WorldWideScience

Sample records for nonpromoter intergenic sequences

  1. Chromosome-wide mapping of DNA methylation patterns in normal and malignant prostate cells reveals pervasive methylation of gene-associated and conserved intergenic sequences

    Directory of Open Access Journals (Sweden)

    De Marzo Angelo M

    2011-06-01

    Full Text Available Abstract Background DNA methylation has been linked to genome regulation and dysregulation in health and disease respectively, and methods for characterizing genomic DNA methylation patterns are rapidly emerging. We have developed/refined methods for enrichment of methylated genomic fragments using the methyl-binding domain of the human MBD2 protein (MBD2-MBD followed by analysis with high-density tiling microarrays. This MBD-chip approach was used to characterize DNA methylation patterns across all non-repetitive sequences of human chromosomes 21 and 22 at high-resolution in normal and malignant prostate cells. Results Examining this data using computational methods that were designed specifically for DNA methylation tiling array data revealed widespread methylation of both gene promoter and non-promoter regions in cancer and normal cells. In addition to identifying several novel cancer hypermethylated 5' gene upstream regions that mediated epigenetic gene silencing, we also found several hypermethylated 3' gene downstream, intragenic and intergenic regions. The hypermethylated intragenic regions were highly enriched for overlap with intron-exon boundaries, suggesting a possible role in regulation of alternative transcriptional start sites, exon usage and/or splicing. The hypermethylated intergenic regions showed significant enrichment for conservation across vertebrate species. A sampling of these newly identified promoter (ADAMTS1 and SCARF2 genes and non-promoter (downstream or within DSCR9, C21orf57 and HLCS genes hypermethylated regions were effective in distinguishing malignant from normal prostate tissues and/or cell lines. Conclusions Comparison of chromosome-wide DNA methylation patterns in normal and malignant prostate cells revealed significant methylation of gene-proximal and conserved intergenic sequences. Such analyses can be easily extended for genome-wide methylation analysis in health and disease.

  2. Similar Ratios of Introns to Intergenic Sequence across Animal Genomes.

    Science.gov (United States)

    Francis, Warren R; Wörheide, Gert

    2017-06-01

    One central goal of genome biology is to understand how the usage of the genome differs between organisms. Our knowledge of genome composition, needed for downstream inferences, is critically dependent on gene annotations, yet problems associated with gene annotation and assembly errors are usually ignored in comparative genomics. Here, we analyze the genomes of 68 species across 12 animal phyla and some single-cell eukaryotes for general trends in genome composition and transcription, taking into account problems of gene annotation. We show that, regardless of genome size, the ratio of introns to intergenic sequence is comparable across essentially all animals, with nearly all deviations dominated by increased intergenic sequence. Genomes of model organisms have ratios much closer to 1:1, suggesting that the majority of published genomes of nonmodel organisms are underannotated and consequently omit substantial numbers of genes, with likely negative impact on evolutionary interpretations. Finally, our results also indicate that most animals transcribe half or more of their genomes arguing against differences in genome usage between animal groups, and also suggesting that the transcribed portion is more dependent on genome size than previously thought. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  3. Local repeat sequence organization of an intergenic spacer

    Indian Academy of Sciences (India)

    The amplification yielded the same uniquely ``sequence-scrambled” product, whether the template used for PCR was total cellular DNA, chloroplast DNA or a plasmid clone DNA corresponding to that region. The PCR product, a ``unique” new sequence, had lost the repetitive organization of the template genome where it ...

  4. Local repeat sequence organization of an intergenic spacer in the ...

    Indian Academy of Sciences (India)

    Unknown

    chloroplast genome of Chlamydomonas reinhardtii leads to DNA expansion and sequence ... The discovery of uniparentally inherited streptomycin resistant mutants ... resembles yeast, mitochondrial and phage recombination in that it is typically ...... Sager R and Lane D 1972 Molecular basis of maternal inheritance; Proc.

  5. The Dunaliella salina organelle genomes: large sequences, inflated with intronic and intergenic DNA

    Directory of Open Access Journals (Sweden)

    Tran Duc

    2010-05-01

    Full Text Available Abstract Background Dunaliella salina Teodoresco, a unicellular, halophilic green alga belonging to the Chlorophyceae, is among the most industrially important microalgae. This is because D. salina can produce massive amounts of β-carotene, which can be collected for commercial purposes, and because of its potential as a feedstock for biofuels production. Although the biochemistry and physiology of D. salina have been studied in great detail, virtually nothing is known about the genomes it carries, especially those within its mitochondrion and plastid. This study presents the complete mitochondrial and plastid genome sequences of D. salina and compares them with those of the model green algae Chlamydomonas reinhardtii and Volvox carteri. Results The D. salina organelle genomes are large, circular-mapping molecules with ~60% noncoding DNA, placing them among the most inflated organelle DNAs sampled from the Chlorophyta. In fact, the D. salina plastid genome, at 269 kb, is the largest complete plastid DNA (ptDNA sequence currently deposited in GenBank, and both the mitochondrial and plastid genomes have unprecedentedly high intron densities for organelle DNA: ~1.5 and ~0.4 introns per gene, respectively. Moreover, what appear to be the relics of genes, introns, and intronic open reading frames are found scattered throughout the intergenic ptDNA regions -- a trait without parallel in other characterized organelle genomes and one that gives insight into the mechanisms and modes of expansion of the D. salina ptDNA. Conclusions These findings confirm the notion that chlamydomonadalean algae have some of the most extreme organelle genomes of all eukaryotes. They also suggest that the events giving rise to the expanded ptDNA architecture of D. salina and other Chlamydomonadales may have occurred early in the evolution of this lineage. Although interesting from a genome evolution standpoint, the D. salina organelle DNA sequences will aid in the

  6. The Dunaliella salina organelle genomes: large sequences, inflated with intronic and intergenic DNA

    Energy Technology Data Exchange (ETDEWEB)

    Smith, David R.; Lee, Robert W.; Cushman, John C.; Magnuson, Jon K.; Tran, Duc; Polle, Juergen E.

    2010-05-07

    Abstract Background: Dunaliella salina Teodoresco, a unicellular, halophilic green alga belonging to the Chlorophyceae, is among the most industrially important microalgae. This is because D. salina can produce massive amounts of β-carotene, which can be collected for commercial purposes, and because of its potential as a feedstock for biofuels production. Although the biochemistry and physiology of D. salina have been studied in great detail, virtually nothing is known about the genomes it carries, especially those within its mitochondrion and plastid. This study presents the complete mitochondrial and plastid genome sequences of D. salina and compares them with those of the model green algae Chlamydomonas reinhardtii and Volvox carteri. Results: The D. salina organelle genomes are large, circular-mapping molecules with ~60% noncoding DNA, placing them among the most inflated organelle DNAs sampled from the Chlorophyta. In fact, the D. salina plastid genome, at 269 kb, is the largest complete plastid DNA (ptDNA) sequence currently deposited in GenBank, and both the mitochondrial and plastid genomes have unprecedentedly high intron densities for organelle DNA: ~1.5 and ~0.4 introns per gene, respectively. Moreover, what appear to be the relics of genes, introns, and intronic open reading frames are found scattered throughout the intergenic ptDNA regions -- a trait without parallel in other characterized organelle genomes and one that gives insight into the mechanisms and modes of expansion of the D. salina ptDNA. Conclusions: These findings confirm the notion that chlamydomonadalean algae have some of the most extreme organelle genomes of all eukaryotes. They also suggest that the events giving rise to the expanded ptDNA architecture of D. salina and other Chlamydomonadales may have occurred early in the evolution of this lineage. Although interesting from a genome evolution standpoint, the D. salina organelle DNA sequences will aid in the development of a viable

  7. Genomic Variability of Haemophilus influenzae Isolated from Mexican Children Determined by Using Enterobacterial Repetitive Intergenic Consensus Sequences and PCR

    OpenAIRE

    Gomez-De-Leon, Patricia; Santos, Jose I.; Caballero, Javier; Gomez, Demostenes; Espinosa, Luz E.; Moreno, Isabel; Piñero, Daniel; Cravioto, Alejandro

    2000-01-01

    Genomic fingerprints from 92 capsulated and noncapsulated strains of Haemophilus influenzae from Mexican children with different diseases and healthy carriers were generated by PCR using the enterobacterial repetitive intergenic consensus (ERIC) sequences. A cluster analysis by the unweighted pair-group method with arithmetic averages based on the overall similarity as estimated from the characteristics of the genomic fingerprints, was conducted to group the strains. A total of 69 fingerprint...

  8. Intergenic DNA sequences from the human X chromosome reveal high rates of global gene flow

    Directory of Open Access Journals (Sweden)

    Wall Jeffrey D

    2008-11-01

    Full Text Available Abstract Background Despite intensive efforts devoted to collecting human polymorphism data, little is known about the role of gene flow in the ancestry of human populations. This is partly because most analyses have applied one of two simple models of population structure, the island model or the splitting model, which make unrealistic biological assumptions. Results Here, we analyze 98-kb of DNA sequence from 20 independently evolving intergenic regions on the X chromosome in a sample of 90 humans from six globally diverse populations. We employ an isolation-with-migration (IM model, which assumes that populations split and subsequently exchange migrants, to independently estimate effective population sizes and migration rates. While the maximum effective size of modern humans is estimated at ~10,000, individual populations vary substantially in size, with African populations tending to be larger (2,300–9,000 than non-African populations (300–3,300. We estimate mean rates of bidirectional gene flow at 4.8 × 10-4/generation. Bidirectional migration rates are ~5-fold higher among non-African populations (1.5 × 10-3 than among African populations (2.7 × 10-4. Interestingly, because effective sizes and migration rates are inversely related in African and non-African populations, population migration rates are similar within Africa and Eurasia (e.g., global mean Nm = 2.4. Conclusion We conclude that gene flow has played an important role in structuring global human populations and that migration rates should be incorporated as critical parameters in models of human demography.

  9. Identification of Dendrobium species by a candidate DNA barcode sequence: the chloroplast psbA-trnH intergenic region.

    Science.gov (United States)

    Yao, Hui; Song, Jing-Yuan; Ma, Xin-Ye; Liu, Chang; Li, Ying; Xu, Hong-Xi; Han, Jian-Ping; Duan, Li-Sheng; Chen, Shi-Lin

    2009-05-01

    DNA barcoding is a novel technology that uses a standard DNA sequence to facilitate species identification. Although a consensus has not been reached regarding which DNA sequences can be used as the best plant barcodes, the psbA-trnH spacer region has been tested extensively in recent years. In this study, we hypothesize that the psbA-trnH spacer regions are also effective barcodes for Dendrobium species. We have sequenced the chloroplast psbA-trnH intergenic spacers of 17 Dendrobium species to test this hypothesis. The sequences were found to be significantly different from those of other species, with percentages of variation ranging from 0.3 % to 2.3 % and an average of 1.2 %. In contrast, the intraspecific variation among the Dendrobium species studied ranged from 0 % to 0.1 %. The sequence difference between the psbA-trnH sequences of 17 Dendrobium species and one Bulbophyllum odoratissimum ranged from 2.0 % to 3.1 %, with an average of 2.5 %. Our results support the notion that the psbA-trnH intergenic spacer region could be used as a barcode to distinguish various Dendrobium species and to differentiate Dendrobium species from other adulterating species. Copyright Georg Thieme Verlag KG Stuttgart. New York.

  10. Enterobacterial repetitive intergenic consensus sequences and the PCR to generate fingerprints of genomic DNAs from Vibrio cholerae O1, O139, and non-O1 strains.

    OpenAIRE

    Rivera, I G; Chowdhury, M A; Huq, A; Jacobs, D; Martins, M T; Colwell, R R

    1995-01-01

    Enterobacterial repetitive intergenic consensus (ERIC) sequence polymorphism was studied in Vibrio Cholerae strains isolated before and after the cholera epidemic in Brazil (in 1991), along with epidemic strains from Peru, Mexico, and India, by PCR. A total of 17 fingerprint patterns (FPs) were detected in the V. cholerae strains examined; 96.7% of the toxigenic V. cholerae O1 strains and 100% of the O139 serogroup strains were found to belong to the same FP group comprising four fragments (F...

  11. Ribosomal DNA intergenic spacer sequence in foxtail millet, Setaria italica (L.) P. Beauv. and its characterization and application to typing of foxtail millet landraces.

    Science.gov (United States)

    Fukunaga, Kenji; Ichitani, Katsuyuki; Taura, Satoru; Sato, Muneharu; Kawase, Makoto

    2005-02-01

    We determined the sequence of ribosomal DNA (rDNA) intergenic spacer (IGS) of foxtail millet isolated in our previous study, and identified subrepeats in the polymorphic region. We also developed a PCR-based method for identifying rDNA types based on sequence information and assessed 153 accessions of foxtail millet. Results were congruent with our previous works. This study provides new findings regarding the geographical distribution of rDNA variants. This new method facilitates analyses of numerous foxtail millet accessions. It is helpful for typing of foxtail millet germplasms and elucidating the evolution of this millet.

  12. Genomic relationships of Actinobacillus pleuropneumoniae serotype 2 strains evaluated by ribotyping, sequence analysis of ribosomal intergenic regions, and pulsed-field gel electrophoresis

    DEFF Research Database (Denmark)

    Fussing, V.

    1998-01-01

    The aim of the present study was to examine the genomic relationship among 112 Actinobacillus pleuropneumoniae serotype 2 strains obtained throughout Europe and North America. HindIII ribotyping of the strains resulted in five ribotypes of high similarity (87-98%). Sequence analysis of the riboso......The aim of the present study was to examine the genomic relationship among 112 Actinobacillus pleuropneumoniae serotype 2 strains obtained throughout Europe and North America. HindIII ribotyping of the strains resulted in five ribotypes of high similarity (87-98%). Sequence analysis...... of the ribosomal intergenic region of strains representing each ribotype and each country showed no differences. A common ribotype was further characterized by PFGE of 12 strains representing all countries. The resultant five PFGE patterns of European strains showed a similarity of more than 91%, to which the two...

  13. Intergenic sequence between Arabidopsis caseinolytic protease B-cytoplasmic/heat shock protein100 and choline kinase genes functions as a heat-inducible bidirectional promoter.

    Science.gov (United States)

    Mishra, Ratnesh Chandra; Grover, Anil

    2014-11-01

    In Arabidopsis (Arabidopsis thaliana), the At1g74310 locus encodes for caseinolytic protease B-cytoplasmic (ClpB-C)/heat shock protein100 protein (AtClpB-C), which is critical for the acquisition of thermotolerance, and At1g74320 encodes for choline kinase (AtCK2) that catalyzes the first reaction in the Kennedy pathway for phosphatidylcholine biosynthesis. Previous work has established that the knockout mutants of these genes display heat-sensitive phenotypes. While analyzing the AtClpB-C promoter and upstream genomic regions in this study, we noted that AtClpB-C and AtCK2 genes are head-to-head oriented on chromosome 1 of the Arabidopsis genome. Expression analysis showed that transcripts of these genes are rapidly induced in response to heat stress treatment. In stably transformed Arabidopsis plants harboring this intergenic sequence between head-to-head oriented green fluorescent protein and β-glucuronidase reporter genes, both transcripts and proteins of the two reporters were up-regulated upon heat stress. Four heat shock elements were noted in the intergenic region by in silico analysis. In the homozygous transfer DNA insertion mutant Salk_014505, 4,393-bp transfer DNA is inserted at position -517 upstream of ATG of the AtClpB-C gene. As a result, AtCk2 loses proximity to three of the four heat shock elements in the mutant line. Heat-inducible expression of the AtCK2 transcript was completely lost, whereas the expression of AtClpB-C was not affected in the mutant plants. Our results suggest that the 1,329-bp intergenic fragment functions as a heat-inducible bidirectional promoter and the region governing the heat inducibility is possibly shared between the two genes. We propose a model in which AtClpB-C shares its regulatory region with heat-induced choline kinase, which has a possible role in heat signaling. © 2014 American Society of Plant Biologists. All Rights Reserved.

  14. The High Degree of Sequence Plasticity of the Arenavirus Noncoding Intergenic Region (IGR) Enables the Use of a Nonviral Universal Synthetic IGR To Attenuate Arenaviruses.

    Science.gov (United States)

    Iwasaki, Masaharu; Cubitt, Beatrice; Sullivan, Brian M; de la Torre, Juan C

    2016-01-06

    Hemorrhagic fever arenaviruses (HFAs) pose important public health problems in regions where they are endemic. Concerns about human-pathogenic arenaviruses are exacerbated because of the lack of FDA-licensed arenavirus vaccines and because current antiarenaviral therapy is limited to an off-label use of ribavirin that is only partially effective. We have recently shown that the noncoding intergenic region (IGR) present in each arenavirus genome segment, the S and L segments (S-IGR and L-IGR, respectively), plays important roles in the control of virus protein expression and that this knowledge could be harnessed for the development of live-attenuated vaccine strains to combat HFAs. In this study, we further investigated the sequence plasticity of the arenavirus IGR. We demonstrate that recombinants of the prototypic arenavirus lymphocytic choriomeningitis virus (rLCMVs), whose S-IGRs were replaced by the S-IGR of Lassa virus (LASV) or an entirely nonviral S-IGR-like sequence (Ssyn), are viable, indicating that the function of S-IGR tolerates a high degree of sequence plasticity. In addition, rLCMVs whose L-IGRs were replaced by Ssyn or S-IGRs of the very distantly related reptarenavirus Golden Gate virus (GGV) were viable and severely attenuated in vivo but able to elicit protective immunity against a lethal challenge with wild-type LCMV. Our findings indicate that replacement of L-IGR by a nonviral Ssyn could serve as a universal molecular determinant of arenavirus attenuation. Hemorrhagic fever arenaviruses (HFAs) cause high rates of morbidity and mortality and pose important public health problems in regions where they are endemic. Implementation of live-attenuated vaccines (LAVs) will represent a major step to combat HFAs. Here we document that the arenavirus noncoding intergenic region (IGR) has a high degree of plasticity compatible with virus viability. This observation led us to generate recombinant LCMVs containing nonviral synthetic IGRs. These r

  15. Sequence analysis of two alleles reveals that intra-and intergenic recombination played a role in the evolution of the radish fertility restorer (Rfo

    Directory of Open Access Journals (Sweden)

    Budar Françoise

    2010-02-01

    Full Text Available Abstract Background Land plant genomes contain multiple members of a eukaryote-specific gene family encoding proteins with pentatricopeptide repeat (PPR motifs. Some PPR proteins were shown to participate in post-transcriptional events involved in organellar gene expression, and this type of function is now thought to be their main biological role. Among PPR genes, restorers of fertility (Rf of cytoplasmic male sterility systems constitute a peculiar subgroup that is thought to evolve in response to the presence of mitochondrial sterility-inducing genes. Rf genes encoding PPR proteins are associated with very close relatives on complex loci. Results We sequenced a non-restoring allele (L7rfo of the Rfo radish locus whose restoring allele (D81Rfo was previously described, and compared the two alleles and their PPR genes. We identified a ca 13 kb long fragment, likely originating from another part of the radish genome, inserted into the L7rfo sequence. The L7rfo allele carries two genes (PPR-1 and PPR-2 closely related to the three previously described PPR genes of the restorer D81Rfo allele (PPR-A, PPR-B, and PPR-C. Our results indicate that alleles of the Rfo locus have experienced complex evolutionary events, including recombination and insertion of extra-locus sequences, since they diverged. Our analyses strongly suggest that present coding sequences of Rfo PPR genes result from intragenic recombination. We found that the 10 C-terminal PPR repeats in Rfo PPR gene encoded proteins result from the tandem duplication of a 5 PPR repeat block. Conclusions The Rfo locus appears to experience more complex evolution than its flanking sequences. The Rfo locus and PPR genes therein are likely to evolve as a result of intergenic and intragenic recombination. It is therefore not possible to determine which genes on the two alleles are direct orthologs. Our observations recall some previously reported data on pathogen resistance complex loci.

  16. 16S-23S rDNA intergenic spacer region polymorphism of Lactococcus garvieae, Lactococcus raffinolactis and Lactococcus lactis as revealed by PCR and nucleotide sequence analysis.

    Science.gov (United States)

    Blaiotta, Giuseppe; Pepe, Olimpia; Mauriello, Gianluigi; Villani, Francesco; Andolfi, Rosamaria; Moschetti, Giancarlo

    2002-12-01

    The intergenic spacer region (ISR) between the 16S and 23S rRNA genes was tested as a tool for differentiating lactococci commonly isolated in a dairy environment. 17 reference strains, representing 11 different species belonging to the genera Lactococcus, Streptococcus, Lactobacillus, Enterococcus and Leuconostoc, and 127 wild streptococcal strains isolated during the whole fermentation process of "Fior di Latte" cheese were analyzed. After 16S-23S rDNA ISR amplification by PCR, species or genus-specific patterns were obtained for most of the reference strains tested. Moreover, results obtained after nucleotide analysis show that the 16S-23S rDNA ISR sequences vary greatly, in size and sequence, among Lactococcus garvieae, Lactococcus raffinolactis, Lactococcus lactis as well as other streptococci from dairy environments. Because of the high degree of inter-specific polymorphism observed, 16S-23S rDNA ISR can be considered a good potential target for selecting species-specific molecular assays, such as PCR primer or probes, for a rapid and extremely reliable differentiation of dairy lactococcal isolates.

  17. Phylogenetic relationships in the genus Leonardoxa (Leguminosae: Caesalpinioideae) inferred from chloroplast trnL intron and trnL-trnF intergenic spacer sequences.

    Science.gov (United States)

    Brouat, Carine; Gielly, Ludovic; McKey, Doyle

    2001-01-01

    The African genus LEONARDOXA: (Leguminosae: Caesalpinioideae) comprises two Congolean species and a group of four mostly allopatric subspecies principally located in Cameroon and clustered together in the L. africana complex. LEONARDOXA: provides a good opportunity to investigate the evolutionary history of ant-plant mutualisms, as it exhibits various grades of ant-plant interactions from diffuse to obligate and symbiotic associations. We present in this paper the first molecular phylogenetic study of this genus. We sequenced both the chloroplast DNA trnL intron (677 aligned base pairs [bp]) and trnL-trnF intergene spacer (598 aligned bp). Inferred phylogenetic relationships suggested first that the genus is paraphyletic. The L. africana complex is clearly separated from the two Congolean species, and the integrity of the genus is thus in question. In the L. africana complex, our data showed a lack of congruence between clades suggested by morphological and chloroplast characters. This, and the low level of molecular divergence found between subspecies, suggests gene flow and introgressive events in the L. africana complex.

  18. Novel Bacteriocinogenic Lactobacillus plantarum Strains and Their Differentiation by Sequence Analysis of 16S rDNA, 16S-23S and 23S-5S Intergenic Spacer Regions and Randomly Amplified Polymorphic DNA Analysis

    Directory of Open Access Journals (Sweden)

    Morteza Shojaei Moghadam

    2010-01-01

    Full Text Available Six strains of bacteriocinogenic Lactobacillus plantarum (TL1, RG11, RS5, UL4, RG14 and RI11 isolated from Malaysian foods were investigated for their structural bacteriocin genes. A new combination of plantaricin EF and plantaricin W bacteriocin structural genes was successfully amplified from all studied strains, suggesting that they were novel bacteriocin-producing L. plantarum strains. A four-base pair variable region was detected in the short 16S-23S intergenic spacer regions of the studied strains by a comparative analysis with 17 L. plantarum strains deposited in the GenBank, implying they were new genotypes. The studied L. plantarum strains were subsequently differentiated into four groups on the basis of the detected four-base pair variable region of the short 16S-23S intergenic spacer region. Further analysis of the DNA sequence of 23S-5S intergenic spacer region revealed only one type of 23S-5S intergenic spacer region present in the studied strains, indicating it was highly conserved among the studied L. plantarum strains. Three randomly amplified polymorphic DNA experiments using three different combinations of arbitrary primers successfully differentiated the studied L. plantarum strains from each other, confirming they were different strains. In conclusion, the studied L. plantarum strains were shown to be novel bacteriocin producers and high level of strain discrimination could be achieved with a combination of randomly amplified polymorphic DNA analysis and the analysis of the variable region of short 16S-23S intergenic spacer region present in L. plantarum strains.

  19. Enterobacterial repetitive intergenic consensus sequences and the PCR to generate fingerprints of genomic DNAs from Vibrio cholerae O1, O139, and non-O1 strains.

    Science.gov (United States)

    Rivera, I G; Chowdhury, M A; Huq, A; Jacobs, D; Martins, M T; Colwell, R R

    1995-08-01

    Enterobacterial repetitive intergenic consensus (ERIC) sequence polymorphism was studied in Vibrio Cholerae strains isolated before and after the cholera epidemic in Brazil (in 1991), along with epidemic strains from Peru, Mexico, and India, by PCR. A total of 17 fingerprint patterns (FPs) were detected in the V. cholerae strains examined; 96.7% of the toxigenic V. cholerae O1 strains and 100% of the O139 serogroup strains were found to belong to the same FP group comprising four fragments (FP1). The nontoxigenic V. cholerae O1 also yielded four fragments but constituted a different FP group (FP2). A total of 15 different patterns were observed among the V. cholerae non-O1 strains. Two patterns were observed most frequently for V. cholerae non-01 strains, 25% of which have FP3, with five fragments, and 16.7% of which have FP4, with two fragments. Three fragments, 1.75, 0.79, and 0.5 kb, were found to be common to both toxigenic and nontoxigenic V. cholerae O1 strains as well as to group FP3, containing V. cholerae non-O1 strains. Two fragments of group FP3, 1.3 and 1.0 kb, were present in FP1 and FP2 respectively. The 0.5-kb fragment was common to all strains and serogroups of V. cholerae analyzed. It is concluded from the results of this study, based on DNA FPs of environmental isolates, that it is possible to detect an emerging virulent strain in a cholera-endemic region. ERIC-PCR constitutes a powerful tool for determination of the virulence potential of V. cholerae O1 strains isolated in surveillance programs and for molecular epidemiological investigations.

  20. A one-step reaction for the rapid identification of Lactobacillus mindensis, Lactobacillus panis, Lactobacillus paralimentarius, Lactobacillus pontis and Lactobacillus frumenti using oligonucleotide primers designed from the 16S-23S rRNA intergenic sequences.

    Science.gov (United States)

    Ferchichi, M; Valcheva, R; Prévost, H; Onno, B; Dousset, X

    2008-06-01

    Species-specific primers targeting the 16S-23S ribosomal DNA (rDNA) intergenic spacer region (ISR) were designed to rapidly discriminate between Lactobacillus mindensis, Lactobacillus panis, Lactobacillus paralimentarius, Lactobacillus pontis and Lactobacillus frumenti species recently isolated from French sourdough. The 16S-23S ISRs were amplified using primers 16S/p2 and 23S/p7, which anneal to positions 1388-1406 of the 16S rRNA gene and to positions 207-189 of the 23S rRNA gene respectively, Escherichia coli numbering (GenBank accession number V00331). Clone libraries of the resulting amplicons were constructed using a pCR2.1 TA cloning kit and sequenced. Species-specific primers were designed based on the sequences obtained and were used to amplify the 16S-23S ISR in the Lactobacillus species considered. For all of them, two PCR amplicons, designated as small ISR (S-ISR) and large ISR (L-ISR), were obtained. The L-ISR is composed of the corresponding S-ISR, interrupted by a sequence containing tRNA(Ile) and tRNA(Ala) genes. Based on these sequences, species-specific primers were designed and proved to identify accurately the species considered among 30 reference Lactobacillus species tested. Designed species-specific primers enable a rapid and accurate identification of L. mindensis, L. paralimentarius, L. panis, L. pontis and L. frumenti species among other lactobacilli. The proposed method provides a powerful and convenient means of rapidly identifying some sourdough lactobacilli, which could be of help in large starter culture surveys.

  1. [Phylogenetic relationships of the species of Oxytropis DC. subg. Oxytropis and Phacoxytropis (Fabaceae) from Asian Russia inferred from the nucleotide sequence analysis of the intergenic spacers of the chloroplast genome].

    Science.gov (United States)

    Kholina, A B; Kozyrenko, M M; Artyukova, E V; Sandanov, D V; Andrianova, E A

    2016-08-01

    The nucleotide sequence analysis of trnH–psbA, trnL–trnF, and trnS–trnG intergenic spacer regions of chloroplast DNA performed in the representatives of the genus Oxytropis from Asian Russia provided clarification of the phylogenetic relationships of some species and sections in the subgenera Oxytropis and Phacoxytropis and in the genus Oxytropis as a whole. Only the section Mesogaea corresponds to the subgenus Phacoxytropis, while the section Janthina of the same subgenus groups together with the sections of the subgenus Oxytropis. The sections Chrysantha and Ortholoma of the subgenus Oxytropis are not only closely related to each other, but together with the section Mesogaea, they are grouped into the subgenus Phacoxytropis. It seems likely that the sections Chrysantha and Ortholoma should be assigned to the subgenus Phacoxytropis, and the section Janthina should be assigned to the subgenus Oxytropis. The molecular differences were identified between O. coerulea and O. mandshurica from the section Janthina that were indicative of considerable divergence of their chloroplast genomes and the species independence of the taxa. The species independence of O. czukotica belonging to the section Arctobia was also confirmed.

  2. Analysis of a new strain of Euphorbia mosaic virus with distinct replication specificity unveils a lineage of begomoviruses with short Rep sequences in the DNA-B intergenic region

    Directory of Open Access Journals (Sweden)

    Argüello-Astorga Gerardo R

    2010-10-01

    Full Text Available Abstract Background Euphorbia mosaic virus (EuMV is a member of the SLCV clade, a lineage of New World begomoviruses that display distinctive features in their replication-associated protein (Rep and virion-strand replication origin. The first entirely characterized EuMV isolate is native from Yucatan Peninsula, Mexico; subsequently, EuMV was detected in weeds and pepper plants from another region of Mexico, and partial DNA-A sequences revealed significant differences in their putative replication specificity determinants with respect to EuMV-YP. This study was aimed to investigate the replication compatibility between two EuMV isolates from the same country. Results A new isolate of EuMV was obtained from pepper plants collected at Jalisco, Mexico. Full-length clones of both genomic components of EuMV-Jal were biolistically inoculated into plants of three different species, which developed symptoms indistinguishable from those induced by EuMV-YP. Pseudorecombination experiments with EuMV-Jal and EuMV-YP genomic components demonstrated that these viruses do not form infectious reassortants in Nicotiana benthamiana, presumably because of Rep-iteron incompatibility. Sequence analysis of the EuMV-Jal DNA-B intergenic region (IR led to the unexpected discovery of a 35-nt-long sequence that is identical to a segment of the rep gene in the cognate viral DNA-A. Similar short rep sequences ranging from 35- to 51-nt in length were identified in all EuMV isolates and in three distinct viruses from South America related to EuMV. These short rep sequences in the DNA-B IR are positioned downstream to a ~160-nt non-coding domain highly similar to the CP promoter of begomoviruses belonging to the SLCV clade. Conclusions EuMV strains are not compatible in replication, indicating that this begomovirus species probably is not a replicating lineage in nature. The genomic analysis of EuMV-Jal led to the discovery of a subgroup of SLCV clade viruses that contain in

  3. Differentiation of Actinobacillus pleuropneumoniae strains by sequence analysis of 16S rDNA and ribosomal intergenic regions, and development of a species specific oligonucleotide for in situ detection

    DEFF Research Database (Denmark)

    Fussing, Vivian; Paster, Bruce J.; Dewhirst, Floyd E.

    1998-01-01

    . The larger RIS's were different between the 3 species tested. The sequence of the 16S ribosomal gene was determined for 8 serotypes of A. pleuropneumoniae. These sequences showed only minor base differences, indicating a close genetic relatedness of these serotypes within the species. An oligonucleotide DNA...... probe designed from the 16S rRNA gene sequence of A. pleuropneumoniae was specific for all strains of the target species and did not cross react with A. lignieresii, the closest known relative of A. pleuropneumoniae. This species-specific DNA probe labeled with fluorescein was used for in situ......The aims of this study were to characterize and determine intraspecies and interspecies relatedness of Actinobacillus pleuropneumoniae to Actinobacillus lignieresii and Actinobacillus suis by sequence analysis of the ribosomal operon and to find a species-specific area for in situ detection of A...

  4. [Identification of medicinal plant Dendrobium based on the chloroplast psbK-psbI intergenic spacer].

    Science.gov (United States)

    Yao, Hui; Yang, Pei; Zhou, Hong; Ma, Shuang-jiao; Song, Jing-yuan; Chen, Shi-lin

    2015-06-01

    In this paper, the chloroplast psbK-psbI intergenic spacers of 18 species of Dendrobium and their adulterants were amplified and sequenced, and then the sequence characteristics were analyzed. The sequence lengths of chloroplast psbK-psbI regions of Dendrobium ranged from 474 to 513 bp and the GC contents were 25.4%-27.6%. The variable sites were 71 while the informative sites were 46. The inter-specific genetic distances calculated by Kimura 2-parameter (K2P) of Dendrobium were 0.006 1-0.058 1, with an average of 0.028 4. The K2P genetic distances between Dendrobium species and Bulbophyllum odoratissimum were 0.093 2-0.120 4. The NJ tree showed that the Dendrobium species can be easily differentiated from each other and 6 samples of the inspected Dendrobium species were identified successfully through sequencing the psbK-psbI intergenic spacer. Therefore, the chloroplast psbK-psbI intergenic spacer can be used as a candidate marker to identify Dendrobium species and its adulterants.

  5. Genotypic Characterization of Bradyrhizobium Strains Nodulating Endemic Woody Legumes of the Canary Islands by PCR-Restriction Fragment Length Polymorphism Analysis of Genes Encoding 16S rRNA (16S rDNA) and 16S-23S rDNA Intergenic Spacers, Repetitive Extragenic Palindromic PCR Genomic Fingerprinting, and Partial 16S rDNA Sequencing

    Science.gov (United States)

    Vinuesa, Pablo; Rademaker, Jan L. W.; de Bruijn, Frans J.; Werner, Dietrich

    1998-01-01

    We present a phylogenetic analysis of nine strains of symbiotic nitrogen-fixing bacteria isolated from nodules of tagasaste (Chamaecytisus proliferus) and other endemic woody legumes of the Canary Islands, Spain. These and several reference strains were characterized genotypically at different levels of taxonomic resolution by computer-assisted analysis of 16S ribosomal DNA (rDNA) PCR-restriction fragment length polymorphisms (PCR-RFLPs), 16S-23S rDNA intergenic spacer (IGS) RFLPs, and repetitive extragenic palindromic PCR (rep-PCR) genomic fingerprints with BOX, ERIC, and REP primers. Cluster analysis of 16S rDNA restriction patterns with four tetrameric endonucleases grouped the Canarian isolates with the two reference strains, Bradyrhizobium japonicum USDA 110spc4 and Bradyrhizobium sp. strain (Centrosema) CIAT 3101, resolving three genotypes within these bradyrhizobia. In the analysis of IGS RFLPs with three enzymes, six groups were found, whereas rep-PCR fingerprinting revealed an even greater genotypic diversity, with only two of the Canarian strains having similar fingerprints. Furthermore, we show that IGS RFLPs and even very dissimilar rep-PCR fingerprints can be clustered into phylogenetically sound groupings by combining them with 16S rDNA RFLPs in computer-assisted cluster analysis of electrophoretic patterns. The DNA sequence analysis of a highly variable 264-bp segment of the 16S rRNA genes of these strains was found to be consistent with the fingerprint-based classification. Three different DNA sequences were obtained, one of which was not previously described, and all belonged to the B. japonicum/Rhodopseudomonas rDNA cluster. Nodulation assays revealed that none of the Canarian isolates nodulated Glycine max or Leucaena leucocephala, but all nodulated Acacia pendula, C. proliferus, Macroptilium atropurpureum, and Vigna unguiculata. PMID:9603820

  6. Intergenic and repeat transcription in human, chimpanzee and macaque brains measured by RNA-Seq.

    Directory of Open Access Journals (Sweden)

    Augix Guohua Xu

    Full Text Available Transcription is the first step connecting genetic information with an organism's phenotype. While expression of annotated genes in the human brain has been characterized extensively, our knowledge about the scope and the conservation of transcripts located outside of the known genes' boundaries is limited. Here, we use high-throughput transcriptome sequencing (RNA-Seq to characterize the total non-ribosomal transcriptome of human, chimpanzee, and rhesus macaque brain. In all species, only 20-28% of non-ribosomal transcripts correspond to annotated exons and 20-23% to introns. By contrast, transcripts originating within intronic and intergenic repetitive sequences constitute 40-48% of the total brain transcriptome. Notably, some repeat families show elevated transcription. In non-repetitive intergenic regions, we identify and characterize 1,093 distinct regions highly expressed in the human brain. These regions are conserved at the RNA expression level across primates studied and at the DNA sequence level across mammals. A large proportion of these transcripts (20% represents 3'UTR extensions of known genes and may play roles in alternative microRNA-directed regulation. Finally, we show that while transcriptome divergence between species increases with evolutionary time, intergenic transcripts show more expression differences among species and exons show less. Our results show that many yet uncharacterized evolutionary conserved transcripts exist in the human brain. Some of these transcripts may play roles in transcriptional regulation and contribute to evolution of human-specific phenotypic traits.

  7. A ribosomal RNA gene intergenic spacer based PCR and DGGE fingerprinting method for the analysis of specific rhizobial communities in soil

    NARCIS (Netherlands)

    de Oliveira, VM; Manfio, GP; Coutinho, HLD; Keijzer-Wolters, AC; van Elsas, JD

    A direct molecular method for assessing the diversity of specific populations of rhizobia in soil, based on nested PCR amplification of 16S-23S ribosomal RNA gene (rDNA) intergenic spacer (IGS) sequences, was developed. Initial generic amplification of bacterial rDNA IGS sequences from soil DNA was

  8. A ribosomal RNA gene intergenic spacer based PCR and DGGE fingerprinting method for the analysis of specific rhizobial communities in soil

    NARCIS (Netherlands)

    Oliveira, de V.M.; Manfio, G.P.; Coutinho, H.L.D.; Keijzer-Wolters, A.C.; Elsas, van J.D.

    2006-01-01

    A direct molecular method for assessing the diversity of specific populations of rhizobia in soil, based on nested PCR amplification of 16S-23S ribosomal RNA gene (rDNA) intergenic spacer (IGS) sequences, was developed. Initial generic amplification of bacterial rDNA IGS sequences from soil DNA was

  9. Species identification of medicinal pteridophytes by a DNA barcode marker, the chloroplast psbA-trnH intergenic region.

    Science.gov (United States)

    Ma, Xin-Ye; Xie, Cai-Xiang; Liu, Chang; Song, Jing-Yuan; Yao, Hui; Luo, Kun; Zhu, Ying-Jie; Gao, Ting; Pang, Xiao-Hui; Qian, Jun; Chen, Shi-Lin

    2010-01-01

    Medicinal pteridophytes are an important group used in traditional Chinese medicine; however, there is no simple and universal way to differentiate various species of this group by morphological traits. A novel technology termed "DNA barcoding" could discriminate species by a standard DNA sequence with universal primers and sufficient variation. To determine whether DNA barcoding would be effective for differentiating pteridophyte species, we first analyzed five DNA sequence markers (psbA-trnH intergenic region, rbcL, rpoB, rpoC1, and matK) using six chloroplast genomic sequences from GeneBank and found psbA-trnH intergenic region the best candidate for availability of universal primers. Next, we amplified the psbA-trnH region from 79 samples of medicinal pteridophyte plants. These samples represented 51 species from 24 families, including all the authentic pteridophyte species listed in the Chinese pharmacopoeia (2005 version) and some commonly used adulterants. We found that the sequence of the psbA-trnH intergenic region can be determined with both high polymerase chain reaction (PCR) amplification efficiency (94.1%) and high direct sequencing success rate (81.3%). Combined with GeneBank data (54 species cross 12 pteridophyte families), species discriminative power analysis showed that 90.2% of species could be separated/identified successfully by the TaxonGap method in conjunction with the Basic Local Alignment Search Tool 1 (BLAST1) method. The TaxonGap method results further showed that, for 37 out of 39 separable species with at least two samples each, between-species variation was higher than the relevant within-species variation. Thus, the psbA-trnH intergenic region is a suitable DNA marker for species identification in medicinal pteridophytes.

  10. Intergenic disease-associated regions are abundant in novel transcripts.

    Science.gov (United States)

    Bartonicek, N; Clark, M B; Quek, X C; Torpy, J R; Pritchard, A L; Maag, J L V; Gloss, B S; Crawford, J; Taft, R J; Hayward, N K; Montgomery, G W; Mattick, J S; Mercer, T R; Dinger, M E

    2017-12-28

    Genotyping of large populations through genome-wide association studies (GWAS) has successfully identified many genomic variants associated with traits or disease risk. Unexpectedly, a large proportion of GWAS single nucleotide polymorphisms (SNPs) and associated haplotype blocks are in intronic and intergenic regions, hindering their functional evaluation. While some of these risk-susceptibility regions encompass cis-regulatory sites, their transcriptional potential has never been systematically explored. To detect rare tissue-specific expression, we employed the transcript-enrichment method CaptureSeq on 21 human tissues to identify 1775 multi-exonic transcripts from 561 intronic and intergenic haploblocks associated with 392 traits and diseases, covering 73.9 Mb (2.2%) of the human genome. We show that a large proportion (85%) of disease-associated haploblocks express novel multi-exonic non-coding transcripts that are tissue-specific and enriched for GWAS SNPs as well as epigenetic markers of active transcription and enhancer activity. Similarly, we captured transcriptomes from 13 melanomas, targeting nine melanoma-associated haploblocks, and characterized 31 novel melanoma-specific transcripts that include fusion proteins, novel exons and non-coding RNAs, one-third of which showed allelically imbalanced expression. This resource of previously unreported transcripts in disease-associated regions ( http://gwas-captureseq.dingerlab.org ) should provide an important starting point for the translational community in search of novel biomarkers, disease mechanisms, and drug targets.

  11. Variation of 45S rDNA intergenic spacers in Arabidopsis thaliana.

    Science.gov (United States)

    Havlová, Kateřina; Dvořáčková, Martina; Peiro, Ramon; Abia, David; Mozgová, Iva; Vansáčová, Lenka; Gutierrez, Crisanto; Fajkus, Jiří

    2016-11-01

    Approximately seven hundred 45S rRNA genes (rDNA) in the Arabidopsis thaliana genome are organised in two 4 Mbp-long arrays of tandem repeats arranged in head-to-tail fashion separated by an intergenic spacer (IGS). These arrays make up 5 % of the A. thaliana genome. IGS are rapidly evolving sequences and frequent rearrangements inside the rDNA loci have generated considerable interspecific and even intra-individual variability which allows to distinguish among otherwise highly conserved rRNA genes. The IGS has not been comprehensively described despite its potential importance in regulation of rDNA transcription and replication. Here we describe the detailed sequence variation in the complete IGS of A. thaliana WT plants and provide the reference/consensus IGS sequence, as well as genomic DNA analysis. We further investigate mutants dysfunctional in chromatin assembly factor-1 (CAF-1) (fas1 and fas2 mutants), which are known to have a reduced number of rDNA copies, and plant lines with restored CAF-1 function (segregated from a fas1xfas2 genetic background) showing major rDNA rearrangements. The systematic rDNA loss in CAF-1 mutants leads to the decreased variability of the IGS and to the occurrence of distinct IGS variants. We present for the first time a comprehensive and representative set of complete IGS sequences, obtained by conventional cloning and by Pacific Biosciences sequencing. Our data expands the knowledge of the A. thaliana IGS sequence arrangement and variability, which has not been available in full and in detail until now. This is also the first study combining IGS sequencing data with RFLP analysis of genomic DNA.

  12. Adaptation of the short intergenic spacers between co-directional genes to the Shine-Dalgarno motif among prokaryote genomes

    DEFF Research Database (Denmark)

    Caro, Albert Pallejà; García-Vallvé, Santiago; Romeu, Antoni

    2009-01-01

    ABSTRACT: BACKGROUND: In prokaryote genomes most of the co-directional genes are in close proximity. Even the coding sequence or the stop codon of a gene can overlap with the Shine-Dalgarno (SD) sequence of the downstream co-directional gene. In this paper we analyze how the presence of SD may...... influence the stop codon usage or the spacing lengths between co-directional genes. RESULTS: The SD sequences for 530 prokaryote genomes have been predicted using computer calculations of the base-pairing free energy between translation initiation regions and the 16S rRNA 3' tail. Genomes with a large...... to the discussion of which factors affect the intergenic lengths, which cannot be totally explained by the pressure to compact the prokaryote genomes....

  13. Comparative phylogenetic analysis of intergenic spacers and small ...

    African Journals Online (AJOL)

    The phylogenetic analysis of test isolates included assessment of variation in sequences and length of IGS and SSU-rRNA genes with reference to 16 different microsporidian sequences. The results proved that IGS sequences have more variation than SSU-rRNA gene sequences. Analysis of phylogenetic trees reveal that ...

  14. Promoter2.0: for the recognition of PolII promoter sequences

    DEFF Research Database (Denmark)

    Knudsen, Steen; Knudsen, Steen

    1999-01-01

    Motivation : A new approach to the prediction of eukaryotic PolII promoters from DNA sequence takesadvantage of a combination of elements similar to neural networks and genetic algorithms to recognize a set ofdiscrete subpatterns with variable separation as one pattern: a promoter. The neural...... of optimization, the algorithm was able todiscriminate between vertebrate promoter and non-promoter sequences in a test set with a correlationcoefficient of 0.63. In addition, all five known transcription start sites on the plus strand of the completeadenovirus genome were within 161 bp of 35 predicted...

  15. Genic and Intergenic SSR Database Generation, SNPs Determination and Pathway Annotations, in Date Palm (Phoenix dactylifera L.).

    Science.gov (United States)

    Mokhtar, Morad M; Adawy, Sami S; El-Assal, Salah El-Din S; Hussein, Ebtissam H A

    2016-01-01

    The present investigation was carried out aiming to use the bioinformatics tools in order to identify and characterize, simple sequence repeats within the third Version of the date palm genome and develop a new SSR primers database. In addition single nucleotide polymorphisms (SNPs) that are located within the SSR flanking regions were recognized. Moreover, the pathways for the sequences assigned by SSR primers, the biological functions and gene interaction were determined. A total of 172,075 SSR motifs was identified on date palm genome sequence with a frequency of 450.97 SSRs per Mb. Out of these, 130,014 SSRs (75.6%) were located within the intergenic regions with a frequency of 499 SSRs per Mb. While, only 42,061 SSRs (24.4%) were located within the genic regions with a frequency of 347.5 SSRs per Mb. A total of 111,403 of SSR primer pairs were designed, that represents 291.9 SSR primers per Mb. Out of the 111,403, only 31,380 SSR primers were in the genic regions, while 80,023 primers were in the intergenic regions. A number of 250,507 SNPs were recognized in 84,172 SSR flanking regions, which represents 75.55% of the total SSR flanking regions. Out of 12,274 genes only 463 genes comprising 896 SSR primers were mapped onto 111 pathways using KEGG data base. The most abundant enzymes were identified in the pathway related to the biosynthesis of antibiotics. We tested 1031 SSR primers using both publicly available date palm genome sequences as templates in the in silico PCR reactions. Concerning in vitro validation, 31 SSR primers among those used in the in silico PCR were synthesized and tested for their ability to detect polymorphism among six Egyptian date palm cultivars. All tested primers have successfully amplified products, but only 18 primers detected polymorphic amplicons among the studied date palm cultivars.

  16. Intergenic mRNA molecules resulting from trans-splicing.

    Science.gov (United States)

    Finta, Csaba; Zaphiropoulos, Peter G

    2002-02-22

    Accumulated recent evidence is indicating that alternative splicing represents a generalized process that increases the complexity of human gene expression. Here we show that mRNA production may not necessarily be limited to single genes, as human liver also has the potential to produce a variety of hybrid cytochrome P450 3A mRNA molecules. The four known cytochrome P450 3A genes in humans, CYP3A4, CYP3A5, CYP3A7, and CYP3A43, share a high degree of similarity, consist of 13 exons with conserved exon-intron boundaries, and form a cluster on chromosome 7. The chimeric CYP3A mRNA molecules described herein are characterized by CYP3A43 exon 1 joined at canonical splice sites to distinct sets of CYP3A4 or CYP3A5 exons. Because the CYP3A43 gene is in a head-to-head orientation with the CYP3A4 and CYP3A5 genes, bypassing transcriptional termination can not account for the formation of hybrid CYP3A mRNAs. Thus, the mechanism generating these molecules has to be an RNA processing event that joins exons of independent pre-mRNA molecules, i.e. trans-splicing. Using quantitative real-time polymerase chain reaction, the ratio of one CYP3A43/3A4 intergenic combination was estimated to be approximately 0.15% that of the CYP3A43 mRNAs. Moreover, trans-splicing has been found not to interfere with polyadenylation. Heterologous expression of the chimeric species composed of CYP3A43 exon 1 joined to exons 2-13 of CYP3A4 revealed catalytic activity toward testosterone.

  17. Inferring a role for methylation of intergenic DNA in the regulation of genes aberrantly expressed in precursor B-cell acute lymphoblastic leukemia.

    Science.gov (United States)

    Almamun, Md; Kholod, Olha; Stuckel, Alexei J; Levinson, Benjamin T; Johnson, Nathan T; Arthur, Gerald L; Davis, J Wade; Taylor, Kristen H

    2017-09-01

    A complete understanding of the mechanisms involved in the development of pre-B ALL is lacking. In this study, we integrated DNA methylation data and gene expression data to elucidate the impact of aberrant intergenic DNA methylation on gene expression in pre-B ALL. We found a subset of differentially methylated intergenic loci that were associated with altered gene expression in pre-B ALL patients. Notably, 84% of these regions were also bound by transcription factors (TF) known to play roles in differentiation and B-cell development in a lymphoblastoid cell line. Further, an overall downregulation of eRNA transcripts was observed in pre-B ALL patients and these transcripts were associated with the downregulation of putative target genes involved in B-cell migration, proliferation, and apoptosis. The identification of novel putative regulatory regions highlights the significance of intergenic DNA sequences and may contribute to the identification of new therapeutic targets for the treatment of pre-B ALL.

  18. Systematically profiling and annotating long intergenic non-coding RNAs in human embryonic stem cell.

    Science.gov (United States)

    Tang, Xing; Hou, Mei; Ding, Yang; Li, Zhaohui; Ren, Lichen; Gao, Ge

    2013-01-01

    While more and more long intergenic non-coding RNAs (lincRNAs) were identified to take important roles in both maintaining pluripotency and regulating differentiation, how these lincRNAs may define and drive cell fate decisions on a global scale are still mostly elusive. Systematical profiling and comprehensive annotation of embryonic stem cells lincRNAs may not only bring a clearer big picture of these novel regulators but also shed light on their functionalities. Based on multiple RNA-Seq datasets, we systematically identified 300 human embryonic stem cell lincRNAs (hES lincRNAs). Of which, one forth (78 out of 300) hES lincRNAs were further identified to be biasedly expressed in human ES cells. Functional analysis showed that they were preferentially involved in several early-development related biological processes. Comparative genomics analysis further suggested that around half of the identified hES lincRNAs were conserved in mouse. To facilitate further investigation of these hES lincRNAs, we constructed an online portal for biologists to access all their sequences and annotations interactively. In addition to navigation through a genome browse interface, users can also locate lincRNAs through an advanced query interface based on both keywords and expression profiles, and analyze results through multiple tools. By integrating multiple RNA-Seq datasets, we systematically characterized and annotated 300 hES lincRNAs. A full functional web portal is available freely at http://scbrowse.cbi.pku.edu.cn. As the first global profiling and annotating of human embryonic stem cell lincRNAs, this work aims to provide a valuable resource for both experimental biologists and bioinformaticians.

  19. An improved PCR method for direct identification of Porphyra (Bangiales, Rhodophyta) using conchocelis based on a RUBISCO intergenic spacer

    Science.gov (United States)

    Wang, Chao; Dong, Dong; Wang, Guangce; Zhang, Baoyu; Peng, Guang; Xu, Pu; Tang, Xiaorong

    2009-09-01

    An improved method of PCR in which the small segment of conchocelis is amplified directly without DNA extraction was used to amplify a RUBISCO intergenic spacer DNA fragment from nine species of red algal genus Porphyra (Bangiales, Rhodophyta), including Porphyra yezoensis (Jiangsu, China), P. haitanensis (Fujian, China), P. oligospermatangia (Qingdao, China), P. katadai (Qingdao, China), P. tenera (Qingdao, China), P. suborboculata (Fujian, China), P. pseudolinearis (Kogendo, Korea), P. linearis (Devon, England), and P. fallax (Seattle, USA). Standard PCR and the method developed here were both conducted using primers specific for the RUBISCO spacer region, after which the two PCR products were sequenced. The sequencing data of the amplicons obtained using both methods were identical, suggesting that the improved PCR method was functional. These findings indicate that the method developed here may be useful for the rapid identification of species of Porphyra in a germplasm bank. In addition, a phylogenetic tree was constructed using the RUBISCO spacer and partial rbcS sequence, and the results were in concordant with possible alternative phylogenies based on traditional morphological taxonomic characteristics, indicating that the RUBISCO spacer is a useful region for phylogenetic studies.

  20. A New Intergenic α-Globin Deletion (α-αΔ125) Found in a Kabyle Population.

    Science.gov (United States)

    Singh, Amrathlal Rabbind; Lacan, Philippe; Cadet, Estelle; Bignet, Patricia; Dumesnil, Cécile; Vannier, Jean-Pierre; Joly, Philippe; Rochette, Jacques

    2016-01-01

    We have identified a deletion of 125 bp (α-α(Δ125)) (NG_000006.1: g.37040_37164del) in the α-globin gene cluster in a Kabyle population. A combination of singlex and multiplex polymerase chain reaction (PCR)-based assays have been used to identify the molecular defect. Sequencing of the abnormal PCR amplification product revealed a novel α1-globin promoter deletion. The endpoints of the deletion were characterized by sequencing the deletion junctions of the mutated allele. The observed deletion was located 378 bp upstream of the α1-globin gene transcription initiation site and leaves the α2 gene intact. In some patients, the α-α(Δ125) deletion was shown to segregate with Hb S (HBB: c.20A>T) and/or Hb C (HBB: c.19G>A) or a β-thalassemic allele. The α-α(Δ125) deletion has no discernible effect on red cell indices when inherited with no other abnormal globin genes. The family study demonstrated that the deletion is heritable. This is the only example of an intergenic α2-α1 non coding DNA deletion, leaving the α2-globin gene and the α1 coding part intact.

  1. Genome-wide identification and characterization of long intergenic non-coding RNAs in Ganoderma lucidum.

    Directory of Open Access Journals (Sweden)

    Jianqin Li

    Full Text Available Ganoderma lucidum is a white-rot fungus best-known for its medicinal activities. We have previously sequenced its genome and annotated the protein coding genes. However, long non-coding RNAs in G. lucidum genome have not been analyzed. In this study, we have identified and characterized long intergenic non-coding RNAs (lincRNA in G. lucidum systematically. We developed a computational pipeline, which was used to analyze RNA-Seq data derived from G. lucidum samples collected from three developmental stages. A total of 402 lincRNA candidates were identified, with an average length of 609 bp. Analysis of their adjacent protein-coding genes (apcGenes revealed that 46 apcGenes belong to the pathways of triterpenoid biosynthesis and lignin degradation, or families of cytochrome P450, mating type B genes, and carbohydrate-active enzymes. To determine if lincRNAs and these apcGenes have any interactions, the corresponding pairs of lincRNAs and apcGenes were analyzed in detail. We developed a modified 3' RACE method to analyze the transcriptional direction of a transcript. Among the 46 lincRNAs, 37 were found unidirectionally transcribed, and 9 were found bidirectionally transcribed. The expression profiles of 16 of these 37 lincRNAs were found to be highly correlated with those of the apcGenes across the three developmental stages. Among them, 11 are positively correlated (r>0.8 and 5 are negatively correlated (r<-0.8. The co-localization and co-expression of lincRNAs and those apcGenes playing important functions is consistent with the notion that lincRNAs might be important regulators for cellular processes. In summary, this represents the very first study to identify and characterize lincRNAs in the genomes of basidiomycetes. The results obtained here have laid the foundation for study of potential lincRNA-mediated expression regulation of genes in G. lucidum.

  2. Effects of near-ultraviolet light on mutations, intragenic and intergenic recombinations in Saccharomyces cerevisiae

    International Nuclear Information System (INIS)

    Machida, Isamu; Saeki, Tetsuya; Nakai, Sayaka

    1986-01-01

    The effects of far and near ultraviolet light on mutations, intragenic and intergenic recombinations were compared in diploid strains of Saccharomyces cerevisiae. At equivalent survival levels there was not much difference in the induction of nonsense and missense mutations between far- and near-UV radiations. However, frameshift mutations were induced more frequently by near-UV than by far-UV radiation. Near-UV radiation induced intragenic recombination as efficiently as far-UV radiation. A strikingly higher frequency was observed for the intergenic recombination induced by near-UV radiation than by far-UV radiation when compared at equivalent survival levels. Photoreactivation reduced the frequency only slightly in far-UV induced intergenic recombination and not at all in near-UV induction. These results indicate that near-UV damage involves strand breakage in addition to pyrimidine dimers and other lesions induced, whereas far-UV damage consists largely of photoreactivable lesions, pyrimidine dimers, and near-UV induced damage is more efficient for the induction of crossing-over. (Auth.)

  3. Evaluation of Automated Ribosomal Intergenic Spacer Analysis for Bacterial Fingerprinting of Rumen Microbiome Compared to Pyrosequencing Technology

    Directory of Open Access Journals (Sweden)

    Elie Jami

    2014-01-01

    Full Text Available The mammalian gut houses a complex microbial community which is believed to play a significant role in host physiology. In recent years, several microbial community analysis methods have been implemented to study the whole gut microbial environment, in contrast to classical microbiological methods focusing on bacteria which can be cultivated. One of these is automated ribosomal intergenic spacer analysis (ARISA, an inexpensive and popular way of analyzing bacterial diversity and community fingerprinting in ecological samples. ARISA uses the natural variability in length of the DNA fragment found between the 16S and 23S genes in different bacterial lineages to infer diversity. This method is now being supplanted by affordable next-generation sequencing technologies that can also simultaneously annotate operational taxonomic units for taxonomic identification. We compared ARISA and pyrosequencing of samples from the rumen microbiome of cows, previously sampled at different stages of development and varying in microbial complexity using several ecological parameters. We revealed close agreement between ARISA and pyrosequencing outputs, especially in their ability to discriminate samples from different ecological niches. In contrast, the ARISA method seemed to underestimate sample richness. The good performance of the relatively inexpensive ARISA makes it relevant for straightforward use in bacterial fingerprinting analysis as well as for quick cross-validation of pyrosequencing data.

  4. Molecular characterization of a novel bat-associated circovirus with a poly-T tract in the 3' intergenic region.

    Science.gov (United States)

    Zhu, Aiwei; Jiang, Tinglei; Hu, Tingsong; Mi, Shijiang; Zhao, Zihan; Zhang, Fuqiang; Feng, Jiang; Fan, Quanshui; He, Biao; Tu, Changchun

    2018-05-02

    The family Circoviridae comprises a large group of small circular single-stranded DNA viruses with several members causing severe pig and poultry diseases. In recent years the number of new viruses within the family has had an explosive increase showing a high level of genetic diversity and a broad host range. In this report we describe two more circoviruses identified from bats in Yunnan and Heilongjiang provinces in China. Full genome sequencing has revealed that these bat associated circoviruses (bat ACV) should be classified as new species within the genus Circovirus based on the demarcation criteria of the International Committee on the Taxonomy of Viruses (ICTV). The most striking result is the novel finding of a 21-28 nt polythymidine (poly-T) tract in the 3' terminal intergenic region of bat ACV isolates from Heilongjiang province. To understand its role in viral replication, a wild type bat ACV and a mutated version with the entire poly-T deleted were rescued through construction of infectious clones. Replication comparison in vitro showed that the poly-T is not essential for viral replication. Identification of additional bat ACV isolates and study of their biological characteristics will be the main task in future to understand the potential roles of bats in transmission of circoviruses to terrestrial mammals and humans. Copyright © 2018 Elsevier B.V. All rights reserved.

  5. Genome-wide identification of potato long intergenic noncoding RNAs responsive to Pectobacterium carotovorum subspecies brasiliense infection.

    Science.gov (United States)

    Kwenda, Stanford; Birch, Paul R J; Moleleki, Lucy N

    2016-08-11

    Long noncoding RNAs (lncRNAs) represent a class of RNA molecules that are implicated in regulation of gene expression in both mammals and plants. While much progress has been made in determining the biological functions of lncRNAs in mammals, the functional roles of lncRNAs in plants are still poorly understood. Specifically, the roles of long intergenic nocoding RNAs (lincRNAs) in plant defence responses are yet to be fully explored. In this study, we used strand-specific RNA sequencing to identify 1113 lincRNAs in potato (Solanum tuberosum) from stem tissues. The lincRNAs are expressed from all 12 potato chromosomes and generally smaller in size compared to protein-coding genes. Like in other plants, most potato lincRNAs possess single exons. A time-course RNA-seq analysis between a tolerant and a susceptible potato cultivar showed that 559 lincRNAs are responsive to Pectobacterium carotovorum subsp. brasiliense challenge compared to mock-inoculated controls. Moreover, coexpression analysis revealed that 17 of these lincRNAs are highly associated with 12 potato defence-related genes. Together, these results suggest that lincRNAs have potential functional roles in potato defence responses. Furthermore, this work provides the first library of potato lincRNAs and a set of novel lincRNAs implicated in potato defences against P. carotovorum subsp. brasiliense, a member of the soft rot Enterobacteriaceae phytopathogens.

  6. Differential effect of UV irradiation on induction of intragenic and intergenic recombination during commitment to meiosis in Saccharomyces cerevisiae

    International Nuclear Information System (INIS)

    Machida, I.; Nakai, S.

    1980-01-01

    A comparison was made between the induction of intragenic and intergenic recombinations during meiosis in a wild-type diploid of Saccharomyces cerevisiae. Under non-irradiated normal conditions, production of both intragenic and intergenic recombinants greatly increased in the cells with commitment to meiosis. The susceptibility of cells to the induction ob both the spontaneous intra- and intergenic recombinations in meiotic cells was similar. However, under condition of UV irradiation, there were striking differences between intra- and intergenic recombinations. Susceptibility to induction of intragenic recombination by UV irradiation was not enhanced at meiosis compared with mitosis, and was not altered through commitment to meiotic processes. In contrast, however, susceptibility to the induction of intergenic recombination by UV irradiation was enhanced markedly during commitment to meiosis compared with mitosis. Genetic analysis suggested that the enhanced susceptibility to recombination during meiosis is specifically concerned with reciprocal-type recombination (crossing-over) but not non-reciprocal-type recombination (gene conversion). Hence it is concluded that the meiotic that the meiotic process appears to be intimately concerned with the mechanism(s) of induction of recombination, especially reciprocal-type recombination. (orig.)

  7. A ribosomal RNA gene intergenic spacer based PCR and DGGE fingerprinting method for the analysis of specific rhizobial communities in soil.

    Science.gov (United States)

    de Oliveira, Valéria Maia; Manfio, Gilson Paulo; da Costa Coutinho, Heitor Luiz; Keijzer-Wolters, Anneke Christina; van Elsas, Jan Dirk

    2006-03-01

    A direct molecular method for assessing the diversity of specific populations of rhizobia in soil, based on nested PCR amplification of 16S-23S ribosomal RNA gene (rDNA) intergenic spacer (IGS) sequences, was developed. Initial generic amplification of bacterial rDNA IGS sequences from soil DNA was followed by specific amplification of (1) sequences affiliated with Rhizobium leguminosarum "sensu lato" and (2) R. tropici. Using analysis of the amplified sequences in clone libraries obtained on the basis of soil DNA, this two-sided method was shown to be very specific for rhizobial subpopulations in soil. It was then further validated as a direct fingerprinting tool of the target rhizobia based on denaturing gradient gel electrophoresis (DGGE). The PCR-DGGE approach was applied to soils from fields in Brazil cultivated with common bean (Phaseolus vulgaris) under conventional or no-tillage practices. The community fingerprints obtained allowed the direct analysis of the respective rhizobial community structures in soil samples from the two contrasting agricultural practices. Data obtained with both primer sets revealed clustering of the community structures of the target rhizobial types along treatment. Moreover, the DGGE profiles obtained with the R. tropici primer set indicated that the abundance and diversity of these organisms were favoured under NT practices. These results suggest that the R. leguminosarum-as well as R. tropici-targeted IGS-based nested PCR and DGGE are useful tools for monitoring the effect of agricultural practices on these and related rhizobial subpopulations in soils.

  8. Long Intergenic Noncoding RNAs Mediate the Human Chondrocyte Inflammatory Response and Are Differentially Expressed in Osteoarthritis Cartilage.

    Science.gov (United States)

    Pearson, Mark J; Philp, Ashleigh M; Heward, James A; Roux, Benoit T; Walsh, David A; Davis, Edward T; Lindsay, Mark A; Jones, Simon W

    2016-04-01

    To identify long noncoding RNAs (lncRNAs), including long intergenic noncoding RNAs (lincRNAs), antisense RNAs, and pseudogenes, associated with the inflammatory response in human primary osteoarthritis (OA) chondrocytes and to explore their expression and function in OA. OA cartilage was obtained from patients with hip or knee OA following joint replacement surgery. Non-OA cartilage was obtained from postmortem donors and patients with fracture of the neck of the femur. Primary OA chondrocytes were isolated by collagenase digestion. LncRNA expression analysis was performed by RNA sequencing (RNAseq) and quantitative reverse transcriptase-polymerase chain reaction. Modulation of lncRNA chondrocyte expression was achieved using LNA longRNA GapmeRs (Exiqon). Cytokine production was measured with Luminex. RNAseq identified 983 lncRNAs in primary human hip OA chondrocytes, 183 of which had not previously been identified. Following interleukin-1β (IL-1β) stimulation, we identified 125 lincRNAs that were differentially expressed. The lincRNA p50-associated cyclooxygenase 2-extragenic RNA (PACER) and 2 novel chondrocyte inflammation-associated lincRNAs (CILinc01 and CILinc02) were differentially expressed in both knee and hip OA cartilage compared to non-OA cartilage. In primary OA chondrocytes, these lincRNAs were rapidly and transiently induced in response to multiple proinflammatory cytokines. Knockdown of CILinc01 and CILinc02 expression in human chondrocytes significantly enhanced the IL-1-stimulated secretion of proinflammatory cytokines. The inflammatory response in human OA chondrocytes is associated with widespread changes in the profile of lncRNAs, including PACER, CILinc01, and CILinc02. Differential expression of CILinc01 and CIinc02 in hip and knee OA cartilage, and their role in modulating cytokine production during the chondrocyte inflammatory response, suggest that they may play an important role in mediating inflammation-driven cartilage degeneration in

  9. Exploring DNA methylation changes in promoter, intragenic, and intergenic regions as early and late events in breast cancer formation

    International Nuclear Information System (INIS)

    Rauscher, Garth H.; Kresovich, Jacob K.; Poulin, Matthew; Yan, Liying; Macias, Virgilia; Mahmoud, Abeer M.; Al-Alem, Umaima; Kajdacsy-Balla, Andre; Wiley, Elizabeth L.; Tonetti, Debra; Ehrlich, Melanie

    2015-01-01

    Breast cancer formation is associated with frequent changes in DNA methylation but the extent of very early alterations in DNA methylation and the biological significance of cancer-associated epigenetic changes need further elucidation. Pyrosequencing was done on bisulfite-treated DNA from formalin-fixed, paraffin-embedded sections containing invasive tumor and paired samples of histologically normal tissue adjacent to the cancers as well as control reduction mammoplasty samples from unaffected women. The DNA regions studied were promoters (BRCA1, CD44, ESR1, GSTM2, GSTP1, MAGEA1, MSI1, NFE2L3, RASSF1A, RUNX3, SIX3 and TFF1), far-upstream regions (EN1, PAX3, PITX2, and SGK1), introns (APC, EGFR, LHX2, RFX1 and SOX9) and the LINE-1 and satellite 2 DNA repeats. These choices were based upon previous literature or publicly available DNA methylome profiles. The percent methylation was averaged across neighboring CpG sites. Most of the assayed gene regions displayed hypermethylation in cancer vs. adjacent tissue but the TFF1 and MAGEA1 regions were significantly hypomethylated (p ≤0.001). Importantly, six of the 16 regions examined in a large collection of patients (105 – 129) and in 15-18 reduction mammoplasty samples were already aberrantly methylated in adjacent, histologically normal tissue vs. non-cancerous mammoplasty samples (p ≤0.01). In addition, examination of transcriptome and DNA methylation databases indicated that methylation at three non-promoter regions (far-upstream EN1 and PITX2 and intronic LHX2) was associated with higher gene expression, unlike the inverse associations between cancer DNA hypermethylation and cancer-altered gene expression usually reported. These three non-promoter regions also exhibited normal tissue-specific hypermethylation positively associated with differentiation-related gene expression (in muscle progenitor cells vs. many other types of normal cells). The importance of considering the exact DNA region analyzed and the

  10. Comparing Generative and Inter-generative Subjectivity in Post-Revolutionary Academic Generations in Iran

    Directory of Open Access Journals (Sweden)

    mehran Sohrabzadeh

    2010-01-01

    Full Text Available Comparative study of different post-revolutionary generation has been broadly applied by social scientists; among them some believe there is a gulf between generations, while some others endorsing some small differences among generations, emphasis that this variety is natural. Avoiding being loyal to any of these two views, the present study attempts to compare three different post-revolutionary academic generations using theory of “generative objects” which explores generations’ view about their behaviors, Beliefs, and historical monuments. Sampling was carried among 3 generations; firstly ones who were student in 60s and now are experienced faculties in the university, secondly ones who are recently employed as faculty members, and finally who are now students in universities. Results show that in all 3 generations there are essential in-generation similarities, while comparatively there are some differentiations in inter-generative analysis.

  11. Optimisation of automated ribosomal intergenic spacer analysis for the estimation of microbial diversity in fynbos soil

    Directory of Open Access Journals (Sweden)

    Karin Jacobs

    2010-07-01

    Full Text Available Automated ribosomal intergenic spacer analysis (ARISA has become a commonly used molecular technique for the study of microbial populations in environmental samples. The reproducibility and accuracy of ARISA, with and without the polymerase chain reaction (PCR are important aspects that influence the results and effectiveness of these techniques. We used the primer set ITS4/ITS5 for ARISA to assess the fungal community composition of two sites situated in the Sand Fynbos. The primer set proved to deliver reproducible ARISA profiles of the fungal community composition with little variation observed between ARISA-PCRs. Variation that occurred in a sample due to repeated DNA extraction is expected for ecological studies. This reproducibility made ARISA a useful tool for the assessment and comparison of diversity in ecological samples. In this paper, we also offered particular suggestions concerning the binning strategy for the analysis of ARISA profiles.

  12. Identification and characterization of long intergenic noncoding RNAs in bovine mammary glands.

    Science.gov (United States)

    Tong, Chao; Chen, Qiaoling; Zhao, Lili; Ma, Junfei; Ibeagha-Awemu, Eveline M; Zhao, Xin

    2017-06-19

    Mammary glands of dairy cattle produce milk for the newborn offspring and for human consumption. Long intergenic noncoding RNAs (lincRNAs) play various functions in eukaryotic cells. However, types and roles of lincRNAs in bovine mammary glands are still poorly understood. Using computational methods, 886 unknown intergenic transcripts (UITs) were identified from five RNA-seq datasets from bovine mammary glands. Their non-coding potentials were predicted by using the combination of four software programs (CPAT, CNCI, CPC and hmmscan), with 184 lincRNAs identified. By comparison to the NONCODE2016 database and a domestic-animal long noncoding RNA database (ALDB), 112 novel lincRNAs were revealed in bovine mammary glands. Many lincRNAs were found to be located in quantitative trait loci (QTL). In particular, 36 lincRNAs were found in 172 milk related QTLs, whereas one lincRNA was within clinical mastitis QTL region. In addition, targeted genes for 10 lincRNAs with the highest fragments per kilobase of transcript per million fragments mapped (FPKM) were predicted by LncTar for forecasting potential biological functions of these lincRNAs. Further analyses indicate involvement of lincRNAs in several biological functions and different pathways. Our study has provided a panoramic view of lincRNAs in bovine mammary glands and suggested their involvement in many biological functions including susceptibility to clinical mastitis as well as milk quality and production. This integrative annotation of mammary gland lincRNAs broadens and deepens our understanding of bovine mammary gland biology.

  13. Genome-wide identification, characterization and evolutionary analysis of long intergenic noncoding RNAs in cucumber.

    Directory of Open Access Journals (Sweden)

    Zhiqiang Hao

    Full Text Available Long intergenic noncoding RNAs (lincRNAs are intergenic transcripts with a length of at least 200 nt that lack coding potential. Emerging evidence suggests that lincRNAs from animals participate in many fundamental biological processes. However, the systemic identification of lincRNAs has been undertaken in only a few plants. We chose to use cucumber (Cucumis sativus as a model to analyze lincRNAs due to its importance as a model plant for studying sex differentiation and fruit development and the rich genomic and transcriptome data available. The application of a bioinformatics pipeline to multiple types of gene expression data resulted in the identification and characterization of 3,274 lincRNAs. Next, 10 lincRNAs targeted by 17 miRNAs were also explored. Based on co-expression analysis between lincRNAs and mRNAs, 94 lincRNAs were annotated, which may be involved in response to stimuli, multi-organism processes, reproduction, reproductive processes, and growth. Finally, examination of the evolution of lincRNAs showed that most lincRNAs are under purifying selection, while 16 lincRNAs are under natural selection. Our results provide a rich resource for further validation of cucumber lincRNAs and their function. The identification of lincRNAs targeted by miRNAs offers new clues for investigations into the role of lincRNAs in regulating gene expression. Finally, evaluation of the lincRNAs suggested that some lincRNAs are under positive and balancing selection.

  14. Paenibacillus larvae 16S-23S rDNA intergenic transcribed spacer (ITS) regions: DNA fingerprinting and characterization.

    Science.gov (United States)

    Dingman, Douglas W

    2012-07-01

    Paenibacillus larvae is the causative agent of American foulbrood in honey bee (Apis mellifera) larvae. PCR amplification of the 16S-23S ribosomal DNA (rDNA) intergenic transcribed spacer (ITS) regions, and agarose gel electrophoresis of the amplified DNA, was performed using genomic DNA collected from 134 P. larvae strains isolated in Connecticut, six Northern Regional Research Laboratory stock strains, four strains isolated in Argentina, and one strain isolated in Chile. Following electrophoresis of amplified DNA, all isolates exhibited a common migratory profile (i.e., ITS-PCR fingerprint pattern) of six DNA bands. This profile represented a unique ITS-PCR DNA fingerprint that was useful as a fast, simple, and accurate procedure for identification of P. larvae. Digestion of ITS-PCR amplified DNA, using mung bean nuclease prior to electrophoresis, characterized only three of the six electrophoresis bands as homoduplex DNA and indicating three true ITS regions. These three ITS regions, DNA migratory band sizes of 915, 1010, and 1474 bp, signify a minimum of three types of rrn operons within P. larvae. DNA sequence analysis of ITS region DNA, using P. larvae NRRL B-3553, identified the 3' terminal nucleotides of the 16S rRNA gene, 5' terminal nucleotides of the 23S rRNA gene, and the complete DNA sequences of the 5S rRNA, tRNA(ala), and tRNA(ile) genes. Gene organization within the three rrn operon types was 16S-23S, 16S-tRNA(ala)-23S, and l6S-5S-tRNA(ile)-tRNA(ala)-23S and these operons were named rrnA, rrnF, and rrnG, respectively. The 23S rRNA gene was shown by I-CeuI digestion and pulsed-field gel electrophoresis of genomic DNA to be present as seven copies. This was suggestive of seven rrn operon copies within the P. larvae genome. Investigation of the 16S-23S rDNA regions of this bacterium has aided the development of a diagnostic procedure and has helped genomic mapping investigations via characterization of the ITS regions. Copyright © 2012 Elsevier Inc

  15. Differentiation between Aspergillus flavus and Aspergillus parasiticus from Pure Culture and Aflatoxin-Contaminated Grapes Using PCR-RFLP Analysis of aflR-aflJ Intergenic Spacer

    International Nuclear Information System (INIS)

    El Khoury, A.; Atoui, A.; Lebrihi, A.; Rizk, T.; Lteif, R.; Kallassy, M.

    2011-01-01

    Aflatoxins (AFs) represent the most important single mycotoxin-related food safety problem in developed and developing countries as they have adverse effects on human and animal health. They are produced mainly by Aspergillus flavus and A. parasiticus. Both species have different aflatoxinogenic profile. In order to distinguish between A. flavus and A. parasiticus, gene-specific primers were designed to target the intergenic spacer (IGS) for the AF biosynthesis genes, aflJ and aflR. Polymerase chain reaction (PCR) products were subjected to restriction endonuclease analysis using BglII to look for restriction fragment length polymorphisms (RFLPs). Our result showed that both species displayed different PCR-based RFLP (PCR-RFLP) profile. PCR products from A. flavus cleaved into 3 fragments of 362, 210, and 102 bp. However, there is only one restriction site for this enzyme in the sequence of A. parasiticus that produced only 2 fragments of 363 and 311 bp. The method was successfully applied to contaminated grapes samples. This approach of differentiating these 2 species would be simpler, less costly, and quicker than conventional sequencing of PCR products and/or morphological identification. (author)

  16. Long intergenic non-coding RNA TUG1 is overexpressed in urothelial carcinoma of the bladder.

    Science.gov (United States)

    Han, Yonghua; Liu, Yuchen; Gui, Yaoting; Cai, Zhiming

    2013-04-01

    Long intergenic non-coding RNAs (lincRNAs) are a class of non-coding RNAs that regulate gene expression via chromatin reprogramming. Taurine Up-regulated Gene 1 (TUG1) is a lincRNA that is associated with chromatin-modifying complexes and plays roles in gene regulation. In this study, we determined the expression patterns of TUG1 and the cell proliferation inhibition and apoptosis induced by silencing TUG1 in urothelial carcinoma of the bladder. The expression levels of TUG1 were determined using Real-Time qPCR in a total of 44 patients with bladder urothelial carcinomas. Bladder urothelial carcinoma T24 and 5637 cells were transfected with TUG1 siRNA or negative control siRNA. Cell proliferation was evaluated using MTT assay. Apoptosis was determined using ELISA assay. TUG1 was up-regulated in bladder urothelial carcinoma compared to paired normal urothelium. High TUG1 expression levels were associated with high grade and stage carcinomas. Cell proliferation inhibition and apoptosis induction were observed in TUG1 siRNA-transfected bladder urothelial carcinoma T24 and 5637 cells. Our data suggest that lincRNA TUG1 is emerging as a novel player in the disease state of bladder urothelial carcinoma. TUG1 may have potential roles as a biomarker and/or a therapeutic target in bladder urothelial carcinoma. Copyright © 2012 Wiley Periodicals, Inc.

  17. The evolutionary landscape of intergenic trans-splicing events in insects

    Science.gov (United States)

    Kong, Yimeng; Zhou, Hongxia; Yu, Yao; Chen, Longxian; Hao, Pei; Li, Xuan

    2015-01-01

    To explore the landscape of intergenic trans-splicing events and characterize their functions and evolutionary dynamics, we conduct a mega-data study of a phylogeny containing eight species across five orders of class Insecta, a model system spanning 400 million years of evolution. A total of 1,627 trans-splicing events involving 2,199 genes are identified, accounting for 1.58% of the total genes. Homology analysis reveals that mod(mdg4)-like trans-splicing is the only conserved event that is consistently observed in multiple species across two orders, which represents a unique case of functional diversification involving trans-splicing. Thus, evolutionarily its potential for generating proteins with novel function is not broadly utilized by insects. Furthermore, 146 non-mod trans-spliced transcripts are found to resemble canonical genes from different species. Trans-splicing preserving the function of ‘breakup' genes may serve as a general mechanism for relaxing the constraints on gene structure, with profound implications for the evolution of genes and genomes. PMID:26521696

  18. A novel intergenic ETnII-β insertion mutation causes multiple malformations in polypodia mice.

    Directory of Open Access Journals (Sweden)

    Jessica A Lehoczky

    Full Text Available Mouse early transposon insertions are responsible for ~10% of spontaneous mutant phenotypes. We previously reported the phenotypes and genetic mapping of Polypodia, (Ppd, a spontaneous, X-linked dominant mutation with profound effects on body plan morphogenesis. Our new data shows that mutant mice are not born in expected Mendelian ratios secondary to loss after E9.5. In addition, we refined the Ppd genetic interval and discovered a novel ETnII-β early transposon insertion between the genes for Dusp9 and Pnck. The ETn inserted 1.6 kb downstream and antisense to Dusp9 and does not disrupt polyadenylation or splicing of either gene. Knock-in mice engineered to carry the ETn display Ppd characteristic ectopic caudal limb phenotypes, showing that the ETn insertion is the Ppd molecular lesion. Early transposons are actively expressed in the early blastocyst. To explore the consequences of the ETn on the genomic landscape at an early stage of development, we compared interval gene expression between wild-type and mutant ES cells. Mutant ES cell expression analysis revealed marked upregulation of Dusp9 mRNA and protein expression. Evaluation of the 5' LTR CpG methylation state in adult mice revealed no correlation with the occurrence or severity of Ppd phenotypes at birth. Thus, the broad range of phenotypes observed in this mutant is secondary to a novel intergenic ETn insertion whose effects include dysregulation of nearby interval gene expression at early stages of development.

  19. Annotating long intergenic non-coding RNAs under artificial selection during chicken domestication.

    Science.gov (United States)

    Wang, Yun-Mei; Xu, Hai-Bo; Wang, Ming-Shan; Otecko, Newton Otieno; Ye, Ling-Qun; Wu, Dong-Dong; Zhang, Ya-Ping

    2017-08-15

    Numerous biological functions of long intergenic non-coding RNAs (lincRNAs) have been identified. However, the contribution of lincRNAs to the domestication process has remained elusive. Following domestication from their wild ancestors, animals display substantial changes in many phenotypic traits. Therefore, it is possible that diverse molecular drivers play important roles in this process. We analyzed 821 transcriptomes in this study and annotated 4754 lincRNA genes in the chicken genome. Our population genomic analysis indicates that 419 lincRNAs potentially evolved during artificial selection related to the domestication of chicken, while a comparative transcriptomic analysis identified 68 lincRNAs that were differentially expressed under different conditions. We also found 47 lincRNAs linked to special phenotypes. Our study provides a comprehensive view of the genome-wide landscape of lincRNAs in chicken. This will promote a better understanding of the roles of lincRNAs in domestication, and the genetic mechanisms associated with the artificial selection of domestic animals.

  20. The Intergenic Recombinant HLA-B*46:01 Has a Distinctive Peptidome that Includes KIR2DL3 Ligands

    DEFF Research Database (Denmark)

    Hilton, Hugo G.; McMurtrey, Curtis P.; Han, Alex S.

    2017-01-01

    HLA-B*46:01 was formed by an intergenic mini-conversion, between HLA-B*15:01 and HLA-C*01:02, in Southeast Asia during the last 50,000 years, and it has since become the most common HLA-B allele in the region. A functional effect of the mini-conversion was introduction of the C1 epitope into HLA-...

  1. Copy Number Variations in Candidate Genes and Intergenic Regions Affect Body Mass Index and Abdominal Obesity in Mexican Children

    Science.gov (United States)

    Burguete-García, Ana Isabel; Bonnefond, Amélie; Peralta-Romero, Jesús; Froguel, Philippe

    2017-01-01

    Introduction. Increase in body weight is a gradual process that usually begins in childhood and in adolescence as a result of multiple interactions among environmental and genetic factors. This study aimed to analyze the relationship between copy number variants (CNVs) in five genes and four intergenic regions with obesity in Mexican children. Methods. We studied 1423 children aged 6–12 years. Anthropometric measurements and blood levels of biochemical parameters were obtained. Identification of CNVs was performed by real-time PCR. The effect of CNVs on obesity or body composition was assessed using regression models adjusted for age, gender, and family history of obesity. Results. Gains in copy numbers of LEPR and NEGR1 were associated with decreased body mass index (BMI), waist circumference (WC), and risk of abdominal obesity, whereas gain in ARHGEF4 and CPXCR1 and the intergenic regions 12q15c, 15q21.1a, and 22q11.21d and losses in INS were associated with increased BMI and WC. Conclusion. Our results indicate a possible contribution of CNVs in LEPR, NEGR1, ARHGEF4, and CPXCR1 and the intergenic regions 12q15c, 15q21.1a, and 22q11.21d to the development of obesity, particularly abdominal obesity in Mexican children. PMID:28428959

  2. Population genetic structure and phylogeographical pattern of a relict tree fern, Alsophila spinulosa (Cyatheaceae), inferred from cpDNA atpB- rbcL intergenic spacers.

    Science.gov (United States)

    Su, Yingjuan; Wang, Ting; Zheng, Bo; Jiang, Yu; Chen, Guopei; Gu, Hongya

    2004-11-01

    Sequences of chloroplast DNA (cpDNA) atpB- rbcL intergenic spacers of individuals of a tree fern species, Alsophila spinulosa, collected from ten relict populations distributed in the Hainan and Guangdong provinces, and the Guangxi Zhuang region in southern China, were determined. Sequence length varied from 724 bp to 731 bp, showing length polymorphism, and base composition was with high A+T content between 63.17% and 63.95%. Sequences were neutral in terms of evolution (Tajima's criterion D=-1.01899, P>0.10 and Fu and Li's test D*=-1.39008, P>0.10; F*=-1.49775, P>0.10). A total of 19 haplotypes were identified based on nucleotide variation. High levels of haplotype diversity (h=0.744) and nucleotide diversity (Dij=0.01130) were detected in A. spinulosa, probably associated with its long evolutionary history, which has allowed the accumulation of genetic variation within lineages. Both the minimum spanning network and neighbor-joining trees generated for haplotypes demonstrated that current populations of A. spinulosa existing in Hainan, Guangdong, and Guangxi were subdivided into two geographical groups. An analysis of molecular variance indicated that most of the genetic variation (93.49%, P<0.001) was partitioned among regions. Wright's isolation by distance model was not supported across extant populations. Reduced gene flow by the Qiongzhou Strait and inbreeding may result in the geographical subdivision between the Hainan and Guangdong + Guangxi populations (FST=0.95, Nm=0.03). Within each region, the star-like pattern of phylogeography of haplotypes implied a population expansion process during evolutionary history. Gene genealogies together with coalescent theory provided significant information for uncovering phylogeography of A. spinulosa.

  3. Bat white-nose syndrome: a real-time TaqMan polymerase chain reaction test targeting the intergenic spacer region of Geomyces destructanstructans.

    Science.gov (United States)

    Muller, Laura K.; Lorch, Jeffrey M.; Lindner, Daniel L.; O'Connor, Michael; Gargas, Andrea; Blehert, David S.

    2013-01-01

    The fungus Geomyces destructans is the causative agent of white-nose syndrome (WNS), a disease that has killed millions of North American hibernating bats. We describe a real-time TaqMan PCR test that detects DNA from G. destructans by targeting a portion of the multicopy intergenic spacer region of the rRNA gene complex. The test is highly sensitive, consistently detecting as little as 3.3 fg of genomic DNA from G. destructans. The real-time PCR test specifically amplified genomic DNA from G. destructans but did not amplify target sequence from 54 closely related fungal isolates (including 43 Geomyces spp. isolates) associated with bats. The test was further qualified by analyzing DNA extracted from 91 bat wing skin samples, and PCR results matched histopathology findings. These data indicate the real-time TaqMan PCR method described herein is a sensitive, specific, and rapid test to detect DNA from G. destructans and provides a valuable tool for WNS diagnostics and research.

  4. The music of trees: the intergenerative tie between primary care and public health.

    Science.gov (United States)

    Whitehouse, Peter

    2016-01-01

    Stories help us frame and understand complex ideas and challenges. Metaphors are particularly powerful linguistic devices that guide and extend our thinking by bridging conceptual domains, for example to consider the brain as a digital computer. Trees are widely used as metaphors for broad concepts like evolution, history, society, and even life itself, i.e. 'the tree of life'. Tree-like diagrams of roots and branches are used to demonstrate historical and cultural relationships, for example, between different species or different languages. In this paper, we describe a theatrical character called a tree doctor which is a living metaphor. A human being, namely the author, lectures, acts or dances as a tree and offers lessons to Homo Sapiens about 'holistic' ideas of health. The character teaches us to not only see the value of our relationships to trees, but the importance of seeing forests as well the individual trees. The metaphorical statement that we should not 'miss the forest for the trees' means we should learn to think of health embedded in systems and communities. In medicine, we too often focus on individual molecules, pharmaceuticals, or even patients and miss the bigger picture of public and environmental health. In a time of great ecological system change, the tree doctor points to broad ethical responsibility for each other and future generations of humans and other living creatures. The character embraces arts and particularly music as a powerful way of infusing purpose and improving the qualities of our lives together, especially as we age. The tree doctor knows the value of intergenerational relationships. But it also points to intergenerative innovations across many cultural domains, disciplines and professions. The tree doctor supports primary care and empowers the value of intergenerational relationships, art and music in the recommendations doctors make to patients to improve their health and well-being.

  5. Evolution of the rpoB-psbZ region in fern plastid genomes: notable structural rearrangements and highly variable intergenic spacers.

    Science.gov (United States)

    Gao, Lei; Zhou, Yuan; Wang, Zhi-Wei; Su, Ying-Juan; Wang, Ting

    2011-04-13

    The rpoB-psbZ (BZ) region of some fern plastid genomes (plastomes) has been noted to go through considerable genomic changes. Unraveling its evolutionary dynamics across all fern lineages will lead to clarify the fundamental process shaping fern plastome structure and organization. A total of 24 fern BZ sequences were investigated with taxon sampling covering all the extant fern orders. We found that: (i) a tree fern Plagiogyria japonica contained a novel gene order that can be generated from either the ancestral Angiopteris type or the derived Adiantum type via a single inversion; (ii) the trnY-trnE intergenic spacer (IGS) of the filmy fern Vandenboschia radicans was expanded 3-fold due to the tandem 27-bp repeats which showed strong sequence similarity with the anticodon domain of trnY; (iii) the trnY-trnE IGSs of two horsetail ferns Equisetum ramosissimum and E. arvense underwent an unprecedented 5-kb long expansion, more than a quarter of which was consisted of a single type of direct repeats also relevant to the trnY anticodon domain; and (iv) ycf66 has independently lost at least four times in ferns. Our results provided fresh insights into the evolutionary process of fern BZ regions. The intermediate BZ gene order was not detected, supporting that the Adiantum type was generated by two inversions occurring in pairs. The occurrence of Vandenboschia 27-bp repeats represents the first evidence of partial tRNA gene duplication in fern plastomes. Repeats potentially forming a stem-loop structure play major roles in the expansion of the trnY-trnE IGS.

  6. Cytomolecular analysis of ribosomal DNA evolution in a natural allotetraploid Brachypodium hybridum and its putative ancestors – dissecting complex repetitive structure of intergenic spacers

    Directory of Open Access Journals (Sweden)

    Natalia Borowska-Zuchowska

    2016-10-01

    Full Text Available Nucleolar dominance is an epigenetic phenomenon associated with nuclear 35S rRNA genes and consists in selective suppression of gene loci inherited from one of the progenitors in the allopolyploid. Our understanding of the exact mechanisms that determine this process is still fragmentary, especially in case of the grass species. This study aimed to shed some light on the molecular basis of this genome-specific inactivation of 35S rDNA loci in an allotetraploid Brachypodium hybridum (2n=30, which arose from the interspecific hybridization between two diploid ancestors that were very similar to modern B. distachyon (2n=10 and B. stacei (2n=20. Using fluorescence in situ hybridization with 25S rDNA and chromosome-specific BAC clones as probes we revealed that the nucleolar dominance is present not only in meristematic root-tip cells but also in differentiated cell fraction of B. hybridum. Additionally, the intergenic spacers (IGSs from both of the putative ancestors and the allotetraploid were sequenced and analyzed. The presumptive transcription initiation sites, spacer promoters and repeated elements were identified within the IGSs. Two different length variants, 2.3 kb and 3.5 kb, of IGSs were identified in B. distachyon and B. stacei, respectively, however only the IGS that had originated from B. distachyon-like ancestor was present in the allotetraploid. The amplification pattern of B. hybridum IGSs suggests that some genetic changes occurred in inactive B. stacei-like rDNA loci during the evolution of the allotetraploid. We hypothesize that their preferential silencing is an effect of structural changes in the sequence rather than just the result of the sole inactivation at the epigenetic level.

  7. Evolution of the rpoB-psbZ region in fern plastid genomes: notable structural rearrangements and highly variable intergenic spacers

    Directory of Open Access Journals (Sweden)

    Su Ying-Juan

    2011-04-01

    Full Text Available Abstract Background The rpoB-psbZ (BZ region of some fern plastid genomes (plastomes has been noted to go through considerable genomic changes. Unraveling its evolutionary dynamics across all fern lineages will lead to clarify the fundamental process shaping fern plastome structure and organization. Results A total of 24 fern BZ sequences were investigated with taxon sampling covering all the extant fern orders. We found that: (i a tree fern Plagiogyria japonica contained a novel gene order that can be generated from either the ancestral Angiopteris type or the derived Adiantum type via a single inversion; (ii the trnY-trnE intergenic spacer (IGS of the filmy fern Vandenboschia radicans was expanded 3-fold due to the tandem 27-bp repeats which showed strong sequence similarity with the anticodon domain of trnY; (iii the trnY-trnE IGSs of two horsetail ferns Equisetum ramosissimum and E. arvense underwent an unprecedented 5-kb long expansion, more than a quarter of which was consisted of a single type of direct repeats also relevant to the trnY anticodon domain; and (iv ycf66 has independently lost at least four times in ferns. Conclusions Our results provided fresh insights into the evolutionary process of fern BZ regions. The intermediate BZ gene order was not detected, supporting that the Adiantum type was generated by two inversions occurring in pairs. The occurrence of Vandenboschia 27-bp repeats represents the first evidence of partial tRNA gene duplication in fern plastomes. Repeats potentially forming a stem-loop structure play major roles in the expansion of the trnY-trnE IGS.

  8. Functional Characterization of MC1R-TUBB3 Intergenic Splice Variants of the Human Melanocortin 1 Receptor.

    Directory of Open Access Journals (Sweden)

    Cecilia Herraiz

    Full Text Available The melanocortin 1 receptor gene (MC1R expressed in melanocytes is a major determinant of skin pigmentation. It encodes a Gs protein-coupled receptor activated by α-melanocyte stimulating hormone (αMSH. Human MC1R has an inefficient poly(A site allowing intergenic splicing with its downstream neighbour Tubulin-β-III (TUBB3. Intergenic splicing produces two MC1R isoforms, designated Iso1 and Iso2, bearing the complete seven transmembrane helices from MC1R fused to TUBB3-derived C-terminal extensions, in-frame for Iso1 and out-of-frame for Iso2. It has been reported that exposure to ultraviolet radiation (UVR might promote an isoform switch from canonical MC1R (MC1R-001 to the MC1R-TUBB3 chimeras, which might lead to novel phenotypes required for tanning. We expressed the Flag epitope-tagged intergenic isoforms in heterologous HEK293T cells and human melanoma cells, for functional characterization. Iso1 was expressed with the expected size. Iso2 yielded a doublet of Mr significantly lower than predicted, and impaired intracellular stability. Although Iso1- and Iso2 bound radiolabelled agonist with the same affinity as MC1R-001, their plasma membrane expression was strongly reduced. Decreased surface expression mostly resulted from aberrant forward trafficking, rather than high rates of endocytosis. Functional coupling of both isoforms to cAMP was lower than wild-type, but ERK activation upon binding of αMSH was unimpaired, suggesting imbalanced signaling from the splice variants. Heterodimerization of differentially labelled MC1R-001 with the splicing isoforms analyzed by co-immunoprecipitation was efficient and caused decreased surface expression of binding sites. Thus, UVR-induced MC1R isoforms might contribute to fine-tune the tanning response by modulating MC1R-001 availability and functional parameters.

  9. Genome-wide identification and functional prediction of nitrogen-responsive intergenic and intronic long non-coding RNAs in maize (Zea mays L.).

    Science.gov (United States)

    Lv, Yuanda; Liang, Zhikai; Ge, Min; Qi, Weicong; Zhang, Tifu; Lin, Feng; Peng, Zhaohua; Zhao, Han

    2016-05-11

    Nitrogen (N) is an essential and often limiting nutrient to plant growth and development. Previous studies have shown that the mRNA expressions of numerous genes are regulated by nitrogen supplies; however, little is known about the expressed non-coding elements, for example long non-coding RNAs (lncRNAs) that control the response of maize (Zea mays L.) to nitrogen. LncRNAs are a class of non-coding RNAs larger than 200 bp, which have emerged as key regulators in gene expression. In this study, we surveyed the intergenic/intronic lncRNAs in maize B73 leaves at the V7 stage under conditions of N-deficiency and N-sufficiency using ribosomal RNA depletion and ultra-deep total RNA sequencing approaches. By integration with mRNA expression profiles and physiological evaluations, 7245 lncRNAs and 637 nitrogen-responsive lncRNAs were identified that exhibited unique expression patterns. Co-expression network analysis showed that the nitrogen-responsive lncRNAs were enriched mainly in one of the three co-expressed modules. The genes in the enriched module are mainly involved in NADH dehydrogenase activity, oxidative phosphorylation and the nitrogen compounds metabolic process. We identified a large number of lncRNAs in maize and illustrated their potential regulatory roles in response to N stress. The results lay the foundation for further in-depth understanding of the molecular mechanisms of lncRNAs' role in response to nitrogen stresses.

  10. Methylation pattern of the intergenic spacer of rRNA genes in excised cotyledons of Cucurbita pepo L. (Zucchini) after hormone treatment

    International Nuclear Information System (INIS)

    Ananiev, E.; Abdulova, G.; Grozdanov, P.; Karagyozov, L.

    2003-01-01

    High molecular mass genomic DNA was isolated from excised marrow cotyledons (Cucurbita pepo L. zucchini) treated with 6-benzyladenine (BA) of methyl ester of jasmonic acid (MeJA) for 24 h in darkness. DNA purified from contaminating polysaccharides with Celite column was completely digested with the restriction enzyme Eco RI and the changes in the methylation pattern of the intergenic spacer (IGS) of r RNA genes were studied after subsequent digestion with the couple of restriction enzymes-isoschizomers MSP I and Hpa II by the method of 'indirect end labelling'. As rDNA units probe a cloned 32 P-labelled Eco RI 2.1 kb fragment spanning in the most part of 18S r RNA gene from flax rDNA was used. Results showed heavy methylation of the rRNA genes. As judged from the almost total lack of digestion with HPA II, there were no methylation free regions in repeated rDNA units or little if any were observed. A hypo methylated Hps II site was detected near the promoter region in some of the repeats. Digestion with Msp I affected nearly 50% of the repeating units. The Msp digestion fragments of the 6.2 kb Eco RI fragment of r DNA were few in number and large in size (0.5 - 2.5 kb). This suggested that in addition with -CpG- sequences, methylation in -CpNpG- might not be random. Methylation pattern in IGS was not changed upon treatment of the cotyledons in vivo with BA and MeJA. Thus, previously observed hormone-mediated effects on the eactivity of rRNA gene expression were not accompanied by any significant changes of the methylation pattern in IGS. (authors)

  11. Trypanosoma cruzi I genotypes in different geographic regions and transmission cycles based on a microsatellite motif of the intergenic spacer of spliced leader genes✯

    Science.gov (United States)

    Cura, Carolina I.; Mejía-Jaramillo, Ana M.; Duffy, Tomás; Burgos, Juan M.; Rodriguero, Marcela; Cardinal, Marta V.; Kjos, Sonia; Gurgel-Gonçalves, Rodrigo; Blanchet, Denis; De Pablos, Luis M.; Tomasini, Nicolás; Silva, Alex Da; Russomando, Graciela; Cuba Cuba, Cesar A.; Aznar, Christine; Abate, Teresa; Levin, Mariano J.; Osuna, Antonio; Gürtler, Ricardo E.; Diosque, Patricio; Solari, Aldo; Triana-Chávez, Omar; Schijman, Alejandro G.

    2011-01-01

    The intergenic region of spliced-leader (SL-IR) genes from 105 Trypanosoma cruzi I (Tc I) infected biological samples, culture isolates and stocks from 11 endemic countries, from Argentina to the USA were characterised, allowing identification of 76 genotypes with 54 polymorphic sites from 123 aligned sequences. On the basis of the microsatellite motif proposed by Herrera et al. (2007) to define four haplotypes in Colombia, we could classify these genotypes into four distinct Tc I SL-IR groups, three corresponding to the former haplotypes Ia (11 genotypes), Ib (11 genotypes) and Id (35 genotypes); and one novel group, Ie (19 genotypes). Genotypes harboring the Tc Ic motif were not detected in our study. Tc Ia was associated with domestic cycles in southern and northern South America and sylvatic cycles in Central and North America. Tc Ib was found in all transmission cycles from Colombia. Tc Id was identified in all transmission cycles from Argentina and Colombia, including Chagas cardiomyopathy patients, sylvatic Brazilian samples and human cases from French Guiana, Panama and Venezuela. Tc Ie gathered five samples from domestic Triatoma infestans from northern Argentina, nine samples from wild Mepraia spinolai and Mepraia gajardoi and two chagasic patients from Chile and one from a Bolivian patient with chagasic reactivation. Mixed infections by Tc Ia + Tc Id, Tc Ia + Tc Ie and Tc Id + Tc Ie were detected in vector faeces and isolates from human and vector samples. In addition, Tc Ia and Tc Id were identified in different tissues from a heart transplanted Chagas cardiomyopathy patient with reactivation, denoting histotropism. Trypanosoma cruzi I SL-IR genotypes from parasites infecting Triatoma gerstaeckeri and Didelphis virginiana from USA, T. infestans from Paraguay, Rhodnius nasutus and Rhodnius neglectus from Brazil and M. spinolai and M. gajardoi from Chile are to our knowledge described for the first time. PMID:20670628

  12. A two-locus DNA sequence database for typing plant and human pathogens within the Fusarium oxysporum species complex

    DEFF Research Database (Denmark)

    O'Donnell, Kerry; Gueidan, C; Sink, S

    2009-01-01

    We constructed a two-locus database, comprising partial translation elongation factor (EF-1alpha) gene sequences and nearly full-length sequences of the nuclear ribosomal intergenic spacer region (IGS rDNA) for 850 isolates spanning the phylogenetic breadth of the Fusarium oxysporum species compl...... of the IGS rDNA sequences may be non-orthologous. We also evaluated enniatin, fumonisin and moniliformin mycotoxin production in vitro within a phylogenetic framework....

  13. RANDNA: a random DNA sequence generator.

    Science.gov (United States)

    Piva, Francesco; Principato, Giovanni

    2006-01-01

    Monte Carlo simulations are useful to verify the significance of data. Genomic regularities, such as the nucleotide correlations or the not uniform distribution of the motifs throughout genomic or mature mRNA sequences, exist and their significance can be checked by means of the Monte Carlo test. The test needs good quality random sequences in order to work, moreover they should have the same nucleotide distribution as the sequences in which the regularities have been found. Random DNA sequences are also useful to estimate the background score of an alignment, that is a threshold below which the resulting score is merely due to chance. We have developed RANDNA, a free software which allows to produce random DNA or RNA sequences setting both their length and the percentage of nucleotide composition. Sequences having the same nucleotide distribution of exonic, intronic or intergenic sequences can be generated. Its graphic interface makes it possible to easily set the parameters that characterize the sequences being produced and saved in a text format file. The pseudo-random number generator function of Borland Delphi 6 is used, since it guarantees a good randomness, a long cycle length and a high speed. We have checked the quality of sequences generated by the software, by means of well-known tests, both by themselves and versus genuine random sequences. We show the good quality of the generated sequences. The software, complete with examples and documentation, is freely available to users from: http://www.introni.it/en/software.

  14. Phylogenetic relationships in Solanaceae and related species based on cpDNA sequence from plastid trnE-trnT region

    Directory of Open Access Journals (Sweden)

    Danila Montewka Melotto-Passarin

    2008-01-01

    Full Text Available Intergenic spacers of chloroplast DNA (cpDNA are very useful in phylogenetic and population genetic studiesof plant species, to study their potential integration in phylogenetic analysis. The non-coding trnE-trnT intergenic spacer ofcpDNA was analyzed to assess the nucleotide sequence polymorphism of 16 Solanaceae species and to estimate its ability tocontribute to the resolution of phylogenetic studies of this group. Multiple alignments of DNA sequences of trnE-trnT intergenicspacer made the identification of nucleotide variability in this region possible and the phylogeny was estimated by maximumparsimony and rooted with Convolvulaceae Ipomoea batatas, the most closely related family. Besides, this intergenic spacerwas tested for the phylogenetic ability to differentiate taxonomic levels. For this purpose, species from four other families wereanalyzed and compared with Solanaceae species. Results confirmed polymorphism in the trnE-trnT region at different taxonomiclevels.

  15. The Intergenic Recombinant HLA-B∗46:01 Has a Distinctive Peptidome that Includes KIR2DL3 Ligands

    Directory of Open Access Journals (Sweden)

    Hugo G. Hilton

    2017-05-01

    Full Text Available HLA-B∗46:01 was formed by an intergenic mini-conversion, between HLA-B∗15:01 and HLA-C∗01:02, in Southeast Asia during the last 50,000 years, and it has since become the most common HLA-B allele in the region. A functional effect of the mini-conversion was introduction of the C1 epitope into HLA-B∗46:01, making it an exceptional HLA-B allotype that is recognized by the C1-specific natural killer (NK cell receptor KIR2DL3. High-resolution mass spectrometry showed that HLA-B∗46:01 has a low-diversity peptidome that is distinct from those of its parents. A minority (21% of HLA-B∗46:01 peptides, with common C-terminal characteristics, form ligands for KIR2DL3. The HLA-B∗46:01 peptidome is predicted to be enriched for peptide antigens derived from Mycobacterium leprae. Overall, the results indicate that the distinctive peptidome and functions of HLA-B∗46:01 provide carriers with resistance to leprosy, which drove its rapid rise in frequency in Southeast Asia.

  16. Genome-wide CpG island methylation and intergenic demethylation propensities vary among different tumor sites.

    Science.gov (United States)

    Lee, Seung-Tae; Wiemels, Joseph L

    2016-02-18

    The epigenetic landscape of cancer includes both focal hypermethylation and broader hypomethylation in a genome-wide manner. By means of a comprehensive genomic analysis on 6637 tissues of 21 tumor types, we here show that the degrees of overall methylation in CpG island (CGI) and demethylation in intergenic regions, defined as 'backbone', largely vary among different tumors. Depending on tumor type, both CGI methylation and backbone demethylation are often associated with clinical, epidemiological and biological features such as age, sex, smoking history, anatomic location, histological type and grade, stage, molecular subtype and biological pathways. We found connections between CGI methylation and hypermutability, microsatellite instability, IDH1 mutation, 19p gain and polycomb features, and backbone demethylation with chromosomal instability, NSD1 and TP53 mutations, 5q and 19p loss and long repressive domains. These broad epigenetic patterns add a new dimension to our understanding of tumor biology and its clinical implications. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  17. Genetic Diversity in Salmonella Isolates from Ducks and their Environments in Penang, Malaysia using Enterobacterial Repetitive Intergenic Consensus

    Directory of Open Access Journals (Sweden)

    Frederick Adzitey 1, Gulam Rusul Rahmat Ali2*, Nurul Huda2 and Rosma Ahmad3

    2013-07-01

    Full Text Available A total of 107 Salmonella isolates (37 S. typhimurium, 26 S. hadar, 15 S. enteritidis, 15 S. braenderup, and 14 S. albany isolated from ducks and their environments in Penang, Malaysia were typed using enterobacterial repetitive intergenic consensus (ERIC to determine their genetic diversity. Analysis of the Salmonella strains by ERIC produced DNA fingerprints of different sizes for differentiation purposes. The DNA fingerprints or band sizes ranged from 14-8300bp for S. Typhimurium, 146-6593bp for S. hadar, 15-4929bp for S. enteritidis, 14-5142bp for S. braenderup and 7-5712bp for S. albany. Cluster analysis at a coefficient of 0.85 grouped the Salmonella strains into various clusters and singletons. S. typhimurium were grouped into 10 clusters and 6 singletons, S. Hadar were grouped into 3 clusters and 18 singletons, S. enteritidis were grouped into 3 clusters and 7 singletons, S. braenderup were grouped into 4 clusters and 7 singletons, and S. albany were grouped into 3 clusters and 7 singletons with discriminatory index (D ranging from 0.92-0.98. ERIC proved to be a useful typing tool for determining the genetic diversity of the duck Salmonella strains.

  18. Correlation of maple sap composition with bacterial and fungal communities determined by multiplex automated ribosomal intergenic spacer analysis (MARISA).

    Science.gov (United States)

    Filteau, Marie; Lagacé, Luc; LaPointe, Gisèle; Roy, Denis

    2011-08-01

    During collection, maple sap is contaminated by bacteria and fungi that subsequently colonize the tubing system. The bacterial microbiota has been more characterized than the fungal microbiota, but the impact of both components on maple sap quality remains unclear. This study focused on identifying bacterial and fungal members of maple sap and correlating microbiota composition with maple sap properties. A multiplex automated ribosomal intergenic spacer analysis (MARISA) method was developed to presumptively identify bacterial and fungal members of maple sap samples collected from 19 production sites during the tapping period. Results indicate that the fungal community of maple sap is mainly composed of yeast related to Mrakia sp., Mrakiella sp., Guehomyces pullulans, Cryptococcus victoriae and Williopsis saturnus. Mrakia, Mrakiella and Guehomyces peaks were identified in samples of all production sites and can be considered dominant and stable members of the fungal microbiota of maple sap. A multivariate analysis based on MARISA profiles and maple sap chemical composition data showed correlations between Candida sake, Janthinobacterium lividum, Williopsis sp., Leuconostoc mesenteroides, Mrakia sp., Rhodococcus sp., Pseudomonas tolaasii, G. pullulans and maple sap composition at different flow periods. This study provides new insights on the relationship between microbial community and maple sap quality. Copyright © 2011 Elsevier Ltd. All rights reserved.

  19. PlantRNA_Sniffer: A SVM-Based Workflow to Predict Long Intergenic Non-Coding RNAs in Plants.

    Science.gov (United States)

    Vieira, Lucas Maciel; Grativol, Clicia; Thiebaut, Flavia; Carvalho, Thais G; Hardoim, Pablo R; Hemerly, Adriana; Lifschitz, Sergio; Ferreira, Paulo Cavalcanti Gomes; Walter, Maria Emilia M T

    2017-03-04

    Non-coding RNAs (ncRNAs) constitute an important set of transcripts produced in the cells of organisms. Among them, there is a large amount of a particular class of long ncRNAs that are difficult to predict, the so-called long intergenic ncRNAs (lincRNAs), which might play essential roles in gene regulation and other cellular processes. Despite the importance of these lincRNAs, there is still a lack of biological knowledge and, currently, the few computational methods considered are so specific that they cannot be successfully applied to other species different from those that they have been originally designed to. Prediction of lncRNAs have been performed with machine learning techniques. Particularly, for lincRNA prediction, supervised learning methods have been explored in recent literature. As far as we know, there are no methods nor workflows specially designed to predict lincRNAs in plants. In this context, this work proposes a workflow to predict lincRNAs on plants, considering a workflow that includes known bioinformatics tools together with machine learning techniques, here a support vector machine (SVM). We discuss two case studies that allowed to identify novel lincRNAs, in sugarcane ( Saccharum spp.) and in maize ( Zea mays ). From the results, we also could identify differentially-expressed lincRNAs in sugarcane and maize plants submitted to pathogenic and beneficial microorganisms.

  20. PlantRNA_Sniffer: A SVM-Based Workflow to Predict Long Intergenic Non-Coding RNAs in Plants

    Directory of Open Access Journals (Sweden)

    Lucas Maciel Vieira

    2017-03-01

    Full Text Available Non-coding RNAs (ncRNAs constitute an important set of transcripts produced in the cells of organisms. Among them, there is a large amount of a particular class of long ncRNAs that are difficult to predict, the so-called long intergenic ncRNAs (lincRNAs, which might play essential roles in gene regulation and other cellular processes. Despite the importance of these lincRNAs, there is still a lack of biological knowledge and, currently, the few computational methods considered are so specific that they cannot be successfully applied to other species different from those that they have been originally designed to. Prediction of lncRNAs have been performed with machine learning techniques. Particularly, for lincRNA prediction, supervised learning methods have been explored in recent literature. As far as we know, there are no methods nor workflows specially designed to predict lincRNAs in plants. In this context, this work proposes a workflow to predict lincRNAs on plants, considering a workflow that includes known bioinformatics tools together with machine learning techniques, here a support vector machine (SVM. We discuss two case studies that allowed to identify novel lincRNAs, in sugarcane (Saccharum spp. and in maize (Zea mays. From the results, we also could identify differentially-expressed lincRNAs in sugarcane and maize plants submitted to pathogenic and beneficial microorganisms.

  1. Large Intergenic Non-coding RNA-RoR Inhibits Aerobic Glycolysis of Glioblastoma Cells via Akt Pathway

    Science.gov (United States)

    Li, Yong; He, Zhi-Cheng; Liu, Qing; Zhou, Kai; Shi, Yu; Yao, Xiao-Hong; Zhang, Xia; Kung, Hsiang-Fu; Ping, Yi-Fang; Bian, Xiu-Wu

    2018-01-01

    Reprogramming energy metabolism is a hallmark of malignant tumors, including glioblastoma (GBM). Aerobic glycolysis is often utilized by tumor cells to maintain survival and proliferation. However, the underlying mechanisms of aerobic glycolysis in GBM remain elusive. Herein, we demonstrated that large intergenic non-coding RNA-RoR (LincRNA-RoR) functioned as a critical suppressor to inhibit the aerobic glycolysis and viability of GBM cells. We found that LincRNA-RoR was markedly reduced in GBM tissues compared with adjacent non-tumor tissues from 10 cases of GBM patients. Consistently, LincRNA-RoR expression in GBM cells was significantly lower than that in normal glial cells. The aerobic glycolysis of GBM cells, as determined by the measurement of glucose uptake and lactate production, was impaired by LincRNA-RoR overexpression. Mechanistically, LincRNA-RoR inhibited the expression of Rictor, the key component of mTORC2 (mammalian target of rapamycin complex 2), to suppress the activity of Akt pathway and impair the expression of glycolytic effectors, including Glut1, HK2, PKM2 and LDHA. Finally, enforced expression of LincRNA-RoR reduced the proliferation of GBM cells in vitro, restrained tumor growth in vivo, and repressed the expression of glycolytic molecules in GBM xenografts. Collectively, our results underscore LincRNA-RoR as a new suppressor of GBM aerobic glycolysis with therapeutic potential. PMID:29581766

  2. Bat white-nose syndrome: A real-time TaqMan polymerase chain reaction test targeting the intergenic spacer region of Geomyces destructans

    Science.gov (United States)

    Laura K Muller; Jeffrey M. Lorch; Daniel L. Lindner; Michael O' Connor; Andrea Gargas; David S. Blehert

    2013-01-01

    The fungus Geomyces destructans is the causative agent of white-nose syndrome (WNS), a disease that has killed millions of North American hibernating bats. We describe a real-time TaqMan PCR test that detects DNA from G. destructans by targeting a portion of the multicopy intergenic spacer region of the rRNA gene complex. The...

  3. In silico analysis of Simple Sequence Repeats from chloroplast genomes of Solanaceae species

    Directory of Open Access Journals (Sweden)

    Evandro Vagner Tambarussi

    2009-01-01

    Full Text Available The availability of chloroplast genome (cpDNA sequences of Atropa belladonna, Nicotiana sylvestris, N.tabacum, N. tomentosiformis, Solanum bulbocastanum, S. lycopersicum and S. tuberosum, which are Solanaceae species,allowed us to analyze the organization of cpSSRs in their genic and intergenic regions. In general, the number of cpSSRs incpDNA ranged from 161 in S. tuberosum to 226 in N. tabacum, and the number of intergenic cpSSRs was higher than geniccpSSRs. The mononucleotide repeats were the most frequent in studied species, but we also identified di-, tri-, tetra-, pentaandhexanucleotide repeats. Multiple alignments of all cpSSRs sequences from Solanaceae species made the identification ofnucleotide variability possible and the phylogeny was estimated by maximum parsimony. Our study showed that the plastomedatabase can be exploited for phylogenetic analysis and biotechnological approaches.

  4. Rapid identification of 11 human intestinal Lactobacillus species by multiplex PCR assays using group- and species-specific primers derived from the 16S-23S rRNA intergenic spacer region and its flanking 23S rRNA.

    Science.gov (United States)

    Song, Y; Kato, N; Liu, C; Matsumiya, Y; Kato, H; Watanabe, K

    2000-06-15

    Rapid and reliable two-step multiplex polymerase chain reaction (PCR) assays were established to identify human intestinal lactobacilli; a multiplex PCR was used for grouping of lactobacilli with a mixture of group-specific primers followed by four multiplex PCR assays with four sorts of species-specific primer mixtures for identification at the species level. Primers used were designed from nucleotide sequences of the 16S-23S rRNA intergenic spacer region and its flanking 23S rRNA gene of members of the genus Lactobacillus which are commonly isolated from human stool specimens: Lactobacillus acidophilus, Lactobacillus crispatus, Lactobacillus delbrueckii (ssp. bulgaricus and ssp. lactis), Lactobacillus fermentum, Lactobacillus gasseri, Lactobacillus jensenii, Lactobacillus paracasei (ssp. paracasei and ssp. tolerans), Lactobacillus plantarum, Lactobacillus reuteri, Lactobacillus rhamnosus and Lactobacillus salivarius (ssp. salicinius and ssp. salivarius). The established two-step multiplex PCR assays were applied to the identification of 84 Lactobacillus strains isolated from human stool specimens and the PCR results were consistent with the results from the DNA-DNA hybridization assay. These results suggest that the multiplex PCR system established in this study is a simple, rapid and reliable method for the identification of common Lactobacillus isolates from human stool samples.

  5. Identification and Functional Analysis of Long Intergenic Non-coding RNAs Underlying Intramuscular Fat Content in Pigs

    Directory of Open Access Journals (Sweden)

    Cheng Zou

    2018-03-01

    Full Text Available Intramuscular fat (IMF content is an important trait that can affect pork quality. Previous studies have identified many genes that can regulate IMF. Long intergenic non-coding RNAs (lincRNAs are emerging as key regulators in various biological processes. However, lincRNAs related to IMF in pig are largely unknown, and the mechanisms by which they regulate IMF are yet to be elucidated. Here we reconstructed 105,687 transcripts and identified 1,032 lincRNAs in pig longissimus dorsi muscle (LDM of four stages with different IMF contents based on published RNA-seq. These lincRNAs show typical characteristics such as shorter length and lower expression compared with protein-coding genes. Combined with methylation data, we found that both the promoter and genebody methylation of lincRNAs can negatively regulate lincRNA expression. We found that lincRNAs exhibit high correlation with their protein-coding neighbors in expression. Co-expression network analysis resulted in eight stage-specific modules, gene ontology and pathway analysis of them suggested that some lincRNAs were involved in IMF-related processes, such as fatty acid metabolism and peroxisome proliferator-activated receptor signaling pathway. Furthermore, we identified hub lincRNAs and found six of them may play important roles in IMF development. This work detailed some lincRNAs which may affect of IMF development in pig, and facilitated future research on these lincRNAs and molecular assisted breeding for pig.

  6. Identification of clinically relevant nonhemolytic Streptococci on the basis of sequence analysis of 16S-23S intergenic spacer region and partial gdh gene

    DEFF Research Database (Denmark)

    Nielsen, Xiaohui Chen; Justesen, Ulrik Stenz; Dargis, Rimtas

    2009-01-01

    Nonhemolytic streptococci (NHS) cause serious infections, such as endocarditis and septicemia. Many conventional phenotypic methods are insufficient for the identification of bacteria in this group to the species level. Genetic analysis has revealed that single-gene analysis is insufficient...

  7. Concerted evolution rapidly eliminates sequence variation in rDNA coding regions but not in intergenic spacers in Nicotiana tabacum allotetraploid

    Czech Academy of Sciences Publication Activity Database

    Lunerová Bedřichová, Jana; Renny-Byfield, S.; Matyášek, Roman; Leitch, A.; Kovařík, Aleš

    2017-01-01

    Roč. 303, č. 8 (2017), s. 1043-1060 ISSN 0378-2697 R&D Projects: GA ČR(CZ) GA17-11642S; GA ČR(CZ) GC16-02149J Institutional support: RVO:68081707 Keywords : Concerted evolution * Immunomodulation * Neutrophils Subject RIV: EB - Genetics ; Molecular Biology OBOR OECD: Genetics and heredity (medical genetics to be 3) Impact factor: 1.239, year: 2016

  8. Sequence Classification: 13038 [

    Lifescience Database Archive (English)

    Full Text Available Non-TMB Non-TMH Non-TMB Non-TMB Non-TMB Non-TMB >gi|15669589|ref|NP_248402.1| alignment in /usr/local/projec...ts/ARG/Intergenic/ARG_R584_orf2.nr || http://www.ncbi.nlm.nih.gov/protein/15669589 ...

  9. Sequence Classification: 12660 [

    Lifescience Database Archive (English)

    Full Text Available Non-TMB Non-TMH Non-TMB Non-TMB Non-TMB Non-TMB >gi|15669211|ref|NP_248016.1| alignment in /usr/local/projec...ts/ARG/Intergenic/ARG_R428_orf1.nr || http://www.ncbi.nlm.nih.gov/protein/15669211 ...

  10. An intergenic risk locus containing an enhancer deletion in 2q35 modulates breast cancer risk by deregulating IGFBP5 expression

    Science.gov (United States)

    Wyszynski, Asaf; Hong, Chi-Chen; Lam, Kristin; Michailidou, Kyriaki; Lytle, Christian; Yao, Song; Zhang, Yali; Bolla, Manjeet K.; Wang, Qin; Dennis, Joe; Hopper, John L.; Southey, Melissa C.; Schmidt, Marjanka K.; Broeks, Annegien; Muir, Kenneth; Lophatananon, Artitaya; Fasching, Peter A.; Beckmann, Matthias W.; Peto, Julian; dos-Santos-Silva, Isabel; Sawyer, Elinor J.; Tomlinson, Ian; Burwinkel, Barbara; Marme, Frederik; Guénel, Pascal; Truong, Thérèse; Bojesen, Stig E.; Nordestgaard, Børge G.; González-Neira, Anna; Benitez, Javier; Neuhausen, Susan L.; Brenner, Hermann; Dieffenbach, Aida Karina; Meindl, Alfons; Schmutzler, Rita K.; Brauch, Hiltrud; Nevanlinna, Heli; Khan, Sofia; Matsuo, Keitaro; Ito, Hidemi; Dörk, Thilo; Bogdanova, Natalia V.; Lindblom, Annika; Margolin, Sara; Mannermaa, Arto; Kosma, Veli-Matti; Wu, Anna H.; Van Den Berg, David; Lambrechts, Diether; Wildiers, Hans; Chang-Claude, Jenny; Rudolph, Anja; Radice, Paolo; Peterlongo, Paolo; Couch, Fergus J.; Olson, Janet E.; Giles, Graham G.; Milne, Roger L.; Haiman, Christopher A.; Henderson, Brian E.; Dumont, Martine; Teo, Soo Hwang; Wong, Tien Y.; Kristensen, Vessela; Zheng, Wei; Long, Jirong; Winqvist, Robert; Pylkäs, Katri; Andrulis, Irene L.; Knight, Julia A.; Devilee, Peter; Seynaeve, Caroline; García-Closas, Montserrat; Figueroa, Jonine; Klevebring, Daniel; Czene, Kamila; Hooning, Maartje J.; van den Ouweland, Ans M.W.; Darabi, Hatef; Shu, Xiao-Ou; Gao, Yu-Tang; Cox, Angela; Blot, William; Signorello, Lisa B.; Shah, Mitul; Kang, Daehee; Choi, Ji-Yeob; Hartman, Mikael; Miao, Hui; Hamann, Ute; Jakubowska, Anna; Lubinski, Jan; Sangrajrang, Suleeporn; McKay, James; Toland, Amanda E.; Yannoukakos, Drakoulis; Shen, Chen-Yang; Wu, Pei-Ei; Swerdlow, Anthony; Orr, Nick; Simard, Jacques; Pharoah, Paul D.P.; Dunning, Alison M.; Chenevix-Trench, Georgia; Hall, Per; Bandera, Elisa; Amos, Chris; Ambrosone, Christine; Easton, Douglas F.; Cole, Michael D.

    2016-01-01

    Breast cancer is the most diagnosed malignancy and the second leading cause of cancer mortality in females. Previous association studies have identified variants on 2q35 associated with the risk of breast cancer. To identify functional susceptibility loci for breast cancer, we interrogated the 2q35 gene desert for chromatin architecture and functional variation correlated with gene expression. We report a novel intergenic breast cancer risk locus containing an enhancer copy number variation (enCNV; deletion) located approximately 400Kb upstream to IGFBP5, which overlaps an intergenic ERα-bound enhancer that loops to the IGFBP5 promoter. The enCNV is correlated with modified ERα binding and monoallelic-repression of IGFBP5 following oestrogen treatment. We investigated the association of enCNV genotype with breast cancer in 1,182 cases and 1,362 controls, and replicate our findings in an independent set of 62,533 cases and 60,966 controls from 41 case control studies and 11 GWAS. We report a dose-dependent inverse association of 2q35 enCNV genotype (percopy OR = 0.68 95%CI 0.55–0.83, P = 0.0002; replication OR = 0.77 95% CI 0.73-0.82, P = 2.1 × 10−19) and identify 13 additional linked variants (r2 > 0.8) in the 20Kb linkage block containing the enCNV (P = 3.2 × 10−15 − 5.6 × 10−17). These associations were independent of previously reported 2q35 variants, rs13387042/rs4442975 and rs16857609, and were stronger for ER-positive than ER-negative disease. Together, these results suggest that 2q35 breast cancer risk loci may be mediating their effect through IGFBP5. PMID:27402876

  11. Genome-wide Anaplasma phagocytophilum AnkA-DNA interactions are enriched in intergenic regions and gene promoters and correlate with infection-induced differential gene expression.

    Directory of Open Access Journals (Sweden)

    J Stephen Dumler

    2016-09-01

    Full Text Available Anaplasma phagocytophilum, an obligate intracellular prokaryote, infects neutrophils and alters cardinal functions via reprogrammed transcription. Large contiguous regions of neutrophil chromosomes are differentially expressed during infection. Secreted A. phagocytophilum effector AnkA transits into the neutrophil or granulocyte nucleus to complex with DNA in heterochromatin across all chromosomes. AnkA binds to gene promoters to dampen cis-transcription and also has features of matrix attachment region (MAR-binding proteins that regulate three-dimensional chromatin architecture and coordinate transcriptional programs encoded in topologically-associated chromatin domains. We hypothesize that identification of additional AnkA binding sites will better delineate how A. phagocytophilum infection results in reprogramming of the neutrophil genome. Using AnkA-binding ChIP-seq, we showed that AnkA binds broadly throughout all chromosomes in a reproducible pattern, especially at: i intergenic regions predicted to be matrix attachment regions (MARs; ii within predicted lamina-associated domains; and iii at promoters ≤3,000 bp upstream of transcriptional start sites. These findings provide genome-wide support for AnkA as a regulator of cis-gene transcription. Moreover, the dominant mark of AnkA in distal intergenic regions known to be AT-enriched, coupled with frequent enrichment in the nuclear lamina, provides strong support for its role as a MAR-binding protein and genome re-organizer. AnkA must be considered a prime candidate to promote neutrophil reprogramming and subsequent functional changes that belie improved microbial fitness and pathogenicity.

  12. Sequence assembly

    DEFF Research Database (Denmark)

    Scheibye-Alsing, Karsten; Hoffmann, S.; Frankel, Annett Maria

    2009-01-01

    Despite the rapidly increasing number of sequenced and re-sequenced genomes, many issues regarding the computational assembly of large-scale sequencing data have remain unresolved. Computational assembly is crucial in large genome projects as well for the evolving high-throughput technologies and...... in genomic DNA, highly expressed genes and alternative transcripts in EST sequences. We summarize existing comparisons of different assemblers and provide a detailed descriptions and directions for download of assembly programs at: http://genome.ku.dk/resources/assembly/methods.html....

  13. Genome Sequencing

    DEFF Research Database (Denmark)

    Sato, Shusei; Andersen, Stig Uggerhøj

    2014-01-01

    The current Lotus japonicus reference genome sequence is based on a hybrid assembly of Sanger TAC/BAC, Sanger shotgun and Illumina shotgun sequencing data generated from the Miyakojima-MG20 accession. It covers nearly all expressed L. japonicus genes and has been annotated mainly based on transcr......The current Lotus japonicus reference genome sequence is based on a hybrid assembly of Sanger TAC/BAC, Sanger shotgun and Illumina shotgun sequencing data generated from the Miyakojima-MG20 accession. It covers nearly all expressed L. japonicus genes and has been annotated mainly based...

  14. Cloning, nucleotide sequence and transcriptional analysis of the uvrA gene from Neisseria gonorrhoeae

    International Nuclear Information System (INIS)

    Black, C.G.; Fyfe, J.A.M.; Davies, J.K.

    1997-01-01

    A recombinant plasmid capable of restoring UV resistance to an Escherichia coli uvrA mutant was isolated from a genomic library of Neisseria gonorrhoeae. Sequence analysis revealed an open reading frame whose deduced amino acid sequence displayed significant similarity to those of the UvrA proteins of other bacterial species. A second open reading frame (ORF259) was identified upstream from, and in the opposite orientation to the gonococcal uvrA gene. Transcriptional fusions between portions of the gonococcal uvrA upstream region and a reporter gene were used to localise promoter activity in both E. coli and N. gonorrhoeae. The transcriptional starting points of uvrA and ORF259 were mapped in E. coli by primer extension analysis, and corresponding σ 70 promoters were identified. The arrangement of the uvrA-ORF259 intergenic region is similar to that of the gonococcal recA-aroD intergenic region. Both contain inverted copies of the 10 bp neisserial DNA uptake sequence situated between divergently transcribed genes. However, there is no evidence that either the uptake sequence or the proximity of the promoters influences expression of these genes. (author)

  15. ASAP: Amplification, sequencing & annotation of plastomes

    Directory of Open Access Journals (Sweden)

    Folta Kevin M

    2005-12-01

    Full Text Available Abstract Background Availability of DNA sequence information is vital for pursuing structural, functional and comparative genomics studies in plastids. Traditionally, the first step in mining the valuable information within a chloroplast genome requires sequencing a chloroplast plasmid library or BAC clones. These activities involve complicated preparatory procedures like chloroplast DNA isolation or identification of the appropriate BAC clones to be sequenced. Rolling circle amplification (RCA is being used currently to amplify the chloroplast genome from purified chloroplast DNA and the resulting products are sheared and cloned prior to sequencing. Herein we present a universal high-throughput, rapid PCR-based technique to amplify, sequence and assemble plastid genome sequence from diverse species in a short time and at reasonable cost from total plant DNA, using the large inverted repeat region from strawberry and peach as proof of concept. The method exploits the highly conserved coding regions or intergenic regions of plastid genes. Using an informatics approach, chloroplast DNA sequence information from 5 available eudicot plastomes was aligned to identify the most conserved regions. Cognate primer pairs were then designed to generate ~1 – 1.2 kb overlapping amplicons from the inverted repeat region in 14 diverse genera. Results 100% coverage of the inverted repeat region was obtained from Arabidopsis, tobacco, orange, strawberry, peach, lettuce, tomato and Amaranthus. Over 80% coverage was obtained from distant species, including Ginkgo, loblolly pine and Equisetum. Sequence from the inverted repeat region of strawberry and peach plastome was obtained, annotated and analyzed. Additionally, a polymorphic region identified from gel electrophoresis was sequenced from tomato and Amaranthus. Sequence analysis revealed large deletions in these species relative to tobacco plastome thus exhibiting the utility of this method for structural and

  16. Identification of novel growth phase- and media-dependent small non-coding RNAs in Streptococcus pyogenes M49 using intergenic tiling arrays

    Directory of Open Access Journals (Sweden)

    Patenge Nadja

    2012-10-01

    Full Text Available Abstract Background Small non-coding RNAs (sRNAs have attracted attention as a new class of gene regulators in both eukaryotes and bacteria. Genome-wide screening methods have been successfully applied in Gram-negative bacteria to identify sRNA regulators. Many sRNAs are well characterized, including their target mRNAs and mode of action. In comparison, little is known about sRNAs in Gram-positive pathogens. In this study, we identified novel sRNAs in the exclusively human pathogen Streptococcus pyogenes M49 (Group A Streptococcus, GAS M49, employing a whole genome intergenic tiling array approach. GAS is an important pathogen that causes diseases ranging from mild superficial infections of the skin and mucous membranes of the naso-pharynx, to severe toxic and invasive diseases. Results We identified 55 putative sRNAs in GAS M49 that were expressed during growth. Of these, 42 were novel. Some of the newly-identified sRNAs belonged to one of the common non-coding RNA families described in the Rfam database. Comparison of the results of our screen with the outcome of two recently published bioinformatics tools showed a low level of overlap between putative sRNA genes. Previously, 40 potential sRNAs have been reported to be expressed in a GAS M1T1 serotype, as detected by a whole genome intergenic tiling array approach. Our screen detected 12 putative sRNA genes that were expressed in both strains. Twenty sRNA candidates appeared to be regulated in a medium-dependent fashion, while eight sRNA genes were regulated throughout growth in chemically defined medium. Expression of candidate genes was verified by reverse transcriptase-qPCR. For a subset of sRNAs, the transcriptional start was determined by 5′ rapid amplification of cDNA ends-PCR (RACE-PCR analysis. Conclusions In accord with the results of previous studies, we found little overlap between different screening methods, which underlines the fact that a comprehensive analysis of s

  17. Long Intergenic Noncoding RNA 00511 Acts as an Oncogene in Non–small-cell Lung Cancer by Binding to EZH2 and Suppressing p57

    Directory of Open Access Journals (Sweden)

    Cheng-Cao Sun

    2016-01-01

    Full Text Available Long noncoding RNAs (lncRNAs play crucial roles in carcinogenesis. However, the function and mechanism of lncRNAs in human non–small-cell lung cancer (NSCLC are still remaining largely unknown. Long intergenic noncoding RNA 00511 (LINC00511 has been found to be upregulated and acts as an oncogene in breast cancer, but little is known about its expression pattern, biological function and underlying mechanism in NSCLC. Herein, we identified LINC00511 as an oncogenic lncRNA by driving tumorigenesis in NSCLC. We found LINC00511 was upregulated and associated with oncogenesis, tumor size, metastasis, and poor prognosis in NSCLC. Moreover, LINC00511 affected cell proliferation, invasiveness, metastasis, and apoptosis in multiple NSCLC cell lines. Mechanistically, LINC00511 bound histone methyltransferase enhancer of zeste homolog 2 ((EZH2, the catalytic subunit of the polycomb repressive complex 2 (PRC2, a highly conserved protein complex that regulates gene expression by methylating lysine 27 on histone H3, and acted as a modular scaffold of EZH2/PRC2 complexes, coordinated their localization, and specified the histone modification pattern on the target genes, including p57, and consequently altered NSCLC cell biology. Thus, LINC00511 is mechanistically, functionally, and clinically oncogenic in NSCLC. Targeting LINC00511 and its pathway may be meaningful for treating patients with NSCLC.

  18. TALEN-mediated single-base-pair editing identification of an intergenic mutation upstream of BUB1B as causative of PCS (MVA) syndrome

    Science.gov (United States)

    Ochiai, Hiroshi; Miyamoto, Tatsuo; Kanai, Akinori; Hosoba, Kosuke; Sakuma, Tetsushi; Kudo, Yoshiki; Asami, Keiko; Ogawa, Atsushi; Watanabe, Akihiro; Kajii, Tadashi; Yamamoto, Takashi; Matsuura, Shinya

    2014-01-01

    Cancer-prone syndrome of premature chromatid separation with mosaic variegated aneuploidy [PCS (MVA) syndrome] is a rare autosomal recessive disorder characterized by constitutional aneuploidy and a high risk of childhood cancer. We previously reported monoallelic mutations in the BUB1B gene (encoding BUBR1) in seven Japanese families with the syndrome. No second mutation was found in the opposite allele of any of the families studied, although a conserved BUB1B haplotype and a decreased transcript were identified. To clarify the molecular pathology of the second allele, we extended our mutational search to a candidate region surrounding BUB1B. A unique single nucleotide substitution, G > A at ss802470619, was identified in an intergenic region 44 kb upstream of a BUB1B transcription start site, which cosegregated with the disorder. To examine whether this is the causal mutation, we designed a transcription activator-like effector nuclease–mediated two-step single-base pair editing strategy and biallelically introduced this substitution into cultured human cells. The cell clones showed reduced BUB1B transcripts, increased PCS frequency, and MVA, which are the hallmarks of the syndrome. We also encountered a case of a Japanese infant with PCS (MVA) syndrome carrying a homozygous single nucleotide substitution at ss802470619. These results suggested that the nucleotide substitution identified was the causal mutation of PCS (MVA) syndrome. PMID:24344301

  19. Genotyping of virulent Escherichia coli obtained from poultry and poultry farm workers using enterobacterial repetitive intergenic consensus-polymerase chain reaction

    Directory of Open Access Journals (Sweden)

    M. Soma Sekhar

    2017-11-01

    Full Text Available Aim: The aim of this study was to characterize virulent Escherichia coli isolated from different poultry species and poultry farm workers using enterobacterial repetitive intergenic consensus-polymerase chain reaction (ERIC-PCR genotyping. Materials and Methods: Fecal swabs from different poultry species (n=150 and poultry farm workers (n=15 were analyzed for E. coli and screened for virulence genes (stx1, stx2, eaeA, and hlyA by multiplex PCR. Virulent E. coli was serotyped based on their "O" antigen and then genotyped using ERIC-PCR. Results: A total of 134 E. coli isolates (122/150 from poultry and 12/15 from farm workers were recovered. Virulence genes were detected in a total of 12 isolates. Serological typing of the 12 virulent E. coli revealed nine different serotypes (O2, O49, O60, O63, O83, O101, O120, UT, and Rough. ERIC-PCR genotyping allowed discrimination of 12 virulent E. coli isolates into 11 ERIC-PCR genotypes. The numerical index of discrimination was 0.999. Conclusion: Our findings provide information about the wide genetic diversity and discrimination of virulent E. coli in apparently healthy poultry and poultry farm workers of Andhra Pradesh (India based on their genotype.

  20. Genome wide discovery of long intergenic non-coding RNAs in Diamondback moth (Plutella xylostella) and their expression in insecticide resistant strains

    Science.gov (United States)

    Etebari, Kayvan; Furlong, Michael J.; Asgari, Sassan

    2015-01-01

    Long non-coding RNAs (lncRNAs) play important roles in genomic imprinting, cancer, differentiation and regulation of gene expression. Here, we identified 3844 long intergenic ncRNAs (lincRNA) in Plutella xylostella, which is a notorious pest of cruciferous plants that has developed field resistance to all classes of insecticides, including Bacillus thuringiensis (Bt) endotoxins. Further, we found that some of those lincRNAs may potentially serve as precursors for the production of small ncRNAs. We found 280 and 350 lincRNAs that are differentially expressed in Chlorpyrifos and Fipronil resistant larvae. A survey on P. xylostella midgut transcriptome data from Bt-resistant populations revealed 59 altered lincRNA in two resistant strains compared with the susceptible population. We validated the transcript levels of a number of putative lincRNAs in deltamethrin-resistant larvae that were exposed to deltamethrin, which indicated that this group of lincRNAs might be involved in the response to xenobiotics in this insect. To functionally characterize DBM lincRNAs, gene ontology (GO) enrichment of their associated protein-coding genes was extracted and showed over representation of protein, DNA and RNA binding GO terms. The data presented here will facilitate future studies to unravel the function of lincRNAs in insecticide resistance or the response to xenobiotics of eukaryotic cells. PMID:26411386

  1. Quantitative Proteomics Analysis Reveals Novel Insights into Mechanisms of Action of Long Noncoding RNA Hox Transcript Antisense Intergenic RNA (HOTAIR) in HeLa Cells*

    Science.gov (United States)

    Zheng, Peng; Xiong, Qian; Wu, Ying; Chen, Ying; Chen, Zhuo; Fleming, Joy; Gao, Ding; Bi, Lijun; Ge, Feng

    2015-01-01

    Long noncoding RNAs (lncRNAs), which have emerged in recent years as a new and crucial layer of gene regulators, regulate various biological processes such as carcinogenesis and metastasis. HOTAIR (Hox transcript antisense intergenic RNA), a lncRNA overexpressed in most human cancers, has been shown to be an oncogenic lncRNA. Here, we explored the role of HOTAIR in HeLa cells and searched for proteins regulated by HOTAIR. To understand the mechanism of action of HOTAIR from a systems perspective, we employed a quantitative proteomic strategy to systematically identify potential targets of HOTAIR. The expression of 170 proteins was significantly dys-regulated after inhibition of HOTAIR, implying that they could be potential targets of HOTAIR. Analysis of this data at the systems level revealed major changes in proteins involved in diverse cellular components, including the cytoskeleton and the respiratory chain. Further functional studies on vimentin (VIM), a key protein involved in the cytoskeleton, revealed that HOTAIR exerts its effects on migration and invasion of HeLa cells, at least in part, through the regulation of VIM expression. Inhibition of HOTAIR leads to mitochondrial dysfunction and ultrastructural alterations, suggesting a novel role of HOTAIR in maintaining mitochondrial function in cancer cells. Our results provide novel insights into the mechanisms underlying the function of HOTAIR in cancer cells. We expect that the methods used in this study will become an integral part of functional studies of lncRNAs. PMID:25762744

  2. Potentials and limitations of histone repeat sequences for phylogenetic reconstruction of Sophophora.

    Science.gov (United States)

    Baldo, A M; Les, D H; Strausbaugh, L D

    1999-11-01

    Simplified DNA sequence acquisition has provided many new data sets that are useful for phylogenetic reconstruction, including single- and multiple-copy nuclear and organellar genes. Although transcribed regions receive much attention, nontranscribed regions have recently been added to the repertoire of sequences suitable for phylogenetic studies, especially for closely related taxa. We evaluated the efficacy of a small portion of the histone repeat for phylogenetic reconstruction among Drosophila species. Histone repeats in invertebrates offer distinct advantages similar to those of widely used ribosomal repeats. First, the units are tandemly repeated and undergo concerted evolution. Second, histone repeats include both highly conserved coding and variable intergenic regions. This composition facilitates application of "universal" primers spanning potentially informative sites. We examined a small region of the histone repeat, including the intergenic spacer segments of coding regions from the divergently transcribed H2A and H2B histone genes. The spacer (about 230 bp) exists as a mosaic with highly conserved functional motifs interspersed with rapidly diverging regions; the former aid in alignment of the spacer. There are no ambiguities in alignment of coding regions. Coding and noncoding regions were analyzed together and separately for phylogenetic information. Parsimony, distance, and maximum-likelihood methods successfully retrieve the corroborated phylogeny for the taxa examined. This study demonstrates the resolving power of a small histone region which may now be added to the growing collection of phylogenetically useful DNA sequences.

  3. Whole genome sequencing: an efficient approach to ensuring food safety

    Science.gov (United States)

    Lakicevic, B.; Nastasijevic, I.; Dimitrijevic, M.

    2017-09-01

    Whole genome sequencing is an effective, powerful tool that can be applied to a wide range of public health and food safety applications. A major difference between WGS and the traditional typing techniques is that WGS allows all genes to be included in the analysis, instead of a well-defined subset of genes or variable intergenic regions. Also, the use of WGS can facilitate the understanding of contamination/colonization routes of foodborne pathogens within the food production environment, and can also afford efficient tracking of pathogens’ entry routes and distribution from farm-to-consumer. Tracking foodborne pathogens in the food processing-distribution-retail-consumer continuum is of the utmost importance for facilitation of outbreak investigations and rapid action in controlling/preventing foodborne outbreaks. Therefore, WGS likely will replace most of the numerous workflows used in public health laboratories to characterize foodborne pathogens into one consolidated, efficient workflow.

  4. A Sequence-Specific Interaction between the Saccharomyces cerevisiae rRNA Gene Repeats and a Locus Encoding an RNA Polymerase I Subunit Affects Ribosomal DNA Stability

    Science.gov (United States)

    Cahyani, Inswasti; Cridge, Andrew G.; Engelke, David R.; Ganley, Austen R. D.

    2014-01-01

    The spatial organization of eukaryotic genomes is linked to their functions. However, how individual features of the global spatial structure contribute to nuclear function remains largely unknown. We previously identified a high-frequency interchromosomal interaction within the Saccharomyces cerevisiae genome that occurs between the intergenic spacer of the ribosomal DNA (rDNA) repeats and the intergenic sequence between the locus encoding the second largest RNA polymerase I subunit and a lysine tRNA gene [i.e., RPA135-tK(CUU)P]. Here, we used quantitative chromosome conformation capture in combination with replacement mapping to identify a 75-bp sequence within the RPA135-tK(CUU)P intergenic region that is involved in the interaction. We demonstrate that the RPA135-IGS1 interaction is dependent on the rDNA copy number and the Msn2 protein. Surprisingly, we found that the interaction does not govern RPA135 transcription. Instead, replacement of a 605-bp region within the RPA135-tK(CUU)P intergenic region results in a reduction in the RPA135-IGS1 interaction level and fluctuations in rDNA copy number. We conclude that the chromosomal interaction that occurs between the RPA135-tK(CUU)P and rDNA IGS1 loci stabilizes rDNA repeat number and contributes to the maintenance of nucleolar stability. Our results provide evidence that the DNA loci involved in chromosomal interactions are composite elements, sections of which function in stabilizing the interaction or mediating a functional outcome. PMID:25421713

  5. The development and application of a Mycoplasma gallisepticum sequence database.

    Science.gov (United States)

    Armour, Natalie K; Laibinis, Victoria A; Collett, Stephen R; Ferguson-Noel, Naola

    2013-01-01

    Molecular analysis was conducted on 36 Mycoplasma gallisepticum DNA extracts from tracheal swab samples of commercial poultry in seven South African provinces between 2009 and 2012. Twelve unique M. gallisepticum genotypes were identified by polymerase chain reaction and sequence analysis of the 16S-23S rRNA intergenic spacer region (IGSR), M. gallisepticum cytadhesin 2 (mgc2), MGA_0319 and gapA genetic regions. The DNA sequences of these genotypes were distinct from those of M. gallisepticum isolates in a database composed of sequences from other countries, vaccine and reference strains. The most prevalent genotype (SA-WT#7) was detected in samples from commercial broilers, broiler breeders and layers in five provinces. South African M. gallisepticum sequences were more similar to those of the live vaccines commercially available in South Africa, but were distinct from that of F strain vaccine, which is not registered for use in South Africa. The IGSR, mgc2 or MGA_0319 sequences of three South African genotypes were identical to those of the ts-11 vaccine strain, necessitating a combination of mgc2 and IGSR targeted sequencing to differentiate South African wild-type genotypes from ts-11 vaccine. To identify and differentiate all 12 wild-types, mgc2, IGSR and MGA_0319 sequencing was required. Sequencing of gapA was least effective at strain differentiation. This research serves as a model for the development of an M. gallisepticum sequence database, and illustrates its application to characterize M. gallisepticum genotypes, select diagnostic tests and better understand the epidemiology of M. gallisepticum.

  6. The Arabidopsis lyrata genome sequence and the basis of rapid genome size change

    Energy Technology Data Exchange (ETDEWEB)

    Hu, Tina T.; Pattyn, Pedro; Bakker, Erica G.; Cao, Jun; Cheng, Jan-Fang; Clark, Richard M.; Fahlgren, Noah; Fawcett, Jeffrey A.; Grimwood, Jane; Gundlach, Heidrun; Haberer, Georg; Hollister, Jesse D.; Ossowski, Stephan; Ottilar, Robert P.; Salamov, Asaf A.; Schneeberger, Korbinian; Spannagl, Manuel; Wang, Xi; Yang, Liang; Nasrallah, Mikhail E.; Bergelson, Joy; Carrington, James C.; Gaut, Brandon S.; Schmutz, Jeremy; Mayer, Klaus F. X.; Van de Peer, Yves; Grigoriev, Igor V.; Nordborg, Magnus; Weigel, Detlef; Guo, Ya-Long

    2011-04-29

    In our manuscript, we present a high-quality genome sequence of the Arabidopsis thaliana relative, Arabidopsis lyrata, produced by dideoxy sequencing. We have performed the usual types of genome analysis (gene annotation, dN/dS studies etc. etc.), but this is relegated to the Supporting Information. Instead, we focus on what was a major motivation for sequencing this genome, namely to understand how A. thaliana lost half its genome in a few million years and lived to tell the tale. The rather surprising conclusion is that there is not a single genomic feature that accounts for the reduced genome, but that every aspect centromeres, intergenic regions, transposable elements, gene family number is affected through hundreds of thousands of cuts. This strongly suggests that overall genome size in itself is what has been under selection, a suggestion that is strongly supported by our demonstration (using population genetics data from A. thaliana) that new deletions seem to be driven to fixation.

  7. Genome-wide characterization of long intergenic non-coding RNAs (lincRNAs) provides new insight into viral diseases in honey bees Apis cerana and Apis mellifera.

    Science.gov (United States)

    Jayakodi, Murukarthick; Jung, Je Won; Park, Doori; Ahn, Young-Joon; Lee, Sang-Choon; Shin, Sang-Yoon; Shin, Chanseok; Yang, Tae-Jin; Kwon, Hyung Wook

    2015-09-04

    Long non-coding RNAs (lncRNAs) are a class of RNAs that do not encode proteins. Recently, lncRNAs have gained special attention for their roles in various biological process and diseases. In an attempt to identify long intergenic non-coding RNAs (lincRNAs) and their possible involvement in honey bee development and diseases, we analyzed RNA-seq datasets generated from Asian honey bee (Apis cerana) and western honey bee (Apis mellifera). We identified 2470 lincRNAs with an average length of 1011 bp from A. cerana and 1514 lincRNAs with an average length of 790 bp in A. mellifera. Comparative analysis revealed that 5 % of the total lincRNAs derived from both species are unique in each species. Our comparative digital gene expression analysis revealed a high degree of tissue-specific expression among the seven major tissues of honey bee, different from mRNA expression patterns. A total of 863 (57 %) and 464 (18 %) lincRNAs showed tissue-dependent expression in A. mellifera and A. cerana, respectively, most preferentially in ovary and fat body tissues. Importantly, we identified 11 lincRNAs that are specifically regulated upon viral infection in honey bees, and 10 of them appear to play roles during infection with various viruses. This study provides the first comprehensive set of lincRNAs for honey bees and opens the door to discover lincRNAs associated with biological and hormone signaling pathways as well as various diseases of honey bee.

  8. Identification of virulence factors in 16S-23S rRNA intergenic spacer genotyped Staphylococcus aureus isolated from water buffaloes and small ruminants.

    Science.gov (United States)

    Cremonesi, P; Zottola, T; Locatelli, C; Pollera, C; Castiglioni, B; Scaccabarozzi, L; Moroni, P

    2013-01-01

    Staphylococcus aureus is an important human and animal pathogen, and is regarded as an important cause of intramammary infection (IMI) in ruminants. Staphylococcus aureus genetic variability and virulence factors have been well studied in veterinary medicine, especially in cows as support for control and management of IMI. The aim of the present study was to genotype 71 Staph. aureus isolates from the bulk tank and foremilk of water buffaloes (n=40) and from udder tissue (n=7) and foremilk (n=24) from small ruminants. The method used was previously applied to bovine Staph. aureus and is based on the amplification of the 16S-23S rRNA intergenic spacer region. The technique applied was able to identify different Staph. aureus genotypes isolated from dairy species other than the bovine species, and cluster the genotypes according to species and herds. Virulence gene distribution was consistent with genotype differentiation. The isolates were also characterized through determination of the presence of 19 virulence-associated genes by specific PCR. Enterotoxins A, C, D, G, I, J, and L were associated with Staph. aureus isolates from buffaloes, whereas enterotoxins C and L were linked to small ruminants. Genes coding for methicillin resistance, Panton-Valentine leukocidin, exfoliative toxins A and B, and enterotoxins B, E, and H were undetected. These findings indicate that RNA template-specific PCR is a valid technique for typing Staph. aureus from buffaloes and small ruminants and is a useful tool for understanding udder infection epidemiology. Copyright © 2013 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.

  9. Genotypic Characterization of Escherichia coli O157:H7 Isolates from Different Sources in the North-West Province, South Africa, Using Enterobacterial Repetitive Intergenic Consensus PCR Analysis

    Directory of Open Access Journals (Sweden)

    Collins Njie Ateba

    2014-05-01

    Full Text Available In many developing countries, proper hygiene is not strictly implemented when animals are slaughtered and meat products become contaminated. Contaminated meat may contain Escherichia coli (E. coli O157:H7 that could cause diseases in humans if these food products are consumed undercooked. In the present study, a total of 94 confirmed E. coli O157:H7 isolates were subjected to the enterobacterial repetitive intergenic consensus (ERIC polymerase chain reaction (PCR typing to generate genetic fingerprints. The ERIC fragments were resolved by electrophoresis on 2% (w/v agarose gels. The presence, absence and intensity of band data were obtained, exported to Microsoft Excel (Microsoft Office 2003 and used to generate a data matrix. The unweighted pair group method with arithmetic mean (UPGMA and complete linkage algorithms were used to analyze the percentage of similarity and matrix data. Relationships between the various profiles and/or lanes were expressed as dendrograms. Data from groups of related lanes were compiled and reported on cluster tables. ERIC fragments ranged from one to 15 per isolate, and their sizes varied from 0.25 to 0.771 kb. A large proportion of the isolates produced an ERIC banding pattern with three duplets ranging in sizes from 0.408 to 0.628 kb. Eight major clusters (I–VIII were identified. Overall, the remarkable similarities (72% to 91% between the ERIC profiles for the isolate from animal species and their corresponding food products indicated some form of contamination, which may not exclude those at the level of the abattoirs. These results reveal that ERIC PCR analysis can be reliable in comparing the genetic profiles of E. coli O157:H7 from different sources in the North-West Province of South Africa.

  10. Genotypic characterization of Escherichia coli O157:H7 isolates from different sources in the North-West Province, South Africa, using enterobacterial repetitive intergenic consensus PCR analysis.

    Science.gov (United States)

    Ateba, Collins Njie; Mbewe, Moses

    2014-05-30

    In many developing countries, proper hygiene is not strictly implemented when animals are slaughtered and meat products become contaminated. Contaminated meat may contain Escherichia coli (E. coli) O157:H7 that could cause diseases in humans if these food products are consumed undercooked. In the present study, a total of 94 confirmed E. coli O157:H7 isolates were subjected to the enterobacterial repetitive intergenic consensus (ERIC) polymerase chain reaction (PCR) typing to generate genetic fingerprints. The ERIC fragments were resolved by electrophoresis on 2% (w/v) agarose gels. The presence, absence and intensity of band data were obtained, exported to Microsoft Excel (Microsoft Office 2003) and used to generate a data matrix. The unweighted pair group method with arithmetic mean (UPGMA) and complete linkage algorithms were used to analyze the percentage of similarity and matrix data. Relationships between the various profiles and/or lanes were expressed as dendrograms. Data from groups of related lanes were compiled and reported on cluster tables. ERIC fragments ranged from one to 15 per isolate, and their sizes varied from 0.25 to 0.771 kb. A large proportion of the isolates produced an ERIC banding pattern with three duplets ranging in sizes from 0.408 to 0.628 kb. Eight major clusters (I-VIII) were identified. Overall, the remarkable similarities (72% to 91%) between the ERIC profiles for the isolate from animal species and their corresponding food products indicated some form of contamination, which may not exclude those at the level of the abattoirs. These results reveal that ERIC PCR analysis can be reliable in comparing the genetic profiles of E. coli O157:H7 from different sources in the North-West Province of South Africa.

  11. CHIR99021 promotes self-renewal of mouse embryonic stem cells by modulation of protein-encoding gene and long intergenic non-coding RNA expression

    Energy Technology Data Exchange (ETDEWEB)

    Wu, Yongyan [College of Veterinary Medicine, Northwest A and F University, Yangling 712100, Shaanxi (China); Key Laboratory of Animal Biotechnology, Ministry of Agriculture, Northwest A and F University, Yangling 712100, Shaanxi (China); Ai, Zhiying [Key Laboratory of Animal Biotechnology, Ministry of Agriculture, Northwest A and F University, Yangling 712100, Shaanxi (China); College of Life Sciences, Northwest A and F University, Yangling 712100, Shaanxi (China); Yao, Kezhen [College of Veterinary Medicine, Northwest A and F University, Yangling 712100, Shaanxi (China); Key Laboratory of Animal Biotechnology, Ministry of Agriculture, Northwest A and F University, Yangling 712100, Shaanxi (China); Cao, Lixia; Du, Juan; Shi, Xiaoyan [Key Laboratory of Animal Biotechnology, Ministry of Agriculture, Northwest A and F University, Yangling 712100, Shaanxi (China); College of Life Sciences, Northwest A and F University, Yangling 712100, Shaanxi (China); Guo, Zekun, E-mail: gzk@nwsuaf.edu.cn [College of Veterinary Medicine, Northwest A and F University, Yangling 712100, Shaanxi (China); Key Laboratory of Animal Biotechnology, Ministry of Agriculture, Northwest A and F University, Yangling 712100, Shaanxi (China); Zhang, Yong, E-mail: zhylab@hotmail.com [College of Veterinary Medicine, Northwest A and F University, Yangling 712100, Shaanxi (China); Key Laboratory of Animal Biotechnology, Ministry of Agriculture, Northwest A and F University, Yangling 712100, Shaanxi (China)

    2013-10-15

    Embryonic stem cells (ESCs) can proliferate indefinitely in vitro and differentiate into cells of all three germ layers. These unique properties make them exceptionally valuable for drug discovery and regenerative medicine. However, the practical application of ESCs is limited because it is difficult to derive and culture ESCs. It has been demonstrated that CHIR99021 (CHIR) promotes self-renewal and enhances the derivation efficiency of mouse (m)ESCs. However, the downstream targets of CHIR are not fully understood. In this study, we identified CHIR-regulated genes in mESCs using microarray analysis. Our microarray data demonstrated that CHIR not only influenced the Wnt/β-catenin pathway by stabilizing β-catenin, but also modulated several other pluripotency-related signaling pathways such as TGF-β, Notch and MAPK signaling pathways. More detailed analysis demonstrated that CHIR inhibited Nodal signaling, while activating bone morphogenetic protein signaling in mESCs. In addition, we found that pluripotency-maintaining transcription factors were up-regulated by CHIR, while several developmental-related genes were down-regulated. Furthermore, we found that CHIR altered the expression of epigenetic regulatory genes and long intergenic non-coding RNAs. Quantitative real-time PCR results were consistent with microarray data, suggesting that CHIR alters the expression pattern of protein-encoding genes (especially transcription factors), epigenetic regulatory genes and non-coding RNAs to establish a relatively stable pluripotency-maintaining network. - Highlights: • Combined use of CHIR with LIF promotes self-renewal of J1 mESCs. • CHIR-regulated genes are involved in multiple pathways. • CHIR inhibits Nodal signaling and promotes Bmp4 expression to activate BMP signaling. • Expression of epigenetic regulatory genes and lincRNAs is altered by CHIR.

  12. CHIR99021 promotes self-renewal of mouse embryonic stem cells by modulation of protein-encoding gene and long intergenic non-coding RNA expression

    International Nuclear Information System (INIS)

    Wu, Yongyan; Ai, Zhiying; Yao, Kezhen; Cao, Lixia; Du, Juan; Shi, Xiaoyan; Guo, Zekun; Zhang, Yong

    2013-01-01

    Embryonic stem cells (ESCs) can proliferate indefinitely in vitro and differentiate into cells of all three germ layers. These unique properties make them exceptionally valuable for drug discovery and regenerative medicine. However, the practical application of ESCs is limited because it is difficult to derive and culture ESCs. It has been demonstrated that CHIR99021 (CHIR) promotes self-renewal and enhances the derivation efficiency of mouse (m)ESCs. However, the downstream targets of CHIR are not fully understood. In this study, we identified CHIR-regulated genes in mESCs using microarray analysis. Our microarray data demonstrated that CHIR not only influenced the Wnt/β-catenin pathway by stabilizing β-catenin, but also modulated several other pluripotency-related signaling pathways such as TGF-β, Notch and MAPK signaling pathways. More detailed analysis demonstrated that CHIR inhibited Nodal signaling, while activating bone morphogenetic protein signaling in mESCs. In addition, we found that pluripotency-maintaining transcription factors were up-regulated by CHIR, while several developmental-related genes were down-regulated. Furthermore, we found that CHIR altered the expression of epigenetic regulatory genes and long intergenic non-coding RNAs. Quantitative real-time PCR results were consistent with microarray data, suggesting that CHIR alters the expression pattern of protein-encoding genes (especially transcription factors), epigenetic regulatory genes and non-coding RNAs to establish a relatively stable pluripotency-maintaining network. - Highlights: • Combined use of CHIR with LIF promotes self-renewal of J1 mESCs. • CHIR-regulated genes are involved in multiple pathways. • CHIR inhibits Nodal signaling and promotes Bmp4 expression to activate BMP signaling. • Expression of epigenetic regulatory genes and lincRNAs is altered by CHIR

  13. Suitability of the molecular subtyping methods intergenic spacer region, direct genome restriction analysis, and pulsed-field gel electrophoresis for clinical and environmental Vibrio parahaemolyticus isolates.

    Science.gov (United States)

    Lüdeke, Catharina H M; Fischer, Markus; LaFon, Patti; Cooper, Kara; Jones, Jessica L

    2014-07-01

    Vibrio parahaemolyticus is the leading cause of infectious illness associated with seafood consumption in the United States. Molecular fingerprinting of strains has become a valuable research tool for understanding this pathogen. However, there are many subtyping methods available and little information on how they compare to one another. For this study, a collection of 67 oyster and 77 clinical V. parahaemolyticus isolates were analyzed by three subtyping methods--intergenic spacer region (ISR-1), direct genome restriction analysis (DGREA), and pulsed-field gel electrophoresis (PFGE)--to determine the utility of these methods for discriminatory subtyping. ISR-1 analysis, run as previously described, provided the lowest discrimination of all the methods (discriminatory index [DI]=0.8665). However, using a broader analytical range than previously reported, ISR-1 clustered isolates based on origin (oyster versus clinical) and had a DI=0.9986. DGREA provided a DI=0.9993-0.9995, but did not consistently cluster the isolates by any identifiable characteristics (origin, serotype, or virulence genotype) and ∼ 15% of isolates were untypeable by this method. PFGE provided a DI=0.9998 when using the combined pattern analysis of both restriction enzymes, SfiI and NotI. This analysis was more discriminatory than using either enzyme pattern alone and primarily grouped isolates by serotype, regardless of strain origin (clinical or oyster) or presence of currently accepted virulence markers. These results indicate that PFGE and ISR-1 are more reliable methods for subtyping V. parahemolyticus, rather than DGREA. Additionally, ISR-1 may provide an indication of pathogenic potential; however, more detailed studies are needed. These data highlight the diversity within V. parahaemolyticus and the need for appropriate selection of subtyping methods depending on the study objectives.

  14. Comparison of automated ribosomal intergenic spacer analysis (ARISA) and denaturing gradient gel electrophoresis (DGGE) techniques for analysing the influence of diet on ruminal bacterial diversity.

    Science.gov (United States)

    Saro, Cristina; Molina-Alcaide, Eduarda; Abecia, Leticia; Ranilla, María José; Carro, María Dolores

    2018-04-01

    The objective of this study was to compare the automated ribosomal intergenic spacer analysis (ARISA) and the denaturing gradient gel electrophoresis (DGGE) techniques for analysing the effects of diet on diversity in bacterial pellets isolated from the liquid (liquid-associated bacteria (LAB)) and solid (solid-associated bacteria (SAB)) phase of the rumen. The four experimental diets contained forage to concentrate ratios of 70:30 or 30:70 and had either alfalfa hay or grass hay as forage. Four rumen-fistulated animals (two sheep and two goats) received the diets in a Latin square design. Bacterial pellets (LAB and SAB) were isolated at 2 h post-feeding for DNA extraction and analysed by ARISA and DGGE. The number of peaks in individual samples ranged from 48 to 99 for LAB and from 41 to 95 for SAB with ARISA, and values of DGGE-bands ranged from 27 to 50 for LAB and from 18 to 45 for SAB. The LAB samples from high concentrate-fed animals tended (p forage-fed animals with ARISA, but no differences were identified with DGGE. The SAB samples from high concentrate-fed animals had lower (p forage diets with ARISA, but only a trend was noticed for these parameters with DGGE (p forage type on LAB diversity was detected by any technique. In this study, ARISA detected some changes in ruminal bacterial communities that were not detected by DGGE, and therefore ARISA was considered more appropriate for assessing bacterial diversity of ruminal bacterial pellets. The results highlight the impact of the fingerprinting technique used to draw conclusions on dietary factors affecting bacterial diversity in ruminal bacterial pellets.

  15. Genomic DNA Enrichment Using Sequence Capture Microarrays: a Novel Approach to Discover Sequence Nucleotide Polymorphisms (SNP) in Brassica napus L

    Science.gov (United States)

    Clarke, Wayne E.; Parkin, Isobel A.; Gajardo, Humberto A.; Gerhardt, Daniel J.; Higgins, Erin; Sidebottom, Christine; Sharpe, Andrew G.; Snowdon, Rod J.; Federico, Maria L.; Iniguez-Luy, Federico L.

    2013-01-01

    Targeted genomic selection methodologies, or sequence capture, allow for DNA enrichment and large-scale resequencing and characterization of natural genetic variation in species with complex genomes, such as rapeseed canola (Brassica napus L., AACC, 2n=38). The main goal of this project was to combine sequence capture with next generation sequencing (NGS) to discover single nucleotide polymorphisms (SNPs) in specific areas of the B. napus genome historically associated (via quantitative trait loci –QTL– analysis) to traits of agronomical and nutritional importance. A 2.1 million feature sequence capture platform was designed to interrogate DNA sequence variation across 47 specific genomic regions, representing 51.2 Mb of the Brassica A and C genomes, in ten diverse rapeseed genotypes. All ten genotypes were sequenced using the 454 Life Sciences chemistry and to assess the effect of increased sequence depth, two genotypes were also sequenced using Illumina HiSeq chemistry. As a result, 589,367 potentially useful SNPs were identified. Analysis of sequence coverage indicated a four-fold increased representation of target regions, with 57% of the filtered SNPs falling within these regions. Sixty percent of discovered SNPs corresponded to transitions while 40% were transversions. Interestingly, fifty eight percent of the SNPs were found in genic regions while 42% were found in intergenic regions. Further, a high percentage of genic SNPs was found in exons (65% and 64% for the A and C genomes, respectively). Two different genotyping assays were used to validate the discovered SNPs. Validation rates ranged from 61.5% to 84% of tested SNPs, underpinning the effectiveness of this SNP discovery approach. Most importantly, the discovered SNPs were associated with agronomically important regions of the B. napus genome generating a novel data resource for research and breeding this crop species. PMID:24312619

  16. REMap: Operon map of M. tuberculosis based on RNA sequence data.

    Science.gov (United States)

    Pelly, Shaaretha; Winglee, Kathryn; Xia, Fang Fang; Stevens, Rick L; Bishai, William R; Lamichhane, Gyanu

    2016-07-01

    A map of the transcriptional organization of genes of an organism is a basic tool that is necessary to understand and facilitate a more accurate genetic manipulation of the organism. Operon maps are largely generated by computational prediction programs that rely on gene conservation and genome architecture and may not be physiologically relevant. With the widespread use of RNA sequencing (RNAseq), the prediction of operons based on actual transcriptome sequencing rather than computational genomics alone is much needed. Here, we report a validated operon map of Mycobacterium tuberculosis, developed using RNAseq data from both the exponential and stationary phases of growth. At least 58.4% of M. tuberculosis genes are organized into 749 operons. Our prediction algorithm, REMap (RNA Expression Mapping of operons), considers the many cases of transcription coverage of intergenic regions, and avoids dependencies on functional annotation and arbitrary assumptions about gene structure. As a result, we demonstrate that REMap is able to more accurately predict operons, especially those that contain long intergenic regions or functionally unrelated genes, than previous operon prediction programs. The REMap algorithm is publicly available as a user-friendly tool that can be readily modified to predict operons in other bacteria. Copyright © 2016 Elsevier Ltd. All rights reserved.

  17. Genetic diversity in breonadia salicina based on intra-species sequence variation of chloroplast dna spacer sequence

    International Nuclear Information System (INIS)

    Qurainy, F.A.; Gaafar, A.R.Z.

    2014-01-01

    Assessment and knowledge of the genetic diversity and variation within and between populations of rare and endangered plants is very important for effective conservation. Intergenic spacer sequences variation of psbA-trnH locus of chloroplast genome was assessed within Breonadia salicina (Rubiaceae), a critically endangered and endemic plant species to South western part of Kingdom of Saudi Arabia. The obtained sequence data from 19 individuals in three populations revealed nine haplotypes. The aligned sequences obtained from the overall Saudi accessions extended to 355 bp, revealing nine haplotypes. A high level of haplotype diversity (Hd = 0.842) and low level of nucleotide diversity (Pi = 0.0058) were detected. Consistently, both hierarchical analysis of molecular variance (AMOVA) and constructed neighbor-joining tree indicated null genetic differentiation among populations. This level of differentiation between populations or between regions in psbA-trnH sequences may be due to effects of the abundance of ancestral haplotype sharing and the presence of private haplotypes fixed for each population. Furthermore, the results revealed almost the same level of genetic diversity in comparison with Yemeni accessions, in which Saudi accessions were sharing three haplotypes from the four haplotypes found in Yemeni accessions. (author)

  18. Five Complete Chloroplast Genome Sequences from Diospyros: Genome Organization and Comparative Analysis.

    Science.gov (United States)

    Fu, Jianmin; Liu, Huimin; Hu, Jingjing; Liang, Yuqin; Liang, Jinjun; Wuyun, Tana; Tan, Xiaofeng

    2016-01-01

    Diospyros is the largest genus in Ebenaceae, comprising more than 500 species with remarkable economic value, especially Diospyros kaki Thunb., which has traditionally been an important food resource in China, Korea, and Japan. Complete chloroplast (cp) genomes from D. kaki, D. lotus L., D. oleifera Cheng., D. glaucifolia Metc., and Diospyros 'Jinzaoshi' were sequenced using Illumina sequencing technology. This is the first cp genome reported in Ebenaceae. The cp genome sequences of Diospyros ranged from 157,300 to 157,784 bp in length, presenting a typical quadripartite structure with two inverted repeats each separated by one large and one small single-copy region. For each cp genome, 134 genes were annotated, including 80 protein-coding, 31 tRNA, and 4 rRNA unique genes. In all, 179 repeats and 283 single sequence repeats were identified. Four hypervariable regions, namely, intergenic region of trnQ_rps16, trnV_ndhC, and psbD_trnT, and intron of ndhA, were identified in the Diospyros genomes. Phylogenetic analyses based on the whole cp genome, protein-coding, and intergenic and intron sequences indicated that D. oleifera is closely related to D. kaki and could be used as a model plant for future research on D. kaki; to our knowledge, this is proposed for the first time. Further, these analyses together with two large deletions (301 and 140 bp) in the cp genome of D. 'Jinzaoshi', support its placement as a new species in Diospyros. Both maximum parsimony and likelihood analyses for 19 taxa indicated the basal position of Ericales in asterids and suggested that Ebenaceae is monophyletic in Ericales.

  19. Five Complete Chloroplast Genome Sequences from Diospyros: Genome Organization and Comparative Analysis.

    Directory of Open Access Journals (Sweden)

    Jianmin Fu

    Full Text Available Diospyros is the largest genus in Ebenaceae, comprising more than 500 species with remarkable economic value, especially Diospyros kaki Thunb., which has traditionally been an important food resource in China, Korea, and Japan. Complete chloroplast (cp genomes from D. kaki, D. lotus L., D. oleifera Cheng., D. glaucifolia Metc., and Diospyros 'Jinzaoshi' were sequenced using Illumina sequencing technology. This is the first cp genome reported in Ebenaceae. The cp genome sequences of Diospyros ranged from 157,300 to 157,784 bp in length, presenting a typical quadripartite structure with two inverted repeats each separated by one large and one small single-copy region. For each cp genome, 134 genes were annotated, including 80 protein-coding, 31 tRNA, and 4 rRNA unique genes. In all, 179 repeats and 283 single sequence repeats were identified. Four hypervariable regions, namely, intergenic region of trnQ_rps16, trnV_ndhC, and psbD_trnT, and intron of ndhA, were identified in the Diospyros genomes. Phylogenetic analyses based on the whole cp genome, protein-coding, and intergenic and intron sequences indicated that D. oleifera is closely related to D. kaki and could be used as a model plant for future research on D. kaki; to our knowledge, this is proposed for the first time. Further, these analyses together with two large deletions (301 and 140 bp in the cp genome of D. 'Jinzaoshi', support its placement as a new species in Diospyros. Both maximum parsimony and likelihood analyses for 19 taxa indicated the basal position of Ericales in asterids and suggested that Ebenaceae is monophyletic in Ericales.

  20. Overexpression of long intergenic noncoding RNA LINC00312 inhibits the invasion and migration of thyroid cancer cells by down-regulating microRNA-197-3p.

    Science.gov (United States)

    Liu, Kai; Huang, Wen; Yan, Dan-Qing; Luo, Qing; Min, Xiang

    2017-08-31

    The study evaluated the ability of long intergenic noncoding RNA LINC00312 (LINC00312) to influence the proliferation, invasion, and migration of thyroid cancer (TC) cells by regulating miRNA-197-3p. TC tissues and adjacent normal tissues were collected from 211 TC patients. K1 (papillary TC), SW579 (squamous TC), and 8505C (anaplastic TC) cell lines were assigned into a blank, negative control (NC), LINC00312 overexpression, miR-197-3p inhibitors, and LINC00312 overexpression + miR-197-3p mimics group. The expression of LINC00312, miR-197-3p , and p120 were measured using quantitative real-time PCR (qRT-PCR) and Western blotting. Cell proliferation was assessed via CCK8 assay, cell invasion through the scratch test, and cell migration via Transwell assay. In comparison with adjacent normal tissues, the expression of LINC00312 is down-regulated and the expression of miR-197-3p is up-regulated in TC tissues. The dual luciferase reporter gene assay confirmed that P120 is a target of miR-197-3p The expression of LINC00312 and p120 was higher in the LINC00312 overexpression group than in the blank and NV groups. However, the expression of miR-197-3p was lower in the LINC00312 overexpression group than in the blank and NC groups. The miR-197-3p inhibitors group had a higher expression of miR-197-3p , but a lower expression of p120 than the blank and NC groups. The LINC00312 overexpression and miR-197-3p inhibitor groups had reduced cell proliferation, invasion and migration than the blank and NC groups. These results indicate that a LINC00312 overexpression inhibits the proliferation, invasion, and migration of TC cells and that this can be achieved by down-regulating miR-197-3p . © 2017 The Author(s).

  1. RNA sequencing of the exercise transcriptome in equine athletes.

    Directory of Open Access Journals (Sweden)

    Stefano Capomaccio

    Full Text Available The horse is an optimal model organism for studying the genomic response to exercise-induced stress, due to its natural aptitude for athletic performance and the relative homogeneity of its genetic and environmental backgrounds. Here, we applied RNA-sequencing analysis through the use of SOLiD technology in an experimental framework centered on exercise-induced stress during endurance races in equine athletes. We monitored the transcriptional landscape by comparing gene expression levels between animals at rest and after competition. Overall, we observed a shift from coding to non-coding regions, suggesting that the stress response involves the differential expression of not annotated regions. Notably, we observed significant post-race increases of reads that correspond to repeats, especially the intergenic and intronic L1 and L2 transposable elements. We also observed increased expression of the antisense strands compared to the sense strands in intronic and regulatory regions (1 kb up- and downstream of the genes, suggesting that antisense transcription could be one of the main mechanisms for transposon regulation in the horse under stress conditions. We identified a large number of transcripts corresponding to intergenic and intronic regions putatively associated with new transcriptional elements. Gene expression and pathway analysis allowed us to identify several biological processes and molecular functions that may be involved with exercise-induced stress. Ontology clustering reflected mechanisms that are already known to be stress activated (e.g., chemokine-type cytokines, Toll-like receptors, and kinases, as well as "nucleic acid binding" and "signal transduction activity" functions. There was also a general and transient decrease in the global rates of protein synthesis, which would be expected after strenuous global stress. In sum, our network analysis points toward the involvement of specific gene clusters in equine exercise

  2. Molecular profiling of microbial communities from contaminated sources: Use of subtractive cloning methods and rDNA spacer sequences. 1998 annual progress report

    International Nuclear Information System (INIS)

    Robb, F.T.

    1998-01-01

    'The major objective of the research is to provide appropriate sequences and to assemble a high-density DNA array of oligonucleotides that can be used for rapid profiling of microbial populations from polluted areas. The sequences to be assigned to the DNA array are chosen from from cloned genomic DNA sequences (the ribosomal operon, described below) from groundwater at DOE sites containing organic solvents. The sites, Hanford Nuclear Plant and Lawrence Livermore Site 300, have well characterized pollutant histories, which have been provided by the collaborators. At this mid-point of the project, over 60 unique sequence classes of intergenic spacer region have been identified from the first sample site. The use of these sequences as hybridization probes, and their frequency of occurrence, allow a clear distinction between bacterial communities before and after remediation by acetate/nitrate pumping. The authors have developed the hybridization conditions for identifying PCR products in a 96 well format, a versatile alignment and visualization program (acronym: MALIGN) developed by Dr. Dennis Maeder, has been used to align the ISRs, which are variable in length and sometimes in position of the tRNAs. Finally, in collaboration with Dr. W. Chen and Dr. J. Zhou at ORNL, they have significant evidence that mass spectrometer analysis can be used to determine the lengths of PCR amplified intergenic spacer DNA.'

  3. Multispacer sequence typing relapsing fever Borreliae in Africa.

    Directory of Open Access Journals (Sweden)

    Haitham Elbir

    Full Text Available BACKGROUND: In Africa, relapsing fevers are neglected arthropod-borne infections caused by closely related Borrelia species. They cause mild to deadly undifferentiated fever particularly severe in pregnant women. Lack of a tool to genotype these Borrelia organisms limits knowledge regarding their reservoirs and their epidemiology. METHODOLOGY/PRINCIPAL FINDINGS: Genome sequence analysis of Borrelia crocidurae, Borrelia duttonii and Borrelia recurrentis yielded 5 intergenic spacers scattered between 10 chromosomal genes that were incorporated into a multispacer sequence typing (MST approach. Sequencing these spacers directly from human blood specimens previously found to be infected by B. recurrentis (30 specimens, B. duttonii (17 specimens and B. crocidurae (13 specimens resolved these 60 strains and the 3 type strains into 13 species-specific spacer types in the presence of negative controls. B. crocidurae comprised of 8 spacer types, B. duttonii of 3 spacer types and B. recurrentis of 2 spacer types. CONCLUSIONS/SIGNIFICANCE: Phylogenetic analyses of MST data suggested that B. duttonii, B. crocidurae and B. recurrentis are variants of a unique ancestral Borrelia species. MST proved to be a suitable approach for identifying and genotyping relapsing fever borreliae in Africa. It could be applied to both vectors and clinical specimens.

  4. Nuclear RNA sequencing of the mouse erythroid cell transcriptome.

    Directory of Open Access Journals (Sweden)

    Jennifer A Mitchell

    Full Text Available In addition to protein coding genes a substantial proportion of mammalian genomes are transcribed. However, most transcriptome studies investigate steady-state mRNA levels, ignoring a considerable fraction of the transcribed genome. In addition, steady-state mRNA levels are influenced by both transcriptional and posttranscriptional mechanisms, and thus do not provide a clear picture of transcriptional output. Here, using deep sequencing of nuclear RNAs (nucRNA-Seq in parallel with chromatin immunoprecipitation sequencing (ChIP-Seq of active RNA polymerase II, we compared the nuclear transcriptome of mouse anemic spleen erythroid cells with polymerase occupancy on a genome-wide scale. We demonstrate that unspliced transcripts quantified by nucRNA-seq correlate with primary transcript frequencies measured by RNA FISH, but differ from steady-state mRNA levels measured by poly(A-enriched RNA-seq. Highly expressed protein coding genes showed good correlation between RNAPII occupancy and transcriptional output; however, genome-wide we observed a poor correlation between transcriptional output and RNAPII association. This poor correlation is due to intergenic regions associated with RNAPII which correspond with transcription factor bound regulatory regions and a group of stable, nuclear-retained long non-coding transcripts. In conclusion, sequencing the nuclear transcriptome provides an opportunity to investigate the transcriptional landscape in a given cell type through quantification of unspliced primary transcripts and the identification of nuclear-retained long non-coding RNAs.

  5. The complete genome sequence of the Atlantic salmon paramyxovirus (ASPV)

    International Nuclear Information System (INIS)

    Nylund, Stian; Karlsen, Marius; Nylund, Are

    2008-01-01

    The complete RNA genome of the Atlantic salmon paramyxovirus (ASPV), isolated from Atlantic salmon suffering from proliferative gill inflammation (PGI), has been determined. The genome is 16,965 nucleotides in length and consists of six nonoverlapping genes in the order 3'- N - P/C/V - M - F - HN - L -5', coding for the nucleocapsid, phospho-, matrix, fusion, hemagglutinin-neuraminidase and large polymerase proteins, respectively. The gene junctions contain highly conserved transcription start and stop signal sequences and trinucleotide intergenic regions similar to those of other Paramyxoviridae. The ASPV P-gene expression strategy is like that of the respiro- and morbilliviruses, which express the phosphoprotein from the primary transcript, and edit a portion of the mRNA to encode the accessory proteins V and W. It also encodes the C-protein by ribosomal choice of translation initiation. Pairwise comparisons of amino acid identities, and phylogenetic analysis of deduced ASPV protein sequences with homologous sequences from other Paramyxoviridae, show that ASPV has an affinity for the genus Respirovirus, but may represent a new genus within the subfamily Paramyxovirinae

  6. Complete nucleotide sequences of avian metapneumovirus subtype B genome.

    Science.gov (United States)

    Sugiyama, Miki; Ito, Hiroshi; Hata, Yusuke; Ono, Eriko; Ito, Toshihiro

    2010-12-01

    Complete nucleotide sequences were determined for subtype B avian metapneumovirus (aMPV), the attenuated vaccine strain VCO3/50 and its parental pathogenic strain VCO3/60616. The genomes of both strains comprised 13,508 nucleotides (nt), with a 42-nt leader at the 3'-end and a 46-nt trailer at the 5'-end. The genome contains eight genes in the order 3'-N-P-M-F-M2-SH-G-L-5', which is the same order shown in the other metapneumoviruses. The genes are flanked on either side by conserved transcriptional start and stop signals and have intergenic sequences varying in length from 1 to 88 nt. Comparison of nt and predicted amino acid (aa) sequences of VCO3/60616 with those of other metapneumoviruses revealed higher homology with aMPV subtype A virus than with other metapneumoviruses. A total of 18 nt and 10 deduced aa differences were seen between the strains, and one or a combination of several differences could be associated with attenuation of VCO3/50.

  7. DNA barcode and identification of the varieties and provenances of Taiwan's domestic and imported made teas using ribosomal internal transcribed spacer 2 sequences.

    Science.gov (United States)

    Lee, Shih-Chieh; Wang, Chia-Hsiang; Yen, Cheng-En; Chang, Chieh

    2017-04-01

    The major aim of made tea identification is to identify the variety and provenance of the tea plant. The present experiment used 113 tea plants [Camellia sinensis (L.) O. Kuntze] housed at the Tea Research and Extension Substation, from which 113 internal transcribed spacer 2 (ITS2) fragments, 104 trnL intron, and 98 trnL-trnF intergenic sequence region DNA sequences were successfully sequenced. The similarity of the ITS2 nucleotide sequences between tea plants housed at the Tea Research and Extension Substation was 0.379-0.994. In this polymerase chain reaction-amplified noncoding region, no varieties possessed identical sequences. Compared with the trnL intron and trnL-trnF intergenic sequence fragments of chloroplast cpDNA, the proportion of ITS2 nucleotide sequence variation was large and is more suitable for establishing a DNA barcode database to identify tea plant varieties. After establishing the database, 30 imported teas and 35 domestic made teas were used in this model system to explore the feasibility of using ITS2 sequences to identify the varieties and provenances of made teas. A phylogenetic tree was constructed using ITS2 sequences with the unweighted pair group method with arithmetic mean, which indicated that the same variety of tea plant is likely to be successfully categorized into one cluster, but contamination from other tea plants was also detected. This result provides molecular evidence that the similarity between important tea varieties in Taiwan remains high. We suggest a direct, wide collection of made tea and original samples of tea plants to establish an ITS2 sequence molecular barcode identification database to identify the varieties and provenances of tea plants. The DNA barcode comparison method can satisfy the need for a rapid, low-cost, frontline differentiation of the large amount of made teas from Taiwan and abroad, and can provide molecular evidence of their varieties and provenances. Copyright © 2016. Published by Elsevier B.V.

  8. Plastome Sequence Determination and Comparative Analysis for Members of the Lolium-Festuca Grass Species Complex

    Science.gov (United States)

    Hand, Melanie L.; Spangenberg, German C.; Forster, John W.; Cogan, Noel O. I.

    2013-01-01

    Chloroplast genome sequences are of broad significance in plant biology, due to frequent use in molecular phylogenetics, comparative genomics, population genetics, and genetic modification studies. The present study used a second-generation sequencing approach to determine and assemble the plastid genomes (plastomes) of four representatives from the agriculturally important Lolium-Festuca species complex of pasture grasses (Lolium multiflorum, Festuca pratensis, Festuca altissima, and Festuca ovina). Total cellular DNA was extracted from either roots or leaves, was sequenced, and the output was filtered for plastome-related reads. A comparison between sources revealed fewer plastome-related reads from root-derived template but an increase in incidental bacterium-derived sequences. Plastome assembly and annotation indicated high levels of sequence identity and a conserved organization and gene content between species. However, frequent deletions within the F. ovina plastome appeared to contribute to a smaller plastid genome size. Comparative analysis with complete plastome sequences from other members of the Poaceae confirmed conservation of most grass-specific features. Detailed analysis of the rbcL–psaI intergenic region, however, revealed a “hot-spot” of variation characterized by independent deletion events. The evolutionary implications of this observation are discussed. The complete plastome sequences are anticipated to provide the basis for potential organelle-specific genetic modification of pasture grasses. PMID:23550121

  9. Quality Control of the Traditional Patent Medicine Yimu Wan Based on SMRT Sequencing and DNA Barcoding

    Science.gov (United States)

    Jia, Jing; Xu, Zhichao; Xin, Tianyi; Shi, Linchun; Song, Jingyuan

    2017-01-01

    Substandard traditional patent medicines may lead to global safety-related issues. Protecting consumers from the health risks associated with the integrity and authenticity of herbal preparations is of great concern. Of particular concern is quality control for traditional patent medicines. Here, we establish an effective approach for verifying the biological composition of traditional patent medicines based on single-molecule real-time (SMRT) sequencing and DNA barcoding. Yimu Wan (YMW), a classical herbal prescription recorded in the Chinese Pharmacopoeia, was chosen to test the method. Two reference YMW samples were used to establish a standard method for analysis, which was then applied to three different batches of commercial YMW samples. A total of 3703 and 4810 circular-consensus sequencing (CCS) reads from two reference and three commercial YMW samples were mapped to the ITS2 and psbA-trnH regions, respectively. Moreover, comparison of intraspecific genetic distances based on SMRT sequencing data with reference data from Sanger sequencing revealed an ITS2 and psbA-trnH intergenic spacer that exhibited high intraspecific divergence, with the sites of variation showing significant differences within species. Using the CCS strategy for SMRT sequencing analysis was adequate to guarantee the accuracy of identification. This study demonstrates the application of SMRT sequencing to detect the biological ingredients of herbal preparations. SMRT sequencing provides an affordable way to monitor the legality and safety of traditional patent medicines. PMID:28620408

  10. Inheritance of the yeast mitochondrial genome

    DEFF Research Database (Denmark)

    Piskur, Jure

    1994-01-01

    Mitochondrion, extrachromosomal genetics, intergenic sequences, genome size, mitochondrial DNA, petite mutation, yeast......Mitochondrion, extrachromosomal genetics, intergenic sequences, genome size, mitochondrial DNA, petite mutation, yeast...

  11. Phylogenetic relationships within Luzula DC. and Juncus L. (Juncaceae): A comparison of phylogenetic signals of trnL-trnF intergenic spacer, trnL intron and rbcL plastome sequence data

    Czech Academy of Sciences Publication Activity Database

    Drábková, Lenka; Kirschner, Jan; Vlček, Čestmír

    2006-01-01

    Roč. 22, č. 2 (2006), s. 132-143 ISSN 0748-3007 R&D Projects: GA ČR GA206/02/0355 Institutional research plan: CEZ:AV0Z60050516; CEZ:AV0Z50520514 Keywords : Luzula * Juncus * Juncaceae Subject RIV: EF - Botanics Impact factor: 4.270, year: 2006

  12. Shotgun protein sequencing.

    Energy Technology Data Exchange (ETDEWEB)

    Faulon, Jean-Loup Michel; Heffelfinger, Grant S.

    2009-06-01

    A novel experimental and computational technique based on multiple enzymatic digestion of a protein or protein mixture that reconstructs protein sequences from sequences of overlapping peptides is described in this SAND report. This approach, analogous to shotgun sequencing of DNA, is to be used to sequence alternative spliced proteins, to identify post-translational modifications, and to sequence genetically engineered proteins.

  13. Multimodal sequence learning.

    Science.gov (United States)

    Kemény, Ferenc; Meier, Beat

    2016-02-01

    While sequence learning research models complex phenomena, previous studies have mostly focused on unimodal sequences. The goal of the current experiment is to put implicit sequence learning into a multimodal context: to test whether it can operate across different modalities. We used the Task Sequence Learning paradigm to test whether sequence learning varies across modalities, and whether participants are able to learn multimodal sequences. Our results show that implicit sequence learning is very similar regardless of the source modality. However, the presence of correlated task and response sequences was required for learning to take place. The experiment provides new evidence for implicit sequence learning of abstract conceptual representations. In general, the results suggest that correlated sequences are necessary for implicit sequence learning to occur. Moreover, they show that elements from different modalities can be automatically integrated into one unitary multimodal sequence. Copyright © 2015 Elsevier B.V. All rights reserved.

  14. Sequence Read Archive (SRA)

    Data.gov (United States)

    U.S. Department of Health & Human Services — The Sequence Read Archive (SRA) stores raw sequencing data from the next generation of sequencing platforms including Roche 454 GS System®, Illumina Genome...

  15. An Intergenic Region Shared by At4g35985 and At4g35987 in Arabidopsis thaliana Is a Tissue Specific and Stress Inducible Bidirectional Promoter Analyzed in Transgenic Arabidopsis and Tobacco Plants

    Science.gov (United States)

    Banerjee, Joydeep; Sahoo, Dipak Kumar; Dey, Nrisingha; Houtz, Robert L.; Maiti, Indu Bhushan

    2013-01-01

    On chromosome 4 in the Arabidopsis genome, two neighboring genes (calmodulin methyl transferase At4g35987 and senescence associated gene At4g35985) are located in a head-to-head divergent orientation sharing a putative bidirectional promoter. This 1258 bp intergenic region contains a number of environmental stress responsive and tissue specific cis-regulatory elements. Transcript analysis of At4g35985 and At4g35987 genes by quantitative real time PCR showed tissue specific and stress inducible expression profiles. We tested the bidirectional promoter-function of the intergenic region shared by the divergent genes At4g35985 and At4g35987 using two reporter genes (GFP and GUS) in both orientations in transient tobacco protoplast and Agro-infiltration assays, as well as in stably transformed transgenic Arabidopsis and tobacco plants. In transient assays with GFP and GUS reporter genes the At4g35985 promoter (P85) showed stronger expression (about 3.5 fold) compared to the At4g35987 promoter (P87). The tissue specific as well as stress responsive functional nature of the bidirectional promoter was evaluated in independent transgenic Arabidopsis and tobacco lines. Expression of P85 activity was detected in the midrib of leaves, leaf trichomes, apical meristemic regions, throughout the root, lateral roots and flowers. The expression of P87 was observed in leaf-tip, hydathodes, apical meristem, root tips, emerging lateral root tips, root stele region and in floral tissues. The bidirectional promoter in both orientations shows differential up-regulation (2.5 to 3 fold) under salt stress. Use of such regulatory elements of bidirectional promoters showing spatial and stress inducible promoter-functions in heterologous system might be an important tool for plant biotechnology and gene stacking applications. PMID:24260266

  16. An intergenic region shared by At4g35985 and At4g35987 in Arabidopsis thaliana is a tissue specific and stress inducible bidirectional promoter analyzed in transgenic arabidopsis and tobacco plants.

    Directory of Open Access Journals (Sweden)

    Joydeep Banerjee

    Full Text Available On chromosome 4 in the Arabidopsis genome, two neighboring genes (calmodulin methyl transferase At4g35987 and senescence associated gene At4g35985 are located in a head-to-head divergent orientation sharing a putative bidirectional promoter. This 1258 bp intergenic region contains a number of environmental stress responsive and tissue specific cis-regulatory elements. Transcript analysis of At4g35985 and At4g35987 genes by quantitative real time PCR showed tissue specific and stress inducible expression profiles. We tested the bidirectional promoter-function of the intergenic region shared by the divergent genes At4g35985 and At4g35987 using two reporter genes (GFP and GUS in both orientations in transient tobacco protoplast and Agro-infiltration assays, as well as in stably transformed transgenic Arabidopsis and tobacco plants. In transient assays with GFP and GUS reporter genes the At4g35985 promoter (P85 showed stronger expression (about 3.5 fold compared to the At4g35987 promoter (P87. The tissue specific as well as stress responsive functional nature of the bidirectional promoter was evaluated in independent transgenic Arabidopsis and tobacco lines. Expression of P85 activity was detected in the midrib of leaves, leaf trichomes, apical meristemic regions, throughout the root, lateral roots and flowers. The expression of P87 was observed in leaf-tip, hydathodes, apical meristem, root tips, emerging lateral root tips, root stele region and in floral tissues. The bidirectional promoter in both orientations shows differential up-regulation (2.5 to 3 fold under salt stress. Use of such regulatory elements of bidirectional promoters showing spatial and stress inducible promoter-functions in heterologous system might be an important tool for plant biotechnology and gene stacking applications.

  17. Methylome-wide Sequencing Detects DNA Hypermethylation Distinguishing Indolent from Aggressive Prostate Cancer

    Directory of Open Access Journals (Sweden)

    Jeffrey M. Bhasin

    2015-12-01

    Full Text Available A critical need in understanding the biology of prostate cancer is characterizing the molecular differences between indolent and aggressive cases. Because DNA methylation can capture the regulatory state of tumors, we analyzed differential methylation patterns genome-wide among benign prostatic tissue and low-grade and high-grade prostate cancer and found extensive, focal hypermethylation regions unique to high-grade disease. These hypermethylation regions occurred not only in the promoters of genes but also in gene bodies and at intergenic regions that are enriched for DNA-protein binding sites. Integration with existing RNA-sequencing (RNA-seq and survival data revealed regions where DNA methylation correlates with reduced gene expression associated with poor outcome. Regions specific to aggressive disease are proximal to genes with distinct functions from regions shared by indolent and aggressive disease. Our compendium of methylation changes reveals crucial molecular distinctions between indolent and aggressive prostate cancer.

  18. Polymorphic DNA sequences of the fungal honey bee pathogen Ascosphaera apis

    DEFF Research Database (Denmark)

    Jensen, Annette B; Welker, Dennis L; Kryger, Per

    2012-01-01

    The pathogenic fungus Ascosphaera apis is ubiquitous in honey bee populations. We used the draft genome assembly of this pathogen to search for polymorphic intergenic loci that could be used to differentiate haplotypes. Primers were developed for five such loci, and the species specificities were...... verified using DNA from nine closely related species. The sequence variation was compared among 12 A. apis isolates at each of these loci, and two additional loci, the internal transcribed spacer of the ribosomal RNA (ITS) and a variable part of the elongation factor 1α (Ef1α). The degree of variation...... was then compared among the different loci, and three were found to have the greatest detection power for identifying A. apis haplotypes. The described loci can help to resolve strain differences and population genetic structures, to elucidate host–pathogen interaction and to test evolutionary hypotheses...

  19. GEITLERINEMA SPECIES (OSCILLATORIALES, CYANOBACTERIA) REVEALED BY CELLULAR MORPHOLOGY, ULTRASTRUCTURE, AND DNA SEQUENCING(1).

    Science.gov (United States)

    Do Carmo Bittencourt-Oliveira, Maria; Do Nascimento Moura, Ariadne; De Oliveira, Mariana Cabral; Sidnei Massola, Nelson

    2009-06-01

    Geitlerinema amphibium (C. Agardh ex Gomont) Anagn. and G. unigranulatum (Rama N. Singh) Komárek et M. T. P. Azevedo are morphologically close species with characteristics frequently overlapping. Ten strains of Geitlerinema (six of G. amphibium and four of G. unigranulatum) were analyzed by DNA sequencing and transmission electronic and optical microscopy. Among the investigated strains, the two species were not separated with respect to cellular dimensions, and cellular width was the most varying characteristic. The number and localization of granules, as well as other ultrastructural characteristics, did not provide a means to discriminate between the two species. The two species were not separated either by geography or environment. These results were further corroborated by the analysis of the cpcB-cpcA intergenic spacer (PC-IGS) sequences. Given the fact that morphology is very uniform, plus the coexistence of these populations in the same habitat, it would be nearly impossible to distinguish between them in nature. On the other hand, two of the analyzed strains were distinct from all others based on the PC-IGS sequences, in spite of their morphological similarity. PC-IGS sequences indicate that these two strains could be a different species of Geitlerinema. Using morphology, cell ultrastructure, and PC-IGS sequences, it is not possible to distinguish G. amphibium and G. unigranulatum. Therefore, they should be treated as one species, G. unigranulatum as a synonym of G. amphibium. © 2009 Phycological Society of America.

  20. Linking maternal and somatic 5S rRNA types with different sequence-specific non-LTR retrotransposons.

    Science.gov (United States)

    Locati, Mauro D; Pagano, Johanna F B; Ensink, Wim A; van Olst, Marina; van Leeuwen, Selina; Nehrdich, Ulrike; Zhu, Kongju; Spaink, Herman P; Girard, Geneviève; Rauwerda, Han; Jonker, Martijs J; Dekker, Rob J; Breit, Timo M

    2017-04-01

    5S rRNA is a ribosomal core component, transcribed from many gene copies organized in genomic repeats. Some eukaryotic species have two 5S rRNA types defined by their predominant expression in oogenesis or adult tissue. Our next-generation sequencing study on zebrafish egg, embryo, and adult tissue identified maternal-type 5S rRNA that is exclusively accumulated during oogenesis, replaced throughout the embryogenesis by a somatic-type, and thus virtually absent in adult somatic tissue. The maternal-type 5S rDNA contains several thousands of gene copies on chromosome 4 in tandem repeats with small intergenic regions, whereas the somatic-type is present in only 12 gene copies on chromosome 18 with large intergenic regions. The nine-nucleotide variation between the two 5S rRNA types likely affects TFIII binding and riboprotein L5 binding, probably leading to storage of maternal-type rRNA. Remarkably, these sequence differences are located exactly at the sequence-specific target site for genome integration by the 5S rRNA-specific Mutsu retrotransposon family. Thus, we could define maternal- and somatic-type MutsuDr subfamilies. Furthermore, we identified four additional maternal-type and two new somatic-type MutsuDr subfamilies, each with their own target sequence. This target-site specificity, frequently intact maternal-type retrotransposon elements, plus specific presence of Mutsu retrotransposon RNA and piRNA in egg and adult tissue, suggest an involvement of retrotransposons in achieving the differential copy number of the two types of 5S rDNA loci. © 2017 Locati et al.; Published by Cold Spring Harbor Laboratory Press for the RNA Society.

  1. Molecular Profiling of Microbial Communities from Contaminated Sources: Use of Subtractive Cloning Methods and rDNA Spacer Sequences; FINAL

    International Nuclear Information System (INIS)

    Robb, Frank T.

    2001-01-01

    The major objective of this research was to provide appropriate sequences and assemble a DNA array of oligonucleotides to be used for rapid profiling of microbial populations from polluted areas and other areas of interest. The sequences to be assigned to the DNA array were chosen from cloned genomic DNA taken from groundwater sites having well characterized pollutant histories at Hanford Nuclear Plant and Lawrence Livermore Site 300. Glass-slide arrays were made and tested; and a new multiplexed, bead-based method was developed that uses nucleic acid hybridization on the surface of microscopic polystyrene spheres to identify specific sequences in heterogeneous mixtures of DNA sequences. The test data revealed considerable strain variation between sample sites showing a striking distribution of sequences. It also suggests that diversity varies greatly with bioremediation, and that there are many bacterial intergenic spacer region sequences that can indicate its effects. The bead method exhibited superior sequence discrimination and has features for easier and more accurate measurement

  2. Repeated extragenic sequences in prokaryotic genomes: a proposal for the origin and dynamics of the RUP element in Streptococcus pneumoniae.

    Science.gov (United States)

    Oggioni, M R; Claverys, J P

    1999-10-01

    A survey of all Streptococcus pneumoniae GenBank/EMBL DNA sequence entries and of the public domain sequence (representing more than 90% of the genome) of an S. pneumoniae type 4 strain allowed identification of 108 copies of a 107-bp-long highly repeated intergenic element called RUP (for repeat unit of pneumococcus). Several features of the element, revealed in this study, led to the proposal that RUP is an insertion sequence (IS)-derivative that could still be mobile. Among these features are: (1) a highly significant homology between the terminal inverted repeats (IRs) of RUPs and of IS630-Spn1, a new putative IS of S. pneumoniae; and (2) insertion at a TA dinucleotide, a characteristic target of several members of the IS630 family. Trans-mobilization of RUP is therefore proposed to be mediated by the transposase of IS630-Spn1. To account for the observation that RUPs are distributed among four subtypes which exhibit different degrees of sequence homogeneity, a scenario is invoked based on successive stages of RUP mobility and non-mobility, depending on whether an active transposase is present or absent. In the latter situation, an active transposase could be reintroduced into the species through natural transformation. Examination of sequences flanking RUP revealed a preferential association with ISs. It also provided evidence that RUPs promote sequence rearrangements, thereby contributing to genome flexibility. The possibility that RUP preferentially targets transforming DNA of foreign origin and subsequently favours disruption/rearrangement of exogenous sequences is discussed.

  3. Nonparametric combinatorial sequence models.

    Science.gov (United States)

    Wauthier, Fabian L; Jordan, Michael I; Jojic, Nebojsa

    2011-11-01

    This work considers biological sequences that exhibit combinatorial structures in their composition: groups of positions of the aligned sequences are "linked" and covary as one unit across sequences. If multiple such groups exist, complex interactions can emerge between them. Sequences of this kind arise frequently in biology but methodologies for analyzing them are still being developed. This article presents a nonparametric prior on sequences which allows combinatorial structures to emerge and which induces a posterior distribution over factorized sequence representations. We carry out experiments on three biological sequence families which indicate that combinatorial structures are indeed present and that combinatorial sequence models can more succinctly describe them than simpler mixture models. We conclude with an application to MHC binding prediction which highlights the utility of the posterior distribution over sequence representations induced by the prior. By integrating out the posterior, our method compares favorably to leading binding predictors.

  4. Spliced DNA Sequences in the Paramecium Germline: Their Properties and Evolutionary Potential

    Science.gov (United States)

    Catania, Francesco; McGrath, Casey L.; Doak, Thomas G.; Lynch, Michael

    2013-01-01

    Despite playing a crucial role in germline-soma differentiation, the evolutionary significance of developmentally regulated genome rearrangements (DRGRs) has received scant attention. An example of DRGR is DNA splicing, a process that removes segments of DNA interrupting genic and/or intergenic sequences. Perhaps, best known for shaping immune-system genes in vertebrates, DNA splicing plays a central role in the life of ciliated protozoa, where thousands of germline DNA segments are eliminated after sexual reproduction to regenerate a functional somatic genome. Here, we identify and chronicle the properties of 5,286 sequences that putatively undergo DNA splicing (i.e., internal eliminated sequences [IESs]) across the genomes of three closely related species of the ciliate Paramecium (P. tetraurelia, P. biaurelia, and P. sexaurelia). The study reveals that these putative IESs share several physical characteristics. Although our results are consistent with excision events being largely conserved between species, episodes of differential IES retention/excision occur, may have a recent origin, and frequently involve coding regions. Our findings indicate interconversion between somatic—often coding—DNA sequences and noncoding IESs, and provide insights into the role of DNA splicing in creating potentially functional genetic innovation. PMID:23737328

  5. Sequencing, Characterization, and Comparative Analyses of the Plastome of Caragana rosea var. rosea

    Directory of Open Access Journals (Sweden)

    Mei Jiang

    2018-05-01

    Full Text Available To exploit the drought-resistant Caragana species, we performed a comparative study of the plastomes from four species: Caragana rosea, C. microphylla, C. kozlowii, and C. Korshinskii. The complete plastome sequence of the C. rosea was obtained using the next generation DNA sequencing technology. The genome is a circular structure of 133,122 bases and it lacks inverted repeat. It contains 111 unique genes, including 76 protein-coding, 30 tRNA, and four rRNA genes. Repeat analyses obtained 239, 244, 258, and 246 simple sequence repeats in C. rosea, C. microphylla, C. kozlowii, and C. korshinskii, respectively. Analyses of sequence divergence found two intergenic regions: trnI-CAU-ycf2 and trnN-GUU-ycf1, exhibiting a high degree of variations. Phylogenetic analyses showed that the four Caragana species belong to a monophyletic clade. Analyses of Ka/Ks ratios revealed that five genes: rpl16, rpl20, rps11, rps7, and ycf1 and several sites having undergone strong positive selection in the Caragana branch. The results lay the foundation for the development of molecular markers and the understanding of the evolutionary process for drought-resistant characteristics.

  6. Genome Sequence Databases (Overview): Sequencing and Assembly

    Energy Technology Data Exchange (ETDEWEB)

    Lapidus, Alla L.

    2009-01-01

    From the date its role in heredity was discovered, DNA has been generating interest among scientists from different fields of knowledge: physicists have studied the three dimensional structure of the DNA molecule, biologists tried to decode the secrets of life hidden within these long molecules, and technologists invent and improve methods of DNA analysis. The analysis of the nucleotide sequence of DNA occupies a special place among the methods developed. Thanks to the variety of sequencing technologies available, the process of decoding the sequence of genomic DNA (or whole genome sequencing) has become robust and inexpensive. Meanwhile the assembly of whole genome sequences remains a challenging task. In addition to the need to assemble millions of DNA fragments of different length (from 35 bp (Solexa) to 800 bp (Sanger)), great interest in analysis of microbial communities (metagenomes) of different complexities raises new problems and pushes some new requirements for sequence assembly tools to the forefront. The genome assembly process can be divided into two steps: draft assembly and assembly improvement (finishing). Despite the fact that automatically performed assembly (or draft assembly) is capable of covering up to 98% of the genome, in most cases, it still contains incorrectly assembled reads. The error rate of the consensus sequence produced at this stage is about 1/2000 bp. A finished genome represents the genome assembly of much higher accuracy (with no gaps or incorrectly assembled areas) and quality ({approx}1 error/10,000 bp), validated through a number of computer and laboratory experiments.

  7. Survey and analysis of simple sequence repeats in the Laccaria bicolor genome, with development of microsatellite markers

    Energy Technology Data Exchange (ETDEWEB)

    Labbe, Jessy L [ORNL; Murat, Claude [INRA, Nancy, France; Morin, Emmanuelle [INRA, Nancy, France; Le Tacon, F [UMR, France; Martin, Francis [INRA, Nancy, France

    2011-01-01

    It is becoming clear that simple sequence repeats (SSRs) play a significant role in fungal genome organization, and they are a large source of genetic markers for population genetics and meiotic maps. We identified SSRs in the Laccaria bicolor genome by in silico survey and analyzed their distribution in the different genomic regions. We also compared the abundance and distribution of SSRs in L. bicolor with those of the following fungal genomes: Phanerochaete chrysosporium, Coprinopsis cinerea, Ustilago maydis, Cryptococcus neoformans, Aspergillus nidulans, Magnaporthe grisea, Neurospora crassa and Saccharomyces cerevisiae. Using the MISA computer program, we detected 277,062 SSRs in the L. bicolor genome representing 8% of the assembled genomic sequence. Among the analyzed basidiomycetes, L. bicolor exhibited the highest SSR density although no correlation between relative abundance and the genome sizes was observed. In most genomes the short motifs (mono- to trinucleotides) were more abundant than the longer repeated SSRs. Generally, in each organism, the occurrence, relative abundance, and relative density of SSRs decreased as the repeat unit increased. Furthermore, each organism had its own common and longest SSRs. In the L. bicolor genome, most of the SSRs were located in intergenic regions (73.3%) and the highest SSR density was observed in transposable elements (TEs; 6,706 SSRs/Mb). However, 81% of the protein-coding genes contained SSRs in their exons, suggesting that SSR polymorphism may alter gene phenotypes. Within a L. bicolor offspring, sequence polymorphism of 78 SSRs was mainly detected in non-TE intergenic regions. Unlike previously developed microsatellite markers, these new ones are spread throughout the genome; these markers could have immediate applications in population genetics.

  8. Long sequence correlation coprocessor

    Science.gov (United States)

    Gage, Douglas W.

    1994-09-01

    A long sequence correlation coprocessor (LSCC) accelerates the bitwise correlation of arbitrarily long digital sequences by calculating in parallel the correlation score for 16, for example, adjacent bit alignments between two binary sequences. The LSCC integrated circuit is incorporated into a computer system with memory storage buffers and a separate general purpose computer processor which serves as its controller. Each of the LSCC's set of sequential counters simultaneously tallies a separate correlation coefficient. During each LSCC clock cycle, computer enable logic associated with each counter compares one bit of a first sequence with one bit of a second sequence to increment the counter if the bits are the same. A shift register assures that the same bit of the first sequence is simultaneously compared to different bits of the second sequence to simultaneously calculate the correlation coefficient by the different counters to represent different alignments of the two sequences.

  9. Roles of repetitive sequences

    Energy Technology Data Exchange (ETDEWEB)

    Bell, G.I.

    1991-12-31

    The DNA of higher eukaryotes contains many repetitive sequences. The study of repetitive sequences is important, not only because many have important biological function, but also because they provide information on genome organization, evolution and dynamics. In this paper, I will first discuss some generic effects that repetitive sequences will have upon genome dynamics and evolution. In particular, it will be shown that repetitive sequences foster recombination among, and turnover of, the elements of a genome. I will then consider some examples of repetitive sequences, notably minisatellite sequences and telomere sequences as examples of tandem repeats, without and with respectively known function, and Alu sequences as an example of interspersed repeats. Some other examples will also be considered in less detail.

  10. Anomaly Detection in Sequences

    Data.gov (United States)

    National Aeronautics and Space Administration — We present a set of novel algorithms which we call sequenceMiner, that detect and characterize anomalies in large sets of high-dimensional symbol sequences that...

  11. DNA sequencing conference, 2

    Energy Technology Data Exchange (ETDEWEB)

    Cook-Deegan, R.M. [Georgetown Univ., Kennedy Inst. of Ethics, Washington, DC (United States); Venter, J.C. [National Inst. of Neurological Disorders and Strokes, Bethesda, MD (United States); Gilbert, W. [Harvard Univ., Cambridge, MA (United States); Mulligan, J. [Stanford Univ., CA (United States); Mansfield, B.K. [Oak Ridge National Lab., TN (United States)

    1991-06-19

    This conference focused on DNA sequencing, genetic linkage mapping, physical mapping, informatics and bioethics. Several were used to study this sequencing and mapping. This article also discusses computer hardware and software aiding in the mapping of genes.

  12. sequenceMiner algorithm

    Data.gov (United States)

    National Aeronautics and Space Administration — Detecting and describing anomalies in large repositories of discrete symbol sequences. sequenceMiner has been open-sourced! Download the file below to try it out....

  13. Identification of a novel intergenic miRNA located between the human DDC and COBL genes with a potential function in cell cycle arrest.

    Science.gov (United States)

    Hoballa, Mohamad Hussein; Soltani, Bahram M; Mowla, Seyed Javad; Sheikhpour, Mojgan; Kay, Maryam

    2018-07-01

    Frequent abnormalities in 7p12 locus in different tumors like lung cancer candidate this region for novel regulatory elements. MiRNAs as novel regulatory elements encoded within the human genome are potentially oncomiRs or miR suppressors. Here, we have used bioinformatics tools to search for the novel miRNAs embedded within human chromosome 7p12. A bona fide stem loop (named mirZa precursor) had the features of producing a real miRNA (named miRZa) which was detected through RT-qPCR following the overexpression of its precursor. Then, endogenous miRZa was detected in human cell lines and tissues and sequenced. Consistent to the bioinformatics prediction, RT-qPCR as well as dual luciferase assay indicated that SMAD3 and IGF1R genes were targeted by miRZa. MiRZa-3p and miRZa-5p were downregulated in lung tumor tissue samples detected by RT-qPCR, and mirZa precursor overexpression in SW480 cells resulted in increased sub-G1 cell population. Overall, here we introduced a novel miRNA which is capable of targeting SMAD3 and IGF1R regulatory genes and increases the cell population in sub-G1 stage.

  14. Complete chloroplast DNA sequence from a Korean endemic genus, Megaleranthis saniculifolia, and its evolutionary implications.

    Science.gov (United States)

    Kim, Young-Kyu; Park, Chong-wook; Kim, Ki-Joong

    2009-03-31

    The chloroplast DNA sequences of Megaleranthis saniculifolia, an endemic and monotypic endangered plant species, were completed in this study (GenBank FJ597983). The genome is 159,924 bp in length. It harbors a pair of IR regions consisting of 26,608 bp each. The lengths of the LSC and SSC regions are 88,326 bp and 18,382 bp, respectively. The structural organizations, gene and intron contents, gene orders, AT contents, codon usages, and transcription units of the Megaleranthis chloroplast genome are similar to those of typical land plant cp DNAs. However, the detailed features of Megaleranthis chloroplast genomes are substantially different from that of Ranunculus, which belongs to the same family, the Ranunculaceae. First, the Megaleranthis cp DNA was 4,797 bp longer than that of Ranunculus due to an expanded IR region into the SSC region and duplicated sequence elements in several spacer regions of the Megaleranthis cp genome. Second, the chloroplast genomes of Megaleranthis and Ranunculus evidence 5.6% sequence divergence in the coding regions, 8.9% sequence divergence in the intron regions, and 18.7% sequence divergence in the intergenic spacer regions, respectively. In both the coding and noncoding regions, average nucleotide substitution rates differed markedly, depending on the genome position. Our data strongly implicate the positional effects of the evolutionary modes of chloroplast genes. The genes evidencing higher levels of base substitutions also have higher incidences of indel mutations and low Ka/Ks ratios. A total of 54 simple sequence repeat loci were identified from the Megaleranthis cp genome. The existence of rich cp SSR loci in the Megaleranthis cp genome provides a rare opportunity to study the population genetic structures of this endangered species. Our phylogenetic trees based on the two independent markers, the nuclear ITS and chloroplast matK sequences, strongly support the inclusion of the Megaleranthis to the Trollius. Therefore, our

  15. The identification of two Trypanosoma cruzi I genotypes from domestic and sylvatic transmission cycles in Colombia based on a single polymerase chain reaction amplification of the spliced-leader intergenic region

    Directory of Open Access Journals (Sweden)

    Lina Marcela Villa

    2013-11-01

    Full Text Available A single polymerase chain reaction (PCR reaction targeting the spliced-leader intergenic region of Trypanosoma cruzi I was standardised by amplifying a 231 bp fragment in domestic (TcIDOM strains or clones and 450 and 550 bp fragments in sylvatic strains or clones. This reaction was validated using 44 blind coded samples and 184 non-coded T. cruzi I clones isolated from sylvatic triatomines and the correspondence between the amplified fragments and their domestic or sylvatic origin was determined. Six of the nine strains isolated from acute cases suspected of oral infection had the sylvatic T. cruzi I profile. These results confirmed that the sylvatic T. cruzi I genotype is linked to cases of oral Chagas disease in Colombia. We therefore propose the use of this novel PCR reaction in strains or clones previously characterised as T. cruzi I to distinguish TcIDOMfrom sylvatic genotypes in studies of transmission dynamics, including the verification of population selection within hosts or detection of the frequency of mixed infections by both T. cruzi I genotypes in Colombia.

  16. Whole genome sequencing and evolutionary analysis of human respiratory syncytial virus A and B from Milwaukee, WI 1998-2010.

    Directory of Open Access Journals (Sweden)

    Cecilia Rebuffo-Scheer

    Full Text Available BACKGROUND: Respiratory Syncytial Virus (RSV is the leading cause of lower respiratory-tract infections in infants and young children worldwide. Despite this, only six complete genome sequences of original strains have been previously published, the most recent of which dates back 35 and 26 years for RSV group A and group B respectively. METHODOLOGY/PRINCIPAL FINDINGS: We present a semi-automated sequencing method allowing for the sequencing of four RSV whole genomes simultaneously. We were able to sequence the complete coding sequences of 13 RSV A and 4 RSV B strains from Milwaukee collected from 1998-2010. Another 12 RSV A and 5 RSV B strains sequenced in this study cover the majority of the genome. All RSV A and RSV B sequences were analyzed by neighbor-joining, maximum parsimony and Bayesian phylogeny methods. Genetic diversity was high among RSV A viruses in Milwaukee including the circulation of multiple genotypes (GA1, GA2, GA5, GA7 with GA2 persisting throughout the 13 years of the study. However, RSV B genomes showed little variation with all belonging to the BA genotype. For RSV A, the same evolutionary patterns and clades were seen consistently across the whole genome including all intergenic, coding, and non-coding regions sequences. CONCLUSIONS/SIGNIFICANCE: The sequencing strategy presented in this work allows for RSV A and B genomes to be sequenced simultaneously in two working days and with a low cost. We have significantly increased the amount of genomic data that is available for both RSV A and B, providing the basic molecular characteristics of RSV strains circulating in Milwaukee over the last 13 years. This information can be used for comparative analysis with strains circulating in other communities around the world which should also help with the development of new strategies for control of RSV, specifically vaccine development and improvement of RSV diagnostics.

  17. Mitochondrial sequencing reveals five separate origins of 'black' Apis mellifera (Hymenoptera: Apidae) in eastern Australian commercial colonies.

    Science.gov (United States)

    Oxley, P R; Oldroyd, B P

    2009-04-01

    Establishment of a closed population honey bee, Apis mellifera L. (Hymenoptera: Apidae), breeding program based on 'black' strains has been proposed for eastern Australia. Long-term success of such a program requires a high level of genetic variance. To determine the likely extent of genetic variation available, 50 colonies from 11 different commercial apiaries were sequenced in the mitochondrial cytochrome oxidase I and II intergenic region. Five distinct and novel mitotypes were identified. No colonies were found with the A. mellifera mellifera mitotype, which is often associated with undesirable feral strains. One group of mitotypes was consistent with a caucasica origin, two with carnica, and two with ligustica. The results suggest that there is sufficient genetic diversity to support a breeding program provided all these five sources were pooled.

  18. Complete Chloroplast Genome Sequence of Tartary Buckwheat (Fagopyrum tataricum and Comparative Analysis with Common Buckwheat (F. esculentum.

    Directory of Open Access Journals (Sweden)

    Kwang-Soo Cho

    Full Text Available We report the chloroplast (cp genome sequence of tartary buckwheat (Fagopyrum tataricum obtained by next-generation sequencing technology and compared this with the previously reported common buckwheat (F. esculentum ssp. ancestrale cp genome. The cp genome of F. tataricum has a total sequence length of 159,272 bp, which is 327 bp shorter than the common buckwheat cp genome. The cp gene content, order, and orientation are similar to those of common buckwheat, but with some structural variation at tandem and palindromic repeat frequencies and junction areas. A total of seven InDels (around 100 bp were found within the intergenic sequences and the ycf1 gene. Copy number variation of the 21-bp tandem repeat varied in F. tataricum (four repeats and F. esculentum (one repeat, and the InDel of the ycf1 gene was 63 bp long. Nucleotide and amino acid have highly conserved coding sequence with about 98% homology and four genes--rpoC2, ycf3, accD, and clpP--have high synonymous (Ks value. PCR based InDel markers were applied to diverse genetic resources of F. tataricum and F. esculentum, and the amplicon size was identical to that expected in silico. Therefore, these InDel markers are informative biomarkers to practically distinguish raw or processed buckwheat products derived from F. tataricum and F. esculentum.

  19. InFusion: Advancing Discovery of Fusion Genes and Chimeric Transcripts from Deep RNA-Sequencing Data.

    Directory of Open Access Journals (Sweden)

    Konstantin Okonechnikov

    Full Text Available Analysis of fusion transcripts has become increasingly important due to their link with cancer development. Since high-throughput sequencing approaches survey fusion events exhaustively, several computational methods for the detection of gene fusions from RNA-seq data have been developed. This kind of analysis, however, is complicated by native trans-splicing events, the splicing-induced complexity of the transcriptome and biases and artefacts introduced in experiments and data analysis. There are a number of tools available for the detection of fusions from RNA-seq data; however, certain differences in specificity and sensitivity between commonly used approaches have been found. The ability to detect gene fusions of different types, including isoform fusions and fusions involving non-coding regions, has not been thoroughly studied yet. Here, we propose a novel computational toolkit called InFusion for fusion gene detection from RNA-seq data. InFusion introduces several unique features, such as discovery of fusions involving intergenic regions, and detection of anti-sense transcription in chimeric RNAs based on strand-specificity. Our approach demonstrates superior detection accuracy on simulated data and several public RNA-seq datasets. This improved performance was also evident when evaluating data from RNA deep-sequencing of two well-established prostate cancer cell lines. InFusion identified 26 novel fusion events that were validated in vitro, including alternatively spliced gene fusion isoforms and chimeric transcripts that include intergenic regions. The toolkit is freely available to download from http:/bitbucket.org/kokonech/infusion.

  20. Sequences for Student Investigation

    Science.gov (United States)

    Barton, Jeffrey; Feil, David; Lartigue, David; Mullins, Bernadette

    2004-01-01

    We describe two classes of sequences that give rise to accessible problems for undergraduate research. These problems may be understood with virtually no prerequisites and are well suited for computer-aided investigation. The first sequence is a variation of one introduced by Stephen Wolfram in connection with his study of cellular automata. The…

  1. Sequence History Update Tool

    Science.gov (United States)

    Khanampompan, Teerapat; Gladden, Roy; Fisher, Forest; DelGuercio, Chris

    2008-01-01

    The Sequence History Update Tool performs Web-based sequence statistics archiving for Mars Reconnaissance Orbiter (MRO). Using a single UNIX command, the software takes advantage of sequencing conventions to automatically extract the needed statistics from multiple files. This information is then used to populate a PHP database, which is then seamlessly formatted into a dynamic Web page. This tool replaces a previous tedious and error-prone process of manually editing HTML code to construct a Web-based table. Because the tool manages all of the statistics gathering and file delivery to and from multiple data sources spread across multiple servers, there is also a considerable time and effort savings. With the use of The Sequence History Update Tool what previously took minutes is now done in less than 30 seconds, and now provides a more accurate archival record of the sequence commanding for MRO.

  2. Expression of the Long Intergenic Non-Protein Coding RNA 665 (LINC00665) Gene and the Cell Cycle in Hepatocellular Carcinoma Using The Cancer Genome Atlas, the Gene Expression Omnibus, and Quantitative Real-Time Polymerase Chain Reaction.

    Science.gov (United States)

    Wen, Dong-Yue; Lin, Peng; Pang, Yu-Yan; Chen, Gang; He, Yun; Dang, Yi-Wu; Yang, Hong

    2018-05-05

    BACKGROUND Long non-coding RNAs (lncRNAs) have a role in physiological and pathological processes, including cancer. The aim of this study was to investigate the expression of the long intergenic non-protein coding RNA 665 (LINC00665) gene and the cell cycle in hepatocellular carcinoma (HCC) using database analysis including The Cancer Genome Atlas (TCGA), the Gene Expression Omnibus (GEO), and quantitative real-time polymerase chain reaction (qPCR). MATERIAL AND METHODS Expression levels of LINC00665 were compared between human tissue samples of HCC and adjacent normal liver, clinicopathological correlations were made using TCGA and the GEO, and qPCR was performed to validate the findings. Other public databases were searched for other genes associated with LINC00665 expression, including The Atlas of Noncoding RNAs in Cancer (TANRIC), the Multi Experiment Matrix (MEM), Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG) and protein-protein interaction (PPI) networks. RESULTS Overexpression of LINC00665 in patients with HCC was significantly associated with gender, tumor grade, stage, and tumor cell type. Overexpression of LINC00665 in patients with HCC was significantly associated with overall survival (OS) (HR=1.47795%; CI: 1.046-2.086). Bioinformatics analysis identified 469 related genes and further analysis supported a hypothesis that LINC00665 regulates pathways in the cell cycle to facilitate the development and progression of HCC through ten identified core genes: CDK1, BUB1B, BUB1, PLK1, CCNB2, CCNB1, CDC20, ESPL1, MAD2L1, and CCNA2. CONCLUSIONS Overexpression of the lncRNA, LINC00665 may be involved in the regulation of cell cycle pathways in HCC through ten identified hub genes.

  3. Development and Validation of an Improved PCR Method Using the 23S-5S Intergenic Spacer for Detection of Rickettsiae in Dermacentor variabilis Ticks and Tissue Samples from Humans and Laboratory Animals.

    Science.gov (United States)

    Kakumanu, Madhavi L; Ponnusamy, Loganathan; Sutton, Haley T; Meshnick, Steven R; Nicholson, William L; Apperson, Charles S

    2016-04-01

    A novel nested PCR assay was developed to detectRickettsiaspp. in ticks and tissue samples from humans and laboratory animals. Primers were designed for the nested run to amplify a variable region of the 23S-5S intergenic spacer (IGS) ofRickettsiaspp. The newly designed primers were evaluated using genomic DNA from 11Rickettsiaspecies belonging to the spotted fever, typhus, and ancestral groups and, in parallel, compared to otherRickettsia-specific PCR targets (ompA,gltA, and the 17-kDa protein gene). The new 23S-5S IGS nested PCR assay amplified all 11Rickettsiaspp., but the assays employing other PCR targets did not. The novel nested assay was sensitive enough to detect one copy of a cloned 23S-5S IGS fragment from "CandidatusRickettsia amblyommii." Subsequently, the detection efficiency of the 23S-5S IGS nested assay was compared to those of the other three assays using genomic DNA extracted from 40 adultDermacentor variabilisticks. The nested 23S-5S IGS assay detectedRickettsiaDNA in 45% of the ticks, while the amplification rates of the other three assays ranged between 5 and 20%. The novel PCR assay was validated using clinical samples from humans and laboratory animals that were known to be infected with pathogenic species ofRickettsia The nested 23S-5S IGS PCR assay was coupled with reverse line blot hybridization with species-specific probes for high-throughput detection and simultaneous identification of the species ofRickettsiain the ticks. "CandidatusRickettsia amblyommii,"R. montanensis,R. felis, andR. belliiwere frequently identified species, along with some potentially novelRickettsiastrains that were closely related toR. belliiandR. conorii. Copyright © 2016 Kakumanu et al.

  4. Short communication: Identification of coagulase-negative staphylococcus species from goat milk with the API Staph identification test and with transfer RNA-intergenic spacer PCR combined with capillary electrophoresis.

    Science.gov (United States)

    Koop, G; De Visscher, A; Collar, C A; Bacon, D A C; Maga, E A; Murray, J D; Supré, K; De Vliegher, S; Haesebrouck, F; Rowe, J D; Nielen, M; van Werven, T

    2012-12-01

    Coagulase-negative staphylococci (CNS) are the most commonly isolated bacteria from goat milk, but they have often been identified with phenotypic methods, which may have resulted in misclassification. The aims of this paper were to assess the amount of misclassification of a phenotypic test for identifying CNS species from goat milk compared with transfer RNA intergenic spacer PCR (tDNA-PCR) followed by capillary electrophoresis, and to apply the tDNA-PCR technique on different capillary electrophoresis equipment. Milk samples were collected from 416 does in 5 Californian dairy goat herds on 3 occasions during lactation. In total, 219 CNS isolates were identified at the species level with tDNA-PCR and subjected to the API 20 Staph identification test kit (API Staph; bioMérieux, Durham, NC). If the same species was isolated multiple times from the same udder gland, only the first isolate was used for further analyses, resulting in 115 unique CNS isolates. According to the tDNA-PCR test, the most prevalent CNS species were Staphylococcus epidermidis, Staphylococcus caprae, and Staphylococcus simulans. Typeability with API staph was low (72%). Although the API Staph test was capable of identifying the majority of Staph. epidermidis and Staph. caprae isolates, sensitivity for identification of Staph. simulans was low. The true positive fraction was high for the 3 most prevalent species. It was concluded that the overall performance of API Staph in differentiating CNS species from goat milk was moderate to low, mainly because of the low typeability, and that genotypic methods such as tDNA-PCR are preferred. Copyright © 2012 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.

  5. Development and Validation of an Improved PCR Method Using the 23S-5S Intergenic Spacer for Detection of Rickettsiae in Dermacentor variabilis Ticks and Tissue Samples from Humans and Laboratory Animals

    Science.gov (United States)

    Kakumanu, Madhavi L.; Ponnusamy, Loganathan; Sutton, Haley T.; Meshnick, Steven R.; Nicholson, William L.

    2016-01-01

    A novel nested PCR assay was developed to detect Rickettsia spp. in ticks and tissue samples from humans and laboratory animals. Primers were designed for the nested run to amplify a variable region of the 23S-5S intergenic spacer (IGS) of Rickettsia spp. The newly designed primers were evaluated using genomic DNA from 11 Rickettsia species belonging to the spotted fever, typhus, and ancestral groups and, in parallel, compared to other Rickettsia-specific PCR targets (ompA, gltA, and the 17-kDa protein gene). The new 23S-5S IGS nested PCR assay amplified all 11 Rickettsia spp., but the assays employing other PCR targets did not. The novel nested assay was sensitive enough to detect one copy of a cloned 23S-5S IGS fragment from “Candidatus Rickettsia amblyommii.” Subsequently, the detection efficiency of the 23S-5S IGS nested assay was compared to those of the other three assays using genomic DNA extracted from 40 adult Dermacentor variabilis ticks. The nested 23S-5S IGS assay detected Rickettsia DNA in 45% of the ticks, while the amplification rates of the other three assays ranged between 5 and 20%. The novel PCR assay was validated using clinical samples from humans and laboratory animals that were known to be infected with pathogenic species of Rickettsia. The nested 23S-5S IGS PCR assay was coupled with reverse line blot hybridization with species-specific probes for high-throughput detection and simultaneous identification of the species of Rickettsia in the ticks. “Candidatus Rickettsia amblyommii,” R. montanensis, R. felis, and R. bellii were frequently identified species, along with some potentially novel Rickettsia strains that were closely related to R. bellii and R. conorii. PMID:26818674

  6. HIV Sequence Compendium 2015

    Energy Technology Data Exchange (ETDEWEB)

    Foley, Brian Thomas [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Leitner, Thomas Kenneth [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Apetrei, Cristian [Univ. of Pittsburgh, PA (United States); Hahn, Beatrice [Univ. of Pennsylvania, Philadelphia, PA (United States); Mizrachi, Ilene [National Center for Biotechnology Information, Bethesda, MD (United States); Mullins, James [Univ. of Washington, Seattle, WA (United States); Rambaut, Andrew [Univ. of Edinburgh, Scotland (United Kingdom); Wolinsky, Steven [Northwestern Univ., Evanston, IL (United States); Korber, Bette Tina Marie [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)

    2015-10-05

    This compendium is an annual printed summary of the data contained in the HIV sequence database. We try to present a judicious selection of the data in such a way that it is of maximum utility to HIV researchers. Each of the alignments attempts to display the genetic variability within the different species, groups and subtypes of the virus. This compendium contains sequences published before January 1, 2015. Hence, though it is published in 2015 and called the 2015 Compendium, its contents correspond to the 2014 curated alignments on our website. The number of sequences in the HIV database is still increasing. In total, at the end of 2014, there were 624,121 sequences in the HIV Sequence Database, an increase of 7% since the previous year. This is the first year that the number of new sequences added to the database has decreased compared to the previous year. The number of near complete genomes (>7000 nucleotides) increased to 5834 by end of 2014. However, as in previous years, the compendium alignments contain only a fraction of these. A more complete version of all alignments is available on our website, http://www.hiv.lanl.gov/ content/sequence/NEWALIGN/align.html As always, we are open to complaints and suggestions for improvement. Inquiries and comments regarding the compendium should be addressed to seq-info@lanl.gov.

  7. Mapping sequences by parts

    Directory of Open Access Journals (Sweden)

    Guziolowski Carito

    2007-09-01

    Full Text Available Abstract Background: We present the N-map method, a pairwise and asymmetrical approach which allows us to compare sequences by taking into account evolutionary events that produce shuffled, reversed or repeated elements. Basically, the optimal N-map of a sequence s over a sequence t is the best way of partitioning the first sequence into N parts and placing them, possibly complementary reversed, over the second sequence in order to maximize the sum of their gapless alignment scores. Results: We introduce an algorithm computing an optimal N-map with time complexity O (|s| × |t| × N using O (|s| × |t| × N memory space. Among all the numbers of parts taken in a reasonable range, we select the value N for which the optimal N-map has the most significant score. To evaluate this significance, we study the empirical distributions of the scores of optimal N-maps and show that they can be approximated by normal distributions with a reasonable accuracy. We test the functionality of the approach over random sequences on which we apply artificial evolutionary events. Practical Application: The method is illustrated with four case studies of pairs of sequences involving non-standard evolutionary events.

  8. Complete sequences of the mitochondrial DNA of the wild Gracilariopsis lemaneiformis and two mutagenic cultivated breeds (Gracilariaceae, Rhodophyta.

    Directory of Open Access Journals (Sweden)

    Lei Zhang

    Full Text Available The complete mitochondrial DNA (mtDNA of Gracilariopsis lemaneiformis was sequenced (25883 bp and mapped to a circular model. The A+T composition was 72.5%. Forty six genes and two potentially functional open reading frames were identified. They include 24 protein-coding genes, 2 rRNA genes, 20 tRNA genes and 2 ORFs (orf60, orf142. There is considerable sequence synteny across the five red algal mtDNAs falling into Florideophyceae including Gr. lemaneiformis in this study and previously sequenced species. A long stem-loop and a hairpin structure were identified in intergenic regions of mt genome of Gr. lemaneiformis, which are believed to be involved with transcription and replication. In addition, the mtDNAs of two mutagenic cultivated breeds ("981" and "07-2" were also sequenced. Compared with the mtDNA of wild Gr. lemaneiformis, the genome size and gene length and order of three strains were completely identical except nine base mutations including eight in the protein-coding genes and one in the tRNA gene. None of the base mutations caused frameshift or a premature stop codon in the mtDNA genes. Phylogenetic analyses based on mitochondrial protein-coding genes and rRNA genes demonstrated Gracilariopsis andersonii had closer phylogenetic relationship with its parasite Gracilariophila oryzoides than Gracilariopsis lemaneiformis which was from the same genus of Gracilariopsis.

  9. Complete sequences of the mitochondrial DNA of the wild Gracilariopsis lemaneiformis and two mutagenic cultivated breeds (Gracilariaceae, Rhodophyta).

    Science.gov (United States)

    Zhang, Lei; Wang, Xumin; Qian, Hao; Chi, Shan; Liu, Cui; Liu, Tao

    2012-01-01

    The complete mitochondrial DNA (mtDNA) of Gracilariopsis lemaneiformis was sequenced (25883 bp) and mapped to a circular model. The A+T composition was 72.5%. Forty six genes and two potentially functional open reading frames were identified. They include 24 protein-coding genes, 2 rRNA genes, 20 tRNA genes and 2 ORFs (orf60, orf142). There is considerable sequence synteny across the five red algal mtDNAs falling into Florideophyceae including Gr. lemaneiformis in this study and previously sequenced species. A long stem-loop and a hairpin structure were identified in intergenic regions of mt genome of Gr. lemaneiformis, which are believed to be involved with transcription and replication. In addition, the mtDNAs of two mutagenic cultivated breeds ("981" and "07-2") were also sequenced. Compared with the mtDNA of wild Gr. lemaneiformis, the genome size and gene length and order of three strains were completely identical except nine base mutations including eight in the protein-coding genes and one in the tRNA gene. None of the base mutations caused frameshift or a premature stop codon in the mtDNA genes. Phylogenetic analyses based on mitochondrial protein-coding genes and rRNA genes demonstrated Gracilariopsis andersonii had closer phylogenetic relationship with its parasite Gracilariophila oryzoides than Gracilariopsis lemaneiformis which was from the same genus of Gracilariopsis.

  10. The Solanum commersonii Genome Sequence Provides Insights into Adaptation to Stress Conditions and Genome Evolution of Wild Potato Relatives

    Science.gov (United States)

    Aversano, Riccardo; Contaldi, Felice; Ercolano, Maria Raffaella; Grosso, Valentina; Iorizzo, Massimo; Tatino, Filippo; Xumerle, Luciano; Dal Molin, Alessandra; Avanzato, Carla; Ferrarini, Alberto; Delledonne, Massimo; Sanseverino, Walter; Cigliano, Riccardo Aiese; Capella-Gutierrez, Salvador; Gabaldón, Toni; Frusciante, Luigi; Bradeen, James M.; Carputo, Domenico

    2015-01-01

    Here, we report the draft genome sequence of Solanum commersonii, which consists of ∼830 megabases with an N50 of 44,303 bp anchored to 12 chromosomes, using the potato (Solanum tuberosum) genome sequence as a reference. Compared with potato, S. commersonii shows a striking reduction in heterozygosity (1.5% versus 53 to 59%), and differences in genome sizes were mainly due to variations in intergenic sequence length. Gene annotation by ab initio prediction supported by RNA-seq data produced a catalog of 1703 predicted microRNAs, 18,882 long noncoding RNAs of which 20% are shown to target cold-responsive genes, and 39,290 protein-coding genes with a significant repertoire of nonredundant nucleotide binding site-encoding genes and 126 cold-related genes that are lacking in S. tuberosum. Phylogenetic analyses indicate that domesticated potato and S. commersonii lineages diverged ∼2.3 million years ago. Three duplication periods corresponding to genome enrichment for particular gene families related to response to salt stress, water transport, growth, and defense response were discovered. The draft genome sequence of S. commersonii substantially increases our understanding of the domesticated germplasm, facilitating translation of acquired knowledge into advances in crop stability in light of global climate and environmental changes. PMID:25873387

  11. The Colliding Beams Sequencer

    International Nuclear Information System (INIS)

    Johnson, D.E.; Johnson, R.P.

    1989-01-01

    The Colliding Beam Sequencer (CBS) is a computer program used to operate the pbar-p Collider by synchronizing the applications programs and simulating the activities of the accelerator operators during filling and storage. The Sequencer acts as a meta-program, running otherwise stand alone applications programs, to do the set-up, beam transfers, acceleration, low beta turn on, and diagnostics for the transfers and storage. The Sequencer and its operational performance will be described along with its special features which include a periodic scheduler and command logger. 14 refs., 3 figs

  12. Phylogenetic Trees From Sequences

    Science.gov (United States)

    Ryvkin, Paul; Wang, Li-San

    In this chapter, we review important concepts and approaches for phylogeny reconstruction from sequence data.We first cover some basic definitions and properties of phylogenetics, and briefly explain how scientists model sequence evolution and measure sequence divergence. We then discuss three major approaches for phylogenetic reconstruction: distance-based phylogenetic reconstruction, maximum parsimony, and maximum likelihood. In the third part of the chapter, we review how multiple phylogenies are compared by consensus methods and how to assess confidence using bootstrapping. At the end of the chapter are two sections that list popular software packages and additional reading.

  13. Sequence characterization of cotton leaf curl virus from Rajasthan: phylogenetic relationship with other members of geminiviruses and detection of recombination.

    Science.gov (United States)

    Kumar, A; Kumar, J; Khan, J A

    2010-04-01

    Diseased cotton plants showing typical leaf curl symptoms were collected from experimental plot of Agriculture Research Station-Sriganganagar, Rajasthan. Complete DNA-A component from samples taken from two areas were amplified through rolling circle amplification (RCA) using templiphi kit (GE Healthcare) and characterized. DNA-A of one isolate consists of 2751 nucleotides and second isolate of 2759 nucleotide. Both sequences comprised six ORF's. Genome organization of DNA-A of one isolate shows high sequence similarity with other characterized local begomovirus isolates of Rajasthan, while other isolate shows high sequence similarity with CLCuV reported from Pakistan. The maximum similarity of first isolate, CLCuV-SG01, shows highest sequence identity with Cotton leaf curl Abohar (Rajasthan) virus, and second isolate, CLCuV-SG02, shows highest sequence identity with cotton leaf curl virus from Pakistan. Both isolates showed 85% similarities with each other. The sequence data revealed probable infiltration of some strains of Cotton leaf curl virus from Pakistan to India, or co-existence of different isolates under similar geographical conditions. While CLCuV-SG01 shows highest nt sequence similarity with CLCuV Rajasthan (Abohar), nt identity of V1 ORF (encoding coat protein) of SG01 shows the highest nt identity (100%) with CLCuV Multan (Bhatinda) and Abohar virus while AC1 region also showed difference. Complete nucleotide sequence of SG01 shows only 86% similarity with CLCuV Multan virus. Similarity search revealed significant difference in AV1 and AC1 regions with respect to DNA-A suggesting an evolutionary history of recombination. Computer based analysis, recombination detection Program (RDP) supports the recombination hypothesis, indicated that recombination with other begomoviruses had taken place within V1 ORF and AC1 ORF of CLCuV-SG01 and AC1 ORF of CLCuV-SG02 and also in noncoding intergenic region (IR).

  14. Genetic variation in the Staphylococcus aureus 8325 strain lineage revealed by whole-genome sequencing.

    Directory of Open Access Journals (Sweden)

    Kristoffer T Bæk

    Full Text Available Staphylococcus aureus strains of the 8325 lineage, especially 8325-4 and derivatives lacking prophage, have been used extensively for decades of research. We report herein the results of our deep sequence analysis of strain 8325-4. Assignment of sequence variants compared with the reference strain 8325 (NRS77/PS47 required correction of errors in the 8325 reference genome, and reassessment of variation previously attributed to chemical mutagenesis of the restriction-defective RN4220. Using an extensive strain pedigree analysis, we discovered that 8325-4 contains 16 single nucleotide polymorphisms (SNP arising prior to the construction of RN4220. We identified 5 indels in 8325-4 compared with 8325. Three indels correspond to expected Φ11, 12, 13 excisions, one indel is explained by a sequence assembly artifact, and the final indel (Δ63bp in the spa-sarS intergenic region is common to only a sub-lineage of 8325-4 strains including SH1000. This deletion was found to significantly decrease (75% steady state sarS but not spa transcript levels in post-exponential phase. The sub-lineage 8325-4 was also found to harbor 4 additional SNPs. We also found large sequence variation between 8325, 8325-4 and RN4220 in a cluster of repetitive hypothetical proteins (SA0282 homologs near the Ess secretion cluster. The overall 8325-4 SNP set results in 17 alterations within coding sequences. Remarkably, we discovered that all tested strains of the 8325-4 lineage lack phenol soluble modulin α3 (PSMα3, a virulence determinant implicated in neutrophil chemotaxis, biofilm architecture and surface spreading. Collectively, our results clarify and define the 8325-4 pedigree and reveal clear evidence that mutations existing throughout all branches of this lineage, including the widely used RN6390 and SH1000 strains, could conceivably impact virulence regulation.

  15. Complete genome sequence of Fer-de-Lance Virus reveals a novel gene in reptilian Paramyxoviruses

    Science.gov (United States)

    Kurath, G.; Batts, W.N.; Ahne, W.; Winton, J.R.

    2004-01-01

    The complete RNA genome sequence of the archetype reptilian paramyxovirus, Fer-de-Lance virus (FDLV), has been determined. The genome is 15,378 nucleotides in length and consists of seven nonoverlapping genes in the order 3??? N-U-P-M-F-HN-L 5???, coding for the nucleocapsid, unknown, phospho-, matrix, fusion, hemagglutinin-neuraminidase, and large polymerase proteins, respectively. The gene junctions contain highly conserved transcription start and stop signal sequences and tri-nucleotide intergenic regions similar to those of other Paramyxoviridae. The FDLV P gene expression strategy is like that of rubulaviruses, which express the accessory V protein from the primary transcript and edit a portion of the mRNA to encode P and I proteins. There is also an overlapping open reading frame potentially encoding a small basic protein in the P gene. The gene designated U (unknown), encodes a deduced protein of 19.4 kDa that has no counterpart in other paramyxoviruses and has no similarity with sequences in the National Center for Biotechnology Information database. Active transcription of the U gene in infected cells was demonstrated by Northern blot analysis, and bicistronic N-U mRNA was also evident. The genomes of two other snake paramyxovirus genotypes were also found to have U genes, with 11 to 16% nucleotide divergence from the FDLV U gene. Pairwise comparisons of amino acid identities and phylogenetic analyses of all deduced FDLV protein sequences with homologous sequences from other Paramyxoviridae indicate that FDLV represents a new genus within the subfamily Paramyxovirinae. We suggest the name Ferlavirus for the new genus, with FDLV as the type species.

  16. Gomphid DNA sequence data

    Data.gov (United States)

    U.S. Environmental Protection Agency — DNA sequence data for several genetic loci. This dataset is not publicly accessible because: It's already publicly available on GenBank. It can be accessed through...

  17. Yeast genome sequencing:

    DEFF Research Database (Denmark)

    Piskur, Jure; Langkjær, Rikke Breinhold

    2004-01-01

    For decades, unicellular yeasts have been general models to help understand the eukaryotic cell and also our own biology. Recently, over a dozen yeast genomes have been sequenced, providing the basis to resolve several complex biological questions. Analysis of the novel sequence data has shown...... of closely related species helps in gene annotation and to answer how many genes there really are within the genomes. Analysis of non-coding regions among closely related species has provided an example of how to determine novel gene regulatory sequences, which were previously difficult to analyse because...... they are short and degenerate and occupy different positions. Comparative genomics helps to understand the origin of yeasts and points out crucial molecular events in yeast evolutionary history, such as whole-genome duplication and horizontal gene transfer(s). In addition, the accumulating sequence data provide...

  18. Sequence Analysis of Mitochondrial Genome of Toxascaris leonina from a South China Tiger.

    Science.gov (United States)

    Li, Kangxin; Yang, Fang; Abdullahi, A Y; Song, Meiran; Shi, Xianli; Wang, Minwei; Fu, Yeqi; Pan, Weida; Shan, Fang; Chen, Wu; Li, Guoqing

    2016-12-01

    Toxascaris leonina is a common parasitic nematode of wild mammals and has significant impacts on the protection of rare wild animals. To analyze population genetic characteristics of T. leonina from South China tiger, its mitochondrial (mt) genome was sequenced. Its complete circular mt genome was 14,277 bp in length, including 12 protein-coding genes, 22 tRNA genes, 2 rRNA genes, and 2 non-coding regions. The nucleotide composition was biased toward A and T. The most common start codon and stop codon were TTG and TAG, and 4 genes ended with an incomplete stop codon. There were 13 intergenic regions ranging 1 to 10 bp in size. Phylogenetically, T. leonina from a South China tiger was close to canine T. leonina . This study reports for the first time a complete mt genome sequence of T. leonina from the South China tiger, and provides a scientific basis for studying the genetic diversity of nematodes between different hosts.

  19. DNA Sequence-Mediated, Evolutionarily Rapid Redistribution of Meiotic Recombination Hotspots

    Science.gov (United States)

    Wahls, Wayne P.; Davidson, Mari K.

    2011-01-01

    Hotspots regulate the position and frequency of Spo11 (Rec12)-initiated meiotic recombination, but paradoxically they are suicidal and are somehow resurrected elsewhere in the genome. After the DNA sequence-dependent activation of hotspots was discovered in fission yeast, nearly two decades elapsed before the key realizations that (A) DNA site-dependent regulation is broadly conserved and (B) individual eukaryotes have multiple different DNA sequence motifs that activate hotspots. From our perspective, such findings provide a conceptually straightforward solution to the hotspot paradox and can explain other, seemingly complex features of meiotic recombination. We describe how a small number of single-base-pair substitutions can generate hotspots de novo and dramatically alter their distribution in the genome. This model also shows how equilibrium rate kinetics could maintain the presence of hotspots over evolutionary timescales, without strong selective pressures invoked previously, and explains why hotspots localize preferentially to intergenic regions and introns. The model is robust enough to account for all hotspots of humans and chimpanzees repositioned since their divergence from the latest common ancestor. PMID:22084420

  20. Dynamic Sequence Assignment.

    Science.gov (United States)

    1983-12-01

    D-136 548 DYNAMIIC SEQUENCE ASSIGNMENT(U) ADVANCED INFORMATION AND 1/2 DECISION SYSTEMS MOUNTAIN YIELW CA C A 0 REILLY ET AL. UNCLSSIIED DEC 83 AI/DS...I ADVANCED INFORMATION & DECISION SYSTEMS Mountain View. CA 94040 84 u ,53 V,..’. Unclassified _____ SCURITY CLASSIFICATION OF THIS PAGE REPORT...reviews some important heuristic algorithms developed for fas- ter solution of the sequence assignment problem. 3.1. DINAMIC MOGRAMUNIG FORMULATION FOR

  1. HIV Sequence Compendium 2010

    Energy Technology Data Exchange (ETDEWEB)

    Kuiken, Carla [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Foley, Brian [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Leitner, Thomas [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Apetrei, Christian [Univ. of Pittsburgh, PA (United States); Hahn, Beatrice [Univ. of Alabama, Tuscaloosa, AL (United States); Mizrachi, Ilene [National Center for Biotechnology Information, Bethesda, MD (United States); Mullins, James [Univ. of Washington, Seattle, WA (United States); Rambaut, Andrew [Univ. of Edinburgh, Scotland (United Kingdom); Wolinsky, Steven [Northwestern Univ., Evanston, IL (United States); Korber, Bette [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)

    2010-12-31

    This compendium is an annual printed summary of the data contained in the HIV sequence database. In these compendia we try to present a judicious selection of the data in such a way that it is of maximum utility to HIV researchers. Each of the alignments attempts to display the genetic variability within the different species, groups and subtypes of the virus. This compendium contains sequences published before January 1, 2010. Hence, though it is called the 2010 Compendium, its contents correspond to the 2009 curated alignments on our website. The number of sequences in the HIV database is still increasing exponentially. In total, at the time of printing, there were 339,306 sequences in the HIV Sequence Database, an increase of 45% since last year. The number of near complete genomes (>7000 nucleotides) increased to 2576 by end of 2009, reflecting a smaller increase than in previous years. However, as in previous years, the compendium alignments contain only a small fraction of these. Included in the alignments are a small number of sequences representing each of the subtypes and the more prevalent circulating recombinant forms (CRFs) such as 01 and 02, as well as a few outgroup sequences (group O and N and SIV-CPZ). Of the rarer CRFs we included one representative each. A more complete version of all alignments is available on our website, http://www.hiv.lanl.gov/content/sequence/NEWALIGN/align.html. Reprints are available from our website in the form of both HTML and PDF files. As always, we are open to complaints and suggestions for improvement. Inquiries and comments regarding the compendium should be addressed to seq-info@lanl.gov.

  2. General LTE Sequence

    OpenAIRE

    Billal, Masum

    2015-01-01

    In this paper,we have characterized sequences which maintain the same property described in Lifting the Exponent Lemma. Lifting the Exponent Lemma is a very powerful tool in olympiad number theory and recently it has become very popular. We generalize it to all sequences that maintain a property like it i.e. if p^{\\alpha}||a_k and p^\\b{eta}||n, then p^{{\\alpha}+\\b{eta}}||a_{nk}.

  3. Pairwise Sequence Alignment Library

    Energy Technology Data Exchange (ETDEWEB)

    2015-05-20

    Vector extensions, such as SSE, have been part of the x86 CPU since the 1990s, with applications in graphics, signal processing, and scientific applications. Although many algorithms and applications can naturally benefit from automatic vectorization techniques, there are still many that are difficult to vectorize due to their dependence on irregular data structures, dense branch operations, or data dependencies. Sequence alignment, one of the most widely used operations in bioinformatics workflows, has a computational footprint that features complex data dependencies. The trend of widening vector registers adversely affects the state-of-the-art sequence alignment algorithm based on striped data layouts. Therefore, a novel SIMD implementation of a parallel scan-based sequence alignment algorithm that can better exploit wider SIMD units was implemented as part of the Parallel Sequence Alignment Library (parasail). Parasail features: Reference implementations of all known vectorized sequence alignment approaches. Implementations of Smith Waterman (SW), semi-global (SG), and Needleman Wunsch (NW) sequence alignment algorithms. Implementations across all modern CPU instruction sets including AVX2 and KNC. Language interfaces for C/C++ and Python.

  4. Complete chloroplast genome sequences of Hordeum vulgare, Sorghum bicolor and Agrostis stolonifera, and comparative analyses with other grass genomes

    Science.gov (United States)

    Saski, Christopher; Lee, Seung-Bum; Fjellheim, Siri; Guda, Chittibabu; Jansen, Robert K.; Luo, Hong; Tomkins, Jeffrey; Rognli, Odd Arne; Clarke, Jihong Liu

    2009-01-01

    Comparisons of complete chloroplast genome sequences of Hordeum vulgare, Sorghum bicolor and Agrostis stolonifera to six published grass chloroplast genomes reveal that gene content and order are similar but two microstructural changes have occurred. First, the expansion of the IR at the SSC/IRa boundary that duplicates a portion of the 5′ end of ndhH is restricted to the three genera of the subfamily Pooideae (Agrostis, Hordeum and Triticum). Second, a 6 bp deletion in ndhK is shared by Agrostis, Hordeum, Oryza and Triticum, and this event supports the sister relationship between the subfamilies Erhartoideae and Pooideae. Repeat analysis identified 19–37 direct and inverted repeats 30 bp or longer with a sequence identity of at least 90%. Seventeen of the 26 shared repeats are found in all the grass chloroplast genomes examined and are located in the same genes or intergenic spacer (IGS) regions. Examination of simple sequence repeats (SSRs) identified 16–21 potential polymorphic SSRs. Five IGS regions have 100% sequence identity among Zea mays, Saccharum officinarum and Sorghum bicolor, whereas no spacer regions were identical among Oryza sativa, Triticum aestivum, H. vulgare and A. stolonifera despite their close phylogenetic relationship. Alignment of EST sequences and DNA coding sequences identified six C–U conversions in both Sorghum bicolor and H. vulgare but only one in A. stolonifera. Phylogenetic trees based on DNA sequences of 61 protein-coding genes of 38 taxa using both maximum parsimony and likelihood methods provide moderate support for a sister relationship between the subfamilies Erhartoideae and Pooideae. PMID:17534593

  5. Complete Sequence and Analysis of the Mitochondrial Genome of Hemiselmis andersenii CCMP644 (Cryptophyceae

    Directory of Open Access Journals (Sweden)

    Bowman Sharen

    2008-05-01

    Full Text Available Abstract Background Cryptophytes are an enigmatic group of unicellular eukaryotes with plastids derived by secondary (i.e., eukaryote-eukaryote endosymbiosis. Cryptophytes are unusual in that they possess four genomes–a host cell-derived nuclear and mitochondrial genome and an endosymbiont-derived plastid and 'nucleomorph' genome. The evolutionary origins of the host and endosymbiont components of cryptophyte algae are at present poorly understood. Thus far, a single complete mitochondrial genome sequence has been determined for the cryptophyte Rhodomonas salina. Here, the second complete mitochondrial genome of the cryptophyte alga Hemiselmis andersenii CCMP644 is presented. Results The H. andersenii mtDNA is 60,553 bp in size and encodes 30 structural RNAs and 36 protein-coding genes, all located on the same strand. A prominent feature of the genome is the presence of a ~20 Kbp long intergenic region comprised of numerous tandem and dispersed repeat units of between 22–336 bp. Adjacent to these repeats are 27 copies of palindromic sequences predicted to form stable DNA stem-loop structures. One such stem-loop is located near a GC-rich and GC-poor region and may have a regulatory function in replication or transcription. The H. andersenii mtDNA shares a number of features in common with the genome of the cryptophyte Rhodomonas salina, including general architecture, gene content, and the presence of a large repeat region. However, the H. andersenii mtDNA is devoid of inverted repeats and introns, which are present in R. salina. Comparative analyses of the suite of tRNAs encoded in the two genomes reveal that the H. andersenii mtDNA has lost or converted its original trnK(uuu gene and possesses a trnS-derived 'trnK(uuu', which appears unable to produce a functional tRNA. Mitochondrial protein coding gene phylogenies strongly support a variety of previously established eukaryotic groups, but fail to resolve the relationships among higher

  6. Adaptive Processing for Sequence Alignment

    KAUST Repository

    Zidan, Mohammed A.; Bonny, Talal; Salama, Khaled N.

    2012-01-01

    Disclosed are various embodiments for adaptive processing for sequence alignment. In one embodiment, among others, a method includes obtaining a query sequence and a plurality of database sequences. A first portion of the plurality of database sequences is distributed to a central processing unit (CPU) and a second portion of the plurality of database sequences is distributed to a graphical processing unit (GPU) based upon a predetermined splitting ratio associated with the plurality of database sequences, where the database sequences of the first portion are shorter than the database sequences of the second portion. A first alignment score for the query sequence is determined with the CPU based upon the first portion of the plurality of database sequences and a second alignment score for the query sequence is determined with the GPU based upon the second portion of the plurality of database sequences.

  7. Adaptive Processing for Sequence Alignment

    KAUST Repository

    Zidan, Mohammed A.

    2012-01-26

    Disclosed are various embodiments for adaptive processing for sequence alignment. In one embodiment, among others, a method includes obtaining a query sequence and a plurality of database sequences. A first portion of the plurality of database sequences is distributed to a central processing unit (CPU) and a second portion of the plurality of database sequences is distributed to a graphical processing unit (GPU) based upon a predetermined splitting ratio associated with the plurality of database sequences, where the database sequences of the first portion are shorter than the database sequences of the second portion. A first alignment score for the query sequence is determined with the CPU based upon the first portion of the plurality of database sequences and a second alignment score for the query sequence is determined with the GPU based upon the second portion of the plurality of database sequences.

  8. Comparative genomics and repetitive sequence divergence in the species of diploid Nicotiana section Alatae.

    Science.gov (United States)

    Lim, K Yoong; Kovarik, Ales; Matyasek, Roman; Chase, Mark W; Knapp, Sandra; McCarthy, Elizabeth; Clarkson, James J; Leitch, Andrew R

    2006-12-01

    Combining phylogenetic reconstructions of species relationships with comparative genomic approaches is a powerful way to decipher evolutionary events associated with genome divergence. Here, we reconstruct the history of karyotype and tandem repeat evolution in species of diploid Nicotiana section Alatae. By analysis of plastid DNA, we resolved two clades with high bootstrap support, one containing N. alata, N. langsdorffii, N. forgetiana and N. bonariensis (called the n = 9 group) and another containing N. plumbaginifolia and N. longiflora (called the n = 10 group). Despite little plastid DNA sequence divergence, we observed, via fluorescent in situ hybridization, substantial chromosomal repatterning, including altered chromosome numbers, structure and distribution of repeats. Effort was focussed on 35S and 5S nuclear ribosomal DNA (rDNA) and the HRS60 satellite family of tandem repeats comprising the elements HRS60, NP3R and NP4R. We compared divergence of these repeats in diploids and polyploids of Nicotiana. There are dramatic shifts in the distribution of the satellite repeats and complete replacement of intergenic spacers (IGSs) of 35S rDNA associated with divergence of the species in section Alatae. We suggest that sequence homogenization has replaced HRS60 family repeats at sub-telomeric regions, but that this process may not occur, or occurs more slowly, when the repeats are found at intercalary locations. Sequence homogenization acts more rapidly (at least two orders of magnitude) on 35S rDNA than 5S rDNA and sub-telomeric satellite sequences. This rapid rate of divergence is analogous to that found in polyploid species, and is therefore, in plants, not only associated with polyploidy.

  9. A comparison of chloroplast genome sequences in Aconitum (Ranunculaceae: a traditional herbal medicinal genus

    Directory of Open Access Journals (Sweden)

    Hanghui Kong

    2017-11-01

    Full Text Available The herbal medicinal genus Aconitum L., belonging to the Ranunculaceae family, represents the earliest diverging lineage within the eudicots. It currently comprises of two subgenera, A. subgenus Lycoctonum and A. subg. Aconitum. The complete chloroplast (cp genome sequences were characterized in three species: A. angustius, A. finetianum, and A. sinomontanum in subg. Lycoctonum and compared to other Aconitum species to clarify their phylogenetic relationship and provide molecular information for utilization of Aconitum species particularly in Eastern Asia. The length of the chloroplast genome sequences were 156,109 bp in A. angustius, 155,625 bp in A. finetianum and 157,215 bp in A. sinomontanum, with each species possessing 126 genes with 84 protein coding genes (PCGs. While genomic rearrangements were absent, structural variation was detected in the LSC/IR/SSC boundaries. Five pseudogenes were identified, among which Ψrps19 and Ψycf1 were in the LSC/IR/SSC boundaries, Ψrps16 and ΨinfA in the LSC region, and Ψycf15 in the IRb region. The nucleotide variability (Pi of Aconitum was estimated to be 0.00549, with comparably higher variations in the LSC and SSC than the IR regions. Eight intergenic regions were revealed to be highly variable and a total of 58–62 simple sequence repeats (SSRs were detected in all three species. More than 80% of SSRs were present in the LSC region. Altogether, 64.41% and 46.81% of SSRs are mononucleotides in subg. Lycoctonum and subg. Aconitum, respectively, while a higher percentage of di-, tri-, tetra-, and penta- SSRs were present in subg. Aconitum. Most species of subg. Aconitum in Eastern Asia were first used for phylogenetic analyses. The availability of the complete cp genome sequences of these species in subg. Lycoctonum will benefit future phylogenetic analyses and aid in germplasm utilization in Aconitum species.

  10. A comparison of chloroplast genome sequences in Aconitum (Ranunculaceae): a traditional herbal medicinal genus.

    Science.gov (United States)

    Kong, Hanghui; Liu, Wanzhen; Yao, Gang; Gong, Wei

    2017-01-01

    The herbal medicinal genus Aconitum L., belonging to the Ranunculaceae family, represents the earliest diverging lineage within the eudicots. It currently comprises of two subgenera, A . subgenus Lycoctonum and A . subg. Aconitum . The complete chloroplast (cp) genome sequences were characterized in three species: A. angustius , A. finetianum , and A. sinomontanum in subg. Lycoctonum and compared to other Aconitum species to clarify their phylogenetic relationship and provide molecular information for utilization of Aconitum species particularly in Eastern Asia. The length of the chloroplast genome sequences were 156,109 bp in A. angustius , 155,625 bp in A. finetianum and 157,215 bp in A. sinomontanum , with each species possessing 126 genes with 84 protein coding genes (PCGs). While genomic rearrangements were absent, structural variation was detected in the LSC/IR/SSC boundaries. Five pseudogenes were identified, among which Ψ rps 19 and Ψ ycf 1 were in the LSC/IR/SSC boundaries, Ψ rps 16 and Ψ inf A in the LSC region, and Ψ ycf 15 in the IRb region. The nucleotide variability ( Pi ) of Aconitum was estimated to be 0.00549, with comparably higher variations in the LSC and SSC than the IR regions. Eight intergenic regions were revealed to be highly variable and a total of 58-62 simple sequence repeats (SSRs) were detected in all three species. More than 80% of SSRs were present in the LSC region. Altogether, 64.41% and 46.81% of SSRs are mononucleotides in subg. Lycoctonum and subg. Aconitum , respectively, while a higher percentage of di-, tri-, tetra-, and penta- SSRs were present in subg. Aconitum . Most species of subg. Aconitum in Eastern Asia were first used for phylogenetic analyses. The availability of the complete cp genome sequences of these species in subg. Lycoctonum will benefit future phylogenetic analyses and aid in germplasm utilization in Aconitum species.

  11. Main sequence mass loss

    International Nuclear Information System (INIS)

    Brunish, W.M.; Guzik, J.A.; Willson, L.A.; Bowen, G.

    1987-01-01

    It has been hypothesized that variable stars may experience mass loss, driven, at least in part, by oscillations. The class of stars we are discussing here are the δ Scuti variables. These are variable stars with masses between about 1.2 and 2.25 M/sub θ/, lying on or very near the main sequence. According to this theory, high rotation rates enhance the rate of mass loss, so main sequence stars born in this mass range would have a range of mass loss rates, depending on their initial rotation velocity and the amplitude of the oscillations. The stars would evolve rapidly down the main sequence until (at about 1.25 M/sub θ/) a surface convection zone began to form. The presence of this convective region would slow the rotation, perhaps allowing magnetic braking to occur, and thus sharply reduce the mass loss rate. 7 refs

  12. Electricity sequence control

    International Nuclear Information System (INIS)

    Shin, Heung Ryeol

    2010-03-01

    The contents of the book are introduction of control system, like classification and control signal, introduction of electricity power switch, such as push-button and detection switch sensor for induction type and capacitance type machinery for control, solenoid valve, expression of sequence and type of electricity circuit about using diagram, time chart, marking and term, logic circuit like Yes, No, and, or and equivalence logic, basic electricity circuit, electricity sequence control, added condition, special program control about choice and jump of program, motor control, extra circuit on repeat circuit, pause circuit in a conveyer, safety regulations and rule about classification of electricity disaster and protective device for insulation.

  13. Next-generation sequencing

    DEFF Research Database (Denmark)

    Rieneck, Klaus; Bak, Mads; Jønson, Lars

    2013-01-01

    , Illumina); several millions of PCR sequences were analyzed. RESULTS: The results demonstrated the feasibility of diagnosing the fetal KEL1 or KEL2 blood group from cell-free DNA purified from maternal plasma. CONCLUSION: This method requires only one primer pair, and the large amount of sequence...... information obtained allows well for statistical analysis of the data. This general approach can be integrated into current laboratory practice and has numerous applications. Besides DNA-based predictions of blood group phenotypes, platelet phenotypes, or sickle cell anemia, and the determination of zygosity...

  14. Biological sequence analysis

    DEFF Research Database (Denmark)

    Durbin, Richard; Eddy, Sean; Krogh, Anders Stærmose

    This book provides an up-to-date and tutorial-level overview of sequence analysis methods, with particular emphasis on probabilistic modelling. Discussed methods include pairwise alignment, hidden Markov models, multiple alignment, profile searches, RNA secondary structure analysis, and phylogene...

  15. THE RHIC SEQUENCER

    International Nuclear Information System (INIS)

    VAN ZEIJTS, J.; DOTTAVIO, T.; FRAK, B.; MICHNOFF, R.

    2001-01-01

    The Relativistic Heavy Ion Collider (RHIC) has a high level asynchronous time-line driven by a controlling program called the ''Sequencer''. Most high-level magnet and beam related issues are orchestrated by this system. The system also plays an important task in coordinated data acquisition and saving. We present the program, operator interface, operational impact and experience

  16. Twin anemia polycythemia sequence

    NARCIS (Netherlands)

    Slaghekke, Femke

    2014-01-01

    In this thesis we describe that Twin Anemia Polycythemia Sequence (TAPS) is a form of chronic feto-fetal transfusion in monochorionic (identical) twins based on a small amount of blood transfusion through very small anastomoses. For the antenatal diagnosis of TAPS, Middle Cerebral Artery – Peak

  17. simple sequence repeat (SSR)

    African Journals Online (AJOL)

    In the present study, 78 mapped simple sequence repeat (SSR) markers representing 11 linkage groups of adzuki bean were evaluated for transferability to mungbean and related Vigna spp. 41 markers amplified characteristic bands in at least one Vigna species. The transferability percentage across the genotypes ranged ...

  18. Targeted sequencing of plant genomes

    Science.gov (United States)

    Mark D. Huynh

    2014-01-01

    Next-generation sequencing (NGS) has revolutionized the field of genetics by providing a means for fast and relatively affordable sequencing. With the advancement of NGS, wholegenome sequencing (WGS) has become more commonplace. However, sequencing an entire genome is still not cost effective or even beneficial in all cases. In studies that do not require a whole-...

  19. Almost convergence of triple sequences

    OpenAIRE

    Ayhan Esi; M.Necdet Catalbas

    2013-01-01

    In this paper we introduce and study the concepts of almost convergence and almost Cauchy for triple sequences. Weshow that the set of almost convergent triple sequences of 0's and 1's is of the first category and also almost everytriple sequence of 0's and 1's is not almost convergent.Keywords: almost convergence, P-convergent, triple sequence.

  20. A few Smarandache Integer Sequences

    OpenAIRE

    Ibstedt, Henry

    2010-01-01

    This paper deals with the analysis of a few Smarandache Integer Sequences which first appeared in Properties or the Numbers, F. Smarandache, University or Craiova Archives, 1975. The first four sequences are recurrence generated sequences while the last three are concatenation sequences.

  1. A Comparison of the First Two Sequenced Chloroplast Genomes in Asteraceae: Lettuce and Sunflower

    Energy Technology Data Exchange (ETDEWEB)

    Timme, Ruth E.; Kuehl, Jennifer V.; Boore, Jeffrey L.; Jansen, Robert K.

    2006-01-20

    Asteraceae is the second largest family of plants, with over 20,000 species. For the past few decades, numerous phylogenetic studies have contributed to our understanding of the evolutionary relationships within this family, including comparisons of the fast evolving chloroplast gene, ndhF, rbcL, as well as non-coding DNA from the trnL intron plus the trnLtrnF intergenic spacer, matK, and, with lesser resolution, psbA-trnH. This culminated in a study by Panero and Funk in 2002 that used over 13,000 bp per taxon for the largest taxonomic revision of Asteraceae in over a hundred years. Still, some uncertainties remain, and it would be very useful to have more information on the relative rates of sequence evolution among various genes and on genome structure as a potential set of phylogenetic characters to help guide future phylogenetic structures. By way of contributing to this, we report the first two complete chloroplast genome sequences from members of the Asteraceae, those of Helianthus annuus and Lactuca sativa. These plants belong to two distantly related subfamilies, Asteroideae and Cichorioideae, respectively. In addition to these, there is only one other published chloroplast genome sequence for any plant within the larger group called Eusterids II, that of Panax ginseng (Araliaceae, 156,318 bps, AY582139). Early chloroplast genome mapping studies demonstrated that H. annuus and L. sativa share a 22 kb inversion relative to members of the subfamily Barnadesioideae. By comparison to outgroups, this inversion was shown to be derived, indicating that the Asteroideae and Cichorioideae are more closely related than either is to the Barnadesioideae. Later sequencing study found that taxa that share this 22 kb inversion also contain within this region a second, smaller, 3.3 kb inversion. These sequences also enable an analysis of patterns of shared repeats in the genomes at fine level and of RNA editing by comparison to available EST sequences. In addition, since

  2. Allele Re-sequencing Technologies

    DEFF Research Database (Denmark)

    Byrne, Stephen; Farrell, Jacqueline Danielle; Asp, Torben

    2013-01-01

    The development of next-generation sequencing technologies has made sequencing an affordable approach for detection of genetic variations associated with various traits. However, the cost of whole genome re-sequencing still remains too high to be feasible for many plant species with large...... alternative to whole genome re-sequencing to identify causative genetic variations in plants. One challenge, however, will be efficient bioinformatics strategies for data handling and analysis from the increasing amount of sequence information....

  3. Enxertia intergenérica de cultivares de nespereira no porta-enxerto de marmeleiro 'japonês' Intergeneric grafting of loquat cultivars using 'Japonese' quince tree as rootstock

    Directory of Open Access Journals (Sweden)

    Rafael Pio

    2010-12-01

    Full Text Available No Brasil, foram desenvolvidos alguns trabalhos pioneiros com a utilização do marmeleiro (Cydonia oblonga Mill. como porta-enxertos para as nespereiras (Eriobotrya japonica (Thunb. Lindl.. O sucesso da utilização dessa enxertia intergenérica está relacionado, principalmente, à redução do porte da planta. Objetivou-se, neste trabalho estudar técnicas de enxertia de cultivares de nespereiras, utilizando-se o marmeleiro 'Japonês' (Chaenomeles sinensis (Thouin Koehne como nova opção de porta-enxerto. Mudas de marmeleiro 'Japonês' com um ano de idade (altura próxima a 110 cm e diâmetro de 0,85 cm na região de enxertia, a 15 cm acima do colo, mantidos em sacos plásticos com dimensões de 18 x 30 cm (capacidade de 3 L, foram enxertados pelos métodos de borbulhia em placa e garfagem em fenda cheia, em duas diferentes épocas: outono (abril e inverno (julho. Utilizaram-se cinco cultivares de nespereira de importância econômica no Brasil: 'Mizuho', 'Néctar de Cristal' (IAC 866-7, 'Mizauto' (IAC 167-4, 'Mizumo' (IAC 1567-411 e 'Centenária' (IAC 1567-420. Pelo método de borbulhia, não houve nenhuma borbulha brotada quando esta foi realizada no outono, apenas duas borbulhas da 'Mizauto', 'Néctar de Cristal' e 'Centenária' brotaram quando esta foi realizada no inverno, no entanto, com baixo crescimento. Já, por garfagem, maiores porcentagens de brotação e crescimento dos enxertos foram obtidas quando a enxertia foi realizada no inverno, com destaque para as nespereiras 'Mizuho', 'Centenária' e 'Néctar de Cristal'.In Brazil, some pioneer studies were carried out using quince seedlings (Cydonia oblonga Mill. as rootstock for loquat (Eriobotrya japonica (Thunb. Lindl.. The main advantage of this intergeneric grafting use is plant size reduction. The success of using this intergeneric grafting is related mainly to plant size reduction. Therefore, the objective of this work was to study grafting techniques of loquat cultivars using

  4. Multilocus Sequence Typing

    OpenAIRE

    Belén, Ana; Pavón, Ibarz; Maiden, Martin C.J.

    2009-01-01

    Multilocus sequence typing (MLST) was first proposed in 1998 as a typing approach that enables the unambiguous characterization of bacterial isolates in a standardized, reproducible, and portable manner using the human pathogen Neisseria meningitidis as the exemplar organism. Since then, the approach has been applied to a large and growing number of organisms by public health laboratories and research institutions. MLST data, shared by investigators over the world via the Internet, have been ...

  5. Achalasia Carcinoma Sequence

    OpenAIRE

    Makmun, Dadang

    2001-01-01

    We report a case of carcinoma of the esophagus in a 58 years old woman with achalasia, who has been diagnosed since 30 years ago, which initiated by surgical treatment (myotomy) and the symptoms recurred since 3 years ago. According to the progress of the disease, Malignancy was strongly suspected due to prolonged stasis and mucosal irritation caused by achalasia (achalasia carcinoma sequence). Because of these contributing factors for the development of serious complications such as Malignan...

  6. Sequencing BPS spectra

    Energy Technology Data Exchange (ETDEWEB)

    Gukov, Sergei [Walter Burke Institute for Theoretical Physics, California Institute of Technology,1200 E California Blvd, Pasadena, CA 91125 (United States); Max-Planck-Institut für Mathematik,Vivatsgasse 7, D-53111 Bonn (Germany); Nawata, Satoshi [Walter Burke Institute for Theoretical Physics, California Institute of Technology,1200 E California Blvd, Pasadena, CA 91125 (United States); Centre for Quantum Geometry of Moduli Spaces, University of Aarhus,Nordre Ringgade 1, DK-8000 (Denmark); Saberi, Ingmar [Walter Burke Institute for Theoretical Physics, California Institute of Technology,1200 E California Blvd, Pasadena, CA 91125 (United States); Stošić, Marko [CAMGSD, Departamento de Matemática, Instituto Superior Técnico,Av. Rovisco Pais, 1049-001 Lisbon (Portugal); Mathematical Institute SANU,Knez Mihajlova 36, 11000 Belgrade (Serbia); Sułkowski, Piotr [Walter Burke Institute for Theoretical Physics, California Institute of Technology,1200 E California Blvd, Pasadena, CA 91125 (United States); Faculty of Physics, University of Warsaw,ul. Pasteura 5, 02-093 Warsaw (Poland)

    2016-03-02

    This paper provides both a detailed study of color-dependence of link homologies, as realized in physics as certain spaces of BPS states, and a broad study of the behavior of BPS states in general. We consider how the spectrum of BPS states varies as continuous parameters of a theory are perturbed. This question can be posed in a wide variety of physical contexts, and we answer it by proposing that the relationship between unperturbed and perturbed BPS spectra is described by a spectral sequence. These general considerations unify previous applications of spectral sequence techniques to physics, and explain from a physical standpoint the appearance of many spectral sequences relating various link homology theories to one another. We also study structural properties of colored HOMFLY homology for links and evaluate Poincaré polynomials in numerous examples. Among these structural properties is a novel “sliding” property, which can be explained by using (refined) modular S-matrix. This leads to the identification of modular transformations in Chern-Simons theory and 3d N=2 theory via the 3d/3d correspondence. Lastly, we introduce the notion of associated varieties as classical limits of recursion relations of colored superpolynomials of links, and study their properties.

  7. Sequencing BPS spectra

    International Nuclear Information System (INIS)

    Gukov, Sergei; Nawata, Satoshi; Saberi, Ingmar; Stošić, Marko; Sułkowski, Piotr

    2016-01-01

    This paper provides both a detailed study of color-dependence of link homologies, as realized in physics as certain spaces of BPS states, and a broad study of the behavior of BPS states in general. We consider how the spectrum of BPS states varies as continuous parameters of a theory are perturbed. This question can be posed in a wide variety of physical contexts, and we answer it by proposing that the relationship between unperturbed and perturbed BPS spectra is described by a spectral sequence. These general considerations unify previous applications of spectral sequence techniques to physics, and explain from a physical standpoint the appearance of many spectral sequences relating various link homology theories to one another. We also study structural properties of colored HOMFLY homology for links and evaluate Poincaré polynomials in numerous examples. Among these structural properties is a novel “sliding” property, which can be explained by using (refined) modular S-matrix. This leads to the identification of modular transformations in Chern-Simons theory and 3d N=2 theory via the 3d/3d correspondence. Lastly, we introduce the notion of associated varieties as classical limits of recursion relations of colored superpolynomials of links, and study their properties.

  8. Image sequence analysis

    CERN Document Server

    1981-01-01

    The processing of image sequences has a broad spectrum of important applica­ tions including target tracking, robot navigation, bandwidth compression of TV conferencing video signals, studying the motion of biological cells using microcinematography, cloud tracking, and highway traffic monitoring. Image sequence processing involves a large amount of data. However, because of the progress in computer, LSI, and VLSI technologies, we have now reached a stage when many useful processing tasks can be done in a reasonable amount of time. As a result, research and development activities in image sequence analysis have recently been growing at a rapid pace. An IEEE Computer Society Workshop on Computer Analysis of Time-Varying Imagery was held in Philadelphia, April 5-6, 1979. A related special issue of the IEEE Transactions on Pattern Anal­ ysis and Machine Intelligence was published in November 1980. The IEEE Com­ puter magazine has also published a special issue on the subject in 1981. The purpose of this book ...

  9. Complete genome sequence and integrated protein localization and interaction map for alfalfa dwarf virus, which combines properties of both cytoplasmic and nuclear plant rhabdoviruses

    Energy Technology Data Exchange (ETDEWEB)

    Bejerman, Nicolás, E-mail: n.bejerman@uq.edu.au [Instituto de Patología Vegetal (IPAVE), Centro de Investigaciones Agropecuarias (CIAP), Instituto Nacional de Tecnología Agropecuaria INTA, Camino a 60 Cuadras k 5,5, Córdoba X5020ICA (Argentina); Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, St Lucia, QLD 4072 (Australia); Giolitti, Fabián; Breuil, Soledad de; Trucco, Verónica; Nome, Claudia; Lenardon, Sergio [Instituto de Patología Vegetal (IPAVE), Centro de Investigaciones Agropecuarias (CIAP), Instituto Nacional de Tecnología Agropecuaria INTA, Camino a 60 Cuadras k 5,5, Córdoba X5020ICA (Argentina); Dietzgen, Ralf G. [Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, St Lucia, QLD 4072 (Australia)

    2015-09-15

    Summary: We have determined the full-length 14,491-nucleotide genome sequence of a new plant rhabdovirus, alfalfa dwarf virus (ADV). Seven open reading frames (ORFs) were identified in the antigenomic orientation of the negative-sense, single-stranded viral RNA, in the order 3′-N-P-P3-M-G-P6-L-5′. The ORFs are separated by conserved intergenic regions and the genome coding region is flanked by complementary 3′ leader and 5′ trailer sequences. Phylogenetic analysis of the nucleoprotein amino acid sequence indicated that this alfalfa-infecting rhabdovirus is related to viruses in the genus Cytorhabdovirus. When transiently expressed as GFP fusions in Nicotiana benthamiana leaves, most ADV proteins accumulated in the cell periphery, but unexpectedly P protein was localized exclusively in the nucleus. ADV P protein was shown to have a homotypic, and heterotypic nuclear interactions with N, P3 and M proteins by bimolecular fluorescence complementation. ADV appears unique in that it combines properties of both cytoplasmic and nuclear plant rhabdoviruses. - Highlights: • The complete genome of alfalfa dwarf virus is obtained. • An integrated localization and interaction map for ADV is determined. • ADV has a genome sequence similarity and evolutionary links with cytorhabdoviruses. • ADV protein localization and interaction data show an association with the nucleus. • ADV combines properties of both cytoplasmic and nuclear plant rhabdoviruses.

  10. Complete genome sequence and integrated protein localization and interaction map for alfalfa dwarf virus, which combines properties of both cytoplasmic and nuclear plant rhabdoviruses

    International Nuclear Information System (INIS)

    Bejerman, Nicolás; Giolitti, Fabián; Breuil, Soledad de; Trucco, Verónica; Nome, Claudia; Lenardon, Sergio; Dietzgen, Ralf G.

    2015-01-01

    Summary: We have determined the full-length 14,491-nucleotide genome sequence of a new plant rhabdovirus, alfalfa dwarf virus (ADV). Seven open reading frames (ORFs) were identified in the antigenomic orientation of the negative-sense, single-stranded viral RNA, in the order 3′-N-P-P3-M-G-P6-L-5′. The ORFs are separated by conserved intergenic regions and the genome coding region is flanked by complementary 3′ leader and 5′ trailer sequences. Phylogenetic analysis of the nucleoprotein amino acid sequence indicated that this alfalfa-infecting rhabdovirus is related to viruses in the genus Cytorhabdovirus. When transiently expressed as GFP fusions in Nicotiana benthamiana leaves, most ADV proteins accumulated in the cell periphery, but unexpectedly P protein was localized exclusively in the nucleus. ADV P protein was shown to have a homotypic, and heterotypic nuclear interactions with N, P3 and M proteins by bimolecular fluorescence complementation. ADV appears unique in that it combines properties of both cytoplasmic and nuclear plant rhabdoviruses. - Highlights: • The complete genome of alfalfa dwarf virus is obtained. • An integrated localization and interaction map for ADV is determined. • ADV has a genome sequence similarity and evolutionary links with cytorhabdoviruses. • ADV protein localization and interaction data show an association with the nucleus. • ADV combines properties of both cytoplasmic and nuclear plant rhabdoviruses

  11. Foundations of Sequence-to-Sequence Modeling for Time Series

    OpenAIRE

    Kuznetsov, Vitaly; Mariet, Zelda

    2018-01-01

    The availability of large amounts of time series data, paired with the performance of deep-learning algorithms on a broad class of problems, has recently led to significant interest in the use of sequence-to-sequence models for time series forecasting. We provide the first theoretical analysis of this time series forecasting framework. We include a comparison of sequence-to-sequence modeling to classical time series models, and as such our theory can serve as a quantitative guide for practiti...

  12. Novel expressed sequence tag- simple sequence repeats (EST ...

    African Journals Online (AJOL)

    Using different bioinformatic criteria, the SUCEST database was used to mine for simple sequence repeat (SSR) markers. Among 42,189 clusters, 1,425 expressed sequence tag- simple sequence repeats (EST-SSRs) were identified in silico. Trinucleotide repeats were the most abundant SSRs detected. Of 212 primer pairs ...

  13. Infinite sequences and series

    CERN Document Server

    Knopp, Konrad

    1956-01-01

    One of the finest expositors in the field of modern mathematics, Dr. Konrad Knopp here concentrates on a topic that is of particular interest to 20th-century mathematicians and students. He develops the theory of infinite sequences and series from its beginnings to a point where the reader will be in a position to investigate more advanced stages on his own. The foundations of the theory are therefore presented with special care, while the developmental aspects are limited by the scope and purpose of the book. All definitions are clearly stated; all theorems are proved with enough detail to ma

  14. Fast global sequence alignment technique

    KAUST Repository

    Bonny, Mohamed Talal; Salama, Khaled N.

    2011-01-01

    fast alignment algorithm, called 'Alignment By Scanning' (ABS), to provide an approximate alignment of two DNA sequences. We compare our algorithm with the wellknown sequence alignment algorithms, the 'GAP' (which is heuristic) and the 'Needleman

  15. Next-Generation Sequencing Platforms

    Science.gov (United States)

    Mardis, Elaine R.

    2013-06-01

    Automated DNA sequencing instruments embody an elegant interplay among chemistry, engineering, software, and molecular biology and have built upon Sanger's founding discovery of dideoxynucleotide sequencing to perform once-unfathomable tasks. Combined with innovative physical mapping approaches that helped to establish long-range relationships between cloned stretches of genomic DNA, fluorescent DNA sequencers produced reference genome sequences for model organisms and for the reference human genome. New types of sequencing instruments that permit amazing acceleration of data-collection rates for DNA sequencing have been developed. The ability to generate genome-scale data sets is now transforming the nature of biological inquiry. Here, I provide an historical perspective of the field, focusing on the fundamental developments that predated the advent of next-generation sequencing instruments and providing information about how these instruments work, their application to biological research, and the newest types of sequencers that can extract data from single DNA molecules.

  16. Application of whole genome sequence data in analyzing the molecular epidemiology of Shiga toxin-producing Escherichia coli O157:H7/H.

    Science.gov (United States)

    Yokoyama, Eiji; Hirai, Shinichiro; Ishige, Taichiro; Murakami, Satoshi

    2018-01-02

    Seventeen clusters of Shiga toxin-producing Escherichia coli O157:H7/- (O157) strains, determined by cluster analysis of pulsed-field gel electrophoresis patterns, were analyzed using whole genome sequence (WGS) data to investigate this pathogen's molecular epidemiology. The 17 clusters included 136 strains containing strains from nine outbreaks, with each outbreak caused by a single source contaminated with the organism, as shown by epidemiological contact surveys. WGS data of these strains were used to identify single nucleotide polymorphisms (SNPs) by two methods: short read data were directly mapped to a reference genome (mapping derived SNPs) and common SNPs between the mapping derived SNPs and SNPs in assembled data of short read data (common SNPs). Among both SNPs, those that were detected in genes with a gap were excluded to remove ambiguous SNPs from further analysis. The effectiveness of both SNPs was investigated among all the concatenated SNPs that were detected (whole SNP set); SNPs were divided into three categories based on the genes in which they were located (i.e., backbone SNP set, O-island SNP set, and mobile element SNP set); and SNPs in non-coding regions (intergenic region SNP set). When SNPs from strains isolated from the nine single source derived outbreaks were analyzed using an unweighted pair group method with arithmetic mean tree (UPGMA) and a minimum spanning tree (MST), the maximum pair-wise distances of the backbone SNP set of the mapping derived SNPs were significantly smaller than those of the whole and intergenic region SNP set on both UPGMAs and MSTs. This significant difference was also observed when the backbone SNP set of the common SNPs were examined (Steel-Dwass test, P≤0.01). When the maximum pair-wise distances were compared between the mapping derived and common SNPs, significant differences were observed in those of the whole, mobile element, and intergenic region SNP set (Wilcoxon signed rank test, P≤0.01). When all

  17. Epidemiology of transmissible diseases: Array hybridization and next generation sequencing as universal nucleic acid-mediated typing tools.

    Science.gov (United States)

    Michael Dunne, W; Pouseele, Hannes; Monecke, Stefan; Ehricht, Ralf; van Belkum, Alex

    2017-09-21

    The magnitude of interest in the epidemiology of transmissible human diseases is reflected in the vast number of tools and methods developed recently with the expressed purpose to characterize and track evolutionary changes that occur in agents of these diseases over time. Within the past decade a new suite of such tools has become available with the emergence of the so-called "omics" technologies. Among these, two are exponents of the ongoing genomic revolution. Firstly, high-density nucleic acid probe arrays have been proposed and developed using various chemical and physical approaches. Via hybridization-mediated detection of entire genes or genetic polymorphisms in such genes and intergenic regions these so called "DNA chips" have been successfully applied for distinguishing very closely related microbial species and strains. Second and even more phenomenal, next generation sequencing (NGS) has facilitated the assessment of the complete nucleotide sequence of entire microbial genomes. This technology currently provides the most detailed level of bacterial genotyping and hence allows for the resolution of microbial spread and short-term evolution in minute detail. We will here review the very recent history of these two technologies, sketch their usefulness in the elucidation of the spread and epidemiology of mostly hospital-acquired infections and discuss future developments. Copyright © 2017 Elsevier B.V. All rights reserved.

  18. Sequence and annotation of the 314-kb MT325 and the 321-kb FR483 viruses that infect Chlorella Pbi.

    Science.gov (United States)

    Fitzgerald, Lisa A; Graves, Michael V; Li, Xiao; Feldblyum, Tamara; Hartigan, James; Van Etten, James L

    2007-02-20

    Viruses MT325 and FR483, members of the family Phycodnaviridae, genus Chlorovirus, infect the fresh water, unicellular, eukaryotic, chlorella-like green alga, Chlorella Pbi. The 314,335-bp genome of MT325 and the 321,240-bp genome of FR483 are the first viruses that infect Chlorella Pbi to have their genomes sequenced and annotated. Furthermore, these genomes are the two smallest chlorella virus genomes sequenced to date, MT325 has 331 putative protein-encoding and 10 tRNA-encoding genes and FR483 has 335 putative protein-encoding and 9 tRNA-encoding genes. The protein-encoding genes are almost evenly distributed on both strands, and intergenic space is minimal. Approximately 40% of the viral gene products resemble entries in public databases, including some that are the first of their kind to be detected in a virus. For example, these unique gene products include an aquaglyceroporin in MT325, a potassium ion transporter protein and an alkyl sulfatase in FR483, and a dTDP-glucose pyrophosphorylase in both viruses. Comparison of MT325 and FR483 protein-encoding genes with the prototype chlorella virus PBCV-1 indicates that approximately 82% of the genes are present in all three viruses.

  19. Rapid Polymer Sequencer

    Science.gov (United States)

    Stolc, Viktor (Inventor); Brock, Matthew W (Inventor)

    2013-01-01

    Method and system for rapid and accurate determination of each of a sequence of unknown polymer components, such as nucleic acid components. A self-assembling monolayer of a selected substance is optionally provided on an interior surface of a pipette tip, and the interior surface is immersed in a selected liquid. A selected electrical field is impressed in a longitudinal direction, or in a transverse direction, in the tip region, a polymer sequence is passed through the tip region, and a change in an electrical current signal is measured as each polymer component passes through the tip region. Each of the measured changes in electrical current signals is compared with a database of reference electrical change signals, with each reference signal corresponding to an identified polymer component, to identify the unknown polymer component with a reference polymer component. The nanopore preferably has a pore inner diameter of no more than about 40 nm and is prepared by heating and pulling a very small section of a glass tubing.

  20. The advantages of SMRT sequencing

    OpenAIRE

    Roberts, Richard J; Carneiro, Mauricio O; Schatz, Michael C

    2013-01-01

    Of the current next-generation sequencing technologies, SMRT sequencing is sometimes overlooked. However, attributes such as long reads, modified base detection and high accuracy make SMRT a useful technology and an ideal approach to the complete sequencing of small genomes.

  1. Putting instruction sequences into effect

    NARCIS (Netherlands)

    Bergstra, J.A.

    2011-01-01

    An attempt is made to define the concept of execution of an instruction sequence. It is found to be a special case of directly putting into effect of an instruction sequence. Directly putting into effect of an instruction sequences comprises interpretation as well as execution. Directly putting into

  2. Region segmentation along image sequence

    International Nuclear Information System (INIS)

    Monchal, L.; Aubry, P.

    1995-01-01

    A method to extract regions in sequence of images is proposed. Regions are not matched from one image to the following one. The result of a region segmentation is used as an initialization to segment the following and image to track the region along the sequence. The image sequence is exploited as a spatio-temporal event. (authors). 12 refs., 8 figs

  3. Complete Chloroplast Genome Sequence of Aquilaria sinensis (Lour.) Gilg and Evolution Analysis within the Malvales Order.

    Science.gov (United States)

    Wang, Ying; Zhan, Di-Feng; Jia, Xian; Mei, Wen-Li; Dai, Hao-Fu; Chen, Xiong-Ting; Peng, Shi-Qing

    2016-01-01

    Aquilaria sinensis (Lour.) Gilg is an important medicinal woody plant producing agarwood, which is widely used in traditional Chinese medicine. High-throughput sequencing of chloroplast (cp) genomes enhanced the understanding about evolutionary relationships within plant families. In this study, we determined the complete cp genome sequences for A. sinensis. The size of the A. sinensis cp genome was 159,565 bp. This genome included a large single-copy region of 87,482 bp, a small single-copy region of 19,857 bp, and a pair of inverted repeats (IRa and IRb) of 26,113 bp each. The GC content of the genome was 37.11%. The A. sinensis cp genome encoded 113 functional genes, including 82 protein-coding genes, 27 tRNA genes, and 4 rRNA genes. Seven genes were duplicated in the protein-coding genes, whereas 11 genes were duplicated in the RNA genes. A total of 45 polymorphic simple-sequence repeat loci and 60 pairs of large repeats were identified. Most simple-sequence repeats were located in the noncoding sections of the large single-copy/small single-copy region and exhibited high A/T content. Moreover, 33 pairs of large repeat sequences were located in the protein-coding genes, whereas 27 pairs were located in the intergenic regions. Aquilaria sinensis cp genome bias ended with A/T on the basis of codon usage. The distribution of codon usage in A. sinensis cp genome was most similar to that in the Gonystylus bancanus cp genome. Comparative results of 82 protein-coding genes from 29 species of cp genomes demonstrated that A. sinensis was a sister species to G. bancanus within the Malvales order. Aquilaria sinensis cp genome presented the highest sequence similarity of >90% with the G. bancanus cp genome by using CGView Comparison Tool. This finding strongly supports the placement of A. sinensis as a sister to G. bancanus within the Malvales order. The complete A. sinensis cp genome information will be highly beneficial for further studies on this traditional medicinal

  4. Log-balanced combinatorial sequences

    Directory of Open Access Journals (Sweden)

    Tomislav Došlic

    2005-01-01

    Full Text Available We consider log-convex sequences that satisfy an additional constraint imposed on their rate of growth. We call such sequences log-balanced. It is shown that all such sequences satisfy a pair of double inequalities. Sufficient conditions for log-balancedness are given for the case when the sequence satisfies a two- (or more- term linear recurrence. It is shown that many combinatorially interesting sequences belong to this class, and, as a consequence, that the above-mentioned double inequalities are valid for all of them.

  5. New MR pulse sequence

    International Nuclear Information System (INIS)

    Harms, S.E.; Flamig, D.P.; Griffey, R.H.

    1990-01-01

    This paper describes a method for fat suppression for three-dimensional MR imaging. The FATS (fat-suppressed acquisition with echo time shortened) sequence employs a pair of opposing adiabatic half-passage RF pulses tuned on fat resonance. The imaging parameters are as follows: TR, 20 msec; TE, 21.7-3.2 msec; 1,024 x 128 x 128 acquired matrix; imaging time, approximately 11 minutes. A series of 54 examinations were performed. Excellent fat suppression with water excitation is achieved in all cases. The orbital images demonstrate superior resolution of small orbital lesions. The high signal-to-noise ratio (SNR) in cranial studies demonstrates excellent petrous bone and internal auditory canal anatomy

  6. Strong conservation of rhoptry-associated-protein-1 (RAP-1) locus organization and sequence among Babesia isolates infecting sheep from China (Babesia motasi-like phylogenetic group).

    Science.gov (United States)

    Niu, Qingli; Valentin, Charlotte; Bonsergent, Claire; Malandrin, Laurence

    2014-12-01

    Rhoptry-associated-protein 1 (RAP-1) is considered as a potential vaccine candidate due to its involvement in red blood cell invasion by parasites in the genus Babesia. We examined its value as a vaccine candidate by studying RAP-1 conservation in isolates of Babesia sp. BQ1 Ningxian, Babesia sp. Tianzhu and Babesia sp. Hebei, responsible for ovine babesiosis in different regions of China. The rap-1 locus in these isolates has very similar features to those described for Babesia sp. BQ1 Lintan, another Chinese isolate also in the B. motasi-like phylogenetic group, namely the presence of three types of rap-1 genes (rap-1a, rap-1b and rap-1c), multiple conserved rap-1b copies (5) interspaced with more or less variable rap-1a copies (6), and the 3' localization of one rap-1c. The isolates Babesia sp. Tianzhu, Babesia sp. BQ1 Lintan and Ningxian were almost identical (average nucleotide identity of 99.9%) over a putative locus of about 31 Kb, including the intergenic regions. Babesia sp. Hebei showed a similar locus organization but differed in the rap-1 locus sequence, for each gene and intergenic region, with an average nucleotide identity of 78%. Our results are in agreement with 18S rDNA phylogenetic studies performed on these isolates. However, in extremely closely related isolates the rap-1 locus seems more conserved (99.9%) than the 18S rDNA (98.7%), whereas in still closely related isolates the identities are much lower (78%) compared with the 18S rDNA (97.7%). The particularities of the rap-1 locus in terms of evolution, phylogeny, diagnosis and vaccine development are discussed. Copyright © 2014 The Authors. Published by Elsevier B.V. All rights reserved.

  7. Microbial analysis of bite marks by sequence comparison of streptococcal DNA.

    Directory of Open Access Journals (Sweden)

    Darnell M Kennedy

    Full Text Available Bite mark injuries often feature in violent crimes. Conventional morphometric methods for the forensic analysis of bite marks involve elements of subjective interpretation that threaten the credibility of this field. Human DNA recovered from bite marks has the highest evidentiary value, however recovery can be compromised by salivary components. This study assessed the feasibility of matching bacterial DNA sequences amplified from experimental bite marks to those obtained from the teeth responsible, with the aim of evaluating the capability of three genomic regions of streptococcal DNA to discriminate between participant samples. Bite mark and teeth swabs were collected from 16 participants. Bacterial DNA was extracted to provide the template for PCR primers specific for streptococcal 16S ribosomal RNA (16S rRNA gene, 16S-23S intergenic spacer (ITS and RNA polymerase beta subunit (rpoB. High throughput sequencing (GS FLX 454, followed by stringent quality filtering, generated reads from bite marks for comparison to those generated from teeth samples. For all three regions, the greatest overlaps of identical reads were between bite mark samples and the corresponding teeth samples. The average proportions of reads identical between bite mark and corresponding teeth samples were 0.31, 0.41 and 0.31, and for non-corresponding samples were 0.11, 0.20 and 0.016, for 16S rRNA, ITS and rpoB, respectively. The probabilities of correctly distinguishing matching and non-matching teeth samples were 0.92 for ITS, 0.99 for 16S rRNA and 1.0 for rpoB. These findings strongly support the tenet that bacterial DNA amplified from bite marks and teeth can provide corroborating information in the identification of assailants.

  8. Sequence Analysis and Characterization of Active Human Alu Subfamilies Based on the 1000 Genomes Pilot Project.

    Science.gov (United States)

    Konkel, Miriam K; Walker, Jerilyn A; Hotard, Ashley B; Ranck, Megan C; Fontenot, Catherine C; Storer, Jessica; Stewart, Chip; Marth, Gabor T; Batzer, Mark A

    2015-08-29

    The goal of the 1000 Genomes Consortium is to characterize human genome structural variation (SV), including forms of copy number variations such as deletions, duplications, and insertions. Mobile element insertions, particularly Alu elements, are major contributors to genomic SV among humans. During the pilot phase of the project we experimentally validated 645 (611 intergenic and 34 exon targeted) polymorphic "young" Alu insertion events, absent from the human reference genome. Here, we report high resolution sequencing of 343 (322 unique) recent Alu insertion events, along with their respective target site duplications, precise genomic breakpoint coordinates, subfamily assignment, percent divergence, and estimated A-rich tail lengths. All the sequenced Alu loci were derived from the AluY lineage with no evidence of retrotransposition activity involving older Alu families (e.g., AluJ and AluS). AluYa5 is currently the most active Alu subfamily in the human lineage, followed by AluYb8, and many others including three newly identified subfamilies we have termed AluYb7a3, AluYb8b1, and AluYa4a1. This report provides the structural details of 322 unique Alu variants from individual human genomes collectively adding about 100 kb of genomic variation. Many Alu subfamilies are currently active in human populations, including a surprising level of AluY retrotransposition. Human Alu subfamilies exhibit continuous evolution with potential drivers sprouting new Alu lineages. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  9. Quantum-Sequencing: Fast electronic single DNA molecule sequencing

    Science.gov (United States)

    Casamada Ribot, Josep; Chatterjee, Anushree; Nagpal, Prashant

    2014-03-01

    A major goal of third-generation sequencing technologies is to develop a fast, reliable, enzyme-free, high-throughput and cost-effective, single-molecule sequencing method. Here, we present the first demonstration of unique ``electronic fingerprint'' of all nucleotides (A, G, T, C), with single-molecule DNA sequencing, using Quantum-tunneling Sequencing (Q-Seq) at room temperature. We show that the electronic state of the nucleobases shift depending on the pH, with most distinct states identified at acidic pH. We also demonstrate identification of single nucleotide modifications (methylation here). Using these unique electronic fingerprints (or tunneling data), we report a partial sequence of beta lactamase (bla) gene, which encodes resistance to beta-lactam antibiotics, with over 95% success rate. These results highlight the potential of Q-Seq as a robust technique for next-generation sequencing.

  10. Prunus necrotic ringspot ilarvirus: nucleotide sequence of RNA3 and the relationship to other ilarviruses based on coat protein comparison.

    Science.gov (United States)

    Guo, D; Maiss, E; Adam, G; Casper, R

    1995-05-01

    The RNA3 of prunus necrotic ringspot ilarvirus (PNRSV) has been cloned and its entire sequence determined. The RNA3 consists of 1943 nucleotides (nt) and possesses two large open reading frames (ORFs) separated by an intergenic region of 74 nt. The 5' proximal ORF is 855 nt in length and codes for a protein of molecular mass 31.4 kDa which has homologies with the putative movement protein of other members of the Bromoviridae. The 3' proximal ORF of 675 nt is the cistron for the coat protein (CP) and has a predicted molecular mass of 24.9 kDa. The sequence of the 3' non-coding region (NCR) of PNRSV RNA3 showed a high degree of similarity with those of tobacco streak virus (TSV), prune dwarf virus (PDV), apple mosaic virus (ApMV) and also alfalfa mosaic virus (AIMV). In addition it contained potential stem-loop structures with interspersed AUGC motifs characteristic for ilar- and alfamoviruses. This conserved primary and secondary structure in all 3' NCRs may be responsible for the interaction with homologous and heterologous CPs and subsequent activation of genome replication. The CP gene of an ApMV isolate (ApMV-G) of 657 nt has also been cloned and sequenced. Although ApMV and PNRSV have a distant serological relationship, the deduced amino acid sequences of their CPs have an identity of only 51.8%. The N termini of PNRSV and ApMV CPs have in common a zinc-finger motif and the potential to form an amphipathic helix.

  11. Characterization of the Complete Mitochondrial Genome Sequence of the Globose Head Whiptail Cetonurus globiceps (Gadiformes: Macrouridae and Its Phylogenetic Analysis.

    Directory of Open Access Journals (Sweden)

    Xiaofeng Shi

    Full Text Available The particular environmental characteristics of deep water such as its immense scale and high pressure systems, presents technological problems that have prevented research to broaden our knowledge of deep-sea fish. Here, we described the mitogenome sequence of a deep-sea fish, Cetonurus globiceps. The genome is 17,137 bp in length, with a standard set of 22 transfer RNA genes (tRNAs, two ribosomal RNA genes, 13 protein-coding genes, and two typical non-coding control regions. Additionally, a 70 bp tRNA(Thr-tRNA(Pro intergenic spacer is present. The C. globiceps mitogenome exhibited strand-specific asymmetry in nucleotide composition. The AT-skew and GC-skew values in the whole genome of C. globiceps were 0 and -0.2877, respectively, revealing that the H-strand had equal amounts of A and T and that the overall nucleotide composition was C skewed. All of the tRNA genes could be folded into cloverleaf secondary structures, while the secondary structure of tRNA(Ser(AGY lacked a discernible dihydrouridine stem. By comparing this genome sequence with the recognition sites in teleost species, several conserved sequence blocks were identified in the control region. However, the GTGGG-box, the typical characteristic of conserved sequence block E (CSB-E, was absent. Notably, tandem repeats were identified in the 3' portion of the control region. No similar repetitive motifs are present in most of other gadiform species. Phylogenetic analysis based on 12 protein coding genes provided strong support that C. globiceps was the most derived in the clade. Some relationships however, are in contrast with those presented in previous studies. This study enriches our knowledge of mitogenomes of the genus Cetonurus and provides valuable information on the evolution of Macrouridae mtDNA and deep-sea fish.

  12. Complete sequence and analysis of plastid genomes of two economically important red algae: Pyropia haitanensis and Pyropia yezoensis.

    Directory of Open Access Journals (Sweden)

    Li Wang

    Full Text Available Pyropia haitanensis and P. yezoensis are two economically important marine crops that are also considered to be research models to study the physiological ecology of intertidal seaweed communities, evolutionary biology of plastids, and the origins of sexual reproduction. This plastid genome information will facilitate study of breeding, population genetics and phylogenetics.We have fully sequenced using next-generation sequencing the circular plastid genomes of P. hatanensis (195,597 bp and P. yezoensis (191,975 bp, the largest of all the plastid genomes of the red lineage sequenced to date. Organization and gene contents of the two plastids were similar, with 211-213 protein-coding genes (including 29-31 unknown-function ORFs, 37 tRNA genes, and 6 ribosomal RNA genes, suggesting a largest coding capacity in the red lineage. In each genome, 14 protein genes overlapped and no interrupted genes were found, indicating a high degree of genomic condensation. Pyropia maintain an ancient gene content and conserved gene clusters in their plastid genomes, containing nearly complete repertoires of the plastid genes known in photosynthetic eukaryotes. Similarity analysis based on the whole plastid genome sequences showed the distance between P. haitanensis and P. yezoensis (0.146 was much smaller than that of Porphyra purpurea and P. haitanensis (0.250, and P. yezoensis (0.251; this supports re-grouping the two species in a resurrected genus Pyropia while maintaining P. purpurea in genus Porphyra. Phylogenetic analysis supports a sister relationship between Bangiophyceae and Florideophyceae, though precise phylogenetic relationships between multicellular red alage and chromists were not fully resolved.These results indicate that Pyropia have compact plastid genomes. Large coding capacity and long intergenic regions contribute to the size of the largest plastid genomes reported for the red lineage. Possessing the largest coding capacity and ancient gene

  13. Universal sequence map (USM of arbitrary discrete sequences

    Directory of Open Access Journals (Sweden)

    Almeida Jonas S

    2002-02-01

    Full Text Available Abstract Background For over a decade the idea of representing biological sequences in a continuous coordinate space has maintained its appeal but not been fully realized. The basic idea is that any sequence of symbols may define trajectories in the continuous space conserving all its statistical properties. Ideally, such a representation would allow scale independent sequence analysis – without the context of fixed memory length. A simple example would consist on being able to infer the homology between two sequences solely by comparing the coordinates of any two homologous units. Results We have successfully identified such an iterative function for bijective mappingψ of discrete sequences into objects of continuous state space that enable scale-independent sequence analysis. The technique, named Universal Sequence Mapping (USM, is applicable to sequences with an arbitrary length and arbitrary number of unique units and generates a representation where map distance estimates sequence similarity. The novel USM procedure is based on earlier work by these and other authors on the properties of Chaos Game Representation (CGR. The latter enables the representation of 4 unit type sequences (like DNA as an order free Markov Chain transition table. The properties of USM are illustrated with test data and can be verified for other data by using the accompanying web-based tool:http://bioinformatics.musc.edu/~jonas/usm/. Conclusions USM is shown to enable a statistical mechanics approach to sequence analysis. The scale independent representation frees sequence analysis from the need to assume a memory length in the investigation of syntactic rules.

  14. A Chromosome 7 Pericentric Inversion Defined at Single-Nucleotide Resolution Using Diagnostic Whole Genome Sequencing in a Patient with Hand-Foot-Genital Syndrome.

    Science.gov (United States)

    Watson, Christopher M; Crinnion, Laura A; Harrison, Sally M; Lascelles, Carolina; Antanaviciute, Agne; Carr, Ian M; Bonthron, David T; Sheridan, Eamonn

    2016-01-01

    Next generation sequencing methodologies are facilitating the rapid characterisation of novel structural variants at nucleotide resolution. These approaches are particularly applicable to variants initially identified using alternative molecular methods. We report a child born with bilateral postaxial syndactyly of the feet and bilateral fifth finger clinodactyly. This was presumed to be an autosomal recessive syndrome, due to the family history of consanguinity. Karyotype analysis revealed a homozygous pericentric inversion of chromosome 7 (46,XX,inv(7)(p15q21)x2) which was confirmed to be heterozygous in both unaffected parents. Since the resolution of the karyotype was insufficient to identify any putatively causative gene, we undertook medium-coverage whole genome sequencing using paired-end reads, in order to elucidate the molecular breakpoints. In a two-step analysis, we first narrowed down the region by identifying discordant read-pairs, and then determined the precise molecular breakpoint by analysing the mapping locations of "soft-clipped" breakpoint-spanning reads. PCR and Sanger sequencing confirmed the identified breakpoints, both of which were located in intergenic regions. Significantly, the 7p15 breakpoint was located 523 kb upstream of HOXA13, the locus for hand-foot-genital syndrome. By inference from studies of HOXA locus control in the mouse, we suggest that the inversion has delocalised a HOXA13 enhancer to produce the phenotype observed in our patient. This study demonstrates how modern genetic diagnostic approach can characterise structural variants at nucleotide resolution and provide potential insights into functional regulation.

  15. Chloroplast DNA analysis of Tunisian cork oak populations (Quercus suber L.): sequence variations and molecular evolution of the trnL (UAA)-trnF (GAA) region.

    Science.gov (United States)

    Abdessamad, A; Baraket, G; Sakka, H; Ammari, Y; Ksontini, M; Hannachi, A Salhi

    2016-10-24

    Sequences of the trnL-trnF spacer and combined trnL-trnF region in chloroplast DNA of cork oak (Quercus suber L.) were analyzed to detect polymorphisms and to elucidate molecular evolution and demographic history. The aligned sequences varied in length and nucleotide composition. The overall ratio of transition/transversion (ti/tv) of 0.724 for the intergenic spacer and 0.258 for the pooled sequences were estimated, and indicated that transversions are more frequent than transitions. The molecular evolution and demographic history of Q. suber were investigated. Neutrality tests (Tajima's D and Fu and Li) ruled out the null hypothesis of a strictly neutral model, and Fu's Fs and Ramos-Onsins and Rozas' R2 confirmed the recent expansion of cork oak trees, validating its persistency in North Africa since the last glaciation during the Quaternary. The observed uni-modal mismatch distribution and the Harpending's raggedness index confirmed the demographic history model for cork oak. A phylogenetic dendrogram showed that the distribution of Q. suber trees occurs independently of geographical origin, the relief of the population site, and the bioclimatic stages. The molecular history and cytoplasmic diversity suggest that in situ and ex situ conservation strategies can be recommended for preserving landscape value and facing predictable future climatic changes.

  16. The complete nucleotide sequences of the 5 genetically distinct plastid genomes of Oenothera, subsection Oenothera: II. A microevolutionary view using bioinformatics and formal genetic data.

    Science.gov (United States)

    Greiner, Stephan; Wang, Xi; Herrmann, Reinhold G; Rauwolf, Uwe; Mayer, Klaus; Haberer, Georg; Meurer, Jörg

    2008-09-01

    A unique combination of genetic features and a rich stock of information make the flowering plant genus Oenothera an appealing model to explore the molecular basis of speciation processes including nucleus-organelle coevolution. From representative species, we have recently reported complete nucleotide sequences of the 5 basic and genetically distinguishable plastid chromosomes of subsection Oenothera (I-V). In nature, Oenothera plastid genomes are associated with 6 distinct, either homozygous or heterozygous, diploid nuclear genotypes of the 3 basic genomes A, B, or C. Artificially produced plastome-genome combinations that do not occur naturally often display interspecific plastome-genome incompatibility (PGI). In this study, we compare formal genetic data available from all 30 plastome-genome combinations with sequence differences between the plastomes to uncover potential determinants for interspecific PGI. Consistent with an active role in speciation, a remarkable number of genes have high Ka/Ks ratios. Different from the Solanacean cybrid model Atropa/tobacco, RNA editing seems not to be relevant for PGIs in Oenothera. However, predominantly sequence polymorphisms in intergenic segments are proposed as possible sources for PGI. A single locus, the bidirectional promoter region between psbB and clpP, is suggested to contribute to compartmental PGI in the interspecific AB hybrid containing plastome I (AB-I), consistent with its perturbed photosystem II activity.

  17. An intergenic non-coding rRNA correlated with expression of the rRNA and frequency of an rRNA single nucleotide polymorphism in lung cancer cells.

    Directory of Open Access Journals (Sweden)

    Yih-Horng Shiao

    Full Text Available BACKGROUND: Ribosomal RNA (rRNA is a central regulator of cell growth and may control cancer development. A cis noncoding rRNA (nc-rRNA upstream from the 45S rRNA transcription start site has recently been implicated in control of rRNA transcription in mouse fibroblasts. We investigated whether a similar nc-rRNA might be expressed in human cancer epithelial cells, and related to any genomic characteristics. METHODOLOGY/PRINCIPAL FINDINGS: Using quantitative rRNA measurement, we demonstrated that a nc-rRNA is transcribed in human lung epithelial and lung cancer cells, starting from approximately -1000 nucleotides upstream of the rRNA transcription start site (+1 and extending at least to +203. This nc-rRNA was significantly more abundant in the majority of lung cancer cell lines, relative to a nontransformed lung epithelial cell line. Its abundance correlated negatively with total 45S rRNA in 12 of 13 cell lines (P = 0.014. During sequence analysis from -388 to +306, we observed diverse, frequent intercopy single nucleotide polymorphisms (SNPs in rRNA, with a frequency greater than predicted by chance at 12 sites. A SNP at +139 (U/C in the 5' leader sequence varied among the cell lines and correlated negatively with level of the nc-rRNA (P = 0.014. Modelling of the secondary structure of the rRNA 5'-leader sequence indicated a small increase in structural stability due to the +139 U/C SNP and a minor shift in local configuration occurrences. CONCLUSIONS/SIGNIFICANCE: The results demonstrate occurrence of a sense nc-rRNA in human lung epithelial and cancer cells, and imply a role in regulation of the rRNA gene, which may be affected by a +139 SNP in the 5' leader sequence of the primary rRNA transcript.

  18. The Pattern and Distribution of Induced Mutations in J. curcas Using Reduced Representation Sequencing

    Directory of Open Access Journals (Sweden)

    Fatemeh Maghuly

    2018-04-01

    Full Text Available Mutagenesis in combination with Genotyping by Sequencing (GBS is a powerful tool for introducing variation, studying gene function and identifying causal mutations underlying phenotypes of interest in crop plant genomes. About 400 million paired-end reads were obtained from 82 ethylmethane sulfonate (EMS induced mutants and 14 wild-type accessions of Jatropha curcas for the detection of Single Nucleotide Polymorphisms (SNPs and Insertion/Deletions (InDels by two different approaches (nGBS and ddGBS on an Illumina HiSeq 2000 sequencer. Using bioinformatics analyses, 1,452 induced SNPs and InDels were identified in coding regions, which were distributed across 995 genes. The predominantly observed mutations were G/C to A/T transitions (64%, while transversions were observed at a lower frequency (36%. Regarding the effect of mutations on gene function, 18% of the mutations were located in intergenic regions. In fact, mutants with the highest number of heterozygous SNPs were found in samples treated with 0.8% EMS for 3 h. Reconstruction of the metabolic pathways showed that in total 16 SNPs were located in six KEGG pathways by nGBS and two pathways by ddGBS. The most highly represented pathways were ether-lipid metabolism and glycerophospholipid metabolism, followed by starch and sucrose metabolism by nGBS and triterpenoid biosynthesis as well as steroid biosynthesis by ddGBS. Furthermore, high genome methylation was observed in J. curcas, which might help to understand the plasticity of the Jatropha genome in response to environmental factors. At last, the results showed that continuously vegetatively propagated tissue is a fast, efficient and accurate method to dissolve chimeras, especially for long-lived plants like J. curcas. Obtained data showed that allelic variations and in silico analyses of gene functions (gene function prediction, which control important traits, could be identified in mutant populations using nGBS and ddGBS. However, the

  19. Short sequence motifs, overrepresented in mammalian conservednon-coding sequences

    Energy Technology Data Exchange (ETDEWEB)

    Minovitsky, Simon; Stegmaier, Philip; Kel, Alexander; Kondrashov,Alexey S.; Dubchak, Inna

    2007-02-21

    Background: A substantial fraction of non-coding DNAsequences of multicellular eukaryotes is under selective constraint. Inparticular, ~;5 percent of the human genome consists of conservednon-coding sequences (CNSs). CNSs differ from other genomic sequences intheir nucleotide composition and must play important functional roles,which mostly remain obscure.Results: We investigated relative abundancesof short sequence motifs in all human CNSs present in the human/mousewhole-genome alignments vs. three background sets of sequences: (i)weakly conserved or unconserved non-coding sequences (non-CNSs); (ii)near-promoter sequences (located between nucleotides -500 and -1500,relative to a start of transcription); and (iii) random sequences withthe same nucleotide composition as that of CNSs. When compared tonon-CNSs and near-promoter sequences, CNSs possess an excess of AT-richmotifs, often containing runs of identical nucleotides. In contrast, whencompared to random sequences, CNSs contain an excess of GC-rich motifswhich, however, lack CpG dinucleotides. Thus, abundance of short sequencemotifs in human CNSs, taken as a whole, is mostly determined by theiroverall compositional properties and not by overrepresentation of anyspecific short motifs. These properties are: (i) high AT-content of CNSs,(ii) a tendency, probably due to context-dependent mutation, of A's andT's to clump, (iii) presence of short GC-rich regions, and (iv) avoidanceof CpG contexts, due to their hypermutability. Only a small number ofshort motifs, overrepresented in all human CNSs are similar to bindingsites of transcription factors from the FOX family.Conclusion: Human CNSsas a whole appear to be too broad a class of sequences to possess strongfootprints of any short sequence-specific functions. Such footprintsshould be studied at the level of functional subclasses of CNSs, such asthose which flank genes with a particular pattern of expression. Overallproperties of CNSs are affected by

  20. Genomic sequencing in clinical trials

    OpenAIRE

    Mestan, Karen K; Ilkhanoff, Leonard; Mouli, Samdeep; Lin, Simon

    2011-01-01

    Abstract Human genome sequencing is the process by which the exact order of nucleic acid base pairs in the 24 human chromosomes is determined. Since the completion of the Human Genome Project in 2003, genomic sequencing is rapidly becoming a major part of our translational research efforts to understand and improve human health and disease. This article reviews the current and future directions of clinical research with respect to genomic sequencing, a technology that is just beginning to fin...

  1. Biosensors for DNA sequence detection

    Science.gov (United States)

    Vercoutere, Wenonah; Akeson, Mark

    2002-01-01

    DNA biosensors are being developed as alternatives to conventional DNA microarrays. These devices couple signal transduction directly to sequence recognition. Some of the most sensitive and functional technologies use fibre optics or electrochemical sensors in combination with DNA hybridization. In a shift from sequence recognition by hybridization, two emerging single-molecule techniques read sequence composition using zero-mode waveguides or electrical impedance in nanoscale pores.

  2. ABS: Sequence alignment by scanning

    KAUST Repository

    Bonny, Mohamed Talal

    2011-08-01

    Sequence alignment is an essential tool in almost any computational biology research. It processes large database sequences and considered to be high consumers of computation time. Heuristic algorithms are used to get approximate but fast results. We introduce fast alignment algorithm, called Alignment By Scanning (ABS), to provide an approximate alignment of two DNA sequences. We compare our algorithm with the well-known alignment algorithms, the FASTA (which is heuristic) and the \\'Needleman-Wunsch\\' (which is optimal). The proposed algorithm achieves up to 76% enhancement in alignment score when it is compared with the FASTA Algorithm. The evaluations are conducted using different lengths of DNA sequences. © 2011 IEEE.

  3. ABS: Sequence alignment by scanning

    KAUST Repository

    Bonny, Mohamed Talal; Salama, Khaled N.

    2011-01-01

    Sequence alignment is an essential tool in almost any computational biology research. It processes large database sequences and considered to be high consumers of computation time. Heuristic algorithms are used to get approximate but fast results. We introduce fast alignment algorithm, called Alignment By Scanning (ABS), to provide an approximate alignment of two DNA sequences. We compare our algorithm with the well-known alignment algorithms, the FASTA (which is heuristic) and the 'Needleman-Wunsch' (which is optimal). The proposed algorithm achieves up to 76% enhancement in alignment score when it is compared with the FASTA Algorithm. The evaluations are conducted using different lengths of DNA sequences. © 2011 IEEE.

  4. Fast global sequence alignment technique

    KAUST Repository

    Bonny, Mohamed Talal

    2011-11-01

    Bioinformatics database is growing exponentially in size. Processing these large amount of data may take hours of time even if super computers are used. One of the most important processing tool in Bioinformatics is sequence alignment. We introduce fast alignment algorithm, called \\'Alignment By Scanning\\' (ABS), to provide an approximate alignment of two DNA sequences. We compare our algorithm with the wellknown sequence alignment algorithms, the \\'GAP\\' (which is heuristic) and the \\'Needleman-Wunsch\\' (which is optimal). The proposed algorithm achieves up to 51% enhancement in alignment score when it is compared with the GAP Algorithm. The evaluations are conducted using different lengths of DNA sequences. © 2011 IEEE.

  5. Multilocus Sequence Typing of Total-Genome-Sequenced Bacteria

    DEFF Research Database (Denmark)

    Larsen, Mette Voldby; Cosentino, Salvatore; Rasmussen, Simon

    2012-01-01

    Accurate strain identification is essential for anyone working with bacteria. For many species, multilocus sequence typing (MLST) is considered the "gold standard" of typing, but it is traditionally performed in an expensive and time-consuming manner. As the costs of whole-genome sequencing (WGS...

  6. Phytoplasma phylogenetics based on analysis of secA and 23S rRNA gene sequences for improved resolution of candidate species of 'Candidatus Phytoplasma'.

    Science.gov (United States)

    Hodgetts, Jennifer; Boonham, Neil; Mumford, Rick; Harrison, Nigel; Dickinson, Matthew

    2008-08-01

    Phytoplasma phylogenetics has focused primarily on sequences of the non-coding 16S rRNA gene and the 16S-23S rRNA intergenic spacer region (16-23S ISR), and primers that enable amplification of these regions from all phytoplasmas by PCR are well established. In this study, primers based on the secA gene have been developed into a semi-nested PCR assay that results in a sequence of the expected size (about 480 bp) from all 34 phytoplasmas examined, including strains representative of 12 16Sr groups. Phylogenetic analysis of secA gene sequences showed similar clustering of phytoplasmas when compared with clusters resolved by similar sequence analyses of a 16-23S ISR-23S rRNA gene contig or of the 16S rRNA gene alone. The main differences between trees were in the branch lengths, which were elongated in the 16-23S ISR-23S rRNA gene tree when compared with the 16S rRNA gene tree and elongated still further in the secA gene tree, despite this being a shorter sequence. The improved resolution in the secA gene-derived phylogenetic tree resulted in the 16SrII group splitting into two distinct clusters, while phytoplasmas associated with coconut lethal yellowing-type diseases split into three distinct groups, thereby supporting past proposals that they represent different candidate species within 'Candidatus Phytoplasma'. The ability to differentiate 16Sr groups and subgroups by virtual RFLP analysis of secA gene sequences suggests that this gene may provide an informative alternative molecular marker for pathogen identification and diagnosis of phytoplasma diseases.

  7. Transcriptional Slippage and RNA Editing Increase the Diversity of Transcripts in Chloroplasts: Insight from Deep Sequencing of Vigna radiata Genome and Transcriptome.

    Directory of Open Access Journals (Sweden)

    Ching-Ping Lin

    Full Text Available We performed deep sequencing of the nuclear and organellar genomes of three mungbean genotypes: Vigna radiata ssp. sublobata TC1966, V. radiata var. radiata NM92 and the recombinant inbred line RIL59 derived from a cross between TC1966 and NM92. Moreover, we performed deep sequencing of the RIL59 transcriptome to investigate transcript variability. The mungbean chloroplast genome has a quadripartite structure including a pair of inverted repeats separated by two single copy regions. A total of 213 simple sequence repeats were identified in the chloroplast genomes of NM92 and RIL59; 78 single nucleotide variants and nine indels were discovered in comparing the chloroplast genomes of TC1966 and NM92. Analysis of the mungbean chloroplast transcriptome revealed mRNAs that were affected by transcriptional slippage and RNA editing. Transcriptional slippage frequency was positively correlated with the length of simple sequence repeats of the mungbean chloroplast genome (R2=0.9911. In total, 41 C-to-U editing sites were found in 23 chloroplast genes and in one intergenic spacer. No editing site that swapped U to C was found. A combination of bioinformatics and experimental methods revealed that the plastid-encoded RNA polymerase-transcribed genes psbF and ndhA are affected by transcriptional slippage in mungbean and in main lineages of land plants, including three dicots (Glycine max, Brassica rapa, and Nicotiana tabacum, two monocots (Oryza sativa and Zea mays, two gymnosperms (Pinus taeda and Ginkgo biloba and one moss (Physcomitrella patens. Transcript analysis of the rps2 gene showed that transcriptional slippage could affect transcripts at single sequence repeat regions with poly-A runs. It showed that transcriptional slippage together with incomplete RNA editing may cause sequence diversity of transcripts in chloroplasts of land plants.

  8. Dog Y chromosomal DNA sequence: identification, sequencing and SNP discovery

    Directory of Open Access Journals (Sweden)

    Kirkness Ewen

    2006-10-01

    Full Text Available Abstract Background Population genetic studies of dogs have so far mainly been based on analysis of mitochondrial DNA, describing only the history of female dogs. To get a picture of the male history, as well as a second independent marker, there is a need for studies of biallelic Y-chromosome polymorphisms. However, there are no biallelic polymorphisms reported, and only 3200 bp of non-repetitive dog Y-chromosome sequence deposited in GenBank, necessitating the identification of dog Y chromosome sequence and the search for polymorphisms therein. The genome has been only partially sequenced for one male dog, disallowing mapping of the sequence into specific chromosomes. However, by comparing the male genome sequence to the complete female dog genome sequence, candidate Y-chromosome sequence may be identified by exclusion. Results The male dog genome sequence was analysed by Blast search against the human genome to identify sequences with a best match to the human Y chromosome and to the female dog genome to identify those absent in the female genome. Candidate sequences were then tested for male specificity by PCR of five male and five female dogs. 32 sequences from the male genome, with a total length of 24 kbp, were identified as male specific, based on a match to the human Y chromosome, absence in the female dog genome and male specific PCR results. 14437 bp were then sequenced for 10 male dogs originating from Europe, Southwest Asia, Siberia, East Asia, Africa and America. Nine haplotypes were found, which were defined by 14 substitutions. The genetic distance between the haplotypes indicates that they originate from at least five wolf haplotypes. There was no obvious trend in the geographic distribution of the haplotypes. Conclusion We have identified 24159 bp of dog Y-chromosome sequence to be used for population genetic studies. We sequenced 14437 bp in a worldwide collection of dogs, identifying 14 SNPs for future SNP analyses, and

  9. SVX Sequencer Board

    International Nuclear Information System (INIS)

    Utes, M.

    1997-01-01

    The SVX Sequencer boards are 9U by 280mm circuit boards that reside in slots 2 through 21 of each of eight Eurocard crates in the D0 Detector Platform. The basic purpose is to control the SVX chips for data acquisition and when a trigger occurs, to gather the SVX data and relay the data to the VRB boards in the Movable Counting House. Functions and features are as follows: (1) Initialization of eight SVX chip strings using the MIL-STD-1553 data bus; (2) Real time manipulation of the SVX control lines to effect data acquisition, digitization, and readout based on the NRZ/Clock signals from the Controller; (3) Conversion of 8-bit electrical SVX readout data to an optical signal operating at 1.062 Gbit/sec, sent to the VRB. Eight HDIs will be serviced per board; (4) Built-in logic analyzer which can record the most important control and data lines during a data acquisition cycle and put this recorded information onto the 1553 bus; (5) Identification header and end of data trailer tacked onto data stream; (6) 1553 register which can read the current values of the control and data lines; (7) 1553 register which can test the optical link; (8) 1553 registers for crossing pulse width, calibration pulse voltage, and calibration pipeline select; (9) 1553 register for reading the optical drivers status link; (10) 1553 register for power control of SVX chips and ignoring bad SVX strings; (11) Front panel displays and LEDs show the board status at a glance; (12) In-system programmable EPLDs are programmed via 1553 or Altera's 'Bitblaster'; (13) Automatic readout abort after 45us; (14) Supplies BUSY signal back to Trigger Framework; (15) Supports a heartbeat system to prevent excessive SVX current draw; and (16) Supports a SVX power trip feature if heartbeat failure occurs.

  10. Sequence Algebra, Sequence Decision Diagrams and Dynamic Fault Trees

    International Nuclear Information System (INIS)

    Rauzy, Antoine B.

    2011-01-01

    A large attention has been focused on the Dynamic Fault Trees in the past few years. By adding new gates to static (regular) Fault Trees, Dynamic Fault Trees aim to take into account dependencies among events. Merle et al. proposed recently an algebraic framework to give a formal interpretation to these gates. In this article, we extend Merle et al.'s work by adopting a slightly different perspective. We introduce Sequence Algebras that can be seen as Algebras of Basic Events, representing failures of non-repairable components. We show how to interpret Dynamic Fault Trees within this framework. Finally, we propose a new data structure to encode sets of sequences of Basic Events: Sequence Decision Diagrams. Sequence Decision Diagrams are very much inspired from Minato's Zero-Suppressed Binary Decision Diagrams. We show that all operations of Sequence Algebras can be performed on this data structure.

  11. Chameleon sequences in neurodegenerative diseases

    International Nuclear Information System (INIS)

    Bahramali, Golnaz; Goliaei, Bahram; Minuchehr, Zarrin; Salari, Ali

    2016-01-01

    Chameleon sequences can adopt either alpha helix sheet or a coil conformation. Defining chameleon sequences in PDB (Protein Data Bank) may yield to an insight on defining peptides and proteins responsible in neurodegeneration. In this research, we benefitted from the large PDB and performed a sequence analysis on Chameleons, where we developed an algorithm to extract peptide segments with identical sequences, but different structures. In order to find new chameleon sequences, we extracted a set of 8315 non-redundant protein sequences from the PDB with an identity less than 25%. Our data was classified to “helix to strand (HE)”, “helix to coil (HC)” and “strand to coil (CE)” alterations. We also analyzed the occurrence of singlet and doublet amino acids and the solvent accessibility in the chameleon sequences; we then sorted out the proteins with the most number of chameleon sequences and named them Chameleon Flexible Proteins (CFPs) in our dataset. Our data revealed that Gly, Val, Ile, Tyr and Phe, are the major amino acids in Chameleons. We also found that there are proteins such as Insulin Degrading Enzyme IDE and GTP-binding nuclear protein Ran (RAN) with the most number of chameleons (640 and 405 respectively). These proteins have known roles in neurodegenerative diseases. Therefore it can be inferred that other CFP's can serve as key proteins in neurodegeneration, and a study on them can shed light on curing and preventing neurodegenerative diseases.

  12. Direct, rapid RNA sequence analysis

    International Nuclear Information System (INIS)

    Peattie, D.A.

    1987-01-01

    The original methods of RNA sequence analysis were based on enzymatic production and chromatographic separation of overlapping oligonucleotide fragments from within an RNA molecule followed by identification of the mononucleotides comprising the oligomer. Over the past decade the field of nucleic acid sequencing has changed dramatically, however, and RNA molecules now can be sequenced in a variety of more streamlined fashions. Most of the more recent advances in RNA sequencing have involved one-dimensional electrophoretic separation of 32 P-end-labeled oligoribonucleotides on polyacrylamide gels. In this chapter the author discusses two of these methods for determining the nucleotide sequences of RNA molecules rapidly: the chemical method and the enzymatic method. Both methods are direct and degradative, i.e., they rely on fragmatic and chemical approaches should be utilized. The single-strand-specific ribonucleases (A, T 1 , T 2 , and S 1 ) provide an efficient means to locate double-helical regions rapidly, and the chemical reactions provide a means to determine the RNA sequence within these regions. In addition, the chemical reactions allow one to assign interactions to specific atoms and to distinguish secondary interactions from tertiary ones. If the RNA molecule is small enough to be sequenced directly by the enzymatic or chemical method, the probing reactions can be done easily at the same time as sequencing reactions

  13. Chameleon sequences in neurodegenerative diseases.

    Science.gov (United States)

    Bahramali, Golnaz; Goliaei, Bahram; Minuchehr, Zarrin; Salari, Ali

    2016-03-25

    Chameleon sequences can adopt either alpha helix sheet or a coil conformation. Defining chameleon sequences in PDB (Protein Data Bank) may yield to an insight on defining peptides and proteins responsible in neurodegeneration. In this research, we benefitted from the large PDB and performed a sequence analysis on Chameleons, where we developed an algorithm to extract peptide segments with identical sequences, but different structures. In order to find new chameleon sequences, we extracted a set of 8315 non-redundant protein sequences from the PDB with an identity less than 25%. Our data was classified to "helix to strand (HE)", "helix to coil (HC)" and "strand to coil (CE)" alterations. We also analyzed the occurrence of singlet and doublet amino acids and the solvent accessibility in the chameleon sequences; we then sorted out the proteins with the most number of chameleon sequences and named them Chameleon Flexible Proteins (CFPs) in our dataset. Our data revealed that Gly, Val, Ile, Tyr and Phe, are the major amino acids in Chameleons. We also found that there are proteins such as Insulin Degrading Enzyme IDE and GTP-binding nuclear protein Ran (RAN) with the most number of chameleons (640 and 405 respectively). These proteins have known roles in neurodegenerative diseases. Therefore it can be inferred that other CFP's can serve as key proteins in neurodegeneration, and a study on them can shed light on curing and preventing neurodegenerative diseases. Copyright © 2016 Elsevier Inc. All rights reserved.

  14. Farey sequences and resistor networks

    Indian Academy of Sciences (India)

    Green's function, while the perturbation of a network is investigated in [3]. ... In Theorem 1 below, we employ the Farey sequence to establish a strict .... We next show that the Farey sequence method is applicable for circuits with n or fewer.

  15. DNA Sequencing by Capillary Electrophoresis

    Science.gov (United States)

    Karger, Barry L.; Guttman, Andras

    2009-01-01

    Sequencing of human and other genomes has been at the center of interest in the biomedical field over the past several decades and is now leading toward an era of personalized medicine. During this time, DNA sequencing methods have evolved from the labor intensive slab gel electrophoresis, through automated multicapillary electrophoresis systems using fluorophore labeling with multispectral imaging, to the “next generation” technologies of cyclic array, hybridization based, nanopore and single molecule sequencing. Deciphering the genetic blueprint and follow-up confirmatory sequencing of Homo sapiens and other genomes was only possible by the advent of modern sequencing technologies that was a result of step by step advances with a contribution of academics, medical personnel and instrument companies. While next generation sequencing is moving ahead at break-neck speed, the multicapillary electrophoretic systems played an essential role in the sequencing of the Human Genome, the foundation of the field of genomics. In this prospective, we wish to overview the role of capillary electrophoresis in DNA sequencing based in part of several of our articles in this journal. PMID:19517496

  16. Graphene nanodevices for DNA sequencing

    NARCIS (Netherlands)

    Heerema, S.J.; Dekker, C.

    2016-01-01

    Fast, cheap, and reliable DNA sequencing could be one of the most disruptive innovations of this decade, as it will pave the way for personalized medicine. In pursuit of such technology, a variety of nanotechnology-based approaches have been explored and established, including sequencing with

  17. Chameleon sequences in neurodegenerative diseases

    Energy Technology Data Exchange (ETDEWEB)

    Bahramali, Golnaz [Institute of Biochemistry and Biophysics, University of Tehran, Tehran (Iran, Islamic Republic of); Goliaei, Bahram, E-mail: goliaei@ut.ac.ir [Institute of Biochemistry and Biophysics, University of Tehran, Tehran (Iran, Islamic Republic of); Minuchehr, Zarrin, E-mail: minuchehr@nigeb.ac.ir [Department of Systems Biotechnology, National Institute of Genetic Engineering and Biotechnology, (NIGEB), Tehran (Iran, Islamic Republic of); Salari, Ali [Department of Systems Biotechnology, National Institute of Genetic Engineering and Biotechnology, (NIGEB), Tehran (Iran, Islamic Republic of)

    2016-03-25

    Chameleon sequences can adopt either alpha helix sheet or a coil conformation. Defining chameleon sequences in PDB (Protein Data Bank) may yield to an insight on defining peptides and proteins responsible in neurodegeneration. In this research, we benefitted from the large PDB and performed a sequence analysis on Chameleons, where we developed an algorithm to extract peptide segments with identical sequences, but different structures. In order to find new chameleon sequences, we extracted a set of 8315 non-redundant protein sequences from the PDB with an identity less than 25%. Our data was classified to “helix to strand (HE)”, “helix to coil (HC)” and “strand to coil (CE)” alterations. We also analyzed the occurrence of singlet and doublet amino acids and the solvent accessibility in the chameleon sequences; we then sorted out the proteins with the most number of chameleon sequences and named them Chameleon Flexible Proteins (CFPs) in our dataset. Our data revealed that Gly, Val, Ile, Tyr and Phe, are the major amino acids in Chameleons. We also found that there are proteins such as Insulin Degrading Enzyme IDE and GTP-binding nuclear protein Ran (RAN) with the most number of chameleons (640 and 405 respectively). These proteins have known roles in neurodegenerative diseases. Therefore it can be inferred that other CFP's can serve as key proteins in neurodegeneration, and a study on them can shed light on curing and preventing neurodegenerative diseases.

  18. Commercial Art: Scope and Sequence.

    Science.gov (United States)

    Nashville - Davidson County Metropolitan Public Schools, TN.

    This scope and sequence guide, developed for a commercial art vocational education program, represents an initial step in the development of a systemwide articulated curriculum sequence for all vocational programs within the Metropolitan Nashville Public School System. It was developed as a result of needs expressed by teachers, parents, and the…

  19. Genome-Wide Analysis of Simple Sequence Repeats and Efficient Development of Polymorphic SSR Markers Based on Whole Genome Re-Sequencing of Multiple Isolates of the Wheat Stripe Rust Fungus.

    Directory of Open Access Journals (Sweden)

    Huaiyong Luo

    Full Text Available The biotrophic parasitic fungus Puccinia striiformis f. sp. tritici (Pst causes stripe rust, a devastating disease of wheat, endangering global food security. Because the Pst population is highly dynamic, it is difficult to develop wheat cultivars with durable and highly effective resistance. Simple sequence repeats (SSRs are widely used as molecular markers in genetic studies to determine population structure in many organisms. However, only a small number of SSR markers have been developed for Pst. In this study, a total of 4,792 SSR loci were identified using the whole genome sequences of six isolates from different regions of the world, with a marker density of one SSR per 22.95 kb. The majority of the SSRs were di- and tri-nucleotide repeats. A database containing 1,113 SSR markers were established. Through in silico comparison, the previously reported SSR markers were found mainly in exons, whereas the SSR markers in the database were mostly in intergenic regions. Furthermore, 105 polymorphic SSR markers were confirmed in silico by their identical positions and nucleotide variations with INDELs identified among the six isolates. When 104 in silico polymorphic SSR markers were used to genotype 21 Pst isolates, 84 produced the target bands, and 82 of them were polymorphic and revealed the genetic relationships among the isolates. The results show that whole genome re-sequencing of multiple isolates provides an ideal resource for developing SSR markers, and the newly developed SSR markers are useful for genetic and population studies of the wheat stripe rust fungus.

  20. Genome-Wide Analysis of Simple Sequence Repeats and Efficient Development of Polymorphic SSR Markers Based on Whole Genome Re-Sequencing of Multiple Isolates of the Wheat Stripe Rust Fungus.

    Science.gov (United States)

    Luo, Huaiyong; Wang, Xiaojie; Zhan, Gangming; Wei, Guorong; Zhou, Xinli; Zhao, Jing; Huang, Lili; Kang, Zhensheng

    2015-01-01

    The biotrophic parasitic fungus Puccinia striiformis f. sp. tritici (Pst) causes stripe rust, a devastating disease of wheat, endangering global food security. Because the Pst population is highly dynamic, it is difficult to develop wheat cultivars with durable and highly effective resistance. Simple sequence repeats (SSRs) are widely used as molecular markers in genetic studies to determine population structure in many organisms. However, only a small number of SSR markers have been developed for Pst. In this study, a total of 4,792 SSR loci were identified using the whole genome sequences of six isolates from different regions of the world, with a marker density of one SSR per 22.95 kb. The majority of the SSRs were di- and tri-nucleotide repeats. A database containing 1,113 SSR markers were established. Through in silico comparison, the previously reported SSR markers were found mainly in exons, whereas the SSR markers in the database were mostly in intergenic regions. Furthermore, 105 polymorphic SSR markers were confirmed in silico by their identical positions and nucleotide variations with INDELs identified among the six isolates. When 104 in silico polymorphic SSR markers were used to genotype 21 Pst isolates, 84 produced the target bands, and 82 of them were polymorphic and revealed the genetic relationships among the isolates. The results show that whole genome re-sequencing of multiple isolates provides an ideal resource for developing SSR markers, and the newly developed SSR markers are useful for genetic and population studies of the wheat stripe rust fungus.

  1. Rapid Diagnostics of Onboard Sequences

    Science.gov (United States)

    Starbird, Thomas W.; Morris, John R.; Shams, Khawaja S.; Maimone, Mark W.

    2012-01-01

    Keeping track of sequences onboard a spacecraft is challenging. When reviewing Event Verification Records (EVRs) of sequence executions on the Mars Exploration Rover (MER), operators often found themselves wondering which version of a named sequence the EVR corresponded to. The lack of this information drastically impacts the operators diagnostic capabilities as well as their situational awareness with respect to the commands the spacecraft has executed, since the EVRs do not provide argument values or explanatory comments. Having this information immediately available can be instrumental in diagnosing critical events and can significantly enhance the overall safety of the spacecraft. This software provides auditing capability that can eliminate that uncertainty while diagnosing critical conditions. Furthermore, the Restful interface provides a simple way for sequencing tools to automatically retrieve binary compiled sequence SCMFs (Space Command Message Files) on demand. It also enables developers to change the underlying database, while maintaining the same interface to the existing applications. The logging capabilities are also beneficial to operators when they are trying to recall how they solved a similar problem many days ago: this software enables automatic recovery of SCMF and RML (Robot Markup Language) sequence files directly from the command EVRs, eliminating the need for people to find and validate the corresponding sequences. To address the lack of auditing capability for sequences onboard a spacecraft during earlier missions, extensive logging support was added on the Mars Science Laboratory (MSL) sequencing server. This server is responsible for generating all MSL binary SCMFs from RML input sequences. The sequencing server logs every SCMF it generates into a MySQL database, as well as the high-level RML file and dictionary name inputs used to create the SCMF. The SCMF is then indexed by a hash value that is automatically included in all command

  2. Accident sequence quantification with KIRAP

    International Nuclear Information System (INIS)

    Kim, Tae Un; Han, Sang Hoon; Kim, Kil You; Yang, Jun Eon; Jeong, Won Dae; Chang, Seung Cheol; Sung, Tae Yong; Kang, Dae Il; Park, Jin Hee; Lee, Yoon Hwan; Hwang, Mi Jeong.

    1997-01-01

    The tasks of probabilistic safety assessment(PSA) consists of the identification of initiating events, the construction of event tree for each initiating event, construction of fault trees for event tree logics, the analysis of reliability data and finally the accident sequence quantification. In the PSA, the accident sequence quantification is to calculate the core damage frequency, importance analysis and uncertainty analysis. Accident sequence quantification requires to understand the whole model of the PSA because it has to combine all event tree and fault tree models, and requires the excellent computer code because it takes long computation time. Advanced Research Group of Korea Atomic Energy Research Institute(KAERI) has developed PSA workstation KIRAP(Korea Integrated Reliability Analysis Code Package) for the PSA work. This report describes the procedures to perform accident sequence quantification, the method to use KIRAP's cut set generator, and method to perform the accident sequence quantification with KIRAP. (author). 6 refs

  3. Accident sequence quantification with KIRAP

    Energy Technology Data Exchange (ETDEWEB)

    Kim, Tae Un; Han, Sang Hoon; Kim, Kil You; Yang, Jun Eon; Jeong, Won Dae; Chang, Seung Cheol; Sung, Tae Yong; Kang, Dae Il; Park, Jin Hee; Lee, Yoon Hwan; Hwang, Mi Jeong

    1997-01-01

    The tasks of probabilistic safety assessment(PSA) consists of the identification of initiating events, the construction of event tree for each initiating event, construction of fault trees for event tree logics, the analysis of reliability data and finally the accident sequence quantification. In the PSA, the accident sequence quantification is to calculate the core damage frequency, importance analysis and uncertainty analysis. Accident sequence quantification requires to understand the whole model of the PSA because it has to combine all event tree and fault tree models, and requires the excellent computer code because it takes long computation time. Advanced Research Group of Korea Atomic Energy Research Institute(KAERI) has developed PSA workstation KIRAP(Korea Integrated Reliability Analysis Code Package) for the PSA work. This report describes the procedures to perform accident sequence quantification, the method to use KIRAP`s cut set generator, and method to perform the accident sequence quantification with KIRAP. (author). 6 refs.

  4. Repeated DNA sequences in fungi

    Energy Technology Data Exchange (ETDEWEB)

    Dutta, S K

    1974-11-01

    Several fungal species, representatives of all broad groups like basidiomycetes, ascomycetes and phycomycetes, were examined for the nature of repeated DNA sequences by DNA:DNA reassociation studies using hydroxyapatite chromatography. All of the fungal species tested contained 10 to 20 percent repeated DNA sequences. There are approximately 100 to 110 copies of repeated DNA sequences of approximately 4 x 10/sup 7/ daltons piece size of each. Repeated DNA sequence homoduplexes showed on average 5/sup 0/C difference of T/sub e/50 (temperature at which 50 percent duplexes dissociate) values from the corresponding homoduplexes of unfractionated whole DNA. It is suggested that a part of repetitive sequences in fungi constitutes mitochondrial DNA and a part of it constitutes nuclear DNA. (auth)

  5. [Complete genome sequencing and sequence analysis of BCG Tice].

    Science.gov (United States)

    Wang, Zhiming; Pan, Yuanlong; Wu, Jun; Zhu, Baoli

    2012-10-04

    The objective of this study is to obtain the complete genome sequence of Bacillus Calmette-Guerin Tice (BCG Tice), in order to provide more information about the molecular biology of BCG Tice and design more reasonable vaccines to prevent tuberculosis. We assembled the data from high-throughput sequencing with SOAPdenovo software, with many contigs and scaffolds obtained. There are many sequence gaps and physical gaps remained as a result of regional low coverage and low quality. We designed primers at the end of contigs and performed PCR amplification in order to link these contigs and scaffolds. With various enzymes to perform PCR amplification, adjustment of PCR reaction conditions, and combined with clone construction to sequence, all the gaps were finished. We obtained the complete genome sequence of BCG Tice and submitted it to GenBank of National Center for Biotechnology Information (NCBI). The genome of BCG Tice is 4334064 base pairs in length, with GC content 65.65%. The problems and strategies during the finishing step of BCG Tice sequencing are illuminated here, with the hope of affording some experience to those who are involved in the finishing step of genome sequencing. The microarray data were verified by our results.

  6. Sequence-specific DNA binding by MYC/MAX to low-affinity non-E-box motifs.

    Directory of Open Access Journals (Sweden)

    Michael Allevato

    Full Text Available The MYC oncoprotein regulates transcription of a large fraction of the genome as an obligatory heterodimer with the transcription factor MAX. The MYC:MAX heterodimer and MAX:MAX homodimer (hereafter MYC/MAX bind Enhancer box (E-box DNA elements (CANNTG and have the greatest affinity for the canonical MYC E-box (CME CACGTG. However, MYC:MAX also recognizes E-box variants and was reported to bind DNA in a "non-specific" fashion in vitro and in vivo. Here, in order to identify potential additional non-canonical binding sites for MYC/MAX, we employed high throughput in vitro protein-binding microarrays, along with electrophoretic mobility-shift assays and bioinformatic analyses of MYC-bound genomic loci in vivo. We identified all hexameric motifs preferentially bound by MYC/MAX in vitro, which include the low-affinity non-E-box sequence AACGTT, and found that the vast majority (87% of MYC-bound genomic sites in a human B cell line contain at least one of the top 21 motifs bound by MYC:MAX in vitro. We further show that high MYC/MAX concentrations are needed for specific binding to the low-affinity sequence AACGTT in vitro and that elevated MYC levels in vivo more markedly increase the occupancy of AACGTT sites relative to CME sites, especially at distal intergenic and intragenic loci. Hence, MYC binds diverse DNA motifs with a broad range of affinities in a sequence-specific and dose-dependent manner, suggesting that MYC overexpression has more selective effects on the tumor transcriptome than previously thought.

  7. Nuclear and cpDNA sequences combined provide strong inference of higher phylogenetic relationships in the phlox family (Polemoniaceae).

    Science.gov (United States)

    Johnson, Leigh A; Chan, Lauren M; Weese, Terri L; Busby, Lisa D; McMurry, Samuel

    2008-09-01

    Members of the phlox family (Polemoniaceae) serve as useful models for studying various evolutionary and biological processes. Despite its biological importance, no family-wide phylogenetic estimate based on multiple DNA regions with complete generic sampling is available. Here, we analyze one nuclear and five chloroplast DNA sequence regions (nuclear ITS, chloroplast matK, trnL intron plus trnL-trnF intergeneric spacer, and the trnS-trnG, trnD-trnT, and psbM-trnD intergenic spacers) using parsimony and Bayesian methods, as well as assessments of congruence and long branch attraction, to explore phylogenetic relationships among 84 ingroup species representing all currently recognized Polemoniaceae genera. Relationships inferred from the ITS and concatenated chloroplast regions are similar overall. A combined analysis provides strong support for the monophyly of Polemoniaceae and subfamilies Acanthogilioideae, Cobaeoideae, and Polemonioideae. Relationships among subfamilies, and thus for the precise root of Polemoniaceae, remain poorly supported. Within the largest subfamily, Polemonioideae, four clades corresponding to tribes Polemonieae, Phlocideae, Gilieae, and Loeselieae receive strong support. The monogeneric Polemonieae appears sister to Phlocideae. Relationships within Polemonieae, Phlocideae, and Gilieae are mostly consistent between analyses and data permutations. Many relationships within Loeselieae remain uncertain. Overall, inferred phylogenetic relationships support a higher-level classification for Polemoniaceae proposed in 2000.

  8. Molecular characterization of Fasciola spp. from the endemic area of northern Iran based on nuclear ribosomal DNA sequences.

    Science.gov (United States)

    Amor, Nabil; Halajian, Ali; Farjallah, Sarra; Merella, Paolo; Said, Khaled; Ben Slimane, Badreddine

    2011-07-01

    Fasciolosis caused by Fasciola spp. (Platyhelminthes: Trematoda: Digenea) is considered as the most important helminth infection of ruminants in tropical countries, causing considerable socioeconomic problems. In the endemic regions of the North of Iran, Fasciola hepatica and Fasciola gigantica have been previously characterized on the basis of morphometric differences, but the use of molecular markers is necessary to distinguish exactly between species and intermediate forms. Samples from buffaloes and goats from different localities of northern Iran were identified morphologically and then genetically characterized by sequences of the first (ITS-1) and second (ITS-2) Internal Transcribed Spacers (ITS) of nuclear ribosomal DNA (rDNA). Comparison of the ITS of the northern Iranian samples with sequences of Fasciola spp. from GenBank showed that the examined specimens had sequences identical to those of the most frequent haplotypes of F. hepatica (n=25, 48.1%) and F. gigantica (n=20, 38.45%), which differed from each other in different variable nucleotide positions of ITS region sequences, and their intermediate forms (n=7, 13.45%), which had nucleotides overlapped between the two Fasciola species in all the positions. The ITS sequences from populations of Fasciola isolates in buffaloes and goats had experienced introgression/hybridization as previously reported in isolates from other ruminants and humans. Based on ITS-1 and ITS-2 sequences, flukes are scattered in pure F. hepatica, F. gigantica and intermediate Fasciola clades, revealing that multiple genotypes of Fasciola are able to infect goats and buffaloes in North of Iran. Furthermore, the phylogenetic trees based upon the ITS-1 and ITS-2 sequences showed a close relationship of the Iranian samples with isolates of F. hepatica and F. gigantica from different localities of Africa and Asia. In the present study, the intergenic transcribed spacers ITS-1 and ITS-2 showed to be reliable approaches for the genetic

  9. Deduced amino acid sequence of the small hydrophobic protein of US avian pneumovirus has greater identity with that of human metapneumovirus than those of non-US avian pneumoviruses.

    Science.gov (United States)

    Yunus, Abdul S; Govindarajan, Dhanasekaran; Huang, Zhuhui; Samal, Siba K

    2003-05-01

    We report here the nucleotide and deduced amino acid (aa) sequences of the small hydrophobic (SH) gene of the avian pneumovirus strain Colorado (APV/CO). The SH gene of APV/CO is 628 nucleotides in length from gene-start to gene-end. The longest ORF of the SH gene encoded a protein of 177 aas in length. Comparison of the deduced aa sequence of the SH protein of APV/CO with the corresponding published sequences of other members of genera metapneumovirus showed 28% identity with the newly discovered human metapneumovirus (hMPV), but no discernable identity with the APV subgroup A or B. Collectively, this data supports the hypothesis that: (i) APV/CO is distinct from European APV subgroups and belongs to the novel subgroup APV/C (APV/US); (ii) APV/CO is more closely related to hMPV, a mammalian metapneumovirus, than to either APV subgroup A or B. The SH gene of APV/CO was cloned using a genomic walk strategy which initiated cDNA synthesis from genomic RNA that traversed the genes in the order 3'-M-F-M2-SH-G-5', thus confirming that gene-order of APV/CO conforms in the genus Metapneumovirus. We also provide the sequences of transcription-signals and the M-F, F-M2, M2-SH and SH-G intergenic regions of APV/CO.

  10. GROUPING WEB ACCESS SEQUENCES uSING SEQUENCE ALIGNMENT METHOD

    OpenAIRE

    BHUPENDRA S CHORDIA; KRISHNAKANT P ADHIYA

    2011-01-01

    In web usage mining grouping of web access sequences can be used to determine the behavior or intent of a set of users. Grouping websessions is how to measure the similarity between web sessions. There are many shortcomings in traditional measurement methods. The taskof grouping web sessions based on similarity and consists of maximizing the intra-group similarity while minimizing the inter-groupsimilarity is done using sequence alignment method. This paper introduces a new method to group we...

  11. LPTAU, Quasi Random Sequence Generator

    International Nuclear Information System (INIS)

    Sobol, Ilya M.

    1993-01-01

    1 - Description of program or function: LPTAU generates quasi random sequences. These are uniformly distributed sets of L=M N points in the N-dimensional unit cube: I N =[0,1]x...x[0,1]. These sequences are used as nodes for multidimensional integration; as searching points in global optimization; as trial points in multi-criteria decision making; as quasi-random points for quasi Monte Carlo algorithms. 2 - Method of solution: Uses LP-TAU sequence generation (see references). 3 - Restrictions on the complexity of the problem: The number of points that can be generated is L 30 . The dimension of the space cannot exceed 51

  12. Weak disorder in Fibonacci sequences

    Energy Technology Data Exchange (ETDEWEB)

    Ben-Naim, E [Theoretical Division and Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, NM 87545 (United States); Krapivsky, P L [Department of Physics and Center for Molecular Cybernetics, Boston University, Boston, MA 02215 (United States)

    2006-05-19

    We study how weak disorder affects the growth of the Fibonacci series. We introduce a family of stochastic sequences that grow by the normal Fibonacci recursion with probability 1 - {epsilon}, but follow a different recursion rule with a small probability {epsilon}. We focus on the weak disorder limit and obtain the Lyapunov exponent that characterizes the typical growth of the sequence elements, using perturbation theory. The limiting distribution for the ratio of consecutive sequence elements is obtained as well. A number of variations to the basic Fibonacci recursion including shift, doubling and copying are considered. (letter to the editor)

  13. Sequence analysis of Leukemia DNA

    Science.gov (United States)

    Nacong, Nasria; Lusiyanti, Desy; Irawan, Muhammad. Isa

    2018-03-01

    Cancer is a very deadly disease, one of which is leukemia disease or better known as blood cancer. The cancer cell can be detected by taking DNA in laboratory test. This study focused on local alignment of leukemia and non leukemia data resulting from NCBI in the form of DNA sequences by using Smith-Waterman algorithm. SmithWaterman algorithm was invented by TF Smith and MS Waterman in 1981. These algorithms try to find as much as possible similarity of a pair of sequences, by giving a negative value to the unequal base pair (mismatch), and positive values on the same base pair (match). So that will obtain the maximum positive value as the end of the alignment, and the minimum value as the initial alignment. This study will use sequences of leukemia and 3 sequences of non leukemia.

  14. Variation in extragenic repetitive DNA sequences in Pseudomonas syringae and potential use of modified REP primers in the identification of closely related isolates

    Directory of Open Access Journals (Sweden)

    Elif Çepni

    2012-01-01

    Full Text Available In this study, Pseudomonas syringe pathovars isolated from olive, tomato and bean were identified by species-specific PCR and their genetic diversity was assessed by repetitive extragenic palindromic (REP-PCR. Reverse universal primers for REP-PCR were designed by using the bases of A, T, G or C at the positions of 1, 4 and 11 to identify additional polymorphism in the banding patterns. Binding of the primers to different annealing sites in the genome revealed additional fingerprint patterns in eight isolates of P. savastanoi pv. savastanoi and two isolates of P. syringae pv. tomato. The use of four different bases in the primer sequences did not affect the PCR reproducibility and was very efficient in revealing intra-pathovar diversity, particularly in P. savastanoi pv. savastanoi. At the pathovar level, the primer BOX1AR yielded shared fragments, in addition to five bands that discriminated among the pathovars P. syringae pv. phaseolicola, P. savastanoi pv. savastanoi and P. syringae pv. tomato. REP-PCR with a modified primer containing C produced identical bands among the isolates in a pathovar but separated three pathovars more distinctly than four other primers. Although REP-and BOX-PCRs have been successfully used in the molecular identification of Pseudomonas isolates from Turkish flora, a PCR based on inter-enterobacterial repetitive intergenic concensus (ERIC sequences failed to produce clear banding patterns in this study.

  15. Integrated sequence analysis. Final report

    International Nuclear Information System (INIS)

    Andersson, K.; Pyy, P.

    1998-02-01

    The NKS/RAK subprojet 3 'integrated sequence analysis' (ISA) was formulated with the overall objective to develop and to test integrated methodologies in order to evaluate event sequences with significant human action contribution. The term 'methodology' denotes not only technical tools but also methods for integration of different scientific disciplines. In this report, we first discuss the background of ISA and the surveys made to map methods in different application fields, such as man machine system simulation software, human reliability analysis (HRA) and expert judgement. Specific event sequences were, after the surveys, selected for application and testing of a number of ISA methods. The event sequences discussed in the report were cold overpressure of BWR, shutdown LOCA of BWR, steam generator tube rupture of a PWR and BWR disturbed signal view in the control room after an external event. Different teams analysed these sequences by using different ISA and HRA methods. Two kinds of results were obtained from the ISA project: sequence specific and more general findings. The sequence specific results are discussed together with each sequence description. The general lessons are discussed under a separate chapter by using comparisons of different case studies. These lessons include areas ranging from plant safety management (design, procedures, instrumentation, operations, maintenance and safety practices) to methodological findings (ISA methodology, PSA,HRA, physical analyses, behavioural analyses and uncertainty assessment). Finally follows a discussion about the project and conclusions are presented. An interdisciplinary study of complex phenomena is a natural way to produce valuable and innovative results. This project came up with structured ways to perform ISA and managed to apply the in practice. The project also highlighted some areas where more work is needed. In the HRA work, development is required for the use of simulators and expert judgement as

  16. Optimization of sequence alignment for simple sequence repeat regions

    Directory of Open Access Journals (Sweden)

    Ogbonnaya Francis C

    2011-07-01

    Full Text Available Abstract Background Microsatellites, or simple sequence repeats (SSRs, are tandemly repeated DNA sequences, including tandem copies of specific sequences no longer than six bases, that are distributed in the genome. SSR has been used as a molecular marker because it is easy to detect and is used in a range of applications, including genetic diversity, genome mapping, and marker assisted selection. It is also very mutable because of slipping in the DNA polymerase during DNA replication. This unique mutation increases the insertion/deletion (INDELs mutation frequency to a high ratio - more than other types of molecular markers such as single nucleotide polymorphism (SNPs. SNPs are more frequent than INDELs. Therefore, all designed algorithms for sequence alignment fit the vast majority of the genomic sequence without considering microsatellite regions, as unique sequences that require special consideration. The old algorithm is limited in its application because there are many overlaps between different repeat units which result in false evolutionary relationships. Findings To overcome the limitation of the aligning algorithm when dealing with SSR loci, a new algorithm was developed using PERL script with a Tk graphical interface. This program is based on aligning sequences after determining the repeated units first, and the last SSR nucleotides positions. This results in a shifting process according to the inserted repeated unit type. When studying the phylogenic relations before and after applying the new algorithm, many differences in the trees were obtained by increasing the SSR length and complexity. However, less distance between different linage had been observed after applying the new algorithm. Conclusions The new algorithm produces better estimates for aligning SSR loci because it reflects more reliable evolutionary relations between different linages. It reduces overlapping during SSR alignment, which results in a more realistic

  17. ADDRESS SEQUENCES FOR MULTI RUN RAM TESTING

    Directory of Open Access Journals (Sweden)

    V. N. Yarmolik

    2014-01-01

    Full Text Available A universal approach for generation of address sequences with specified properties is proposed and analyzed. A modified version of the Antonov and Saleev algorithm for Sobol sequences genera-tion is chosen as a mathematical description of the proposed method. Within the framework of the proposed universal approach, the Sobol sequences form a subset of the address sequences. Other sub-sets are also formed, which are Gray sequences, anti-Gray sequences, counter sequences and sequenc-es with specified properties.

  18. Reduction of IgE binding and nonpromotion of Aspergillus flavus fungal growth by simultaneously silencing Ara h 2 and Ara h 6 in peanut.

    Science.gov (United States)

    The most potent peanut allergens, Ara h 2 and 6, were silenced in transgenic plants by RNA interference. Three independent transgenic lines were recovered after microprojectile bombardment, of which two contained single, integrated copies of the transgene. The third line contained multiple copies ...

  19. Fast and secure retrieval of DNA sequences

    NARCIS (Netherlands)

    2014-01-01

    Sequence models are retrieved from a sequences index. The sequence models model DNA or RNA sequences stored in a database, and each comprises a finite memory tree source model and parameters for the finite memory tree source model. One or more DNA or RNA sequences stored in the database are

  20. Decidability of uniform recurrence of morphic sequences

    OpenAIRE

    Durand , Fabien

    2012-01-01

    We prove that the uniform recurrence of morphic sequences is decidable. For this we show that the number of derived sequences of uniformly recurrent morphic sequences is bounded. As a corollary we obtain that uniformly recurrent morphic sequences are primitive substitutive sequences.

  1. [Complete genome sequencing and analyses of rabies viruses isolated from wild animals (Chinese Ferret-Badger) in Zhejiang province].

    Science.gov (United States)

    Lei, Yong-Liang; Wang, Xiao-Guang; Liu, Fu-Ming; Chen, Xiu-Ying; Ye, Bi-Feng; Mei, Jian-Hua; Lan, Jin-Quan; Tang, Qing

    2009-08-01

    Based on sequencing the full-length genomes of two Chinese Ferret-Badger, we analyzed the properties of rabies viruses genetic variation in molecular level to get information on prevalence and variation of rabies viruses in Zhejiang, and to enrich the genome database of rabies viruses street strains isolated from Chinese wildlife. Overlapped fragments were amplified by RT-PCR and full-length genomes were assembled to analyze the nucleotide and deduced protein similarities and phylogenetic analyses of the N genes from Chinese Ferret-Badger, sika deer, vole, dog. Vaccine strains were then determined. The two full-length genomes were completely sequenced to find out that they had the same genetic structure with 11 923 nts including 58 nts-Leader, 1353 nts-NP, 894 nts-PP, 609 nts-MP, 1575 nts-GP, 6386 nts-LP, and 2, 5, 5 nts- intergenic regions (IGRs), 423 nts-Pseudogene-like sequence (Psi), 70 nts-Trailer. The two full-length genomes were in accordance with the properties of Rhabdoviridae Lyssa virus by blast and multi-sequence alignment. The nucleotide and amino acid sequences among Chinese strains had the highest similarity, especially among animals of the same species. Of the two full-length genomes, the similarity in amino acid level was dramatically higher than that in nucleotide level, so that the nucleotide mutations happened in these two genomes were most probably as synonymous mutations. Compared to the referenced rabies viruses, the lengths of the five protein coding regions did not show any changes or recombination, but only with a few-point mutations. It was evident that the five proteins appeared to be stable. The variation sites and types of the two ferret badgers genomes were similar to the referenced vaccine or street strains. The two strains were genotype 1 according to the multi-sequence and phylogenetic analyses, which possessing the distinct geographyphic characteristics of China. All the evidence suggested a cue that these two ferret badgers

  2. [Sequencing and analysis of complete genome of rabies viruses isolated from Chinese Ferret-Badger and dog in Zhejiang province].

    Science.gov (United States)

    Lei, Yong-Liang; Wang, Xiao-Guang; Tao, Xiao-Yan; Li, Hao; Meng, Sheng-Li; Chen, Xiu-Ying; Liu, Fu-Ming; Ye, Bi-Feng; Tang, Qing

    2010-01-01

    Based on sequencing the full-length genomes of four Chinese Ferret-Badger and dog, we analyze the properties of rabies viruses genetic variation in molecular level, get the information about rabies viruses prevalence and variation in Zhejiang, and enrich the genome database of rabies viruses street strains isolated from China. Rabies viruses in suckling mice were isolated, overlapped fragments were amplified by RT-PCR and full-length genomes were assembled to analyze the nucleotide and deduced protein similarities and phylogenetic analyses from Chinese Ferret-Badger, dog, sika deer, vole, used vaccine strain were determined. The four full-length genomes were sequenced completely and had the same genetic structure with the length of 11, 923 nts or 11, 925 nts including 58 nts-Leader, 1353 nts-NP, 894 nts-PP, 609 nts-MP, 1575 nts-GP, 6386 nts-LP, and 2, 5, 5 nts- intergenic regions(IGRs), 423 nts-Pseudogene-like sequence (psi), 70 nts-Trailer. The four full-length genomes were in accordance with the properties of Rhabdoviridae Lyssa virus by BLAST and multi-sequence alignment. The nucleotide and amino acid sequences among Chinese strains had the highest similarity, especially among animals of the same species. Of the four full-length genomes, the similarity in amino acid level was dramatically higher than that in nucleotide level, so the nucleotide mutations happened in these four genomes were most synonymous mutations. Compared with the reference rabies viruses, the lengths of the five protein coding regions had no change, no recombination, only with a few point mutations. It was evident that the five proteins appeared to be stable. The variation sites and types of the four genomes were similar to the reference vaccine or street strains. And the four strains were genotype 1 according to the multi-sequence and phylogenetic analyses, which possessed the distinct district characteristics of China. Therefore, these four rabies viruses are likely to be street viruses

  3. Killer Immunoglobulin-Like Receptor Allele Determination Using Next-Generation Sequencing Technology

    Directory of Open Access Journals (Sweden)

    Bercelin Maniangou

    2017-05-01

    Full Text Available The impact of natural killer (NK cell alloreactivity on hematopoietic stem cell transplantation (HSCT outcome is still debated due to the complexity of graft parameters, HLA class I environment, the nature of killer cell immunoglobulin-like receptor (KIR/KIR ligand genetic combinations studied, and KIR+ NK cell repertoire size. KIR genes are known to be polymorphic in terms of gene content, copy number variation, and number of alleles. These allelic polymorphisms may impact both the phenotype and function of KIR+ NK cells. We, therefore, speculate that polymorphisms may alter donor KIR+ NK cell phenotype/function thus modulating post-HSCT KIR+ NK cell alloreactivity. To investigate KIR allele polymorphisms of all KIR genes, we developed a next-generation sequencing (NGS technology on a MiSeq platform. To ensure the reliability and specificity of our method, genomic DNA from well-characterized cell lines were used; high-resolution KIR typing results obtained were then compared to those previously reported. Two different bioinformatic pipelines were used allowing the attribution of sequencing reads to specific KIR genes and the assignment of KIR alleles for each KIR gene. Our results demonstrated successful long-range KIR gene amplifications of all reference samples using intergenic KIR primers. The alignment of reads to the human genome reference (hg19 using BiRD pipeline or visualization of data using Profiler software demonstrated that all KIR genes were completely sequenced with a sufficient read depth (mean 317× for all loci and a high percentage of mapping (mean 93% for all loci. Comparison of high-resolution KIR typing obtained to those published data using exome capture resulted in a reported concordance rate of 95% for centromeric and telomeric KIR genes. Overall, our results suggest that NGS can be used to investigate the broad KIR allelic polymorphism. Hence, these data improve our knowledge, not only on KIR+ NK cell alloreactivity in

  4. Unprecedented high-resolution view of bacterial operon architecture revealed by RNA sequencing.

    Science.gov (United States)

    Conway, Tyrrell; Creecy, James P; Maddox, Scott M; Grissom, Joe E; Conkle, Trevor L; Shadid, Tyler M; Teramoto, Jun; San Miguel, Phillip; Shimada, Tomohiro; Ishihama, Akira; Mori, Hirotada; Wanner, Barry L

    2014-07-08

    We analyzed the transcriptome of Escherichia coli K-12 by strand-specific RNA sequencing at single-nucleotide resolution during steady-state (logarithmic-phase) growth and upon entry into stationary phase in glucose minimal medium. To generate high-resolution transcriptome maps, we developed an organizational schema which showed that in practice only three features are required to define operon architecture: the promoter, terminator, and deep RNA sequence read coverage. We precisely annotated 2,122 promoters and 1,774 terminators, defining 1,510 operons with an average of 1.98 genes per operon. Our analyses revealed an unprecedented view of E. coli operon architecture. A large proportion (36%) of operons are complex with internal promoters or terminators that generate multiple transcription units. For 43% of operons, we observed differential expression of polycistronic genes, despite being in the same operons, indicating that E. coli operon architecture allows fine-tuning of gene expression. We found that 276 of 370 convergent operons terminate inefficiently, generating complementary 3' transcript ends which overlap on average by 286 nucleotides, and 136 of 388 divergent operons have promoters arranged such that their 5' ends overlap on average by 168 nucleotides. We found 89 antisense transcripts of 397-nucleotide average length, 7 unannotated transcripts within intergenic regions, and 18 sense transcripts that completely overlap operons on the opposite strand. Of 519 overlapping transcripts, 75% correspond to sequences that are highly conserved in E. coli (>50 genomes). Our data extend recent studies showing unexpected transcriptome complexity in several bacteria and suggest that antisense RNA regulation is widespread. Importance: We precisely mapped the 5' and 3' ends of RNA transcripts across the E. coli K-12 genome by using a single-nucleotide analytical approach. Our resulting high-resolution transcriptome maps show that ca. one-third of E. coli operons are

  5. Sequence Factorization with Multiple References.

    Directory of Open Access Journals (Sweden)

    Sebastian Wandelt

    Full Text Available The success of high-throughput sequencing has lead to an increasing number of projects which sequence large populations of a species. Storage and analysis of sequence data is a key challenge in these projects, because of the sheer size of the datasets. Compression is one simple technology to deal with this challenge. Referential factorization and compression schemes, which store only the differences between input sequence and a reference sequence, gained lots of interest in this field. Highly-similar sequences, e.g., Human genomes, can be compressed with a compression ratio of 1,000:1 and more, up to two orders of magnitude better than with standard compression techniques. Recently, it was shown that the compression against multiple references from the same species can boost the compression ratio up to 4,000:1. However, a detailed analysis of using multiple references is lacking, e.g., for main memory consumption and optimality. In this paper, we describe one key technique for the referential compression against multiple references: The factorization of sequences. Based on the notion of an optimal factorization, we propose optimization heuristics and identify parameter settings which greatly influence 1 the size of the factorization, 2 the time for factorization, and 3 the required amount of main memory. We evaluate a total of 30 setups with a varying number of references on data from three different species. Our results show a wide range of factorization sizes (optimal to an overhead of up to 300%, factorization speed (0.01 MB/s to more than 600 MB/s, and main memory usage (few dozen MB to dozens of GB. Based on our evaluation, we identify the best configurations for common use cases. Our evaluation shows that multi-reference factorization is much better than single-reference factorization.

  6. Analysis of Pteridium ribosomal RNA sequences by rapid direct sequencing.

    Science.gov (United States)

    Tan, M K

    1991-08-01

    A total of 864 bases from 5 regions interspersed in the 18S and 26S rRNA molecules from various clones of Pteridium covering the general geographical distribution of the genus was analysed using a rapid rRNA sequencing technique. No base difference has been detected amongst the three major lineages, two of which apparently separated before the breakup of the ancient supercontinent, Pangaea. These regions of the rRNA sequences have thus been conserved for at least 160 million years and are here compared with other eukaryotic, especially plant rRNAs.

  7. Genome Sequences of Oryza Species

    KAUST Repository

    Kumagai, Masahiko; Tanaka, Tsuyoshi; Ohyanagi, Hajime; Hsing, Yue-Ie C.; Itoh, Takeshi

    2018-01-01

    This chapter summarizes recent data obtained from genome sequencing, annotation projects, and studies on the genome diversity of Oryza sativa and related Oryza species. O. sativa, commonly known as Asian rice, is the first monocot species whose complete genome sequence was deciphered based on physical mapping by an international collaborative effort. This genome, along with its accurate and comprehensive annotation, has become an indispensable foundation for crop genomics and breeding. With the development of innovative sequencing technologies, genomic studies of O. sativa have dramatically increased; in particular, a large number of cultivars and wild accessions have been sequenced and compared with the reference rice genome. Since de novo genome sequencing has become cost-effective, the genome of African cultivated rice, O. glaberrima, has also been determined. Comparative genomic studies have highlighted the independent domestication processes of different rice species, but it also turned out that Asian and African rice share a common gene set that has experienced similar artificial selection. An international project aimed at constructing reference genomes and examining the genome diversity of wild Oryza species is currently underway, and the genomes of some species are publicly available. This project provides a platform for investigations such as the evolution, development, polyploidization, and improvement of crops. Studies on the genomic diversity of Oryza species, including wild species, should provide new insights to solve the problem of growing food demands in the face of rapid climatic changes.

  8. Genome Sequences of Oryza Species

    KAUST Repository

    Kumagai, Masahiko

    2018-02-14

    This chapter summarizes recent data obtained from genome sequencing, annotation projects, and studies on the genome diversity of Oryza sativa and related Oryza species. O. sativa, commonly known as Asian rice, is the first monocot species whose complete genome sequence was deciphered based on physical mapping by an international collaborative effort. This genome, along with its accurate and comprehensive annotation, has become an indispensable foundation for crop genomics and breeding. With the development of innovative sequencing technologies, genomic studies of O. sativa have dramatically increased; in particular, a large number of cultivars and wild accessions have been sequenced and compared with the reference rice genome. Since de novo genome sequencing has become cost-effective, the genome of African cultivated rice, O. glaberrima, has also been determined. Comparative genomic studies have highlighted the independent domestication processes of different rice species, but it also turned out that Asian and African rice share a common gene set that has experienced similar artificial selection. An international project aimed at constructing reference genomes and examining the genome diversity of wild Oryza species is currently underway, and the genomes of some species are publicly available. This project provides a platform for investigations such as the evolution, development, polyploidization, and improvement of crops. Studies on the genomic diversity of Oryza species, including wild species, should provide new insights to solve the problem of growing food demands in the face of rapid climatic changes.

  9. Transformed composite sequences for improved qubit addressing

    Science.gov (United States)

    Merrill, J. True; Doret, S. Charles; Vittorini, Grahame; Addison, J. P.; Brown, Kenneth R.

    2014-10-01

    Selective laser addressing of a single atom or atomic ion qubit can be improved using narrow-band composite pulse sequences. We describe a Lie-algebraic technique to generalize known narrow-band sequences and introduce sequences related by dilation and rotation of sequence generators. Our method improves known narrow-band sequences by decreasing both the pulse time and the residual error. Finally, we experimentally demonstrate these composite sequences using 40Ca+ ions trapped in a surface-electrode ion trap.

  10. A powerful method for transcriptional profiling of specific cell types in eukaryotes: laser-assisted microdissection and RNA sequencing.

    Directory of Open Access Journals (Sweden)

    Marc W Schmid

    Full Text Available The acquisition of distinct cell fates is central to the development of multicellular organisms and is largely mediated by gene expression patterns specific to individual cells and tissues. A spatially and temporally resolved analysis of gene expression facilitates the elucidation of transcriptional networks linked to cellular identity and function. We present an approach that allows cell type-specific transcriptional profiling of distinct target cells, which are rare and difficult to access, with unprecedented sensitivity and resolution. We combined laser-assisted microdissection (LAM, linear amplification starting from <1 ng of total RNA, and RNA-sequencing (RNA-Seq. As a model we used the central cell of the Arabidopsis thaliana female gametophyte, one of the female gametes harbored in the reproductive organs of the flower. We estimated the number of expressed genes to be more than twice the number reported previously in a study using LAM and ATH1 microarrays, and identified several classes of genes that were systematically underrepresented in the transcriptome measured with the ATH1 microarray. Among them are many genes that are likely to be important for developmental processes and specific cellular functions. In addition, we identified several intergenic regions, which are likely to be transcribed, and describe a considerable fraction of reads mapping to introns and regions flanking annotated loci, which may represent alternative transcript isoforms. Finally, we performed a de novo assembly of the transcriptome and show that the method is suitable for studying individual cell types of organisms lacking reference sequence information, demonstrating that this approach can be applied to most eukaryotic organisms.

  11. High-throughput sequencing identification and characterization of potentially adhesion-related small RNAs in Streptococcus mutans.

    Science.gov (United States)

    Zhu, Wenhui; Liu, Shanshan; Liu, Jia; Zhou, Yan; Lin, Huancai

    2018-05-01

    Adherence capacity is one of the principal virulence factors of Streptococcus mutans, and adhesion virulence factors are controlled by small RNAs (sRNAs) at the post-transcriptional level in various bacteria. Here, we aimed to identify and decipher putative adhesion-related sRNAs in clinical strains of S. mutans. RNA deep-sequencing was performed to identify potential sRNAs under different adhesion conditions. The expression of sRNAs was analysed by quantitative real-time PCR (qRT-PCR), and bioinformatic methods were used to predict the functional characteristics of sRNAs. A total of 736 differentially expressed candidate sRNAs were predicted, and these included 352 sRNAs located on the antisense to mRNA (AM) and 384 sRNAs in intergenic regions (IGRs). The top 7 differentially expressed sRNAs were successfully validated by qRT-PCR in UA159, and 2 of these were further confirmed in 100 clinical isolates. Moreover, the sequences of two sRNAs were conserved in other Streptococcus species, indicating a conserved role in such closely related species. A good correlation between the expression of sRNAs and the adhesion of 100 clinical strains was observed, which, combined with GO and KEGG, provides a perspective for the comprehension of sRNA function annotation. This study revealed a multitude of novel putative adhesion-related sRNAs in S. mutans and contributed to a better understanding of information concerning the transcriptional regulation of adhesion in S. mutans.

  12. The genome sequence of Geobacter metallireducens: features of metabolism, physiology and regulation common and dissimilar to Geobacter sulfurreducens

    Energy Technology Data Exchange (ETDEWEB)

    Aklujkar, Muktak; Krushkal, Julia; DiBartolo, Genevieve; Lapidus, Alla; Land, Miriam L.; Lovley, Derek R.

    2008-12-01

    Background: The genome sequence of Geobacter metallireducens is the second to be completed from the metal-respiring genus Geobacter, and is compared in this report to that of Geobacter sulfurreducens in order to understand their metabolic, physiological and regulatory similarities and differences. Results: The experimentally observed greater metabolic versatility of G. metallireducens versus G. sulfurreducens is borne out by the presence of more numerous genes for metabolism of organic acids including acetate, propionate, and pyruvate. Although G. metallireducens lacks a dicarboxylic acid transporter, it has acquired a second succinate dehydrogenase/fumarate reductase complex, suggesting that respiration of fumarate was important until recently in its evolutionary history. Vestiges of the molybdate (ModE) regulon of G. sulfurreducens can be detected in G. metallireducens, which has lost the global regulatory protein ModE but retained some putative ModE-binding sites and multiplied certain genes of molybdenum cofactor biosynthesis. Several enzymes of amino acid metabolism are of different origin in the two species, but significant patterns of gene organization are conserved. Whereas most Geobacteraceae are predicted to obtain biosynthetic reducing equivalents from electron transfer pathways via a ferredoxin oxidoreductase, G. metallireducens can derive them from the oxidative pentose phosphate pathway. In addition to the evidence of greater metabolic versatility, the G. metallireducens genome is also remarkable for the abundance of multicopy nucleotide sequences found in intergenic regions and even within genes. Conclusion: The genomic evidence suggests that metabolism, physiology and regulation of gene expression in G. metallireducens may be dramatically different from other Geobacteraceae.

  13. Comparative sequence analysis of Solanum and Arabidopsis in a hot spot for pathogen resistance on potato chromosome V reveals a patchwork of conserved and rapidly evolving genome segments

    Directory of Open Access Journals (Sweden)

    Bruggmann Rémy

    2007-05-01

    Full Text Available Abstract Background Quantitative phenotypic variation of agronomic characters in crop plants is controlled by environmental and genetic factors (quantitative trait loci = QTL. To understand the molecular basis of such QTL, the identification of the underlying genes is of primary interest and DNA sequence analysis of the genomic regions harboring QTL is a prerequisite for that. QTL mapping in potato (Solanum tuberosum has identified a region on chromosome V tagged by DNA markers GP21 and GP179, which contains a number of important QTL, among others QTL for resistance to late blight caused by the oomycete Phytophthora infestans and to root cyst nematodes. Results To obtain genomic sequence for the targeted region on chromosome V, two local BAC (bacterial artificial chromosome contigs were constructed and sequenced, which corresponded to parts of the homologous chromosomes of the diploid, heterozygous genotype P6/210. Two contiguous sequences of 417,445 and 202,781 base pairs were assembled and annotated. Gene-by-gene co-linearity was disrupted by non-allelic insertions of retrotransposon elements, stretches of diverged intergenic sequences, differences in gene content and gene order. The latter was caused by inversion of a 70 kbp genomic fragment. These features were also found in comparison to orthologous sequence contigs from three homeologous chromosomes of Solanum demissum, a wild tuber bearing species. Functional annotation of the sequence identified 48 putative open reading frames (ORF in one contig and 22 in the other, with an average of one ORF every 9 kbp. Ten ORFs were classified as resistance-gene-like, 11 as F-box-containing genes, 13 as transposable elements and three as transcription factors. Comparing potato to Arabidopsis thaliana annotated proteins revealed five micro-syntenic blocks of three to seven ORFs with A. thaliana chromosomes 1, 3 and 5. Conclusion Comparative sequence analysis revealed highly conserved collinear regions

  14. Restrição do 16S-23S DNAr intergênico para avaliação da diversidade de Azospirillum amazonense isolado de Brachiaria spp. Restriction of 16S-23S intergenic rDNA for diversity evaluation of Azospirillum amazonense isolated from different Brachiaria spp.

    Directory of Open Access Journals (Sweden)

    Fábio Bueno dos Reis Junior

    2006-03-01

    Full Text Available O objetivo deste trabalho foi avaliar a diversidade intra-específica de isolados de Azospirillum amazonense e estabelecer a possível influência de diferentes espécies de Brachiaria ssp. e diferentes condições edafoclimáticas. A caracterização da diversidade desses isolados foi conduzida, utilizando-se a análise de restrição da região intergênica 16S-23S DNAr. As estirpes estudadas separaram-se em dois grupos, definidos a 56% de similaridade. As espécies de Brachiaria ssp. influenciaram a diversidade de estirpes. A maioria dos isolados oriundos de B. decumbens e B. brizantha está inserida no primeiro grupo, enquanto os oriundos de B. humidicola concentram-se no segundo grupo.The aim of this work was to study the intra-specific diversity of Azospirillum amazonense isolates and to establish possible influences of different Brachiaria spp. and edaphoclimatic conditions. The characterization of the diversity among the isolates of A. amazonense studied was conducted using restriction analysis of the 16S-23S rDNA intergenic spacer region. The evaluated strains were separated in two groups, defined at 56% of similarity. Brachiaria spp. showed effects on strain diversity. Most part of the isolates from B. decumbens and B. brizantha are inserted in the first group, while B. humidicola isolates concentrate in the second group.

  15. Sequences, groups, and number theory

    CERN Document Server

    Rigo, Michel

    2018-01-01

    This collaborative book presents recent trends on the study of sequences, including combinatorics on words and symbolic dynamics, and new interdisciplinary links to group theory and number theory. Other chapters branch out from those areas into subfields of theoretical computer science, such as complexity theory and theory of automata. The book is built around four general themes: number theory and sequences, word combinatorics, normal numbers, and group theory. Those topics are rounded out by investigations into automatic and regular sequences, tilings and theory of computation, discrete dynamical systems, ergodic theory, numeration systems, automaton semigroups, and amenable groups.  This volume is intended for use by graduate students or research mathematicians, as well as computer scientists who are working in automata theory and formal language theory. With its organization around unified themes, it would also be appropriate as a supplemental text for graduate level courses.

  16. Explaining the harmonic sequence paradox.

    Science.gov (United States)

    Schmidt, Ulrich; Zimper, Alexander

    2012-05-01

    According to the harmonic sequence paradox, an expected utility decision maker's willingness to pay for a gamble whose expected payoffs evolve according to the harmonic series is finite if and only if his marginal utility of additional income becomes zero for rather low payoff levels. Since the assumption of zero marginal utility is implausible for finite payoff levels, expected utility theory - as well as its standard generalizations such as cumulative prospect theory - are apparently unable to explain a finite willingness to pay. This paper presents first an experimental study of the harmonic sequence paradox. Additionally, it demonstrates that the theoretical argument of the harmonic sequence paradox only applies to time-patient decision makers, whereas the paradox is easily avoided if time-impatience is introduced. ©2011 The British Psychological Society.

  17. Integrated sequence analysis. Final report

    Energy Technology Data Exchange (ETDEWEB)

    Andersson, K.; Pyy, P

    1998-02-01

    The NKS/RAK subprojet 3 `integrated sequence analysis` (ISA) was formulated with the overall objective to develop and to test integrated methodologies in order to evaluate event sequences with significant human action contribution. The term `methodology` denotes not only technical tools but also methods for integration of different scientific disciplines. In this report, we first discuss the background of ISA and the surveys made to map methods in different application fields, such as man machine system simulation software, human reliability analysis (HRA) and expert judgement. Specific event sequences were, after the surveys, selected for application and testing of a number of ISA methods. The event sequences discussed in the report were cold overpressure of BWR, shutdown LOCA of BWR, steam generator tube rupture of a PWR and BWR disturbed signal view in the control room after an external event. Different teams analysed these sequences by using different ISA and HRA methods. Two kinds of results were obtained from the ISA project: sequence specific and more general findings. The sequence specific results are discussed together with each sequence description. The general lessons are discussed under a separate chapter by using comparisons of different case studies. These lessons include areas ranging from plant safety management (design, procedures, instrumentation, operations, maintenance and safety practices) to methodological findings (ISA methodology, PSA,HRA, physical analyses, behavioural analyses and uncertainty assessment). Finally follows a discussion about the project and conclusions are presented. An interdisciplinary study of complex phenomena is a natural way to produce valuable and innovative results. This project came up with structured ways to perform ISA and managed to apply the in practice. The project also highlighted some areas where more work is needed. In the HRA work, development is required for the use of simulators and expert judgement as

  18. Matrix transformations and sequence spaces

    International Nuclear Information System (INIS)

    Nanda, S.

    1983-06-01

    In most cases the most general linear operator from one sequence space into another is actually given by an infinite matrix and therefore the theory of matrix transformations has always been of great interest in the study of sequence spaces. The study of general theory of matrix transformations was motivated by the special results in summability theory. This paper is a review article which gives almost all known results on matrix transformations. This also suggests a number of open problems for further study and will be very useful for research workers. (author)

  19. Green's theorem and Gorenstein sequences

    OpenAIRE

    Ahn, Jeaman; Migliore, Juan C.; Shin, Yong-Su

    2016-01-01

    We study consequences, for a standard graded algebra, of extremal behavior in Green's Hyperplane Restriction Theorem. First, we extend his Theorem 4 from the case of a plane curve to the case of a hypersurface in a linear space. Second, assuming a certain Lefschetz condition, we give a connection to extremal behavior in Macaulay's theorem. We apply these results to show that $(1,19,17,19,1)$ is not a Gorenstein sequence, and as a result we classify the sequences of the form $(1,a,a-2,a,1)$ th...

  20. Sequences in language and text

    CERN Document Server

    Mikros, George K

    2015-01-01

    The aim of this volume is to present the diverse but highly interesting area of the quantitative analysis of the sequence of various linguistic structures. The collected articles present a wide spectrum of quantitative analyses of linguistic syntagmatic structures and explore novel sequential linguistic entities. This volume will be interesting to all researchers studying linguistics using quantitative methods.

  1. Probabilistic studies of accident sequences

    International Nuclear Information System (INIS)

    Villemeur, A.; Berger, J.P.

    1986-01-01

    For several years, Electricite de France has carried out probabilistic assessment of accident sequences for nuclear power plants. In the framework of this program many methods were developed. As the interest in these studies was increasing and as adapted methods were developed, Electricite de France has undertaken a probabilistic safety assessment of a nuclear power plant [fr

  2. MRI sequences and their parameters

    International Nuclear Information System (INIS)

    Teissier, J.M.

    1993-01-01

    Listing basic sequences and their present variants makes a synthetic classification of the various acquisition modes possible. The knowledge of the advantages of each of them, as well as of their disadvantages and restraints, seems to be an essential prerequisite to an optimal utilization of each magnetic resonance imaging system. (author)

  3. Degree sequence in message transfer

    Science.gov (United States)

    Yamuna, M.

    2017-11-01

    Message encryption is always an issue in current communication scenario. Methods are being devised using various domains. Graphs satisfy numerous unique properties which can be used for message transfer. In this paper, I propose a message encryption method based on degree sequence of graphs.

  4. Fractals in DNA sequence analysis

    Institute of Scientific and Technical Information of China (English)

    Yu Zu-Guo(喻祖国); Vo Anh; Gong Zhi-Min(龚志民); Long Shun-Chao(龙顺潮)

    2002-01-01

    Fractal methods have been successfully used to study many problems in physics, mathematics, engineering, finance,and even in biology. There has been an increasing interest in unravelling the mysteries of DNA; for example, how can we distinguish coding and noncoding sequences, and the problems of classification and evolution relationship of organisms are key problems in bioinformatics. Although much research has been carried out by taking into consideration the long-range correlations in DNA sequences, and the global fractal dimension has been used in these works by other people, the models and methods are somewhat rough and the results are not satisfactory. In recent years, our group has introduced a time series model (statistical point of view) and a visual representation (geometrical point of view)to DNA sequence analysis. We have also used fractal dimension, correlation dimension, the Hurst exponent and the dimension spectrum (multifractal analysis) to discuss problems in this field. In this paper, we introduce these fractal models and methods and the results of DNA sequence analysis.

  5. On primes in Lucas sequences

    Czech Academy of Sciences Publication Activity Database

    Křížek, Michal; Somer, L.

    2015-01-01

    Roč. 53, č. 1 (2015), s. 2-23 ISSN 0015-0517 R&D Projects: GA ČR GA14-02067S Institutional support: RVO:67985840 Keywords : Lucas sequence * primes Subject RIV: BA - General Mathematics http://www.fq.math.ca/Abstracts/53-1/somer.pdf

  6. Is sequence awareness mandatory for perceptual sequence learning: An assessment using a pure perceptual sequence learning design.

    Science.gov (United States)

    Deroost, Natacha; Coomans, Daphné

    2018-02-01

    We examined the role of sequence awareness in a pure perceptual sequence learning design. Participants had to react to the target's colour that changed according to a perceptual sequence. By varying the mapping of the target's colour onto the response keys, motor responses changed randomly. The effect of sequence awareness on perceptual sequence learning was determined by manipulating the learning instructions (explicit versus implicit) and assessing the amount of sequence awareness after the experiment. In the explicit instruction condition (n = 15), participants were instructed to intentionally search for the colour sequence, whereas in the implicit instruction condition (n = 15), they were left uninformed about the sequenced nature of the task. Sequence awareness after the sequence learning task was tested by means of a questionnaire and the process-dissociation-procedure. The results showed that the instruction manipulation had no effect on the amount of perceptual sequence learning. Based on their report to have actively applied their sequence knowledge during the experiment, participants were subsequently regrouped in a sequence strategy group (n = 14, of which 4 participants from the implicit instruction condition and 10 participants from the explicit instruction condition) and a no-sequence strategy group (n = 16, of which 11 participants from the implicit instruction condition and 5 participants from the explicit instruction condition). Only participants of the sequence strategy group showed reliable perceptual sequence learning and sequence awareness. These results indicate that perceptual sequence learning depends upon the continuous employment of strategic cognitive control processes on sequence knowledge. Sequence awareness is suggested to be a necessary but not sufficient condition for perceptual learning to take place. Copyright © 2018 Elsevier B.V. All rights reserved.

  7. Teaching Task Sequencing via Verbal Mediation.

    Science.gov (United States)

    Rusch, Frank R.; And Others

    1987-01-01

    Verbal sequence training was used to teach a moderately mentally retarded woman to sequence job-related tasks. Learning to say the tasks in the proper sequence resulted in the employee performing her tasks in that sequence, and the employee was capable of mediating her own work behavior when scheduled changes occurred. (Author/JDD)

  8. Repdigits in k-Lucas sequences

    Indian Academy of Sciences (India)

    57(2) 2000 243-254) proved that 11 is the largest number with only one distinct digit (the so-called repdigit) in the sequence ( L n ( 2 ) ) n . In this paper, we address a similar problem in the family of -Lucas sequences. We also show that the -Lucas sequences have similar properties to those of -Fibonacci sequences ...

  9. Characterization of Erwinia amylovora strains from different host plants using repetitive-sequences PCR analysis, and restriction fragment length polymorphism and short-sequence DNA repeats of plasmid pEA29.

    Science.gov (United States)

    Barionovi, D; Giorgi, S; Stoeger, A R; Ruppitsch, W; Scortichini, M

    2006-05-01

    The three main aims of the study were the assessment of the genetic relationship between a deviating Erwinia amylovora strain isolated from Amelanchier sp. (Maloideae) grown in Canada and other strains from Maloideae and Rosoideae, the investigation of the variability of the PstI fragment of the pEA29 plasmid using restriction fragment length polymorphism (RFLP) analysis and the determination of the number of short-sequence DNA repeats (SSR) by DNA sequence analysis in representative strains. Ninety-three strains obtained from 12 plant genera and different geographical locations were examined by repetitive-sequences PCR using Enterobacterial Repetitive Intergenic Consensus, BOX and Repetitive Extragenic Palindromic primer sets. Upon the unweighted pair group method with arithmetic mean analysis, a deviating strain from Amelanchier sp. was analysed using amplified ribosomal DNA restriction analysis (ARDRA) analysis and the sequencing of the 16S rDNA gene. This strain showed 99% similarity to other E. amylovora strains in the 16S gene and the same banding pattern with ARDRA. The RFLP analysis of pEA29 plasmid using MspI and Sau3A restriction enzymes showed a higher variability than that previously observed and no clear-cut grouping of the strains was possible. The number of SSR units reiterated two to 12 times. The strains obtained from pear orchards showing for the first time symptoms of fire blight had a low number of SSR units. The strains from Maloideae exhibit a wider genetic variability than previously thought. The RFLP analysis of a fragment of the pEA29 plasmid would not seem a reliable method for typing E. amylovora strains. A low number of SSR units was observed with first epidemics of fire blight. The current detection techniques are mainly based on the genetic similarities observed within the strains from the cultivated tree-fruit crops. For a more reliable detection of the fire blight pathogen also in wild and ornamentals Rosaceous plants the genetic

  10. Nonparametric Inference for Periodic Sequences

    KAUST Repository

    Sun, Ying

    2012-02-01

    This article proposes a nonparametric method for estimating the period and values of a periodic sequence when the data are evenly spaced in time. The period is estimated by a "leave-out-one-cycle" version of cross-validation (CV) and complements the periodogram, a widely used tool for period estimation. The CV method is computationally simple and implicitly penalizes multiples of the smallest period, leading to a "virtually" consistent estimator of integer periods. This estimator is investigated both theoretically and by simulation.We also propose a nonparametric test of the null hypothesis that the data have constantmean against the alternative that the sequence of means is periodic. Finally, our methodology is demonstrated on three well-known time series: the sunspots and lynx trapping data, and the El Niño series of sea surface temperatures. © 2012 American Statistical Association and the American Society for Quality.

  11. Multi-qubit compensation sequences

    International Nuclear Information System (INIS)

    Tomita, Y; Merrill, J T; Brown, K R

    2010-01-01

    The Hamiltonian control of n qubits requires precision control of both the strength and timing of interactions. Compensation pulses relax the precision requirements by reducing unknown but systematic errors. Using composite pulse techniques designed for single qubits, we show that systematic errors for n-qubit systems can be corrected to arbitrary accuracy given either two non-commuting control Hamiltonians with identical systematic errors or one error-free control Hamiltonian. We also examine composite pulses in the context of quantum computers controlled by two-qubit interactions. For quantum computers based on the XY interaction, single-qubit composite pulse sequences naturally correct systematic errors. For quantum computers based on the Heisenberg or exchange interaction, the composite pulse sequences reduce the logical single-qubit gate errors but increase the errors for logical two-qubit gates.

  12. Cassini Mission Sequence Subsystem (MSS)

    Science.gov (United States)

    Alland, Robert

    2011-01-01

    This paper describes my work with the Cassini Mission Sequence Subsystem (MSS) team during the summer of 2011. It gives some background on the motivation for this project and describes the expected benefit to the Cassini program. It then introduces the two tasks that I worked on - an automatic system auditing tool and a series of corrections to the Cassini Sequence Generator (SEQ_GEN) - and the specific objectives these tasks were to accomplish. Next, it details the approach I took to meet these objectives and the results of this approach, followed by a discussion of how the outcome of the project compares with my initial expectations. The paper concludes with a summary of my experience working on this project, lists what the next steps are, and acknowledges the help of my Cassini colleagues.

  13. Sequence complexity and work extraction

    International Nuclear Information System (INIS)

    Merhav, Neri

    2015-01-01

    We consider a simplified version of a solvable model by Mandal and Jarzynski, which constructively demonstrates the interplay between work extraction and the increase of the Shannon entropy of an information reservoir which is in contact with a physical system. We extend Mandal and Jarzynski’s main findings in several directions: first, we allow sequences of correlated bits rather than just independent bits. Secondly, at least for the case of binary information, we show that, in fact, the Shannon entropy is only one measure of complexity of the information that must increase in order for work to be extracted. The extracted work can also be upper bounded in terms of the increase in other quantities that measure complexity, like the predictability of future bits from past ones. Third, we provide an extension to the case of non-binary information (i.e. a larger alphabet), and finally, we extend the scope to the case where the incoming bits (before the interaction) form an individual sequence, rather than a random one. In this case, the entropy before the interaction can be replaced by the Lempel–Ziv (LZ) complexity of the incoming sequence, a fact that gives rise to an entropic meaning of the LZ complexity, not only in information theory, but also in physics. (paper)

  14. Entropic fluctuations in DNA sequences

    Science.gov (United States)

    Thanos, Dimitrios; Li, Wentian; Provata, Astero

    2018-03-01

    The Local Shannon Entropy (LSE) in blocks is used as a complexity measure to study the information fluctuations along DNA sequences. The LSE of a DNA block maps the local base arrangement information to a single numerical value. It is shown that despite this reduction of information, LSE allows to extract meaningful information related to the detection of repetitive sequences in whole chromosomes and is useful in finding evolutionary differences between organisms. More specifically, large regions of tandem repeats, such as centromeres, can be detected based on their low LSE fluctuations along the chromosome. Furthermore, an empirical investigation of the appropriate block sizes is provided and the relationship of LSE properties with the structure of the underlying repetitive units is revealed by using both computational and mathematical methods. Sequence similarity between the genomic DNA of closely related species also leads to similar LSE values at the orthologous regions. As an application, the LSE covariance function is used to measure the evolutionary distance between several primate genomes.

  15. The Complete Plastid Genome Sequence of Madagascar Periwinkle Catharanthus roseus (L.) G. Don: Plastid Genome Evolution, Molecular Marker Identification, and Phylogenetic Implications in Asterids

    Science.gov (United States)

    Ku, Chuan; Chung, Wan-Chia; Chen, Ling-Ling; Kuo, Chih-Horng

    2013-01-01

    The Madagascar periwinkle ( Catharanthus roseus in the family Apocynaceae) is an important medicinal plant and is the source of several widely marketed chemotherapeutic drugs. It is also commonly grown for its ornamental values and, due to ease of infection and distinctiveness of symptoms, is often used as the host for studies on phytoplasmas, an important group of uncultivated plant pathogens. To gain insights into the characteristics of apocynaceous plastid genomes (plastomes), we used a reference-assisted approach to assemble the complete plastome of C . roseus , which could be applied to other C . roseus -related studies. The C . roseus plastome is the second completely sequenced plastome in the asterid order Gentianales. We performed comparative analyses with two other representative sequences in the same order, including the complete plastome of Coffea arabica (from the basal Gentianales family Rubiaceae) and the nearly complete plastome of Asclepias syriaca (Apocynaceae). The results demonstrated considerable variations in gene content and plastome organization within Apocynaceae, including the presence/absence of three essential genes (i.e., accD, clpP, and ycf1) and large size changes in non-coding regions (e.g., rps2-rpoC2 and IRb-ndhF). To find plastome markers of potential utility for Catharanthus breeding and phylogenetic analyses, we identified 41 C . roseus -specific simple sequence repeats. Furthermore, five intergenic regions with high divergence between C . roseus and three other euasterids I taxa were identified as candidate markers. To resolve the euasterids I interordinal relationships, 82 plastome genes were used for phylogenetic inference. With the addition of representatives from Apocynaceae and sampling of most other asterid orders, a sister relationship between Gentianales and Solanales is supported. PMID:23825699

  16. Accurate and Practical Identification of 20 Fusarium Species by Seven-Locus Sequence Analysis and Reverse Line Blot Hybridization, and an In Vitro Antifungal Susceptibility Study▿†

    Science.gov (United States)

    Wang, He; Xiao, Meng; Kong, Fanrong; Chen, Sharon; Dou, Hong-Tao; Sorrell, Tania; Li, Ruo-Yu; Xu, Ying-Chun

    2011-01-01

    Eleven reference and 25 clinical isolates of Fusarium were subject to multilocus DNA sequence analysis to determine the species and haplotypes of the fusarial isolates from Beijing and Shandong, China. Seven loci were analyzed: the translation elongation factor 1 alpha gene (EF-1α); the nuclear rRNA internal transcribed spacer (ITS), large subunit (LSU), and intergenic spacer (IGS) regions; the second largest subunit of the RNA polymerase gene (RPB2); the calmodulin gene (CAM); and the mitochondrial small subunit (mtSSU) rRNA gene. We also evaluated an IGS-targeted PCR/reverse line blot (RLB) assay for species/haplotype identification of Fusarium. Twenty Fusarium species and seven species complexes were identified. Of 25 clinical isolates (10 species), the Gibberella (Fusarium) fujikuroi species complex was the commonest (40%) and was followed by the Fusarium solani species complex (FSSC) (36%) and the F. incarnatum-F. equiseti species complex (12%). Six FSSC isolates were identified to the species level as FSSC-3+4, and three as FSSC-5. Twenty-nine IGS, 27 EF-1α, 26 RPB2, 24 CAM, 18 ITS, 19 LSU, and 18 mtSSU haplotypes were identified; 29 were unique, and haplotypes for 24 clinical strains were novel. By parsimony informative character analysis, the IGS locus was the most phylogenetically informative, and the rRNA gene regions were the least. Results by RLB were concordant with multilocus sequence analysis for all isolates. Amphotericin B was the most active drug against all species. Voriconazole MICs were high (>8 μg/ml) for 15 (42%) isolates, including FSSC. Analysis of larger numbers of isolates is required to determine the clinical utility of the seven-locus sequence analysis and RLB assay in species classification of fusaria. PMID:21389150

  17. Method and apparatus for biological sequence comparison

    Science.gov (United States)

    Marr, T.G.; Chang, W.I.

    1997-12-23

    A method and apparatus are disclosed for comparing biological sequences from a known source of sequences, with a subject (query) sequence. The apparatus takes as input a set of target similarity levels (such as evolutionary distances in units of PAM), and finds all fragments of known sequences that are similar to the subject sequence at each target similarity level, and are long enough to be statistically significant. The invention device filters out fragments from the known sequences that are too short, or have a lower average similarity to the subject sequence than is required by each target similarity level. The subject sequence is then compared only to the remaining known sequences to find the best matches. The filtering member divides the subject sequence into overlapping blocks, each block being sufficiently large to contain a minimum-length alignment from a known sequence. For each block, the filter member compares the block with every possible short fragment in the known sequences and determines a best match for each comparison. The determined set of short fragment best matches for the block provide an upper threshold on alignment values. Regions of a certain length from the known sequences that have a mean alignment value upper threshold greater than a target unit score are concatenated to form a union. The current block is compared to the union and provides an indication of best local alignment with the subject sequence. 5 figs.

  18. Memory and learning with rapid audiovisual sequences

    Science.gov (United States)

    Keller, Arielle S.; Sekuler, Robert

    2015-01-01

    We examined short-term memory for sequences of visual stimuli embedded in varying multisensory contexts. In two experiments, subjects judged the structure of the visual sequences while disregarding concurrent, but task-irrelevant auditory sequences. Stimuli were eight-item sequences in which varying luminances and frequencies were presented concurrently and rapidly (at 8 Hz). Subjects judged whether the final four items in a visual sequence identically replicated the first four items. Luminances and frequencies in each sequence were either perceptually correlated (Congruent) or were unrelated to one another (Incongruent). Experiment 1 showed that, despite encouragement to ignore the auditory stream, subjects' categorization of visual sequences was strongly influenced by the accompanying auditory sequences. Moreover, this influence tracked the similarity between a stimulus's separate audio and visual sequences, demonstrating that task-irrelevant auditory sequences underwent a considerable degree of processing. Using a variant of Hebb's repetition design, Experiment 2 compared musically trained subjects and subjects who had little or no musical training on the same task as used in Experiment 1. Test sequences included some that intermittently and randomly recurred, which produced better performance than sequences that were generated anew for each trial. The auditory component of a recurring audiovisual sequence influenced musically trained subjects more than it did other subjects. This result demonstrates that stimulus-selective, task-irrelevant learning of sequences can occur even when such learning is an incidental by-product of the task being performed. PMID:26575193

  19. Memory and learning with rapid audiovisual sequences.

    Science.gov (United States)

    Keller, Arielle S; Sekuler, Robert

    2015-01-01

    We examined short-term memory for sequences of visual stimuli embedded in varying multisensory contexts. In two experiments, subjects judged the structure of the visual sequences while disregarding concurrent, but task-irrelevant auditory sequences. Stimuli were eight-item sequences in which varying luminances and frequencies were presented concurrently and rapidly (at 8 Hz). Subjects judged whether the final four items in a visual sequence identically replicated the first four items. Luminances and frequencies in each sequence were either perceptually correlated (Congruent) or were unrelated to one another (Incongruent). Experiment 1 showed that, despite encouragement to ignore the auditory stream, subjects' categorization of visual sequences was strongly influenced by the accompanying auditory sequences. Moreover, this influence tracked the similarity between a stimulus's separate audio and visual sequences, demonstrating that task-irrelevant auditory sequences underwent a considerable degree of processing. Using a variant of Hebb's repetition design, Experiment 2 compared musically trained subjects and subjects who had little or no musical training on the same task as used in Experiment 1. Test sequences included some that intermittently and randomly recurred, which produced better performance than sequences that were generated anew for each trial. The auditory component of a recurring audiovisual sequence influenced musically trained subjects more than it did other subjects. This result demonstrates that stimulus-selective, task-irrelevant learning of sequences can occur even when such learning is an incidental by-product of the task being performed.

  20. Multineuronal Spike Sequences Repeat with Millisecond Precision

    Directory of Open Access Journals (Sweden)

    Koki eMatsumoto

    2013-06-01

    Full Text Available Cortical microcircuits are nonrandomly wired by neurons. As a natural consequence, spikes emitted by microcircuits are also nonrandomly patterned in time and space. One of the prominent spike organizations is a repetition of fixed patterns of spike series across multiple neurons. However, several questions remain unsolved, including how precisely spike sequences repeat, how the sequences are spatially organized, how many neurons participate in sequences, and how different sequences are functionally linked. To address these questions, we monitored spontaneous spikes of hippocampal CA3 neurons ex vivo using a high-speed functional multineuron calcium imaging technique that allowed us to monitor spikes with millisecond resolution and to record the location of spiking and nonspiking neurons. Multineuronal spike sequences were overrepresented in spontaneous activity compared to the statistical chance level. Approximately 75% of neurons participated in at least one sequence during our observation period. The participants were sparsely dispersed and did not show specific spatial organization. The number of sequences relative to the chance level decreased when larger time frames were used to detect sequences. Thus, sequences were precise at the millisecond level. Sequences often shared common spikes with other sequences; parts of sequences were subsequently relayed by following sequences, generating complex chains of multiple sequences.

  1. Static multiplicities in heterogeneous azeotropic distillation sequences

    DEFF Research Database (Denmark)

    Esbjerg, Klavs; Andersen, Torben Ravn; Jørgensen, Sten Bay

    1998-01-01

    In this paper the results of a bifurcation analysis on heterogeneous azeotropic distillation sequences are given. Two sequences suitable for ethanol dehydration are compared: The 'direct' and the 'indirect' sequence. It is shown, that the two sequences, despite their similarities, exhibit very...... different static behavior. The method of Petlyuk and Avet'yan (1971), Bekiaris et al. (1993), which assumes infinite reflux and infinite number of stages, is extended to and applied on heterogeneous azeotropic distillation sequences. The predictions are substantiated through simulations. The static sequence...

  2. Blind sequence-length estimation of low-SNR cyclostationary sequences

    CSIR Research Space (South Africa)

    Vlok, JD

    2014-06-01

    Full Text Available Several existing direct-sequence spread spectrum (DSSS) detection and estimation algorithms assume prior knowledge of the symbol period or sequence length, although very few sequence-length estimation techniques are available in the literature...

  3. Infinite matrices and sequence spaces

    CERN Document Server

    Cooke, Richard G

    2014-01-01

    This clear and correct summation of basic results from a specialized field focuses on the behavior of infinite matrices in general, rather than on properties of special matrices. Three introductory chapters guide students to the manipulation of infinite matrices, covering definitions and preliminary ideas, reciprocals of infinite matrices, and linear equations involving infinite matrices.From the fourth chapter onward, the author treats the application of infinite matrices to the summability of divergent sequences and series from various points of view. Topics include consistency, mutual consi

  4. Parallel motif extraction from very long sequences

    KAUST Repository

    Sahli, Majed; Mansour, Essam; Kalnis, Panos

    2013-01-01

    Motifs are frequent patterns used to identify biological functionality in genomic sequences, periodicity in time series, or user trends in web logs. In contrast to a lot of existing work that focuses on collections of many short sequences, modern

  5. Computational analysis of sequence selection mechanisms.

    Science.gov (United States)

    Meyerguz, Leonid; Grasso, Catherine; Kleinberg, Jon; Elber, Ron

    2004-04-01

    Mechanisms leading to gene variations are responsible for the diversity of species and are important components of the theory of evolution. One constraint on gene evolution is that of protein foldability; the three-dimensional shapes of proteins must be thermodynamically stable. We explore the impact of this constraint and calculate properties of foldable sequences using 3660 structures from the Protein Data Bank. We seek a selection function that receives sequences as input, and outputs survival probability based on sequence fitness to structure. We compute the number of sequences that match a particular protein structure with energy lower than the native sequence, the density of the number of sequences, the entropy, and the "selection" temperature. The mechanism of structure selection for sequences longer than 200 amino acids is approximately universal. For shorter sequences, it is not. We speculate on concrete evolutionary mechanisms that show this behavior.

  6. The recurrence sequences via Sylvester matrices

    Science.gov (United States)

    Karaduman, Erdal; Deveci, Ömür

    2017-07-01

    In this work, we define the Pell-Jacobsthal-Slyvester sequence and the Jacobsthal-Pell-Slyvester sequence by using the Slyvester matrices which are obtained from the characteristic polynomials of the Pell and Jacobsthal sequences and then, we study the sequences defined modulo m. Also, we obtain the cyclic groups and the semigroups from the generating matrices of these sequences when read modulo m and then, we derive the relationships among the orders of the cyclic groups and the periods of the sequences. Furthermore, we redefine Pell-Jacobsthal-Slyvester sequence and the Jacobsthal-Pell-Slyvester sequence by means of the elements of the groups and then, we examine them in the finite groups.

  7. ON SOME RECURRENCE TYPE SMARANDACHE SEQUENCES

    OpenAIRE

    MAJUMDAR, A.A.K.; GUNARTO, H.

    2000-01-01

    In this paper, we study some properties of ten recurrence type Smarandache sequences, namely, the Smarandache odd, even, prime product, square product, higher-power product, permutation, consecutive, reverse, symmetric, and pierced chain sequences.

  8. "First generation" automated DNA sequencing technology.

    Science.gov (United States)

    Slatko, Barton E; Kieleczawa, Jan; Ju, Jingyue; Gardner, Andrew F; Hendrickson, Cynthia L; Ausubel, Frederick M

    2011-10-01

    Beginning in the 1980s, automation of DNA sequencing has greatly increased throughput, reduced costs, and enabled large projects to be completed more easily. The development of automation technology paralleled the development of other aspects of DNA sequencing: better enzymes and chemistry, separation and imaging technology, sequencing protocols, robotics, and computational advancements (including base-calling algorithms with quality scores, database developments, and sequence analysis programs). Despite the emergence of high-throughput sequencing platforms, automated Sanger sequencing technology remains useful for many applications. This unit provides background and a description of the "First-Generation" automated DNA sequencing technology. It also includes protocols for using the current Applied Biosystems (ABI) automated DNA sequencing machines. © 2011 by John Wiley & Sons, Inc.

  9. Comparative analysis of sequences from PT 2013

    DEFF Research Database (Denmark)

    Mikkelsen, Susie Sommer

    Sheatfish and not EHNV. Generally, mistakes occurred at the ends of the sequences. This can be due to several factors. One is that the sequence has not been trimmed of the sequence primer sites. Another is the lack of quality control of the chromatogram. Finally, sequencing in just one direction can result...... diseases in Europe. As part of the EURL proficiency test for fish diseases it is required to sequence any RANA virus isolates found in any of the samples. It is also highly recommended to sequence the ISA virus to determine whether it be HPRΔ or HPR0. Furthermore, it is recommended that any VHSV and IHNV...... isolates be genotyped. As part of the evaluation of the proficiency results it was decided this year to look into the quality and similarity of the sequence results for selected viruses. Ampoule III in the proficiency test 2013 contained an EHNV isolate. The EURL received 43 sequences from 41 laboratories...

  10. Perfect sequences over the real quaternions

    OpenAIRE

    Kuznetsov, Oleg

    2017-01-01

    In this Thesis, perfect sequences over the real quaternions are first considered. Definitions for the right and left periodic autocorrelation functions are given, and right and left perfect sequences introduced. It is shown that the right (left) perfection of any sequence implies the left (right) perfection, so concepts of right and left perfect sequences over the real quaternions are equivalent. Unitary transformations of the quaternion space ℍ are then considered. Using the equivalence of t...

  11. Information decomposition method to analyze symbolical sequences

    International Nuclear Information System (INIS)

    Korotkov, E.V.; Korotkova, M.A.; Kudryashov, N.A.

    2003-01-01

    The information decomposition (ID) method to analyze symbolical sequences is presented. This method allows us to reveal a latent periodicity of any symbolical sequence. The ID method is shown to have advantages in comparison with application of the Fourier transformation, the wavelet transform and the dynamic programming method to look for latent periodicity. Examples of the latent periods for poetic texts, DNA sequences and amino acids are presented. Possible origin of a latent periodicity for different symbolical sequences is discussed

  12. Parallel sequencing lives, or what makes large sequencing projects successful.

    Science.gov (United States)

    Quilez, Javier; Vidal, Enrique; Dily, François Le; Serra, François; Cuartero, Yasmina; Stadhouders, Ralph; Graf, Thomas; Marti-Renom, Marc A; Beato, Miguel; Filion, Guillaume

    2017-11-01

    T47D_rep2 and b1913e6c1_51720e9cf were 2 Hi-C samples. They were born and processed at the same time, yet their fates were very different. The life of b1913e6c1_51720e9cf was simple and fruitful, while that of T47D_rep2 was full of accidents and sorrow. At the heart of these differences lies the fact that b1913e6c1_51720e9cf was born under a lab culture of Documentation, Automation, Traceability, and Autonomy and compliance with the FAIR Principles. Their lives are a lesson for those who wish to embark on the journey of managing high-throughput sequencing data. © The Author 2017. Published by Oxford University Press.

  13. MatrixPlot: visualizing sequence constraints

    DEFF Research Database (Denmark)

    Gorodkin, Jan; Stærfeldt, Hans Henrik; Lund, Ole

    1999-01-01

    MatrixPlot: visualizing sequence constraints. Sub-title Abstract Summary : MatrixPlot is a program for making high-quality matrix plots, such as mutual information plots of sequence alignments and distance matrices of sequences with known three-dimensional coordinates. The user can add information...

  14. Comparative genomics beyond sequence-based alignments

    DEFF Research Database (Denmark)

    Þórarinsson, Elfar; Yao, Zizhen; Wiklund, Eric D.

    2008-01-01

    Recent computational scans for non-coding RNAs (ncRNAs) in multiple organisms have relied on existing multiple sequence alignments. However, as sequence similarity drops, a key signal of RNA structure--frequent compensating base changes--is increasingly likely to cause sequence-based alignment me...

  15. DNA sequence modeling based on context trees

    NARCIS (Netherlands)

    Kusters, C.J.; Ignatenko, T.; Roland, J.; Horlin, F.

    2015-01-01

    Genomic sequences contain instructions for protein and cell production. Therefore understanding and identification of biologically and functionally meaningful patterns in DNA sequences is of paramount importance. Modeling of DNA sequences in its turn can help to better understand and identify such

  16. Compact flow diagrams for state sequences

    NARCIS (Netherlands)

    Buchin, K.A.; Buchin, M.E.; Gudmundsson, J.; Horton, M.J.; Sijben, S.

    2016-01-01

    We introduce the concept of compactly representing a large number of state sequences, e.g., sequences of activities, as a flow diagram. We argue that the flow diagram representation gives an intuitive summary that allows the user to detect patterns among large sets of state sequences. Simplified,

  17. Blazar Sequence in Fermi Era Liang Chen

    Indian Academy of Sciences (India)

    Abstract. In this paper, we review the latest research results on the topic of blazar sequence. It seems that the blazar sequence is phenomenally ruled out, while the theoretical blazar sequence still holds. We point out that black hole mass is a dominated parameter accounting for high-power- high-synchrotron-peaked and ...

  18. Tidying up international nucleotide sequence databases: ecological, geographical and sequence quality annotation of its sequences of mycorrhizal fungi.

    Science.gov (United States)

    Tedersoo, Leho; Abarenkov, Kessy; Nilsson, R Henrik; Schüssler, Arthur; Grelet, Gwen-Aëlle; Kohout, Petr; Oja, Jane; Bonito, Gregory M; Veldre, Vilmar; Jairus, Teele; Ryberg, Martin; Larsson, Karl-Henrik; Kõljalg, Urmas

    2011-01-01

    Sequence analysis of the ribosomal RNA operon, particularly the internal transcribed spacer (ITS) region, provides a powerful tool for identification of mycorrhizal fungi. The sequence data deposited in the International Nucleotide Sequence Databases (INSD) are, however, unfiltered for quality and are often poorly annotated with metadata. To detect chimeric and low-quality sequences and assign the ectomycorrhizal fungi to phylogenetic lineages, fungal ITS sequences were downloaded from INSD, aligned within family-level groups, and examined through phylogenetic analyses and BLAST searches. By combining the fungal sequence database UNITE and the annotation and search tool PlutoF, we also added metadata from the literature to these accessions. Altogether 35,632 sequences belonged to mycorrhizal fungi or originated from ericoid and orchid mycorrhizal roots. Of these sequences, 677 were considered chimeric and 2,174 of low read quality. Information detailing country of collection, geographical coordinates, interacting taxon and isolation source were supplemented to cover 78.0%, 33.0%, 41.7% and 96.4% of the sequences, respectively. These annotated sequences are publicly available via UNITE (http://unite.ut.ee/) for downstream biogeographic, ecological and taxonomic analyses. In European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena/), the annotated sequences have a special link-out to UNITE. We intend to expand the data annotation to additional genes and all taxonomic groups and functional guilds of fungi.

  19. Intergenic and intragenic conjugal transfer of multiple antibiotic ...

    African Journals Online (AJOL)

    STORAGESEVER

    2009-01-19

    Jan 19, 2009 ... antibiotic resistance determinants among bacteria in the aquatic ... loci of antibiotic resistant gene among bacteria in the surface water of Bangladesh. ..... bial communities is in assessing the risk of genetically engineered ...

  20. Intergenic and intragenic conjugal transfer of multiple antibiotic ...

    African Journals Online (AJOL)

    Conjugation process was conducted to determine the means of transferring ... In this study, it was surprisingly observed that tetracycline resistant gene was ... among pathogenic bacteria, particularly since antibiotics are indiscriminately used in ...

  1. Characteristics of binding sites of intergenic, intronic and exonic ...

    African Journals Online (AJOL)

    user

    2013-03-06

    Mar 6, 2013 ... miR-1587). Such part of mRNA is very important for its regulation via several miRNA. Interaction of intronic miRNAs with mRNAs genes coding in-miRNA. Oncogenes (51) are host genes and target genes for in-. miRNAs. Majority of these in-miRNAs are encoded in intron. Five of the studied genes (ATF2, ...

  2. Permutation Entropy for Random Binary Sequences

    Directory of Open Access Journals (Sweden)

    Lingfeng Liu

    2015-12-01

    Full Text Available In this paper, we generalize the permutation entropy (PE measure to binary sequences, which is based on Shannon’s entropy, and theoretically analyze this measure for random binary sequences. We deduce the theoretical value of PE for random binary sequences, which can be used to measure the randomness of binary sequences. We also reveal the relationship between this PE measure with other randomness measures, such as Shannon’s entropy and Lempel–Ziv complexity. The results show that PE is consistent with these two measures. Furthermore, we use PE as one of the randomness measures to evaluate the randomness of chaotic binary sequences.

  3. The 2016 Kumamoto earthquake sequence.

    Science.gov (United States)

    Kato, Aitaro; Nakamura, Kouji; Hiyama, Yohei

    2016-01-01

    Beginning in April 2016, a series of shallow, moderate to large earthquakes with associated strong aftershocks struck the Kumamoto area of Kyushu, SW Japan. An M j 7.3 mainshock occurred on 16 April 2016, close to the epicenter of an M j 6.5 foreshock that occurred about 28 hours earlier. The intense seismicity released the accumulated elastic energy by right-lateral strike slip, mainly along two known, active faults. The mainshock rupture propagated along multiple fault segments with different geometries. The faulting style is reasonably consistent with regional deformation observed on geologic timescales and with the stress field estimated from seismic observations. One striking feature of this sequence is intense seismic activity, including a dynamically triggered earthquake in the Oita region. Following the mainshock rupture, postseismic deformation has been observed, as well as expansion of the seismicity front toward the southwest and northwest.

  4. Data selector group sequencer interface

    International Nuclear Information System (INIS)

    Zizka, G.; Turko, B.

    1984-01-01

    A CAMAC-based module for high rate data selection and transfer to Tracor Northern TN-1700 multichannel analysis system is described. The module can select any group of 4096 consecutive addresses of events, in the range of 24 bits. This module solves the problem of connecting a number of time digitizing systems to the memory of a multichannel analyzer. Continuous processing rate up to 200,000 events per second along with the live display make the testing of the above systems very efficient and relatively inexpensive. The module also can be programmed for storing the preset group of addresses into more than one section of the memory. The events are analyzed in each section of the memory during the preset time. Multiple spectra can thus be taken automatically in a sequence

  5. A main sequence for quasars

    Science.gov (United States)

    Marziani, Paola; Dultzin, Deborah; Sulentic, Jack W.; Del Olmo, Ascensión; Negrete, C. A.; Martínez-Aldama, Mary L.; D'Onofrio, Mauro; Bon, Edi; Bon, Natasa; Stirpe, Giovanna M.

    2018-03-01

    The last 25 years saw a major step forward in the analysis of optical and UV spectroscopic data of large quasar samples. Multivariate statistical approaches have led to the definition of systematic trends in observational properties that are the basis of physical and dynamical modeling of quasar structure. We discuss the empirical correlates of the so-called “main sequence” associated with the quasar Eigenvector 1, its governing physical parameters and several implications on our view of the quasar structure, as well as some luminosity effects associated with the virialized component of the line emitting regions. We also briefly discuss quasars in a segment of the main sequence that includes the strongest FeII emitters. These sources show a small dispersion around a well-defined Eddington ratio value, a property which makes them potential Eddington standard candles.

  6. The 2016 Kumamoto earthquake sequence

    Science.gov (United States)

    KATO, Aitaro; NAKAMURA, Kouji; HIYAMA, Yohei

    2016-01-01

    Beginning in April 2016, a series of shallow, moderate to large earthquakes with associated strong aftershocks struck the Kumamoto area of Kyushu, SW Japan. An Mj 7.3 mainshock occurred on 16 April 2016, close to the epicenter of an Mj 6.5 foreshock that occurred about 28 hours earlier. The intense seismicity released the accumulated elastic energy by right-lateral strike slip, mainly along two known, active faults. The mainshock rupture propagated along multiple fault segments with different geometries. The faulting style is reasonably consistent with regional deformation observed on geologic timescales and with the stress field estimated from seismic observations. One striking feature of this sequence is intense seismic activity, including a dynamically triggered earthquake in the Oita region. Following the mainshock rupture, postseismic deformation has been observed, as well as expansion of the seismicity front toward the southwest and northwest. PMID:27725474

  7. A Main Sequence for Quasars

    Directory of Open Access Journals (Sweden)

    Paola Marziani

    2018-03-01

    Full Text Available The last 25 years saw a major step forward in the analysis of optical and UV spectroscopic data of large quasar samples. Multivariate statistical approaches have led to the definition of systematic trends in observational properties that are the basis of physical and dynamical modeling of quasar structure. We discuss the empirical correlates of the so-called “main sequence” associated with the quasar Eigenvector 1, its governing physical parameters and several implications on our view of the quasar structure, as well as some luminosity effects associated with the virialized component of the line emitting regions. We also briefly discuss quasars in a segment of the main sequence that includes the strongest FeII emitters. These sources show a small dispersion around a well-defined Eddington ratio value, a property which makes them potential Eddington standard candles.

  8. Locomotor sequence learning in visually guided walking

    DEFF Research Database (Denmark)

    Choi, Julia T; Jensen, Peter; Nielsen, Jens Bo

    2016-01-01

    walking. In addition, we determined how age (i.e., healthy young adults vs. children) and biomechanical factors (i.e., walking speed) affected the rate and magnitude of locomotor sequence learning. The results showed that healthy young adults (age 24 ± 5 years, N = 20) could learn a specific sequence...... of step lengths over 300 training steps. Younger children (age 6-10 years, N = 8) have lower baseline performance, but their magnitude and rate of sequence learning was the same compared to older children (11-16 years, N = 10) and healthy adults. In addition, learning capacity may be more limited...... to modify step length from one trial to the next. Our sequence learning paradigm is derived from the serial reaction-time (SRT) task that has been used in upper limb studies. Both random and ordered sequences of step lengths were used to measure sequence-specific and sequence non-specific learning during...

  9. The RNA world, automatic sequences and oncogenetics

    Energy Technology Data Exchange (ETDEWEB)

    Tahir Shah, K

    1993-04-01

    We construct a model of the RNA world in terms of naturally evolving nucleotide sequences assuming only Crick-Watson base pairing and self-cleaving/splicing capability. These sequences have the following properties. (1) They are recognizable by an automation (or automata). That is, to each k-sequence, there exist a k-automation which accepts, recognizes or generates the k-sequence. These are known as automatic sequences. Fibonacci and Morse-Thue sequences are the most natural outcome of pre-biotic chemical conditions. (2) Infinite (resp. large) sequences are self-similar (resp. nearly self-similar) under certain rewrite rules and consequently give rise to fractal (resp.fractal-like) structures. Computationally, such sequences can also be generated by their corresponding deterministic parallel re-write system, known as a DOL system. The self-similar sequences are fixed points of their respective rewrite rules. Some of these automatic sequences have the capability that they can read or ``accept`` other sequences while others can detect errors and trigger error-correcting mechanisms. They can be enlarged and have block and/or palindrome structure. Linear recurring sequences such as Fibonacci sequence are simply Feed-back Shift Registers, a well know model of information processing machines. We show that a mutation of any rewrite rule can cause a combinatorial explosion of error and relates this to oncogenetical behavior. On the other hand, a mutation of sequences that are not rewrite rules, leads to normal evolutionary change. Known experimental results support our hypothesis. (author). Refs.

  10. The RNA world, automatic sequences and oncogenetics

    International Nuclear Information System (INIS)

    Tahir Shah, K.

    1993-04-01

    We construct a model of the RNA world in terms of naturally evolving nucleotide sequences assuming only Crick-Watson base pairing and self-cleaving/splicing capability. These sequences have the following properties. 1) They are recognizable by an automation (or automata). That is, to each k-sequence, there exist a k-automation which accepts, recognizes or generates the k-sequence. These are known as automatic sequences. Fibonacci and Morse-Thue sequences are the most natural outcome of pre-biotic chemical conditions. 2) Infinite (resp. large) sequences are self-similar (resp. nearly self-similar) under certain rewrite rules and consequently give rise to fractal (resp.fractal-like) structures. Computationally, such sequences can also be generated by their corresponding deterministic parallel re-write system, known as a DOL system. The self-similar sequences are fixed points of their respective rewrite rules. Some of these automatic sequences have the capability that they can read or 'accept' other sequences while others can detect errors and trigger error-correcting mechanisms. They can be enlarged and have block and/or palindrome structure. Linear recurring sequences such as Fibonacci sequence are simply Feed-back Shift Registers, a well know model of information processing machines. We show that a mutation of any rewrite rule can cause a combinatorial explosion of error and relates this to oncogenetical behavior. On the other hand, a mutation of sequences that are not rewrite rules, leads to normal evolutionary change. Known experimental results support our hypothesis. (author). Refs

  11. Targeted assembly of short sequence reads.

    Directory of Open Access Journals (Sweden)

    René L Warren

    Full Text Available As next-generation sequence (NGS production continues to increase, analysis is becoming a significant bottleneck. However, in situations where information is required only for specific sequence variants, it is not necessary to assemble or align whole genome data sets in their entirety. Rather, NGS data sets can be mined for the presence of sequence variants of interest by localized assembly, which is a faster, easier, and more accurate approach. We present TASR, a streamlined assembler that interrogates very large NGS data sets for the presence of specific variants by only considering reads within the sequence space of input target sequences provided by the user. The NGS data set is searched for reads with an exact match to all possible short words within the target sequence, and these reads are then assembled stringently to generate a consensus of the target and flanking sequence. Typically, variants of a particular locus are provided as different target sequences, and the presence of the variant in the data set being interrogated is revealed by a successful assembly outcome. However, TASR can also be used to find unknown sequences that flank a given target. We demonstrate that TASR has utility in finding or confirming genomic mutations, polymorphisms, fusions and integration events. Targeted assembly is a powerful method for interrogating large data sets for the presence of sequence variants of interest. TASR is a fast, flexible and easy to use tool for targeted assembly.

  12. The chloroplast genome sequence of the green alga Leptosira terrestris: multiple losses of the inverted repeat and extensive genome rearrangements within the Trebouxiophyceae

    Directory of Open Access Journals (Sweden)

    Turmel Monique

    2007-07-01

    Full Text Available Abstract Background In the Chlorophyta – the green algal phylum comprising the classes Prasinophyceae, Ulvophyceae, Trebouxiophyceae and Chlorophyceae – the chloroplast genome displays a highly variable architecture. While chlorophycean chloroplast DNAs (cpDNAs deviate considerably from the ancestral pattern described for the prasinophyte Nephroselmis olivacea, the degree of remodelling sustained by the two ulvophyte cpDNAs completely sequenced to date is intermediate relative to those observed for chlorophycean and trebouxiophyte cpDNAs. Chlorella vulgaris (Chlorellales is currently the only photosynthetic trebouxiophyte whose complete cpDNA sequence has been reported. To gain insights into the evolutionary trends of the chloroplast genome in the Trebouxiophyceae, we sequenced cpDNA from the filamentous alga Leptosira terrestris (Ctenocladales. Results The 195,081-bp Leptosira chloroplast genome resembles the 150,613-bp Chlorella genome in lacking a large inverted repeat (IR but differs greatly in gene order. Six of the conserved genes present in Chlorella cpDNA are missing from the Leptosira gene repertoire. The 106 conserved genes, four introns and 11 free standing open reading frames (ORFs account for 48.3% of the genome sequence. This is the lowest gene density yet observed among chlorophyte cpDNAs. Contrary to the situation in Chlorella but similar to that in the chlorophycean Scenedesmus obliquus, the gene distribution is highly biased over the two DNA strands in Leptosira. Nine genes, compared to only three in Chlorella, have significantly expanded coding regions relative to their homologues in ancestral-type green algal cpDNAs. As observed in chlorophycean genomes, the rpoB gene is fragmented into two ORFs. Short repeats account for 5.1% of the Leptosira genome sequence and are present mainly in intergenic regions. Conclusion Our results highlight the great plasticity of the chloroplast genome in the Trebouxiophyceae and indicate

  13. Inconsistencies of genome annotations in apicomplexan parasites revealed by 5'-end-one-pass and full-length sequences of oligo-capped cDNAs

    Directory of Open Access Journals (Sweden)

    Sugano Sumio

    2009-07-01

    Full Text Available Abstract Background Apicomplexan parasites are causative agents of various diseases including malaria and have been targets of extensive genomic sequencing. We generated 5'-EST collections for six apicomplexa parasites using our full-length oligo-capping cDNA library method. To improve upon the current genome annotations, as well as to validate the importance for physical cDNA clone resources, we generated a large-scale collection of full-length cDNAs for several apicomplexa parasites. Results In this study, we used a total of 61,056 5'-end-single-pass cDNA sequences from Plasmodium falciparum, P. vivax, P. yoelii, P. berghei, Cryptosporidium parvum, and Toxoplasma gondii. We compared these partially sequenced cDNA sequences with the currently annotated gene models and observed significant inconsistencies between the two datasets. In particular, we found that on average 14% of the exons in the current gene models were not supported by any cDNA evidence, and that 16% of the current gene models may contain at least one mis-annotation and should be re-evaluated. We also identified a large number of transcripts that had been previously unidentified. For 732 cDNAs in T. gondii, the entire sequences were determined in order to evaluate the annotated gene models at the complete full-length transcript level. We found that 41% of the T. gondii gene models contained at least one inconsistency. We also identified and confirmed by RT-PCR 140 previously unidentified transcripts found in the intergenic regions of the current gene annotations. We show that the majority of these discrepancies are due to questionable predictions of one or two extra exons in the upstream or downstream regions of the genes. Conclusion Our data indicates that the current gene models are likely to still be incomplete and have much room for improvement. Our unique full-length cDNA information is especially useful for further refinement of the annotations for the genomes of

  14. Sequence and organization of the rhoptry-associated-protein-1 (rap-1) locus for the sheep hemoprotozoan Babesia sp. BQ1 Lintan (B. motasi phylogenetic group).

    Science.gov (United States)

    Niu, Qingli; Bonsergent, Claire; Guan, Guiquan; Yin, Hong; Malandrin, Laurence

    2013-11-15

    Babesiosis is a frequent infection of animals worldwide by tick borne pathogen Babesia, and several species are responsible for ovine babesiosis. Recently, several Babesia motasi-like isolates were described in sheep in China. In this study, we sequenced the multigenic rap-1 gene locus of one of these isolates, Babesia sp. BQ1 Lintan. The RAP-1 proteins are involved in the process of red blood cells invasion and thus represent a potential target for vaccine development. A complex composition and organization of the rap-1 locus was discovered with: (1) the presence of 3 different types of rap-1 sequences (rap-1a, rap-1b and rap-1c); (2) the presence of multiple copies of rap-1a and rap-1b; (3) polymorphism among the rap-1a copies, with two classes (named rap-1a61 and rap-1a67) having a similarity of 95.7%, each class represented by two close variants; (4) polymorphism between rap-1a61-1 and rap-1a61-2 limited to three nucleotide positions; (5) a difference of eight nucleotides between rap-1a67-1 and rap-1a67-2 from position 1270 to the putative stop site of rap-1a67-1 which might produce two putative proteins of slightly different sizes; (6) the ratio of rap-1a copies corresponding to one rap-1a67, one rap-1a61-1 and one rap-1a61-2; (7) the presence of three different intergenic regions separating rap-1a, rap-1b and rap-1c; (8) interspacing of the rap-1a copies with rap-1b copies; and (9) the terminal position of rap-1c in the locus. A 31kb locus composed of 6 rap-1a sequences interspaced with 5 rap-1b sequences and with a terminal rap-1c copy was hypothesized. A strikingly similar sequence composition (rap-1a, rap-1b and rap-1c), as well as strong gene identities and similar locus organization with B. bigemina were found and highlight the conservation of synteny at this locus in this phylogenetic clade. Copyright © 2013 Elsevier B.V. All rights reserved.

  15. Pareto optimal pairwise sequence alignment.

    Science.gov (United States)

    DeRonne, Kevin W; Karypis, George

    2013-01-01

    Sequence alignment using evolutionary profiles is a commonly employed tool when investigating a protein. Many profile-profile scoring functions have been developed for use in such alignments, but there has not yet been a comprehensive study of Pareto optimal pairwise alignments for combining multiple such functions. We show that the problem of generating Pareto optimal pairwise alignments has an optimal substructure property, and develop an efficient algorithm for generating Pareto optimal frontiers of pairwise alignments. All possible sets of two, three, and four profile scoring functions are used from a pool of 11 functions and applied to 588 pairs of proteins in the ce_ref data set. The performance of the best objective combinations on ce_ref is also evaluated on an independent set of 913 protein pairs extracted from the BAliBASE RV11 data set. Our dynamic-programming-based heuristic approach produces approximated Pareto optimal frontiers of pairwise alignments that contain comparable alignments to those on the exact frontier, but on average in less than 1/58th the time in the case of four objectives. Our results show that the Pareto frontiers contain alignments whose quality is better than the alignments obtained by single objectives. However, the task of identifying a single high-quality alignment among those in the Pareto frontier remains challenging.

  16. Hierarchically nested river landform sequences

    Science.gov (United States)

    Pasternack, G. B.; Weber, M. D.; Brown, R. A.; Baig, D.

    2017-12-01

    River corridors exhibit landforms nested within landforms repeatedly down spatial scales. In this study we developed, tested, and implemented a new way to create river classifications by mapping domains of fluvial processes with respect to the hierarchical organization of topographic complexity that drives fluvial dynamism. We tested this approach on flow convergence routing, a morphodynamic mechanism with different states depending on the structure of nondimensional topographic variability. Five nondimensional landform types with unique functionality (nozzle, wide bar, normal channel, constricted pool, and oversized) represent this process at any flow. When this typology is nested at base flow, bankfull, and floodprone scales it creates a system with up to 125 functional types. This shows how a single mechanism produces complex dynamism via nesting. Given the classification, we answered nine specific scientific questions to investigate the abundance, sequencing, and hierarchical nesting of these new landform types using a 35-km gravel/cobble river segment of the Yuba River in California. The nested structure of flow convergence routing landforms found in this study revealed that bankfull landforms are nested within specific floodprone valley landform types, and these types control bankfull morphodynamics during moderate to large floods. As a result, this study calls into question the prevailing theory that the bankfull channel of a gravel/cobble river is controlled by in-channel, bankfull, and/or small flood flows. Such flows are too small to initiate widespread sediment transport in a gravel/cobble river with topographic complexity.

  17. Sequencing intractable DNA to close microbial genomes.

    Directory of Open Access Journals (Sweden)

    Richard A Hurt

    Full Text Available Advancement in high throughput DNA sequencing technologies has supported a rapid proliferation of microbial genome sequencing projects, providing the genetic blueprint for in-depth studies. Oftentimes, difficult to sequence regions in microbial genomes are ruled "intractable" resulting in a growing number of genomes with sequence gaps deposited in databases. A procedure was developed to sequence such problematic regions in the "non-contiguous finished" Desulfovibrio desulfuricans ND132 genome (6 intractable gaps and the Desulfovibrio africanus genome (1 intractable gap. The polynucleotides surrounding each gap formed GC rich secondary structures making the regions refractory to amplification and sequencing. Strand-displacing DNA polymerases used in concert with a novel ramped PCR extension cycle supported amplification and closure of all gap regions in both genomes. The developed procedures support accurate gene annotation, and provide a step-wise method that reduces the effort required for genome finishing.

  18. Sequencing Intractable DNA to Close Microbial Genomes

    Energy Technology Data Exchange (ETDEWEB)

    Hurt, Jr., Richard Ashley [ORNL; Brown, Steven D [ORNL; Podar, Mircea [ORNL; Palumbo, Anthony Vito [ORNL; Elias, Dwayne A [ORNL

    2012-01-01

    Advancement in high throughput DNA sequencing technologies has supported a rapid proliferation of microbial genome sequencing projects, providing the genetic blueprint for for in-depth studies. Oftentimes, difficult to sequence regions in microbial genomes are ruled intractable resulting in a growing number of genomes with sequence gaps deposited in databases. A procedure was developed to sequence such difficult regions in the non-contiguous finished Desulfovibrio desulfuricans ND132 genome (6 intractable gaps) and the Desulfovibrio africanus genome (1 intractable gap). The polynucleotides surrounding each gap formed GC rich secondary structures making the regions refractory to amplification and sequencing. Strand-displacing DNA polymerases used in concert with a novel ramped PCR extension cycle supported amplification and closure of all gap regions in both genomes. These developed procedures support accurate gene annotation, and provide a step-wise method that reduces the effort required for genome finishing.

  19. Identification of human chromosome 22 transcribed sequences with ORF expressed sequence tags

    Science.gov (United States)

    de Souza, Sandro J.; Camargo, Anamaria A.; Briones, Marcelo R. S.; Costa, Fernando F.; Nagai, Maria Aparecida; Verjovski-Almeida, Sergio; Zago, Marco A.; Andrade, Luis Eduardo C.; Carrer, Helaine; El-Dorry, Hamza F. A.; Espreafico, Enilza M.; Habr-Gama, Angelita; Giannella-Neto, Daniel; Goldman, Gustavo H.; Gruber, Arthur; Hackel, Christine; Kimura, Edna T.; Maciel, Rui M. B.; Marie, Suely K. N.; Martins, Elizabeth A. L.; Nóbrega, Marina P.; Paçó-Larson, Maria Luisa; Pardini, Maria Inês M. C.; Pereira, Gonçalo G.; Pesquero, João Bosco; Rodrigues, Vanderlei; Rogatto, Silvia R.; da Silva, Ismael D. C. G.; Sogayar, Mari C.; de Fátima Sonati, Maria; Tajara, Eloiza H.; Valentini, Sandro R.; Acencio, Marcio; Alberto, Fernando L.; Amaral, Maria Elisabete J.; Aneas, Ivy; Bengtson, Mário Henrique; Carraro, Dirce M.; Carvalho, Alex F.; Carvalho, Lúcia Helena; Cerutti, Janete M.; Corrêa, Maria Lucia C.; Costa, Maria Cristina R.; Curcio, Cyntia; Gushiken, Tsieko; Ho, Paulo L.; Kimura, Elza; Leite, Luciana C. C.; Maia, Gustavo; Majumder, Paromita; Marins, Mozart; Matsukuma, Adriana; Melo, Analy S. A.; Mestriner, Carlos Alberto; Miracca, Elisabete C.; Miranda, Daniela C.; Nascimento, Ana Lucia T. O.; Nóbrega, Francisco G.; Ojopi, Élida P. B.; Pandolfi, José Rodrigo C.; Pessoa, Luciana Gilbert; Rahal, Paula; Rainho, Claudia A.; da Ro's, Nancy; de Sá, Renata G.; Sales, Magaly M.; da Silva, Neusa P.; Silva, Tereza C.; da Silva, Wilson; Simão, Daniel F.; Sousa, Josane F.; Stecconi, Daniella; Tsukumo, Fernando; Valente, Valéria; Zalcberg, Heloisa; Brentani, Ricardo R.; Reis, Luis F. L.; Dias-Neto, Emmanuel; Simpson, Andrew J. G.

    2000-01-01

    Transcribed sequences in the human genome can be identified with confidence only by alignment with sequences derived from cDNAs synthesized from naturally occurring mRNAs. We constructed a set of 250,000 cDNAs that represent partial expressed gene sequences and that are biased toward the central coding regions of the resulting transcripts. They are termed ORF expressed sequence tags (ORESTES). The 250,000 ORESTES were assembled into 81,429 contigs. Of these, 1,181 (1.45%) were found to match sequences in chromosome 22 with at least one ORESTES contig for 162 (65.6%) of the 247 known genes, for 67 (44.6%) of the 150 related genes, and for 45 of the 148 (30.4%) EST-predicted genes on this chromosome. Using a set of stringent criteria to validate our sequences, we identified a further 219 previously unannotated transcribed sequences on chromosome 22. Of these, 171 were in fact also defined by EST or full length cDNA sequences available in GenBank but not utilized in the initial annotation of the first human chromosome sequence. Thus despite representing less than 15% of all expressed human sequences in the public databases at the time of the present analysis, ORESTES sequences defined 48 transcribed sequences on chromosome 22 not defined by other sequences. All of the transcribed sequences defined by ORESTES coincided with DNA regions predicted as encoding exons by genscan. (http://genes.mit.edu/GENSCAN.html). PMID:11070084

  20. cis sequence effects on gene expression

    Directory of Open Access Journals (Sweden)

    Jacobs Kevin

    2007-08-01

    Full Text Available Abstract Background Sequence and transcriptional variability within and between individuals are typically studied independently. The joint analysis of sequence and gene expression variation (genetical genomics provides insight into the role of linked sequence variation in the regulation of gene expression. We investigated the role of sequence variation in cis on gene expression (cis sequence effects in a group of genes commonly studied in cancer research in lymphoblastoid cell lines. We estimated the proportion of genes exhibiting cis sequence effects and the proportion of gene expression variation explained by cis sequence effects using three different analytical approaches, and compared our results to the literature. Results We generated gene expression profiling data at N = 697 candidate genes from N = 30 lymphoblastoid cell lines for this study and used available candidate gene resequencing data at N = 552 candidate genes to identify N = 30 candidate genes with sufficient variance in both datasets for the investigation of cis sequence effects. We used two additive models and the haplotype phylogeny scanning approach of Templeton (Tree Scanning to evaluate association between individual SNPs, all SNPs at a gene, and diplotypes, with log-transformed gene expression. SNPs and diplotypes at eight candidate genes exhibited statistically significant (p cis sequence effects in our study, respectively. Conclusion Based on analysis of our results and the extant literature, one in four genes exhibits significant cis sequence effects, and for these genes, about 30% of gene expression variation is accounted for by cis sequence variation. Despite diverse experimental approaches, the presence or absence of significant cis sequence effects is largely supported by previously published studies.

  1. Multiple tag labeling method for DNA sequencing

    Science.gov (United States)

    Mathies, R.A.; Huang, X.C.; Quesada, M.A.

    1995-07-25

    A DNA sequencing method is described which uses single lane or channel electrophoresis. Sequencing fragments are separated in the lane and detected using a laser-excited, confocal fluorescence scanner. Each set of DNA sequencing fragments is separated in the same lane and then distinguished using a binary coding scheme employing only two different fluorescent labels. Also described is a method of using radioisotope labels. 5 figs.

  2. Exome sequencing and genetic testing for MODY.

    Directory of Open Access Journals (Sweden)

    Stefan Johansson

    Full Text Available Genetic testing for monogenic diabetes is important for patient care. Given the extensive genetic and clinical heterogeneity of diabetes, exome sequencing might provide additional diagnostic potential when standard Sanger sequencing-based diagnostics is inconclusive.The aim of the study was to examine the performance of exome sequencing for a molecular diagnosis of MODY in patients who have undergone conventional diagnostic sequencing of candidate genes with negative results.We performed exome enrichment followed by high-throughput sequencing in nine patients with suspected MODY. They were Sanger sequencing-negative for mutations in the HNF1A, HNF4A, GCK, HNF1B and INS genes. We excluded common, non-coding and synonymous gene variants, and performed in-depth analysis on filtered sequence variants in a pre-defined set of 111 genes implicated in glucose metabolism.On average, we obtained 45 X median coverage of the entire targeted exome and found 199 rare coding variants per individual. We identified 0-4 rare non-synonymous and nonsense variants per individual in our a priori list of 111 candidate genes. Three of the variants were considered pathogenic (in ABCC8, HNF4A and PPARG, respectively, thus exome sequencing led to a genetic diagnosis in at least three of the nine patients. Approximately 91% of known heterozygous SNPs in the target exomes were detected, but we also found low coverage in some key diabetes genes using our current exome sequencing approach. Novel variants in the genes ARAP1, GLIS3, MADD, NOTCH2 and WFS1 need further investigation to reveal their possible role in diabetes.Our results demonstrate that exome sequencing can improve molecular diagnostics of MODY when used as a complement to Sanger sequencing. However, improvements will be needed, especially concerning coverage, before the full potential of exome sequencing can be realized.

  3. Identifying driver mutations in sequenced cancer genomes

    DEFF Research Database (Denmark)

    Raphael, Benjamin J; Dobson, Jason R; Oesper, Layla

    2014-01-01

    High-throughput DNA sequencing is revolutionizing the study of cancer and enabling the measurement of the somatic mutations that drive cancer development. However, the resulting sequencing datasets are large and complex, obscuring the clinically important mutations in a background of errors, nois...... patterns of mutual exclusivity. These techniques, coupled with advances in high-throughput DNA sequencing, are enabling precision medicine approaches to the diagnosis and treatment of cancer....

  4. EGNAS: an exhaustive DNA sequence design algorithm

    Directory of Open Access Journals (Sweden)

    Kick Alfred

    2012-06-01

    Full Text Available Abstract Background The molecular recognition based on the complementary base pairing of deoxyribonucleic acid (DNA is the fundamental principle in the fields of genetics, DNA nanotechnology and DNA computing. We present an exhaustive DNA sequence design algorithm that allows to generate sets containing a maximum number of sequences with defined properties. EGNAS (Exhaustive Generation of Nucleic Acid Sequences offers the possibility of controlling both interstrand and intrastrand properties. The guanine-cytosine content can be adjusted. Sequences can be forced to start and end with guanine or cytosine. This option reduces the risk of “fraying” of DNA strands. It is possible to limit cross hybridizations of a defined length, and to adjust the uniqueness of sequences. Self-complementarity and hairpin structures of certain length can be avoided. Sequences and subsequences can optionally be forbidden. Furthermore, sequences can be designed to have minimum interactions with predefined strands and neighboring sequences. Results The algorithm is realized in a C++ program. TAG sequences can be generated and combined with primers for single-base extension reactions, which were described for multiplexed genotyping of single nucleotide polymorphisms. Thereby, possible foldback through intrastrand interaction of TAG-primer pairs can be limited. The design of sequences for specific attachment of molecular constructs to DNA origami is presented. Conclusions We developed a new software tool called EGNAS for the design of unique nucleic acid sequences. The presented exhaustive algorithm allows to generate greater sets of sequences than with previous software and equal constraints. EGNAS is freely available for noncommercial use at http://www.chm.tu-dresden.de/pc6/EGNAS.

  5. Genome Sequencing and Analysis Conference IV

    Energy Technology Data Exchange (ETDEWEB)

    1993-12-31

    J. Craig Venter and C. Thomas Caskey co-chaired Genome Sequencing and Analysis Conference IV held at Hilton Head, South Carolina from September 26--30, 1992. Venter opened the conference by noting that approximately 400 researchers from 16 nations were present four times as many participants as at Genome Sequencing Conference I in 1989. Venter also introduced the Data Fair, a new component of the conference allowing exchange and on-site computer analysis of unpublished sequence data.

  6. FRESCO: Referential compression of highly similar sequences.

    Science.gov (United States)

    Wandelt, Sebastian; Leser, Ulf

    2013-01-01

    In many applications, sets of similar texts or sequences are of high importance. Prominent examples are revision histories of documents or genomic sequences. Modern high-throughput sequencing technologies are able to generate DNA sequences at an ever-increasing rate. In parallel to the decreasing experimental time and cost necessary to produce DNA sequences, computational requirements for analysis and storage of the sequences are steeply increasing. Compression is a key technology to deal with this challenge. Recently, referential compression schemes, storing only the differences between a to-be-compressed input and a known reference sequence, gained a lot of interest in this field. In this paper, we propose a general open-source framework to compress large amounts of biological sequence data called Framework for REferential Sequence COmpression (FRESCO). Our basic compression algorithm is shown to be one to two orders of magnitudes faster than comparable related work, while achieving similar compression ratios. We also propose several techniques to further increase compression ratios, while still retaining the advantage in speed: 1) selecting a good reference sequence; and 2) rewriting a reference sequence to allow for better compression. In addition,we propose a new way of further boosting the compression ratios by applying referential compression to already referentially compressed files (second-order compression). This technique allows for compression ratios way beyond state of the art, for instance,4,000:1 and higher for human genomes. We evaluate our algorithms on a large data set from three different species (more than 1,000 genomes, more than 3 TB) and on a collection of versions of Wikipedia pages. Our results show that real-time compression of highly similar sequences at high compression ratios is possible on modern hardware.

  7. Robustness analysis of chiller sequencing control

    International Nuclear Information System (INIS)

    Liao, Yundan; Sun, Yongjun; Huang, Gongsheng

    2015-01-01

    Highlights: • Uncertainties with chiller sequencing control were systematically quantified. • Robustness of chiller sequencing control was systematically analyzed. • Different sequencing control strategies were sensitive to different uncertainties. • A numerical method was developed for easy selection of chiller sequencing control. - Abstract: Multiple-chiller plant is commonly employed in the heating, ventilating and air-conditioning system to increase operational feasibility and energy-efficiency under part load condition. In a multiple-chiller plant, chiller sequencing control plays a key role in achieving overall energy efficiency while not sacrifices the cooling sufficiency for indoor thermal comfort. Various sequencing control strategies have been developed and implemented in practice. Based on the observation that (i) uncertainty, which cannot be avoided in chiller sequencing control, has a significant impact on the control performance and may cause the control fail to achieve the expected control and/or energy performance; and (ii) in current literature few studies have systematically addressed this issue, this paper therefore presents a study on robustness analysis of chiller sequencing control in order to understand the robustness of various chiller sequencing control strategies under different types of uncertainty. Based on the robustness analysis, a simple and applicable method is developed to select the most robust control strategy for a given chiller plant in the presence of uncertainties, which will be verified using case studies

  8. Multiplexed microsatellite recovery using massively parallel sequencing

    Science.gov (United States)

    Jennings, T.N.; Knaus, B.J.; Mullins, T.D.; Haig, S.M.; Cronn, R.C.

    2011-01-01

    Conservation and management of natural populations requires accurate and inexpensive genotyping methods. Traditional microsatellite, or simple sequence repeat (SSR), marker analysis remains a popular genotyping method because of the comparatively low cost of marker development, ease of analysis and high power of genotype discrimination. With the availability of massively parallel sequencing (MPS), it is now possible to sequence microsatellite-enriched genomic libraries in multiplex pools. To test this approach, we prepared seven microsatellite-enriched, barcoded genomic libraries from diverse taxa (two conifer trees, five birds) and sequenced these on one lane of the Illumina Genome Analyzer using paired-end 80-bp reads. In this experiment, we screened 6.1 million sequences and identified 356958 unique microreads that contained di- or trinucleotide microsatellites. Examination of four species shows that our conversion rate from raw sequences to polymorphic markers compares favourably to Sanger- and 454-based methods. The advantage of multiplexed MPS is that the staggering capacity of modern microread sequencing is spread across many libraries; this reduces sample preparation and sequencing costs to less than $400 (USD) per species. This price is sufficiently low that microsatellite libraries could be prepared and sequenced for all 1373 organisms listed as 'threatened' and 'endangered' in the United States for under $0.5M (USD).

  9. Digital Recovery Sequencer - Advanced Concept Ejection Seats

    National Research Council Canada - National Science Library

    Ross, David A; Cotter, Lee; Culhane, David; Press, Matthew J

    2005-01-01

    .... Continued usage of the Analog Sequencer is undesirable due to limitations with respect to its installed life, electronic component obsolescence, flexibility to accommodate seat safety improvements...

  10. Quantitative phenotyping via deep barcode sequencing.

    Science.gov (United States)

    Smith, Andrew M; Heisler, Lawrence E; Mellor, Joseph; Kaper, Fiona; Thompson, Michael J; Chee, Mark; Roth, Frederick P; Giaever, Guri; Nislow, Corey

    2009-10-01

    Next-generation DNA sequencing technologies have revolutionized diverse genomics applications, including de novo genome sequencing, SNP detection, chromatin immunoprecipitation, and transcriptome analysis. Here we apply deep sequencing to genome-scale fitness profiling to evaluate yeast strain collections in parallel. This method, Barcode analysis by Sequencing, or "Bar-seq," outperforms the current benchmark barcode microarray assay in terms of both dynamic range and throughput. When applied to a complex chemogenomic assay, Bar-seq quantitatively identifies drug targets, with performance superior to the benchmark microarray assay. We also show that Bar-seq is well-suited for a multiplex format. We completely re-sequenced and re-annotated the yeast deletion collection using deep sequencing, found that approximately 20% of the barcodes and common priming sequences varied from expectation, and used this revised list of barcode sequences to improve data quality. Together, this new assay and analysis routine provide a deep-sequencing-based toolkit for identifying gene-environment interactions on a genome-wide scale.

  11. Hardware Accelerated Sequence Alignment with Traceback

    Directory of Open Access Journals (Sweden)

    Scott Lloyd

    2009-01-01

    in a timely manner. Known methods to accelerate alignment on reconfigurable hardware only address sequence comparison, limit the sequence length, or exhibit memory and I/O bottlenecks. A space-efficient, global sequence alignment algorithm and architecture is presented that accelerates the forward scan and traceback in hardware without memory and I/O limitations. With 256 processing elements in FPGA technology, a performance gain over 300 times that of a desktop computer is demonstrated on sequence lengths of 16000. For greater performance, the architecture is scalable to more processing elements.

  12. Multipliers on Generalized Mixed Norm Sequence Spaces

    Directory of Open Access Journals (Sweden)

    Oscar Blasco

    2014-01-01

    Full Text Available Given 1≤p,q≤∞ and sequences of integers (nkk and (nk′k such that nk≤nk′≤nk+1, the generalized mixed norm space ℓℐ(p,q is defined as those sequences (ajj such that ((∑j∈Ik‍|aj|p1/pk∈ℓq where Ik={j∈ℕ0 s.t. nk≤jsequence λ=(λjj to belong to the space of multipliers (ℓℐ(r,s,ℓ(u,v, for different sequences ℐ and of intervals in ℕ0, are determined.

  13. Recursive sequences in first-year calculus

    Science.gov (United States)

    Krainer, Thomas

    2016-02-01

    This article provides ready-to-use supplementary material on recursive sequences for a second-semester calculus class. It equips first-year calculus students with a basic methodical procedure based on which they can conduct a rigorous convergence or divergence analysis of many simple recursive sequences on their own without the need to invoke inductive arguments as is typically required in calculus textbooks. The sequences that are accessible to this kind of analysis are predominantly (eventually) monotonic, but also certain recursive sequences that alternate around their limit point as they converge can be considered.

  14. A measurement of disorder in binary sequences

    Science.gov (United States)

    Gong, Longyan; Wang, Haihong; Cheng, Weiwen; Zhao, Shengmei

    2015-03-01

    We propose a complex quantity, AL, to characterize the degree of disorder of L-length binary symbolic sequences. As examples, we respectively apply it to typical random and deterministic sequences. One kind of random sequences is generated from a periodic binary sequence and the other is generated from the logistic map. The deterministic sequences are the Fibonacci and Thue-Morse sequences. In these analyzed sequences, we find that the modulus of AL, denoted by |AL | , is a (statistically) equivalent quantity to the Boltzmann entropy, the metric entropy, the conditional block entropy and/or other quantities, so it is a useful quantitative measure of disorder. It can be as a fruitful index to discern which sequence is more disordered. Moreover, there is one and only one value of |AL | for the overall disorder characteristics. It needs extremely low computational costs. It can be easily experimentally realized. From all these mentioned, we believe that the proposed measure of disorder is a valuable complement to existing ones in symbolic sequences.

  15. Polynomial sequences generated by infinite Hessenberg matrices

    Directory of Open Access Journals (Sweden)

    Verde-Star Luis

    2017-01-01

    Full Text Available We show that an infinite lower Hessenberg matrix generates polynomial sequences that correspond to the rows of infinite lower triangular invertible matrices. Orthogonal polynomial sequences are obtained when the Hessenberg matrix is tridiagonal. We study properties of the polynomial sequences and their corresponding matrices which are related to recurrence relations, companion matrices, matrix similarity, construction algorithms, and generating functions. When the Hessenberg matrix is also Toeplitz the polynomial sequences turn out to be of interpolatory type and we obtain additional results. For example, we show that every nonderogative finite square matrix is similar to a unique Toeplitz-Hessenberg matrix.

  16. Genomic sequencing of Pleistocene cave bears

    Energy Technology Data Exchange (ETDEWEB)

    Noonan, James P.; Hofreiter, Michael; Smith, Doug; Priest, JamesR.; Rohland, Nadin; Rabeder, Gernot; Krause, Johannes; Detter, J. Chris; Paabo, Svante; Rubin, Edward M.

    2005-04-01

    Despite the information content of genomic DNA, ancient DNA studies to date have largely been limited to amplification of mitochondrial DNA due to technical hurdles such as contamination and degradation of ancient DNAs. In this study, we describe two metagenomic libraries constructed using unamplified DNA extracted from the bones of two 40,000-year-old extinct cave bears. Analysis of {approx}1 Mb of sequence from each library showed that, despite significant microbial contamination, 5.8 percent and 1.1 percent of clones in the libraries contain cave bear inserts, yielding 26,861 bp of cave bear genome sequence. Alignment of this sequence to the dog genome, the closest sequenced genome to cave bear in terms of evolutionary distance, revealed roughly the expected ratio of cave bear exons, repeats and conserved noncoding sequences. Only 0.04 percent of all clones sequenced were derived from contamination with modern human DNA. Comparison of cave bear with orthologous sequences from several modern bear species revealed the evolutionary relationship of these lineages. Using the metagenomic approach described here, we have recovered substantial quantities of mammalian genomic sequence more than twice as old as any previously reported, establishing the feasibility of ancient DNA genomic sequencing programs.

  17. Design of Long Period Pseudo-Random Sequences from the Addition of -Sequences over

    Directory of Open Access Journals (Sweden)

    Ren Jian

    2004-01-01

    Full Text Available Pseudo-random sequence with good correlation property and large linear span is widely used in code division multiple access (CDMA communication systems and cryptology for reliable and secure information transmission. In this paper, sequences with long period, large complexity, balance statistics, and low cross-correlation property are constructed from the addition of -sequences with pairwise-prime linear spans (AMPLS. Using -sequences as building blocks, the proposed method proved to be an efficient and flexible approach to construct long period pseudo-random sequences with desirable properties from short period sequences. Applying the proposed method to , a signal set is constructed.

  18. A modification to the SCAR (Sequence Characterized Amplified Region method provides phylogenetic insights within Ceratozamia (Zamiaceae Una modificación al método SCAR (Sequence Characterized Amplified Region aporta entendimiento filogenético en Ceratozamia (Zamiaceae

    Directory of Open Access Journals (Sweden)

    Dolores González

    2012-12-01

    Full Text Available Phylogenetic relationships among closely related plant species are still problematic. DNA intergenic regions often are insufficiently variable to provide desired resolution or support. In this study, a modification to the Sequence Characterized Amplified Region (SCAR method was used to find polymorphic loci for phylogenetic analyses within Ceratozamia. RAPD markers were first used to detect variation in 5 species. Then, equal length fragments found in 2 or more species were excised from the gel, purified and digested with frequent cutter restriction enzymes for isolating both ends, which have the same primer site. Digested fragments were sequenced with the RAPD primer. Variable sequences were used to design specific primers for amplifying and sequencing in all species for phylogenetic analyses. Our results confirmed the previously known high genome sequence resemblance within this genus that contrasts with its high morphological variation. Only 7 parsimony informative characters were found with this approach. Nonetheless, the Digested-SCAR (D-SCAR method provided some phylogenetic insights. Four main clades consistent with distribution ranges of the species were detected. The approach presented here was effective to solve some relationships within the genus and can potentially be implemented in other organisms to find polymorphic loci for phylogenetic studies at any taxonomic level.Las relaciones filogenéticas entre especies de plantas cercanamente relacionadas es aún problemático. Las regiones intergénicas del ADN son a menudo insuficientemente variables para proveer los niveles de resolución y soporte deseados. En este estudio, se usó una modificación al método Sequence Characterized Amplified Region (SCAR para encontrar loci polimórficos para análisis filogenéticos en Ceratozamia. Primero se usaron marcadores RAPD para detectar variación en 5 especies; luego, se cortaron del gel los fragmentos de la misma longitud en 2 o m

  19. Compressing DNA sequence databases with coil

    Directory of Open Access Journals (Sweden)

    Hendy Michael D

    2008-05-01

    Full Text Available Abstract Background Publicly available DNA sequence databases such as GenBank are large, and are growing at an exponential rate. The sheer volume of data being dealt with presents serious storage and data communications problems. Currently, sequence data is usually kept in large "flat files," which are then compressed using standard Lempel-Ziv (gzip compression – an approach which rarely achieves good compression ratios. While much research has been done on compressing individual DNA sequences, surprisingly little has focused on the compression of entire databases of such sequences. In this study we introduce the sequence database compression software coil. Results We have designed and implemented a portable software package, coil, for compressing and decompressing DNA sequence databases based on the idea of edit-tree coding. coil is geared towards achieving high compression ratios at the expense of execution time and memory usage during compression – the compression time represents a "one-off investment" whose cost is quickly amortised if the resulting compressed file is transmitted many times. Decompression requires little memory and is extremely fast. We demonstrate a 5% improvement in compression ratio over state-of-the-art general-purpose compression tools for a large GenBank database file containing Expressed Sequence Tag (EST data. Finally, coil can efficiently encode incremental additions to a sequence database. Conclusion coil presents a compelling alternative to conventional compression of flat files for the storage and distribution of DNA sequence databases having a narrow distribution of sequence lengths, such as EST data. Increasing compression levels for databases having a wide distribution of sequence lengths is a direction for future work.

  20. Direct chloroplast sequencing: comparison of sequencing platforms and analysis tools for whole chloroplast barcoding.

    Directory of Open Access Journals (Sweden)

    Marta Brozynska

    Full Text Available Direct sequencing of total plant DNA using next generation sequencing technologies generates a whole chloroplast genome sequence that has the potential to provide a barcode for use in plant and food identification. Advances in DNA sequencing platforms may make this an attractive approach for routine plant identification. The HiSeq (Illumina and Ion Torrent (Life Technology sequencing platforms were used to sequence total DNA from rice to identify polymorphisms in the whole chloroplast genome sequence of a wild rice plant relative to cultivated rice (cv. Nipponbare. Consensus chloroplast sequences were produced by mapping sequence reads to the reference rice chloroplast genome or by de novo assembly and mapping of the resulting contigs to the reference sequence. A total of 122 polymorphisms (SNPs and indels between the wild and cultivated rice chloroplasts were predicted by these different sequencing and analysis methods. Of these, a total of 102 polymorphisms including 90 SNPs were predicted by both platforms. Indels were more variable with different sequencing methods, with almost all discrepancies found in homopolymers. The Ion Torrent platform gave no apparent false SNP but was less reliable for indels. The methods should be suitable for routine barcoding using appropriate combinations of sequencing platform and data analysis.

  1. Step out - Step in Sequencing Games

    NARCIS (Netherlands)

    Musegaas, M.; Borm, P.E.M.; Quant, M.

    2014-01-01

    In this paper a new class of relaxed sequencing games is introduced: the class of Step out - Step in sequencing games. In this relaxation any player within a coalition is allowed to step out from his position in the processing order and to step in at any position later in the processing order.

  2. Step out-step in sequencing games

    NARCIS (Netherlands)

    Musegaas, Marieke; Borm, Peter; Quant, Marieke

    2015-01-01

    In this paper a new class of relaxed sequencing games is introduced: the class of Step out–Step in sequencing games. In this relaxation any player within a coalition is allowed to step out from his position in the processing order and to step in at any position later in the processing order. First,

  3. Enhanced throughput for infrared automated DNA sequencing

    Science.gov (United States)

    Middendorf, Lyle R.; Gartside, Bill O.; Humphrey, Pat G.; Roemer, Stephen C.; Sorensen, David R.; Steffens, David L.; Sutter, Scott L.

    1995-04-01

    Several enhancements have been developed and applied to infrared automated DNA sequencing resulting in significantly higher throughput. A 41 cm sequencing gel (31 cm well- to-read distance) combines high resolution of DNA sequencing fragments with optimized run times yielding two runs per day of 500 bases per sample. A 66 cm sequencing gel (56 cm well-to-read distance) produces sequence read lengths of up to 1000 bases for ds and ss templates using either T7 polymerase or cycle-sequencing protocols. Using a multichannel syringe to load 64 lanes allows 16 samples (compatible with 96-well format) to be visualized for each run. The 41 cm gel configuration allows 16,000 bases per day (16 samples X 500 bases/sample X 2 ten hour runs/day) to be sequenced with the advantages of infrared technology. Enhancements to internal labeling techniques using an infrared-labeled dATP molecule (Boehringer Mannheim GmbH, Penzberg, Germany; Sequenase (U.S. Biochemical) have also been made. The inclusion of glycerol in the sequencing reactions yields greatly improved results for some primer and template combinations. The inclusion of (alpha) -Thio-dNTP's in the labeling reaction increases signal intensity two- to three-fold.

  4. Thread extraction for polyadic instruction sequences

    NARCIS (Netherlands)

    Bergstra, J.; Middelburg, C.

    2011-01-01

    In this paper, we study the phenomenon that instruction sequences are split into fragments which somehow produce a joint behaviour. In order to bring this phenomenon better into the picture, we formalize a simple mechanism by which several instruction sequence fragments can produce a joint

  5. Genome sequence of Lactobacillus rhamnosus ATCC 8530.

    Science.gov (United States)

    Pittet, Vanessa; Ewen, Emily; Bushell, Barry R; Ziola, Barry

    2012-02-01

    Lactobacillus rhamnosus is found in the human gastrointestinal tract and is important for probiotics. We became interested in L. rhamnosus isolate ATCC 8530 in relation to beer spoilage and hops resistance. We report here the genome sequence of this isolate, along with a brief comparison to other available L. rhamnosus genome sequences.

  6. Genome Sequence of Lactobacillus rhamnosus ATCC 8530

    OpenAIRE

    Pittet, Vanessa; Ewen, Emily; Bushell, Barry R.; Ziola, Barry

    2012-01-01

    Lactobacillus rhamnosus is found in the human gastrointestinal tract and is important for probiotics. We became interested in L. rhamnosus isolate ATCC 8530 in relation to beer spoilage and hops resistance. We report here the genome sequence of this isolate, along with a brief comparison to other available L. rhamnosus genome sequences.

  7. Question-answer sequences in survey interviews

    NARCIS (Netherlands)

    Dijkstra, W.; Ongena, Y.P.

    2006-01-01

    Interaction analysis was used to analyze a total of 14,265 question-answer sequences of (Q-A Sequences) 80 questions that originated from two face-to-face and three telephone surveys. The analysis was directed towards the causes and effects of particular interactional problems. Our results showed

  8. Trace maps for arbitrary substitution sequences

    International Nuclear Information System (INIS)

    Avishai, Y.

    1993-01-01

    The discovery of quasi-crystals and their 1-dimensional modeling have led to a deep mathematical study of Schroedinger operators with an arbitrary deterministic potential sequence. In this work we address this problem and find trace maps for an arbitrary substitution sequence. our trace maps have lower dimensionality than those of Kolar and Nori, which make them quite attractive for actual applications. (authors)

  9. Stochastic modelling of daily rainfall sequences

    NARCIS (Netherlands)

    Buishand, T.A.

    1977-01-01

    Rainfall series of different climatic regions were analysed with the aim of generating daily rainfall sequences. A survey of the data is given in I, 1. When analysing daily rainfall sequences one must be aware of the following points:
    a. Seasonality. Because of seasonal variation

  10. Learning of Sensory Sequences in Cerebellar Patients

    Science.gov (United States)

    Frings, Markus; Boenisch, Raoul; Gerwig, Marcus; Diener, Hans-Christoph; Timmann, Dagmar

    2004-01-01

    A possible role of the cerebellum in detecting and recognizing event sequences has been proposed. The present study sought to determine whether patients with cerebellar lesions are impaired in the acquisition and discrimination of sequences of sensory stimuli of different modalities. A group of 26 cerebellar patients and 26 controls matched for…

  11. On peculiar Šindel sequences

    Czech Academy of Sciences Publication Activity Database

    Křížek, Michal; Somer, L.

    2010-01-01

    Roč. 17, č. 2 (2010), s. 129-140 ISSN 0972-5555 R&D Projects: GA AV ČR(CZ) IAA100190803 Institutional research plan: CEZ:AV0Z10190503 Keywords : quadratic residue * Chinese remainder theorem * primitive Šindel sequences * Prague clock sequence Subject RIV: BA - General Mathematics http://www.pphmj.com/abstract/5095.htm

  12. Protecting genomic sequence anonymity with generalization lattices.

    Science.gov (United States)

    Malin, B A

    2005-01-01

    Current genomic privacy technologies assume the identity of genomic sequence data is protected if personal information, such as demographics, are obscured, removed, or encrypted. While demographic features can directly compromise an individual's identity, recent research demonstrates such protections are insufficient because sequence data itself is susceptible to re-identification. To counteract this problem, we introduce an algorithm for anonymizing a collection of person-specific DNA sequences. The technique is termed DNA lattice anonymization (DNALA), and is based upon the formal privacy protection schema of k -anonymity. Under this model, it is impossible to observe or learn features that distinguish one genetic sequence from k-1 other entries in a collection. To maximize information retained in protected sequences, we incorporate a concept generalization lattice to learn the distance between two residues in a single nucleotide region. The lattice provides the most similar generalized concept for two residues (e.g. adenine and guanine are both purines). The method is tested and evaluated with several publicly available human population datasets ranging in size from 30 to 400 sequences. Our findings imply the anonymization schema is feasible for the protection of sequences privacy. The DNALA method is the first computational disclosure control technique for general DNA sequences. Given the computational nature of the method, guarantees of anonymity can be formally proven. There is room for improvement and validation, though this research provides the groundwork from which future researchers can construct genomics anonymization schemas tailored to specific datasharing scenarios.

  13. Occupational Sequences: Auto Engines 1. AT 121.

    Science.gov (United States)

    Korb, A. W.; And Others

    In an attempt to individualize an automotive course, the Vocational-Technical Division of Northern Montana College has developed Occupational Sequences for an engine rebuilding course. Occupational Sequences, a learning or teaching aid, is an analysis of numbered operations involved in engine rebuilding. Job sheets, included in the book, provide a…

  14. Sequencing Events: Exploring Art and Art Jobs.

    Science.gov (United States)

    Stephens, Pamela Geiger; Shaddix, Robin K.

    2000-01-01

    Presents an activity for upper-elementary students that correlates the actions of archaeologists, patrons, and artists with the sequencing of events in a logical order. Features ancient Egyptian art images. Discusses the preparation of materials, motivation, a pre-writing activity, and writing a story in sequence. (CMK)

  15. Wijsman Orlicz Asymptotically Ideal -Statistical Equivalent Sequences

    Directory of Open Access Journals (Sweden)

    Bipan Hazarika

    2013-01-01

    in Wijsman sense and present some definitions which are the natural combination of the definition of asymptotic equivalence, statistical equivalent, -statistical equivalent sequences in Wijsman sense. Finally, we introduce the notion of Cesaro Orlicz asymptotically -equivalent sequences in Wijsman sense and establish their relationship with other classes.

  16. Nitrogen chronology of massive main sequence stars

    NARCIS (Netherlands)

    Köhler, K.; Borzyszkowski, M.; Brott, I.; Langer, N.; de Koter, A.

    2012-01-01

    Context. Rotational mixing in massive main sequence stars is predicted to monotonically increase their surface nitrogen abundance with time. Aims. We use this effect to design a method for constraining the age and the inclination angle of massive main sequence stars, given their observed luminosity,

  17. Massively parallel sequencing of forensic STRs

    DEFF Research Database (Denmark)

    Parson, Walther; Ballard, David; Budowle, Bruce

    2016-01-01

    The DNA Commission of the International Society for Forensic Genetics (ISFG) is reviewing factors that need to be considered ahead of the adoption by the forensic community of short tandem repeat (STR) genotyping by massively parallel sequencing (MPS) technologies. MPS produces sequence data that...

  18. Novel algorithms for protein sequence analysis

    NARCIS (Netherlands)

    Ye, Kai

    2008-01-01

    Each protein is characterized by its unique sequential order of amino acids, the so-called protein sequence. Biology”s paradigm is that this order of amino acids determines the protein”s architecture and function. In this thesis, we introduce novel algorithms to analyze protein sequences. Chapter 1

  19. Cyprinus carpio Genome sequencing and assembly

    NARCIS (Netherlands)

    Kolder, I.C.R.M.; Plas-Duivesteijn, van der Suzanne J.; Tan, G.; Wiegertjes, G.; Forlenza, M.; Guler, A.T.; Travin, D.Y.; Nakao, M.; Moritomo, T.; Irnazarow, I.; Jansen, H.J.

    2013-01-01

    Sequencing of the common carp (Cyprinus carpio carpio Linnaeus, 1758) genome, with the objective of establishing carp as a model organism to supplement the closely related zebrafish (Danio rerio). The sequenced individual is a homozygous female (by gynogenesis) of R3 x R8 carp, the heterozygous

  20. Sequence Comparison: Close and Open problems

    NARCIS (Netherlands)

    Lenzini, Gabriele; Cerrai, P.; Freguglia, P.

    Comparing sequences is a very important activity both in computer science and in a many other areas as well. For example thank to text editors, everyone knows the particular instance of a sequence comparison problem knonw as ``string mathcing problem''. It consists in searching a given work

  1. Pig genome sequence - analysis and publication strategy

    DEFF Research Database (Denmark)

    Archibald, Alan L.; Bolund, Lars; Churcher, Carol

    2010-01-01

    preferentially selected for sequencing. In accordance with the Bermuda and Fort Lauderdale agreements and the more recent Toronto Statement the data have been released into public sequence repositories (Genbank/EMBL, NCBI/Ensembl trace repositories) in a timely manner and in advance of publication. CONCLUSIONS...

  2. Swab-to-Sequence: Real-time Data Analysis Platform for the Biomolecule Sequencer

    Data.gov (United States)

    National Aeronautics and Space Administration — DNA was successfully sequenced on the ISS in 2016, but the DNA sequenced was prepared on the ground. With FY’16 IRAD funds, the same team developed a...

  3. From Sequence to Morphology - Long-Range Correlations in Complete Sequenced Genomes

    NARCIS (Netherlands)

    T.A. Knoch (Tobias)

    2004-01-01

    textabstractThe largely unresolved sequential organization, i.e. the relations within DNA sequences, and its connection to the three-dimensional organization of genomes was investigated by correlation analyses of completely sequenced chromosomes from Viroids, Archaea, Bacteria, Arabidopsis

  4. Reclassification of Borrelia spp. isolated in South Korea using Multilocus Sequence Typing.

    Science.gov (United States)

    Park, Kyung-Hee; Choi, Yeon-Joo; Kim, Jeoungyeon; Park, Hye-Jin; Song, Dayoung; Jang, Won-Jong

    2018-05-31

    Using Borrelia isolated from South Korea, we evaluated by MLST and three intergenic genes (16S rRNA, ospA, and 5S-23S IGS) typing to analyze the relationship between host and vector and molecular background. Using the MLST analysis, we identified B. afzelii, B. yangtzensis, B. garinii, and B. bavariensis. This study was first report of the identification of B. yangtzensis using the MLST in South Korea.

  5. Nucleotide sequence preservation of human mitochondrial DNA

    International Nuclear Information System (INIS)

    Monnat, R.J. Jr.; Loeb, L.A.

    1985-01-01

    Recombinant DNA techniques have been used to quantitate the amount of nucleotide sequence divergence in the mitochondrial DNA population of individual normal humans. Mitochondrial DNA was isolated from the peripheral blood lymphocytes of five normal humans and cloned in M13 mp11; 49 kilobases of nucleotide sequence information was obtained from 248 independently isolated clones from the five normal donors. Both between- and within-individual differences were identified. Between-individual differences were identified in approximately = to 1/200 nucleotides. In contrast, only one within-individual difference was identified in 49 kilobases of nucleotide sequence information. This high degree of mitochondrial nucleotide sequence homogeneity in human somatic cells is in marked contrast to the rapid evolutionary divergence of human mitochondrial DNA and suggests the existence of mechanisms for the concerted preservation of mammalian mitochondrial DNA sequences in single organisms

  6. Snake Genome Sequencing: Results and Future Prospects.

    Science.gov (United States)

    Kerkkamp, Harald M I; Kini, R Manjunatha; Pospelov, Alexey S; Vonk, Freek J; Henkel, Christiaan V; Richardson, Michael K

    2016-12-01

    Snake genome sequencing is in its infancy-very much behind the progress made in sequencing the genomes of humans, model organisms and pathogens relevant to biomedical research, and agricultural species. We provide here an overview of some of the snake genome projects in progress, and discuss the biological findings, with special emphasis on toxinology, from the small number of draft snake genomes already published. We discuss the future of snake genomics, pointing out that new sequencing technologies will help overcome the problem of repetitive sequences in assembling snake genomes. Genome sequences are also likely to be valuable in examining the clustering of toxin genes on the chromosomes, in designing recombinant antivenoms and in studying the epigenetic regulation of toxin gene expression.

  7. Sequencing Cyclic Peptides by Multistage Mass Spectrometry

    Science.gov (United States)

    Mohimani, Hosein; Yang, Yu-Liang; Liu, Wei-Ting; Hsieh, Pei-Wen; Dorrestein, Pieter C.; Pevzner, Pavel A.

    2012-01-01

    Some of the most effective antibiotics (e.g., Vancomycin and Daptomycin) are cyclic peptides produced by non-ribosomal biosynthetic pathways. While hundreds of biomedically important cyclic peptides have been sequenced, the computational techniques for sequencing cyclic peptides are still in their infancy. Previous methods for sequencing peptide antibiotics and other cyclic peptides are based on Nuclear Magnetic Resonance spectroscopy, and require large amount (miligrams) of purified materials that, for most compounds, are not possible to obtain. Recently, development of mass spectrometry based methods has provided some hope for accurate sequencing of cyclic peptides using picograms of materials. In this paper we develop a method for sequencing of cyclic peptides by multistage mass spectrometry, and show its advantages over single stage mass spectrometry. The method is tested on known and new cyclic peptides from Bacillus brevis, Dianthus superbus and Streptomyces griseus, as well as a new family of cyclic peptides produced by marine bacteria. PMID:21751357

  8. Snake Genome Sequencing: Results and Future Prospects

    Directory of Open Access Journals (Sweden)

    Harald M. I. Kerkkamp

    2016-12-01

    Full Text Available Snake genome sequencing is in its infancy—very much behind the progress made in sequencing the genomes of humans, model organisms and pathogens relevant to biomedical research, and agricultural species. We provide here an overview of some of the snake genome projects in progress, and discuss the biological findings, with special emphasis on toxinology, from the small number of draft snake genomes already published. We discuss the future of snake genomics, pointing out that new sequencing technologies will help overcome the problem of repetitive sequences in assembling snake genomes. Genome sequences are also likely to be valuable in examining the clustering of toxin genes on the chromosomes, in designing recombinant antivenoms and in studying the epigenetic regulation of toxin gene expression.

  9. Sequencing and comparing whole mitochondrial genomes ofanimals

    Energy Technology Data Exchange (ETDEWEB)

    Boore, Jeffrey L.; Macey, J. Robert; Medina, Monica

    2005-04-22

    Comparing complete animal mitochondrial genome sequences is becoming increasingly common for phylogenetic reconstruction and as a model for genome evolution. Not only are they much more informative than shorter sequences of individual genes for inferring evolutionary relatedness, but these data also provide sets of genome-level characters, such as the relative arrangements of genes, that can be especially powerful. We describe here the protocols commonly used for physically isolating mtDNA, for amplifying these by PCR or RCA, for cloning,sequencing, assembly, validation, and gene annotation, and for comparing both sequences and gene arrangements. On several topics, we offer general observations based on our experiences to date with determining and comparing complete mtDNA sequences.

  10. Divide and conquer: enriching environmental sequencing data.

    Directory of Open Access Journals (Sweden)

    Anne Bergeron

    2007-09-01

    Full Text Available In environmental sequencing projects, a mix of DNA from a whole microbial community is fragmented and sequenced, with one of the possible goals being to reconstruct partial or complete genomes of members of the community. In communities with high diversity of species, a significant proportion of the sequences do not overlap any other fragment in the sample. This problem will arise not only in situations with a relatively even distribution of many species, but also when the community in a particular environment is routinely dominated by the same few species. In the former case, no genomes may be assembled at all, while in the latter case a few dominant species in an environment will always be sequenced at high coverage to the detriment of coverage of the greater number of sparse species.Here we show that, with the same global sequencing effort, separating the species into two or more sub-communities prior to sequencing can yield a much higher proportion of sequences that can be assembled. We first use the Lander-Waterman model to show that, if the expected percentage of singleton sequences is higher than 25%, then, under the uniform distribution hypothesis, splitting the community is always a wise choice. We then construct simulated microbial communities to show that the results hold for highly non-uniform distributions. We also show that, for the distributions considered in the experiments, it is possible to estimate quite accurately the relative diversity of the two sub-communities.Given the fact that several methods exist to split microbial communities based on physical properties such as size, density, surface biochemistry, or optical properties, we strongly suggest that groups involved in environmental sequencing, and expecting high diversity, consider splitting their communities in order to maximize the information content of their sequencing effort.

  11. Deep sequencing analysis of small noncoding RNA and mRNA targets of the global post-transcriptional regulator, Hfq

    DEFF Research Database (Denmark)

    Sittka, A; Lucchini, S; Papenfort, K

    2008-01-01

    would be rescued by overexpression of HilD and FlhDC, and we proved this to be correct. The combination of epitope-tagging and HTPS of immunoprecipitated RNA detected the expression of many intergenic chromosomal regions of Salmonella. Our approach overcomes the limited availability of high...

  12. Sequence Matters but How Exactly? A Method for Evaluating Activity Sequences from Data

    Science.gov (United States)

    Doroudi, Shayan; Holstein, Kenneth; Aleven, Vincent; Brunskill, Emma

    2016-01-01

    How should a wide variety of educational activities be sequenced to maximize student learning? Although some experimental studies have addressed this question, educational data mining methods may be able to evaluate a wider range of possibilities and better handle many simultaneous sequencing constraints. We introduce Sequencing Constraint…

  13. Identification of human chromosome 22 transcribed sequences with ORF expressed sequence tags

    DEFF Research Database (Denmark)

    de Souza, S J; Camargo, A A; Briones, M R

    2000-01-01

    Transcribed sequences in the human genome can be identified with confidence only by alignment with sequences derived from cDNAs synthesized from naturally occurring mRNAs. We constructed a set of 250,000 cDNAs that represent partial expressed gene sequences and that are biased toward the central ...

  14. Advantages of genome sequencing by long-read sequencer using SMRT technology in medical area.

    Science.gov (United States)

    Nakano, Kazuma; Shiroma, Akino; Shimoji, Makiko; Tamotsu, Hinako; Ashimine, Noriko; Ohki, Shun; Shinzato, Misuzu; Minami, Maiko; Nakanishi, Tetsuhiro; Teruya, Kuniko; Satou, Kazuhito; Hirano, Takashi

    2017-07-01

    PacBio RS II is the first commercialized third-generation DNA sequencer able to sequence a single molecule DNA in real-time without amplification. PacBio RS II's sequencing technology is novel and unique, enabling the direct observation of DNA synthesis by DNA polymerase. PacBio RS II confers four major advantages compared to other sequencing technologies: long read lengths, high consensus accuracy, a low degree of bias, and simultaneous capability of epigenetic characterization. These advantages surmount the obstacle of sequencing genomic regions such as high/low G+C, tandem repeat, and interspersed repeat regions. Moreover, PacBio RS II is ideal for whole genome sequencing, targeted sequencing, complex population analysis, RNA sequencing, and epigenetics characterization. With PacBio RS II, we have sequenced and analyzed the genomes of many species, from viruses to humans. Herein, we summarize and review some of our key genome sequencing projects, including full-length viral sequencing, complete bacterial genome and almost-complete plant genome assemblies, and long amplicon sequencing of a disease-associated gene region. We believe that PacBio RS II is not only an effective tool for use in the basic biological sciences but also in the medical/clinical setting.

  15. Final Report for Grant No. DE-FG02-98ER62583 ''Functional Analysis of the Genome Sequence of Deinococcus radiodurans''

    International Nuclear Information System (INIS)

    Daly, Michael J.

    2003-01-01

    because of the positive correlation between desiccation- and radiation-resistance. Further, the D. radiodurans genome is very rich in repetitive sequences, namely IS-like transposons and small intergenic repeats. In combination, these observations suggest that several different biological mechanisms contribute to the multiple DNA repair-dependent phenotypes of this organism. The genetic mechanisms underlying the extreme radiation resistance of this organism are now being characterized experimentally using a whole genome microarray

  16. Phylogenetic relationships in Taxodiaceae and Cupressaceae sensu stricto based on matK gene, chlL gene, trnL-trnF IGS region, and trnL intron sequences.

    Science.gov (United States)

    Kusumi, J; Tsumura, Y; Yoshimaru, H; Tachida, H

    2000-10-01

    Nucleotide sequences from four chloroplast genes, the matK, chlL, intergenic spacer (IGS) region between trnL and trnF, and an intron of trnL, were determined from all species of Taxodiaceae and five species of Cupressaceae sensu stricto (s.s.). Phylogenetic trees were constructed using the maximum parsimony and the neighbor-joining methods with Cunninghamia as an outgroup. These analyses provided greater resolution of relationships among genera and higher bootstrap supports for clades compared to previous analyses. Results indicate that Taiwania diverged first, and then Athrotaxis diverged from the remaining genera. Metasequoia, Sequoia, and Sequoiadendron form a clade. Taxodium and Glyptostrobus form a clade, which is the sister to Cryptomeria. Cupressaceae s.s. are derived from within Taxodiaceae, being the most closely related to the Cryptomeria/Taxodium/Glyptostrobus clade. These relationships are consistent with previous morphological groupings and the analyses of molecular data. In addition, we found acceleration of evolutionary rates in Cupressaceae s.s. Possible causes for the acceleration are discussed.

  17. A comparative evaluation of sequence classification programs

    Directory of Open Access Journals (Sweden)

    Bazinet Adam L

    2012-05-01

    Full Text Available Abstract Background A fundamental problem in modern genomics is to taxonomically or functionally classify DNA sequence fragments derived from environmental sampling (i.e., metagenomics. Several different methods have been proposed for doing this effectively and efficiently, and many have been implemented in software. In addition to varying their basic algorithmic approach to classification, some methods screen sequence reads for ’barcoding genes’ like 16S rRNA, or various types of protein-coding genes. Due to the sheer number and complexity of methods, it can be difficult for a researcher to choose one that is well-suited for a particular analysis. Results We divided the very large number of programs that have been released in recent years for solving the sequence classification problem into three main categories based on the general algorithm they use to compare a query sequence against a database of sequences. We also evaluated the performance of the leading programs in each category on data sets whose taxonomic and functional composition is known. Conclusions We found significant variability in classification accuracy, precision, and resource consumption of sequence classification programs when used to analyze various metagenomics data sets. However, we observe some general trends and patterns that will be useful to researchers who use sequence classification programs.

  18. CATEGORIZATION OF EVENT SEQUENCES FOR LICENSE APPLICATION

    Energy Technology Data Exchange (ETDEWEB)

    G.E. Ragan; P. Mecheret; D. Dexheimer

    2005-04-14

    The purposes of this analysis are: (1) Categorize (as Category 1, Category 2, or Beyond Category 2) internal event sequences that may occur before permanent closure of the repository at Yucca Mountain. (2) Categorize external event sequences that may occur before permanent closure of the repository at Yucca Mountain. This includes examining DBGM-1 seismic classifications and upgrading to DBGM-2, if appropriate, to ensure Beyond Category 2 categorization. (3) State the design and operational requirements that are invoked to make the categorization assignments valid. (4) Indicate the amount of material put at risk by Category 1 and Category 2 event sequences. (5) Estimate frequencies of Category 1 event sequences at the maximum capacity and receipt rate of the repository. (6) Distinguish occurrences associated with normal operations from event sequences. It is beyond the scope of the analysis to propose design requirements that may be required to control radiological exposure associated with normal operations. (7) Provide a convenient compilation of the results of the analysis in tabular form. The results of this analysis are used as inputs to the consequence analyses in an iterative design process that is depicted in Figure 1. Categorization of event sequences for permanent retrieval of waste from the repository is beyond the scope of this analysis. Cleanup activities that take place after an event sequence and other responses to abnormal events are also beyond the scope of the analysis.

  19. Exploration of noncoding sequences in metagenomes.

    Directory of Open Access Journals (Sweden)

    Fabián Tobar-Tosse

    Full Text Available Environment-dependent genomic features have been defined for different metagenomes, whose genes and their associated processes are related to specific environments. Identification of ORFs and their functional categories are the most common methods for association between functional and environmental features. However, this analysis based on finding ORFs misses noncoding sequences and, therefore, some metagenome regulatory or structural information could be discarded. In this work we analyzed 23 whole metagenomes, including coding and noncoding sequences using the following sequence patterns: (G+C content, Codon Usage (Cd, Trinucleotide Usage (Tn, and functional assignments for ORF prediction. Herein, we present evidence of a high proportion of noncoding sequences discarded in common similarity-based methods in metagenomics, and the kind of relevant information present in those. We found a high density of trinucleotide repeat sequences (TRS in noncoding sequences, with a regulatory and adaptive function for metagenome communities. We present associations between trinucleotide values and gene function, where metagenome clustering correlate with microorganism adaptations and kinds of metagenomes. We propose here that noncoding sequences have relevant information to describe metagenomes that could be considered in a whole metagenome analysis in order to improve their organization, classification protocols, and their relation with the environment.

  20. CATEGORIZATION OF EVENT SEQUENCES FOR LICENSE APPLICATION

    International Nuclear Information System (INIS)

    G.E. Ragan; P. Mecheret; D. Dexheimer

    2005-01-01

    The purposes of this analysis are: (1) Categorize (as Category 1, Category 2, or Beyond Category 2) internal event sequences that may occur before permanent closure of the repository at Yucca Mountain. (2) Categorize external event sequences that may occur before permanent closure of the repository at Yucca Mountain. This includes examining DBGM-1 seismic classifications and upgrading to DBGM-2, if appropriate, to ensure Beyond Category 2 categorization. (3) State the design and operational requirements that are invoked to make the categorization assignments valid. (4) Indicate the amount of material put at risk by Category 1 and Category 2 event sequences. (5) Estimate frequencies of Category 1 event sequences at the maximum capacity and receipt rate of the repository. (6) Distinguish occurrences associated with normal operations from event sequences. It is beyond the scope of the analysis to propose design requirements that may be required to control radiological exposure associated with normal operations. (7) Provide a convenient compilation of the results of the analysis in tabular form. The results of this analysis are used as inputs to the consequence analyses in an iterative design process that is depicted in Figure 1. Categorization of event sequences for permanent retrieval of waste from the repository is beyond the scope of this analysis. Cleanup activities that take place after an event sequence and other responses to abnormal events are also beyond the scope of the analysis

  1. On site DNA barcoding by nanopore sequencing.

    Directory of Open Access Journals (Sweden)

    Michele Menegon

    Full Text Available Biodiversity research is becoming increasingly dependent on genomics, which allows the unprecedented digitization and understanding of the planet's biological heritage. The use of genetic markers i.e. DNA barcoding, has proved to be a powerful tool in species identification. However, full exploitation of this approach is hampered by the high sequencing costs and the absence of equipped facilities in biodiversity-rich countries. In the present work, we developed a portable sequencing laboratory based on the portable DNA sequencer from Oxford Nanopore Technologies, the MinION. Complementary laboratory equipment and reagents were selected to be used in remote and tough environmental conditions. The performance of the MinION sequencer and the portable laboratory was tested for DNA barcoding in a mimicking tropical environment, as well as in a remote rainforest of Tanzania lacking electricity. Despite the relatively high sequencing error-rate of the MinION, the development of a suitable pipeline for data analysis allowed the accurate identification of different species of vertebrates including amphibians, reptiles and mammals. In situ sequencing of a wild frog allowed us to rapidly identify the species captured, thus confirming that effective DNA barcoding in the field is possible. These results open new perspectives for real-time-on-site DNA sequencing thus potentially increasing opportunities for the understanding of biodiversity in areas lacking conventional laboratory facilities.

  2. Comparison of two Next Generation sequencing platforms for full genome sequencing of Classical Swine Fever Virus

    DEFF Research Database (Denmark)

    Fahnøe, Ulrik; Pedersen, Anders Gorm; Höper, Dirk

    2013-01-01

    to the consensus sequence. Additionally, we got an average sequence depth for the genome of 4000 for the Iontorrent PGM and 400 for the FLX platform making the mapping suitable for single nucleotide variant (SNV) detection. The analysis revealed a single non-silent SNV A10665G leading to the amino acid change D......Next Generation Sequencing (NGS) is becoming more adopted into viral research and will be the preferred technology in the years to come. We have recently sequenced several strains of Classical Swine Fever Virus (CSFV) by NGS on both Genome Sequencer FLX (GS FLX) and Iontorrent PGM platforms...

  3. Sequencing of chloroplast genome using whole cellular DNA and Solexa sequencing technology

    Directory of Open Access Journals (Sweden)

    Jian eWu

    2012-11-01

    Full Text Available Sequencing of the chloroplast genome using traditional sequencing methods has been difficult because of its size (>120 kb and the complicated procedures required to prepare templates. To explore the feasibility of sequencing the chloroplast genome using DNA extracted from whole cells and Solexa sequencing technology, we sequenced whole cellular DNA isolated from leaves of three Brassica rapa accessions with one lane per accession. In total, 246 Mb, 362Mb, 361 Mb sequence data were generated for the three accessions Chiifu-401-42, Z16 and FT, respectively. Microreads were assembled by reference-guided assembly using the cpDNA sequences of B. rapa, Arabidopsis thaliana, and Nicotiana tabacum. We achieved coverage of more than 99.96% of the cp genome in the three tested accessions using the B. rapa sequence as the reference. When A. thaliana or N. tabacum sequences were used as references, 99.7–99.8% or 95.5–99.7% of the B. rapa chloroplast genome was covered, respectively. These results demonstrated that sequencing of whole cellular DNA isolated from young leaves using the Illumina Genome Analyzer is an efficient method for high-throughput sequencing of chloroplast genome.

  4. Rapid and Accurate Sequencing of Enterovirus Genomes Using MinION Nanopore Sequencer.

    Science.gov (United States)

    Wang, Ji; Ke, Yue Hua; Zhang, Yong; Huang, Ke Qiang; Wang, Lei; Shen, Xin Xin; Dong, Xiao Ping; Xu, Wen Bo; Ma, Xue Jun

    2017-10-01

    Knowledge of an enterovirus genome sequence is very important in epidemiological investigation to identify transmission patterns and ascertain the extent of an outbreak. The MinION sequencer is increasingly used to sequence various viral pathogens in many clinical situations because of its long reads, portability, real-time accessibility of sequenced data, and very low initial costs. However, information is lacking on MinION sequencing of enterovirus genomes. In this proof-of-concept study using Enterovirus 71 (EV71) and Coxsackievirus A16 (CA16) strains as examples, we established an amplicon-based whole genome sequencing method using MinION. We explored the accuracy, minimum sequencing time, discrimination and high-throughput sequencing ability of MinION, and compared its performance with Sanger sequencing. Within the first minute (min) of sequencing, the accuracy of MinION was 98.5% for the single EV71 strain and 94.12%-97.33% for 10 genetically-related CA16 strains. In as little as 14 min, 99% identity was reached for the single EV71 strain, and in 17 min (on average), 99% identity was achieved for 10 CA16 strains in a single run. MinION is suitable for whole genome sequencing of enteroviruses with sufficient accuracy and fine discrimination and has the potential as a fast, reliable and convenient method for routine use. Copyright © 2017 The Editorial Board of Biomedical and Environmental Sciences. Published by China CDC. All rights reserved.

  5. Computer simulation of replacement sequences in copper

    International Nuclear Information System (INIS)

    Schiffgens, J.O.; Schwartz, D.W.; Ariyasu, R.G.; Cascadden, S.E.

    1978-01-01

    Results of computer simulations of , , and replacement sequences in copper are presented, including displacement thresholds, focusing energies, energy losses per replacement, and replacement sequence lengths. These parameters are tabulated for six interatomic potentials and shown to vary in a systematic way with potential stiffness and range. Comparisons of results from calculations made with ADDES, a quasi-dynamical code, and COMENT, a dynamical code, show excellent agreement, demonstrating that the former can be calibrated and used satisfactorily in the analysis of low energy displacement cascades. Upper limits on , , and replacement sequences were found to be approximately 10, approximately 30, and approximately 14 replacements, respectively. (author)

  6. Whole-genome sequencing of veterinary pathogens

    DEFF Research Database (Denmark)

    Ronco, Troels

    -electrophoresis and single-locus sequencing has been widely used to characterize such types of veterinary pathogens. However, DNA sequencing techniques have become fast and cost effective in recent years and whole-genome sequencing data provide a much higher discriminative power and reproducibility than any...... genetic background. This indicates that dairy cows can be natural carriers of S. aureus subtypes that in certain cases lead to CM. A group of isolates that mostly belonged to ST151 carried three pathogenicity islands that were primarily found in this group. The prevalence of resistance genes was generally...

  7. Spreadsheet macros for coloring sequence alignments.

    Science.gov (United States)

    Haygood, M G

    1993-12-01

    This article describes a set of Microsoft Excel macros designed to color amino acid and nucleotide sequence alignments for review and preparation of visual aids. The colored alignments can then be modified to emphasize features of interest. Procedures for importing and coloring sequences are described. The macro file adds a new menu to the menu bar containing sequence-related commands to enable users unfamiliar with Excel to use the macros more readily. The macros were designed for use with Macintosh computers but will also run with the DOS version of Excel.

  8. Tensor products of higher almost split sequences

    OpenAIRE

    Pasquali, Andrea

    2015-01-01

    We investigate how the higher almost split sequences over a tensor product of algebras are related to those over each factor. Herschend and Iyama gave a precise criterion for when the tensor product of an $n$-representation finite algebra and an $m$-representation finite algebra is $(n+m)$-representation finite. In this case we give a complete description of the higher almost split sequences over the tensor product by expressing every higher almost split sequence as the mapping cone of a suit...

  9. Arbitrarily accurate twin composite π -pulse sequences

    Science.gov (United States)

    Torosov, Boyan T.; Vitanov, Nikolay V.

    2018-04-01

    We present three classes of symmetric broadband composite pulse sequences. The composite phases are given by analytic formulas (rational fractions of π ) valid for any number of constituent pulses. The transition probability is expressed by simple analytic formulas and the order of pulse area error compensation grows linearly with the number of pulses. Therefore, any desired compensation order can be produced by an appropriate composite sequence; in this sense, they are arbitrarily accurate. These composite pulses perform equally well as or better than previously published ones. Moreover, the current sequences are more flexible as they allow total pulse areas of arbitrary integer multiples of π .

  10. Modeling of Prepregs during Automated Draping Sequences

    DEFF Research Database (Denmark)

    Krogh, Christian; Glud, Jens Ammitzbøll; Jakobsen, Johnny

    2017-01-01

    algorithm used to generate target points on the mold which are used as input to a draping sequence planner. The draping sequence planner prescribes the displacement history for each gripper in the drape tool and these displacements are then applied to each gripper in a transient model of the draping...... sequence. The model is based on a transient finite element analysis with the material’s constitutive behavior currently being approximated as linear elastic orthotropic. In-plane tensile and bias-extension tests as well as bending tests are conducted and used as input for the model. The virtual draping...

  11. Deep-sequencing protocols influence the results obtained in small-RNA sequencing.

    Directory of Open Access Journals (Sweden)

    Joern Toedling

    Full Text Available Second-generation sequencing is a powerful method for identifying and quantifying small-RNA components of cells. However, little attention has been paid to the effects of the choice of sequencing platform and library preparation protocol on the results obtained. We present a thorough comparison of small-RNA sequencing libraries generated from the same embryonic stem cell lines, using different sequencing platforms, which represent the three major second-generation sequencing technologies, and protocols. We have analysed and compared the expression of microRNAs, as well as populations of small RNAs derived from repetitive elements. Despite the fact that different libraries display a good correlation between sequencing platforms, qualitative and quantitative variations in the results were found, depending on the protocol used. Thus, when comparing libraries from different biological samples, it is strongly recommended to use the same sequencing platform and protocol in order to ensure the biological relevance of the comparisons.

  12. Rapid Multiplex Small DNA Sequencing on the MinION Nanopore Sequencing Platform

    Directory of Open Access Journals (Sweden)

    Shan Wei

    2018-05-01

    Full Text Available Real-time sequencing of short DNA reads has a wide variety of clinical and research applications including screening for mutations, target sequences and aneuploidy. We recently demonstrated that MinION, a nanopore-based DNA sequencing device the size of a USB drive, could be used for short-read DNA sequencing. In this study, an ultra-rapid multiplex library preparation and sequencing method for the MinION is presented and applied to accurately test normal diploid and aneuploidy samples’ genomic DNA in under three hours, including library preparation and sequencing. This novel method shows great promise as a clinical diagnostic test for applications requiring rapid short-read DNA sequencing.

  13. Probabilistic Motor Sequence Yields Greater Offline and Less Online Learning than Fixed Sequence.

    Science.gov (United States)

    Du, Yue; Prashad, Shikha; Schoenbrun, Ilana; Clark, Jane E

    2016-01-01

    It is well acknowledged that motor sequences can be learned quickly through online learning. Subsequently, the initial acquisition of a motor sequence is boosted or consolidated by offline learning. However, little is known whether offline learning can drive the fast learning of motor sequences (i.e., initial sequence learning in the first training session). To examine offline learning in the fast learning stage, we asked four groups of young adults to perform the serial reaction time (SRT) task with either a fixed or probabilistic sequence and with or without preliminary knowledge (PK) of the presence of a sequence. The sequence and PK were manipulated to emphasize either procedural (probabilistic sequence; no preliminary knowledge (NPK)) or declarative (fixed sequence; with PK) memory that were found to either facilitate or inhibit offline learning. In the SRT task, there were six learning blocks with a 2 min break between each consecutive block. Throughout the session, stimuli followed the same fixed or probabilistic pattern except in Block 5, in which stimuli appeared in a random order. We found that PK facilitated the learning of a fixed sequence, but not a probabilistic sequence. In addition to overall learning measured by the mean reaction time (RT), we examined the progressive changes in RT within and between blocks (i.e., online and offline learning, respectively). It was found that the two groups who performed the fixed sequence, regardless of PK, showed greater online learning than the other two groups who performed the probabilistic sequence. The groups who performed the probabilistic sequence, regardless of PK, did not display online learning, as indicated by a decline in performance within the learning blocks. However, they did demonstrate remarkably greater offline improvement in RT, which suggests that they are learning the probabilistic sequence offline. These results suggest that in the SRT task, the fast acquisition of a motor sequence is driven

  14. Optimization of a sequence of reactors

    DEFF Research Database (Denmark)

    Vidal, Rene Victor Valqui

    1991-01-01

    Concerns the optimal production of sulphuric acid in a sequence of reactors. Using a suitable approximation to the objective function, this problem can easily be solved using the maximum principle. A numerical example documents the applicability of the suggested approach...

  15. Fluency First: Reversing the Traditional ESL Sequence.

    Science.gov (United States)

    MacGowan-Gilhooly, Adele

    1991-01-01

    Describes an ESL department's whole language approach to writing and reading, replacing its traditional grammar-based ESL instructional sequence. Reports the positive quantitative and qualitative results of the first three years of using the new approach. (KEH)

  16. Expressed sequence tags (ESTs) and single nucleotide ...

    African Journals Online (AJOL)

    SERVER

    2008-02-19

    Feb 19, 2008 ... the discovery of the DNA, a new area of modern plant biotechnology begun. In plant ... Marker Assisted Breeding and Sequence Tagged Sites. (STS) are all in use in modern ...... and behaviour in the honey bee. Genome Res.

  17. DNA Replication Profiling Using Deep Sequencing.

    Science.gov (United States)

    Saayman, Xanita; Ramos-Pérez, Cristina; Brown, Grant W

    2018-01-01

    Profiling of DNA replication during progression through S phase allows a quantitative snap-shot of replication origin usage and DNA replication fork progression. We present a method for using deep sequencing data to profile DNA replication in S. cerevisiae.

  18. Supervised Sequence Labelling with Recurrent Neural Networks

    CERN Document Server

    Graves, Alex

    2012-01-01

    Supervised sequence labelling is a vital area of machine learning, encompassing tasks such as speech, handwriting and gesture recognition, protein secondary structure prediction and part-of-speech tagging. Recurrent neural networks are powerful sequence learning tools—robust to input noise and distortion, able to exploit long-range contextual information—that would seem ideally suited to such problems. However their role in large-scale sequence labelling systems has so far been auxiliary.    The goal of this book is a complete framework for classifying and transcribing sequential data with recurrent neural networks only. Three main innovations are introduced in order to realise this goal. Firstly, the connectionist temporal classification output layer allows the framework to be trained with unsegmented target sequences, such as phoneme-level speech transcriptions; this is in contrast to previous connectionist approaches, which were dependent on error-prone prior segmentation. Secondly, multidimensional...

  19. The International Nucleotide Sequence Database Collaboration.

    Science.gov (United States)

    Cochrane, Guy; Karsch-Mizrachi, Ilene; Nakamura, Yasukazu

    2011-01-01

    Under the International Nucleotide Sequence Database Collaboration (INSDC; http://www.insdc.org), globally comprehensive public domain nucleotide sequence is captured, preserved and presented. The partners of this long-standing collaboration work closely together to provide data formats and conventions that enable consistent data submission to their databases and support regular data exchange around the globe. Clearly defined policy and governance in relation to free access to data and relationships with journal publishers have positioned INSDC databases as a key provider of the scientific record and a core foundation for the global bioinformatics data infrastructure. While growth in sequence data volumes comes no longer as a surprise to INSDC partners, the uptake of next-generation sequencing technology by mainstream science that we have witnessed in recent years brings a step-change to growth, necessarily making a clear mark on INSDC strategy. In this article, we introduce the INSDC, outline data growth patterns and comment on the challenges of increased growth.

  20. Interference management using direct sequence spread spectrum ...

    African Journals Online (AJOL)

    Interference management using direct sequence spread spectrum (DSSS) technique ... Journal of Fundamental and Applied Sciences ... Keywords: DSSS, LTE network; Wi-Fi network; SINR; interference management and interference power.

  1. Characterizing leader sequences of CRISPR loci

    DEFF Research Database (Denmark)

    Alkhnbashi, Omer; Shah, Shiraz Ali; Garrett, Roger Antony

    2016-01-01

    The CRISPR-Cas system is an adaptive immune system in many archaea and bacteria, which provides resistance against invading genetic elements. The first phase of CRISPR-Cas immunity is called adaptation, in which small DNA fragments are excised from genetic elements and are inserted into a CRISPR...... array generally adjacent to its so called leader sequence at one end of the array. It has been shown that transcription initiation and adaptation signals of the CRISPR array are located within the leader. However, apart from promoters, there is very little knowledge of sequence or structural motifs...... sequences by focusing on the consensus repeat of the adjacent CRISPR array and weak upstream conservation signals. We applied our tool to the analysis of a comprehensive genomic database and identified several characteristic properties of leader sequences specific to archaea and bacteria, ranging from...

  2. Sequencing Information Management System (SIMS). Final report

    Energy Technology Data Exchange (ETDEWEB)

    Fields, C.

    1996-02-15

    A feasibility study to develop a requirements analysis and functional specification for a data management system for large-scale DNA sequencing laboratories resulted in a functional specification for a Sequencing Information Management System (SIMS). This document reports the results of this feasibility study, and includes a functional specification for a SIMS relational schema. The SIMS is an integrated information management system that supports data acquisition, management, analysis, and distribution for DNA sequencing laboratories. The SIMS provides ad hoc query access to information on the sequencing process and its results, and partially automates the transfer of data between laboratory instruments, analysis programs, technical personnel, and managers. The SIMS user interfaces are designed for use by laboratory technicians, laboratory managers, and scientists. The SIMS is designed to run in a heterogeneous, multiplatform environment in a client/server mode. The SIMS communicates with external computational and data resources via the internet.

  3. Galaxy LIMS for next-generation sequencing

    NARCIS (Netherlands)

    Scholtalbers, J.; Rossler, J.; Sorn, P.; Graaf, J. de; Boisguerin, V.; Castle, J.; Sahin, U.

    2013-01-01

    SUMMARY: We have developed a laboratory information management system (LIMS) for a next-generation sequencing (NGS) laboratory within the existing Galaxy platform. The system provides lab technicians standard and customizable sample information forms, barcoded submission forms, tracking of input

  4. Sequencing Closterium moniliferum: Future prospects in nuclear ...

    African Journals Online (AJOL)

    Akanksha Pandey

    2012-10-06

    Oct 6, 2012 ... Abstract Genome sequencing can play a vital role in health and several other domains such as in .... ter have imposed some serious questions over the security is- ... strontium and barium from aqueous environment and store.

  5. Generalized locally Toeplitz sequences theory and applications

    CERN Document Server

    Garoni, Carlo

    2017-01-01

    Based on their research experience, the authors propose a reference textbook in two volumes on the theory of generalized locally Toeplitz sequences and their applications. This first volume focuses on the univariate version of the theory and the related applications in the unidimensional setting, while the second volume, which addresses the multivariate case, is mainly devoted to concrete PDE applications. This book systematically develops the theory of generalized locally Toeplitz (GLT) sequences and presents some of its main applications, with a particular focus on the numerical discretization of differential equations (DEs). It is the first book to address the relatively new field of GLT sequences, which occur in numerous scientific applications and are especially dominant in the context of DE discretizations. Written for applied mathematicians, engineers, physicists, and scientists who (perhaps unknowingly) encounter GLT sequences in their research, it is also of interest to those working in the fields of...

  6. High resolution sequence stratigraphy in China

    International Nuclear Information System (INIS)

    Zhang Shangfeng; Zhang Changmin; Yin Yanshi; Yin Taiju

    2008-01-01

    Since high resolution sequence stratigraphy was introduced into China by DENG Hong-wen in 1995, it has been experienced two development stages in China which are the beginning stage of theory research and development of theory research and application, and the stage of theoretical maturity and widely application that is going into. It is proved by practices that high resolution sequence stratigraphy plays more and more important roles in the exploration and development of oil and gas in Chinese continental oil-bearing basin and the research field spreads to the exploration of coal mine, uranium mine and other strata deposits. However, the theory of high resolution sequence stratigraphy still has some shortages, it should be improved in many aspects. The authors point out that high resolution sequence stratigraphy should be characterized quantitatively and modelized by computer techniques. (authors)

  7. Genetic sequences derived from suppression subtractive ...

    African Journals Online (AJOL)

    STORAGESEVER

    2008-06-17

    Jun 17, 2008 ... their possible roles in Xanthomonas albilineans ... Technology, P. O. Box 1334, Durban 4000, Republic of South Africa. Accepted 4 ... Clones selected were sequenced (using a Perkin Elmer ABI PRISM Dye terminator cycle.

  8. Special Issue: Next Generation DNA Sequencing

    Directory of Open Access Journals (Sweden)

    Paul Richardson

    2010-10-01

    Full Text Available Next Generation Sequencing (NGS refers to technologies that do not rely on traditional dideoxy-nucleotide (Sanger sequencing where labeled DNA fragments are physically resolved by electrophoresis. These new technologies rely on different strategies, but essentially all of them make use of real-time data collection of a base level incorporation event across a massive number of reactions (on the order of millions versus 96 for capillary electrophoresis for instance. The major commercial NGS platforms available to researchers are the 454 Genome Sequencer (Roche, Illumina (formerly Solexa Genome analyzer, the SOLiD system (Applied Biosystems/Life Technologies and the Heliscope (Helicos Corporation. The techniques and different strategies utilized by these platforms are reviewed in a number of the papers in this special issue. These technologies are enabling new applications that take advantage of the massive data produced by this next generation of sequencing instruments. [...

  9. Nursing Student Perceptions Regarding Simulation Experience Sequencing.

    Science.gov (United States)

    Woda, Aimee A; Gruenke, Theresa; Alt-Gehrman, Penny; Hansen, Jamie

    2016-09-01

    The use of simulated learning experiences (SLEs) have increased within nursing curricula with positive learning outcomes for nursing students. The purpose of this study is to explore nursing students' perceptions of their clinical decision making (CDM) related to the block sequencing of different patient care experiences, SLEs versus hospital-based learning experiences (HLEs). A qualitative descriptive design used open-ended survey questions to generate information about the block sequencing of SLEs and its impact on nursing students' perceived CDM. Three themes emerged from the data: Preexperience Anxiety, Real-Time Decision Making, and Increased Patient Care Experiences. Nursing students identified that having SLEs prior to HLEs provided several benefits. Even when students preferred SLEs prior to HLEs, the sequence did not impact their CDM. This suggests that alternating block sequencing can be used without impacting the students' perceptions of their ability to make decisions. [J Nurs Educ. 2016;55(9):528-532.]. Copyright 2016, SLACK Incorporated.

  10. Sequence finishing and mapping of Drosophila melanogasterheterochromatin

    Energy Technology Data Exchange (ETDEWEB)

    Hoskins, Roger A.; Carlson, Joseph W.; Kennedy, Cameron; Acevedo,David; Evans-Holm, Martha; Frise, Erwin; Wan, Kenneth H.; Park, Soo; Mendez-Lago, Maria; Rossi, Fabrizio; Villasante, Alfredo; Dimitri,Patrizio; Karpen, Gary H.; Celniker, Susan E.

    2007-06-15

    Genome sequences for most metazoans are incomplete due tothe presence of repeated DNA in the pericentromeric heterochromatin. Theheterochromatic regions of D. melanogaster contain 20 Mb of sequenceamenable to mapping, sequence assembly and finishing. Here we describethe generation of 15 Mb of finished or improved heterochromatic sequenceusing available clone resources and assembly and mapping methods. We alsoconstructed a BAC-based physical map that spans approximately 13 Mb ofthe pericentromeric heterochromatin, and a cytogenetic map that positionsapproximately 11 Mb of BAC contigs and sequence scaffolds in specificchromosomal locations. The integrated sequence assembly and maps greatlyimprove our understanding of the structure and composition of this poorlyunderstood fraction of a metazoan genome and provide a framework forfunctional analyses.

  11. Phylogeography of Quercus variabilis Based on Chloroplast DNA Sequence in East Asia: Multiple Glacial Refugia and Mainland-Migrated Island Populations

    Science.gov (United States)

    Kang, Hongzhang; Sun, Xiao; Yin, Shan; Du, Hongmei; Yamanaka, Norikazu; Gapare, Washington; Wu, Harry X.; Liu, Chunjiang

    2012-01-01

    The biogeographical relationships between far-separated populations, in particular, those in the mainland and islands, remain unclear for widespread species in eastern Asia where the current distribution of plants was greatly influenced by the Quaternary climate. Deciduous Oriental oak (Quercus variabilis) is one of the most widely distributed species in eastern Asia. In this study, leaf material of 528 Q. variabilis trees from 50 populations across the whole distribution (Mainland China, Korea Peninsular as well as Japan, Zhoushan and Taiwan Islands) was collected, and three cpDNA intergenic spacer fragments were sequenced using universal primers. A total of 26 haplotypes were detected, and it showed a weak phylogeographical structure in eastern Asia populations at species level, however, in the central-eastern region of Mainland China, the populations had more haplotypes than those in other regions, with a significant phylogeographical structure (N ST = 0.751> G ST = 0.690, Ptree showed a rapid speciation during Pleistocene, with a population augment occurred in Middle Pleistocene. Both diversity patterns and ecological niche modelling indicated there could be multiple glacial refugia and possible bottleneck or founder effects occurred in the southern Japan. We dated major spatial expansion of Q. variabilis population in eastern Asia to the last glacial cycle(s), a period with sea-level fluctuations and land bridges in East China Sea as possible dispersal corridors. This study showed that geographical heterogeneity combined with climate and sea-level changes have shaped the genetic structure of this wide-ranging tree species in East Asia. PMID:23115642

  12. Phylogeography of Quercus variabilis based on chloroplast DNA sequence in East Asia: multiple glacial refugia and Mainland-migrated island populations.

    Directory of Open Access Journals (Sweden)

    Dongmei Chen

    Full Text Available The biogeographical relationships between far-separated populations, in particular, those in the mainland and islands, remain unclear for widespread species in eastern Asia where the current distribution of plants was greatly influenced by the Quaternary climate. Deciduous Oriental oak (Quercus variabilis is one of the most widely distributed species in eastern Asia. In this study, leaf material of 528 Q. variabilis trees from 50 populations across the whole distribution (Mainland China, Korea Peninsular as well as Japan, Zhoushan and Taiwan Islands was collected, and three cpDNA intergenic spacer fragments were sequenced using universal primers. A total of 26 haplotypes were detected, and it showed a weak phylogeographical structure in eastern Asia populations at species level, however, in the central-eastern region of Mainland China, the populations had more haplotypes than those in other regions, with a significant phylogeographical structure (N(ST= 0.751> G(ST= 0.690, P<0.05. Q. variabilis displayed high interpopulation and low intrapopulation genetic diversity across the distribution range. Both unimodal mismatch distribution and significant negative Fu's F(S indicated a demographic expansion of Q. variabilis populations in East Asia. A fossil calibrated phylogenetic tree showed a rapid speciation during Pleistocene, with a population augment occurred in Middle Pleistocene. Both diversity patterns and ecological niche modelling indicated there could be multiple glacial refugia and possible bottleneck or founder effects occurred in the southern Japan. We dated major spatial expansion of Q. variabilis population in eastern Asia to the last glacial cycle(s, a period with sea-level fluctuations and land bridges in East China Sea as possible dispersal corridors. This study showed that geographical heterogeneity combined with climate and sea-level changes have shaped the genetic structure of this wide-ranging tree species in East Asia.

  13. The effect of crop sequences on soil microbial, chemical and physical indicators and its relationship with soybean sudden death syndrome (complex of Fusarium species

    Directory of Open Access Journals (Sweden)

    Carolina Perez-Brandan

    2013-12-01

    Full Text Available The effect of crop sequences on soil quality indicators and its relationship with sudden death syndrome (SDS, a complex of Fusarium species was evaluated by physical, chemical, biochemical and molecular techniques. Regarding physical aspects, soybean/maize and maize monoculture exhibited the highest stable aggregate level, with values 41% and 43% higher than in soybean monoculture, respectively, and 133% higher than in bean monoculture. Bulk density (BD was higher in soybean monoculture, being 4% higher than in bean monoculture. The chemical parameters organic matter, total N, P, K, Mg, Ca, and water holding capacity also indicated that soybean/maize and maize monoculture improved soil quality. Fungal and bacterial community fingerprints generated using Terminal Restriction Fragment Length Polymorphism analysis of intergenic transcribed spacer regions of rRNA genes and 16S rRNA genes, respectively, indicated a clear separation between the rotations. Fatty acid profiles evaluated by FAME showed that bean monoculture had higher biomass of Gram (+ bacteria and stress indicators than maize monoculture, while the soybean/maize system showed a significant increase in total microbial biomass (total FAMEs content in comparison with soybean and bean monoculture. The incidence of SDS (Fusarium crassistipitatum was markedly higher (15% under soybean monoculture than when soybean was grown in rotation with maize. In the present work, soil microbial properties were improved under soybean/maize relative to continuous soybean. The improvement of soil health was one of the main causes for the reduction of disease pressure and crop yield improvement due to the benefits that crop rotation produces for soil quality.

  14. Sequencing and annotation of the chloroplast DNAs and identification of polymorphisms distinguishing normal male-fertile and male-sterile cytoplasms of onion.

    Science.gov (United States)

    von Kohn, Christopher; Kiełkowska, Agnieszka; Havey, Michael J

    2013-12-01

    Male-sterile (S) cytoplasm of onion is an alien cytoplasm introgressed into onion in antiquity and is widely used for hybrid seed production. Owing to the biennial generation time of onion, classical crossing takes at least 4 years to classify cytoplasms as S or normal (N) male-fertile. Molecular markers in the organellar DNAs that distinguish N and S cytoplasms are useful to reduce the time required to classify onion cytoplasms. In this research, we completed next-generation sequencing of the chloroplast DNAs of N- and S-cytoplasmic onions; we assembled and annotated the genomes in addition to identifying polymorphisms that distinguish these cytoplasms. The sizes (153 538 and 153 355 base pairs) and GC contents (36.8%) were very similar for the chloroplast DNAs of N and S cytoplasms, respectively, as expected given their close phylogenetic relationship. The size difference was primarily due to small indels in intergenic regions and a deletion in the accD gene of N-cytoplasmic onion. The structures of the onion chloroplast DNAs were similar to those of most land plants with large and small single copy regions separated by inverted repeats. Twenty-eight single nucleotide polymorphisms, two polymorphic restriction-enzyme sites, and one indel distributed across 20 chloroplast genes in the large and small single copy regions were selected and validated using diverse onion populations previously classified as N or S cytoplasmic using restriction fragment length polymorphisms. Although cytoplasmic male sterility is likely associated with the mitochondrial DNA, maternal transmission of the mitochondrial and chloroplast DNAs allows for polymorphisms in either genome to be useful for classifying onion cytoplasms to aid the development of hybrid onion cultivars.

  15. The effect of crop sequences on soil microbial, chemical and physical indicators and its relationship with soybean sudden death syndrome (complex of Fusarium species)

    Energy Technology Data Exchange (ETDEWEB)

    Perez-Brandan, C.; Arzeno, J. L.; Huidobro, J.; Conforto, C.; Grumberg, B.; Hilton, S.; Bending, G. D.; Meriles, J. M.; Vargas-Gil, S.

    2014-06-01

    The effect of crop sequences on soil quality indicators and its relationship with sudden death syndrome (SDS, a complex of Fusarium species) was evaluated by physical, chemical, biochemical and molecular techniques. Regarding physical aspects, soybean/maize and maize mono culture exhibited the highest stable aggregate level, with values 41% and 43% higher than in soybean mono culture, respectively, and 133% higher than in bean mono culture. Bulk density (BD) was higher in soybean monoculture, being 4% higher than in bean monoculture. The chemical parameters organic matter, total N, P, K, Mg, Ca, and water holding capacity also indicated that soybean/maize and maize monoculture improved soil quality. Fungal and bacterial community fingerprints generated using Terminal Restriction Fragment Length Polymorphism analysis of intergenic transcribed spacer regions of rRNA genes and 16S rRNA genes, respectively, indicated a clear separation between the rotations. Fatty acid profiles evaluated by FAME showed that bean monoculture had higher biomass of Gram (+) bacteria and stress indicators than maize monoculture, while the soybean/maize system showed a significant increase in total microbial biomass (total FAMEs content) in comparison with soybean and bean monoculture. The incidence of SDS (Fusarium crassistipitatum) was markedly higher (15%) under soybean monoculture than when soybean was grown in rotation with maize. In the present work, soil microbial properties were improved under soybean/maize relative to continuous soybean. The improvement of soil health was one of the main causes for the reduction of disease pressure and crop yield improvement due to the benefits that crop rotation produces for soil quality. (Author)

  16. Ancestral sequence alignment under optimal conditions

    Directory of Open Access Journals (Sweden)

    Brown Daniel G

    2005-11-01

    Full Text Available Abstract Background Multiple genome alignment is an important problem in bioinformatics. An important subproblem used by many multiple alignment approaches is that of aligning two multiple alignments. Many popular alignment algorithms for DNA use the sum-of-pairs heuristic, where the score of a multiple alignment is the sum of its induced pairwise alignment scores. However, the biological meaning of the sum-of-pairs of pairs heuristic is not obvious. Additionally, many algorithms based on the sum-of-pairs heuristic are complicated and slow, compared to pairwise alignment algorithms. An alternative approach to aligning alignments is to first infer ancestral sequences for each alignment, and then align the two ancestral sequences. In addition to being fast, this method has a clear biological basis that takes into account the evolution implied by an underlying phylogenetic tree. In this study we explore the accuracy of aligning alignments by ancestral sequence alignment. We examine the use of both maximum likelihood and parsimony to infer ancestral sequences. Additionally, we investigate the effect on accuracy of allowing ambiguity in our ancestral sequences. Results We use synthetic sequence data that we generate by simulating evolution on a phylogenetic tree. We use two different types of phylogenetic trees: trees with a period of rapid growth followed by a period of slow growth, and trees with a period of slow growth followed by a period of rapid growth. We examine the alignment accuracy of four ancestral sequence reconstruction and alignment methods: parsimony, maximum likelihood, ambiguous parsimony, and ambiguous maximum likelihood. Additionally, we compare against the alignment accuracy of two sum-of-pairs algorithms: ClustalW and the heuristic of Ma, Zhang, and Wang. Conclusion We find that allowing ambiguity in ancestral sequences does not lead to better multiple alignments. Regardless of whether we use parsimony or maximum likelihood, the

  17. Fibonacci difference sequence spaces for modulus functions

    Directory of Open Access Journals (Sweden)

    Kuldip Raj

    2015-05-01

    Full Text Available In the present paper we introduce Fibonacci difference sequence spaces l(F, Ƒ, p, u and  l_∞(F, Ƒ, p, u by using a sequence of modulus functions and a new band matrix F. We also make an effort to study some inclusion relations, topological and geometric properties of these spaces. Furthermore, the alpha, beta, gamma duals and matrix transformation of the space l(F, Ƒ, p, u are determined.

  18. Parallel motif extraction from very long sequences

    KAUST Repository

    Sahli, Majed

    2013-01-01

    Motifs are frequent patterns used to identify biological functionality in genomic sequences, periodicity in time series, or user trends in web logs. In contrast to a lot of existing work that focuses on collections of many short sequences, modern applications require mining of motifs in one very long sequence (i.e., in the order of several gigabytes). For this case, there exist statistical approaches that are fast but inaccurate; or combinatorial methods that are sound and complete. Unfortunately, existing combinatorial methods are serial and very slow. Consequently, they are limited to very short sequences (i.e., a few megabytes), small alphabets (typically 4 symbols for DNA sequences), and restricted types of motifs. This paper presents ACME, a combinatorial method for extracting motifs from a single very long sequence. ACME arranges the search space in contiguous blocks that take advantage of the cache hierarchy in modern architectures, and achieves almost an order of magnitude performance gain in serial execution. It also decomposes the search space in a smart way that allows scalability to thousands of processors with more than 90% speedup. ACME is the only method that: (i) scales to gigabyte-long sequences; (ii) handles large alphabets; (iii) supports interesting types of motifs with minimal additional cost; and (iv) is optimized for a variety of architectures such as multi-core systems, clusters in the cloud, and supercomputers. ACME reduces the extraction time for an exact-length query from 4 hours to 7 minutes on a typical workstation; handles 3 orders of magnitude longer sequences; and scales up to 16, 384 cores on a supercomputer. Copyright is held by the owner/author(s).

  19. Enhanced Dynamic Algorithm of Genome Sequence Alignments

    OpenAIRE

    Arabi E. keshk

    2014-01-01

    The merging of biology and computer science has created a new field called computational biology that explore the capacities of computers to gain knowledge from biological data, bioinformatics. Computational biology is rooted in life sciences as well as computers, information sciences, and technologies. The main problem in computational biology is sequence alignment that is a way of arranging the sequences of DNA, RNA or protein to identify the region of similarity and relationship between se...

  20. Trace maps of general substitutional sequences

    International Nuclear Information System (INIS)

    Kolar, M.; Nori, F.

    1990-01-01

    It is shown that for arbitrary n, there exists a trace map for any n-letter substitutional sequence. Trace maps are explicitly obtained for the well-known circle and Rudin-Shapiro sequences which can be defined by means of substitution rules on three and four letters, respectively. The properties of the two trace maps and their consequences for various spectral properties are briefly discussed

  1. Human Chromosome 7: DNA Sequence and Biology

    OpenAIRE

    Scherer, Stephen W.; Cheung, Joseph; MacDonald, Jeffrey R.; Osborne, Lucy R.; Nakabayashi, Kazuhiko; Herbrick, Jo-Anne; Carson, Andrew R.; Parker-Katiraee, Layla; Skaug, Jennifer; Khaja, Razi; Zhang, Junjun; Hudek, Alexander K.; Li, Martin; Haddad, May; Duggan, Gavin E.

    2003-01-01

    DNA sequence and annotation of the entire human chromosome 7, encompassing nearly 158 million nucleotides of DNA and 1917 gene structures, are presented. To generate a higher order description, additional structural features such as imprinted genes, fragile sites, and segmental duplications were integrated at the level of the DNA sequence with medical genetic data, including 440 chromosome rearrangement breakpoints associated with disease. This approach enabled the discovery of candidate gene...

  2. DESCRIPTION OF THE RHIC SEQUENCER SYSTEM

    International Nuclear Information System (INIS)

    DOTTAVIO, T.; FRAK, B.; MORRIS, J.; SATOGATA, T.; VAN ZEIJTS, J.

    2001-01-01

    The movement of the Relativistic Heavy Ion Collider (RHIC) through its various states (eg. injection, acceleration, storage, collisions) is controlled by an application called the Sequencer. This program orchestrates most magnet and instrumentation systems and is responsible for the coordinated acquisition and saving of data from various systems. The Sequencer system, its software infrastructure, support programs, and the language used to drive it are discussed in this paper. Initial operational experience is also described

  3. Complete Genome Sequence of Ikoma Lyssavirus

    OpenAIRE

    Marston, Denise A.; Ellis, Richard J.; Horton, Daniel L.; Kuzmin, Ivan V.; Wise, Emma L.; McElhinney, Lorraine M.; Banyard, Ashley C.; Ngeleja, Chanasa; Keyyu, Julius; Cleaveland, Sarah; Lembo, Tiziana; Rupprecht, Charles E.; Fooks, Anthony R.

    2012-01-01

    Lyssaviruses (family Rhabdoviridae) constitute one of the most important groups of viral zoonoses globally. All lyssaviruses cause the disease rabies, an acute progressive encephalitis for which, once symptoms occur, there is no effective cure. Currently available vaccines are highly protective against the predominantly circulating lyssavirus species. Using next-generation sequencing technologies, we have obtained the whole-genome sequence for a novel lyssavirus, Ikoma lyssavirus (IKOV), isol...

  4. Chromatid interchanges at intrachromosomal telomeric DNA sequences

    International Nuclear Information System (INIS)

    Fernandez, J.L.; Vazquez-Gundin, F.; Bilbao, A.; Gosalvez, J.; Goyanes, V.

    1997-01-01

    Chinese hamster Don cells were exposed to X-rays, mitomycin C and teniposide (VM-26) to induce chromatid exchanges (quadriradials and triradials). After fluorescence in situ hybridization (FISH) of telomere sequences it was found that interstitial telomere-like DNA sequence arrays presented around five times more breakage-rearrangements than the genome overall. This high recombinogenic capacity was independent of the clastogen, suggesting that this susceptibility is not related to the initial mechanisms of DNA damage. (author)

  5. On statistical acceleration convergence of double sequences

    Directory of Open Access Journals (Sweden)

    Bipan Hazarika

    2017-04-01

    Full Text Available In this article the notion of statistical acceleration convergence of double sequences in Pringsheim's sense has been introduced. We prove the decompostion theorems for  statistical acceleration convergence of double sequences and some theorems related to that concept have been established using the four dimensional matrix transformations. We provided some examples, where the results of acceleration convergence fails to hold for the statistical cases.

  6. Filovirus Glycoprotein Sequence, Structure and Virulence

    OpenAIRE

    Phillips, J. C.

    2014-01-01

    Leading Ebola subtypes exhibit a wide mortality range, here explained at the molecular level by using fractal hydropathic scaling of amino acid sequences based on protein self-organized criticality. Specific hydrophobic features in the hydrophilic mucin-like domain suffice to account for the wide mortality range. Significance statement: Ebola virus is spreading rapidly in Africa. The connection between protein amino acid sequence and mortality is identified here.

  7. Aspects of coverage in medical DNA sequencing

    Directory of Open Access Journals (Sweden)

    Wilson Richard K

    2008-05-01

    Full Text Available Abstract Background DNA sequencing is now emerging as an important component in biomedical studies of diseases like cancer. Short-read, highly parallel sequencing instruments are expected to be used heavily for such projects, but many design specifications have yet to be conclusively established. Perhaps the most fundamental of these is the redundancy required to detect sequence variations, which bears directly upon genomic coverage and the consequent resolving power for discerning somatic mutations. Results We address the medical sequencing coverage problem via an extension of the standard mathematical theory of haploid coverage. The expected diploid multi-fold coverage, as well as its generalization for aneuploidy are derived and these expressions can be readily evaluated for any project. The resulting theory is used as a scaling law to calibrate performance to that of standard BAC sequencing at 8× to 10× redundancy, i.e. for expected coverages that exceed 99% of the unique sequence. A differential strategy is formalized for tumor/normal studies wherein tumor samples are sequenced more deeply than normal ones. In particular, both tumor alleles should be detected at least twice, while both normal alleles are detected at least once. Our theory predicts these requirements can be met for tumor and normal redundancies of approximately 26× and 21×, respectively. We explain why these values do not differ by a factor of 2, as might intuitively be expected. Future technology developments should prompt even deeper sequencing of tumors, but the 21× value for normal samples is essentially a constant. Conclusion Given the assumptions of standard coverage theory, our model gives pragmatic estimates for required redundancy. The differential strategy should be an efficient means of identifying potential somatic mutations for further study.

  8. Determinant Representations of Sequences: A Survey

    Directory of Open Access Journals (Sweden)

    Moghaddamfar A. R.

    2014-01-01

    Full Text Available This is a survey of recent results concerning (integer matrices whose leading principal minors are well-known sequences such as Fibonacci, Lucas, Jacobsthal and Pell (subsequences. There are different ways for constructing such matrices. Some of these matrices are constructed by homogeneous or nonhomogeneous recurrence relations, and others are constructed by convolution of two sequences. In this article, we will illustrate the idea of these methods by constructing some integer matrices of this type.

  9. Normal form theory and spectral sequences

    OpenAIRE

    Sanders, Jan A.

    2003-01-01

    The concept of unique normal form is formulated in terms of a spectral sequence. As an illustration of this technique some results of Baider and Churchill concerning the normal form of the anharmonic oscillator are reproduced. The aim of this paper is to show that spectral sequences give us a natural framework in which to formulate normal form theory. © 2003 Elsevier Science (USA). All rights reserved.

  10. Bunches of random cross-correlated sequences

    International Nuclear Information System (INIS)

    Maystrenko, A A; Melnik, S S; Pritula, G M; Usatenko, O V

    2013-01-01

    The statistical properties of random cross-correlated sequences constructed by the convolution method (likewise referred to as the Rice or the inverse Fourier transformation) are examined. We clarify the meaning of the filtering function—the kernel of the convolution operator—and show that it is the value of the cross-correlation function which describes correlations between the initial white noise and constructed correlated sequences. The matrix generalization of this method for constructing a bunch of N cross-correlated sequences is presented. Algorithms for their generation are reduced to solving the problem of decomposition of the Fourier transform of the correlation matrix into a product of two mutually conjugate matrices. Different decompositions are considered. The limits of weak and strong correlations for the one-point probability and pair correlation functions of sequences generated by the method under consideration are studied. Special cases of heavy-tailed distributions of the generated sequences are analyzed. We show that, if the filtering function is rather smooth, the distribution function of generated variables has the Gaussian or Lévy form depending on the analytical properties of the distribution (or characteristic) functions of the initial white noise. Anisotropic properties of statistically homogeneous random sequences related to the asymmetry of a filtering function are revealed and studied. These asymmetry properties are expressed in terms of the third- or fourth-order correlation functions. Several examples of the construction of correlated chains with a predefined correlation matrix are given. (paper)

  11. Leaf sequencing algorithms for segmented multileaf collimation

    International Nuclear Information System (INIS)

    Kamath, Srijit; Sahni, Sartaj; Li, Jonathan; Palta, Jatinder; Ranka, Sanjay

    2003-01-01

    The delivery of intensity-modulated radiation therapy (IMRT) with a multileaf collimator (MLC) requires the conversion of a radiation fluence map into a leaf sequence file that controls the movement of the MLC during radiation delivery. It is imperative that the fluence map delivered using the leaf sequence file is as close as possible to the fluence map generated by the dose optimization algorithm, while satisfying hardware constraints of the delivery system. Optimization of the leaf sequencing algorithm has been the subject of several recent investigations. In this work, we present a systematic study of the optimization of leaf sequencing algorithms for segmental multileaf collimator beam delivery and provide rigorous mathematical proofs of optimized leaf sequence settings in terms of monitor unit (MU) efficiency under most common leaf movement constraints that include minimum leaf separation constraint and leaf interdigitation constraint. Our analytical analysis shows that leaf sequencing based on unidirectional movement of the MLC leaves is as MU efficient as bidirectional movement of the MLC leaves

  12. Leaf sequencing algorithms for segmented multileaf collimation

    Energy Technology Data Exchange (ETDEWEB)

    Kamath, Srijit [Department of Computer and Information Science and Engineering, University of Florida, Gainesville, FL (United States); Sahni, Sartaj [Department of Computer and Information Science and Engineering, University of Florida, Gainesville, FL (United States); Li, Jonathan [Department of Radiation Oncology, University of Florida, Gainesville, FL (United States); Palta, Jatinder [Department of Radiation Oncology, University of Florida, Gainesville, FL (United States); Ranka, Sanjay [Department of Computer and Information Science and Engineering, University of Florida, Gainesville, FL (United States)

    2003-02-07

    The delivery of intensity-modulated radiation therapy (IMRT) with a multileaf collimator (MLC) requires the conversion of a radiation fluence map into a leaf sequence file that controls the movement of the MLC during radiation delivery. It is imperative that the fluence map delivered using the leaf sequence file is as close as possible to the fluence map generated by the dose optimization algorithm, while satisfying hardware constraints of the delivery system. Optimization of the leaf sequencing algorithm has been the subject of several recent investigations. In this work, we present a systematic study of the optimization of leaf sequencing algorithms for segmental multileaf collimator beam delivery and provide rigorous mathematical proofs of optimized leaf sequence settings in terms of monitor unit (MU) efficiency under most common leaf movement constraints that include minimum leaf separation constraint and leaf interdigitation constraint. Our analytical analysis shows that leaf sequencing based on unidirectional movement of the MLC leaves is as MU efficient as bidirectional movement of the MLC leaves.

  13. A neurocomputational model of automatic sequence production.

    Science.gov (United States)

    Helie, Sebastien; Roeder, Jessica L; Vucovich, Lauren; Rünger, Dennis; Ashby, F Gregory

    2015-07-01

    Most behaviors unfold in time and include a sequence of submovements or cognitive activities. In addition, most behaviors are automatic and repeated daily throughout life. Yet, relatively little is known about the neurobiology of automatic sequence production. Past research suggests a gradual transfer from the associative striatum to the sensorimotor striatum, but a number of more recent studies challenge this role of the BG in automatic sequence production. In this article, we propose a new neurocomputational model of automatic sequence production in which the main role of the BG is to train cortical-cortical connections within the premotor areas that are responsible for automatic sequence production. The new model is used to simulate four different data sets from human and nonhuman animals, including (1) behavioral data (e.g., RTs), (2) electrophysiology data (e.g., single-neuron recordings), (3) macrostructure data (e.g., TMS), and (4) neurological circuit data (e.g., inactivation studies). We conclude with a comparison of the new model with existing models of automatic sequence production and discuss a possible new role for the BG in automaticity and its implication for Parkinson's disease.

  14. A Unified Theoretical Framework for Cognitive Sequencing.

    Science.gov (United States)

    Savalia, Tejas; Shukla, Anuj; Bapi, Raju S

    2016-01-01

    The capacity to sequence information is central to human performance. Sequencing ability forms the foundation stone for higher order cognition related to language and goal-directed planning. Information related to the order of items, their timing, chunking and hierarchical organization are important aspects in sequencing. Past research on sequencing has emphasized two distinct and independent dichotomies: implicit vs. explicit and goal-directed vs. habits. We propose a theoretical framework unifying these two streams. Our proposal relies on brain's ability to implicitly extract statistical regularities from the stream of stimuli and with attentional engagement organizing sequences explicitly and hierarchically. Similarly, sequences that need to be assembled purposively to accomplish a goal require engagement of attentional processes. With repetition, these goal-directed plans become habits with concomitant disengagement of attention. Thus, attention and awareness play a crucial role in the implicit-to-explicit transition as well as in how goal-directed plans become automatic habits. Cortico-subcortical loops basal ganglia-frontal cortex and hippocampus-frontal cortex loops mediate the transition process. We show how the computational principles of model-free and model-based learning paradigms, along with a pivotal role for attention and awareness, offer a unifying framework for these two dichotomies. Based on this framework, we make testable predictions related to the potential influence of response-to-stimulus interval (RSI) on developing awareness in implicit learning tasks.

  15. A Unified Theoretical Framework for Cognitive Sequencing

    Directory of Open Access Journals (Sweden)

    Tejas Savalia

    2016-11-01

    Full Text Available The capacity to sequence information is central to human performance. Sequencing ability forms the foundation stone for higher order cognition related to language and goal-directed planning. Information related to the order of items, their timing, chunking and hierarchical organization are important aspects in sequencing. Past research on sequencing has emphasized two distinct and independent dichotomies: implicit versus explicit and goal-directed versus habits. We propose a theoretical framework unifying these two streams. Our proposal relies on brain's ability to implicitly extract statistical regularities from the stream of stimuli and with attentional engagement organizing sequences explicitly and hierarchically. Similarly, sequences that need to be assembled purposively to accomplish a goal require engagement of attentional processes. With repetition, these goal-directed plans become habits with concomitant disengagement of attention. Thus attention and awareness play a crucial role in the implicit-to-explicit transition as well as in how goal-directed plans become automatic habits. Cortico-subcortical loops ─ basal ganglia-frontal cortex and hippocampus-frontal cortex loops ─ mediate the transition process. We show how the computational principles of model-free and model-based learning paradigms, along with a pivotal role for attention and awareness, offer a unifying framework for these two dichotomies. Based on this framework, we make testable predictions related to the potential influence of response-to-stimulus interval (RSI on developing awareness in implicit learning tasks.

  16. Sequence determinants of human microsatellite variability

    Directory of Open Access Journals (Sweden)

    Jakobsson Mattias

    2009-12-01

    Full Text Available Abstract Background Microsatellite loci are frequently used in genomic studies of DNA sequence repeats and in population studies of genetic variability. To investigate the effect of sequence properties of microsatellites on their level of variability we have analyzed genotypes at 627 microsatellite loci in 1,048 worldwide individuals from the HGDP-CEPH cell line panel together with the DNA sequences of these microsatellites in the human RefSeq database. Results Calibrating PCR fragment lengths in individual genotypes by using the RefSeq sequence enabled us to infer repeat number in the HGDP-CEPH dataset and to calculate the mean number of repeats (as opposed to the mean PCR fragment length, under the assumption that differences in PCR fragment length reflect differences in the numbers of repeats in the embedded repeat sequences. We find the mean and maximum numbers of repeats across individuals to be positively correlated with heterozygosity. The size and composition of the repeat unit of a microsatellite are also important factors in predicting heterozygosity, with tetra-nucleotide repeat units high in G/C content leading to higher heterozygosity. Finally, we find that microsatellites containing more separate sets of repeated motifs generally have higher heterozygosity. Conclusions These results suggest that sequence properties of microsatellites have a significant impact in determining the features of human microsatellite variability.

  17. Sequence comparison and phylogenetic analysis of core gene of ...

    African Journals Online (AJOL)

    Phylogenetic analysis suggests that our sequences are clustered with sequences reported from Japan. This is the first phylogenetic analysis of HCV core gene from Pakistani population. Our sequences and sequences from Japan are grouped into same cluster in the phylogenetic tree. Sequence comparison and ...

  18. Synaptotagmin gene content of the sequenced genomes

    Directory of Open Access Journals (Sweden)

    Craxton Molly

    2004-07-01

    Full Text Available Abstract Background Synaptotagmins exist as a large gene family in mammals. There is much interest in the function of certain family members which act crucially in the regulated synaptic vesicle exocytosis required for efficient neurotransmission. Knowledge of the functions of other family members is relatively poor and the presence of Synaptotagmin genes in plants indicates a role for the family as a whole which is wider than neurotransmission. Identification of the Synaptotagmin genes within completely sequenced genomes can provide the entire Synaptotagmin gene complement of each sequenced organism. Defining the detailed structures of all the Synaptotagmin genes and their encoded products can provide a useful resource for functional studies and a deeper understanding of the evolution of the gene family. The current rapid increase in the number of sequenced genomes from different branches of the tree of life, together with the public deposition of evolutionarily diverse transcript sequences make such studies worthwhile. Results I have compiled a detailed list of the Synaptotagmin genes of Caenorhabditis, Anopheles, Drosophila, Ciona, Danio, Fugu, Mus, Homo, Arabidopsis and Oryza by examining genomic and transcript sequences from public sequence databases together with some transcript sequences obtained by cDNA library screening and RT-PCR. I have compared all of the genes and investigated the relationship between plant Synaptotagmins and their non-Synaptotagmin counterparts. Conclusions I have identified and compared 98 Synaptotagmin genes from 10 sequenced genomes. Detailed comparison of transcript sequences reveals abundant and complex variation in Synaptotagmin gene expression and indicates the presence of Synaptotagmin genes in all animals and land plants. Amino acid sequence comparisons indicate patterns of conservation and diversity in function. Phylogenetic analysis shows the origin of Synaptotagmins in multicellular eukaryotes and their

  19. Molecular characterization and complete genome sequence of avian paramyxovirus type 4 prototype strain duck/Hong Kong/D3/75

    Directory of Open Access Journals (Sweden)

    Collins Peter L

    2008-10-01

    Full Text Available Abstract Background Avian paramyxoviruses (APMVs are frequently isolated from domestic and wild birds throughout the world. All APMVs, except avian metapneumovirus, are classified in the genus Avulavirus of the family Paramyxoviridae. At present, the APMVs of genus Avulavirus are divided into nine serological types (APMV 1–9. Newcastle disease virus represents APMV-1 and is the most characterized among all APMV types. Very little is known about the molecular characteristics and pathogenicity of APMV 2–9. Results As a first step towards understanding the molecular genetics and pathogenicity of APMV-4, we have sequenced the complete genome of APMV-4 strain duck/Hong Kong/D3/75 and determined its pathogenicity in embryonated chicken eggs. The genome of APMV-4 is 15,054 nucleotides (nt in length, which is consistent with the "rule of six". The genome contains six non-overlapping genes in the order 3'-N-P/V-M-F-HN-L-5'. The genes are flanked on either side by highly conserved transcription start and stop signals and have intergenic sequences varying in length from 9 to 42 nt. The genome contains a 55 nt leader region at 3' end. The 5' trailer region is 17 nt, which is the shortest in the family Paramyxoviridae. Analysis of mRNAs transcribed from the P gene showed that 35% of the transcripts were edited by insertion of one non-templated G residue at an editing site leading to production of V mRNAs. No message was detected that contained insertion of two non-templated G residues, indicating that the W mRNAs are inefficiently produced in APMV-4 infected cells. The cleavage site of the F protein (DIPQR↓F does not conform to the preferred cleavage site of the ubiquitous intracellular protease furin. However, exogenous proteases were not required for the growth of APMV-4 in cell culture, indicating that the cleavage does not depend on a furin site. Conclusion Phylogenic analysis of the nucleotide sequences of viruses of all five genera of the family

  20. PIMS sequencing extension: a laboratory information management system for DNA sequencing facilities

    Directory of Open Access Journals (Sweden)

    Baldwin Stephen A

    2011-03-01

    Full Text Available Abstract Background Facilities that provide a service for DNA sequencing typically support large numbers of users and experiment types. The cost of services is often reduced by the use of liquid handling robots but the efficiency of such facilities is hampered because the software for such robots does not usually integrate well with the systems that run the sequencing machines. Accordingly, there is a need for software systems capable of integrating different robotic systems and managing sample information for DNA sequencing services. In this paper, we describe an extension to the Protein Information Management System (PIMS that is designed for DNA sequencing facilities. The new version of PIMS has a user-friendly web interface and integrates all aspects of the sequencing process, including sample submission, handling and tracking, together with capture and management of the data. Results The PIMS sequencing extension has been in production since July 2009 at the University of Leeds DNA Sequencing Facility. It has completely replaced manual data handling and simplified the tasks of data management and user communication. Samples from 45 groups have been processed with an average throughput of 10000 samples per month. The current version of the PIMS sequencing extension works with Applied Biosystems 3130XL 96-well plate sequencer and MWG 4204 or Aviso Theonyx liquid handling robots, but is readily adaptable for use with other combinations of robots. Conclusions PIMS has been extended to provide a user-friendly and integrated data management solution for DNA sequencing facilities that is accessed through a normal web browser and allows simultaneous access by multiple users as well as facility managers. The system integrates sequencing and liquid handling robots, manages the data flow, and provides remote access to the sequencing results. The software is freely available, for academic users, from http://www.pims-lims.org/.

  1. PIMS sequencing extension: a laboratory information management system for DNA sequencing facilities.

    Science.gov (United States)

    Troshin, Peter V; Postis, Vincent Lg; Ashworth, Denise; Baldwin, Stephen A; McPherson, Michael J; Barton, Geoffrey J

    2011-03-07

    Facilities that provide a service for DNA sequencing typically support large numbers of users and experiment types. The cost of services is often reduced by the use of liquid handling robots but the efficiency of such facilities is hampered because the software for such robots does not usually integrate well with the systems that run the sequencing machines. Accordingly, there is a need for software systems capable of integrating different robotic systems and managing sample information for DNA sequencing services. In this paper, we describe an extension to the Protein Information Management System (PIMS) that is designed for DNA sequencing facilities. The new version of PIMS has a user-friendly web interface and integrates all aspects of the sequencing process, including sample submission, handling and tracking, together with capture and management of the data. The PIMS sequencing extension has been in production since July 2009 at the University of Leeds DNA Sequencing Facility. It has completely replaced manual data handling and simplified the tasks of data management and user communication. Samples from 45 groups have been processed with an average throughput of 10000 samples per month. The current version of the PIMS sequencing extension works with Applied Biosystems 3130XL 96-well plate sequencer and MWG 4204 or Aviso Theonyx liquid handling robots, but is readily adaptable for use with other combinations of robots. PIMS has been extended to provide a user-friendly and integrated data management solution for DNA sequencing facilities that is accessed through a normal web browser and allows simultaneous access by multiple users as well as facility managers. The system integrates sequencing and liquid handling robots, manages the data flow, and provides remote access to the sequencing results. The software is freely available, for academic users, from http://www.pims-lims.org/.

  2. Bellerophon: a program to detect chimeric sequences in multiple sequence alignments.

    Science.gov (United States)

    Huber, Thomas; Faulkner, Geoffrey; Hugenholtz, Philip

    2004-09-22

    Bellerophon is a program for detecting chimeric sequences in multiple sequence datasets by an adaption of partial treeing analysis. Bellerophon was specifically developed to detect 16S rRNA gene chimeras in PCR-clone libraries of environmental samples but can be applied to other nucleotide sequence alignments. Bellerophon is available as an interactive web server at http://foo.maths.uq.edu.au/~huber/bellerophon.pl

  3. Clinical evaluation of further-developed MRCP sequences in comparison with standard MRCP sequences

    International Nuclear Information System (INIS)

    Hundt, W.; Scheidler, J.; Reiser, M.; Petsch, R.

    2002-01-01

    The purpose of this study was the comparison of technically improved single-shot magnetic resonance cholangiopancreatography (MRCP) sequences with standard single-shot rapid acquisition with relaxation enhancement (RARE) and half-Fourier acquired single-shot turbo spin-echo (HASTE) sequences in evaluating the normal and abnormal biliary duct system. The bile duct system of 45 patients was prospectively investigated on a 1.5-T MRI system. The investigation was performed with RARE and HASTE MR cholangiography sequences with standard and high spatial resolutions, and with a delayed-echo half-Fourier RARE (HASTE) sequence. Findings of the improved MRCP sequences were compared with the standard MRCP sequences. The level of confidence in assessing the diagnosis was divided into five groups. The Wilcoxon signed-rank test at a level of p<0.05 was applied. In 15 patients no pathology was found. The MRCP showed stenoses of the bile duct system in 10 patients and choledocholithiasis and cholecystolithiasis in 16 patients. In 12 patients a dilatation of the bile duct system was found. Comparison of the low- and high spatial resolution sequences and the short and long TE times of the half-Fourier RARE (HASTE) sequence revealed no statistically significant differences regarding accuracy of the examination. The diagnostic confidence level in assessing normal or pathological findings for the high-resolution RARE and half-Fourier RARE (HASTE) was significantly better than for the standard sequences. For the delayed-echo half-Fourier RARE (HASTE) sequence no statistically significant difference was seen. The high-resolution RARE and half-Fourier RARE (HASTE) sequences had a higher confidence level, but there was no significant difference in diagnosis in terms of detection and assessment of pathological changes in the biliary duct system compared with standard sequences. (orig.)

  4. Preliminary hazard analysis using sequence tree method

    International Nuclear Information System (INIS)

    Huang Huiwen; Shih Chunkuan; Hung Hungchih; Chen Minghuei; Yih Swu; Lin Jiinming

    2007-01-01

    A system level PHA using sequence tree method was developed to perform Safety Related digital I and C system SSA. The conventional PHA is a brainstorming session among experts on various portions of the system to identify hazards through discussions. However, this conventional PHA is not a systematic technique, the analysis results strongly depend on the experts' subjective opinions. The analysis quality cannot be appropriately controlled. Thereby, this research developed a system level sequence tree based PHA, which can clarify the relationship among the major digital I and C systems. Two major phases are included in this sequence tree based technique. The first phase uses a table to analyze each event in SAR Chapter 15 for a specific safety related I and C system, such as RPS. The second phase uses sequence tree to recognize what I and C systems are involved in the event, how the safety related systems work, and how the backup systems can be activated to mitigate the consequence if the primary safety systems fail. In the sequence tree, the defense-in-depth echelons, including Control echelon, Reactor trip echelon, ESFAS echelon, and Indication and display echelon, are arranged to construct the sequence tree structure. All the related I and C systems, include digital system and the analog back-up systems are allocated in their specific echelon. By this system centric sequence tree based analysis, not only preliminary hazard can be identified systematically, the vulnerability of the nuclear power plant can also be recognized. Therefore, an effective simplified D3 evaluation can be performed as well. (author)

  5. Inverted temperature sequences: role of deformation partitioning

    Science.gov (United States)

    Grujic, D.; Ashley, K. T.; Coble, M. A.; Coutand, I.; Kellett, D.; Whynot, N.

    2015-12-01

    The inverted metamorphism associated with the Main Central thrust zone in the Himalaya has been historically attributed to a number of tectonic processes. Here we show that there is actually a composite peak and deformation temperature sequence that formed in succession via different tectonic processes. The deformation partitioning seems to the have played a key role, and the magnitude of each process has varied along strike of the orogen. To explain the formation of the inverted metamorphic sequence across the Lesser Himalayan Sequence (LHS) in eastern Bhutan, we used Raman spectroscopy of carbonaceous material (RSCM) to determine the peak metamorphic temperatures and Ti-in-quartz thermobarometry to determine the deformation temperatures combined with thermochronology including published apatite and zircon U-Th/He and fission-track data and new 40Ar/39Ar dating of muscovite. The dataset was inverted using 3D-thermal-kinematic modeling to constrain the ranges of geological parameters such as fault geometry and slip rates, location and rates of localized basal accretion, and thermal properties of the crust. RSCM results indicate that there are two peak temperature sequences separated by a major thrust within the LHS. The internal temperature sequence shows an inverted peak temperature gradient of 12 °C/km; in the external (southern) sequence, the peak temperatures are constant across the structural sequence. Thermo-kinematic modeling suggest that the thermochronologic and thermobarometric data are compatible with a two-stage scenario: an Early-Middle Miocene phase of fast overthrusting of a hot hanging wall over a downgoing footwall and inversion of the synkinematic isotherms, followed by the formation of the external duplex developed by dominant underthrusting and basal accretion. To reconcile our observations with the experimental data, we suggest that pervasive ductile deformation within the upper LHS and along the Main Central thrust zone at its top stopped at

  6. Mixed Sequence Reader: A Program for Analyzing DNA Sequences with Heterozygous Base Calling

    Science.gov (United States)

    Chang, Chun-Tien; Tsai, Chi-Neu; Tang, Chuan Yi; Chen, Chun-Houh; Lian, Jang-Hau; Hu, Chi-Yu; Tsai, Chia-Lung; Chao, Angel; Lai, Chyong-Huey; Wang, Tzu-Hao; Lee, Yun-Shien

    2012-01-01

    The direct sequencing of PCR products generates heterozygous base-calling fluorescence chromatograms that are useful for identifying single-nucleotide polymorphisms (SNPs), insertion-deletions (indels), short tandem repeats (STRs), and paralogous genes. Indels and STRs can be easily detected using the currently available Indelligent or ShiftDetector programs, which do not search reference sequences. However, the detection of other genomic variants remains a challenge due to the lack of appropriate tools for heterozygous base-calling fluorescence chromatogram data analysis. In this study, we developed a free web-based program, Mixed Sequence Reader (MSR), which can directly analyze heterozygous base-calling fluorescence chromatogram data in .abi file format using comparisons with reference sequences. The heterozygous sequences are identified as two distinct sequences and aligned with reference sequences. Our results showed that MSR may be used to (i) physically locate indel and STR sequences and determine STR copy number by searching NCBI reference sequences; (ii) predict combinations of microsatellite patterns using the Federal Bureau of Investigation Combined DNA Index System (CODIS); (iii) determine human papilloma virus (HPV) genotypes by searching current viral databases in cases of double infections; (iv) estimate the copy number of paralogous genes, such as β-defensin 4 (DEFB4) and its paralog HSPDP3. PMID:22778697

  7. Enrichment of target sequences for next-generation sequencing applications in research and diagnostics.

    Science.gov (United States)

    Altmüller, Janine; Budde, Birgit S; Nürnberg, Peter

    2014-02-01

    Abstract Targeted re-sequencing such as gene panel sequencing (GPS) has become very popular in medical genetics, both for research projects and in diagnostic settings. The technical principles of the different enrichment methods have been reviewed several times before; however, new enrichment products are constantly entering the market, and researchers are often puzzled about the requirement to take decisions about long-term commitments, both for the enrichment product and the sequencing technology. This review summarizes important considerations for the experimental design and provides helpful recommendations in choosing the best sequencing strategy for various research projects and diagnostic applications.

  8. Detection of M-Sequences from Spike Sequence in Neuronal Networks

    Directory of Open Access Journals (Sweden)

    Yoshi Nishitani

    2012-01-01

    Full Text Available In circuit theory, it is well known that a linear feedback shift register (LFSR circuit generates pseudorandom bit sequences (PRBS, including an M-sequence with the maximum period of length. In this study, we tried to detect M-sequences known as a pseudorandom sequence generated by the LFSR circuit from time series patterns of stimulated action potentials. Stimulated action potentials were recorded from dissociated cultures of hippocampal neurons grown on a multielectrode array. We could find several M-sequences from a 3-stage LFSR circuit (M3. These results show the possibility of assembling LFSR circuits or its equivalent ones in a neuronal network. However, since the M3 pattern was composed of only four spike intervals, the possibility of an accidental detection was not zero. Then, we detected M-sequences from random spike sequences which were not generated from an LFSR circuit and compare the result with the number of M-sequences from the originally observed raster data. As a result, a significant difference was confirmed: a greater number of “0–1” reversed the 3-stage M-sequences occurred than would have accidentally be detected. This result suggests that some LFSR equivalent circuits are assembled in neuronal networks.

  9. A safe an easy method for building consensus HIV sequences from 454 massively parallel sequencing data.

    Science.gov (United States)

    Fernández-Caballero Rico, Jose Ángel; Chueca Porcuna, Natalia; Álvarez Estévez, Marta; Mosquera Gutiérrez, María Del Mar; Marcos Maeso, María Ángeles; García, Federico

    2018-02-01

    To show how to generate a consensus sequence from the information of massive parallel sequences data obtained from routine HIV anti-retroviral resistance studies, and that may be suitable for molecular epidemiology studies. Paired Sanger (Trugene-Siemens) and next-generation sequencing (NGS) (454 GSJunior-Roche) HIV RT and protease sequences from 62 patients were studied. NGS consensus sequences were generated using Mesquite, using 10%, 15%, and 20% thresholds. Molecular evolutionary genetics analysis (MEGA) was used for phylogenetic studies. At a 10% threshold, NGS-Sanger sequences from 17/62 patients were phylogenetically related, with a median bootstrap-value of 88% (IQR83.5-95.5). Association increased to 36/62 sequences, median bootstrap 94% (IQR85.5-98)], using a 15% threshold. Maximum association was at the 20% threshold, with 61/62 sequences associated, and a median bootstrap value of 99% (IQR98-100). A safe method is presented to generate consensus sequences from HIV-NGS data at 20% threshold, which will prove useful for molecular epidemiological studies. Copyright © 2016 Elsevier España, S.L.U. and Sociedad Española de Enfermedades Infecciosas y Microbiología Clínica. All rights reserved.

  10. Sequencing of BAC pools by different next generation sequencing platforms and strategies

    Directory of Open Access Journals (Sweden)

    Scholz Uwe

    2011-10-01

    Full Text Available Abstract Background Next generation sequencing of BACs is a viable option for deciphering the sequence of even large and highly repetitive genomes. In order to optimize this strategy, we examined the influence of read length on the quality of Roche/454 sequence assemblies, to what extent Illumina/Solexa mate pairs (MPs improve the assemblies by scaffolding and whether barcoding of BACs is dispensable. Results Sequencing four BACs with both FLX and Titanium technologies revealed similar sequencing accuracy, but showed that the longer Titanium reads produce considerably less misassemblies and gaps. The 454 assemblies of 96 barcoded BACs were improved by scaffolding 79% of the total contig length with MPs from a non-barcoded library. Assembly of the unmasked 454 sequences without separation by barcodes revealed chimeric contig formation to be a major problem, encompassing 47% of the total contig length. Masking the sequences reduced this fraction to 24%. Conclusion Optimal BAC pool sequencing should be based on the longest available reads, with barcoding essential for a comprehensive assessment of both repetitive and non-repetitive sequence information. When interest is restricted to non-repetitive regions and repeats are masked prior to assembly, barcoding is non-essential. In any case, the assemblies can be improved considerably by scaffolding with non-barcoded BAC pool MPs.

  11. Sequence comparison and phylogenetic analysis of core gene of ...

    African Journals Online (AJOL)

    STORAGESEVER

    2010-07-19

    Jul 19, 2010 ... and antisense primers, a single band of 573 base pairs .... Amino acid sequence alignment of Cluster I and Cluster II of phylogenetic tree. First ten sequences ... sequence weighting, postion-spiecific gap penalties and weight.

  12. Axioms for behavioural congruence of single-pass instruction sequences

    NARCIS (Netherlands)

    Bergstra, J.A.; Middelburg, C.A.

    2017-01-01

    In program algebra, an algebraic theory of single-pass instruction sequences, three congruences on instruction sequences are paid attention to: instruction sequence congruence, structural congruence, and behavioural congruence. Sound and complete axiom systems for the first two congruences were

  13. Tournaments, oriented graphs and football sequences

    Directory of Open Access Journals (Sweden)

    Pirzada S.

    2017-08-01

    Full Text Available Consider the result of a soccer league competition where n teams play each other exactly once. A team gets three points for each win and one point for each draw. The total score obtained by each team vi is called the f-score of vi and is denoted by fi. The sequences of all f-scores [fi]i=1n$\\left[ {{\\rm{f}}_{\\rm{i}} } \\right]_{{\\rm{i}} = 1}^{\\rm{n}} $ arranged in non-decreasing order is called the f-score sequence of the competition. We raise the following problem: Which sequences of non-negative integers in non-decreasing order is a football sequence, that is the outcome of a soccer league competition. We model such a competition by an oriented graph with teams represented by vertices in which the teams play each other once, with an arc from team u to team v if and only if u defeats v. We obtain some necessary conditions for football sequences and some characterizations under restrictions.

  14. SNAD: sequence name annotation-based designer

    Directory of Open Access Journals (Sweden)

    Gorbalenya Alexander E

    2009-08-01

    Full Text Available Abstract Background A growing diversity of biological data is tagged with unique identifiers (UIDs associated with polynucleotides and proteins to ensure efficient computer-mediated data storage, maintenance, and processing. These identifiers, which are not informative for most people, are often substituted by biologically meaningful names in various presentations to facilitate utilization and dissemination of sequence-based knowledge. This substitution is commonly done manually that may be a tedious exercise prone to mistakes and omissions. Results Here we introduce SNAD (Sequence Name Annotation-based Designer that mediates automatic conversion of sequence UIDs (associated with multiple alignment or phylogenetic tree, or supplied as plain text list into biologically meaningful names and acronyms. This conversion is directed by precompiled or user-defined templates that exploit wealth of annotation available in cognate entries of external databases. Using examples, we demonstrate how this tool can be used to generate names for practical purposes, particularly in virology. Conclusion A tool for controllable annotation-based conversion of sequence UIDs into biologically meaningful names and acronyms has been developed and placed into service, fostering links between quality of sequence annotation, and efficiency of communication and knowledge dissemination among researchers.

  15. Variable depth recursion algorithm for leaf sequencing

    International Nuclear Information System (INIS)

    Siochi, R. Alfredo C.

    2007-01-01

    The processes of extraction and sweep are basic segmentation steps that are used in leaf sequencing algorithms. A modified version of a commercial leaf sequencer changed the way that the extracts are selected and expanded the search space, but the modification maintained the basic search paradigm of evaluating multiple solutions, each one consisting of up to 12 extracts and a sweep sequence. While it generated the best solutions compared to other published algorithms, it used more computation time. A new, faster algorithm selects one extract at a time but calls itself as an evaluation function a user-specified number of times, after which it uses the bidirectional sweeping window algorithm as the final evaluation function. To achieve a performance comparable to that of the modified commercial leaf sequencer, 2-3 calls were needed, and in all test cases, there were only slight improvements beyond two calls. For the 13 clinical test maps, computation speeds improved by a factor between 12 and 43, depending on the constraints, namely the ability to interdigitate and the avoidance of the tongue-and-groove under dose. The new algorithm was compared to the original and modified versions of the commercial leaf sequencer. It was also compared to other published algorithms for 1400, random, 15x15, test maps with 3-16 intensity levels. In every single case the new algorithm provided the best solution

  16. Modeling of prepregs during automated draping sequences

    Science.gov (United States)

    Krogh, Christian; Glud, Jens A.; Jakobsen, Johnny

    2017-10-01

    The behavior of wowen prepreg fabric during automated draping sequences is investigated. A drape tool under development with an arrangement of grippers facilitates the placement of a woven prepreg fabric in a mold. It is essential that the draped configuration is free from wrinkles and other defects. The present study aims at setting up a virtual draping framework capable of modeling the draping process from the initial flat fabric to the final double curved shape and aims at assisting the development of an automated drape tool. The virtual draping framework consists of a kinematic mapping algorithm used to generate target points on the mold which are used as input to a draping sequence planner. The draping sequence planner prescribes the displacement history for each gripper in the drape tool and these displacements are then applied to each gripper in a transient model of the draping sequence. The model is based on a transient finite element analysis with the material's constitutive behavior currently being approximated as linear elastic orthotropic. In-plane tensile and bias-extension tests as well as bending tests are conducted and used as input for the model. The virtual draping framework shows a good potential for obtaining a better understanding of the drape process and guide the development of the drape tool. However, results obtained from using the framework on a simple test case indicate that the generation of draping sequences is non-trivial.

  17. Extended sequence diagram for human system interaction

    International Nuclear Information System (INIS)

    Hwang, Jong Rok; Choi, Sun Woo; Ko, Hee Ran; Kim, Jong Hyun

    2012-01-01

    Unified Modeling Language (UML) is a modeling language in the field of object oriented software engineering. The sequence diagram is a kind of interaction diagram that shows how processes operate with one another and in what order. It is a construct of a message sequence chart. It depicts the objects and classes involved in the scenario and the sequence of messages exchanged between the objects needed to carry out the functionality of the scenario. This paper proposes the Extended Sequence Diagram (ESD), which is capable of depicting human system interaction for nuclear power plants, as well as cognitive process of operators analysis. In the conventional sequence diagram, there is a limit to only identify the activities of human and systems interactions. The ESD is extended to describe operators' cognitive process in more detail. The ESD is expected to be used as a task analysis method for describing human system interaction. The ESD can also present key steps causing abnormal operations or failures and diverse human errors based on cognitive condition

  18. Sequence analysis by iterated maps, a review.

    Science.gov (United States)

    Almeida, Jonas S

    2014-05-01

    Among alignment-free methods, Iterated Maps (IMs) are on a particular extreme: they are also scale free (order free). The use of IMs for sequence analysis is also distinct from other alignment-free methodologies in being rooted in statistical mechanics instead of computational linguistics. Both of these roots go back over two decades to the use of fractal geometry in the characterization of phase-space representations. The time series analysis origin of the field is betrayed by the title of the manuscript that started this alignment-free subdomain in 1990, 'Chaos Game Representation'. The clash between the analysis of sequences as continuous series and the better established use of Markovian approaches to discrete series was almost immediate, with a defining critique published in same journal 2 years later. The rest of that decade would go by before the scale-free nature of the IM space was uncovered. The ensuing decade saw this scalability generalized for non-genomic alphabets as well as an interest in its use for graphic representation of biological sequences. Finally, in the past couple of years, in step with the emergence of BigData and MapReduce as a new computational paradigm, there is a surprising third act in the IM story. Multiple reports have described gains in computational efficiency of multiple orders of magnitude over more conventional sequence analysis methodologies. The stage appears to be now set for a recasting of IMs with a central role in processing nextgen sequencing results.

  19. Harnessing Whole Genome Sequencing in Medical Mycology.

    Science.gov (United States)

    Cuomo, Christina A

    2017-01-01

    Comparative genome sequencing studies of human fungal pathogens enable identification of genes and variants associated with virulence and drug resistance. This review describes current approaches, resources, and advances in applying whole genome sequencing to study clinically important fungal pathogens. Genomes for some important fungal pathogens were only recently assembled, revealing gene family expansions in many species and extreme gene loss in one obligate species. The scale and scope of species sequenced is rapidly expanding, leveraging technological advances to assemble and annotate genomes with higher precision. By using iteratively improved reference assemblies or those generated de novo for new species, recent studies have compared the sequence of isolates representing populations or clinical cohorts. Whole genome approaches provide the resolution necessary for comparison of closely related isolates, for example, in the analysis of outbreaks or sampled across time within a single host. Genomic analysis of fungal pathogens has enabled both basic research and diagnostic studies. The increased scale of sequencing can be applied across populations, and new metagenomic methods allow direct analysis of complex samples.

  20. OTU analysis using metagenomic shotgun sequencing data.

    Directory of Open Access Journals (Sweden)

    Xiaolin Hao

    Full Text Available Because of technological limitations, the primer and amplification biases in targeted sequencing of 16S rRNA genes have veiled the true microbial diversity underlying environmental samples. However, the protocol of metagenomic shotgun sequencing provides 16S rRNA gene fragment data with natural immunity against the biases raised during priming and thus the potential of uncovering the true structure of microbial community by giving more accurate predictions of operational taxonomic units (OTUs. Nonetheless, the lack of statistically rigorous comparison between 16S rRNA gene fragments and other data types makes it difficult to interpret previously reported results using 16S rRNA gene fragments. Therefore, in the present work, we established a standard analysis pipeline that would help confirm if the differences in the data are true or are just due to potential technical bias. This pipeline is built by using simulated data to find optimal mapping and OTU prediction methods. The comparison between simulated datasets revealed a relationship between 16S rRNA gene fragments and full-length 16S rRNA sequences that a 16S rRNA gene fragment having a length >150 bp provides the same accuracy as a full-length 16S rRNA sequence using our proposed pipeline, which could serve as a good starting point for experimental design and making the comparison between 16S rRNA gene fragment-based and targeted 16S rRNA sequencing-based surveys possible.