WorldWideScience

Sample records for intergenic non-coding sequence

  1. Genome-wide identification and characterization of long intergenic non-coding RNAs in Ganoderma lucidum.

    Directory of Open Access Journals (Sweden)

    Jianqin Li

    Full Text Available Ganoderma lucidum is a white-rot fungus best-known for its medicinal activities. We have previously sequenced its genome and annotated the protein coding genes. However, long non-coding RNAs in G. lucidum genome have not been analyzed. In this study, we have identified and characterized long intergenic non-coding RNAs (lincRNA in G. lucidum systematically. We developed a computational pipeline, which was used to analyze RNA-Seq data derived from G. lucidum samples collected from three developmental stages. A total of 402 lincRNA candidates were identified, with an average length of 609 bp. Analysis of their adjacent protein-coding genes (apcGenes revealed that 46 apcGenes belong to the pathways of triterpenoid biosynthesis and lignin degradation, or families of cytochrome P450, mating type B genes, and carbohydrate-active enzymes. To determine if lincRNAs and these apcGenes have any interactions, the corresponding pairs of lincRNAs and apcGenes were analyzed in detail. We developed a modified 3' RACE method to analyze the transcriptional direction of a transcript. Among the 46 lincRNAs, 37 were found unidirectionally transcribed, and 9 were found bidirectionally transcribed. The expression profiles of 16 of these 37 lincRNAs were found to be highly correlated with those of the apcGenes across the three developmental stages. Among them, 11 are positively correlated (r>0.8 and 5 are negatively correlated (r<-0.8. The co-localization and co-expression of lincRNAs and those apcGenes playing important functions is consistent with the notion that lincRNAs might be important regulators for cellular processes. In summary, this represents the very first study to identify and characterize lincRNAs in the genomes of basidiomycetes. The results obtained here have laid the foundation for study of potential lincRNA-mediated expression regulation of genes in G. lucidum.

  2. Annotating long intergenic non-coding RNAs under artificial selection during chicken domestication.

    Science.gov (United States)

    Wang, Yun-Mei; Xu, Hai-Bo; Wang, Ming-Shan; Otecko, Newton Otieno; Ye, Ling-Qun; Wu, Dong-Dong; Zhang, Ya-Ping

    2017-08-15

    Numerous biological functions of long intergenic non-coding RNAs (lincRNAs) have been identified. However, the contribution of lincRNAs to the domestication process has remained elusive. Following domestication from their wild ancestors, animals display substantial changes in many phenotypic traits. Therefore, it is possible that diverse molecular drivers play important roles in this process. We analyzed 821 transcriptomes in this study and annotated 4754 lincRNA genes in the chicken genome. Our population genomic analysis indicates that 419 lincRNAs potentially evolved during artificial selection related to the domestication of chicken, while a comparative transcriptomic analysis identified 68 lincRNAs that were differentially expressed under different conditions. We also found 47 lincRNAs linked to special phenotypes. Our study provides a comprehensive view of the genome-wide landscape of lincRNAs in chicken. This will promote a better understanding of the roles of lincRNAs in domestication, and the genetic mechanisms associated with the artificial selection of domestic animals.

  3. Properties of non-coding DNA and identification of putative cis-regulatory elements in Theileria parva

    Directory of Open Access Journals (Sweden)

    Guo Xiang

    2008-12-01

    Full Text Available Abstract Background Parasites in the genus Theileria cause lymphoproliferative diseases in cattle, resulting in enormous socio-economic losses. The availability of the genome sequences and annotation for T. parva and T. annulata has facilitated the study of parasite biology and their relationship with host cell transformation and tropism. However, the mechanism of transcriptional regulation in this genus, which may be key to understanding fundamental aspects of its parasitology, remains poorly understood. In this study, we analyze the evolution of non-coding sequences in the Theileria genome and identify conserved sequence elements that may be involved in gene regulation of these parasitic species. Results Intergenic regions and introns in Theileria are short, and their length distributions are considerably right-skewed. Intergenic regions flanked by genes in 5'-5' orientation tend to be longer and slightly more AT-rich than those flanked by two stop codons; intergenic regions flanked by genes in 3'-5' orientation have intermediate values of length and AT composition. Intron position is negatively correlated with intron length, and positively correlated with GC content. Using stringent criteria, we identified a set of high-quality orthologous non-coding sequences between T. parva and T. annulata, and determined the distribution of selective constraints across regions, which are shown to be higher close to translation start sites. A positive correlation between constraint and length in both intergenic regions and introns suggests a tight control over length expansion of non-coding regions. Genome-wide searches for functional elements revealed several conserved motifs in intergenic regions of Theileria genomes. Two such motifs are preferentially located within the first 60 base pairs upstream of transcription start sites in T. parva, are preferentially associated with specific protein functional categories, and have significant similarity to know

  4. Genome-wide identification and functional prediction of nitrogen-responsive intergenic and intronic long non-coding RNAs in maize (Zea mays L.).

    Science.gov (United States)

    Lv, Yuanda; Liang, Zhikai; Ge, Min; Qi, Weicong; Zhang, Tifu; Lin, Feng; Peng, Zhaohua; Zhao, Han

    2016-05-11

    Nitrogen (N) is an essential and often limiting nutrient to plant growth and development. Previous studies have shown that the mRNA expressions of numerous genes are regulated by nitrogen supplies; however, little is known about the expressed non-coding elements, for example long non-coding RNAs (lncRNAs) that control the response of maize (Zea mays L.) to nitrogen. LncRNAs are a class of non-coding RNAs larger than 200 bp, which have emerged as key regulators in gene expression. In this study, we surveyed the intergenic/intronic lncRNAs in maize B73 leaves at the V7 stage under conditions of N-deficiency and N-sufficiency using ribosomal RNA depletion and ultra-deep total RNA sequencing approaches. By integration with mRNA expression profiles and physiological evaluations, 7245 lncRNAs and 637 nitrogen-responsive lncRNAs were identified that exhibited unique expression patterns. Co-expression network analysis showed that the nitrogen-responsive lncRNAs were enriched mainly in one of the three co-expressed modules. The genes in the enriched module are mainly involved in NADH dehydrogenase activity, oxidative phosphorylation and the nitrogen compounds metabolic process. We identified a large number of lncRNAs in maize and illustrated their potential regulatory roles in response to N stress. The results lay the foundation for further in-depth understanding of the molecular mechanisms of lncRNAs' role in response to nitrogen stresses.

  5. Detecting non-coding selective pressure in coding regions

    Directory of Open Access Journals (Sweden)

    Blanchette Mathieu

    2007-02-01

    Full Text Available Abstract Background Comparative genomics approaches, where orthologous DNA regions are compared and inter-species conserved regions are identified, have proven extremely powerful for identifying non-coding regulatory regions located in intergenic or intronic regions. However, non-coding functional elements can also be located within coding region, as is common for exonic splicing enhancers, some transcription factor binding sites, and RNA secondary structure elements affecting mRNA stability, localization, or translation. Since these functional elements are located in regions that are themselves highly conserved because they are coding for a protein, they generally escaped detection by comparative genomics approaches. Results We introduce a comparative genomics approach for detecting non-coding functional elements located within coding regions. Codon evolution is modeled as a mixture of codon substitution models, where each component of the mixture describes the evolution of codons under a specific type of coding selective pressure. We show how to compute the posterior distribution of the entropy and parsimony scores under this null model of codon evolution. The method is applied to a set of growth hormone 1 orthologous mRNA sequences and a known exonic splicing elements is detected. The analysis of a set of CORTBP2 orthologous genes reveals a region of several hundred base pairs under strong non-coding selective pressure whose function remains unknown. Conclusion Non-coding functional elements, in particular those involved in post-transcriptional regulation, are likely to be much more prevalent than is currently known. With the numerous genome sequencing projects underway, comparative genomics approaches like that proposed here are likely to become increasingly powerful at detecting such elements.

  6. Purifying selection acts on coding and non-coding sequences of paralogous genes in Arabidopsis thaliana.

    Science.gov (United States)

    Hoffmann, Robert D; Palmgren, Michael

    2016-06-13

    Whole-genome duplications in the ancestors of many diverse species provided the genetic material for evolutionary novelty. Several models explain the retention of paralogous genes. However, how these models are reflected in the evolution of coding and non-coding sequences of paralogous genes is unknown. Here, we analyzed the coding and non-coding sequences of paralogous genes in Arabidopsis thaliana and compared these sequences with those of orthologous genes in Arabidopsis lyrata. Paralogs with lower expression than their duplicate had more nonsynonymous substitutions, were more likely to fractionate, and exhibited less similar expression patterns with their orthologs in the other species. Also, lower-expressed genes had greater tissue specificity. Orthologous conserved non-coding sequences in the promoters, introns, and 3' untranslated regions were less abundant at lower-expressed genes compared to their higher-expressed paralogs. A gene ontology (GO) term enrichment analysis showed that paralogs with similar expression levels were enriched in GO terms related to ribosomes, whereas paralogs with different expression levels were enriched in terms associated with stress responses. Loss of conserved non-coding sequences in one gene of a paralogous gene pair correlates with reduced expression levels that are more tissue specific. Together with increased mutation rates in the coding sequences, this suggests that similar forces of purifying selection act on coding and non-coding sequences. We propose that coding and non-coding sequences evolve concurrently following gene duplication.

  7. Long intergenic non-coding RNA TUG1 is overexpressed in urothelial carcinoma of the bladder.

    Science.gov (United States)

    Han, Yonghua; Liu, Yuchen; Gui, Yaoting; Cai, Zhiming

    2013-04-01

    Long intergenic non-coding RNAs (lincRNAs) are a class of non-coding RNAs that regulate gene expression via chromatin reprogramming. Taurine Up-regulated Gene 1 (TUG1) is a lincRNA that is associated with chromatin-modifying complexes and plays roles in gene regulation. In this study, we determined the expression patterns of TUG1 and the cell proliferation inhibition and apoptosis induced by silencing TUG1 in urothelial carcinoma of the bladder. The expression levels of TUG1 were determined using Real-Time qPCR in a total of 44 patients with bladder urothelial carcinomas. Bladder urothelial carcinoma T24 and 5637 cells were transfected with TUG1 siRNA or negative control siRNA. Cell proliferation was evaluated using MTT assay. Apoptosis was determined using ELISA assay. TUG1 was up-regulated in bladder urothelial carcinoma compared to paired normal urothelium. High TUG1 expression levels were associated with high grade and stage carcinomas. Cell proliferation inhibition and apoptosis induction were observed in TUG1 siRNA-transfected bladder urothelial carcinoma T24 and 5637 cells. Our data suggest that lincRNA TUG1 is emerging as a novel player in the disease state of bladder urothelial carcinoma. TUG1 may have potential roles as a biomarker and/or a therapeutic target in bladder urothelial carcinoma. Copyright © 2012 Wiley Periodicals, Inc.

  8. Systematically profiling and annotating long intergenic non-coding RNAs in human embryonic stem cell.

    Science.gov (United States)

    Tang, Xing; Hou, Mei; Ding, Yang; Li, Zhaohui; Ren, Lichen; Gao, Ge

    2013-01-01

    While more and more long intergenic non-coding RNAs (lincRNAs) were identified to take important roles in both maintaining pluripotency and regulating differentiation, how these lincRNAs may define and drive cell fate decisions on a global scale are still mostly elusive. Systematical profiling and comprehensive annotation of embryonic stem cells lincRNAs may not only bring a clearer big picture of these novel regulators but also shed light on their functionalities. Based on multiple RNA-Seq datasets, we systematically identified 300 human embryonic stem cell lincRNAs (hES lincRNAs). Of which, one forth (78 out of 300) hES lincRNAs were further identified to be biasedly expressed in human ES cells. Functional analysis showed that they were preferentially involved in several early-development related biological processes. Comparative genomics analysis further suggested that around half of the identified hES lincRNAs were conserved in mouse. To facilitate further investigation of these hES lincRNAs, we constructed an online portal for biologists to access all their sequences and annotations interactively. In addition to navigation through a genome browse interface, users can also locate lincRNAs through an advanced query interface based on both keywords and expression profiles, and analyze results through multiple tools. By integrating multiple RNA-Seq datasets, we systematically characterized and annotated 300 hES lincRNAs. A full functional web portal is available freely at http://scbrowse.cbi.pku.edu.cn. As the first global profiling and annotating of human embryonic stem cell lincRNAs, this work aims to provide a valuable resource for both experimental biologists and bioinformaticians.

  9. Highly conserved non-coding sequences are associated with vertebrate development.

    Directory of Open Access Journals (Sweden)

    Adam Woolfe

    2005-01-01

    Full Text Available In addition to protein coding sequence, the human genome contains a significant amount of regulatory DNA, the identification of which is proving somewhat recalcitrant to both in silico and functional methods. An approach that has been used with some success is comparative sequence analysis, whereby equivalent genomic regions from different organisms are compared in order to identify both similarities and differences. In general, similarities in sequence between highly divergent organisms imply functional constraint. We have used a whole-genome comparison between humans and the pufferfish, Fugu rubripes, to identify nearly 1,400 highly conserved non-coding sequences. Given the evolutionary divergence between these species, it is likely that these sequences are found in, and furthermore are essential to, all vertebrates. Most, and possibly all, of these sequences are located in and around genes that act as developmental regulators. Some of these sequences are over 90% identical across more than 500 bases, being more highly conserved than coding sequence between these two species. Despite this, we cannot find any similar sequences in invertebrate genomes. In order to begin to functionally test this set of sequences, we have used a rapid in vivo assay system using zebrafish embryos that allows tissue-specific enhancer activity to be identified. Functional data is presented for highly conserved non-coding sequences associated with four unrelated developmental regulators (SOX21, PAX6, HLXB9, and SHH, in order to demonstrate the suitability of this screen to a wide range of genes and expression patterns. Of 25 sequence elements tested around these four genes, 23 show significant enhancer activity in one or more tissues. We have identified a set of non-coding sequences that are highly conserved throughout vertebrates. They are found in clusters across the human genome, principally around genes that are implicated in the regulation of development

  10. Identification and Functional Analysis of Long Intergenic Non-coding RNAs Underlying Intramuscular Fat Content in Pigs

    Directory of Open Access Journals (Sweden)

    Cheng Zou

    2018-03-01

    Full Text Available Intramuscular fat (IMF content is an important trait that can affect pork quality. Previous studies have identified many genes that can regulate IMF. Long intergenic non-coding RNAs (lincRNAs are emerging as key regulators in various biological processes. However, lincRNAs related to IMF in pig are largely unknown, and the mechanisms by which they regulate IMF are yet to be elucidated. Here we reconstructed 105,687 transcripts and identified 1,032 lincRNAs in pig longissimus dorsi muscle (LDM of four stages with different IMF contents based on published RNA-seq. These lincRNAs show typical characteristics such as shorter length and lower expression compared with protein-coding genes. Combined with methylation data, we found that both the promoter and genebody methylation of lincRNAs can negatively regulate lincRNA expression. We found that lincRNAs exhibit high correlation with their protein-coding neighbors in expression. Co-expression network analysis resulted in eight stage-specific modules, gene ontology and pathway analysis of them suggested that some lincRNAs were involved in IMF-related processes, such as fatty acid metabolism and peroxisome proliferator-activated receptor signaling pathway. Furthermore, we identified hub lincRNAs and found six of them may play important roles in IMF development. This work detailed some lincRNAs which may affect of IMF development in pig, and facilitated future research on these lincRNAs and molecular assisted breeding for pig.

  11. Large Intergenic Non-coding RNA-RoR Inhibits Aerobic Glycolysis of Glioblastoma Cells via Akt Pathway

    Science.gov (United States)

    Li, Yong; He, Zhi-Cheng; Liu, Qing; Zhou, Kai; Shi, Yu; Yao, Xiao-Hong; Zhang, Xia; Kung, Hsiang-Fu; Ping, Yi-Fang; Bian, Xiu-Wu

    2018-01-01

    Reprogramming energy metabolism is a hallmark of malignant tumors, including glioblastoma (GBM). Aerobic glycolysis is often utilized by tumor cells to maintain survival and proliferation. However, the underlying mechanisms of aerobic glycolysis in GBM remain elusive. Herein, we demonstrated that large intergenic non-coding RNA-RoR (LincRNA-RoR) functioned as a critical suppressor to inhibit the aerobic glycolysis and viability of GBM cells. We found that LincRNA-RoR was markedly reduced in GBM tissues compared with adjacent non-tumor tissues from 10 cases of GBM patients. Consistently, LincRNA-RoR expression in GBM cells was significantly lower than that in normal glial cells. The aerobic glycolysis of GBM cells, as determined by the measurement of glucose uptake and lactate production, was impaired by LincRNA-RoR overexpression. Mechanistically, LincRNA-RoR inhibited the expression of Rictor, the key component of mTORC2 (mammalian target of rapamycin complex 2), to suppress the activity of Akt pathway and impair the expression of glycolytic effectors, including Glut1, HK2, PKM2 and LDHA. Finally, enforced expression of LincRNA-RoR reduced the proliferation of GBM cells in vitro, restrained tumor growth in vivo, and repressed the expression of glycolytic molecules in GBM xenografts. Collectively, our results underscore LincRNA-RoR as a new suppressor of GBM aerobic glycolysis with therapeutic potential. PMID:29581766

  12. Genome-wide identification of coding and non-coding conserved sequence tags in human and mouse genomes

    Directory of Open Access Journals (Sweden)

    Maggi Giorgio P

    2008-06-01

    Full Text Available Abstract Background The accurate detection of genes and the identification of functional regions is still an open issue in the annotation of genomic sequences. This problem affects new genomes but also those of very well studied organisms such as human and mouse where, despite the great efforts, the inventory of genes and regulatory regions is far from complete. Comparative genomics is an effective approach to address this problem. Unfortunately it is limited by the computational requirements needed to perform genome-wide comparisons and by the problem of discriminating between conserved coding and non-coding sequences. This discrimination is often based (thus dependent on the availability of annotated proteins. Results In this paper we present the results of a comprehensive comparison of human and mouse genomes performed with a new high throughput grid-based system which allows the rapid detection of conserved sequences and accurate assessment of their coding potential. By detecting clusters of coding conserved sequences the system is also suitable to accurately identify potential gene loci. Following this analysis we created a collection of human-mouse conserved sequence tags and carefully compared our results to reliable annotations in order to benchmark the reliability of our classifications. Strikingly we were able to detect several potential gene loci supported by EST sequences but not corresponding to as yet annotated genes. Conclusion Here we present a new system which allows comprehensive comparison of genomes to detect conserved coding and non-coding sequences and the identification of potential gene loci. Our system does not require the availability of any annotated sequence thus is suitable for the analysis of new or poorly annotated genomes.

  13. Similar Ratios of Introns to Intergenic Sequence across Animal Genomes.

    Science.gov (United States)

    Francis, Warren R; Wörheide, Gert

    2017-06-01

    One central goal of genome biology is to understand how the usage of the genome differs between organisms. Our knowledge of genome composition, needed for downstream inferences, is critically dependent on gene annotations, yet problems associated with gene annotation and assembly errors are usually ignored in comparative genomics. Here, we analyze the genomes of 68 species across 12 animal phyla and some single-cell eukaryotes for general trends in genome composition and transcription, taking into account problems of gene annotation. We show that, regardless of genome size, the ratio of introns to intergenic sequence is comparable across essentially all animals, with nearly all deviations dominated by increased intergenic sequence. Genomes of model organisms have ratios much closer to 1:1, suggesting that the majority of published genomes of nonmodel organisms are underannotated and consequently omit substantial numbers of genes, with likely negative impact on evolutionary interpretations. Finally, our results also indicate that most animals transcribe half or more of their genomes arguing against differences in genome usage between animal groups, and also suggesting that the transcribed portion is more dependent on genome size than previously thought. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  14. DNA watermarks in non-coding regulatory sequences

    Directory of Open Access Journals (Sweden)

    Pyka Martin

    2009-07-01

    Full Text Available Abstract Background DNA watermarks can be applied to identify the unauthorized use of genetically modified organisms. It has been shown that coding regions can be used to encrypt information into living organisms by using the DNA-Crypt algorithm. Yet, if the sequence of interest presents a non-coding DNA sequence, either the function of a resulting functional RNA molecule or a regulatory sequence, such as a promoter, could be affected. For our studies we used the small cytoplasmic RNA 1 in yeast and the lac promoter region of Escherichia coli. Findings The lac promoter was deactivated by the integrated watermark. In addition, the RNA molecules displayed altered configurations after introducing a watermark, but surprisingly were functionally intact, which has been verified by analyzing the growth characteristics of both wild type and watermarked scR1 transformed yeast cells. In a third approach we introduced a second overlapping watermark into the lac promoter, which did not affect the promoter activity. Conclusion Even though the watermarked RNA and one of the watermarked promoters did not show any significant differences compared to the wild type RNA and wild type promoter region, respectively, it cannot be generalized that other RNA molecules or regulatory sequences behave accordingly. Therefore, we do not recommend integrating watermark sequences into regulatory regions.

  15. Identification of novel growth phase- and media-dependent small non-coding RNAs in Streptococcus pyogenes M49 using intergenic tiling arrays

    Directory of Open Access Journals (Sweden)

    Patenge Nadja

    2012-10-01

    Full Text Available Abstract Background Small non-coding RNAs (sRNAs have attracted attention as a new class of gene regulators in both eukaryotes and bacteria. Genome-wide screening methods have been successfully applied in Gram-negative bacteria to identify sRNA regulators. Many sRNAs are well characterized, including their target mRNAs and mode of action. In comparison, little is known about sRNAs in Gram-positive pathogens. In this study, we identified novel sRNAs in the exclusively human pathogen Streptococcus pyogenes M49 (Group A Streptococcus, GAS M49, employing a whole genome intergenic tiling array approach. GAS is an important pathogen that causes diseases ranging from mild superficial infections of the skin and mucous membranes of the naso-pharynx, to severe toxic and invasive diseases. Results We identified 55 putative sRNAs in GAS M49 that were expressed during growth. Of these, 42 were novel. Some of the newly-identified sRNAs belonged to one of the common non-coding RNA families described in the Rfam database. Comparison of the results of our screen with the outcome of two recently published bioinformatics tools showed a low level of overlap between putative sRNA genes. Previously, 40 potential sRNAs have been reported to be expressed in a GAS M1T1 serotype, as detected by a whole genome intergenic tiling array approach. Our screen detected 12 putative sRNA genes that were expressed in both strains. Twenty sRNA candidates appeared to be regulated in a medium-dependent fashion, while eight sRNA genes were regulated throughout growth in chemically defined medium. Expression of candidate genes was verified by reverse transcriptase-qPCR. For a subset of sRNAs, the transcriptional start was determined by 5′ rapid amplification of cDNA ends-PCR (RACE-PCR analysis. Conclusions In accord with the results of previous studies, we found little overlap between different screening methods, which underlines the fact that a comprehensive analysis of s

  16. PlantRNA_Sniffer: A SVM-Based Workflow to Predict Long Intergenic Non-Coding RNAs in Plants.

    Science.gov (United States)

    Vieira, Lucas Maciel; Grativol, Clicia; Thiebaut, Flavia; Carvalho, Thais G; Hardoim, Pablo R; Hemerly, Adriana; Lifschitz, Sergio; Ferreira, Paulo Cavalcanti Gomes; Walter, Maria Emilia M T

    2017-03-04

    Non-coding RNAs (ncRNAs) constitute an important set of transcripts produced in the cells of organisms. Among them, there is a large amount of a particular class of long ncRNAs that are difficult to predict, the so-called long intergenic ncRNAs (lincRNAs), which might play essential roles in gene regulation and other cellular processes. Despite the importance of these lincRNAs, there is still a lack of biological knowledge and, currently, the few computational methods considered are so specific that they cannot be successfully applied to other species different from those that they have been originally designed to. Prediction of lncRNAs have been performed with machine learning techniques. Particularly, for lincRNA prediction, supervised learning methods have been explored in recent literature. As far as we know, there are no methods nor workflows specially designed to predict lincRNAs in plants. In this context, this work proposes a workflow to predict lincRNAs on plants, considering a workflow that includes known bioinformatics tools together with machine learning techniques, here a support vector machine (SVM). We discuss two case studies that allowed to identify novel lincRNAs, in sugarcane ( Saccharum spp.) and in maize ( Zea mays ). From the results, we also could identify differentially-expressed lincRNAs in sugarcane and maize plants submitted to pathogenic and beneficial microorganisms.

  17. PlantRNA_Sniffer: A SVM-Based Workflow to Predict Long Intergenic Non-Coding RNAs in Plants

    Directory of Open Access Journals (Sweden)

    Lucas Maciel Vieira

    2017-03-01

    Full Text Available Non-coding RNAs (ncRNAs constitute an important set of transcripts produced in the cells of organisms. Among them, there is a large amount of a particular class of long ncRNAs that are difficult to predict, the so-called long intergenic ncRNAs (lincRNAs, which might play essential roles in gene regulation and other cellular processes. Despite the importance of these lincRNAs, there is still a lack of biological knowledge and, currently, the few computational methods considered are so specific that they cannot be successfully applied to other species different from those that they have been originally designed to. Prediction of lncRNAs have been performed with machine learning techniques. Particularly, for lincRNA prediction, supervised learning methods have been explored in recent literature. As far as we know, there are no methods nor workflows specially designed to predict lincRNAs in plants. In this context, this work proposes a workflow to predict lincRNAs on plants, considering a workflow that includes known bioinformatics tools together with machine learning techniques, here a support vector machine (SVM. We discuss two case studies that allowed to identify novel lincRNAs, in sugarcane (Saccharum spp. and in maize (Zea mays. From the results, we also could identify differentially-expressed lincRNAs in sugarcane and maize plants submitted to pathogenic and beneficial microorganisms.

  18. Genome wide discovery of long intergenic non-coding RNAs in Diamondback moth (Plutella xylostella) and their expression in insecticide resistant strains

    Science.gov (United States)

    Etebari, Kayvan; Furlong, Michael J.; Asgari, Sassan

    2015-01-01

    Long non-coding RNAs (lncRNAs) play important roles in genomic imprinting, cancer, differentiation and regulation of gene expression. Here, we identified 3844 long intergenic ncRNAs (lincRNA) in Plutella xylostella, which is a notorious pest of cruciferous plants that has developed field resistance to all classes of insecticides, including Bacillus thuringiensis (Bt) endotoxins. Further, we found that some of those lincRNAs may potentially serve as precursors for the production of small ncRNAs. We found 280 and 350 lincRNAs that are differentially expressed in Chlorpyrifos and Fipronil resistant larvae. A survey on P. xylostella midgut transcriptome data from Bt-resistant populations revealed 59 altered lincRNA in two resistant strains compared with the susceptible population. We validated the transcript levels of a number of putative lincRNAs in deltamethrin-resistant larvae that were exposed to deltamethrin, which indicated that this group of lincRNAs might be involved in the response to xenobiotics in this insect. To functionally characterize DBM lincRNAs, gene ontology (GO) enrichment of their associated protein-coding genes was extracted and showed over representation of protein, DNA and RNA binding GO terms. The data presented here will facilitate future studies to unravel the function of lincRNAs in insecticide resistance or the response to xenobiotics of eukaryotic cells. PMID:26411386

  19. Identification of evolutionarily conserved non-AUG-initiated N-terminal extensions in human coding sequences.

    LENUS (Irish Health Repository)

    Ivanov, Ivaylo P

    2011-05-01

    In eukaryotes, it is generally assumed that translation initiation occurs at the AUG codon closest to the messenger RNA 5\\' cap. However, in certain cases, initiation can occur at codons differing from AUG by a single nucleotide, especially the codons CUG, UUG, GUG, ACG, AUA and AUU. While non-AUG initiation has been experimentally verified for a handful of human genes, the full extent to which this phenomenon is utilized--both for increased coding capacity and potentially also for novel regulatory mechanisms--remains unclear. To address this issue, and hence to improve the quality of existing coding sequence annotations, we developed a methodology based on phylogenetic analysis of predicted 5\\' untranslated regions from orthologous genes. We use evolutionary signatures of protein-coding sequences as an indicator of translation initiation upstream of annotated coding sequences. Our search identified novel conserved potential non-AUG-initiated N-terminal extensions in 42 human genes including VANGL2, FGFR1, KCNN4, TRPV6, HDGF, CITED2, EIF4G3 and NTF3, and also affirmed the conservation of known non-AUG-initiated extensions in 17 other genes. In several instances, we have been able to obtain independent experimental evidence of the expression of non-AUG-initiated products from the previously published literature and ribosome profiling data.

  20. Sub-grouping of Plasmodium falciparum 3D7 var genes based on sequence analysis of coding and non-coding regions

    DEFF Research Database (Denmark)

    Lavstsen, Thomas; Salanti, Ali; Jensen, Anja T R

    2003-01-01

    and organization of the 3D7 PfEMP1 repertoire was investigated on the basis of the complete genome sequence. METHODS: Using two tree-building methods we analysed the coding and non-coding sequences of 3D7 var and rif genes as well as var genes of other parasite strains. RESULTS: var genes can be sub...

  1. Chromosome-wide mapping of DNA methylation patterns in normal and malignant prostate cells reveals pervasive methylation of gene-associated and conserved intergenic sequences

    Directory of Open Access Journals (Sweden)

    De Marzo Angelo M

    2011-06-01

    Full Text Available Abstract Background DNA methylation has been linked to genome regulation and dysregulation in health and disease respectively, and methods for characterizing genomic DNA methylation patterns are rapidly emerging. We have developed/refined methods for enrichment of methylated genomic fragments using the methyl-binding domain of the human MBD2 protein (MBD2-MBD followed by analysis with high-density tiling microarrays. This MBD-chip approach was used to characterize DNA methylation patterns across all non-repetitive sequences of human chromosomes 21 and 22 at high-resolution in normal and malignant prostate cells. Results Examining this data using computational methods that were designed specifically for DNA methylation tiling array data revealed widespread methylation of both gene promoter and non-promoter regions in cancer and normal cells. In addition to identifying several novel cancer hypermethylated 5' gene upstream regions that mediated epigenetic gene silencing, we also found several hypermethylated 3' gene downstream, intragenic and intergenic regions. The hypermethylated intragenic regions were highly enriched for overlap with intron-exon boundaries, suggesting a possible role in regulation of alternative transcriptional start sites, exon usage and/or splicing. The hypermethylated intergenic regions showed significant enrichment for conservation across vertebrate species. A sampling of these newly identified promoter (ADAMTS1 and SCARF2 genes and non-promoter (downstream or within DSCR9, C21orf57 and HLCS genes hypermethylated regions were effective in distinguishing malignant from normal prostate tissues and/or cell lines. Conclusions Comparison of chromosome-wide DNA methylation patterns in normal and malignant prostate cells revealed significant methylation of gene-proximal and conserved intergenic sequences. Such analyses can be easily extended for genome-wide methylation analysis in health and disease.

  2. The Non-Coding Regulatory RNA Revolution in Archaea

    Directory of Open Access Journals (Sweden)

    Diego Rivera Gelsinger

    2018-03-01

    Full Text Available Small non-coding RNAs (sRNAs are ubiquitously found in the three domains of life playing large-scale roles in gene regulation, transposable element silencing and defense against foreign elements. While a substantial body of experimental work has been done to uncover function of sRNAs in Bacteria and Eukarya, the functional roles of sRNAs in Archaea are still poorly understood. Recently, high throughput studies using RNA-sequencing revealed that sRNAs are broadly expressed in the Archaea, comprising thousands of transcripts within the transcriptome during non-challenged and stressed conditions. Antisense sRNAs, which overlap a portion of a gene on the opposite strand (cis-acting, are the most abundantly expressed non-coding RNAs and they can be classified based on their binding patterns to mRNAs (3′ untranslated region (UTR, 5′ UTR, CDS-binding. These antisense sRNAs target many genes and pathways, suggesting extensive roles in gene regulation. Intergenic sRNAs are less abundantly expressed and their targets are difficult to find because of a lack of complete overlap between sRNAs and target mRNAs (trans-acting. While many sRNAs have been validated experimentally, a regulatory role has only been reported for very few of them. Further work is needed to elucidate sRNA-RNA binding mechanisms, the molecular determinants of sRNA-mediated regulation, whether protein components are involved and how sRNAs integrate with complex regulatory networks.

  3. Identification of Dendrobium species by a candidate DNA barcode sequence: the chloroplast psbA-trnH intergenic region.

    Science.gov (United States)

    Yao, Hui; Song, Jing-Yuan; Ma, Xin-Ye; Liu, Chang; Li, Ying; Xu, Hong-Xi; Han, Jian-Ping; Duan, Li-Sheng; Chen, Shi-Lin

    2009-05-01

    DNA barcoding is a novel technology that uses a standard DNA sequence to facilitate species identification. Although a consensus has not been reached regarding which DNA sequences can be used as the best plant barcodes, the psbA-trnH spacer region has been tested extensively in recent years. In this study, we hypothesize that the psbA-trnH spacer regions are also effective barcodes for Dendrobium species. We have sequenced the chloroplast psbA-trnH intergenic spacers of 17 Dendrobium species to test this hypothesis. The sequences were found to be significantly different from those of other species, with percentages of variation ranging from 0.3 % to 2.3 % and an average of 1.2 %. In contrast, the intraspecific variation among the Dendrobium species studied ranged from 0 % to 0.1 %. The sequence difference between the psbA-trnH sequences of 17 Dendrobium species and one Bulbophyllum odoratissimum ranged from 2.0 % to 3.1 %, with an average of 2.5 %. Our results support the notion that the psbA-trnH intergenic spacer region could be used as a barcode to distinguish various Dendrobium species and to differentiate Dendrobium species from other adulterating species. Copyright Georg Thieme Verlag KG Stuttgart. New York.

  4. Intergenic and repeat transcription in human, chimpanzee and macaque brains measured by RNA-Seq.

    Directory of Open Access Journals (Sweden)

    Augix Guohua Xu

    Full Text Available Transcription is the first step connecting genetic information with an organism's phenotype. While expression of annotated genes in the human brain has been characterized extensively, our knowledge about the scope and the conservation of transcripts located outside of the known genes' boundaries is limited. Here, we use high-throughput transcriptome sequencing (RNA-Seq to characterize the total non-ribosomal transcriptome of human, chimpanzee, and rhesus macaque brain. In all species, only 20-28% of non-ribosomal transcripts correspond to annotated exons and 20-23% to introns. By contrast, transcripts originating within intronic and intergenic repetitive sequences constitute 40-48% of the total brain transcriptome. Notably, some repeat families show elevated transcription. In non-repetitive intergenic regions, we identify and characterize 1,093 distinct regions highly expressed in the human brain. These regions are conserved at the RNA expression level across primates studied and at the DNA sequence level across mammals. A large proportion of these transcripts (20% represents 3'UTR extensions of known genes and may play roles in alternative microRNA-directed regulation. Finally, we show that while transcriptome divergence between species increases with evolutionary time, intergenic transcripts show more expression differences among species and exons show less. Our results show that many yet uncharacterized evolutionary conserved transcripts exist in the human brain. Some of these transcripts may play roles in transcriptional regulation and contribute to evolution of human-specific phenotypic traits.

  5. Non-coding RNA in Deinococcus radiodurans

    International Nuclear Information System (INIS)

    Chen Zhongzhong; Wang Liangyan; Lin Jun; Tian Bing; Hua Yuejin

    2006-01-01

    Researches on DNA damage and repair pathways of Deinococcus radiodurans show its extreme resistance to ionizing radiation, ultraviolet radiation and reactive oxygen species. Non-coding (ncRNA) RNAs are involved in a variety of processes such as transcriptional regulations, RNA processing and modification, mRNA translation, protein transportation and stability. The conserved secondary structures of intergenic regions of Deinococcus radiodurans R1 were predicted using Stochastic Context Free Grammar (SCFG) scan strategy. Results showed that 28 ncRNA families were present in the non-coding regions of the genome of Deinococcus radiodurans R1. Among these families, IRE is the largest family, followed by Histone3, tRNA, SECIS. DicF, ctRNA-pGA1 and tmRNA are one discovered in bacteria. Results from the comparison with other organisms showed that these ncRNA can be applied to the study of biological function of Deinococcus radiodurans and supply reference for the further study of DNA damage and repair mechanisms of this bacterium. (authors)

  6. RNA-Seq analysis of D. radiodurans find non coding RNAs expressed in response to radiation stress

    International Nuclear Information System (INIS)

    Gadewal, Nikhil; Mukhopadhyaya, Rita

    2015-01-01

    In bacteria discovery of functional RNA molecules that are not translated into protein, noncoding RNAs, became possible with advent of Next Generation Sequencing technology. Bacterial non coding RNAs are typically 50-300 nucleotides long and work as internal signals controlling various levels of gene expression. Deep sequencing of total cellular RNA captures all coding and noncoding transcripts with their differential levels of expression in the transcriptome. It provides a powerful approach to study bacterial gene expression and mechanisms of gene regulation. We subjected the 3 h transcriptome of Deinococcus radiodurans R1 cells post exposure to 6 KGy gamma radiation to 100 x 2 cycles of deep sequencing on the Illumina HiSeq 2000 to look for ncRNA transcripts. Bioinformatics pipeline for analysis and interpretation of RNA Seq data was done in house using Softwares available in public domains. Our sequence data aligned with 21 putative ncRNAs expressed in the intergenic regions of annotated genome of D radiodurans. Verification of 2 ncRNA candidates and 3 transcription factor genes by Real Time PCR confirmed presence of these transcripts in the 3 h transcriptome sequenced by us. Any relationship between ncRNAs and control of radiation induced gene expression in D radiodurans can be proved only after specific gene knock outs in future. (author)

  7. Genome-wide characterization of long intergenic non-coding RNAs (lincRNAs) provides new insight into viral diseases in honey bees Apis cerana and Apis mellifera.

    Science.gov (United States)

    Jayakodi, Murukarthick; Jung, Je Won; Park, Doori; Ahn, Young-Joon; Lee, Sang-Choon; Shin, Sang-Yoon; Shin, Chanseok; Yang, Tae-Jin; Kwon, Hyung Wook

    2015-09-04

    Long non-coding RNAs (lncRNAs) are a class of RNAs that do not encode proteins. Recently, lncRNAs have gained special attention for their roles in various biological process and diseases. In an attempt to identify long intergenic non-coding RNAs (lincRNAs) and their possible involvement in honey bee development and diseases, we analyzed RNA-seq datasets generated from Asian honey bee (Apis cerana) and western honey bee (Apis mellifera). We identified 2470 lincRNAs with an average length of 1011 bp from A. cerana and 1514 lincRNAs with an average length of 790 bp in A. mellifera. Comparative analysis revealed that 5 % of the total lincRNAs derived from both species are unique in each species. Our comparative digital gene expression analysis revealed a high degree of tissue-specific expression among the seven major tissues of honey bee, different from mRNA expression patterns. A total of 863 (57 %) and 464 (18 %) lincRNAs showed tissue-dependent expression in A. mellifera and A. cerana, respectively, most preferentially in ovary and fat body tissues. Importantly, we identified 11 lincRNAs that are specifically regulated upon viral infection in honey bees, and 10 of them appear to play roles during infection with various viruses. This study provides the first comprehensive set of lincRNAs for honey bees and opens the door to discover lincRNAs associated with biological and hormone signaling pathways as well as various diseases of honey bee.

  8. CHIR99021 promotes self-renewal of mouse embryonic stem cells by modulation of protein-encoding gene and long intergenic non-coding RNA expression

    Energy Technology Data Exchange (ETDEWEB)

    Wu, Yongyan [College of Veterinary Medicine, Northwest A and F University, Yangling 712100, Shaanxi (China); Key Laboratory of Animal Biotechnology, Ministry of Agriculture, Northwest A and F University, Yangling 712100, Shaanxi (China); Ai, Zhiying [Key Laboratory of Animal Biotechnology, Ministry of Agriculture, Northwest A and F University, Yangling 712100, Shaanxi (China); College of Life Sciences, Northwest A and F University, Yangling 712100, Shaanxi (China); Yao, Kezhen [College of Veterinary Medicine, Northwest A and F University, Yangling 712100, Shaanxi (China); Key Laboratory of Animal Biotechnology, Ministry of Agriculture, Northwest A and F University, Yangling 712100, Shaanxi (China); Cao, Lixia; Du, Juan; Shi, Xiaoyan [Key Laboratory of Animal Biotechnology, Ministry of Agriculture, Northwest A and F University, Yangling 712100, Shaanxi (China); College of Life Sciences, Northwest A and F University, Yangling 712100, Shaanxi (China); Guo, Zekun, E-mail: gzk@nwsuaf.edu.cn [College of Veterinary Medicine, Northwest A and F University, Yangling 712100, Shaanxi (China); Key Laboratory of Animal Biotechnology, Ministry of Agriculture, Northwest A and F University, Yangling 712100, Shaanxi (China); Zhang, Yong, E-mail: zhylab@hotmail.com [College of Veterinary Medicine, Northwest A and F University, Yangling 712100, Shaanxi (China); Key Laboratory of Animal Biotechnology, Ministry of Agriculture, Northwest A and F University, Yangling 712100, Shaanxi (China)

    2013-10-15

    Embryonic stem cells (ESCs) can proliferate indefinitely in vitro and differentiate into cells of all three germ layers. These unique properties make them exceptionally valuable for drug discovery and regenerative medicine. However, the practical application of ESCs is limited because it is difficult to derive and culture ESCs. It has been demonstrated that CHIR99021 (CHIR) promotes self-renewal and enhances the derivation efficiency of mouse (m)ESCs. However, the downstream targets of CHIR are not fully understood. In this study, we identified CHIR-regulated genes in mESCs using microarray analysis. Our microarray data demonstrated that CHIR not only influenced the Wnt/β-catenin pathway by stabilizing β-catenin, but also modulated several other pluripotency-related signaling pathways such as TGF-β, Notch and MAPK signaling pathways. More detailed analysis demonstrated that CHIR inhibited Nodal signaling, while activating bone morphogenetic protein signaling in mESCs. In addition, we found that pluripotency-maintaining transcription factors were up-regulated by CHIR, while several developmental-related genes were down-regulated. Furthermore, we found that CHIR altered the expression of epigenetic regulatory genes and long intergenic non-coding RNAs. Quantitative real-time PCR results were consistent with microarray data, suggesting that CHIR alters the expression pattern of protein-encoding genes (especially transcription factors), epigenetic regulatory genes and non-coding RNAs to establish a relatively stable pluripotency-maintaining network. - Highlights: • Combined use of CHIR with LIF promotes self-renewal of J1 mESCs. • CHIR-regulated genes are involved in multiple pathways. • CHIR inhibits Nodal signaling and promotes Bmp4 expression to activate BMP signaling. • Expression of epigenetic regulatory genes and lincRNAs is altered by CHIR.

  9. CHIR99021 promotes self-renewal of mouse embryonic stem cells by modulation of protein-encoding gene and long intergenic non-coding RNA expression

    International Nuclear Information System (INIS)

    Wu, Yongyan; Ai, Zhiying; Yao, Kezhen; Cao, Lixia; Du, Juan; Shi, Xiaoyan; Guo, Zekun; Zhang, Yong

    2013-01-01

    Embryonic stem cells (ESCs) can proliferate indefinitely in vitro and differentiate into cells of all three germ layers. These unique properties make them exceptionally valuable for drug discovery and regenerative medicine. However, the practical application of ESCs is limited because it is difficult to derive and culture ESCs. It has been demonstrated that CHIR99021 (CHIR) promotes self-renewal and enhances the derivation efficiency of mouse (m)ESCs. However, the downstream targets of CHIR are not fully understood. In this study, we identified CHIR-regulated genes in mESCs using microarray analysis. Our microarray data demonstrated that CHIR not only influenced the Wnt/β-catenin pathway by stabilizing β-catenin, but also modulated several other pluripotency-related signaling pathways such as TGF-β, Notch and MAPK signaling pathways. More detailed analysis demonstrated that CHIR inhibited Nodal signaling, while activating bone morphogenetic protein signaling in mESCs. In addition, we found that pluripotency-maintaining transcription factors were up-regulated by CHIR, while several developmental-related genes were down-regulated. Furthermore, we found that CHIR altered the expression of epigenetic regulatory genes and long intergenic non-coding RNAs. Quantitative real-time PCR results were consistent with microarray data, suggesting that CHIR alters the expression pattern of protein-encoding genes (especially transcription factors), epigenetic regulatory genes and non-coding RNAs to establish a relatively stable pluripotency-maintaining network. - Highlights: • Combined use of CHIR with LIF promotes self-renewal of J1 mESCs. • CHIR-regulated genes are involved in multiple pathways. • CHIR inhibits Nodal signaling and promotes Bmp4 expression to activate BMP signaling. • Expression of epigenetic regulatory genes and lincRNAs is altered by CHIR

  10. Short sequence motifs, overrepresented in mammalian conservednon-coding sequences

    Energy Technology Data Exchange (ETDEWEB)

    Minovitsky, Simon; Stegmaier, Philip; Kel, Alexander; Kondrashov,Alexey S.; Dubchak, Inna

    2007-02-21

    Background: A substantial fraction of non-coding DNAsequences of multicellular eukaryotes is under selective constraint. Inparticular, ~;5 percent of the human genome consists of conservednon-coding sequences (CNSs). CNSs differ from other genomic sequences intheir nucleotide composition and must play important functional roles,which mostly remain obscure.Results: We investigated relative abundancesof short sequence motifs in all human CNSs present in the human/mousewhole-genome alignments vs. three background sets of sequences: (i)weakly conserved or unconserved non-coding sequences (non-CNSs); (ii)near-promoter sequences (located between nucleotides -500 and -1500,relative to a start of transcription); and (iii) random sequences withthe same nucleotide composition as that of CNSs. When compared tonon-CNSs and near-promoter sequences, CNSs possess an excess of AT-richmotifs, often containing runs of identical nucleotides. In contrast, whencompared to random sequences, CNSs contain an excess of GC-rich motifswhich, however, lack CpG dinucleotides. Thus, abundance of short sequencemotifs in human CNSs, taken as a whole, is mostly determined by theiroverall compositional properties and not by overrepresentation of anyspecific short motifs. These properties are: (i) high AT-content of CNSs,(ii) a tendency, probably due to context-dependent mutation, of A's andT's to clump, (iii) presence of short GC-rich regions, and (iv) avoidanceof CpG contexts, due to their hypermutability. Only a small number ofshort motifs, overrepresented in all human CNSs are similar to bindingsites of transcription factors from the FOX family.Conclusion: Human CNSsas a whole appear to be too broad a class of sequences to possess strongfootprints of any short sequence-specific functions. Such footprintsshould be studied at the level of functional subclasses of CNSs, such asthose which flank genes with a particular pattern of expression. Overallproperties of CNSs are affected by

  11. Intergenic disease-associated regions are abundant in novel transcripts.

    Science.gov (United States)

    Bartonicek, N; Clark, M B; Quek, X C; Torpy, J R; Pritchard, A L; Maag, J L V; Gloss, B S; Crawford, J; Taft, R J; Hayward, N K; Montgomery, G W; Mattick, J S; Mercer, T R; Dinger, M E

    2017-12-28

    Genotyping of large populations through genome-wide association studies (GWAS) has successfully identified many genomic variants associated with traits or disease risk. Unexpectedly, a large proportion of GWAS single nucleotide polymorphisms (SNPs) and associated haplotype blocks are in intronic and intergenic regions, hindering their functional evaluation. While some of these risk-susceptibility regions encompass cis-regulatory sites, their transcriptional potential has never been systematically explored. To detect rare tissue-specific expression, we employed the transcript-enrichment method CaptureSeq on 21 human tissues to identify 1775 multi-exonic transcripts from 561 intronic and intergenic haploblocks associated with 392 traits and diseases, covering 73.9 Mb (2.2%) of the human genome. We show that a large proportion (85%) of disease-associated haploblocks express novel multi-exonic non-coding transcripts that are tissue-specific and enriched for GWAS SNPs as well as epigenetic markers of active transcription and enhancer activity. Similarly, we captured transcriptomes from 13 melanomas, targeting nine melanoma-associated haploblocks, and characterized 31 novel melanoma-specific transcripts that include fusion proteins, novel exons and non-coding RNAs, one-third of which showed allelically imbalanced expression. This resource of previously unreported transcripts in disease-associated regions ( http://gwas-captureseq.dingerlab.org ) should provide an important starting point for the translational community in search of novel biomarkers, disease mechanisms, and drug targets.

  12. Non-codingRNA sequence variations in human chronic lymphocytic leukemia and colorectal cancer.

    Science.gov (United States)

    Wojcik, Sylwia E; Rossi, Simona; Shimizu, Masayoshi; Nicoloso, Milena S; Cimmino, Amelia; Alder, Hansjuerg; Herlea, Vlad; Rassenti, Laura Z; Rai, Kanti R; Kipps, Thomas J; Keating, Michael J; Croce, Carlo M; Calin, George A

    2010-02-01

    Cancer is a genetic disease in which the interplay between alterations in protein-coding genes and non-coding RNAs (ncRNAs) plays a fundamental role. In recent years, the full coding component of the human genome was sequenced in various cancers, whereas such attempts related to ncRNAs are still fragmentary. We screened genomic DNAs for sequence variations in 148 microRNAs (miRNAs) and ultraconserved regions (UCRs) loci in patients with chronic lymphocytic leukemia (CLL) or colorectal cancer (CRC) by Sanger technique and further tried to elucidate the functional consequences of some of these variations. We found sequence variations in miRNAs in both sporadic and familial CLL cases, mutations of UCRs in CLLs and CRCs and, in certain instances, detected functional effects of these variations. Furthermore, by integrating our data with previously published data on miRNA sequence variations, we have created a catalog of DNA sequence variations in miRNAs/ultraconserved genes in human cancers. These findings argue that ncRNAs are targeted by both germ line and somatic mutations as well as by single-nucleotide polymorphisms with functional significance for human tumorigenesis. Sequence variations in ncRNA loci are frequent and some have functional and biological significance. Such information can be exploited to further investigate on a genome-wide scale the frequency of genetic variations in ncRNAs and their functional meaning, as well as for the development of new diagnostic and prognostic markers for leukemias and carcinomas.

  13. Roles of Non-Coding RNA in Sugarcane-Microbe Interaction.

    Science.gov (United States)

    Thiebaut, Flávia; Rojas, Cristian A; Grativol, Clícia; Calixto, Edmundo P da R; Motta, Mariana R; Ballesteros, Helkin G F; Peixoto, Barbara; de Lima, Berenice N S; Vieira, Lucas M; Walter, Maria Emilia; de Armas, Elvismary M; Entenza, Júlio O P; Lifschitz, Sergio; Farinelli, Laurent; Hemerly, Adriana S; Ferreira, Paulo C G

    2017-12-20

    Studies have highlighted the importance of non-coding RNA regulation in plant-microbe interaction. However, the roles of sugarcane microRNAs (miRNAs) in the regulation of disease responses have not been investigated. Firstly, we screened the sRNA transcriptome of sugarcane infected with Acidovorax avenae . Conserved and novel miRNAs were identified. Additionally, small interfering RNAs (siRNAs) were aligned to differentially expressed sequences from the sugarcane transcriptome. Interestingly, many siRNAs aligned to a transcript encoding a copper-transporter gene whose expression was induced in the presence of A. avenae , while the siRNAs were repressed in the presence of A. avenae . Moreover, a long intergenic non-coding RNA was identified as a potential target or decoy of miR408. To extend the bioinformatics analysis, we carried out independent inoculations and the expression patterns of six miRNAs were validated by quantitative reverse transcription-PCR (qRT-PCR). Among these miRNAs, miR408-a copper-microRNA-was downregulated. The cleavage of a putative miR408 target, a laccase, was confirmed by a modified 5'RACE (rapid amplification of cDNA ends) assay. MiR408 was also downregulated in samples infected with other pathogens, but it was upregulated in the presence of a beneficial diazotrophic bacteria. Our results suggest that regulation by miR408 is important in sugarcane sensing whether microorganisms are either pathogenic or beneficial, triggering specific miRNA-mediated regulatory mechanisms accordingly.

  14. Roles of Non-Coding RNA in Sugarcane-Microbe Interaction

    Science.gov (United States)

    Grativol, Clícia; Motta, Mariana R.; Ballesteros, Helkin G. F.; Peixoto, Barbara; Vieira, Lucas M.; Walter, Maria Emilia; de Armas, Elvismary M.; Entenza, Júlio O. P.; Lifschitz, Sergio; Farinelli, Laurent; Hemerly, Adriana S.

    2017-01-01

    Studies have highlighted the importance of non-coding RNA regulation in plant-microbe interaction. However, the roles of sugarcane microRNAs (miRNAs) in the regulation of disease responses have not been investigated. Firstly, we screened the sRNA transcriptome of sugarcane infected with Acidovorax avenae. Conserved and novel miRNAs were identified. Additionally, small interfering RNAs (siRNAs) were aligned to differentially expressed sequences from the sugarcane transcriptome. Interestingly, many siRNAs aligned to a transcript encoding a copper-transporter gene whose expression was induced in the presence of A. avenae, while the siRNAs were repressed in the presence of A. avenae. Moreover, a long intergenic non-coding RNA was identified as a potential target or decoy of miR408. To extend the bioinformatics analysis, we carried out independent inoculations and the expression patterns of six miRNAs were validated by quantitative reverse transcription-PCR (qRT-PCR). Among these miRNAs, miR408—a copper-microRNA—was downregulated. The cleavage of a putative miR408 target, a laccase, was confirmed by a modified 5′RACE (rapid amplification of cDNA ends) assay. MiR408 was also downregulated in samples infected with other pathogens, but it was upregulated in the presence of a beneficial diazotrophic bacteria. Our results suggest that regulation by miR408 is important in sugarcane sensing whether microorganisms are either pathogenic or beneficial, triggering specific miRNA-mediated regulatory mechanisms accordingly. PMID:29657296

  15. Roles of Non-Coding RNA in Sugarcane-Microbe Interaction

    Directory of Open Access Journals (Sweden)

    Flávia Thiebaut

    2017-12-01

    Full Text Available Studies have highlighted the importance of non-coding RNA regulation in plant-microbe interaction. However, the roles of sugarcane microRNAs (miRNAs in the regulation of disease responses have not been investigated. Firstly, we screened the sRNA transcriptome of sugarcane infected with Acidovorax avenae. Conserved and novel miRNAs were identified. Additionally, small interfering RNAs (siRNAs were aligned to differentially expressed sequences from the sugarcane transcriptome. Interestingly, many siRNAs aligned to a transcript encoding a copper-transporter gene whose expression was induced in the presence of A. avenae, while the siRNAs were repressed in the presence of A. avenae. Moreover, a long intergenic non-coding RNA was identified as a potential target or decoy of miR408. To extend the bioinformatics analysis, we carried out independent inoculations and the expression patterns of six miRNAs were validated by quantitative reverse transcription-PCR (qRT-PCR. Among these miRNAs, miR408—a copper-microRNA—was downregulated. The cleavage of a putative miR408 target, a laccase, was confirmed by a modified 5′RACE (rapid amplification of cDNA ends assay. MiR408 was also downregulated in samples infected with other pathogens, but it was upregulated in the presence of a beneficial diazotrophic bacteria. Our results suggest that regulation by miR408 is important in sugarcane sensing whether microorganisms are either pathogenic or beneficial, triggering specific miRNA-mediated regulatory mechanisms accordingly.

  16. Estradiol-Induced Transcriptional Regulation of Long Non-Coding RNA, HOTAIR.

    Science.gov (United States)

    Bhan, Arunoday; Mandal, Subhrangsu S

    2016-01-01

    HOTAIR (HOX antisense intergenic RNA) is a 2.2 kb long non-coding RNA (lncRNA), transcribed from the antisense strand of homeobox C (HOXC) gene locus in chromosome 12. HOTAIR acts as a scaffolding lncRNA. It interacts and guides various chromatin-modifying complexes such as PRC2 (polycomb-repressive complex 2) and LSD1 (lysine-specific demethylase 1) to the target gene promoters leading to their gene silencing. Various studies have demonstrated that HOTAIR overexpression is associated with breast cancer. Recent studies from our laboratory demonstrate that HOTAIR is required for viability of breast cancer cells and is transcriptionally regulated by estradiol (E2) in vitro and in vivo. This chapter describes protocols for analysis of the HOTAIR promoter, cloning, transfection and dual luciferase assays, knockdown of protein synthesis by antisense oligonucleotides, and chromatin immunoprecipitation (ChIP) assay. These protocols are useful for studying the estrogen-mediated transcriptional regulation of lncRNA HOTAIR, as well as other protein coding genes and non-coding RNAs.

  17. An efficient genetic algorithm for structural RNA pairwise alignment and its application to non-coding RNA discovery in yeast

    Directory of Open Access Journals (Sweden)

    Taneda Akito

    2008-12-01

    Full Text Available Abstract Background Aligning RNA sequences with low sequence identity has been a challenging problem since such a computation essentially needs an algorithm with high complexities for taking structural conservation into account. Although many sophisticated algorithms for the purpose have been proposed to date, further improvement in efficiency is necessary to accelerate its large-scale applications including non-coding RNA (ncRNA discovery. Results We developed a new genetic algorithm, Cofolga2, for simultaneously computing pairwise RNA sequence alignment and consensus folding, and benchmarked it using BRAliBase 2.1. The benchmark results showed that our new algorithm is accurate and efficient in both time and memory usage. Then, combining with the originally trained SVM, we applied the new algorithm to novel ncRNA discovery where we compared S. cerevisiae genome with six related genomes in a pairwise manner. By focusing our search to the relatively short regions (50 bp to 2,000 bp sandwiched by conserved sequences, we successfully predict 714 intergenic and 1,311 sense or antisense ncRNA candidates, which were found in the pairwise alignments with stable consensus secondary structure and low sequence identity (≤ 50%. By comparing with the previous predictions, we found that > 92% of the candidates is novel candidates. The estimated rate of false positives in the predicted candidates is 51%. Twenty-five percent of the intergenic candidates has supports for expression in cell, i.e. their genomic positions overlap those of the experimentally determined transcripts in literature. By manual inspection of the results, moreover, we obtained four multiple alignments with low sequence identity which reveal consensus structures shared by three species/sequences. Conclusion The present method gives an efficient tool complementary to sequence-alignment-based ncRNA finders.

  18. Enterobacterial repetitive intergenic consensus sequences and the PCR to generate fingerprints of genomic DNAs from Vibrio cholerae O1, O139, and non-O1 strains.

    OpenAIRE

    Rivera, I G; Chowdhury, M A; Huq, A; Jacobs, D; Martins, M T; Colwell, R R

    1995-01-01

    Enterobacterial repetitive intergenic consensus (ERIC) sequence polymorphism was studied in Vibrio Cholerae strains isolated before and after the cholera epidemic in Brazil (in 1991), along with epidemic strains from Peru, Mexico, and India, by PCR. A total of 17 fingerprint patterns (FPs) were detected in the V. cholerae strains examined; 96.7% of the toxigenic V. cholerae O1 strains and 100% of the O139 serogroup strains were found to belong to the same FP group comprising four fragments (F...

  19. Phylogenetic relationships in Solanaceae and related species based on cpDNA sequence from plastid trnE-trnT region

    Directory of Open Access Journals (Sweden)

    Danila Montewka Melotto-Passarin

    2008-01-01

    Full Text Available Intergenic spacers of chloroplast DNA (cpDNA are very useful in phylogenetic and population genetic studiesof plant species, to study their potential integration in phylogenetic analysis. The non-coding trnE-trnT intergenic spacer ofcpDNA was analyzed to assess the nucleotide sequence polymorphism of 16 Solanaceae species and to estimate its ability tocontribute to the resolution of phylogenetic studies of this group. Multiple alignments of DNA sequences of trnE-trnT intergenicspacer made the identification of nucleotide variability in this region possible and the phylogeny was estimated by maximumparsimony and rooted with Convolvulaceae Ipomoea batatas, the most closely related family. Besides, this intergenic spacerwas tested for the phylogenetic ability to differentiate taxonomic levels. For this purpose, species from four other families wereanalyzed and compared with Solanaceae species. Results confirmed polymorphism in the trnE-trnT region at different taxonomiclevels.

  20. A New Intergenic α-Globin Deletion (α-αΔ125) Found in a Kabyle Population.

    Science.gov (United States)

    Singh, Amrathlal Rabbind; Lacan, Philippe; Cadet, Estelle; Bignet, Patricia; Dumesnil, Cécile; Vannier, Jean-Pierre; Joly, Philippe; Rochette, Jacques

    2016-01-01

    We have identified a deletion of 125 bp (α-α(Δ125)) (NG_000006.1: g.37040_37164del) in the α-globin gene cluster in a Kabyle population. A combination of singlex and multiplex polymerase chain reaction (PCR)-based assays have been used to identify the molecular defect. Sequencing of the abnormal PCR amplification product revealed a novel α1-globin promoter deletion. The endpoints of the deletion were characterized by sequencing the deletion junctions of the mutated allele. The observed deletion was located 378 bp upstream of the α1-globin gene transcription initiation site and leaves the α2 gene intact. In some patients, the α-α(Δ125) deletion was shown to segregate with Hb S (HBB: c.20A>T) and/or Hb C (HBB: c.19G>A) or a β-thalassemic allele. The α-α(Δ125) deletion has no discernible effect on red cell indices when inherited with no other abnormal globin genes. The family study demonstrated that the deletion is heritable. This is the only example of an intergenic α2-α1 non coding DNA deletion, leaving the α2-globin gene and the α1 coding part intact.

  1. SHARAKU: an algorithm for aligning and clustering read mapping profiles of deep sequencing in non-coding RNA processing.

    Science.gov (United States)

    Tsuchiya, Mariko; Amano, Kojiro; Abe, Masaya; Seki, Misato; Hase, Sumitaka; Sato, Kengo; Sakakibara, Yasubumi

    2016-06-15

    Deep sequencing of the transcripts of regulatory non-coding RNA generates footprints of post-transcriptional processes. After obtaining sequence reads, the short reads are mapped to a reference genome, and specific mapping patterns can be detected called read mapping profiles, which are distinct from random non-functional degradation patterns. These patterns reflect the maturation processes that lead to the production of shorter RNA sequences. Recent next-generation sequencing studies have revealed not only the typical maturation process of miRNAs but also the various processing mechanisms of small RNAs derived from tRNAs and snoRNAs. We developed an algorithm termed SHARAKU to align two read mapping profiles of next-generation sequencing outputs for non-coding RNAs. In contrast with previous work, SHARAKU incorporates the primary and secondary sequence structures into an alignment of read mapping profiles to allow for the detection of common processing patterns. Using a benchmark simulated dataset, SHARAKU exhibited superior performance to previous methods for correctly clustering the read mapping profiles with respect to 5'-end processing and 3'-end processing from degradation patterns and in detecting similar processing patterns in deriving the shorter RNAs. Further, using experimental data of small RNA sequencing for the common marmoset brain, SHARAKU succeeded in identifying the significant clusters of read mapping profiles for similar processing patterns of small derived RNA families expressed in the brain. The source code of our program SHARAKU is available at http://www.dna.bio.keio.ac.jp/sharaku/, and the simulated dataset used in this work is available at the same link. Accession code: The sequence data from the whole RNA transcripts in the hippocampus of the left brain used in this work is available from the DNA DataBank of Japan (DDBJ) Sequence Read Archive (DRA) under the accession number DRA004502. yasu@bio.keio.ac.jp Supplementary data are available

  2. Identification of coding and non-coding mutational hotspots in cancer genomes.

    Science.gov (United States)

    Piraino, Scott W; Furney, Simon J

    2017-01-05

    The identification of mutations that play a causal role in tumour development, so called "driver" mutations, is of critical importance for understanding how cancers form and how they might be treated. Several large cancer sequencing projects have identified genes that are recurrently mutated in cancer patients, suggesting a role in tumourigenesis. While the landscape of coding drivers has been extensively studied and many of the most prominent driver genes are well characterised, comparatively less is known about the role of mutations in the non-coding regions of the genome in cancer development. The continuing fall in genome sequencing costs has resulted in a concomitant increase in the number of cancer whole genome sequences being produced, facilitating systematic interrogation of both the coding and non-coding regions of cancer genomes. To examine the mutational landscapes of tumour genomes we have developed a novel method to identify mutational hotspots in tumour genomes using both mutational data and information on evolutionary conservation. We have applied our methodology to over 1300 whole cancer genomes and show that it identifies prominent coding and non-coding regions that are known or highly suspected to play a role in cancer. Importantly, we applied our method to the entire genome, rather than relying on predefined annotations (e.g. promoter regions) and we highlight recurrently mutated regions that may have resulted from increased exposure to mutational processes rather than selection, some of which have been identified previously as targets of selection. Finally, we implicate several pan-cancer and cancer-specific candidate non-coding regions, which could be involved in tumourigenesis. We have developed a framework to identify mutational hotspots in cancer genomes, which is applicable to the entire genome. This framework identifies known and novel coding and non-coding mutional hotspots and can be used to differentiate candidate driver regions from

  3. nocoRNAc: Characterization of non-coding RNAs in prokaryotes

    Directory of Open Access Journals (Sweden)

    Nieselt Kay

    2011-01-01

    Full Text Available Abstract Background The interest in non-coding RNAs (ncRNAs constantly rose during the past few years because of the wide spectrum of biological processes in which they are involved. This led to the discovery of numerous ncRNA genes across many species. However, for most organisms the non-coding transcriptome still remains unexplored to a great extent. Various experimental techniques for the identification of ncRNA transcripts are available, but as these methods are costly and time-consuming, there is a need for computational methods that allow the detection of functional RNAs in complete genomes in order to suggest elements for further experiments. Several programs for the genome-wide prediction of functional RNAs have been developed but most of them predict a genomic locus with no indication whether the element is transcribed or not. Results We present NOCORNAc, a program for the genome-wide prediction of ncRNA transcripts in bacteria. NOCORNAc incorporates various procedures for the detection of transcriptional features which are then integrated with functional ncRNA loci to determine the transcript coordinates. We applied RNAz and NOCORNAc to the genome of Streptomyces coelicolor and detected more than 800 putative ncRNA transcripts most of them located antisense to protein-coding regions. Using a custom design microarray we profiled the expression of about 400 of these elements and found more than 300 to be transcribed, 38 of them are predicted novel ncRNA genes in intergenic regions. The expression patterns of many ncRNAs are similarly complex as those of the protein-coding genes, in particular many antisense ncRNAs show a high expression correlation with their protein-coding partner. Conclusions We have developed NOCORNAc, a framework that facilitates the automated characterization of functional ncRNAs. NOCORNAc increases the confidence of predicted ncRNA loci, especially if they contain transcribed ncRNAs. NOCORNAc is not restricted to

  4. Non-Coding RNAs and Endometrial Cancer

    Directory of Open Access Journals (Sweden)

    Cristina Vallone

    2018-03-01

    Full Text Available Non-coding RNAs (ncRNAs are involved in the regulation of cell metabolism and neoplastic transformation. Recent studies have tried to clarify the significance of these information carriers in the genesis and progression of various cancers and their use as biomarkers for the disease; possible targets for the inhibition of growth and invasion by the neoplastic cells have been suggested. The significance of ncRNAs in lung cancer, bladder cancer, kidney cancer, and melanoma has been amply investigated with important results. Recently, the role of long non-coding RNAs (lncRNAs has also been included in cancer studies. Studies on the relation between endometrial cancer (EC and ncRNAs, such as small ncRNAs or micro RNAs (miRNAs, transfer RNAs (tRNAs, ribosomal RNAs (rRNAs, antisense RNAs (asRNAs, small nuclear RNAs (snRNAs, Piwi-interacting RNAs (piRNAs, small nucleolar RNAs (snoRNAs, competing endogenous RNAs (ceRNAs, lncRNAs, and long intergenic ncRNAs (lincRNAs have been published. The recent literature produced in the last three years was extracted from PubMed by two independent readers, which was then selected for the possible relation between ncRNAs, oncogenesis in general, and EC in particular.

  5. Analysis of a new strain of Euphorbia mosaic virus with distinct replication specificity unveils a lineage of begomoviruses with short Rep sequences in the DNA-B intergenic region

    Directory of Open Access Journals (Sweden)

    Argüello-Astorga Gerardo R

    2010-10-01

    Full Text Available Abstract Background Euphorbia mosaic virus (EuMV is a member of the SLCV clade, a lineage of New World begomoviruses that display distinctive features in their replication-associated protein (Rep and virion-strand replication origin. The first entirely characterized EuMV isolate is native from Yucatan Peninsula, Mexico; subsequently, EuMV was detected in weeds and pepper plants from another region of Mexico, and partial DNA-A sequences revealed significant differences in their putative replication specificity determinants with respect to EuMV-YP. This study was aimed to investigate the replication compatibility between two EuMV isolates from the same country. Results A new isolate of EuMV was obtained from pepper plants collected at Jalisco, Mexico. Full-length clones of both genomic components of EuMV-Jal were biolistically inoculated into plants of three different species, which developed symptoms indistinguishable from those induced by EuMV-YP. Pseudorecombination experiments with EuMV-Jal and EuMV-YP genomic components demonstrated that these viruses do not form infectious reassortants in Nicotiana benthamiana, presumably because of Rep-iteron incompatibility. Sequence analysis of the EuMV-Jal DNA-B intergenic region (IR led to the unexpected discovery of a 35-nt-long sequence that is identical to a segment of the rep gene in the cognate viral DNA-A. Similar short rep sequences ranging from 35- to 51-nt in length were identified in all EuMV isolates and in three distinct viruses from South America related to EuMV. These short rep sequences in the DNA-B IR are positioned downstream to a ~160-nt non-coding domain highly similar to the CP promoter of begomoviruses belonging to the SLCV clade. Conclusions EuMV strains are not compatible in replication, indicating that this begomovirus species probably is not a replicating lineage in nature. The genomic analysis of EuMV-Jal led to the discovery of a subgroup of SLCV clade viruses that contain in

  6. Sequence-based heuristics for faster annotation of non-coding RNA families.

    Science.gov (United States)

    Weinberg, Zasha; Ruzzo, Walter L

    2006-01-01

    Non-coding RNAs (ncRNAs) are functional RNA molecules that do not code for proteins. Covariance Models (CMs) are a useful statistical tool to find new members of an ncRNA gene family in a large genome database, using both sequence and, importantly, RNA secondary structure information. Unfortunately, CM searches are extremely slow. Previously, we created rigorous filters, which provably sacrifice none of a CM's accuracy, while making searches significantly faster for virtually all ncRNA families. However, these rigorous filters make searches slower than heuristics could be. In this paper we introduce profile HMM-based heuristic filters. We show that their accuracy is usually superior to heuristics based on BLAST. Moreover, we compared our heuristics with those used in tRNAscan-SE, whose heuristics incorporate a significant amount of work specific to tRNAs, where our heuristics are generic to any ncRNA. Performance was roughly comparable, so we expect that our heuristics provide a high-quality solution that--unlike family-specific solutions--can scale to hundreds of ncRNA families. The source code is available under GNU Public License at the supplementary web site.

  7. Sequencing illustrates the transcriptional response of Legionella pneumophila during infection and identifies seventy novel small non-coding RNAs.

    LENUS (Irish Health Repository)

    Weissenmayer, Barbara A

    2011-01-01

    Second generation sequencing has prompted a number of groups to re-interrogate the transcriptomes of several bacterial and archaeal species. One of the central findings has been the identification of complex networks of small non-coding RNAs that play central roles in transcriptional regulation in all growth conditions and for the pathogen\\'s interaction with and survival within host cells. Legionella pneumophila is a gram-negative facultative intracellular human pathogen with a distinct biphasic lifestyle. One of its primary environmental hosts in the free-living amoeba Acanthamoeba castellanii and its infection by L. pneumophila mimics that seen in human macrophages. Here we present analysis of strand specific sequencing of the transcriptional response of L. pneumophila during exponential and post-exponential broth growth and during the replicative and transmissive phase of infection inside A. castellanii. We extend previous microarray based studies as well as uncovering evidence of a complex regulatory architecture underpinned by numerous non-coding RNAs. Over seventy new non-coding RNAs could be identified; many of them appear to be strain specific and in configurations not previously reported. We discover a family of non-coding RNAs preferentially expressed during infection conditions and identify a second copy of 6S RNA in L. pneumophila. We show that the newly discovered putative 6S RNA as well as a number of other non-coding RNAs show evidence for antisense transcription. The nature and extent of the non-coding RNAs and their expression patterns suggests that these may well play central roles in the regulation of Legionella spp. specific traits and offer clues as to how L. pneumophila adapts to its intracellular niche. The expression profiles outlined in the study have been deposited into Genbank\\'s Gene Expression Omnibus (GEO) database under the series accession GSE27232.

  8. Genomic Variability of Haemophilus influenzae Isolated from Mexican Children Determined by Using Enterobacterial Repetitive Intergenic Consensus Sequences and PCR

    OpenAIRE

    Gomez-De-Leon, Patricia; Santos, Jose I.; Caballero, Javier; Gomez, Demostenes; Espinosa, Luz E.; Moreno, Isabel; Piñero, Daniel; Cravioto, Alejandro

    2000-01-01

    Genomic fingerprints from 92 capsulated and noncapsulated strains of Haemophilus influenzae from Mexican children with different diseases and healthy carriers were generated by PCR using the enterobacterial repetitive intergenic consensus (ERIC) sequences. A cluster analysis by the unweighted pair-group method with arithmetic averages based on the overall similarity as estimated from the characteristics of the genomic fingerprints, was conducted to group the strains. A total of 69 fingerprint...

  9. Sequence analysis of two alleles reveals that intra-and intergenic recombination played a role in the evolution of the radish fertility restorer (Rfo

    Directory of Open Access Journals (Sweden)

    Budar Françoise

    2010-02-01

    Full Text Available Abstract Background Land plant genomes contain multiple members of a eukaryote-specific gene family encoding proteins with pentatricopeptide repeat (PPR motifs. Some PPR proteins were shown to participate in post-transcriptional events involved in organellar gene expression, and this type of function is now thought to be their main biological role. Among PPR genes, restorers of fertility (Rf of cytoplasmic male sterility systems constitute a peculiar subgroup that is thought to evolve in response to the presence of mitochondrial sterility-inducing genes. Rf genes encoding PPR proteins are associated with very close relatives on complex loci. Results We sequenced a non-restoring allele (L7rfo of the Rfo radish locus whose restoring allele (D81Rfo was previously described, and compared the two alleles and their PPR genes. We identified a ca 13 kb long fragment, likely originating from another part of the radish genome, inserted into the L7rfo sequence. The L7rfo allele carries two genes (PPR-1 and PPR-2 closely related to the three previously described PPR genes of the restorer D81Rfo allele (PPR-A, PPR-B, and PPR-C. Our results indicate that alleles of the Rfo locus have experienced complex evolutionary events, including recombination and insertion of extra-locus sequences, since they diverged. Our analyses strongly suggest that present coding sequences of Rfo PPR genes result from intragenic recombination. We found that the 10 C-terminal PPR repeats in Rfo PPR gene encoded proteins result from the tandem duplication of a 5 PPR repeat block. Conclusions The Rfo locus appears to experience more complex evolution than its flanking sequences. The Rfo locus and PPR genes therein are likely to evolve as a result of intergenic and intragenic recombination. It is therefore not possible to determine which genes on the two alleles are direct orthologs. Our observations recall some previously reported data on pathogen resistance complex loci.

  10. Annotating pathogenic non-coding variants in genic regions.

    Science.gov (United States)

    Gelfman, Sahar; Wang, Quanli; McSweeney, K Melodi; Ren, Zhong; La Carpia, Francesca; Halvorsen, Matt; Schoch, Kelly; Ratzon, Fanni; Heinzen, Erin L; Boland, Michael J; Petrovski, Slavé; Goldstein, David B

    2017-08-09

    Identifying the underlying causes of disease requires accurate interpretation of genetic variants. Current methods ineffectively capture pathogenic non-coding variants in genic regions, resulting in overlooking synonymous and intronic variants when searching for disease risk. Here we present the Transcript-inferred Pathogenicity (TraP) score, which uses sequence context alterations to reliably identify non-coding variation that causes disease. High TraP scores single out extremely rare variants with lower minor allele frequencies than missense variants. TraP accurately distinguishes known pathogenic and benign variants in synonymous (AUC = 0.88) and intronic (AUC = 0.83) public datasets, dismissing benign variants with exceptionally high specificity. TraP analysis of 843 exomes from epilepsy family trios identifies synonymous variants in known epilepsy genes, thus pinpointing risk factors of disease from non-coding sequence data. TraP outperforms leading methods in identifying non-coding variants that are pathogenic and is therefore a valuable tool for use in gene discovery and the interpretation of personal genomes.While non-coding synonymous and intronic variants are often not under strong selective constraint, they can be pathogenic through affecting splicing or transcription. Here, the authors develop a score that uses sequence context alterations to predict pathogenicity of synonymous and non-coding genetic variants, and provide a web server of pre-computed scores.

  11. Intergenic DNA sequences from the human X chromosome reveal high rates of global gene flow

    Directory of Open Access Journals (Sweden)

    Wall Jeffrey D

    2008-11-01

    Full Text Available Abstract Background Despite intensive efforts devoted to collecting human polymorphism data, little is known about the role of gene flow in the ancestry of human populations. This is partly because most analyses have applied one of two simple models of population structure, the island model or the splitting model, which make unrealistic biological assumptions. Results Here, we analyze 98-kb of DNA sequence from 20 independently evolving intergenic regions on the X chromosome in a sample of 90 humans from six globally diverse populations. We employ an isolation-with-migration (IM model, which assumes that populations split and subsequently exchange migrants, to independently estimate effective population sizes and migration rates. While the maximum effective size of modern humans is estimated at ~10,000, individual populations vary substantially in size, with African populations tending to be larger (2,300–9,000 than non-African populations (300–3,300. We estimate mean rates of bidirectional gene flow at 4.8 × 10-4/generation. Bidirectional migration rates are ~5-fold higher among non-African populations (1.5 × 10-3 than among African populations (2.7 × 10-4. Interestingly, because effective sizes and migration rates are inversely related in African and non-African populations, population migration rates are similar within Africa and Eurasia (e.g., global mean Nm = 2.4. Conclusion We conclude that gene flow has played an important role in structuring global human populations and that migration rates should be incorporated as critical parameters in models of human demography.

  12. The long non-coding RNA TUG1 indicates a poor prognosis for colorectal cancer and promotes metastasis by affecting epithelial-mesenchymal transition

    OpenAIRE

    Sun, Junfeng; Ding, Chaohui; Yang, Zhen; Liu, Tao; Zhang, Xiefu; Zhao, Chunlin; Wang, Jiaxiang

    2016-01-01

    Background Long intergenic non-coding RNAs (lncRNAs) are a class of non-coding RNAs that are involved in gene expression regulation. Taurine up-regulated gene 1 (TUG1) is a cancer progression related lncRNA in some tumor oncogenesis; however, its role in colorectal cancer (CRC) remains unclear. In this study, we determined the expression patterns of TUG1 in CRC patients and explored its effect on CRC cell metastasis using cultured representative CRC cell lines. Methods The expression levels o...

  13. Functional interrogation of non-coding DNA through CRISPR genome editing.

    Science.gov (United States)

    Canver, Matthew C; Bauer, Daniel E; Orkin, Stuart H

    2017-05-15

    Methodologies to interrogate non-coding regions have lagged behind coding regions despite comprising the vast majority of the genome. However, the rapid evolution of clustered regularly interspaced short palindromic repeats (CRISPR)-based genome editing has provided a multitude of novel techniques for laboratory investigation including significant contributions to the toolbox for studying non-coding DNA. CRISPR-mediated loss-of-function strategies rely on direct disruption of the underlying sequence or repression of transcription without modifying the targeted DNA sequence. CRISPR-mediated gain-of-function approaches similarly benefit from methods to alter the targeted sequence through integration of customized sequence into the genome as well as methods to activate transcription. Here we review CRISPR-based loss- and gain-of-function techniques for the interrogation of non-coding DNA. Copyright © 2017 Elsevier Inc. All rights reserved.

  14. Annotating non-coding regions of the genome.

    Science.gov (United States)

    Alexander, Roger P; Fang, Gang; Rozowsky, Joel; Snyder, Michael; Gerstein, Mark B

    2010-08-01

    Most of the human genome consists of non-protein-coding DNA. Recently, progress has been made in annotating these non-coding regions through the interpretation of functional genomics experiments and comparative sequence analysis. One can conceptualize functional genomics analysis as involving a sequence of steps: turning the output of an experiment into a 'signal' at each base pair of the genome; smoothing this signal and segmenting it into small blocks of initial annotation; and then clustering these small blocks into larger derived annotations and networks. Finally, one can relate functional genomics annotations to conserved units and measures of conservation derived from comparative sequence analysis.

  15. Adaptation of the short intergenic spacers between co-directional genes to the Shine-Dalgarno motif among prokaryote genomes

    DEFF Research Database (Denmark)

    Caro, Albert Pallejà; García-Vallvé, Santiago; Romeu, Antoni

    2009-01-01

    ABSTRACT: BACKGROUND: In prokaryote genomes most of the co-directional genes are in close proximity. Even the coding sequence or the stop codon of a gene can overlap with the Shine-Dalgarno (SD) sequence of the downstream co-directional gene. In this paper we analyze how the presence of SD may...... influence the stop codon usage or the spacing lengths between co-directional genes. RESULTS: The SD sequences for 530 prokaryote genomes have been predicted using computer calculations of the base-pairing free energy between translation initiation regions and the 16S rRNA 3' tail. Genomes with a large...... to the discussion of which factors affect the intergenic lengths, which cannot be totally explained by the pressure to compact the prokaryote genomes....

  16. Hybridization-based reconstruction of small non-coding RNA transcripts from deep sequencing data.

    Science.gov (United States)

    Ragan, Chikako; Mowry, Bryan J; Bauer, Denis C

    2012-09-01

    Recent advances in RNA sequencing technology (RNA-Seq) enables comprehensive profiling of RNAs by producing millions of short sequence reads from size-fractionated RNA libraries. Although conventional tools for detecting and distinguishing non-coding RNAs (ncRNAs) from reference-genome data can be applied to sequence data, ncRNA detection can be improved by harnessing the full information content provided by this new technology. Here we present NorahDesk, the first unbiased and universally applicable method for small ncRNAs detection from RNA-Seq data. NorahDesk utilizes the coverage-distribution of small RNA sequence data as well as thermodynamic assessments of secondary structure to reliably predict and annotate ncRNA classes. Using publicly available mouse sequence data from brain, skeletal muscle, testis and ovary, we evaluated our method with an emphasis on the performance for microRNAs (miRNAs) and piwi-interacting small RNA (piRNA). We compared our method with Dario and mirDeep2 and found that NorahDesk produces longer transcripts with higher read coverage. This feature makes it the first method particularly suitable for the prediction of both known and novel piRNAs.

  17. [Identification of medicinal plant Dendrobium based on the chloroplast psbK-psbI intergenic spacer].

    Science.gov (United States)

    Yao, Hui; Yang, Pei; Zhou, Hong; Ma, Shuang-jiao; Song, Jing-yuan; Chen, Shi-lin

    2015-06-01

    In this paper, the chloroplast psbK-psbI intergenic spacers of 18 species of Dendrobium and their adulterants were amplified and sequenced, and then the sequence characteristics were analyzed. The sequence lengths of chloroplast psbK-psbI regions of Dendrobium ranged from 474 to 513 bp and the GC contents were 25.4%-27.6%. The variable sites were 71 while the informative sites were 46. The inter-specific genetic distances calculated by Kimura 2-parameter (K2P) of Dendrobium were 0.006 1-0.058 1, with an average of 0.028 4. The K2P genetic distances between Dendrobium species and Bulbophyllum odoratissimum were 0.093 2-0.120 4. The NJ tree showed that the Dendrobium species can be easily differentiated from each other and 6 samples of the inspected Dendrobium species were identified successfully through sequencing the psbK-psbI intergenic spacer. Therefore, the chloroplast psbK-psbI intergenic spacer can be used as a candidate marker to identify Dendrobium species and its adulterants.

  18. Nuclear RNA sequencing of the mouse erythroid cell transcriptome.

    Directory of Open Access Journals (Sweden)

    Jennifer A Mitchell

    Full Text Available In addition to protein coding genes a substantial proportion of mammalian genomes are transcribed. However, most transcriptome studies investigate steady-state mRNA levels, ignoring a considerable fraction of the transcribed genome. In addition, steady-state mRNA levels are influenced by both transcriptional and posttranscriptional mechanisms, and thus do not provide a clear picture of transcriptional output. Here, using deep sequencing of nuclear RNAs (nucRNA-Seq in parallel with chromatin immunoprecipitation sequencing (ChIP-Seq of active RNA polymerase II, we compared the nuclear transcriptome of mouse anemic spleen erythroid cells with polymerase occupancy on a genome-wide scale. We demonstrate that unspliced transcripts quantified by nucRNA-seq correlate with primary transcript frequencies measured by RNA FISH, but differ from steady-state mRNA levels measured by poly(A-enriched RNA-seq. Highly expressed protein coding genes showed good correlation between RNAPII occupancy and transcriptional output; however, genome-wide we observed a poor correlation between transcriptional output and RNAPII association. This poor correlation is due to intergenic regions associated with RNAPII which correspond with transcription factor bound regulatory regions and a group of stable, nuclear-retained long non-coding transcripts. In conclusion, sequencing the nuclear transcriptome provides an opportunity to investigate the transcriptional landscape in a given cell type through quantification of unspliced primary transcripts and the identification of nuclear-retained long non-coding RNAs.

  19. On Coding Non-Contiguous Letter Combinations

    Directory of Open Access Journals (Sweden)

    Frédéric eDandurand

    2011-06-01

    Full Text Available Starting from the hypothesis that printed word identification initially involves the parallel mapping of visual features onto location-specific letter identities, we analyze the type of information that would be involved in optimally mapping this location-specific orthographic code onto a location-invariant lexical code. We assume that some intermediate level of coding exists between individual letters and whole words, and that this involves the representation of letter combinations. We then investigate the nature of this intermediate level of coding given the constraints of optimality. This intermediate level of coding is expected to compress data while retaining as much information as possible about word identity. Information conveyed by letters is a function of how much they constrain word identity and how visible they are. Optimization of this coding is a combination of minimizing resources (using the most compact representations and maximizing information. We show that in a large proportion of cases, non-contiguous letter sequences contain more information than contiguous sequences, while at the same time requiring less precise coding. Moreover, we found that the best predictor of human performance in orthographic priming experiments was within-word ranking of conditional probabilities, rather than average conditional probabilities. We conclude that from an optimality perspective, readers learn to select certain contiguous and non-contiguous letter combinations as information that provides the best cue to word identity.

  20. Long non-coding RNAs: Mechanism of action and functional utility

    OpenAIRE

    Bhat, Shakil Ahmad; Ahmad, Syed Mudasir; Mumtaz, Peerzada Tajamul; Malik, Abrar Ahad; Dar, Mashooq Ahmad; Urwat, Uneeb; Shah, Riaz Ahmad; Ganai, Nazir Ahmad

    2016-01-01

    Recent RNA sequencing studies have revealed that most of the human genome is transcribed, but very little of the total transcriptomes has the ability to encode proteins. Long non-coding RNAs (lncRNAs) are non-coding transcripts longer than 200 nucleotides. Members of the non-coding genome include microRNA (miRNA), small regulatory RNAs and other short RNAs. Most of long non-coding RNA (lncRNAs) are poorly annotated. Recent recognition about lncRNAs highlights their effects in many biological ...

  1. Enterobacterial repetitive intergenic consensus sequences and the PCR to generate fingerprints of genomic DNAs from Vibrio cholerae O1, O139, and non-O1 strains.

    Science.gov (United States)

    Rivera, I G; Chowdhury, M A; Huq, A; Jacobs, D; Martins, M T; Colwell, R R

    1995-08-01

    Enterobacterial repetitive intergenic consensus (ERIC) sequence polymorphism was studied in Vibrio Cholerae strains isolated before and after the cholera epidemic in Brazil (in 1991), along with epidemic strains from Peru, Mexico, and India, by PCR. A total of 17 fingerprint patterns (FPs) were detected in the V. cholerae strains examined; 96.7% of the toxigenic V. cholerae O1 strains and 100% of the O139 serogroup strains were found to belong to the same FP group comprising four fragments (FP1). The nontoxigenic V. cholerae O1 also yielded four fragments but constituted a different FP group (FP2). A total of 15 different patterns were observed among the V. cholerae non-O1 strains. Two patterns were observed most frequently for V. cholerae non-01 strains, 25% of which have FP3, with five fragments, and 16.7% of which have FP4, with two fragments. Three fragments, 1.75, 0.79, and 0.5 kb, were found to be common to both toxigenic and nontoxigenic V. cholerae O1 strains as well as to group FP3, containing V. cholerae non-O1 strains. Two fragments of group FP3, 1.3 and 1.0 kb, were present in FP1 and FP2 respectively. The 0.5-kb fragment was common to all strains and serogroups of V. cholerae analyzed. It is concluded from the results of this study, based on DNA FPs of environmental isolates, that it is possible to detect an emerging virulent strain in a cholera-endemic region. ERIC-PCR constitutes a powerful tool for determination of the virulence potential of V. cholerae O1 strains isolated in surveillance programs and for molecular epidemiological investigations.

  2. Retrotransposons and non-protein coding RNAs

    DEFF Research Database (Denmark)

    Mourier, Tobias; Willerslev, Eske

    2009-01-01

    does not merely represent spurious transcription. We review examples of functional RNAs transcribed from retrotransposons, and address the collection of non-protein coding RNAs derived from transposable element sequences, including numerous human microRNAs and the neuronal BC RNAs. Finally, we review...

  3. Golay sequences coded coherent optical OFDM for long-haul transmission

    Science.gov (United States)

    Qin, Cui; Ma, Xiangrong; Hua, Tao; Zhao, Jing; Yu, Huilong; Zhang, Jian

    2017-09-01

    We propose to use binary Golay sequences in coherent optical orthogonal frequency division multiplexing (CO-OFDM) to improve the long-haul transmission performance. The Golay sequences are generated by binary Reed-Muller codes, which have low peak-to-average power ratio and certain error correction capability. A low-complexity decoding algorithm for the Golay sequences is then proposed to recover the signal. Under same spectral efficiency, the QPSK modulated OFDM with binary Golay sequences coding with and without discrete Fourier transform (DFT) spreading (DFTS-QPSK-GOFDM and QPSK-GOFDM) are compared with the normal BPSK modulated OFDM with and without DFT spreading (DFTS-BPSK-OFDM and BPSK-OFDM) after long-haul transmission. At a 7% forward error correction code threshold (Q2 factor of 8.5 dB), it is shown that DFTS-QPSK-GOFDM outperforms DFTS-BPSK-OFDM by extending the transmission distance by 29% and 18%, in non-dispersion managed and dispersion managed links, respectively.

  4. CRISPR/Cas9-Mediated Immunity to Geminiviruses: Differential Interference and Evasion

    KAUST Repository

    Ali, Zahir; Ali, Shawkat; Tashkandi, Manal; Zaidi, Syed Shan-e-Ali; Mahfouz, Magdy M.

    2016-01-01

    The CRISPR/Cas9 system has recently been used to confer molecular immunity against several eukaryotic viruses, including plant DNA geminiviruses. Here, we provide a detailed analysis of the efficiencies of targeting different coding and non-coding sequences in the genomes of multiple geminiviruses. Moreover, we analyze the ability of geminiviruses to evade the CRISPR/Cas9 machinery. Our results demonstrate that the CRISPR/Cas9 machinery can efficiently target coding and non-coding sequences and interfere with various geminiviruses. Furthermore, targeting the coding sequences of different geminiviruses resulted in the generation of viral variants capable of replication and systemic movement. By contrast, targeting the noncoding intergenic region sequences of geminiviruses resulted in interference, but with inefficient recovery of mutated viral variants, which thus limited the generation of variants capable of replication and movement. Taken together, our results indicate that targeting noncoding, intergenic sequences provides viral interference activity and significantly limits the generation of viral variants capable of replication and systemic infection, which is essential for developing durable resistance strategies for long-term virus control.

  5. CRISPR/Cas9-Mediated Immunity to Geminiviruses: Differential Interference and Evasion

    KAUST Repository

    Ali, Zahir

    2016-05-26

    The CRISPR/Cas9 system has recently been used to confer molecular immunity against several eukaryotic viruses, including plant DNA geminiviruses. Here, we provide a detailed analysis of the efficiencies of targeting different coding and non-coding sequences in the genomes of multiple geminiviruses. Moreover, we analyze the ability of geminiviruses to evade the CRISPR/Cas9 machinery. Our results demonstrate that the CRISPR/Cas9 machinery can efficiently target coding and non-coding sequences and interfere with various geminiviruses. Furthermore, targeting the coding sequences of different geminiviruses resulted in the generation of viral variants capable of replication and systemic movement. By contrast, targeting the noncoding intergenic region sequences of geminiviruses resulted in interference, but with inefficient recovery of mutated viral variants, which thus limited the generation of variants capable of replication and movement. Taken together, our results indicate that targeting noncoding, intergenic sequences provides viral interference activity and significantly limits the generation of viral variants capable of replication and systemic infection, which is essential for developing durable resistance strategies for long-term virus control.

  6. Ribosomal DNA intergenic spacer sequence in foxtail millet, Setaria italica (L.) P. Beauv. and its characterization and application to typing of foxtail millet landraces.

    Science.gov (United States)

    Fukunaga, Kenji; Ichitani, Katsuyuki; Taura, Satoru; Sato, Muneharu; Kawase, Makoto

    2005-02-01

    We determined the sequence of ribosomal DNA (rDNA) intergenic spacer (IGS) of foxtail millet isolated in our previous study, and identified subrepeats in the polymorphic region. We also developed a PCR-based method for identifying rDNA types based on sequence information and assessed 153 accessions of foxtail millet. Results were congruent with our previous works. This study provides new findings regarding the geographical distribution of rDNA variants. This new method facilitates analyses of numerous foxtail millet accessions. It is helpful for typing of foxtail millet germplasms and elucidating the evolution of this millet.

  7. The Dunaliella salina organelle genomes: large sequences, inflated with intronic and intergenic DNA

    Energy Technology Data Exchange (ETDEWEB)

    Smith, David R.; Lee, Robert W.; Cushman, John C.; Magnuson, Jon K.; Tran, Duc; Polle, Juergen E.

    2010-05-07

    Abstract Background: Dunaliella salina Teodoresco, a unicellular, halophilic green alga belonging to the Chlorophyceae, is among the most industrially important microalgae. This is because D. salina can produce massive amounts of β-carotene, which can be collected for commercial purposes, and because of its potential as a feedstock for biofuels production. Although the biochemistry and physiology of D. salina have been studied in great detail, virtually nothing is known about the genomes it carries, especially those within its mitochondrion and plastid. This study presents the complete mitochondrial and plastid genome sequences of D. salina and compares them with those of the model green algae Chlamydomonas reinhardtii and Volvox carteri. Results: The D. salina organelle genomes are large, circular-mapping molecules with ~60% noncoding DNA, placing them among the most inflated organelle DNAs sampled from the Chlorophyta. In fact, the D. salina plastid genome, at 269 kb, is the largest complete plastid DNA (ptDNA) sequence currently deposited in GenBank, and both the mitochondrial and plastid genomes have unprecedentedly high intron densities for organelle DNA: ~1.5 and ~0.4 introns per gene, respectively. Moreover, what appear to be the relics of genes, introns, and intronic open reading frames are found scattered throughout the intergenic ptDNA regions -- a trait without parallel in other characterized organelle genomes and one that gives insight into the mechanisms and modes of expansion of the D. salina ptDNA. Conclusions: These findings confirm the notion that chlamydomonadalean algae have some of the most extreme organelle genomes of all eukaryotes. They also suggest that the events giving rise to the expanded ptDNA architecture of D. salina and other Chlamydomonadales may have occurred early in the evolution of this lineage. Although interesting from a genome evolution standpoint, the D. salina organelle DNA sequences will aid in the development of a viable

  8. The Dunaliella salina organelle genomes: large sequences, inflated with intronic and intergenic DNA

    Directory of Open Access Journals (Sweden)

    Tran Duc

    2010-05-01

    Full Text Available Abstract Background Dunaliella salina Teodoresco, a unicellular, halophilic green alga belonging to the Chlorophyceae, is among the most industrially important microalgae. This is because D. salina can produce massive amounts of β-carotene, which can be collected for commercial purposes, and because of its potential as a feedstock for biofuels production. Although the biochemistry and physiology of D. salina have been studied in great detail, virtually nothing is known about the genomes it carries, especially those within its mitochondrion and plastid. This study presents the complete mitochondrial and plastid genome sequences of D. salina and compares them with those of the model green algae Chlamydomonas reinhardtii and Volvox carteri. Results The D. salina organelle genomes are large, circular-mapping molecules with ~60% noncoding DNA, placing them among the most inflated organelle DNAs sampled from the Chlorophyta. In fact, the D. salina plastid genome, at 269 kb, is the largest complete plastid DNA (ptDNA sequence currently deposited in GenBank, and both the mitochondrial and plastid genomes have unprecedentedly high intron densities for organelle DNA: ~1.5 and ~0.4 introns per gene, respectively. Moreover, what appear to be the relics of genes, introns, and intronic open reading frames are found scattered throughout the intergenic ptDNA regions -- a trait without parallel in other characterized organelle genomes and one that gives insight into the mechanisms and modes of expansion of the D. salina ptDNA. Conclusions These findings confirm the notion that chlamydomonadalean algae have some of the most extreme organelle genomes of all eukaryotes. They also suggest that the events giving rise to the expanded ptDNA architecture of D. salina and other Chlamydomonadales may have occurred early in the evolution of this lineage. Although interesting from a genome evolution standpoint, the D. salina organelle DNA sequences will aid in the

  9. Prevalence of transcription promoters within archaeal operons and coding sequences.

    Science.gov (United States)

    Koide, Tie; Reiss, David J; Bare, J Christopher; Pang, Wyming Lee; Facciotti, Marc T; Schmid, Amy K; Pan, Min; Marzolf, Bruz; Van, Phu T; Lo, Fang-Yin; Pratap, Abhishek; Deutsch, Eric W; Peterson, Amelia; Martin, Dan; Baliga, Nitin S

    2009-01-01

    Despite the knowledge of complex prokaryotic-transcription mechanisms, generalized rules, such as the simplified organization of genes into operons with well-defined promoters and terminators, have had a significant role in systems analysis of regulatory logic in both bacteria and archaea. Here, we have investigated the prevalence of alternate regulatory mechanisms through genome-wide characterization of transcript structures of approximately 64% of all genes, including putative non-coding RNAs in Halobacterium salinarum NRC-1. Our integrative analysis of transcriptome dynamics and protein-DNA interaction data sets showed widespread environment-dependent modulation of operon architectures, transcription initiation and termination inside coding sequences, and extensive overlap in 3' ends of transcripts for many convergently transcribed genes. A significant fraction of these alternate transcriptional events correlate to binding locations of 11 transcription factors and regulators (TFs) inside operons and annotated genes-events usually considered spurious or non-functional. Using experimental validation, we illustrate the prevalence of overlapping genomic signals in archaeal transcription, casting doubt on the general perception of rigid boundaries between coding sequences and regulatory elements.

  10. Sequence Analysis of Mitochondrial Genome of Toxascaris leonina from a South China Tiger.

    Science.gov (United States)

    Li, Kangxin; Yang, Fang; Abdullahi, A Y; Song, Meiran; Shi, Xianli; Wang, Minwei; Fu, Yeqi; Pan, Weida; Shan, Fang; Chen, Wu; Li, Guoqing

    2016-12-01

    Toxascaris leonina is a common parasitic nematode of wild mammals and has significant impacts on the protection of rare wild animals. To analyze population genetic characteristics of T. leonina from South China tiger, its mitochondrial (mt) genome was sequenced. Its complete circular mt genome was 14,277 bp in length, including 12 protein-coding genes, 22 tRNA genes, 2 rRNA genes, and 2 non-coding regions. The nucleotide composition was biased toward A and T. The most common start codon and stop codon were TTG and TAG, and 4 genes ended with an incomplete stop codon. There were 13 intergenic regions ranging 1 to 10 bp in size. Phylogenetically, T. leonina from a South China tiger was close to canine T. leonina . This study reports for the first time a complete mt genome sequence of T. leonina from the South China tiger, and provides a scientific basis for studying the genetic diversity of nematodes between different hosts.

  11. Species identification of medicinal pteridophytes by a DNA barcode marker, the chloroplast psbA-trnH intergenic region.

    Science.gov (United States)

    Ma, Xin-Ye; Xie, Cai-Xiang; Liu, Chang; Song, Jing-Yuan; Yao, Hui; Luo, Kun; Zhu, Ying-Jie; Gao, Ting; Pang, Xiao-Hui; Qian, Jun; Chen, Shi-Lin

    2010-01-01

    Medicinal pteridophytes are an important group used in traditional Chinese medicine; however, there is no simple and universal way to differentiate various species of this group by morphological traits. A novel technology termed "DNA barcoding" could discriminate species by a standard DNA sequence with universal primers and sufficient variation. To determine whether DNA barcoding would be effective for differentiating pteridophyte species, we first analyzed five DNA sequence markers (psbA-trnH intergenic region, rbcL, rpoB, rpoC1, and matK) using six chloroplast genomic sequences from GeneBank and found psbA-trnH intergenic region the best candidate for availability of universal primers. Next, we amplified the psbA-trnH region from 79 samples of medicinal pteridophyte plants. These samples represented 51 species from 24 families, including all the authentic pteridophyte species listed in the Chinese pharmacopoeia (2005 version) and some commonly used adulterants. We found that the sequence of the psbA-trnH intergenic region can be determined with both high polymerase chain reaction (PCR) amplification efficiency (94.1%) and high direct sequencing success rate (81.3%). Combined with GeneBank data (54 species cross 12 pteridophyte families), species discriminative power analysis showed that 90.2% of species could be separated/identified successfully by the TaxonGap method in conjunction with the Basic Local Alignment Search Tool 1 (BLAST1) method. The TaxonGap method results further showed that, for 37 out of 39 separable species with at least two samples each, between-species variation was higher than the relevant within-species variation. Thus, the psbA-trnH intergenic region is a suitable DNA marker for species identification in medicinal pteridophytes.

  12. Potentials and limitations of histone repeat sequences for phylogenetic reconstruction of Sophophora.

    Science.gov (United States)

    Baldo, A M; Les, D H; Strausbaugh, L D

    1999-11-01

    Simplified DNA sequence acquisition has provided many new data sets that are useful for phylogenetic reconstruction, including single- and multiple-copy nuclear and organellar genes. Although transcribed regions receive much attention, nontranscribed regions have recently been added to the repertoire of sequences suitable for phylogenetic studies, especially for closely related taxa. We evaluated the efficacy of a small portion of the histone repeat for phylogenetic reconstruction among Drosophila species. Histone repeats in invertebrates offer distinct advantages similar to those of widely used ribosomal repeats. First, the units are tandemly repeated and undergo concerted evolution. Second, histone repeats include both highly conserved coding and variable intergenic regions. This composition facilitates application of "universal" primers spanning potentially informative sites. We examined a small region of the histone repeat, including the intergenic spacer segments of coding regions from the divergently transcribed H2A and H2B histone genes. The spacer (about 230 bp) exists as a mosaic with highly conserved functional motifs interspersed with rapidly diverging regions; the former aid in alignment of the spacer. There are no ambiguities in alignment of coding regions. Coding and noncoding regions were analyzed together and separately for phylogenetic information. Parsimony, distance, and maximum-likelihood methods successfully retrieve the corroborated phylogeny for the taxa examined. This study demonstrates the resolving power of a small histone region which may now be added to the growing collection of phylogenetically useful DNA sequences.

  13. On the classification of long non-coding RNAs

    KAUST Repository

    Ma, Lina; Bajic, Vladimir B.; Zhang, Zhang

    2013-01-01

    Long non-coding RNAs (lncRNAs) have been found to perform various functions in a wide variety of important biological processes. To make easier interpretation of lncRNA functionality and conduct deep mining on these transcribed sequences

  14. Long non-coding RNA HOTAIR is an independent prognostic marker of metastasis in estrogen receptor-positive primary breast cancer

    DEFF Research Database (Denmark)

    Sørensen, Kristina P; Thomassen, Mads; Tan, Qihua

    2013-01-01

    Expression of HOX transcript antisense intergenic RNA (HOTAIR)-a long non-coding RNA-has been examined in a variety of human cancers, and overexpression of HOTAIR is correlated with poor survival among breast, colon, and liver cancer patients. In this retrospective study, we examine HOTAIR......-negative tumor samples, we are not able to detect a prognostic value of HOTAIR expression, probably due to the limited sample size. These results are successfully validated in an independent dataset with similar associations (P = 0.018, HR 1.825). In conclusion, our findings suggest that HOTAIR expression may...

  15. Expression of the Long Intergenic Non-Protein Coding RNA 665 (LINC00665) Gene and the Cell Cycle in Hepatocellular Carcinoma Using The Cancer Genome Atlas, the Gene Expression Omnibus, and Quantitative Real-Time Polymerase Chain Reaction.

    Science.gov (United States)

    Wen, Dong-Yue; Lin, Peng; Pang, Yu-Yan; Chen, Gang; He, Yun; Dang, Yi-Wu; Yang, Hong

    2018-05-05

    BACKGROUND Long non-coding RNAs (lncRNAs) have a role in physiological and pathological processes, including cancer. The aim of this study was to investigate the expression of the long intergenic non-protein coding RNA 665 (LINC00665) gene and the cell cycle in hepatocellular carcinoma (HCC) using database analysis including The Cancer Genome Atlas (TCGA), the Gene Expression Omnibus (GEO), and quantitative real-time polymerase chain reaction (qPCR). MATERIAL AND METHODS Expression levels of LINC00665 were compared between human tissue samples of HCC and adjacent normal liver, clinicopathological correlations were made using TCGA and the GEO, and qPCR was performed to validate the findings. Other public databases were searched for other genes associated with LINC00665 expression, including The Atlas of Noncoding RNAs in Cancer (TANRIC), the Multi Experiment Matrix (MEM), Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG) and protein-protein interaction (PPI) networks. RESULTS Overexpression of LINC00665 in patients with HCC was significantly associated with gender, tumor grade, stage, and tumor cell type. Overexpression of LINC00665 in patients with HCC was significantly associated with overall survival (OS) (HR=1.47795%; CI: 1.046-2.086). Bioinformatics analysis identified 469 related genes and further analysis supported a hypothesis that LINC00665 regulates pathways in the cell cycle to facilitate the development and progression of HCC through ten identified core genes: CDK1, BUB1B, BUB1, PLK1, CCNB2, CCNB1, CDC20, ESPL1, MAD2L1, and CCNA2. CONCLUSIONS Overexpression of the lncRNA, LINC00665 may be involved in the regulation of cell cycle pathways in HCC through ten identified hub genes.

  16. Some Algebraic Aspects of MorseCode Sequences

    OpenAIRE

    Johann Cigler

    2003-01-01

    Morse code sequences are very useful to give combinatorial interpretations of various properties of Fibonacci numbers. In this note we study some algebraic and combinatorial aspects of Morse code sequences and obtain several q-analogues of Fibonacci numbers and Fibonacci polynomials and their generalizations.

  17. Some Algebraic Aspects of MorseCode Sequences

    Directory of Open Access Journals (Sweden)

    Johann Cigler

    2003-06-01

    Full Text Available Morse code sequences are very useful to give combinatorial interpretations of various properties of Fibonacci numbers. In this note we study some algebraic and combinatorial aspects of Morse code sequences and obtain several q-analogues of Fibonacci numbers and Fibonacci polynomials and their generalizations.

  18. TFIIS-Dependent Non-coding Transcription Regulates Developmental Genome Rearrangements.

    Directory of Open Access Journals (Sweden)

    Kamila Maliszewska-Olejniczak

    2015-07-01

    Full Text Available Because of their nuclear dimorphism, ciliates provide a unique opportunity to study the role of non-coding RNAs (ncRNAs in the communication between germline and somatic lineages. In these unicellular eukaryotes, a new somatic nucleus develops at each sexual cycle from a copy of the zygotic (germline nucleus, while the old somatic nucleus degenerates. In the ciliate Paramecium tetraurelia, the genome is massively rearranged during this process through the reproducible elimination of repeated sequences and the precise excision of over 45,000 short, single-copy Internal Eliminated Sequences (IESs. Different types of ncRNAs resulting from genome-wide transcription were shown to be involved in the epigenetic regulation of genome rearrangements. To understand how ncRNAs are produced from the entire genome, we have focused on a homolog of the TFIIS elongation factor, which regulates RNA polymerase II transcriptional pausing. Six TFIIS-paralogs, representing four distinct families, can be found in P. tetraurelia genome. Using RNA interference, we showed that TFIIS4, which encodes a development-specific TFIIS protein, is essential for the formation of a functional somatic genome. Molecular analyses and high-throughput DNA sequencing upon TFIIS4 RNAi demonstrated that TFIIS4 is involved in all kinds of genome rearrangements, including excision of ~48% of IESs. Localization of a GFP-TFIIS4 fusion revealed that TFIIS4 appears specifically in the new somatic nucleus at an early developmental stage, before IES excision. RT-PCR experiments showed that TFIIS4 is necessary for the synthesis of IES-containing non-coding transcripts. We propose that these IES+ transcripts originate from the developing somatic nucleus and serve as pairing substrates for germline-specific short RNAs that target elimination of their homologous sequences. Our study, therefore, connects the onset of zygotic non coding transcription to the control of genome plasticity in Paramecium

  19. Genomic relationships of Actinobacillus pleuropneumoniae serotype 2 strains evaluated by ribotyping, sequence analysis of ribosomal intergenic regions, and pulsed-field gel electrophoresis

    DEFF Research Database (Denmark)

    Fussing, V.

    1998-01-01

    The aim of the present study was to examine the genomic relationship among 112 Actinobacillus pleuropneumoniae serotype 2 strains obtained throughout Europe and North America. HindIII ribotyping of the strains resulted in five ribotypes of high similarity (87-98%). Sequence analysis of the riboso......The aim of the present study was to examine the genomic relationship among 112 Actinobacillus pleuropneumoniae serotype 2 strains obtained throughout Europe and North America. HindIII ribotyping of the strains resulted in five ribotypes of high similarity (87-98%). Sequence analysis...... of the ribosomal intergenic region of strains representing each ribotype and each country showed no differences. A common ribotype was further characterized by PFGE of 12 strains representing all countries. The resultant five PFGE patterns of European strains showed a similarity of more than 91%, to which the two...

  20. De novo ORFs in Drosophila are important to organismal fitness and evolved rapidly from previously non-coding sequences.

    Directory of Open Access Journals (Sweden)

    Josephine A Reinhardt

    Full Text Available How non-coding DNA gives rise to new protein-coding genes (de novo genes is not well understood. Recent work has revealed the origins and functions of a few de novo genes, but common principles governing the evolution or biological roles of these genes are unknown. To better define these principles, we performed a parallel analysis of the evolution and function of six putatively protein-coding de novo genes described in Drosophila melanogaster. Reconstruction of the transcriptional history of de novo genes shows that two de novo genes emerged from novel long non-coding RNAs that arose at least 5 MY prior to evolution of an open reading frame. In contrast, four other de novo genes evolved a translated open reading frame and transcription within the same evolutionary interval suggesting that nascent open reading frames (proto-ORFs, while not required, can contribute to the emergence of a new de novo gene. However, none of the genes arose from proto-ORFs that existed long before expression evolved. Sequence and structural evolution of de novo genes was rapid compared to nearby genes and the structural complexity of de novo genes steadily increases over evolutionary time. Despite the fact that these genes are transcribed at a higher level in males than females, and are most strongly expressed in testes, RNAi experiments show that most of these genes are essential in both sexes during metamorphosis. This lethality suggests that protein coding de novo genes in Drosophila quickly become functionally important.

  1. Identification and characterization of long intergenic noncoding RNAs in bovine mammary glands.

    Science.gov (United States)

    Tong, Chao; Chen, Qiaoling; Zhao, Lili; Ma, Junfei; Ibeagha-Awemu, Eveline M; Zhao, Xin

    2017-06-19

    Mammary glands of dairy cattle produce milk for the newborn offspring and for human consumption. Long intergenic noncoding RNAs (lincRNAs) play various functions in eukaryotic cells. However, types and roles of lincRNAs in bovine mammary glands are still poorly understood. Using computational methods, 886 unknown intergenic transcripts (UITs) were identified from five RNA-seq datasets from bovine mammary glands. Their non-coding potentials were predicted by using the combination of four software programs (CPAT, CNCI, CPC and hmmscan), with 184 lincRNAs identified. By comparison to the NONCODE2016 database and a domestic-animal long noncoding RNA database (ALDB), 112 novel lincRNAs were revealed in bovine mammary glands. Many lincRNAs were found to be located in quantitative trait loci (QTL). In particular, 36 lincRNAs were found in 172 milk related QTLs, whereas one lincRNA was within clinical mastitis QTL region. In addition, targeted genes for 10 lincRNAs with the highest fragments per kilobase of transcript per million fragments mapped (FPKM) were predicted by LncTar for forecasting potential biological functions of these lincRNAs. Further analyses indicate involvement of lincRNAs in several biological functions and different pathways. Our study has provided a panoramic view of lincRNAs in bovine mammary glands and suggested their involvement in many biological functions including susceptibility to clinical mastitis as well as milk quality and production. This integrative annotation of mammary gland lincRNAs broadens and deepens our understanding of bovine mammary gland biology.

  2. Long non-coding RNAs: Mechanism of action and functional utility

    Directory of Open Access Journals (Sweden)

    Shakil Ahmad Bhat

    2016-10-01

    Full Text Available Recent RNA sequencing studies have revealed that most of the human genome is transcribed, but very little of the total transcriptomes has the ability to encode proteins. Long non-coding RNAs (lncRNAs are non-coding transcripts longer than 200 nucleotides. Members of the non-coding genome include microRNA (miRNA, small regulatory RNAs and other short RNAs. Most of long non-coding RNA (lncRNAs are poorly annotated. Recent recognition about lncRNAs highlights their effects in many biological and pathological processes. LncRNAs are dysfunctional in a variety of human diseases varying from cancerous to non-cancerous diseases. Characterization of these lncRNA genes and their modes of action may allow their use for diagnosis, monitoring of progression and targeted therapies in various diseases. In this review, we summarize the functional perspectives as well as the mechanism of action of lncRNAs. Keywords: LncRNA, X-chromosome inactivation, Genome imprinting, Transcription regulation, Cancer, Immunity

  3. A two-locus global DNA barcode for land plants: the coding rbcL gene complements the non-coding trnH-psbA spacer region.

    Science.gov (United States)

    Kress, W John; Erickson, David L

    2007-06-06

    A useful DNA barcode requires sufficient sequence variation to distinguish between species and ease of application across a broad range of taxa. Discovery of a DNA barcode for land plants has been limited by intrinsically lower rates of sequence evolution in plant genomes than that observed in animals. This low rate has complicated the trade-off in finding a locus that is universal and readily sequenced and has sufficiently high sequence divergence at the species-level. Here, a global plant DNA barcode system is evaluated by comparing universal application and degree of sequence divergence for nine putative barcode loci, including coding and non-coding regions, singly and in pairs across a phylogenetically diverse set of 48 genera (two species per genus). No single locus could discriminate among species in a pair in more than 79% of genera, whereas discrimination increased to nearly 88% when the non-coding trnH-psbA spacer was paired with one of three coding loci, including rbcL. In silico trials were conducted in which DNA sequences from GenBank were used to further evaluate the discriminatory power of a subset of these loci. These trials supported the earlier observation that trnH-psbA coupled with rbcL can correctly identify and discriminate among related species. A combination of the non-coding trnH-psbA spacer region and a portion of the coding rbcL gene is recommended as a two-locus global land plant barcode that provides the necessary universality and species discrimination.

  4. The PRC2-binding long non-coding RNAs in human and mouse genomes are associated with predictive sequence features

    Science.gov (United States)

    Tu, Shiqi; Yuan, Guo-Cheng; Shao, Zhen

    2017-01-01

    Recently, long non-coding RNAs (lncRNAs) have emerged as an important class of molecules involved in many cellular processes. One of their primary functions is to shape epigenetic landscape through interactions with chromatin modifying proteins. However, mechanisms contributing to the specificity of such interactions remain poorly understood. Here we took the human and mouse lncRNAs that were experimentally determined to have physical interactions with Polycomb repressive complex 2 (PRC2), and systematically investigated the sequence features of these lncRNAs by developing a new computational pipeline for sequences composition analysis, in which each sequence is considered as a series of transitions between adjacent nucleotides. Through that, PRC2-binding lncRNAs were found to be associated with a set of distinctive and evolutionarily conserved sequence features, which can be utilized to distinguish them from the others with considerable accuracy. We further identified fragments of PRC2-binding lncRNAs that are enriched with these sequence features, and found they show strong PRC2-binding signals and are more highly conserved across species than the other parts, implying their functional importance.

  5. Differential effect of UV irradiation on induction of intragenic and intergenic recombination during commitment to meiosis in Saccharomyces cerevisiae

    International Nuclear Information System (INIS)

    Machida, I.; Nakai, S.

    1980-01-01

    A comparison was made between the induction of intragenic and intergenic recombinations during meiosis in a wild-type diploid of Saccharomyces cerevisiae. Under non-irradiated normal conditions, production of both intragenic and intergenic recombinants greatly increased in the cells with commitment to meiosis. The susceptibility of cells to the induction ob both the spontaneous intra- and intergenic recombinations in meiotic cells was similar. However, under condition of UV irradiation, there were striking differences between intra- and intergenic recombinations. Susceptibility to induction of intragenic recombination by UV irradiation was not enhanced at meiosis compared with mitosis, and was not altered through commitment to meiotic processes. In contrast, however, susceptibility to the induction of intergenic recombination by UV irradiation was enhanced markedly during commitment to meiosis compared with mitosis. Genetic analysis suggested that the enhanced susceptibility to recombination during meiosis is specifically concerned with reciprocal-type recombination (crossing-over) but not non-reciprocal-type recombination (gene conversion). Hence it is concluded that the meiotic that the meiotic process appears to be intimately concerned with the mechanism(s) of induction of recombination, especially reciprocal-type recombination. (orig.)

  6. A new method for species identification via protein-coding and non-coding DNA barcodes by combining machine learning with bioinformatic methods.

    Science.gov (United States)

    Zhang, Ai-bing; Feng, Jie; Ward, Robert D; Wan, Ping; Gao, Qiang; Wu, Jun; Zhao, Wei-zhong

    2012-01-01

    Species identification via DNA barcodes is contributing greatly to current bioinventory efforts. The initial, and widely accepted, proposal was to use the protein-coding cytochrome c oxidase subunit I (COI) region as the standard barcode for animals, but recently non-coding internal transcribed spacer (ITS) genes have been proposed as candidate barcodes for both animals and plants. However, achieving a robust alignment for non-coding regions can be problematic. Here we propose two new methods (DV-RBF and FJ-RBF) to address this issue for species assignment by both coding and non-coding sequences that take advantage of the power of machine learning and bioinformatics. We demonstrate the value of the new methods with four empirical datasets, two representing typical protein-coding COI barcode datasets (neotropical bats and marine fish) and two representing non-coding ITS barcodes (rust fungi and brown algae). Using two random sub-sampling approaches, we demonstrate that the new methods significantly outperformed existing Neighbor-joining (NJ) and Maximum likelihood (ML) methods for both coding and non-coding barcodes when there was complete species coverage in the reference dataset. The new methods also out-performed NJ and ML methods for non-coding sequences in circumstances of potentially incomplete species coverage, although then the NJ and ML methods performed slightly better than the new methods for protein-coding barcodes. A 100% success rate of species identification was achieved with the two new methods for 4,122 bat queries and 5,134 fish queries using COI barcodes, with 95% confidence intervals (CI) of 99.75-100%. The new methods also obtained a 96.29% success rate (95%CI: 91.62-98.40%) for 484 rust fungi queries and a 98.50% success rate (95%CI: 96.60-99.37%) for 1094 brown algae queries, both using ITS barcodes.

  7. A new method for species identification via protein-coding and non-coding DNA barcodes by combining machine learning with bioinformatic methods.

    Directory of Open Access Journals (Sweden)

    Ai-bing Zhang

    Full Text Available Species identification via DNA barcodes is contributing greatly to current bioinventory efforts. The initial, and widely accepted, proposal was to use the protein-coding cytochrome c oxidase subunit I (COI region as the standard barcode for animals, but recently non-coding internal transcribed spacer (ITS genes have been proposed as candidate barcodes for both animals and plants. However, achieving a robust alignment for non-coding regions can be problematic. Here we propose two new methods (DV-RBF and FJ-RBF to address this issue for species assignment by both coding and non-coding sequences that take advantage of the power of machine learning and bioinformatics. We demonstrate the value of the new methods with four empirical datasets, two representing typical protein-coding COI barcode datasets (neotropical bats and marine fish and two representing non-coding ITS barcodes (rust fungi and brown algae. Using two random sub-sampling approaches, we demonstrate that the new methods significantly outperformed existing Neighbor-joining (NJ and Maximum likelihood (ML methods for both coding and non-coding barcodes when there was complete species coverage in the reference dataset. The new methods also out-performed NJ and ML methods for non-coding sequences in circumstances of potentially incomplete species coverage, although then the NJ and ML methods performed slightly better than the new methods for protein-coding barcodes. A 100% success rate of species identification was achieved with the two new methods for 4,122 bat queries and 5,134 fish queries using COI barcodes, with 95% confidence intervals (CI of 99.75-100%. The new methods also obtained a 96.29% success rate (95%CI: 91.62-98.40% for 484 rust fungi queries and a 98.50% success rate (95%CI: 96.60-99.37% for 1094 brown algae queries, both using ITS barcodes.

  8. An RNA-Seq strategy to detect the complete coding and non-coding transcriptome including full-length imprinted macro ncRNAs.

    Directory of Open Access Journals (Sweden)

    Ru Huang

    Full Text Available Imprinted macro non-protein-coding (nc RNAs are cis-repressor transcripts that silence multiple genes in at least three imprinted gene clusters in the mouse genome. Similar macro or long ncRNAs are abundant in the mammalian genome. Here we present the full coding and non-coding transcriptome of two mouse tissues: differentiated ES cells and fetal head using an optimized RNA-Seq strategy. The data produced is highly reproducible in different sequencing locations and is able to detect the full length of imprinted macro ncRNAs such as Airn and Kcnq1ot1, whose length ranges between 80-118 kb. Transcripts show a more uniform read coverage when RNA is fragmented with RNA hydrolysis compared with cDNA fragmentation by shearing. Irrespective of the fragmentation method, all coding and non-coding transcripts longer than 8 kb show a gradual loss of sequencing tags towards the 3' end. Comparisons to published RNA-Seq datasets show that the strategy presented here is more efficient in detecting known functional imprinted macro ncRNAs and also indicate that standardization of RNA preparation protocols would increase the comparability of the transcriptome between different RNA-Seq datasets.

  9. Novel Bacteriocinogenic Lactobacillus plantarum Strains and Their Differentiation by Sequence Analysis of 16S rDNA, 16S-23S and 23S-5S Intergenic Spacer Regions and Randomly Amplified Polymorphic DNA Analysis

    Directory of Open Access Journals (Sweden)

    Morteza Shojaei Moghadam

    2010-01-01

    Full Text Available Six strains of bacteriocinogenic Lactobacillus plantarum (TL1, RG11, RS5, UL4, RG14 and RI11 isolated from Malaysian foods were investigated for their structural bacteriocin genes. A new combination of plantaricin EF and plantaricin W bacteriocin structural genes was successfully amplified from all studied strains, suggesting that they were novel bacteriocin-producing L. plantarum strains. A four-base pair variable region was detected in the short 16S-23S intergenic spacer regions of the studied strains by a comparative analysis with 17 L. plantarum strains deposited in the GenBank, implying they were new genotypes. The studied L. plantarum strains were subsequently differentiated into four groups on the basis of the detected four-base pair variable region of the short 16S-23S intergenic spacer region. Further analysis of the DNA sequence of 23S-5S intergenic spacer region revealed only one type of 23S-5S intergenic spacer region present in the studied strains, indicating it was highly conserved among the studied L. plantarum strains. Three randomly amplified polymorphic DNA experiments using three different combinations of arbitrary primers successfully differentiated the studied L. plantarum strains from each other, confirming they were different strains. In conclusion, the studied L. plantarum strains were shown to be novel bacteriocin producers and high level of strain discrimination could be achieved with a combination of randomly amplified polymorphic DNA analysis and the analysis of the variable region of short 16S-23S intergenic spacer region present in L. plantarum strains.

  10. Genetic Code Analysis Toolkit: A novel tool to explore the coding properties of the genetic code and DNA sequences

    Science.gov (United States)

    Kraljić, K.; Strüngmann, L.; Fimmel, E.; Gumbel, M.

    2018-01-01

    The genetic code is degenerated and it is assumed that redundancy provides error detection and correction mechanisms in the translation process. However, the biological meaning of the code's structure is still under current research. This paper presents a Genetic Code Analysis Toolkit (GCAT) which provides workflows and algorithms for the analysis of the structure of nucleotide sequences. In particular, sets or sequences of codons can be transformed and tested for circularity, comma-freeness, dichotomic partitions and others. GCAT comes with a fertile editor custom-built to work with the genetic code and a batch mode for multi-sequence processing. With the ability to read FASTA files or load sequences from GenBank, the tool can be used for the mathematical and statistical analysis of existing sequence data. GCAT is Java-based and provides a plug-in concept for extensibility. Availability: Open source Homepage:http://www.gcat.bio/

  11. A ribosomal RNA gene intergenic spacer based PCR and DGGE fingerprinting method for the analysis of specific rhizobial communities in soil

    NARCIS (Netherlands)

    de Oliveira, VM; Manfio, GP; Coutinho, HLD; Keijzer-Wolters, AC; van Elsas, JD

    A direct molecular method for assessing the diversity of specific populations of rhizobia in soil, based on nested PCR amplification of 16S-23S ribosomal RNA gene (rDNA) intergenic spacer (IGS) sequences, was developed. Initial generic amplification of bacterial rDNA IGS sequences from soil DNA was

  12. A ribosomal RNA gene intergenic spacer based PCR and DGGE fingerprinting method for the analysis of specific rhizobial communities in soil

    NARCIS (Netherlands)

    Oliveira, de V.M.; Manfio, G.P.; Coutinho, H.L.D.; Keijzer-Wolters, A.C.; Elsas, van J.D.

    2006-01-01

    A direct molecular method for assessing the diversity of specific populations of rhizobia in soil, based on nested PCR amplification of 16S-23S ribosomal RNA gene (rDNA) intergenic spacer (IGS) sequences, was developed. Initial generic amplification of bacterial rDNA IGS sequences from soil DNA was

  13. Intergenic sequence between Arabidopsis caseinolytic protease B-cytoplasmic/heat shock protein100 and choline kinase genes functions as a heat-inducible bidirectional promoter.

    Science.gov (United States)

    Mishra, Ratnesh Chandra; Grover, Anil

    2014-11-01

    In Arabidopsis (Arabidopsis thaliana), the At1g74310 locus encodes for caseinolytic protease B-cytoplasmic (ClpB-C)/heat shock protein100 protein (AtClpB-C), which is critical for the acquisition of thermotolerance, and At1g74320 encodes for choline kinase (AtCK2) that catalyzes the first reaction in the Kennedy pathway for phosphatidylcholine biosynthesis. Previous work has established that the knockout mutants of these genes display heat-sensitive phenotypes. While analyzing the AtClpB-C promoter and upstream genomic regions in this study, we noted that AtClpB-C and AtCK2 genes are head-to-head oriented on chromosome 1 of the Arabidopsis genome. Expression analysis showed that transcripts of these genes are rapidly induced in response to heat stress treatment. In stably transformed Arabidopsis plants harboring this intergenic sequence between head-to-head oriented green fluorescent protein and β-glucuronidase reporter genes, both transcripts and proteins of the two reporters were up-regulated upon heat stress. Four heat shock elements were noted in the intergenic region by in silico analysis. In the homozygous transfer DNA insertion mutant Salk_014505, 4,393-bp transfer DNA is inserted at position -517 upstream of ATG of the AtClpB-C gene. As a result, AtCk2 loses proximity to three of the four heat shock elements in the mutant line. Heat-inducible expression of the AtCK2 transcript was completely lost, whereas the expression of AtClpB-C was not affected in the mutant plants. Our results suggest that the 1,329-bp intergenic fragment functions as a heat-inducible bidirectional promoter and the region governing the heat inducibility is possibly shared between the two genes. We propose a model in which AtClpB-C shares its regulatory region with heat-induced choline kinase, which has a possible role in heat signaling. © 2014 American Society of Plant Biologists. All Rights Reserved.

  14. SRComp: short read sequence compression using burstsort and Elias omega coding.

    Directory of Open Access Journals (Sweden)

    Jeremy John Selva

    Full Text Available Next-generation sequencing (NGS technologies permit the rapid production of vast amounts of data at low cost. Economical data storage and transmission hence becomes an increasingly important challenge for NGS experiments. In this paper, we introduce a new non-reference based read sequence compression tool called SRComp. It works by first employing a fast string-sorting algorithm called burstsort to sort read sequences in lexicographical order and then Elias omega-based integer coding to encode the sorted read sequences. SRComp has been benchmarked on four large NGS datasets, where experimental results show that it can run 5-35 times faster than current state-of-the-art read sequence compression tools such as BEETL and SCALCE, while retaining comparable compression efficiency for large collections of short read sequences. SRComp is a read sequence compression tool that is particularly valuable in certain applications where compression time is of major concern.

  15. An Auto sequence Code to Integrate a Neutron Unfolding Code with thePC-MCA Accuspec

    International Nuclear Information System (INIS)

    Darsono

    2000-01-01

    In a neutron spectrometry using proton recoil method, the neutronunfolding code is needed to unfold the measured proton spectrum to become theneutron spectrum. The process of the unfolding neutron in the existingneutron spectrometry which was successfully installed last year was doneseparately. This manuscript reports that the auto sequence code to integratethe neutron unfolding code UNFSPEC.EXE with the software facility of thePC-MCA Accuspec has been made and run successfully so that the new neutronspectrometry become compact. The auto sequence code was written based on therules in application program facility of PC-MCA Accuspec and then it wascompiled using AC-EXE. Result of the test of the auto sequence code showedthat for binning width 20, 30, and 40 giving a little different spectrumshape. The binning width around 30 gives a better spectrum in mean of givingsmall error compared to the others. (author)

  16. Non-Protein Coding RNAs

    CERN Document Server

    Walter, Nils G; Batey, Robert T

    2009-01-01

    This book assembles chapters from experts in the Biophysics of RNA to provide a broadly accessible snapshot of the current status of this rapidly expanding field. The 2006 Nobel Prize in Physiology or Medicine was awarded to the discoverers of RNA interference, highlighting just one example of a large number of non-protein coding RNAs. Because non-protein coding RNAs outnumber protein coding genes in mammals and other higher eukaryotes, it is now thought that the complexity of organisms is correlated with the fraction of their genome that encodes non-protein coding RNAs. Essential biological processes as diverse as cell differentiation, suppression of infecting viruses and parasitic transposons, higher-level organization of eukaryotic chromosomes, and gene expression itself are found to largely be directed by non-protein coding RNAs. The biophysical study of these RNAs employs X-ray crystallography, NMR, ensemble and single molecule fluorescence spectroscopy, optical tweezers, cryo-electron microscopy, and ot...

  17. Five Complete Chloroplast Genome Sequences from Diospyros: Genome Organization and Comparative Analysis.

    Science.gov (United States)

    Fu, Jianmin; Liu, Huimin; Hu, Jingjing; Liang, Yuqin; Liang, Jinjun; Wuyun, Tana; Tan, Xiaofeng

    2016-01-01

    Diospyros is the largest genus in Ebenaceae, comprising more than 500 species with remarkable economic value, especially Diospyros kaki Thunb., which has traditionally been an important food resource in China, Korea, and Japan. Complete chloroplast (cp) genomes from D. kaki, D. lotus L., D. oleifera Cheng., D. glaucifolia Metc., and Diospyros 'Jinzaoshi' were sequenced using Illumina sequencing technology. This is the first cp genome reported in Ebenaceae. The cp genome sequences of Diospyros ranged from 157,300 to 157,784 bp in length, presenting a typical quadripartite structure with two inverted repeats each separated by one large and one small single-copy region. For each cp genome, 134 genes were annotated, including 80 protein-coding, 31 tRNA, and 4 rRNA unique genes. In all, 179 repeats and 283 single sequence repeats were identified. Four hypervariable regions, namely, intergenic region of trnQ_rps16, trnV_ndhC, and psbD_trnT, and intron of ndhA, were identified in the Diospyros genomes. Phylogenetic analyses based on the whole cp genome, protein-coding, and intergenic and intron sequences indicated that D. oleifera is closely related to D. kaki and could be used as a model plant for future research on D. kaki; to our knowledge, this is proposed for the first time. Further, these analyses together with two large deletions (301 and 140 bp) in the cp genome of D. 'Jinzaoshi', support its placement as a new species in Diospyros. Both maximum parsimony and likelihood analyses for 19 taxa indicated the basal position of Ericales in asterids and suggested that Ebenaceae is monophyletic in Ericales.

  18. Five Complete Chloroplast Genome Sequences from Diospyros: Genome Organization and Comparative Analysis.

    Directory of Open Access Journals (Sweden)

    Jianmin Fu

    Full Text Available Diospyros is the largest genus in Ebenaceae, comprising more than 500 species with remarkable economic value, especially Diospyros kaki Thunb., which has traditionally been an important food resource in China, Korea, and Japan. Complete chloroplast (cp genomes from D. kaki, D. lotus L., D. oleifera Cheng., D. glaucifolia Metc., and Diospyros 'Jinzaoshi' were sequenced using Illumina sequencing technology. This is the first cp genome reported in Ebenaceae. The cp genome sequences of Diospyros ranged from 157,300 to 157,784 bp in length, presenting a typical quadripartite structure with two inverted repeats each separated by one large and one small single-copy region. For each cp genome, 134 genes were annotated, including 80 protein-coding, 31 tRNA, and 4 rRNA unique genes. In all, 179 repeats and 283 single sequence repeats were identified. Four hypervariable regions, namely, intergenic region of trnQ_rps16, trnV_ndhC, and psbD_trnT, and intron of ndhA, were identified in the Diospyros genomes. Phylogenetic analyses based on the whole cp genome, protein-coding, and intergenic and intron sequences indicated that D. oleifera is closely related to D. kaki and could be used as a model plant for future research on D. kaki; to our knowledge, this is proposed for the first time. Further, these analyses together with two large deletions (301 and 140 bp in the cp genome of D. 'Jinzaoshi', support its placement as a new species in Diospyros. Both maximum parsimony and likelihood analyses for 19 taxa indicated the basal position of Ericales in asterids and suggested that Ebenaceae is monophyletic in Ericales.

  19. Quantitative Profiling of Peptides from RNAs classified as non-coding

    Science.gov (United States)

    Prabakaran, Sudhakaran; Hemberg, Martin; Chauhan, Ruchi; Winter, Dominic; Tweedie-Cullen, Ry Y.; Dittrich, Christian; Hong, Elizabeth; Gunawardena, Jeremy; Steen, Hanno; Kreiman, Gabriel; Steen, Judith A.

    2014-01-01

    Only a small fraction of the mammalian genome codes for messenger RNAs destined to be translated into proteins, and it is generally assumed that a large portion of transcribed sequences - including introns and several classes of non-coding RNAs (ncRNAs) do not give rise to peptide products. A systematic examination of translation and physiological regulation of ncRNAs has not been conducted. Here, we use computational methods to identify the products of non-canonical translation in mouse neurons by analyzing unannotated transcripts in combination with proteomic data. This study supports the existence of non-canonical translation products from both intragenic and extragenic genomic regions, including peptides derived from anti-sense transcripts and introns. Moreover, the studied novel translation products exhibit temporal regulation similar to that of proteins known to be involved in neuronal activity processes. These observations highlight a potentially large and complex set of biologically regulated translational events from transcripts formerly thought to lack coding potential. PMID:25403355

  20. Performance Analysis for Cooperative Communication System with QC-LDPC Codes Constructed with Integer Sequences

    Directory of Open Access Journals (Sweden)

    Yan Zhang

    2015-01-01

    Full Text Available This paper presents four different integer sequences to construct quasi-cyclic low-density parity-check (QC-LDPC codes with mathematical theory. The paper introduces the procedure of the coding principle and coding. Four different integer sequences constructing QC-LDPC code are compared with LDPC codes by using PEG algorithm, array codes, and the Mackey codes, respectively. Then, the integer sequence QC-LDPC codes are used in coded cooperative communication. Simulation results show that the integer sequence constructed QC-LDPC codes are effective, and overall performance is better than that of other types of LDPC codes in the coded cooperative communication. The performance of Dayan integer sequence constructed QC-LDPC is the most excellent performance.

  1. Rfam: annotating families of non-coding RNA sequences.

    Science.gov (United States)

    Daub, Jennifer; Eberhardt, Ruth Y; Tate, John G; Burge, Sarah W

    2015-01-01

    The primary task of the Rfam database is to collate experimentally validated noncoding RNA (ncRNA) sequences from the published literature and facilitate the prediction and annotation of new homologues in novel nucleotide sequences. We group homologous ncRNA sequences into "families" and related families are further grouped into "clans." We collate and manually curate data cross-references for these families from other databases and external resources. Our Web site offers researchers a simple interface to Rfam and provides tools with which to annotate their own sequences using our covariance models (CMs), through our tools for searching, browsing, and downloading information on Rfam families. In this chapter, we will work through examples of annotating a query sequence, collating family information, and searching for data.

  2. Coding visual features extracted from video sequences.

    Science.gov (United States)

    Baroffio, Luca; Cesana, Matteo; Redondi, Alessandro; Tagliasacchi, Marco; Tubaro, Stefano

    2014-05-01

    Visual features are successfully exploited in several applications (e.g., visual search, object recognition and tracking, etc.) due to their ability to efficiently represent image content. Several visual analysis tasks require features to be transmitted over a bandwidth-limited network, thus calling for coding techniques to reduce the required bit budget, while attaining a target level of efficiency. In this paper, we propose, for the first time, a coding architecture designed for local features (e.g., SIFT, SURF) extracted from video sequences. To achieve high coding efficiency, we exploit both spatial and temporal redundancy by means of intraframe and interframe coding modes. In addition, we propose a coding mode decision based on rate-distortion optimization. The proposed coding scheme can be conveniently adopted to implement the analyze-then-compress (ATC) paradigm in the context of visual sensor networks. That is, sets of visual features are extracted from video frames, encoded at remote nodes, and finally transmitted to a central controller that performs visual analysis. This is in contrast to the traditional compress-then-analyze (CTA) paradigm, in which video sequences acquired at a node are compressed and then sent to a central unit for further processing. In this paper, we compare these coding paradigms using metrics that are routinely adopted to evaluate the suitability of visual features in the context of content-based retrieval, object recognition, and tracking. Experimental results demonstrate that, thanks to the significant coding gains achieved by the proposed coding scheme, ATC outperforms CTA with respect to all evaluation metrics.

  3. Sequence-specific DNA binding by MYC/MAX to low-affinity non-E-box motifs.

    Directory of Open Access Journals (Sweden)

    Michael Allevato

    Full Text Available The MYC oncoprotein regulates transcription of a large fraction of the genome as an obligatory heterodimer with the transcription factor MAX. The MYC:MAX heterodimer and MAX:MAX homodimer (hereafter MYC/MAX bind Enhancer box (E-box DNA elements (CANNTG and have the greatest affinity for the canonical MYC E-box (CME CACGTG. However, MYC:MAX also recognizes E-box variants and was reported to bind DNA in a "non-specific" fashion in vitro and in vivo. Here, in order to identify potential additional non-canonical binding sites for MYC/MAX, we employed high throughput in vitro protein-binding microarrays, along with electrophoretic mobility-shift assays and bioinformatic analyses of MYC-bound genomic loci in vivo. We identified all hexameric motifs preferentially bound by MYC/MAX in vitro, which include the low-affinity non-E-box sequence AACGTT, and found that the vast majority (87% of MYC-bound genomic sites in a human B cell line contain at least one of the top 21 motifs bound by MYC:MAX in vitro. We further show that high MYC/MAX concentrations are needed for specific binding to the low-affinity sequence AACGTT in vitro and that elevated MYC levels in vivo more markedly increase the occupancy of AACGTT sites relative to CME sites, especially at distal intergenic and intragenic loci. Hence, MYC binds diverse DNA motifs with a broad range of affinities in a sequence-specific and dose-dependent manner, suggesting that MYC overexpression has more selective effects on the tumor transcriptome than previously thought.

  4. Orion: Detecting regions of the human non-coding genome that are intolerant to variation using population genetics.

    Science.gov (United States)

    Gussow, Ayal B; Copeland, Brett R; Dhindsa, Ryan S; Wang, Quanli; Petrovski, Slavé; Majoros, William H; Allen, Andrew S; Goldstein, David B

    2017-01-01

    There is broad agreement that genetic mutations occurring outside of the protein-coding regions play a key role in human disease. Despite this consensus, we are not yet capable of discerning which portions of non-coding sequence are important in the context of human disease. Here, we present Orion, an approach that detects regions of the non-coding genome that are depleted of variation, suggesting that the regions are intolerant of mutations and subject to purifying selection in the human lineage. We show that Orion is highly correlated with known intolerant regions as well as regions that harbor putatively pathogenic variation. This approach provides a mechanism to identify pathogenic variation in the human non-coding genome and will have immediate utility in the diagnostic interpretation of patient genomes and in large case control studies using whole-genome sequences.

  5. Screening and Identification of putative long non coding RNAs from transcriptome data of a high yielding blackgram (Vigna mungo, Cv. T9

    Directory of Open Access Journals (Sweden)

    Pankaj Kumar Singh

    2018-04-01

    Full Text Available Blackgram (Vigna mungo is one of primary legumes cultivated throughout India, Cv.T9 being one of its common high yielding cultivar. This article reports RNA sequencing data and a pipeline for prediction of novel long non-coding RNAs from the sequenced data. The raw data generated during sequencing are available at Sequence Read Archive (SRA of NCBI with accession number- SRX1558530 Keywords: Blackgram, Long non-coding RNA, Legumes, RNA sequencing data

  6. Machine-Checked Sequencer for Critical Embedded Code Generator

    Science.gov (United States)

    Izerrouken, Nassima; Pantel, Marc; Thirioux, Xavier

    This paper presents the development of a correct-by-construction block sequencer for GeneAuto a qualifiable (according to DO178B/ED12B recommendation) automatic code generator. It transforms Simulink models to MISRA C code for safety critical systems. Our approach which combines classical development process and formal specification and verification using proof-assistants, led to preliminary fruitful exchanges with certification authorities. We present parts of the classical user and tools requirements and derived formal specifications, implementation and verification for the correctness and termination of the block sequencer. This sequencer has been successfully applied to real-size industrial use cases from various transportation domain partners and led to requirement errors detection and a correct-by-construction implementation.

  7. InFusion: Advancing Discovery of Fusion Genes and Chimeric Transcripts from Deep RNA-Sequencing Data.

    Directory of Open Access Journals (Sweden)

    Konstantin Okonechnikov

    Full Text Available Analysis of fusion transcripts has become increasingly important due to their link with cancer development. Since high-throughput sequencing approaches survey fusion events exhaustively, several computational methods for the detection of gene fusions from RNA-seq data have been developed. This kind of analysis, however, is complicated by native trans-splicing events, the splicing-induced complexity of the transcriptome and biases and artefacts introduced in experiments and data analysis. There are a number of tools available for the detection of fusions from RNA-seq data; however, certain differences in specificity and sensitivity between commonly used approaches have been found. The ability to detect gene fusions of different types, including isoform fusions and fusions involving non-coding regions, has not been thoroughly studied yet. Here, we propose a novel computational toolkit called InFusion for fusion gene detection from RNA-seq data. InFusion introduces several unique features, such as discovery of fusions involving intergenic regions, and detection of anti-sense transcription in chimeric RNAs based on strand-specificity. Our approach demonstrates superior detection accuracy on simulated data and several public RNA-seq datasets. This improved performance was also evident when evaluating data from RNA deep-sequencing of two well-established prostate cancer cell lines. InFusion identified 26 novel fusion events that were validated in vitro, including alternatively spliced gene fusion isoforms and chimeric transcripts that include intergenic regions. The toolkit is freely available to download from http:/bitbucket.org/kokonech/infusion.

  8. Overexpression of long non-coding RNAs following exposure to xenobiotics in the aquatic midge Chironomus riparius

    International Nuclear Information System (INIS)

    Martínez-Guitarte, José-Luis; Planelló, Rosario; Morcillo, Gloria

    2012-01-01

    Non-coding RNAs (ncRNAs) represent an important transcriptional output of eukaryotic genomes. In addition to their functional relevance as housekeeping and regulatory elements, recent studies have suggested their involvement in rather unexpected cellular functions. The aim of this work was to analyse the transcriptional behaviour of non-coding RNAs in the toxic response to pollutants in Chironomus riparius, a reference organism in aquatic toxicology. Three well-characterized long non-coding sequences were studied: telomeric repeats, Cla repetitive elements and the SINE CTRT1. Transcription levels were evaluated by RT-PCR after 24-h exposures to three current aquatic contaminants: bisphenol A (BPA), benzyl butyl phthalate (BBP) and the heavy metal cadmium (Cd). Upregulation of telomeric transcripts was found after BPA treatments. Moreover, BPA significantly activated Cla transcription, which also appeared to be increased by cadmium, whereas BBP did not affect the transcription levels of these sequences. Transcription of SINE CTRT1 was not altered by any of the chemicals tested. These data are discussed in the light of previous studies that have shown a response by long ncRNAS (lncRNAs) to cellular stressors, indicating a relationship with environmental stimuli. Our results demonstrated for the first time the ability of bisphenol A to activate non-coding sequences mainly located at telomeres and centromeres. Overall, this study provides evidence that xenobiotics can induce specific responses in ncRNAs derived from repetitive sequences that could be relevant in the toxic response, and also suggests that ncRNAs could represent a novel class of potential biomarkers in toxicological assessment.

  9. Overexpression of long non-coding RNAs following exposure to xenobiotics in the aquatic midge Chironomus riparius

    Energy Technology Data Exchange (ETDEWEB)

    Martinez-Guitarte, Jose-Luis, E-mail: jlmartinez@ccia.uned.es [Grupo de Biologia y Toxicologia Ambiental, Facultad de Ciencias, Universidad Nacional de Educacion a Distancia, UNED, Senda del Rey 9, 28040 Madrid (Spain); Planello, Rosario; Morcillo, Gloria [Grupo de Biologia y Toxicologia Ambiental, Facultad de Ciencias, Universidad Nacional de Educacion a Distancia, UNED, Senda del Rey 9, 28040 Madrid (Spain)

    2012-04-15

    Non-coding RNAs (ncRNAs) represent an important transcriptional output of eukaryotic genomes. In addition to their functional relevance as housekeeping and regulatory elements, recent studies have suggested their involvement in rather unexpected cellular functions. The aim of this work was to analyse the transcriptional behaviour of non-coding RNAs in the toxic response to pollutants in Chironomus riparius, a reference organism in aquatic toxicology. Three well-characterized long non-coding sequences were studied: telomeric repeats, Cla repetitive elements and the SINE CTRT1. Transcription levels were evaluated by RT-PCR after 24-h exposures to three current aquatic contaminants: bisphenol A (BPA), benzyl butyl phthalate (BBP) and the heavy metal cadmium (Cd). Upregulation of telomeric transcripts was found after BPA treatments. Moreover, BPA significantly activated Cla transcription, which also appeared to be increased by cadmium, whereas BBP did not affect the transcription levels of these sequences. Transcription of SINE CTRT1 was not altered by any of the chemicals tested. These data are discussed in the light of previous studies that have shown a response by long ncRNAS (lncRNAs) to cellular stressors, indicating a relationship with environmental stimuli. Our results demonstrated for the first time the ability of bisphenol A to activate non-coding sequences mainly located at telomeres and centromeres. Overall, this study provides evidence that xenobiotics can induce specific responses in ncRNAs derived from repetitive sequences that could be relevant in the toxic response, and also suggests that ncRNAs could represent a novel class of potential biomarkers in toxicological assessment.

  10. A two-locus DNA sequence database for typing plant and human pathogens within the Fusarium oxysporum species complex

    DEFF Research Database (Denmark)

    O'Donnell, Kerry; Gueidan, C; Sink, S

    2009-01-01

    We constructed a two-locus database, comprising partial translation elongation factor (EF-1alpha) gene sequences and nearly full-length sequences of the nuclear ribosomal intergenic spacer region (IGS rDNA) for 850 isolates spanning the phylogenetic breadth of the Fusarium oxysporum species compl...... of the IGS rDNA sequences may be non-orthologous. We also evaluated enniatin, fumonisin and moniliformin mycotoxin production in vitro within a phylogenetic framework....

  11. Genome-wide identification and characterization of long non-coding RNAs in developmental skeletal muscle of fetal goat.

    Science.gov (United States)

    Zhan, Siyuan; Dong, Yao; Zhao, Wei; Guo, Jiazhong; Zhong, Tao; Wang, Linjie; Li, Li; Zhang, Hongping

    2016-08-22

    Long non-coding RNAs (lncRNAs) have been studied extensively over the past few years. Large numbers of lncRNAs have been identified in mouse, rat, and human, and some of them have been shown to play important roles in muscle development and myogenesis. However, there are few reports on the characterization of lncRNAs covering all the development stages of skeletal muscle in livestock. RNA libraries constructed from developing longissimus dorsi muscle of fetal (45, 60, and 105 days of gestation) and postnatal (3 days after birth) goat (Capra hircus) were sequenced. A total of 1,034,049,894 clean reads were generated. Among them, 3981 lncRNA transcripts corresponding to 2739 lncRNA genes were identified, including 3515 intergenic lncRNAs and 466 anti-sense lncRNAs. Notably, in pairwise comparisons between the libraries of skeletal muscle at the different development stages, a total of 577 transcripts were differentially expressed (P development-related processes, indicating they may be in cis-regulatory relationships. Additionally, Pearson's correlation coefficients of co-expression levels suggested 1737 lncRNAs and 19,422 mRNAs were possibly in trans-regulatory relationships (r > 0.95 or r development-related biological processes such as muscle system processes, regulation of cell growth, muscle cell development, regulation of transcription, and embryonic morphogenesis. This study provides a catalog of goat muscle-related lncRNAs, and will contribute to a fuller understanding of the molecular mechanism underpinning muscle development in mammals.

  12. The High Degree of Sequence Plasticity of the Arenavirus Noncoding Intergenic Region (IGR) Enables the Use of a Nonviral Universal Synthetic IGR To Attenuate Arenaviruses.

    Science.gov (United States)

    Iwasaki, Masaharu; Cubitt, Beatrice; Sullivan, Brian M; de la Torre, Juan C

    2016-01-06

    Hemorrhagic fever arenaviruses (HFAs) pose important public health problems in regions where they are endemic. Concerns about human-pathogenic arenaviruses are exacerbated because of the lack of FDA-licensed arenavirus vaccines and because current antiarenaviral therapy is limited to an off-label use of ribavirin that is only partially effective. We have recently shown that the noncoding intergenic region (IGR) present in each arenavirus genome segment, the S and L segments (S-IGR and L-IGR, respectively), plays important roles in the control of virus protein expression and that this knowledge could be harnessed for the development of live-attenuated vaccine strains to combat HFAs. In this study, we further investigated the sequence plasticity of the arenavirus IGR. We demonstrate that recombinants of the prototypic arenavirus lymphocytic choriomeningitis virus (rLCMVs), whose S-IGRs were replaced by the S-IGR of Lassa virus (LASV) or an entirely nonviral S-IGR-like sequence (Ssyn), are viable, indicating that the function of S-IGR tolerates a high degree of sequence plasticity. In addition, rLCMVs whose L-IGRs were replaced by Ssyn or S-IGRs of the very distantly related reptarenavirus Golden Gate virus (GGV) were viable and severely attenuated in vivo but able to elicit protective immunity against a lethal challenge with wild-type LCMV. Our findings indicate that replacement of L-IGR by a nonviral Ssyn could serve as a universal molecular determinant of arenavirus attenuation. Hemorrhagic fever arenaviruses (HFAs) cause high rates of morbidity and mortality and pose important public health problems in regions where they are endemic. Implementation of live-attenuated vaccines (LAVs) will represent a major step to combat HFAs. Here we document that the arenavirus noncoding intergenic region (IGR) has a high degree of plasticity compatible with virus viability. This observation led us to generate recombinant LCMVs containing nonviral synthetic IGRs. These r

  13. An atlas of human long non-coding RNAs with accurate 5′ ends

    KAUST Repository

    Hon, Chung-Chau

    2017-02-28

    Long non-coding RNAs (lncRNAs) are largely heterogeneous and functionally uncharacterized. Here, using FANTOM5 cap analysis of gene expression (CAGE) data, we integrate multiple transcript collections to generate a comprehensive atlas of 27,919 human lncRNA genes with high-confidence 5′ ends and expression profiles across 1,829 samples from the major human primary cell types and tissues. Genomic and epigenomic classification of these lncRNAs reveals that most intergenic lncRNAs originate from enhancers rather than from promoters. Incorporating genetic and expression data, we show that lncRNAs overlapping trait-associated single nucleotide polymorphisms are specifically expressed in cell types relevant to the traits, implicating these lncRNAs in multiple diseases. We further demonstrate that lncRNAs overlapping expression quantitative trait loci (eQTL)-associated single nucleotide polymorphisms of messenger RNAs are co-expressed with the corresponding messenger RNAs, suggesting their potential roles in transcriptional regulation. Combining these findings with conservation data, we identify 19,175 potentially functional lncRNAs in the human genome.

  14. Local repeat sequence organization of an intergenic spacer

    Indian Academy of Sciences (India)

    The amplification yielded the same uniquely ``sequence-scrambled” product, whether the template used for PCR was total cellular DNA, chloroplast DNA or a plasmid clone DNA corresponding to that region. The PCR product, a ``unique” new sequence, had lost the repetitive organization of the template genome where it ...

  15. RNA sequencing of the exercise transcriptome in equine athletes.

    Directory of Open Access Journals (Sweden)

    Stefano Capomaccio

    Full Text Available The horse is an optimal model organism for studying the genomic response to exercise-induced stress, due to its natural aptitude for athletic performance and the relative homogeneity of its genetic and environmental backgrounds. Here, we applied RNA-sequencing analysis through the use of SOLiD technology in an experimental framework centered on exercise-induced stress during endurance races in equine athletes. We monitored the transcriptional landscape by comparing gene expression levels between animals at rest and after competition. Overall, we observed a shift from coding to non-coding regions, suggesting that the stress response involves the differential expression of not annotated regions. Notably, we observed significant post-race increases of reads that correspond to repeats, especially the intergenic and intronic L1 and L2 transposable elements. We also observed increased expression of the antisense strands compared to the sense strands in intronic and regulatory regions (1 kb up- and downstream of the genes, suggesting that antisense transcription could be one of the main mechanisms for transposon regulation in the horse under stress conditions. We identified a large number of transcripts corresponding to intergenic and intronic regions putatively associated with new transcriptional elements. Gene expression and pathway analysis allowed us to identify several biological processes and molecular functions that may be involved with exercise-induced stress. Ontology clustering reflected mechanisms that are already known to be stress activated (e.g., chemokine-type cytokines, Toll-like receptors, and kinases, as well as "nucleic acid binding" and "signal transduction activity" functions. There was also a general and transient decrease in the global rates of protein synthesis, which would be expected after strenuous global stress. In sum, our network analysis points toward the involvement of specific gene clusters in equine exercise

  16. Algebraic solution of the synthesis problem for coded sequences

    International Nuclear Information System (INIS)

    Leukhin, Anatolii N

    2005-01-01

    The algebraic solution of a 'complex' problem of synthesis of phase-coded (PC) sequences with the zero level of side lobes of the cyclic autocorrelation function (ACF) is proposed. It is shown that the solution of the synthesis problem is connected with the existence of difference sets for a given code dimension. The problem of estimating the number of possible code combinations for a given code dimension is solved. It is pointed out that the problem of synthesis of PC sequences is related to the fundamental problems of discrete mathematics and, first of all, to a number of combinatorial problems, which can be solved, as the number factorisation problem, by algebraic methods by using the theory of Galois fields and groups. (fourth seminar to the memory of d.n. klyshko)

  17. Continuous Non-malleable Codes

    DEFF Research Database (Denmark)

    Faust, Sebastian; Mukherjee, Pratyay; Nielsen, Jesper Buus

    2014-01-01

    or modify it to the encoding of a completely unrelated value. This paper introduces an extension of the standard non-malleability security notion - so-called continuous non-malleability - where we allow the adversary to tamper continuously with an encoding. This is in contrast to the standard notion of non...... is necessary to achieve continuous non-malleability in the split-state model. Moreover, we illustrate that none of the existing constructions satisfies our uniqueness property and hence is not secure in the continuous setting. We construct a split-state code satisfying continuous non-malleability. Our scheme...... is based on the inner product function, collision-resistant hashing and non-interactive zero-knowledge proofs of knowledge and requires an untamperable common reference string. We apply continuous non-malleable codes to protect arbitrary cryptographic primitives against tampering attacks. Previous...

  18. Non-binary Entanglement-assisted Stabilizer Quantum Codes

    OpenAIRE

    Riguang, Leng; Zhi, Ma

    2011-01-01

    In this paper, we show how to construct non-binary entanglement-assisted stabilizer quantum codes by using pre-shared entanglement between the sender and receiver. We also give an algorithm to determine the circuit for non-binary entanglement-assisted stabilizer quantum codes and some illustrated examples. The codes we constructed do not require the dual-containing constraint, and many non-binary classical codes, like non-binary LDPC codes, which do not satisfy the condition, can be used to c...

  19. Evolutionary modeling and prediction of non-coding RNAs in Drosophila.

    Directory of Open Access Journals (Sweden)

    Robert K Bradley

    2009-08-01

    Full Text Available We performed benchmarks of phylogenetic grammar-based ncRNA gene prediction, experimenting with eight different models of structural evolution and two different programs for genome alignment. We evaluated our models using alignments of twelve Drosophila genomes. We find that ncRNA prediction performance can vary greatly between different gene predictors and subfamilies of ncRNA gene. Our estimates for false positive rates are based on simulations which preserve local islands of conservation; using these simulations, we predict a higher rate of false positives than previous computational ncRNA screens have reported. Using one of the tested prediction grammars, we provide an updated set of ncRNA predictions for D. melanogaster and compare them to previously-published predictions and experimental data. Many of our predictions show correlations with protein-coding genes. We found significant depletion of intergenic predictions near the 3' end of coding regions and furthermore depletion of predictions in the first intron of protein-coding genes. Some of our predictions are colocated with larger putative unannotated genes: for example, 17 of our predictions showing homology to the RFAM family snoR28 appear in a tandem array on the X chromosome; the 4.5 Kbp spanned by the predicted tandem array is contained within a FlyBase-annotated cDNA.

  20. MASTR: multiple alignment and structure prediction of non-coding RNAs using simulated annealing

    DEFF Research Database (Denmark)

    Lindgreen, Stinus; Gardner, Paul P; Krogh, Anders

    2007-01-01

    function that considers sequence conservation, covariation and basepairing probabilities. The results show that the method is very competitive to similar programs available today, both in terms of accuracy and computational efficiency. AVAILABILITY: Source code available from http://mastr.binf.ku.dk/......MOTIVATION: As more non-coding RNAs are discovered, the importance of methods for RNA analysis increases. Since the structure of ncRNA is intimately tied to the function of the molecule, programs for RNA structure prediction are necessary tools in this growing field of research. Furthermore......, it is known that RNA structure is often evolutionarily more conserved than sequence. However, few existing methods are capable of simultaneously considering multiple sequence alignment and structure prediction. RESULT: We present a novel solution to the problem of simultaneous structure prediction...

  1. Inferring a role for methylation of intergenic DNA in the regulation of genes aberrantly expressed in precursor B-cell acute lymphoblastic leukemia.

    Science.gov (United States)

    Almamun, Md; Kholod, Olha; Stuckel, Alexei J; Levinson, Benjamin T; Johnson, Nathan T; Arthur, Gerald L; Davis, J Wade; Taylor, Kristen H

    2017-09-01

    A complete understanding of the mechanisms involved in the development of pre-B ALL is lacking. In this study, we integrated DNA methylation data and gene expression data to elucidate the impact of aberrant intergenic DNA methylation on gene expression in pre-B ALL. We found a subset of differentially methylated intergenic loci that were associated with altered gene expression in pre-B ALL patients. Notably, 84% of these regions were also bound by transcription factors (TF) known to play roles in differentiation and B-cell development in a lymphoblastoid cell line. Further, an overall downregulation of eRNA transcripts was observed in pre-B ALL patients and these transcripts were associated with the downregulation of putative target genes involved in B-cell migration, proliferation, and apoptosis. The identification of novel putative regulatory regions highlights the significance of intergenic DNA sequences and may contribute to the identification of new therapeutic targets for the treatment of pre-B ALL.

  2. Fast comparison of IS radar code sequences for lag profile inversion

    Directory of Open Access Journals (Sweden)

    M. S. Lehtinen

    2008-08-01

    Full Text Available A fast method for theoretically comparing the posteriori variances produced by different phase code sequences in incoherent scatter radar (ISR experiments is introduced. Alternating codes of types 1 and 2 are known to be optimal for selected range resolutions, but the code sets are inconveniently long for many purposes like ground clutter estimation and in cases where coherent echoes from lower ionospheric layers are to be analyzed in addition to standard F-layer spectra.

    The method is used in practice for searching binary code quads that have estimation accuracy almost equal to that of much longer alternating code sets. Though the code sequences can consist of as few as four different transmission envelopes, the lag profile estimation variances are near to the theoretical minimum. Thus the short code sequence is equally good as a full cycle of alternating codes with the same pulse length and bit length. The short code groups cannot be directly decoded, but the decoding is done in connection with more computationally expensive lag profile inversion in data analysis.

    The actual code searches as well as the analysis and real data results from the found short code searches are explained in other papers sent to the same issue of this journal. We also discuss interesting subtle differences found between the different alternating codes by this method. We assume that thermal noise dominates the incoherent scatter signal.

  3. Survey and analysis of simple sequence repeats in the Laccaria bicolor genome, with development of microsatellite markers

    Energy Technology Data Exchange (ETDEWEB)

    Labbe, Jessy L [ORNL; Murat, Claude [INRA, Nancy, France; Morin, Emmanuelle [INRA, Nancy, France; Le Tacon, F [UMR, France; Martin, Francis [INRA, Nancy, France

    2011-01-01

    It is becoming clear that simple sequence repeats (SSRs) play a significant role in fungal genome organization, and they are a large source of genetic markers for population genetics and meiotic maps. We identified SSRs in the Laccaria bicolor genome by in silico survey and analyzed their distribution in the different genomic regions. We also compared the abundance and distribution of SSRs in L. bicolor with those of the following fungal genomes: Phanerochaete chrysosporium, Coprinopsis cinerea, Ustilago maydis, Cryptococcus neoformans, Aspergillus nidulans, Magnaporthe grisea, Neurospora crassa and Saccharomyces cerevisiae. Using the MISA computer program, we detected 277,062 SSRs in the L. bicolor genome representing 8% of the assembled genomic sequence. Among the analyzed basidiomycetes, L. bicolor exhibited the highest SSR density although no correlation between relative abundance and the genome sizes was observed. In most genomes the short motifs (mono- to trinucleotides) were more abundant than the longer repeated SSRs. Generally, in each organism, the occurrence, relative abundance, and relative density of SSRs decreased as the repeat unit increased. Furthermore, each organism had its own common and longest SSRs. In the L. bicolor genome, most of the SSRs were located in intergenic regions (73.3%) and the highest SSR density was observed in transposable elements (TEs; 6,706 SSRs/Mb). However, 81% of the protein-coding genes contained SSRs in their exons, suggesting that SSR polymorphism may alter gene phenotypes. Within a L. bicolor offspring, sequence polymorphism of 78 SSRs was mainly detected in non-TE intergenic regions. Unlike previously developed microsatellite markers, these new ones are spread throughout the genome; these markers could have immediate applications in population genetics.

  4. Sequence Coding and Search System for licensee event reports: code listings. Volume 2

    International Nuclear Information System (INIS)

    Gallaher, R.B.; Guymon, R.H.; Mays, G.T.; Poore, W.P.; Cagle, R.J.; Harrington, K.H.; Johnson, M.P.

    1985-04-01

    Operating experience data from nuclear power plants are essential for safety and reliability analyses, especially analyses of trends and patterns. The licensee event reports (LERs) that are submitted to the Nuclear Regulatory Commission (NRC) by the nuclear power plant utilities contain much of this data. The NRC's Office for Analysis and Evaluation of Operational Data (AEOD) has developed, under contract with NSIC, a system for codifying the events reported in the LERs. The primary objective of the Sequence Coding and Search System (SCSS) is to reduce the descriptive text of the LERs to coded sequences that are both computer-readable and computer-searchable. This system provides a structured format for detailed coding of component, system, and unit effects as well as personnel errors. The database contains all current LERs submitted by nuclear power plant utilities for events occurring since 1981 and is updated on a continual basis. Volume 2 contains all valid and acceptable codes used for searching and encoding the LER data. This volume contains updated material through amendment 1 to revision 1 of the working version of ORNL/NSIC-223, Vol. 2

  5. Sequence Classification: 13038 [

    Lifescience Database Archive (English)

    Full Text Available Non-TMB Non-TMH Non-TMB Non-TMB Non-TMB Non-TMB >gi|15669589|ref|NP_248402.1| alignment in /usr/local/projec...ts/ARG/Intergenic/ARG_R584_orf2.nr || http://www.ncbi.nlm.nih.gov/protein/15669589 ...

  6. Sequence Classification: 12660 [

    Lifescience Database Archive (English)

    Full Text Available Non-TMB Non-TMH Non-TMB Non-TMB Non-TMB Non-TMB >gi|15669211|ref|NP_248016.1| alignment in /usr/local/projec...ts/ARG/Intergenic/ARG_R428_orf1.nr || http://www.ncbi.nlm.nih.gov/protein/15669211 ...

  7. Non-coding RNA networks in cancer.

    Science.gov (United States)

    Anastasiadou, Eleni; Jacob, Leni S; Slack, Frank J

    2018-01-01

    Thousands of unique non-coding RNA (ncRNA) sequences exist within cells. Work from the past decade has altered our perception of ncRNAs from 'junk' transcriptional products to functional regulatory molecules that mediate cellular processes including chromatin remodelling, transcription, post-transcriptional modifications and signal transduction. The networks in which ncRNAs engage can influence numerous molecular targets to drive specific cell biological responses and fates. Consequently, ncRNAs act as key regulators of physiological programmes in developmental and disease contexts. Particularly relevant in cancer, ncRNAs have been identified as oncogenic drivers and tumour suppressors in every major cancer type. Thus, a deeper understanding of the complex networks of interactions that ncRNAs coordinate would provide a unique opportunity to design better therapeutic interventions.

  8. 16S-23S rDNA intergenic spacer region polymorphism of Lactococcus garvieae, Lactococcus raffinolactis and Lactococcus lactis as revealed by PCR and nucleotide sequence analysis.

    Science.gov (United States)

    Blaiotta, Giuseppe; Pepe, Olimpia; Mauriello, Gianluigi; Villani, Francesco; Andolfi, Rosamaria; Moschetti, Giancarlo

    2002-12-01

    The intergenic spacer region (ISR) between the 16S and 23S rRNA genes was tested as a tool for differentiating lactococci commonly isolated in a dairy environment. 17 reference strains, representing 11 different species belonging to the genera Lactococcus, Streptococcus, Lactobacillus, Enterococcus and Leuconostoc, and 127 wild streptococcal strains isolated during the whole fermentation process of "Fior di Latte" cheese were analyzed. After 16S-23S rDNA ISR amplification by PCR, species or genus-specific patterns were obtained for most of the reference strains tested. Moreover, results obtained after nucleotide analysis show that the 16S-23S rDNA ISR sequences vary greatly, in size and sequence, among Lactococcus garvieae, Lactococcus raffinolactis, Lactococcus lactis as well as other streptococci from dairy environments. Because of the high degree of inter-specific polymorphism observed, 16S-23S rDNA ISR can be considered a good potential target for selecting species-specific molecular assays, such as PCR primer or probes, for a rapid and extremely reliable differentiation of dairy lactococcal isolates.

  9. Phylogenetic analyses of the polyprotein coding sequences of serotype O foot-and-mouth disease viruses in East Africa: evidence for interserotypic recombination

    DEFF Research Database (Denmark)

    Balinda, Sheila; Siegismund, Hans; Muwanika, Vincent

    2010-01-01

    from both serotypes A and O. Conclusions Sequences of the VP1 coding region from recent serotype O FMDVs from Kenya and Uganda are all representatives of a specific East African lineage (topotype EA-2), a probable indication that hardly any FMD introductions of this serotype have occurred from outside...... the region in the recent past. Furthermore, evidence for interserotypic recombination, within the non-structural protein coding regions, between FMDVs of serotypes A and O has been obtained. In addition to characterization using the VP1 coding region, analyses involving the non-structural protein coding...

  10. nRC: non-coding RNA Classifier based on structural features.

    Science.gov (United States)

    Fiannaca, Antonino; La Rosa, Massimo; La Paglia, Laura; Rizzo, Riccardo; Urso, Alfonso

    2017-01-01

    Non-coding RNA (ncRNA) are small non-coding sequences involved in gene expression regulation of many biological processes and diseases. The recent discovery of a large set of different ncRNAs with biologically relevant roles has opened the way to develop methods able to discriminate between the different ncRNA classes. Moreover, the lack of knowledge about the complete mechanisms in regulative processes, together with the development of high-throughput technologies, has required the help of bioinformatics tools in addressing biologists and clinicians with a deeper comprehension of the functional roles of ncRNAs. In this work, we introduce a new ncRNA classification tool, nRC (non-coding RNA Classifier). Our approach is based on features extraction from the ncRNA secondary structure together with a supervised classification algorithm implementing a deep learning architecture based on convolutional neural networks. We tested our approach for the classification of 13 different ncRNA classes. We obtained classification scores, using the most common statistical measures. In particular, we reach an accuracy and sensitivity score of about 74%. The proposed method outperforms other similar classification methods based on secondary structure features and machine learning algorithms, including the RNAcon tool that, to date, is the reference classifier. nRC tool is freely available as a docker image at https://hub.docker.com/r/tblab/nrc/. The source code of nRC tool is also available at https://github.com/IcarPA-TBlab/nrc.

  11. Phylogenetic relationships in the genus Leonardoxa (Leguminosae: Caesalpinioideae) inferred from chloroplast trnL intron and trnL-trnF intergenic spacer sequences.

    Science.gov (United States)

    Brouat, Carine; Gielly, Ludovic; McKey, Doyle

    2001-01-01

    The African genus LEONARDOXA: (Leguminosae: Caesalpinioideae) comprises two Congolean species and a group of four mostly allopatric subspecies principally located in Cameroon and clustered together in the L. africana complex. LEONARDOXA: provides a good opportunity to investigate the evolutionary history of ant-plant mutualisms, as it exhibits various grades of ant-plant interactions from diffuse to obligate and symbiotic associations. We present in this paper the first molecular phylogenetic study of this genus. We sequenced both the chloroplast DNA trnL intron (677 aligned base pairs [bp]) and trnL-trnF intergene spacer (598 aligned bp). Inferred phylogenetic relationships suggested first that the genus is paraphyletic. The L. africana complex is clearly separated from the two Congolean species, and the integrity of the genus is thus in question. In the L. africana complex, our data showed a lack of congruence between clades suggested by morphological and chloroplast characters. This, and the low level of molecular divergence found between subspecies, suggests gene flow and introgressive events in the L. africana complex.

  12. Analysis of Non-binary Hybrid LDPC Codes

    OpenAIRE

    Sassatelli, Lucile; Declercq, David

    2008-01-01

    In this paper, we analyse asymptotically a new class of LDPC codes called Non-binary Hybrid LDPC codes, which has been recently introduced. We use density evolution techniques to derive a stability condition for hybrid LDPC codes, and prove their threshold behavior. We study this stability condition to conclude on asymptotic advantages of hybrid LDPC codes compared to their non-hybrid counterparts.

  13. Linking maternal and somatic 5S rRNA types with different sequence-specific non-LTR retrotransposons.

    Science.gov (United States)

    Locati, Mauro D; Pagano, Johanna F B; Ensink, Wim A; van Olst, Marina; van Leeuwen, Selina; Nehrdich, Ulrike; Zhu, Kongju; Spaink, Herman P; Girard, Geneviève; Rauwerda, Han; Jonker, Martijs J; Dekker, Rob J; Breit, Timo M

    2017-04-01

    5S rRNA is a ribosomal core component, transcribed from many gene copies organized in genomic repeats. Some eukaryotic species have two 5S rRNA types defined by their predominant expression in oogenesis or adult tissue. Our next-generation sequencing study on zebrafish egg, embryo, and adult tissue identified maternal-type 5S rRNA that is exclusively accumulated during oogenesis, replaced throughout the embryogenesis by a somatic-type, and thus virtually absent in adult somatic tissue. The maternal-type 5S rDNA contains several thousands of gene copies on chromosome 4 in tandem repeats with small intergenic regions, whereas the somatic-type is present in only 12 gene copies on chromosome 18 with large intergenic regions. The nine-nucleotide variation between the two 5S rRNA types likely affects TFIII binding and riboprotein L5 binding, probably leading to storage of maternal-type rRNA. Remarkably, these sequence differences are located exactly at the sequence-specific target site for genome integration by the 5S rRNA-specific Mutsu retrotransposon family. Thus, we could define maternal- and somatic-type MutsuDr subfamilies. Furthermore, we identified four additional maternal-type and two new somatic-type MutsuDr subfamilies, each with their own target sequence. This target-site specificity, frequently intact maternal-type retrotransposon elements, plus specific presence of Mutsu retrotransposon RNA and piRNA in egg and adult tissue, suggest an involvement of retrotransposons in achieving the differential copy number of the two types of 5S rDNA loci. © 2017 Locati et al.; Published by Cold Spring Harbor Laboratory Press for the RNA Society.

  14. Novel classes of non-coding RNAs and cancer

    Directory of Open Access Journals (Sweden)

    Sana Jiri

    2012-05-01

    Full Text Available Abstract For the many years, the central dogma of molecular biology has been that RNA functions mainly as an informational intermediate between a DNA sequence and its encoded protein. But one of the great surprises of modern biology was the discovery that protein-coding genes represent less than 2% of the total genome sequence, and subsequently the fact that at least 90% of the human genome is actively transcribed. Thus, the human transcriptome was found to be more complex than a collection of protein-coding genes and their splice variants. Although initially argued to be spurious transcriptional noise or accumulated evolutionary debris arising from the early assembly of genes and/or the insertion of mobile genetic elements, recent evidence suggests that the non-coding RNAs (ncRNAs may play major biological roles in cellular development, physiology and pathologies. NcRNAs could be grouped into two major classes based on the transcript size; small ncRNAs and long ncRNAs. Each of these classes can be further divided, whereas novel subclasses are still being discovered and characterized. Although, in the last years, small ncRNAs called microRNAs were studied most frequently with more than ten thousand hits at PubMed database, recently, evidence has begun to accumulate describing the molecular mechanisms by which a wide range of novel RNA species function, providing insight into their functional roles in cellular biology and in human disease. In this review, we summarize newly discovered classes of ncRNAs, and highlight their functioning in cancer biology and potential usage as biomarkers or therapeutic targets.

  15. Non-Coding RNAs in Hodgkin Lymphoma

    Directory of Open Access Journals (Sweden)

    Anna Cordeiro

    2017-05-01

    Full Text Available MicroRNAs (miRNAs, small non-coding RNAs that regulate gene expression by binding to the 3’-UTR of their target genes, can act as oncogenes or tumor suppressors. Recently, other types of non-coding RNAs—piwiRNAs and long non-coding RNAs—have also been identified. Hodgkin lymphoma (HL is a B cell origin disease characterized by the presence of only 1% of tumor cells, known as Hodgkin and Reed-Stenberg (HRS cells, which interact with the microenvironment to evade apoptosis. Several studies have reported specific miRNA signatures that can differentiate HL lymph nodes from reactive lymph nodes, identify histologic groups within classical HL, and distinguish HRS cells from germinal center B cells. Moreover, some signatures are associated with survival or response to chemotherapy. Most of the miRNAs in the signatures regulate genes related to apoptosis, cell cycle arrest, or signaling pathways. Here we review findings on miRNAs in HL, as well as on other non-coding RNAs.

  16. Whole genome sequencing and evolutionary analysis of human respiratory syncytial virus A and B from Milwaukee, WI 1998-2010.

    Directory of Open Access Journals (Sweden)

    Cecilia Rebuffo-Scheer

    Full Text Available BACKGROUND: Respiratory Syncytial Virus (RSV is the leading cause of lower respiratory-tract infections in infants and young children worldwide. Despite this, only six complete genome sequences of original strains have been previously published, the most recent of which dates back 35 and 26 years for RSV group A and group B respectively. METHODOLOGY/PRINCIPAL FINDINGS: We present a semi-automated sequencing method allowing for the sequencing of four RSV whole genomes simultaneously. We were able to sequence the complete coding sequences of 13 RSV A and 4 RSV B strains from Milwaukee collected from 1998-2010. Another 12 RSV A and 5 RSV B strains sequenced in this study cover the majority of the genome. All RSV A and RSV B sequences were analyzed by neighbor-joining, maximum parsimony and Bayesian phylogeny methods. Genetic diversity was high among RSV A viruses in Milwaukee including the circulation of multiple genotypes (GA1, GA2, GA5, GA7 with GA2 persisting throughout the 13 years of the study. However, RSV B genomes showed little variation with all belonging to the BA genotype. For RSV A, the same evolutionary patterns and clades were seen consistently across the whole genome including all intergenic, coding, and non-coding regions sequences. CONCLUSIONS/SIGNIFICANCE: The sequencing strategy presented in this work allows for RSV A and B genomes to be sequenced simultaneously in two working days and with a low cost. We have significantly increased the amount of genomic data that is available for both RSV A and B, providing the basic molecular characteristics of RSV strains circulating in Milwaukee over the last 13 years. This information can be used for comparative analysis with strains circulating in other communities around the world which should also help with the development of new strategies for control of RSV, specifically vaccine development and improvement of RSV diagnostics.

  17. Mechanisms of haplotype divergence at the RGA08 nucleotide-binding leucine-rich repeat gene locus in wild banana (Musa balbisiana).

    Science.gov (United States)

    Baurens, Franc-Christophe; Bocs, Stéphanie; Rouard, Mathieu; Matsumoto, Takashi; Miller, Robert N G; Rodier-Goud, Marguerite; MBéguié-A-MBéguié, Didier; Yahiaoui, Nabila

    2010-07-16

    Comparative sequence analysis of complex loci such as resistance gene analog clusters allows estimating the degree of sequence conservation and mechanisms of divergence at the intraspecies level. In banana (Musa sp.), two diploid wild species Musa acuminata (A genome) and Musa balbisiana (B genome) contribute to the polyploid genome of many cultivars. The M. balbisiana species is associated with vigour and tolerance to pests and disease and little is known on the genome structure and haplotype diversity within this species. Here, we compare two genomic sequences of 253 and 223 kb corresponding to two haplotypes of the RGA08 resistance gene analog locus in M. balbisiana "Pisang Klutuk Wulung" (PKW). Sequence comparison revealed two regions of contrasting features. The first is a highly colinear gene-rich region where the two haplotypes diverge only by single nucleotide polymorphisms and two repetitive element insertions. The second corresponds to a large cluster of RGA08 genes, with 13 and 18 predicted RGA genes and pseudogenes spread over 131 and 152 kb respectively on each haplotype. The RGA08 cluster is enriched in repetitive element insertions, in duplicated non-coding intergenic sequences including low complexity regions and shows structural variations between haplotypes. Although some allelic relationships are retained, a large diversity of RGA08 genes occurs in this single M. balbisiana genotype, with several RGA08 paralogs specific to each haplotype. The RGA08 gene family has evolved by mechanisms of unequal recombination, intragenic sequence exchange and diversifying selection. An unequal recombination event taking place between duplicated non-coding intergenic sequences resulted in a different RGA08 gene content between haplotypes pointing out the role of such duplicated regions in the evolution of RGA clusters. Based on the synonymous substitution rate in coding sequences, we estimated a 1 million year divergence time for these M. balbisiana haplotypes. A

  18. Decoding the non-coding RNAs in Alzheimer's disease.

    Science.gov (United States)

    Schonrock, Nicole; Götz, Jürgen

    2012-11-01

    Non-coding RNAs (ncRNAs) are integral components of biological networks with fundamental roles in regulating gene expression. They can integrate sequence information from the DNA code, epigenetic regulation and functions of multimeric protein complexes to potentially determine the epigenetic status and transcriptional network in any given cell. Humans potentially contain more ncRNAs than any other species, especially in the brain, where they may well play a significant role in human development and cognitive ability. This review discusses their emerging role in Alzheimer's disease (AD), a human pathological condition characterized by the progressive impairment of cognitive functions. We discuss the complexity of the ncRNA world and how this is reflected in the regulation of the amyloid precursor protein and Tau, two proteins with central functions in AD. By understanding this intricate regulatory network, there is hope for a better understanding of disease mechanisms and ultimately developing diagnostic and therapeutic tools.

  19. The origins and evolutionary history of human non-coding RNA regulatory networks.

    Science.gov (United States)

    Sherafatian, Masih; Mowla, Seyed Javad

    2017-04-01

    The evolutionary history and origin of the regulatory function of animal non-coding RNAs are not well understood. Lack of conservation of long non-coding RNAs and small sizes of microRNAs has been major obstacles in their phylogenetic analysis. In this study, we tried to shed more light on the evolution of ncRNA regulatory networks by changing our phylogenetic strategy to focus on the evolutionary pattern of their protein coding targets. We used available target databases of miRNAs and lncRNAs to find their protein coding targets in human. We were able to recognize evolutionary hallmarks of ncRNA targets by phylostratigraphic analysis. We found the conventional 3'-UTR and lesser known 5'-UTR targets of miRNAs to be enriched at three consecutive phylostrata. Firstly, in eukaryata phylostratum corresponding to the emergence of miRNAs, our study revealed that miRNA targets function primarily in cell cycle processes. Moreover, the same overrepresentation of the targets observed in the next two consecutive phylostrata, opisthokonta and eumetazoa, corresponded to the expansion periods of miRNAs in animals evolution. Coding sequence targets of miRNAs showed a delayed rise at opisthokonta phylostratum, compared to the 3' and 5' UTR targets of miRNAs. LncRNA regulatory network was the latest to evolve at eumetazoa.

  20. Emerging putative associations between non-coding RNAs and protein-coding genes in Neuropathic Pain. Added value from re-using microarray data.

    Directory of Open Access Journals (Sweden)

    Enrico Capobianco

    2016-10-01

    Full Text Available Regeneration of injured nerves is likely occurring in the peripheral nervous system, but not in the central nervous system. Although protein-coding gene expression has been assessed during nerve regeneration, little is currently known about the role of non-coding RNAs (ncRNAs. This leaves open questions about the potential effects of ncRNAs at transcriptome level. Due to the limited availability of human neuropathic pain data, we have identified the most comprehensive time-course gene expression profile referred to sciatic nerve injury, and studied in a rat model, using two neuronal tissues, namely dorsal root ganglion (DRG and sciatic nerve (SN. We have developed a methodology to identify differentially expressed bioentities starting from microarray probes, and re-purposing them to annotate ncRNAs, while analyzing the expression profiles of protein-coding genes. The approach is designed to reuse microarray data and perform first profiling and then meta-analysis through three main steps. First, we used contextual analysis to identify what we considered putative or potential protein coding targets for selected ncRNAs. Relevance was therefore assigned to differential expression of neighbor protein-coding genes, with neighborhood defined by a fixed genomic distance from long or antisense ncRNA loci, and of parent genes associated with pseudogenes. Second, connectivity among putative targets was used to build networks, in turn useful to conduct inference at interactomic scale. Last, network paths were annotated to assess relevance to neuropathic pain. We found significant differential expression in long-intergenic ncRNAs (32 lincRNAs in SN, and 8 in DRG, antisense RNA (31 asRNA in SN, and 12 in DRG and pseudogenes (456 in SN, 56 in DRG. In particular, contextual analysis centered on pseudogenes revealed some targets with known association to neurodegeneration and/or neurogenesis processes. While modules of the olfactory receptors were clearly

  1. The Non-Coding RNA Ontology (NCRO): a comprehensive resource for the unification of non-coding RNA biology.

    Science.gov (United States)

    Huang, Jingshan; Eilbeck, Karen; Smith, Barry; Blake, Judith A; Dou, Dejing; Huang, Weili; Natale, Darren A; Ruttenberg, Alan; Huan, Jun; Zimmermann, Michael T; Jiang, Guoqian; Lin, Yu; Wu, Bin; Strachan, Harrison J; He, Yongqun; Zhang, Shaojie; Wang, Xiaowei; Liu, Zixing; Borchert, Glen M; Tan, Ming

    2016-01-01

    In recent years, sequencing technologies have enabled the identification of a wide range of non-coding RNAs (ncRNAs). Unfortunately, annotation and integration of ncRNA data has lagged behind their identification. Given the large quantity of information being obtained in this area, there emerges an urgent need to integrate what is being discovered by a broad range of relevant communities. To this end, the Non-Coding RNA Ontology (NCRO) is being developed to provide a systematically structured and precisely defined controlled vocabulary for the domain of ncRNAs, thereby facilitating the discovery, curation, analysis, exchange, and reasoning of data about structures of ncRNAs, their molecular and cellular functions, and their impacts upon phenotypes. The goal of NCRO is to serve as a common resource for annotations of diverse research in a way that will significantly enhance integrative and comparative analysis of the myriad resources currently housed in disparate sources. It is our belief that the NCRO ontology can perform an important role in the comprehensive unification of ncRNA biology and, indeed, fill a critical gap in both the Open Biological and Biomedical Ontologies (OBO) Library and the National Center for Biomedical Ontology (NCBO) BioPortal. Our initial focus is on the ontological representation of small regulatory ncRNAs, which we see as the first step in providing a resource for the annotation of data about all forms of ncRNAs. The NCRO ontology is free and open to all users, accessible at: http://purl.obolibrary.org/obo/ncro.owl.

  2. Variation of 45S rDNA intergenic spacers in Arabidopsis thaliana.

    Science.gov (United States)

    Havlová, Kateřina; Dvořáčková, Martina; Peiro, Ramon; Abia, David; Mozgová, Iva; Vansáčová, Lenka; Gutierrez, Crisanto; Fajkus, Jiří

    2016-11-01

    Approximately seven hundred 45S rRNA genes (rDNA) in the Arabidopsis thaliana genome are organised in two 4 Mbp-long arrays of tandem repeats arranged in head-to-tail fashion separated by an intergenic spacer (IGS). These arrays make up 5 % of the A. thaliana genome. IGS are rapidly evolving sequences and frequent rearrangements inside the rDNA loci have generated considerable interspecific and even intra-individual variability which allows to distinguish among otherwise highly conserved rRNA genes. The IGS has not been comprehensively described despite its potential importance in regulation of rDNA transcription and replication. Here we describe the detailed sequence variation in the complete IGS of A. thaliana WT plants and provide the reference/consensus IGS sequence, as well as genomic DNA analysis. We further investigate mutants dysfunctional in chromatin assembly factor-1 (CAF-1) (fas1 and fas2 mutants), which are known to have a reduced number of rDNA copies, and plant lines with restored CAF-1 function (segregated from a fas1xfas2 genetic background) showing major rDNA rearrangements. The systematic rDNA loss in CAF-1 mutants leads to the decreased variability of the IGS and to the occurrence of distinct IGS variants. We present for the first time a comprehensive and representative set of complete IGS sequences, obtained by conventional cloning and by Pacific Biosciences sequencing. Our data expands the knowledge of the A. thaliana IGS sequence arrangement and variability, which has not been available in full and in detail until now. This is also the first study combining IGS sequencing data with RFLP analysis of genomic DNA.

  3. Long Non-Coding RNAs in Metabolic Organs and Energy Homeostasis

    Directory of Open Access Journals (Sweden)

    Maude Giroud

    2017-11-01

    Full Text Available Single cell organisms can surprisingly exceed the number of human protein-coding genes, which are thus not at the origin of the complexity of an organism. In contrast, the relative amount of non-protein-coding sequences increases consistently with organismal complexity. Moreover, the mammalian transcriptome predominantly comprises non-(protein-coding RNAs (ncRNA, of which the long ncRNAs (lncRNAs constitute the most abundant part. lncRNAs are highly species- and tissue-specific with very versatile modes of action in accordance with their binding to a large spectrum of molecules and their diverse localization. lncRNAs are transcriptional regulators adding an additional regulatory layer in biological processes and pathophysiological conditions. Here, we review lncRNAs affecting metabolic organs with a focus on the liver, pancreas, skeletal muscle, cardiac muscle, brain, and adipose organ. In addition, we will discuss the impact of lncRNAs on metabolic diseases such as obesity and diabetes. In contrast to the substantial number of lncRNA loci in the human genome, the functionally characterized lncRNAs are just the tip of the iceberg. So far, our knowledge concerning lncRNAs in energy homeostasis is still in its infancy, meaning that the rest of the iceberg is a treasure chest yet to be discovered.

  4. Identification of Novel Long Non-coding and Circular RNAs in Human Papillomavirus-Mediated Cervical Cancer

    Directory of Open Access Journals (Sweden)

    Hongbo Wang

    2017-09-01

    Full Text Available Cervical cancer is the third most common cancer worldwide and the fourth leading cause of cancer-associated mortality in women. Accumulating evidence indicates that long non-coding RNAs (lncRNAs and circular RNAs (circRNAs may play key roles in the carcinogenesis of different cancers; however, little is known about the mechanisms of lncRNAs and circRNAs in the progression and metastasis of cervical cancer. In this study, we explored the expression profiles of lncRNAs, circRNAs, miRNAs, and mRNAs in HPV16 (human papillomavirus genotype 16 mediated cervical squamous cell carcinoma and matched adjacent non-tumor (ATN tissues from three patients with high-throughput RNA sequencing (RNA-seq. In total, we identified 19 lncRNAs, 99 circRNAs, 28 miRNAs, and 304 mRNAs that were commonly differentially expressed (DE in different patients. Among the non-coding RNAs, 3 lncRNAs and 44 circRNAs are novel to our knowledge. Functional enrichment analysis showed that DE lncRNAs, miRNAs, and mRNAs were enriched in pathways crucial to cancer as well as other gene ontology (GO terms. Furthermore, the co-expression network and function prediction suggested that all 19 DE lncRNAs could play different roles in the carcinogenesis and development of cervical cancer. The competing endogenous RNA (ceRNA network based on DE coding and non-coding RNAs showed that each miRNA targeted a number of lncRNAs and circRNAs. The link between part of the miRNAs in the network and cervical cancer has been validated in previous studies, and these miRNAs targeted the majority of the novel non-coding RNAs, thus suggesting that these novel non-coding RNAs may be involved in cervical cancer. Taken together, our study shows that DE non-coding RNAs could be further developed as diagnostic and therapeutic biomarkers of cervical cancer. The complex ceRNA network also lays the foundation for future research of the roles of coding and non-coding RNAs in cervical cancer.

  5. Identification of Novel Long Non-coding and Circular RNAs in Human Papillomavirus-Mediated Cervical Cancer

    Science.gov (United States)

    Wang, Hongbo; Zhao, Yingchao; Chen, Mingyue; Cui, Jie

    2017-01-01

    Cervical cancer is the third most common cancer worldwide and the fourth leading cause of cancer-associated mortality in women. Accumulating evidence indicates that long non-coding RNAs (lncRNAs) and circular RNAs (circRNAs) may play key roles in the carcinogenesis of different cancers; however, little is known about the mechanisms of lncRNAs and circRNAs in the progression and metastasis of cervical cancer. In this study, we explored the expression profiles of lncRNAs, circRNAs, miRNAs, and mRNAs in HPV16 (human papillomavirus genotype 16) mediated cervical squamous cell carcinoma and matched adjacent non-tumor (ATN) tissues from three patients with high-throughput RNA sequencing (RNA-seq). In total, we identified 19 lncRNAs, 99 circRNAs, 28 miRNAs, and 304 mRNAs that were commonly differentially expressed (DE) in different patients. Among the non-coding RNAs, 3 lncRNAs and 44 circRNAs are novel to our knowledge. Functional enrichment analysis showed that DE lncRNAs, miRNAs, and mRNAs were enriched in pathways crucial to cancer as well as other gene ontology (GO) terms. Furthermore, the co-expression network and function prediction suggested that all 19 DE lncRNAs could play different roles in the carcinogenesis and development of cervical cancer. The competing endogenous RNA (ceRNA) network based on DE coding and non-coding RNAs showed that each miRNA targeted a number of lncRNAs and circRNAs. The link between part of the miRNAs in the network and cervical cancer has been validated in previous studies, and these miRNAs targeted the majority of the novel non-coding RNAs, thus suggesting that these novel non-coding RNAs may be involved in cervical cancer. Taken together, our study shows that DE non-coding RNAs could be further developed as diagnostic and therapeutic biomarkers of cervical cancer. The complex ceRNA network also lays the foundation for future research of the roles of coding and non-coding RNAs in cervical cancer. PMID:28970820

  6. Blackout sequence modeling for Atucha-I with MARCH3 code

    International Nuclear Information System (INIS)

    Baron, J.; Bastianelli, B.

    1997-01-01

    The modeling of a blackout sequence in Atucha I nuclear power plant is presented in this paper, as a preliminary phase for a level II probabilistic safety assessment. Such sequence is analyzed with the code MARCH3 from STCP (Source Term Code Package), based on a specific model developed for Atucha, that takes into accounts it peculiarities. The analysis includes all the severe accident phases, from the initial transient (loss of heat sink), loss of coolant through the safety valves, core uncovered, heatup, metal-water reaction, melting and relocation, heatup and failure of the pressure vessel, core-concrete interaction in the reactor cavity, heatup and failure of the containment building (multi-compartmented) due to quasi-static overpressurization. The results obtained permit to visualize the time sequence of these events, as well as provide the basis for source term studies. (author) [es

  7. Hominoid-specific de novo protein-coding genes originating from long non-coding RNAs.

    Directory of Open Access Journals (Sweden)

    Chen Xie

    2012-09-01

    Full Text Available Tinkering with pre-existing genes has long been known as a major way to create new genes. Recently, however, motherless protein-coding genes have been found to have emerged de novo from ancestral non-coding DNAs. How these genes originated is not well addressed to date. Here we identified 24 hominoid-specific de novo protein-coding genes with precise origination timing in vertebrate phylogeny. Strand-specific RNA-Seq analyses were performed in five rhesus macaque tissues (liver, prefrontal cortex, skeletal muscle, adipose, and testis, which were then integrated with public transcriptome data from human, chimpanzee, and rhesus macaque. On the basis of comparing the RNA expression profiles in the three species, we found that most of the hominoid-specific de novo protein-coding genes encoded polyadenylated non-coding RNAs in rhesus macaque or chimpanzee with a similar transcript structure and correlated tissue expression profile. According to the rule of parsimony, the majority of these hominoid-specific de novo protein-coding genes appear to have acquired a regulated transcript structure and expression profile before acquiring coding potential. Interestingly, although the expression profile was largely correlated, the coding genes in human often showed higher transcriptional abundance than their non-coding counterparts in rhesus macaque. The major findings we report in this manuscript are robust and insensitive to the parameters used in the identification and analysis of de novo genes. Our results suggest that at least a portion of long non-coding RNAs, especially those with active and regulated transcription, may serve as a birth pool for protein-coding genes, which are then further optimized at the transcriptional level.

  8. Function and Application Areas in Medicine of Non-Coding RNA

    Directory of Open Access Journals (Sweden)

    Figen Guzelgul

    2009-06-01

    Full Text Available RNA is the genetic material converting the genetic code that it gets from DNA into protein. While less than 2 % of RNA is converted into protein , more than 98 % of it can not be converted into protein and named as non-coding RNAs. 70 % of noncoding RNAs consists of introns , however, the rest part of them consists of exons. Non-coding RNAs are examined in two classes according to their size and functions. Whereas they are classified as long non-coding and small non-coding RNAs according to their size , they are grouped as housekeeping non-coding RNAs and regulating non-coding RNAs according to their function. For long years ,these non-coding RNAs have been considered as non-functional. However, today, it has been proved that these non-coding RNAs play role in regulating genes and in structural, functional and catalitic roles of RNAs converted into protein. Due to its taking a role in gene silencing mechanism, particularly in medical world , non-coding RNAs have led to significant developments. RNAi technolgy , which is used in designing drugs to be used in treatment of various diseases , is a ray of hope for medical world. [Archives Medical Review Journal 2009; 18(3.000: 141-155

  9. Prunus necrotic ringspot ilarvirus: nucleotide sequence of RNA3 and the relationship to other ilarviruses based on coat protein comparison.

    Science.gov (United States)

    Guo, D; Maiss, E; Adam, G; Casper, R

    1995-05-01

    The RNA3 of prunus necrotic ringspot ilarvirus (PNRSV) has been cloned and its entire sequence determined. The RNA3 consists of 1943 nucleotides (nt) and possesses two large open reading frames (ORFs) separated by an intergenic region of 74 nt. The 5' proximal ORF is 855 nt in length and codes for a protein of molecular mass 31.4 kDa which has homologies with the putative movement protein of other members of the Bromoviridae. The 3' proximal ORF of 675 nt is the cistron for the coat protein (CP) and has a predicted molecular mass of 24.9 kDa. The sequence of the 3' non-coding region (NCR) of PNRSV RNA3 showed a high degree of similarity with those of tobacco streak virus (TSV), prune dwarf virus (PDV), apple mosaic virus (ApMV) and also alfalfa mosaic virus (AIMV). In addition it contained potential stem-loop structures with interspersed AUGC motifs characteristic for ilar- and alfamoviruses. This conserved primary and secondary structure in all 3' NCRs may be responsible for the interaction with homologous and heterologous CPs and subsequent activation of genome replication. The CP gene of an ApMV isolate (ApMV-G) of 657 nt has also been cloned and sequenced. Although ApMV and PNRSV have a distant serological relationship, the deduced amino acid sequences of their CPs have an identity of only 51.8%. The N termini of PNRSV and ApMV CPs have in common a zinc-finger motif and the potential to form an amphipathic helix.

  10. Chromosome preference of disease genes and vectorization for the prediction of non-coding disease genes.

    Science.gov (United States)

    Peng, Hui; Lan, Chaowang; Liu, Yuansheng; Liu, Tao; Blumenstein, Michael; Li, Jinyan

    2017-10-03

    Disease-related protein-coding genes have been widely studied, but disease-related non-coding genes remain largely unknown. This work introduces a new vector to represent diseases, and applies the newly vectorized data for a positive-unlabeled learning algorithm to predict and rank disease-related long non-coding RNA (lncRNA) genes. This novel vector representation for diseases consists of two sub-vectors, one is composed of 45 elements, characterizing the information entropies of the disease genes distribution over 45 chromosome substructures. This idea is supported by our observation that some substructures (e.g., the chromosome 6 p-arm) are highly preferred by disease-related protein coding genes, while some (e.g., the 21 p-arm) are not favored at all. The second sub-vector is 30-dimensional, characterizing the distribution of disease gene enriched KEGG pathways in comparison with our manually created pathway groups. The second sub-vector complements with the first one to differentiate between various diseases. Our prediction method outperforms the state-of-the-art methods on benchmark datasets for prioritizing disease related lncRNA genes. The method also works well when only the sequence information of an lncRNA gene is known, or even when a given disease has no currently recognized long non-coding genes.

  11. Characterization of the Complete Mitochondrial Genome Sequence of the Globose Head Whiptail Cetonurus globiceps (Gadiformes: Macrouridae and Its Phylogenetic Analysis.

    Directory of Open Access Journals (Sweden)

    Xiaofeng Shi

    Full Text Available The particular environmental characteristics of deep water such as its immense scale and high pressure systems, presents technological problems that have prevented research to broaden our knowledge of deep-sea fish. Here, we described the mitogenome sequence of a deep-sea fish, Cetonurus globiceps. The genome is 17,137 bp in length, with a standard set of 22 transfer RNA genes (tRNAs, two ribosomal RNA genes, 13 protein-coding genes, and two typical non-coding control regions. Additionally, a 70 bp tRNA(Thr-tRNA(Pro intergenic spacer is present. The C. globiceps mitogenome exhibited strand-specific asymmetry in nucleotide composition. The AT-skew and GC-skew values in the whole genome of C. globiceps were 0 and -0.2877, respectively, revealing that the H-strand had equal amounts of A and T and that the overall nucleotide composition was C skewed. All of the tRNA genes could be folded into cloverleaf secondary structures, while the secondary structure of tRNA(Ser(AGY lacked a discernible dihydrouridine stem. By comparing this genome sequence with the recognition sites in teleost species, several conserved sequence blocks were identified in the control region. However, the GTGGG-box, the typical characteristic of conserved sequence block E (CSB-E, was absent. Notably, tandem repeats were identified in the 3' portion of the control region. No similar repetitive motifs are present in most of other gadiform species. Phylogenetic analysis based on 12 protein coding genes provided strong support that C. globiceps was the most derived in the clade. Some relationships however, are in contrast with those presented in previous studies. This study enriches our knowledge of mitogenomes of the genus Cetonurus and provides valuable information on the evolution of Macrouridae mtDNA and deep-sea fish.

  12. Developmental programming of long non-coding RNAs during postnatal liver maturation in mice.

    Directory of Open Access Journals (Sweden)

    Lai Peng

    Full Text Available The liver is a vital organ with critical functions in metabolism, protein synthesis, and immune defense. Most of the liver functions are not mature at birth and many changes happen during postnatal liver development. However, it is unclear what changes occur in liver after birth, at what developmental stages they occur, and how the developmental processes are regulated. Long non-coding RNAs (lncRNAs are involved in organ development and cell differentiation. Here, we analyzed the transcriptome of lncRNAs in mouse liver from perinatal (day -2 to adult (day 60 by RNA-Sequencing, with an attempt to understand the role of lncRNAs in liver maturation. We found around 15,000 genes expressed, including about 2,000 lncRNAs. Most lncRNAs were expressed at a lower level than coding RNAs. Both coding RNAs and lncRNAs displayed three major ontogenic patterns: enriched at neonatal, adolescent, or adult stages. Neighboring coding and non-coding RNAs showed the trend to exhibit highly correlated ontogenic expression patterns. Gene ontology (GO analysis revealed that some lncRNAs enriched at neonatal ages have their neighbor protein coding genes also enriched at neonatal ages and associated with cell proliferation, immune activation related processes, tissue organization pathways, and hematopoiesis; other lncRNAs enriched at adolescent ages have their neighbor protein coding genes associated with different metabolic processes. These data reveal significant functional transition during postnatal liver development and imply the potential importance of lncRNAs in liver maturation.

  13. On the classification of long non-coding RNAs

    KAUST Repository

    Ma, Lina

    2013-06-01

    Long non-coding RNAs (lncRNAs) have been found to perform various functions in a wide variety of important biological processes. To make easier interpretation of lncRNA functionality and conduct deep mining on these transcribed sequences, it is convenient to classify lncRNAs into different groups. Here, we summarize classification methods of lncRNAs according to their four major features, namely, genomic location and context, effect exerted on DNA sequences, mechanism of functioning and their targeting mechanism. In combination with the presently available function annotations, we explore potential relationships between different classification categories, and generalize and compare biological features of different lncRNAs within each category. Finally, we present our view on potential further studies. We believe that the classifications of lncRNAs as indicated above are of fundamental importance for lncRNA studies, helpful for further investigation of specific lncRNAs, for formulation of new hypothesis based on different features of lncRNA and for exploration of the underlying lncRNA functional mechanisms. © 2013 Landes Bioscience.

  14. Genome-wide identification of potato long intergenic noncoding RNAs responsive to Pectobacterium carotovorum subspecies brasiliense infection.

    Science.gov (United States)

    Kwenda, Stanford; Birch, Paul R J; Moleleki, Lucy N

    2016-08-11

    Long noncoding RNAs (lncRNAs) represent a class of RNA molecules that are implicated in regulation of gene expression in both mammals and plants. While much progress has been made in determining the biological functions of lncRNAs in mammals, the functional roles of lncRNAs in plants are still poorly understood. Specifically, the roles of long intergenic nocoding RNAs (lincRNAs) in plant defence responses are yet to be fully explored. In this study, we used strand-specific RNA sequencing to identify 1113 lincRNAs in potato (Solanum tuberosum) from stem tissues. The lincRNAs are expressed from all 12 potato chromosomes and generally smaller in size compared to protein-coding genes. Like in other plants, most potato lincRNAs possess single exons. A time-course RNA-seq analysis between a tolerant and a susceptible potato cultivar showed that 559 lincRNAs are responsive to Pectobacterium carotovorum subsp. brasiliense challenge compared to mock-inoculated controls. Moreover, coexpression analysis revealed that 17 of these lincRNAs are highly associated with 12 potato defence-related genes. Together, these results suggest that lincRNAs have potential functional roles in potato defence responses. Furthermore, this work provides the first library of potato lincRNAs and a set of novel lincRNAs implicated in potato defences against P. carotovorum subsp. brasiliense, a member of the soft rot Enterobacteriaceae phytopathogens.

  15. Genome-wide identification, characterization and evolutionary analysis of long intergenic noncoding RNAs in cucumber.

    Directory of Open Access Journals (Sweden)

    Zhiqiang Hao

    Full Text Available Long intergenic noncoding RNAs (lincRNAs are intergenic transcripts with a length of at least 200 nt that lack coding potential. Emerging evidence suggests that lincRNAs from animals participate in many fundamental biological processes. However, the systemic identification of lincRNAs has been undertaken in only a few plants. We chose to use cucumber (Cucumis sativus as a model to analyze lincRNAs due to its importance as a model plant for studying sex differentiation and fruit development and the rich genomic and transcriptome data available. The application of a bioinformatics pipeline to multiple types of gene expression data resulted in the identification and characterization of 3,274 lincRNAs. Next, 10 lincRNAs targeted by 17 miRNAs were also explored. Based on co-expression analysis between lincRNAs and mRNAs, 94 lincRNAs were annotated, which may be involved in response to stimuli, multi-organism processes, reproduction, reproductive processes, and growth. Finally, examination of the evolution of lincRNAs showed that most lincRNAs are under purifying selection, while 16 lincRNAs are under natural selection. Our results provide a rich resource for further validation of cucumber lincRNAs and their function. The identification of lincRNAs targeted by miRNAs offers new clues for investigations into the role of lincRNAs in regulating gene expression. Finally, evaluation of the lincRNAs suggested that some lincRNAs are under positive and balancing selection.

  16. The complete genome sequence of the Atlantic salmon paramyxovirus (ASPV)

    International Nuclear Information System (INIS)

    Nylund, Stian; Karlsen, Marius; Nylund, Are

    2008-01-01

    The complete RNA genome of the Atlantic salmon paramyxovirus (ASPV), isolated from Atlantic salmon suffering from proliferative gill inflammation (PGI), has been determined. The genome is 16,965 nucleotides in length and consists of six nonoverlapping genes in the order 3'- N - P/C/V - M - F - HN - L -5', coding for the nucleocapsid, phospho-, matrix, fusion, hemagglutinin-neuraminidase and large polymerase proteins, respectively. The gene junctions contain highly conserved transcription start and stop signal sequences and trinucleotide intergenic regions similar to those of other Paramyxoviridae. The ASPV P-gene expression strategy is like that of the respiro- and morbilliviruses, which express the phosphoprotein from the primary transcript, and edit a portion of the mRNA to encode the accessory proteins V and W. It also encodes the C-protein by ribosomal choice of translation initiation. Pairwise comparisons of amino acid identities, and phylogenetic analysis of deduced ASPV protein sequences with homologous sequences from other Paramyxoviridae, show that ASPV has an affinity for the genus Respirovirus, but may represent a new genus within the subfamily Paramyxovirinae

  17. Molecular Evolution of the non-coding Eosinophil Granule Ontogeny Transcript EGOT

    Directory of Open Access Journals (Sweden)

    Dominic eRose

    2011-10-01

    Full Text Available Eukaryotic genomes are pervasively transcribed. A large fraction of the transcriptional output consists of long, mRNA-like, non-protein-coding transcripts (mlncRNAs. The evolutionary history of mlncRNAs is still largely uncharted territory.In this contribution, we explore in detail the evolutionary traces of the eosinophil granule ontogeny transcript (EGOT, an experimentally confirmed representative of an abundant class of totally intronic non-coding transcripts (TINs. EGOT is located antisense to an intron of the ITPR1 gene. We computationally identify putative EGOT orthologs in the genomes of 32 different amniotes, including orthologs from primates, rodents, ungulates, carnivores, afrotherians, and xenarthrans, as well as putative candidates from basal amniotes, such as opossum or platypus. We investigate the EGOT gene phylogeny, analyse patterns of sequence conservation, and the evolutionary conservation of the EGOT gene structure. We show that EGO-B, the spliced isoform, may be present throughout the placental mammals, but most likely dates back even further. We demonstrat here for the first time that the whole EGOT locus is highly structured, containing several evolutionary conserved and thermodynamic stable secondary structures.Our analyses allow us to postulate novel functional roles of a hitherto poorly understood region at the intron of EGO-B which is highly conserved at the sequence level. The region contains a novel ITPR1 exon and also conserved RNA secondary structures together with a conserved TATA-like element, which putatively acts as a promoter of an independent regulatory element.

  18. Genic and Intergenic SSR Database Generation, SNPs Determination and Pathway Annotations, in Date Palm (Phoenix dactylifera L.).

    Science.gov (United States)

    Mokhtar, Morad M; Adawy, Sami S; El-Assal, Salah El-Din S; Hussein, Ebtissam H A

    2016-01-01

    The present investigation was carried out aiming to use the bioinformatics tools in order to identify and characterize, simple sequence repeats within the third Version of the date palm genome and develop a new SSR primers database. In addition single nucleotide polymorphisms (SNPs) that are located within the SSR flanking regions were recognized. Moreover, the pathways for the sequences assigned by SSR primers, the biological functions and gene interaction were determined. A total of 172,075 SSR motifs was identified on date palm genome sequence with a frequency of 450.97 SSRs per Mb. Out of these, 130,014 SSRs (75.6%) were located within the intergenic regions with a frequency of 499 SSRs per Mb. While, only 42,061 SSRs (24.4%) were located within the genic regions with a frequency of 347.5 SSRs per Mb. A total of 111,403 of SSR primer pairs were designed, that represents 291.9 SSR primers per Mb. Out of the 111,403, only 31,380 SSR primers were in the genic regions, while 80,023 primers were in the intergenic regions. A number of 250,507 SNPs were recognized in 84,172 SSR flanking regions, which represents 75.55% of the total SSR flanking regions. Out of 12,274 genes only 463 genes comprising 896 SSR primers were mapped onto 111 pathways using KEGG data base. The most abundant enzymes were identified in the pathway related to the biosynthesis of antibiotics. We tested 1031 SSR primers using both publicly available date palm genome sequences as templates in the in silico PCR reactions. Concerning in vitro validation, 31 SSR primers among those used in the in silico PCR were synthesized and tested for their ability to detect polymorphism among six Egyptian date palm cultivars. All tested primers have successfully amplified products, but only 18 primers detected polymorphic amplicons among the studied date palm cultivars.

  19. Non-coding RNAs and epigenome: de novo DNA methylation, allelic exclusion and X-inactivation

    Directory of Open Access Journals (Sweden)

    V. A. Halytskiy

    2013-12-01

    Full Text Available Non-coding RNAs are widespread class of cell RNAs. They participate in many important processes in cells – signaling, posttranscriptional silencing, protein biosynthesis, splicing, maintenance of genome stability, telomere lengthening, X-inactivation. Nevertheless, activity of these RNAs is not restricted to posttranscriptional sphere, but cover also processes that change or maintain the epigenetic information. Non-coding RNAs can directly bind to the DNA targets and cause their repression through recruitment of DNA methyltransferases as well as chromatin modifying enzymes. Such events constitute molecular mechanism of the RNA-dependent DNA methylation. It is possible, that the RNA-DNA interaction is universal mechanism triggering DNA methylation de novo. Allelic exclusion can be also based on described mechanism. This phenomenon takes place, when non-coding RNA, which precursor is transcribed from one allele, triggers DNA methylation in all other alleles present in the cell. Note, that miRNA-mediated transcriptional silencing resembles allelic exclusion, because both miRNA gene and genes, which can be targeted by this miRNA, contain elements with the same sequences. It can be assumed that RNA-dependent DNA methylation and allelic exclusion originated with the purpose of counteracting the activity of mobile genetic elements. Probably, thinning and deregulation of the cellular non-coding RNA pattern allows reactivation of silent mobile genetic elements resulting in genome instability that leads to ageing and carcinogenesis. In the course of X-inactivation, DNA methylation and subsequent hete­rochromatinization of X chromosome can be triggered by direct hybridization of 5′-end of large non-coding RNA Xist with DNA targets in remote regions of the X chromosome.

  20. An integrative approach to predicting the functional effects of small indels in non-coding regions of the human genome.

    Science.gov (United States)

    Ferlaino, Michael; Rogers, Mark F; Shihab, Hashem A; Mort, Matthew; Cooper, David N; Gaunt, Tom R; Campbell, Colin

    2017-10-06

    Small insertions and deletions (indels) have a significant influence in human disease and, in terms of frequency, they are second only to single nucleotide variants as pathogenic mutations. As the majority of mutations associated with complex traits are located outside the exome, it is crucial to investigate the potential pathogenic impact of indels in non-coding regions of the human genome. We present FATHMM-indel, an integrative approach to predict the functional effect, pathogenic or neutral, of indels in non-coding regions of the human genome. Our method exploits various genomic annotations in addition to sequence data. When validated on benchmark data, FATHMM-indel significantly outperforms CADD and GAVIN, state of the art models in assessing the pathogenic impact of non-coding variants. FATHMM-indel is available via a web server at indels.biocompute.org.uk. FATHMM-indel can accurately predict the functional impact and prioritise small indels throughout the whole non-coding genome.

  1. Systematic analysis of coding and noncoding DNA sequences using methods of statistical linguistics

    Science.gov (United States)

    Mantegna, R. N.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Peng, C. K.; Simons, M.; Stanley, H. E.

    1995-01-01

    We compare the statistical properties of coding and noncoding regions in eukaryotic and viral DNA sequences by adapting two tests developed for the analysis of natural languages and symbolic sequences. The data set comprises all 30 sequences of length above 50 000 base pairs in GenBank Release No. 81.0, as well as the recently published sequences of C. elegans chromosome III (2.2 Mbp) and yeast chromosome XI (661 Kbp). We find that for the three chromosomes we studied the statistical properties of noncoding regions appear to be closer to those observed in natural languages than those of coding regions. In particular, (i) a n-tuple Zipf analysis of noncoding regions reveals a regime close to power-law behavior while the coding regions show logarithmic behavior over a wide interval, while (ii) an n-gram entropy measurement shows that the noncoding regions have a lower n-gram entropy (and hence a larger "n-gram redundancy") than the coding regions. In contrast to the three chromosomes, we find that for vertebrates such as primates and rodents and for viral DNA, the difference between the statistical properties of coding and noncoding regions is not pronounced and therefore the results of the analyses of the investigated sequences are less conclusive. After noting the intrinsic limitations of the n-gram redundancy analysis, we also briefly discuss the failure of the zeroth- and first-order Markovian models or simple nucleotide repeats to account fully for these "linguistic" features of DNA. Finally, we emphasize that our results by no means prove the existence of a "language" in noncoding DNA.

  2. Coding patient emotional cues and concerns in medical consultations: the Verona coding definitions of emotional sequences (VR-CoDES).

    NARCIS (Netherlands)

    Zimmermann, C.; Piccolo, L. del; Bensing, J.; Bergvik, S.; Haes, H. de; Eide, H.; Fletcher, I.; Goss, C.; Heaven, C.; Humphris, G.; Young-Mi, K.; Langewitz, W.; Meeuwesen, L.; Nuebling, M.; Rimondini, M.; Salmon, P.; Dulmen, S. van; Wissow, L.; Zandbelt, L.; Finset, A.

    2011-01-01

    Objective: To present the Verona Coding Definitions of Emotional Sequences (VR-CoDES CC), a consensus based system for coding patient expressions of emotional distress in medical consultations, defined as Cues or Concerns. Methods: The system was developed by an international group of communication

  3. Non-binary Hybrid LDPC Codes: Structure, Decoding and Optimization

    OpenAIRE

    Sassatelli, Lucile; Declercq, David

    2007-01-01

    In this paper, we propose to study and optimize a very general class of LDPC codes whose variable nodes belong to finite sets with different orders. We named this class of codes Hybrid LDPC codes. Although efficient optimization techniques exist for binary LDPC codes and more recently for non-binary LDPC codes, they both exhibit drawbacks due to different reasons. Our goal is to capitalize on the advantages of both families by building codes with binary (or small finite set order) and non-bin...

  4. The complete mitochondrial genome of the land snail Cornu aspersum (Helicidae: Mollusca: intra-specific divergence of protein-coding genes and phylogenetic considerations within Euthyneura.

    Directory of Open Access Journals (Sweden)

    Juan Diego Gaitán-Espitia

    Full Text Available The complete sequences of three mitochondrial genomes from the land snail Cornu aspersum were determined. The mitogenome has a length of 14050 bp, and it encodes 13 protein-coding genes, 22 transfer RNA genes and two ribosomal RNA genes. It also includes nine small intergene spacers, and a large AT-rich intergenic spacer. The intra-specific divergence analysis revealed that COX1 has the lower genetic differentiation, while the most divergent genes were NADH1, NADH3 and NADH4. With the exception of Euhadra herklotsi, the structural comparisons showed the same gene order within the family Helicidae, and nearly identical gene organization to that found in order Pulmonata. Phylogenetic reconstruction recovered Basommatophora as polyphyletic group, whereas Eupulmonata and Pulmonata as paraphyletic groups. Bayesian and Maximum Likelihood analyses showed that C. aspersum is a close relative of Cepaea nemoralis, and with the other Helicidae species form a sister group of Albinaria caerulea, supporting the monophyly of the Stylommatophora clade.

  5. Complete sequences of the mitochondrial DNA of the wild Gracilariopsis lemaneiformis and two mutagenic cultivated breeds (Gracilariaceae, Rhodophyta.

    Directory of Open Access Journals (Sweden)

    Lei Zhang

    Full Text Available The complete mitochondrial DNA (mtDNA of Gracilariopsis lemaneiformis was sequenced (25883 bp and mapped to a circular model. The A+T composition was 72.5%. Forty six genes and two potentially functional open reading frames were identified. They include 24 protein-coding genes, 2 rRNA genes, 20 tRNA genes and 2 ORFs (orf60, orf142. There is considerable sequence synteny across the five red algal mtDNAs falling into Florideophyceae including Gr. lemaneiformis in this study and previously sequenced species. A long stem-loop and a hairpin structure were identified in intergenic regions of mt genome of Gr. lemaneiformis, which are believed to be involved with transcription and replication. In addition, the mtDNAs of two mutagenic cultivated breeds ("981" and "07-2" were also sequenced. Compared with the mtDNA of wild Gr. lemaneiformis, the genome size and gene length and order of three strains were completely identical except nine base mutations including eight in the protein-coding genes and one in the tRNA gene. None of the base mutations caused frameshift or a premature stop codon in the mtDNA genes. Phylogenetic analyses based on mitochondrial protein-coding genes and rRNA genes demonstrated Gracilariopsis andersonii had closer phylogenetic relationship with its parasite Gracilariophila oryzoides than Gracilariopsis lemaneiformis which was from the same genus of Gracilariopsis.

  6. Complete sequences of the mitochondrial DNA of the wild Gracilariopsis lemaneiformis and two mutagenic cultivated breeds (Gracilariaceae, Rhodophyta).

    Science.gov (United States)

    Zhang, Lei; Wang, Xumin; Qian, Hao; Chi, Shan; Liu, Cui; Liu, Tao

    2012-01-01

    The complete mitochondrial DNA (mtDNA) of Gracilariopsis lemaneiformis was sequenced (25883 bp) and mapped to a circular model. The A+T composition was 72.5%. Forty six genes and two potentially functional open reading frames were identified. They include 24 protein-coding genes, 2 rRNA genes, 20 tRNA genes and 2 ORFs (orf60, orf142). There is considerable sequence synteny across the five red algal mtDNAs falling into Florideophyceae including Gr. lemaneiformis in this study and previously sequenced species. A long stem-loop and a hairpin structure were identified in intergenic regions of mt genome of Gr. lemaneiformis, which are believed to be involved with transcription and replication. In addition, the mtDNAs of two mutagenic cultivated breeds ("981" and "07-2") were also sequenced. Compared with the mtDNA of wild Gr. lemaneiformis, the genome size and gene length and order of three strains were completely identical except nine base mutations including eight in the protein-coding genes and one in the tRNA gene. None of the base mutations caused frameshift or a premature stop codon in the mtDNA genes. Phylogenetic analyses based on mitochondrial protein-coding genes and rRNA genes demonstrated Gracilariopsis andersonii had closer phylogenetic relationship with its parasite Gracilariophila oryzoides than Gracilariopsis lemaneiformis which was from the same genus of Gracilariopsis.

  7. Metformin-Induced Changes of the Coding Transcriptome and Non-Coding RNAs in the Livers of Non-Alcoholic Fatty Liver Disease Mice.

    Science.gov (United States)

    Guo, Jun; Zhou, Yuan; Cheng, Yafen; Fang, Weiwei; Hu, Gang; Wei, Jie; Lin, Yajun; Man, Yong; Guo, Lixin; Sun, Mingxiao; Cui, Qinghua; Li, Jian

    2018-01-01

    Recent studies have suggested that changes in non-coding mRNA play a key role in the progression of non-alcoholic fatty liver disease (NAFLD). Metformin is now recommended and effective for the treatment of NAFLD. We hope the current analyses of the non-coding mRNA transcriptome will provide a better presentation of the potential roles of mRNAs and long non-coding RNAs (lncRNAs) that underlie NAFLD and metformin intervention. The present study mainly analysed changes in the coding transcriptome and non-coding RNAs after the application of a five-week metformin intervention. Liver samples from three groups of mice were harvested for transcriptome profiling, which covered mRNA, lncRNA, microRNA (miRNA) and circular RNA (circRNA), using a microarray technique. A systematic alleviation of high-fat diet (HFD)-induced transcriptome alterations by metformin was observed. The metformin treatment largely reversed the correlations with diabetes-related pathways. Our analysis also suggested interaction networks between differentially expressed lncRNAs and known hepatic disease genes and interactions between circRNA and their disease-related miRNA partners. Eight HFD-responsive lncRNAs and three metformin-responsive lncRNAs were noted due to their widespread associations with disease genes. Moreover, seven miRNAs that interacted with multiple differentially expressed circRNAs were highlighted because they were likely to be associated with metabolic or liver diseases. The present study identified novel changes in the coding transcriptome and non-coding RNAs in the livers of NAFLD mice after metformin treatment that might shed light on the underlying mechanism by which metformin impedes the progression of NAFLD. © 2018 The Author(s). Published by S. Karger AG, Basel.

  8. Non-Binary Protograph-Based LDPC Codes: Analysis,Enumerators and Designs

    OpenAIRE

    Sun, Yizeng

    2013-01-01

    Non-binary LDPC codes can outperform binary LDPC codes using sum-product algorithm with higher computation complexity. Non-binary LDPC codes based on protographs have the advantage of simple hardware architecture. In the first part of this thesis, we will use EXIT chart analysis to compute the thresholds of different protographs over GF(q). Based on threshold computation, some non-binary protograph-based LDPC codes are designed and their frame error rates are compared with binary LDPC codes. ...

  9. Non coding RNA: sequence-specific guide for chromatin modification and DNA damage signaling

    Directory of Open Access Journals (Sweden)

    Sofia eFrancia

    2015-11-01

    Full Text Available Chromatin conformation shapes the environment in which our genome is transcribed into RNA. Transcription is a source of DNA damage, thus it often occurs concomitantly to DNA damage signaling. Growing amounts of evidence suggest that different types of RNAs can, independently from their protein-coding properties, directly affect chromatin conformation, transcription and splicing, as well as promote the activation of the DNA damage response (DDR and DNA repair. Therefore, transcription paradoxically functions to both threaten and safeguard genome integrity. On the other hand, DNA damage signaling is known to modulate chromatin to suppress transcription of the surrounding genetic unit. It is thus intriguing to understand how transcription can modulate DDR signaling while, in turn, DDR signaling represses transcription of chromatin around the DNA lesion. An unexpected player in this field is the RNA interference (RNAi machinery, which play roles in transcription, splicing and chromatin modulation in several organisms. Non-coding RNAs (ncRNAs and several protein factors involved in the RNAi pathway are well known master regulators of chromatin while only recent reports suggest that ncRNAs are involved in DDR signaling and homology-mediated DNA repair. Here, we discuss the experimental evidence supporting the idea that ncRNAs act at the genomic loci from which they are transcribed to modulate chromatin, DDR signaling and DNA repair.

  10. PATACSDB—the database of polyA translational attenuators in coding sequences

    Directory of Open Access Journals (Sweden)

    Malgorzata Habich

    2016-02-01

    Full Text Available Recent additions to the repertoire of gene expression regulatory mechanisms are polyadenylate (polyA tracks encoding for poly-lysine runs in protein sequences. Such tracks stall the translation apparatus and induce frameshifting independently of the effects of charged nascent poly-lysine sequence on the ribosome exit channel. As such, they substantially influence the stability of mRNA and the amount of protein produced from a given transcript. Single base changes in these regions are enough to exert a measurable response on both protein and mRNA abundance; this makes each of these sequences a potentially interesting case study for the effects of synonymous mutation, gene dosage balance and natural frameshifting. Here we present PATACSDB, a resource that contain a comprehensive list of polyA tracks from over 250 eukaryotic genomes. Our data is based on the Ensembl genomic database of coding sequences and filtered with algorithm of 12A-1 which selects sequences of polyA tracks with a minimal length of 12 A’s allowing for one mismatched base. The PATACSDB database is accessible at: http://sysbio.ibb.waw.pl/patacsdb. The source code is available at http://github.com/habich/PATACSDB, and it includes the scripts with which the database can be recreated.

  11. The long non-coding RNA TUG1 indicates a poor prognosis for colorectal cancer and promotes metastasis by affecting epithelial-mesenchymal transition.

    Science.gov (United States)

    Sun, Junfeng; Ding, Chaohui; Yang, Zhen; Liu, Tao; Zhang, Xiefu; Zhao, Chunlin; Wang, Jiaxiang

    2016-02-08

    Long intergenic non-coding RNAs (lncRNAs) are a class of non-coding RNAs that are involved in gene expression regulation. Taurine up-regulated gene 1 (TUG1) is a cancer progression related lncRNA in some tumor oncogenesis; however, its role in colorectal cancer (CRC) remains unclear. In this study, we determined the expression patterns of TUG1 in CRC patients and explored its effect on CRC cell metastasis using cultured representative CRC cell lines. The expression levels of TUG1 in 120 CRC patients and CRC cells were determined using quantitative real-time PCR. HDACs and epithelial-mesenchymal transition (EMT)-related gene expression were determined using western blot. CRC cell metastasis was assessed by colony formation, migration assay and invasion assay. Our data showed that the levels of TUG1 were upregulated in both CRC cell lines and primary CRC clinical samples. TUG1 upregulation was closely correlated with the survival time of CRC patients. Overexpression of TUG1 in CRC cells increased their colony formation, migration, and invasion in vitro and promoted their metastatic potential in vivo, whereas knockdown of TUG1 inhibited the colony formation, migration, and invasion of CRC cells in vitro. It is also worth pointing out that TUG1 activated EMT-related gene expression. Our data suggest that tumor expression of lncRNA TUG1 plays a critical role in CRC metastasis. TUG1 may have potential roles as a biomarker and/or a therapeutic target in colorectal cancer.

  12. Application of the verona coding definitions of emotional sequences (VR-CoDES) on a pediatric data set.

    Science.gov (United States)

    Vatne, Torun M; Finset, Arnstein; Ørnes, Knut; Ruland, Cornelia M

    2010-09-01

    Adult patients present concerns as defined in the Verona Coding Definitions of Emotional Sequences (VR-CoDES), but we do not know how children express their concerns during medical consultations. This study aimed to evaluate the applicability of VR-CoDES to pediatric oncology consultations. Twenty-eight pediatric consultations were coded with the Verona Coding Definitions of Emotional Sequences (VR-CoDES), and the material was also qualitatively analyzed for descriptive purposes. Five consultations were randomly selected for reliability testing and descriptive statistics were computed. Perfect inter-rater reliability for concerns and moderate reliability for cues were obtained. Cues and/or concerns were present in over half of the consultations. Cues were more frequent than concerns, with the majority of cues being verbal hints to hidden concerns or non-verbal cues. Intensity of expressions, limitations in vocabulary, commonality of statements, and complexity of the setting complicated the use of VR-CoDES. Child-specific cues; use of the imperative, cues about past experiences, and use of onomatopoeia were observed. Children with cancer express concerns during medical consultations. VR-CoDES is a reliable tool for coding concerns in pediatric data sets. For future applications in pediatric settings an appendix should be developed to incorporate the child-specific traits. Copyright (c) 2010 Elsevier Ireland Ltd. All rights reserved.

  13. Modeling compositional dynamics based on GC and purine contents of protein-coding sequences

    KAUST Repository

    Zhang, Zhang

    2010-11-08

    Background: Understanding the compositional dynamics of genomes and their coding sequences is of great significance in gaining clues into molecular evolution and a large number of publically-available genome sequences have allowed us to quantitatively predict deviations of empirical data from their theoretical counterparts. However, the quantification of theoretical compositional variations for a wide diversity of genomes remains a major challenge.Results: To model the compositional dynamics of protein-coding sequences, we propose two simple models that take into account both mutation and selection effects, which act differently at the three codon positions, and use both GC and purine contents as compositional parameters. The two models concern the theoretical composition of nucleotides, codons, and amino acids, with no prerequisite of homologous sequences or their alignments. We evaluated the two models by quantifying theoretical compositions of a large collection of protein-coding sequences (including 46 of Archaea, 686 of Bacteria, and 826 of Eukarya), yielding consistent theoretical compositions across all the collected sequences.Conclusions: We show that the compositions of nucleotides, codons, and amino acids are largely determined by both GC and purine contents and suggest that deviations of the observed from the expected compositions may reflect compositional signatures that arise from a complex interplay between mutation and selection via DNA replication and repair mechanisms.Reviewers: This article was reviewed by Zhaolei Zhang (nominated by Mark Gerstein), Guruprasad Ananda (nominated by Kateryna Makova), and Daniel Haft. 2010 Zhang and Yu; licensee BioMed Central Ltd.

  14. Modeling compositional dynamics based on GC and purine contents of protein-coding sequences

    KAUST Repository

    Zhang, Zhang; Yu, Jun

    2010-01-01

    Background: Understanding the compositional dynamics of genomes and their coding sequences is of great significance in gaining clues into molecular evolution and a large number of publically-available genome sequences have allowed us to quantitatively predict deviations of empirical data from their theoretical counterparts. However, the quantification of theoretical compositional variations for a wide diversity of genomes remains a major challenge.Results: To model the compositional dynamics of protein-coding sequences, we propose two simple models that take into account both mutation and selection effects, which act differently at the three codon positions, and use both GC and purine contents as compositional parameters. The two models concern the theoretical composition of nucleotides, codons, and amino acids, with no prerequisite of homologous sequences or their alignments. We evaluated the two models by quantifying theoretical compositions of a large collection of protein-coding sequences (including 46 of Archaea, 686 of Bacteria, and 826 of Eukarya), yielding consistent theoretical compositions across all the collected sequences.Conclusions: We show that the compositions of nucleotides, codons, and amino acids are largely determined by both GC and purine contents and suggest that deviations of the observed from the expected compositions may reflect compositional signatures that arise from a complex interplay between mutation and selection via DNA replication and repair mechanisms.Reviewers: This article was reviewed by Zhaolei Zhang (nominated by Mark Gerstein), Guruprasad Ananda (nominated by Kateryna Makova), and Daniel Haft. 2010 Zhang and Yu; licensee BioMed Central Ltd.

  15. Optical orthogonal code-division multiple-access system - Part 2: Multibits/sequence-period OOCDMA

    Science.gov (United States)

    Kwon, Hyuck M.

    1994-08-01

    In a recently proposed optical orthogonal code division multiple-access (OOCDMA) system, one bit of user's data is transmitted per sequence-period, and a threshold is employed for the final bit decision. In this paper, a system that can transmit multibits per sequence-period is introduced, and avalanche photodiode (APD) noise, thermal noise, and interference, are included. This system, derived by exploiting orthogonal properties of the OOCDMA code sequence and using a maximum search (instead of a threshold) in the final decision, is log(sub 2) F times higher in throughput, where F is sequence-period. For example, four orders of magnitude are better in bit error probability at - 56 dBW received laser power, with F = 1000 chips, 10 'marks' in a sequence, and 10 users of 30 Mb/s data rate for one-bit/sequence-period and 270 Mb/s data rate for multibits/sequence-period system. Furthermore, an exact analysis is performed for the log(sub 2)F bits/sequence-period system with a hard-limiter placed before the receiver, and its performance is compared to the performance without hard-limiter, for the chip-synchronous case. The improvement from using a hard-limiter is significant in the log(sub 2)F bits/sequence-period OCCDMA system.

  16. Pathoadaptation of a Human Pathogen Through Non-Coding Intergenic Mutations

    DEFF Research Database (Denmark)

    Khademi, Seyed Mohammad Hossein

    in CF adaptation of P. aeruginosa and their expressions are altered by the mutation. It was established that mutations upstream ampR increased tolerance of P. aeruginosa to some β-lactam antibiotics. Mutations in promoter of phuR, encoding receptor of pseudomonas heme uptake system, conferred growth...... advantage in the presence of hemoglobin demonstrating that P. aeruginosa has adapted towards utilization of iron from hemoglobin. Further investigation of phuR promoter mutation revealed pleiotropic effects on expression of many other genes. The pleiotropic effect by this mutation was contingent...

  17. Bistability in self-activating genes regulated by non-coding RNAs

    International Nuclear Information System (INIS)

    Miro-Bueno, Jesus

    2015-01-01

    Non-coding RNA molecules are able to regulate gene expression and play an essential role in cells. On the other hand, bistability is an important behaviour of genetic networks. Here, we propose and study an ODE model in order to show how non-coding RNA can produce bistability in a simple way. The model comprises a single gene with positive feedback that is repressed by non-coding RNA molecules. We show how the values of all the reaction rates involved in the model are able to control the transitions between the high and low states. This new model can be interesting to clarify the role of non-coding RNA molecules in genetic networks. As well, these results can be interesting in synthetic biology for developing new genetic memories and biomolecular devices based on non-coding RNAs

  18. SEAPATH: A microcomputer code for evaluating physical security effectiveness using adversary sequence diagrams

    International Nuclear Information System (INIS)

    Darby, J.L.

    1986-01-01

    The Adversary Sequence Diagram (ASD) concept was developed by Sandia National Laboratories (SNL) to examine physical security system effectiveness. Sandia also developed a mainframe computer code, PANL, to analyze the ASD. The authors have developed a microcomputer code, SEAPATH, which also analyzes ASD's. The Authors are supporting SNL in software development of the SAVI code; SAVI utilizes the SEAPATH algorithm to identify and quantify paths

  19. Kinetic models of gene expression including non-coding RNAs

    Energy Technology Data Exchange (ETDEWEB)

    Zhdanov, Vladimir P., E-mail: zhdanov@catalysis.r

    2011-03-15

    In cells, genes are transcribed into mRNAs, and the latter are translated into proteins. Due to the feedbacks between these processes, the kinetics of gene expression may be complex even in the simplest genetic networks. The corresponding models have already been reviewed in the literature. A new avenue in this field is related to the recognition that the conventional scenario of gene expression is fully applicable only to prokaryotes whose genomes consist of tightly packed protein-coding sequences. In eukaryotic cells, in contrast, such sequences are relatively rare, and the rest of the genome includes numerous transcript units representing non-coding RNAs (ncRNAs). During the past decade, it has become clear that such RNAs play a crucial role in gene expression and accordingly influence a multitude of cellular processes both in the normal state and during diseases. The numerous biological functions of ncRNAs are based primarily on their abilities to silence genes via pairing with a target mRNA and subsequently preventing its translation or facilitating degradation of the mRNA-ncRNA complex. Many other abilities of ncRNAs have been discovered as well. Our review is focused on the available kinetic models describing the mRNA, ncRNA and protein interplay. In particular, we systematically present the simplest models without kinetic feedbacks, models containing feedbacks and predicting bistability and oscillations in simple genetic networks, and models describing the effect of ncRNAs on complex genetic networks. Mathematically, the presentation is based primarily on temporal mean-field kinetic equations. The stochastic and spatio-temporal effects are also briefly discussed.

  20. Spliced DNA Sequences in the Paramecium Germline: Their Properties and Evolutionary Potential

    Science.gov (United States)

    Catania, Francesco; McGrath, Casey L.; Doak, Thomas G.; Lynch, Michael

    2013-01-01

    Despite playing a crucial role in germline-soma differentiation, the evolutionary significance of developmentally regulated genome rearrangements (DRGRs) has received scant attention. An example of DRGR is DNA splicing, a process that removes segments of DNA interrupting genic and/or intergenic sequences. Perhaps, best known for shaping immune-system genes in vertebrates, DNA splicing plays a central role in the life of ciliated protozoa, where thousands of germline DNA segments are eliminated after sexual reproduction to regenerate a functional somatic genome. Here, we identify and chronicle the properties of 5,286 sequences that putatively undergo DNA splicing (i.e., internal eliminated sequences [IESs]) across the genomes of three closely related species of the ciliate Paramecium (P. tetraurelia, P. biaurelia, and P. sexaurelia). The study reveals that these putative IESs share several physical characteristics. Although our results are consistent with excision events being largely conserved between species, episodes of differential IES retention/excision occur, may have a recent origin, and frequently involve coding regions. Our findings indicate interconversion between somatic—often coding—DNA sequences and noncoding IESs, and provide insights into the role of DNA splicing in creating potentially functional genetic innovation. PMID:23737328

  1. The Evolution of Bony Vertebrate Enhancers at Odds with Their Coding Sequence Landscape.

    Science.gov (United States)

    Yousaf, Aisha; Sohail Raza, Muhammad; Ali Abbasi, Amir

    2015-08-06

    Enhancers lie at the heart of transcriptional and developmental gene regulation. Therefore, changes in enhancer sequences usually disrupt the target gene expression and result in disease phenotypes. Despite the well-established role of enhancers in development and disease, evolutionary sequence studies are lacking. The current study attempts to unravel the puzzle of bony vertebrates' conserved noncoding elements (CNE) enhancer evolution. Bayesian phylogenetics of enhancer sequences spotlights promising interordinal relationships among placental mammals, proposing a closer relationship between humans and laurasiatherians while placing rodents at the basal position. Clock-based estimates of enhancer evolution provided a dynamic picture of interspecific rate changes across the bony vertebrate lineage. Moreover, coelacanth in the study augmented our appreciation of the vertebrate cis-regulatory evolution during water-land transition. Intriguingly, we observed a pronounced upsurge in enhancer evolution in land-dwelling vertebrates. These novel findings triggered us to further investigate the evolutionary trend of coding as well as CNE nonenhancer repertoires, to highlight the relative evolutionary dynamics of diverse genomic landscapes. Surprisingly, the evolutionary rates of enhancer sequences were clearly at odds with those of the coding and the CNE nonenhancer sequences during vertebrate adaptation to land, with land vertebrates exhibiting significantly reduced rates of coding sequence evolution in comparison to their fast evolving regulatory landscape. The observed variation in tetrapod cis-regulatory elements caused the fine-tuning of associated gene regulatory networks. Therefore, the increased evolutionary rate of tetrapods' enhancer sequences might be responsible for the variation in developmental regulatory circuits during the process of vertebrate adaptation to land. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for

  2. Weight Distribution for Non-binary Cluster LDPC Code Ensemble

    Science.gov (United States)

    Nozaki, Takayuki; Maehara, Masaki; Kasai, Kenta; Sakaniwa, Kohichi

    In this paper, we derive the average weight distributions for the irregular non-binary cluster low-density parity-check (LDPC) code ensembles. Moreover, we give the exponential growth rate of the average weight distribution in the limit of large code length. We show that there exist $(2,d_c)$-regular non-binary cluster LDPC code ensembles whose normalized typical minimum distances are strictly positive.

  3. From structure prediction to genomic screens for novel non-coding RNAs

    DEFF Research Database (Denmark)

    Gorodkin, Jan; Hofacker, Ivo L.

    2011-01-01

    Abstract: Non-coding RNAs (ncRNAs) are receiving more and more attention not only as an abundant class of genes, but also as regulatory structural elements (some located in mRNAs). A key feature of RNA function is its structure. Computational methods were developed early for folding and prediction....... This and the increased amount of available genomes have made it possible to employ structure-based methods for genomic screens. The field has moved from folding prediction of single sequences to computational screens for ncRNAs in genomic sequence using the RNA structure as the main characteristic feature. Whereas early...... upon some of the concepts in current methods that have been applied in genomic screens for de novo RNA structures in searches for novel ncRNA genes and regulatory RNA structure on mRNAs. We discuss the strengths and weaknesses of the different strategies and how they can complement each other....

  4. Short-lived non-coding transcripts (SLiTs): Clues to regulatory long non-coding RNA.

    Science.gov (United States)

    Tani, Hidenori

    2017-03-22

    Whole transcriptome analyses have revealed a large number of novel long non-coding RNAs (lncRNAs). Although the importance of lncRNAs has been documented in previous reports, the biological and physiological functions of lncRNAs remain largely unknown. The role of lncRNAs seems an elusive problem. Here, I propose a clue to the identification of regulatory lncRNAs. The key point is RNA half-life. RNAs with a long half-life (t 1/2 > 4 h) contain a significant proportion of ncRNAs, as well as mRNAs involved in housekeeping functions, whereas RNAs with a short half-life (t 1/2 regulatory ncRNAs and regulatory mRNAs. This novel class of ncRNAs with a short half-life can be categorized as Short-Lived non-coding Transcripts (SLiTs). I consider that SLiTs are likely to be rich in functionally uncharacterized regulatory RNAs. This review describes recent progress in research into SLiTs.

  5. Statistical properties and fractals of nucleotide clusters in DNA sequences

    International Nuclear Information System (INIS)

    Sun Tingting; Zhang Linxi; Chen Jin; Jiang Zhouting

    2004-01-01

    Statistical properties of nucleotide clusters in DNA sequences and their fractals are investigated in this paper. The average size of nucleotide clusters in non-coding sequence is larger than that in coding sequence. We investigate the cluster-size distribution P(S) for human chromosomes 21 and 22, and the results are different from previous works. The cluster-size distribution P(S 1 +S 2 ) with the total size of sequential Pu-cluster and Py-cluster S 1 +S 2 is studied. We observe that P(S 1 +S 2 ) follows an exponential decay both in coding and non-coding sequences. However, we get different results for human chromosomes 21 and 22. The probability distribution P(S 1 ,S 2 ) of nucleotide clusters with the size of sequential Pu-cluster and Py-cluster S 1 and S 2 respectively, is also examined. In the meantime, some of the linear correlations are obtained in the double logarithmic plots of the fluctuation F(l) versus nucleotide cluster distance l along the DNA chain. The power spectrums of nucleotide clusters are also discussed, and it is concluded that the curves are flat and hardly changed and the 1/3 frequency is neither observed in coding sequence nor in non-coding sequence. These investigations can provide some insights into the nucleotide clusters of DNA sequences

  6. Effects of near-ultraviolet light on mutations, intragenic and intergenic recombinations in Saccharomyces cerevisiae

    International Nuclear Information System (INIS)

    Machida, Isamu; Saeki, Tetsuya; Nakai, Sayaka

    1986-01-01

    The effects of far and near ultraviolet light on mutations, intragenic and intergenic recombinations were compared in diploid strains of Saccharomyces cerevisiae. At equivalent survival levels there was not much difference in the induction of nonsense and missense mutations between far- and near-UV radiations. However, frameshift mutations were induced more frequently by near-UV than by far-UV radiation. Near-UV radiation induced intragenic recombination as efficiently as far-UV radiation. A strikingly higher frequency was observed for the intergenic recombination induced by near-UV radiation than by far-UV radiation when compared at equivalent survival levels. Photoreactivation reduced the frequency only slightly in far-UV induced intergenic recombination and not at all in near-UV induction. These results indicate that near-UV damage involves strand breakage in addition to pyrimidine dimers and other lesions induced, whereas far-UV damage consists largely of photoreactivable lesions, pyrimidine dimers, and near-UV induced damage is more efficient for the induction of crossing-over. (Auth.)

  7. The long non-coding RNA HOTAIR indicates a poor prognosis and promotes metastasis in non-small cell lung cancer

    International Nuclear Information System (INIS)

    Liu, Xiang-hua; Liu, Zhi-li; Sun, Ming; Liu, Jing; Wang, Zhao-xia; De, Wei

    2013-01-01

    The identification of cancer-associated long non-coding RNAs and the investigation of their molecular and biological functions are important for understanding the molecular biology and progression of cancer. HOTAIR (HOX transcript antisense intergenic RNA) has been implicated in several cancers; however, its role in non-small cell lung cancer (NSCLC) is unknown. The aim of the present study was to examine the expression pattern of HOTAIR in NSCLC and to evaluate its biological role and clinical significance in tumor progression. Expression of HOTAIR was analyzed in 42 NSCLC tissues and four NSCLC cell lines by quantitative reverse-transcription polymerase chain reaction (qRT-PCR). Over-expression and RNA interference (RNAi) approaches were used to investigate the biological functions of HOTAIR. The effect of HOTAIR on proliferation was evaluated by MTT and colony formation assays, and cell migration and invasion were evaluated by transwell assays. Tail vein injection of cells was used to study metastasis in nude mice. Protein levels of HOTAIR targets were determined by western blot analysis. Differences between groups were tested for significance using Student’s t-test (two-tailed). HOTAIR was highly expressed both in NSCLC samples and cell lines compared with corresponding normal counterparts. HOTAIR upregulation was correlated with NSCLC advanced pathological stage and lymph-node metastasis. Moreover, patients with high levels of HOTAIR expression had a relatively poor prognosis. Inhibition of HOTAIR by RNAi decreased the migration and invasion of NSCLC cells in vitro and impeded cell metastasis in vivo. HOXA5 levels were affected by HOTAIR knockdown or over-expression in vitro. Our findings indicate that HOTAIR is significantly up-regulated in NSCLC tissues, and regulates NSCLC cell invasion and metastasis, partially via the down-regulation of HOXA5. Thus, HOTAIR may represent a new marker of poor prognosis and is a potential therapeutic target for NSCLC

  8. Binary Linear-Time Erasure Decoding for Non-Binary LDPC codes

    OpenAIRE

    Savin, Valentin

    2009-01-01

    In this paper, we first introduce the extended binary representation of non-binary codes, which corresponds to a covering graph of the bipartite graph associated with the non-binary code. Then we show that non-binary codewords correspond to binary codewords of the extended representation that further satisfy some simplex-constraint: that is, bits lying over the same symbol-node of the non-binary graph must form a codeword of a simplex code. Applied to the binary erasure channel, this descript...

  9. Linear-Time Non-Malleable Codes in the Bit-Wise Independent Tampering Model

    DEFF Research Database (Denmark)

    Cramer, Ronald; Damgård, Ivan Bjerre; Döttling, Nico

    Non-malleable codes were introduced by Dziembowski et al. (ICS 2010) as coding schemes that protect a message against tampering attacks. Roughly speaking, a code is non-malleable if decoding an adversarially tampered encoding of a message m produces the original message m or a value m' (eventuall...... non-malleable codes of Agrawal et al. (TCC 2015) and of Cher- aghchi and Guruswami (TCC 2014) and improves the previous result in the bit-wise tampering model: it builds the first non-malleable codes with linear-time complexity and optimal-rate (i.e. rate 1 - o(1)).......Non-malleable codes were introduced by Dziembowski et al. (ICS 2010) as coding schemes that protect a message against tampering attacks. Roughly speaking, a code is non-malleable if decoding an adversarially tampered encoding of a message m produces the original message m or a value m' (eventually...... abort) completely unrelated with m. It is known that non-malleability is possible only for restricted classes of tampering functions. Since their introduction, a long line of works has established feasibility results of non-malleable codes against different families of tampering functions. However...

  10. Non-Coding RNAs in Arabidopsis

    DEFF Research Database (Denmark)

    van Wonterghem, Miranda

    This work evolves around elucidating the mechanisms of micro RNAs (miRNAs) in Arabidopsis thaliana. I identified a new class of nuclear non-coding RNAs derived from protein coding genes. The genes are miRNA targets with extensive gene body methylation. The RNA species are nuclear localized and de...

  11. Non-contiguous finished genome sequence and description of Streptococcus varani sp. nov.

    Directory of Open Access Journals (Sweden)

    S. Bakour

    2016-05-01

    Full Text Available Strain FF10T (= CSUR P1489 = DSM 100884 was isolated from the oral cavity of a lizard (Varanus niloticus in Dakar, Senegal. Here we used a polyphasic study including phenotypic and genomic analyses to describe the strain FF10T. Results support strain FF10T being a Gram-positive coccus, facultative anaerobic bacterium, catalase-negative, non-motile and non-spore forming. The sequenced genome counts 2.46 Mb with one chromosome but no plasmid. It exhibits a G+C content of 40.4% and contains 2471 protein-coding and 45 RNA genes. On the basis of these data, we propose the creation of Streptococcus varani sp. nov.

  12. New technologies accelerate the exploration of non-coding RNAs in horticultural plants

    Energy Technology Data Exchange (ETDEWEB)

    Liu, Degao; Mewalal, Ritesh; Hu, Rongbin; Tuskan, Gerald A.; Yang, Xiaohan

    2017-07-05

    Non-coding RNAs (ncRNAs), that is, RNAs not translated into proteins, are crucial regulators of a variety of biological processes in plants. While protein-encoding genes have been relatively well-annotated in sequenced genomes, accounting for a small portion of the genome space in plants, the universe of plant ncRNAs is rapidly expanding. Recent advances in experimental and computational technologies have generated a great momentum for discovery and functional characterization of ncRNAs. Here we summarize the classification and known biological functions of plant ncRNAs, review the application of next-generation sequencing (NGS) technology and ribosome profiling technology to ncRNA discovery in horticultural plants and discuss the application of new technologies, especially the new genome-editing tool clustered regularly interspaced short palindromic repeat (CRISPR)/CRISPR-associated protein 9 (Cas9) systems, to functional characterization of plant ncRNAs.

  13. The identification of two Trypanosoma cruzi I genotypes from domestic and sylvatic transmission cycles in Colombia based on a single polymerase chain reaction amplification of the spliced-leader intergenic region

    Directory of Open Access Journals (Sweden)

    Lina Marcela Villa

    2013-11-01

    Full Text Available A single polymerase chain reaction (PCR reaction targeting the spliced-leader intergenic region of Trypanosoma cruzi I was standardised by amplifying a 231 bp fragment in domestic (TcIDOM strains or clones and 450 and 550 bp fragments in sylvatic strains or clones. This reaction was validated using 44 blind coded samples and 184 non-coded T. cruzi I clones isolated from sylvatic triatomines and the correspondence between the amplified fragments and their domestic or sylvatic origin was determined. Six of the nine strains isolated from acute cases suspected of oral infection had the sylvatic T. cruzi I profile. These results confirmed that the sylvatic T. cruzi I genotype is linked to cases of oral Chagas disease in Colombia. We therefore propose the use of this novel PCR reaction in strains or clones previously characterised as T. cruzi I to distinguish TcIDOMfrom sylvatic genotypes in studies of transmission dynamics, including the verification of population selection within hosts or detection of the frequency of mixed infections by both T. cruzi I genotypes in Colombia.

  14. Sequencing, Characterization, and Comparative Analyses of the Plastome of Caragana rosea var. rosea

    Directory of Open Access Journals (Sweden)

    Mei Jiang

    2018-05-01

    Full Text Available To exploit the drought-resistant Caragana species, we performed a comparative study of the plastomes from four species: Caragana rosea, C. microphylla, C. kozlowii, and C. Korshinskii. The complete plastome sequence of the C. rosea was obtained using the next generation DNA sequencing technology. The genome is a circular structure of 133,122 bases and it lacks inverted repeat. It contains 111 unique genes, including 76 protein-coding, 30 tRNA, and four rRNA genes. Repeat analyses obtained 239, 244, 258, and 246 simple sequence repeats in C. rosea, C. microphylla, C. kozlowii, and C. korshinskii, respectively. Analyses of sequence divergence found two intergenic regions: trnI-CAU-ycf2 and trnN-GUU-ycf1, exhibiting a high degree of variations. Phylogenetic analyses showed that the four Caragana species belong to a monophyletic clade. Analyses of Ka/Ks ratios revealed that five genes: rpl16, rpl20, rps11, rps7, and ycf1 and several sites having undergone strong positive selection in the Caragana branch. The results lay the foundation for the development of molecular markers and the understanding of the evolutionary process for drought-resistant characteristics.

  15. Non-coding genomic regions possessing enhancer and silencer potential are associated with healthy aging and exceptional survival.

    Science.gov (United States)

    Kim, Sangkyu; Welsh, David A; Myers, Leann; Cherry, Katie E; Wyckoff, Jennifer; Jazwinski, S Michal

    2015-02-28

    We have completed a genome-wide linkage scan for healthy aging using data collected from a family study, followed by fine-mapping by association in a separate population, the first such attempt reported. The family cohort consisted of parents of age 90 or above and their children ranging in age from 50 to 80. As a quantitative measure of healthy aging, we used a frailty index, called FI34, based on 34 health and function variables. The linkage scan found a single significant linkage peak on chromosome 12. Using an independent cohort of unrelated nonagenarians, we carried out a fine-scale association mapping of the region suggestive of linkage and identified three sites associated with healthy aging. These healthy-aging sites (HASs) are located in intergenic regions at 12q13-14. HAS-1 has been previously associated with multiple diseases, and an enhancer was recently mapped and experimentally validated within the site. HAS-2 is a previously uncharacterized site possessing genomic features suggestive of enhancer activity. HAS-3 contains features associated with Polycomb repression. The HASs also contain variants associated with exceptional longevity, based on a separate analysis. Our results provide insight into functional genomic networks involving non-coding regulatory elements that are involved in healthy aging and longevity.

  16. Local repeat sequence organization of an intergenic spacer in the ...

    Indian Academy of Sciences (India)

    Unknown

    chloroplast genome of Chlamydomonas reinhardtii leads to DNA expansion and sequence ... The discovery of uniparentally inherited streptomycin resistant mutants ... resembles yeast, mitochondrial and phage recombination in that it is typically ...... Sager R and Lane D 1972 Molecular basis of maternal inheritance; Proc.

  17. Multiple Access Interference Reduction Using Received Response Code Sequence for DS-CDMA UWB System

    Science.gov (United States)

    Toh, Keat Beng; Tachikawa, Shin'ichi

    This paper proposes a combination of novel Received Response (RR) sequence at the transmitter and a Matched Filter-RAKE (MF-RAKE) combining scheme receiver system for the Direct Sequence-Code Division Multiple Access Ultra Wideband (DS-CDMA UWB) multipath channel model. This paper also demonstrates the effectiveness of the RR sequence in Multiple Access Interference (MAI) reduction for the DS-CDMA UWB system. It suggests that by using conventional binary code sequence such as the M sequence or the Gold sequence, there is a possibility of generating extra MAI in the UWB system. Therefore, it is quite difficult to collect the energy efficiently although the RAKE reception method is applied at the receiver. The main purpose of the proposed system is to overcome the performance degradation for UWB transmission due to the occurrence of MAI during multiple accessing in the DS-CDMA UWB system. The proposed system improves the system performance by improving the RAKE reception performance using the RR sequence which can reduce the MAI effect significantly. Simulation results verify that significant improvement can be obtained by the proposed system in the UWB multipath channel models.

  18. MicroRNA-encoding long non-coding RNAs

    Directory of Open Access Journals (Sweden)

    Zhu Xiaopeng

    2008-05-01

    Full Text Available Abstract Background Recent analysis of the mouse transcriptional data has revealed the existence of ~34,000 messenger-like non-coding RNAs (ml-ncRNAs. Whereas the functional properties of these ml-ncRNAs are beginning to be unravelled, no functional information is available for the large majority of these transcripts. Results A few ml-ncRNA have been shown to have genomic loci that overlap with microRNA loci, leading us to suspect that a fraction of ml-ncRNA may encode microRNAs. We therefore developed an algorithm (PriMir for specifically detecting potential microRNA-encoding transcripts in the entire set of 34,030 mouse full-length ml-ncRNAs. In combination with mouse-rat sequence conservation, this algorithm detected 97 (80 of them were novel strong miRNA-encoding candidates, and for 52 of these we obtained experimental evidence for the existence of their corresponding mature microRNA by microarray and stem-loop RT-PCR. Sequence analysis of the microRNA-encoding RNAs revealed an internal motif, whose presence correlates strongly (R2 = 0.9, P-value = 2.2 × 10-16 with the occurrence of stem-loops with characteristics of known pre-miRNAs, indicating the presence of a larger number microRNA-encoding RNAs (from 300 up to 800 in the ml-ncRNAs population. Conclusion Our work highlights a unique group of ml-ncRNAs and offers clues to their functions.

  19. Comparative genomics beyond sequence-based alignments

    DEFF Research Database (Denmark)

    Þórarinsson, Elfar; Yao, Zizhen; Wiklund, Eric D.

    2008-01-01

    Recent computational scans for non-coding RNAs (ncRNAs) in multiple organisms have relied on existing multiple sequence alignments. However, as sequence similarity drops, a key signal of RNA structure--frequent compensating base changes--is increasingly likely to cause sequence-based alignment me...

  20. Code of practice for safety in laboratory - non ionising radiation

    International Nuclear Information System (INIS)

    Ramli Jaya; Mohd Yusof Mohd Ali; Khoo Boo Huat; Khatijah Hashim

    1995-01-01

    The code identifies the non-ionizing radiation encountered in laboratories and the associated hazards. The code is intended as a laboratory standard reference document for general information on safety requirements relating to the usage of non-ionizing radiations in laboratories. The nonionizing radiations cover in this code, namely, are ultraviolet radiation, visible light, radio-frequency radiation, lasers, sound waves and ultrasonic radiation. (author)

  1. Non-contiguous finished genome sequence and description of Collinsella massiliensis sp. nov.

    Science.gov (United States)

    Padmanabhan, Roshan; Dubourg, Gregory; Nguyen, Thi-Thien; Couderc, Carine; Rossi-Tamisier, Morgane; Caputo, Aurelia; Raoult, Didier; Fournier, Pierre-Edouard

    2014-06-15

    Collinsella massiliensis strain GD3(T) is the type strain of Collinsella massiliensis sp. nov., a new species within the genus Collinsella. This strain, whose genome is described here, was isolated from the fecal flora of a 53-year-old French Caucasoid woman who had been admitted to intensive care unit for Guillain-Barré syndrome. Collinsella massiliensis is a Gram-positive, obligate anaerobic, non motile and non sporulating bacillus. Here, we describe the features of this organism, together with the complete genome sequence and annotation. The genome is 2,319,586 bp long (1 chromosome, no plasmid), exhibits a G+C content of 65.8% and contains 2,003 protein-coding and 54 RNA genes, including 1 rRNA operon.

  2. Development of non-linear vibration analysis code for CANDU fuelling machine

    International Nuclear Information System (INIS)

    Murakami, Hajime; Hirai, Takeshi; Horikoshi, Kiyomi; Mizukoshi, Kaoru; Takenaka, Yasuo; Suzuki, Norio.

    1988-01-01

    This paper describes the development of a non-linear, dynamic analysis code for the CANDU 600 fuelling machine (F-M), which includes a number of non-linearities such as gap with or without Coulomb friction, special multi-linear spring connections, etc. The capabilities and features of the code and the mathematical treatment for the non-linearities are explained. The modeling and numerical methodology for the non-linearities employed in the code are verified experimentally. Finally, the simulation analyses for the full-scale F-M vibration testing are carried out, and the applicability of the code to such multi-degree of freedom systems as F-M is demonstrated. (author)

  3. Non-Coding Transcript Heterogeneity in Mesothelioma: Insights from Asbestos-Exposed Mice.

    Science.gov (United States)

    Felley-Bosco, Emanuela; Rehrauer, Hubert

    2018-04-11

    Mesothelioma is an aggressive, rapidly fatal cancer and a better understanding of its molecular heterogeneity may help with making more efficient therapeutic strategies. Non-coding RNAs represent a larger part of the transcriptome but their contribution to diseases is not fully understood yet. We used recently obtained RNA-seq data from asbestos-exposed mice and performed data mining of publicly available datasets in order to evaluate how non-coding RNA contribute to mesothelioma heterogeneity. Nine non-coding RNAs are specifically elevated in mesothelioma tumors and contribute to human mesothelioma heterogeneity. Because some of them have known oncogenic properties, this study supports the concept of non-coding RNAs as cancer progenitor genes.

  4. The evolutionary landscape of intergenic trans-splicing events in insects

    Science.gov (United States)

    Kong, Yimeng; Zhou, Hongxia; Yu, Yao; Chen, Longxian; Hao, Pei; Li, Xuan

    2015-01-01

    To explore the landscape of intergenic trans-splicing events and characterize their functions and evolutionary dynamics, we conduct a mega-data study of a phylogeny containing eight species across five orders of class Insecta, a model system spanning 400 million years of evolution. A total of 1,627 trans-splicing events involving 2,199 genes are identified, accounting for 1.58% of the total genes. Homology analysis reveals that mod(mdg4)-like trans-splicing is the only conserved event that is consistently observed in multiple species across two orders, which represents a unique case of functional diversification involving trans-splicing. Thus, evolutionarily its potential for generating proteins with novel function is not broadly utilized by insects. Furthermore, 146 non-mod trans-spliced transcripts are found to resemble canonical genes from different species. Trans-splicing preserving the function of ‘breakup' genes may serve as a general mechanism for relaxing the constraints on gene structure, with profound implications for the evolution of genes and genomes. PMID:26521696

  5. [Transposition errors during learning to reproduce a sequence by the right- and the left-hand movements: simulation of positional and movement coding].

    Science.gov (United States)

    Liakhovetskiĭ, V A; Bobrova, E V; Skopin, G N

    2012-01-01

    Transposition errors during the reproduction of a hand movement sequence make it possible to receive important information on the internal representation of this sequence in the motor working memory. Analysis of such errors showed that learning to reproduce sequences of the left-hand movements improves the system of positional coding (coding ofpositions), while learning of the right-hand movements improves the system of vector coding (coding of movements). Learning of the right-hand movements after the left-hand performance involved the system of positional coding "imposed" by the left hand. Learning of the left-hand movements after the right-hand performance activated the system of vector coding. Transposition errors during learning to reproduce movement sequences can be explained by neural network using either vector coding or both vector and positional coding.

  6. Verona Coding Definitions of Emotional Sequences (VR-CoDES): Conceptual framework and future directions.

    Science.gov (United States)

    Piccolo, Lidia Del; Finset, Arnstein; Mellblom, Anneli V; Figueiredo-Braga, Margarida; Korsvold, Live; Zhou, Yuefang; Zimmermann, Christa; Humphris, Gerald

    2017-12-01

    To discuss the theoretical and empirical framework of VR-CoDES and potential future direction in research based on the coding system. The paper is based on selective review of papers relevant to the construction and application of VR-CoDES. VR-CoDES system is rooted in patient-centered and biopsychosocial model of healthcare consultations and on a functional approach to emotion theory. According to the VR-CoDES, emotional interaction is studied in terms of sequences consisting of an eliciting event, an emotional expression by the patient and the immediate response by the clinician. The rationale for the emphasis on sequences, on detailed classification of cues and concerns, and on the choices of explicit vs. non-explicit responses and providing vs. reducing room for further disclosure, as basic categories of the clinician responses, is described. Results from research on VR-CoDES may help raise awareness of emotional sequences. Future directions in applying VR-CoDES in research may include studies on predicting patient and clinician behavior within the consultation, qualitative analyses of longer sequences including several VR-CoDES triads, and studies of effects of emotional communication on health outcomes. VR-CoDES may be applied to develop interventions to promote good handling of patients' emotions in healthcare encounters. Copyright © 2017 Elsevier B.V. All rights reserved.

  7. lncRScan-SVM: A Tool for Predicting Long Non-Coding RNAs Using Support Vector Machine.

    Science.gov (United States)

    Sun, Lei; Liu, Hui; Zhang, Lin; Meng, Jia

    2015-01-01

    Functional long non-coding RNAs (lncRNAs) have been bringing novel insight into biological study, however it is still not trivial to accurately distinguish the lncRNA transcripts (LNCTs) from the protein coding ones (PCTs). As various information and data about lncRNAs are preserved by previous studies, it is appealing to develop novel methods to identify the lncRNAs more accurately. Our method lncRScan-SVM aims at classifying PCTs and LNCTs using support vector machine (SVM). The gold-standard datasets for lncRScan-SVM model training, lncRNA prediction and method comparison were constructed according to the GENCODE gene annotations of human and mouse respectively. By integrating features derived from gene structure, transcript sequence, potential codon sequence and conservation, lncRScan-SVM outperforms other approaches, which is evaluated by several criteria such as sensitivity, specificity, accuracy, Matthews correlation coefficient (MCC) and area under curve (AUC). In addition, several known human lncRNA datasets were assessed using lncRScan-SVM. LncRScan-SVM is an efficient tool for predicting the lncRNAs, and it is quite useful for current lncRNA study.

  8. Coding chaotic billiards. Pt. 3

    International Nuclear Information System (INIS)

    Ullmo, D.; Giannoni, M.J.

    1993-01-01

    Non-tiling compact billiard defined on the pseudosphere is studied 'a la Morse coding'. As for most bounded systems, the coding is non exact. However, two sets of approximate grammar rules can be obtained, one specifying forbidden codes, and the other allowed ones. In-between some sequences remain in the 'unknown' zone, but their relative amount can be reduced to zero if one lets the length of the approximate grammar rules goes to infinity. The relationship between these approximate grammar rules and the 'pruning front' introduced by Cvitanovic et al. is discussed. (authors). 13 refs., 10 figs., 1 tab

  9. Analysis of the AD sequence in Zion plant using the March 1.1 code

    International Nuclear Information System (INIS)

    Oriolo, F.; Paci, S.

    1985-01-01

    The analyses of the AD sequences for the Zion power plant, made at the Pisa University, in the framework of the participation in the Source Tern Working Group. After a short description of the plant and the sequence under analysis, the model used for the reference computation and the results obtained using the March 1.1 code are shown. Together with the reference computation a series of parametric tests have been also made, concerning some input code variables, in order to ascertain their influence on the transient trend. The results of these analyses are shown in Appendix

  10. Min-Max decoding for non binary LDPC codes

    OpenAIRE

    Savin, Valentin

    2008-01-01

    Iterative decoding of non-binary LDPC codes is currently performed using either the Sum-Product or the Min-Sum algorithms or slightly different versions of them. In this paper, several low-complexity quasi-optimal iterative algorithms are proposed for decoding non-binary codes. The Min-Max algorithm is one of them and it has the benefit of two possible LLR domain implementations: a standard implementation, whose complexity scales as the square of the Galois field's cardinality and a reduced c...

  11. Sequence Coding and Search System for licensee event reports: coder's manual. Volume 4

    International Nuclear Information System (INIS)

    Gallaher, R.B.; Guymon, R.H.; Mays, G.T.; Poore, W.P.; Cagle, R.J.; Harrington, K.H.; Johnson, M.P.

    1985-04-01

    Operating experience data from nuclear power plants are essential for safety and reliability analyses, especially analyses of trends and patterns. The licensee event reports (LERs) that are submitted to the Nuclear Regulatory Commission (NRC) by the nuclear power plant utilities contain much of this data. The NRC's Office for Analysis and Evaluation of Operational Data (AEOD) has developed, under contract with NSIC, a system for codifying the events reported in the LERs. The primary objective of the Sequence Coding and Search System (SCSS) is to reduce the descriptive text of the LERs to coded sequences that are both computer-readable and computer-searchable. This four volume report documents and describes SCSS in detail. Volume 3 and 4 provide a technical processor, new to SCSS, the information and methodology necessary to capture descriptive data from the LER and to codify that data into a structured format and serve as reference material for the more experienced technical processor, and contains information that is essential for the more advanced user who needs to be familiar with the intricate coding techniques in order to retrieve specific details in a sequence. This volume contains updated material through amendment 1 to revision 1 of the working version of ORNL/NSIC-223, Vol. 4

  12. Sequence Coding and Search System for licensee event reports: coder's manual. Volume 3

    International Nuclear Information System (INIS)

    Gallaher, R.B.; Guymon, R.H.; Mays, G.T.; Poore, W.P.; Cagle, R.J.; Harrington, K.H.; Johnson, M.P.

    1985-04-01

    Operating experience data from nuclear power plants are essential for safety and reliability analyses, especially analyses of trends and patterns. The licensee event reports (LERs) that are submitted to the Nuclear Regulatory Commission (NRC) by the nuclear power plant utilities contain much of this data. The NRC's Office for Analysis and Evaluation of Operational Data (AEOD) has developed, under contract with NSIC, a system for codifying the events reported in the LERs. The primary objective of the Sequence Coding and Search System (SCSS) is to reduce the descriptive text of the LERs to coded sequences that are both computer-readable and computer-searchable. This four volume report documents and describes SCSS in detail. Volumes 3 and 4 provide a technical processor, new to SCSS, the information and methodology necessary to capture descriptive data from the LER and to codify that data into a structured format and serve as reference material for the more experienced technical processor, and contains information is essential for the more advanced user who needs to be familiar with the intricate coding techniques in order to retrieve specific details in a sequence. This volume contains updated material through amendment 1 to revision 1 of the working version of ORNL/NSIC-223, Vol. 3

  13. Sequence Coding and Search System Backfit Quality Assurance Program Plan

    International Nuclear Information System (INIS)

    Lovell, C.J.; Stepina, P.L.

    1985-03-01

    The Sequence Coding and Search System is a computer-based encoding system for events described in Licensee Event Reports. This data system contains LERs from 1981 to present. Backfit of the data system to include LERs prior to 1981 is required. This report documents the Quality Assurance Program Plan that EG and G Idaho, Inc. will follow while encoding 1980 LERs

  14. Mitochondrial genome evolution in the Saccharomyces sensu stricto complex.

    Science.gov (United States)

    Ruan, Jiangxing; Cheng, Jian; Zhang, Tongcun; Jiang, Huifeng

    2017-01-01

    Exploring the evolutionary patterns of mitochondrial genomes is important for our understanding of the Saccharomyces sensu stricto (SSS) group, which is a model system for genomic evolution and ecological analysis. In this study, we first obtained the complete mitochondrial sequences of two important species, Saccharomyces mikatae and Saccharomyces kudriavzevii. We then compared the mitochondrial genomes in the SSS group with those of close relatives, and found that the non-coding regions evolved rapidly, including dramatic expansion of intergenic regions, fast evolution of introns and almost 20-fold higher rearrangement rates than those of the nuclear genomes. However, the coding regions, and especially the protein-coding genes, are more conserved than those in the nuclear genomes of the SSS group. The different evolutionary patterns of coding and non-coding regions in the mitochondrial and nuclear genomes may be related to the origin of the aerobic fermentation lifestyle in this group. Our analysis thus provides novel insights into the evolution of mitochondrial genomes.

  15. ASAP: Amplification, sequencing & annotation of plastomes

    Directory of Open Access Journals (Sweden)

    Folta Kevin M

    2005-12-01

    Full Text Available Abstract Background Availability of DNA sequence information is vital for pursuing structural, functional and comparative genomics studies in plastids. Traditionally, the first step in mining the valuable information within a chloroplast genome requires sequencing a chloroplast plasmid library or BAC clones. These activities involve complicated preparatory procedures like chloroplast DNA isolation or identification of the appropriate BAC clones to be sequenced. Rolling circle amplification (RCA is being used currently to amplify the chloroplast genome from purified chloroplast DNA and the resulting products are sheared and cloned prior to sequencing. Herein we present a universal high-throughput, rapid PCR-based technique to amplify, sequence and assemble plastid genome sequence from diverse species in a short time and at reasonable cost from total plant DNA, using the large inverted repeat region from strawberry and peach as proof of concept. The method exploits the highly conserved coding regions or intergenic regions of plastid genes. Using an informatics approach, chloroplast DNA sequence information from 5 available eudicot plastomes was aligned to identify the most conserved regions. Cognate primer pairs were then designed to generate ~1 – 1.2 kb overlapping amplicons from the inverted repeat region in 14 diverse genera. Results 100% coverage of the inverted repeat region was obtained from Arabidopsis, tobacco, orange, strawberry, peach, lettuce, tomato and Amaranthus. Over 80% coverage was obtained from distant species, including Ginkgo, loblolly pine and Equisetum. Sequence from the inverted repeat region of strawberry and peach plastome was obtained, annotated and analyzed. Additionally, a polymorphic region identified from gel electrophoresis was sequenced from tomato and Amaranthus. Sequence analysis revealed large deletions in these species relative to tobacco plastome thus exhibiting the utility of this method for structural and

  16. Modelling of blackout sequence at Atucha-1 using the MARCH3 code

    International Nuclear Information System (INIS)

    Baron, J.; Bastianelli, B.

    1997-01-01

    This paper presents the modelling of a complete blackout at the Atucha-1 NPP as preliminary phase for a Level II safety probabilistic analysis. The MARCH3 code of the STCP (Source Term Code Package) is used, based on a plant model made in accordance with particularities of the plant design. The analysis covers all the severe accident phases. The results allow to view the time sequence of the events, and provide the basis for source term studies. (author). 6 refs., 2 figs

  17. Functional long non-coding RNA transcription in Schizosaccharomyces pombe

    OpenAIRE

    Ard, Ryan Anthony

    2016-01-01

    Eukaryotic genomes are pervasively transcribed and frequently generate long noncoding RNAs (lncRNAs). However, most lncRNAs remain uncharacterized. In this work, a set of positionally conserved intergenic lncRNAs in the fission yeast Schizosaccharomyces pombe genome are selected for further analysis. Deleting one of these lncRNA genes (ncRNA.1343) exhibited a clear phenotype: increased drug sensitivity. Further analyses revealed that deleting ncRNA.1343 also disrupted a prev...

  18. Non-contiguous finished genome sequence of the opportunistic oral pathogen Prevotella multisaccharivorax type strain (PPPA20T)

    Energy Technology Data Exchange (ETDEWEB)

    Pati, Amrita [U.S. Department of Energy, Joint Genome Institute; Gronow, Sabine [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Lu, Megan [Los Alamos National Laboratory (LANL); Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Nolan, Matt [U.S. Department of Energy, Joint Genome Institute; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Hammon, Nancy [U.S. Department of Energy, Joint Genome Institute; Deshpande, Shweta [U.S. Department of Energy, Joint Genome Institute; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Tapia, Roxanne [Los Alamos National Laboratory (LANL); Han, Cliff [Los Alamos National Laboratory (LANL); Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Liolios, Konstantinos [U.S. Department of Energy, Joint Genome Institute; Pagani, Ioanna [U.S. Department of Energy, Joint Genome Institute; Mavromatis, K [U.S. Department of Energy, Joint Genome Institute; Mikhailova, Natalia [U.S. Department of Energy, Joint Genome Institute; Huntemann, Marcel [U.S. Department of Energy, Joint Genome Institute; Chen, Amy [U.S. Department of Energy, Joint Genome Institute; Palaniappan, Krishna [U.S. Department of Energy, Joint Genome Institute; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Detter, J. Chris [U.S. Department of Energy, Joint Genome Institute; Brambilla, Evelyne-Marie [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Rohde, Manfred [HZI - Helmholtz Centre for Infection Research, Braunschweig, Germany; Goker, Markus [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute; Bristow, James [U.S. Department of Energy, Joint Genome Institute; Eisen, Jonathan [U.S. Department of Energy, Joint Genome Institute; Markowitz, Victor [U.S. Department of Energy, Joint Genome Institute; Hugenholtz, Philip [U.S. Department of Energy, Joint Genome Institute; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Ivanova, N [U.S. Department of Energy, Joint Genome Institute

    2011-01-01

    Prevotella multisaccharivorax Sakamoto et al. 2005 is a species of the large genus Prevotella, which belongs to the family Prevotellaceae. The species is of medical interest because its members are able to cause diseases in the human oral cavity such as periodontitis, root caries and others. Although 77 Prevotella genomes have already been sequenced or are targeted for sequencing, this is only the second completed genome sequence of a type strain of a species within the genus Prevotella to be published. The 3,388,644 bp long genome is assembled in three non-contiguous contigs, harbors 2,876 protein-coding and 75 RNA genes and is a part of the Genomic Encyclopedia of Bacteria and Archaea project.

  19. Complete coding sequence of Zika virus from Martinique outbreak in 2015

    Directory of Open Access Journals (Sweden)

    G. Piorkowski

    2016-05-01

    Full Text Available Zika virus is an Aedes-borne Flavivirus causing fever, arthralgia, myalgia rash, associated with Guillain–Barré syndrome and suspected to induce microcephaly in the fetus. We report here the complete coding sequence of the first characterized Caribbean Zika virus strain, isolated from a patient from Martinique in December, 2015.

  20. Complete chloroplast genome sequences of Hordeum vulgare, Sorghum bicolor and Agrostis stolonifera, and comparative analyses with other grass genomes

    Science.gov (United States)

    Saski, Christopher; Lee, Seung-Bum; Fjellheim, Siri; Guda, Chittibabu; Jansen, Robert K.; Luo, Hong; Tomkins, Jeffrey; Rognli, Odd Arne; Clarke, Jihong Liu

    2009-01-01

    Comparisons of complete chloroplast genome sequences of Hordeum vulgare, Sorghum bicolor and Agrostis stolonifera to six published grass chloroplast genomes reveal that gene content and order are similar but two microstructural changes have occurred. First, the expansion of the IR at the SSC/IRa boundary that duplicates a portion of the 5′ end of ndhH is restricted to the three genera of the subfamily Pooideae (Agrostis, Hordeum and Triticum). Second, a 6 bp deletion in ndhK is shared by Agrostis, Hordeum, Oryza and Triticum, and this event supports the sister relationship between the subfamilies Erhartoideae and Pooideae. Repeat analysis identified 19–37 direct and inverted repeats 30 bp or longer with a sequence identity of at least 90%. Seventeen of the 26 shared repeats are found in all the grass chloroplast genomes examined and are located in the same genes or intergenic spacer (IGS) regions. Examination of simple sequence repeats (SSRs) identified 16–21 potential polymorphic SSRs. Five IGS regions have 100% sequence identity among Zea mays, Saccharum officinarum and Sorghum bicolor, whereas no spacer regions were identical among Oryza sativa, Triticum aestivum, H. vulgare and A. stolonifera despite their close phylogenetic relationship. Alignment of EST sequences and DNA coding sequences identified six C–U conversions in both Sorghum bicolor and H. vulgare but only one in A. stolonifera. Phylogenetic trees based on DNA sequences of 61 protein-coding genes of 38 taxa using both maximum parsimony and likelihood methods provide moderate support for a sister relationship between the subfamilies Erhartoideae and Pooideae. PMID:17534593

  1. Classifying Coding DNA with Nucleotide Statistics

    Directory of Open Access Journals (Sweden)

    Nicolas Carels

    2009-10-01

    Full Text Available In this report, we compared the success rate of classification of coding sequences (CDS vs. introns by Codon Structure Factor (CSF and by a method that we called Universal Feature Method (UFM. UFM is based on the scoring of purine bias (Rrr and stop codon frequency. We show that the success rate of CDS/intron classification by UFM is higher than by CSF. UFM classifies ORFs as coding or non-coding through a score based on (i the stop codon distribution, (ii the product of purine probabilities in the three positions of nucleotide triplets, (iii the product of Cytosine (C, Guanine (G, and Adenine (A probabilities in the 1st, 2nd, and 3rd positions of triplets, respectively, (iv the probabilities of G in 1st and 2nd position of triplets and (v the distance of their GC3 vs. GC2 levels to the regression line of the universal correlation. More than 80% of CDSs (true positives of Homo sapiens (>250 bp, Drosophila melanogaster (>250 bp and Arabidopsis thaliana (>200 bp are successfully classified with a false positive rate lower or equal to 5%. The method releases coding sequences in their coding strand and coding frame, which allows their automatic translation into protein sequences with 95% confidence. The method is a natural consequence of the compositional bias of nucleotides in coding sequences.

  2. Complete sequence and analysis of plastid genomes of two economically important red algae: Pyropia haitanensis and Pyropia yezoensis.

    Directory of Open Access Journals (Sweden)

    Li Wang

    Full Text Available Pyropia haitanensis and P. yezoensis are two economically important marine crops that are also considered to be research models to study the physiological ecology of intertidal seaweed communities, evolutionary biology of plastids, and the origins of sexual reproduction. This plastid genome information will facilitate study of breeding, population genetics and phylogenetics.We have fully sequenced using next-generation sequencing the circular plastid genomes of P. hatanensis (195,597 bp and P. yezoensis (191,975 bp, the largest of all the plastid genomes of the red lineage sequenced to date. Organization and gene contents of the two plastids were similar, with 211-213 protein-coding genes (including 29-31 unknown-function ORFs, 37 tRNA genes, and 6 ribosomal RNA genes, suggesting a largest coding capacity in the red lineage. In each genome, 14 protein genes overlapped and no interrupted genes were found, indicating a high degree of genomic condensation. Pyropia maintain an ancient gene content and conserved gene clusters in their plastid genomes, containing nearly complete repertoires of the plastid genes known in photosynthetic eukaryotes. Similarity analysis based on the whole plastid genome sequences showed the distance between P. haitanensis and P. yezoensis (0.146 was much smaller than that of Porphyra purpurea and P. haitanensis (0.250, and P. yezoensis (0.251; this supports re-grouping the two species in a resurrected genus Pyropia while maintaining P. purpurea in genus Porphyra. Phylogenetic analysis supports a sister relationship between Bangiophyceae and Florideophyceae, though precise phylogenetic relationships between multicellular red alage and chromists were not fully resolved.These results indicate that Pyropia have compact plastid genomes. Large coding capacity and long intergenic regions contribute to the size of the largest plastid genomes reported for the red lineage. Possessing the largest coding capacity and ancient gene

  3. Sequence and comparative analysis of the chicken genome provide unique perspectives on vertebrate evolution.

    Science.gov (United States)

    2004-12-09

    We present here a draft genome sequence of the red jungle fowl, Gallus gallus. Because the chicken is a modern descendant of the dinosaurs and the first non-mammalian amniote to have its genome sequenced, the draft sequence of its genome--composed of approximately one billion base pairs of sequence and an estimated 20,000-23,000 genes--provides a new perspective on vertebrate genome evolution, while also improving the annotation of mammalian genomes. For example, the evolutionary distance between chicken and human provides high specificity in detecting functional elements, both non-coding and coding. Notably, many conserved non-coding sequences are far from genes and cannot be assigned to defined functional classes. In coding regions the evolutionary dynamics of protein domains and orthologous groups illustrate processes that distinguish the lineages leading to birds and mammals. The distinctive properties of avian microchromosomes, together with the inferred patterns of conserved synteny, provide additional insights into vertebrate chromosome architecture.

  4. The mitochondrial genome of Paraspadella gotoi is highly reduced and reveals that chaetognaths are a sister-group to protostomes

    Energy Technology Data Exchange (ETDEWEB)

    Helfenbein, Kevin G.; Fourcade, H. Matthew; Vanjani, Rohit G.; Boore, Jeffrey L.

    2004-05-01

    We report the first complete mitochondrial (mt) DNA sequence from a member of the phylum Chaetognatha (arrow worms). The Paraspadella gotoi mtDNA is highly unusual, missing 23 of the genes commonly found in animal mtDNAs, including atp6, which has otherwise been found universally to be present. Its 14 genes are unusually arranged into two groups, one on each strand. One group is punctuated by numerous non-coding intergenic nucleotides, while the other group is tightly packed, having no non-coding nucleotides, leading to speculation that there are two transcription units with differing modes of expression. The phylogenetic position of the Chaetognatha within the Metazoa has long been uncertain, with conflicting or equivocal results from various morphological analyses and rRNA sequence comparisons. Comparisons here of amino acid sequences from mitochondrially encoded proteins gives a single most parsimonious tree that supports a position of Chaetognatha as sister to the protostomes studied here. From this, one can more clearly interpret the patterns of evolution of various developmental features, especially regarding the embryological fate of the blastopore.

  5. Source coherence impairments in a direct detection direct sequence optical code-division multiple-access system.

    Science.gov (United States)

    Fsaifes, Ihsan; Lepers, Catherine; Lourdiane, Mounia; Gallion, Philippe; Beugin, Vincent; Guignard, Philippe

    2007-02-01

    We demonstrate that direct sequence optical code- division multiple-access (DS-OCDMA) encoders and decoders using sampled fiber Bragg gratings (S-FBGs) behave as multipath interferometers. In that case, chip pulses of the prime sequence codes generated by spreading in time-coherent data pulses can result from multiple reflections in the interferometers that can superimpose within a chip time duration. We show that the autocorrelation function has to be considered as the sum of complex amplitudes of the combined chip as the laser source coherence time is much greater than the integration time of the photodetector. To reduce the sensitivity of the DS-OCDMA system to the coherence time of the laser source, we analyze the use of sparse and nonperiodic quadratic congruence and extended quadratic congruence codes.

  6. Source coherence impairments in a direct detection direct sequence optical code-division multiple-access system

    Science.gov (United States)

    Fsaifes, Ihsan; Lepers, Catherine; Lourdiane, Mounia; Gallion, Philippe; Beugin, Vincent; Guignard, Philippe

    2007-02-01

    We demonstrate that direct sequence optical code- division multiple-access (DS-OCDMA) encoders and decoders using sampled fiber Bragg gratings (S-FBGs) behave as multipath interferometers. In that case, chip pulses of the prime sequence codes generated by spreading in time-coherent data pulses can result from multiple reflections in the interferometers that can superimpose within a chip time duration. We show that the autocorrelation function has to be considered as the sum of complex amplitudes of the combined chip as the laser source coherence time is much greater than the integration time of the photodetector. To reduce the sensitivity of the DS-OCDMA system to the coherence time of the laser source, we analyze the use of sparse and nonperiodic quadratic congruence and extended quadratic congruence codes.

  7. IdentiCS – Identification of coding sequence and in silico reconstruction of the metabolic network directly from unannotated low-coverage bacterial genome sequence

    Directory of Open Access Journals (Sweden)

    Zeng An-Ping

    2004-08-01

    Full Text Available Abstract Background A necessary step for a genome level analysis of the cellular metabolism is the in silico reconstruction of the metabolic network from genome sequences. The available methods are mainly based on the annotation of genome sequences including two successive steps, the prediction of coding sequences (CDS and their function assignment. The annotation process takes time. The available methods often encounter difficulties when dealing with unfinished error-containing genomic sequence. Results In this work a fast method is proposed to use unannotated genome sequence for predicting CDSs and for an in silico reconstruction of metabolic networks. Instead of using predicted genes or CDSs to query public databases, entries from public DNA or protein databases are used as queries to search a local database of the unannotated genome sequence to predict CDSs. Functions are assigned to the predicted CDSs simultaneously. The well-annotated genome of Salmonella typhimurium LT2 is used as an example to demonstrate the applicability of the method. 97.7% of the CDSs in the original annotation are correctly identified. The use of SWISS-PROT-TrEMBL databases resulted in an identification of 98.9% of CDSs that have EC-numbers in the published annotation. Furthermore, two versions of sequences of the bacterium Klebsiella pneumoniae with different genome coverage (3.9 and 7.9 fold, respectively are examined. The results suggest that a 3.9-fold coverage of the bacterial genome could be sufficiently used for the in silico reconstruction of the metabolic network. Compared to other gene finding methods such as CRITICA our method is more suitable for exploiting sequences of low genome coverage. Based on the new method, a program called IdentiCS (Identification of Coding Sequences from Unfinished Genome Sequences is delivered that combines the identification of CDSs with the reconstruction, comparison and visualization of metabolic networks (free to download

  8. Small non coding RNAs in adipocyte biology and obesity.

    Science.gov (United States)

    Amri, Ez-Zoubir; Scheideler, Marcel

    2017-11-15

    Obesity has reached epidemic proportions world-wide and constitutes a substantial risk factor for hypertension, type 2 diabetes, cardiovascular diseases and certain cancers. So far, regulation of energy intake by dietary and pharmacological treatments has met limited success. The main interest of current research is focused on understanding the role of different pathways involved in adipose tissue function and modulation of its mass. Whole-genome sequencing studies revealed that the majority of the human genome is transcribed, with thousands of non-protein-coding RNAs (ncRNA), which comprise small and long ncRNAs. ncRNAs regulate gene expression at the transcriptional and post-transcriptional level. Numerous studies described the involvement of ncRNAs in the pathogenesis of many diseases including obesity and associated metabolic disorders. ncRNAs represent potential diagnostic biomarkers and promising therapeutic targets. In this review, we focused on small ncRNAs involved in the formation and function of adipocytes and obesity. Copyright © 2017 Elsevier B.V. All rights reserved.

  9. Toward understanding non-coding RNA roles in intracranial aneurysms and subarachnoid hemorrhage

    Directory of Open Access Journals (Sweden)

    Huang Fengzhen

    2017-05-01

    Full Text Available Subarachnoid hemorrhage (SAH is a common and frequently life-threatening cerebrovascular disease, which is mostly related with a ruptured intracranial aneurysm. Its complications include rebleeding, early brain injury, cerebral vasospasm, delayed cerebral ischemia, chronic hydrocephalus, and also non neurological problems. Non-coding RNAs (ncRNAs, comprising of microRNAs (miRNAs, small interfering RNAs (siRNAs and long non-coding RNAs (lncRNAs, play an important role in intracranial aneurysms and SAH. Here, we review the non-coding RNAs expression profile and their related mechanisms in intracranial aneurysms and SAH. Moreover, we suggest that these non-coding RNAs function as novel molecular biomarkers to predict intracranial aneurysms and SAH, and may yield new therapies after SAH in the future.

  10. In silico analysis of Simple Sequence Repeats from chloroplast genomes of Solanaceae species

    Directory of Open Access Journals (Sweden)

    Evandro Vagner Tambarussi

    2009-01-01

    Full Text Available The availability of chloroplast genome (cpDNA sequences of Atropa belladonna, Nicotiana sylvestris, N.tabacum, N. tomentosiformis, Solanum bulbocastanum, S. lycopersicum and S. tuberosum, which are Solanaceae species,allowed us to analyze the organization of cpSSRs in their genic and intergenic regions. In general, the number of cpSSRs incpDNA ranged from 161 in S. tuberosum to 226 in N. tabacum, and the number of intergenic cpSSRs was higher than geniccpSSRs. The mononucleotide repeats were the most frequent in studied species, but we also identified di-, tri-, tetra-, pentaandhexanucleotide repeats. Multiple alignments of all cpSSRs sequences from Solanaceae species made the identification ofnucleotide variability possible and the phylogeny was estimated by maximum parsimony. Our study showed that the plastomedatabase can be exploited for phylogenetic analysis and biotechnological approaches.

  11. Non-coding RNAs in the Ovarian Follicle

    Directory of Open Access Journals (Sweden)

    Rosalia Battaglia

    2017-05-01

    Full Text Available The mammalian ovarian follicle is the complex reproductive unit comprising germ cell, somatic cells (Cumulus and Granulosa cells, and follicular fluid (FF: paracrine communication among the different cell types through FF ensures the development of a mature oocyte ready for fertilization. This paper is focused on non-coding RNAs in ovarian follicles and their predicted role in the pathways involved in oocyte growth and maturation. We determined the expression profiles of microRNAs in human oocytes and FF by high-throughput analysis and identified 267 microRNAs in FF and 176 in oocytes. Most of these were FF microRNAs, while 9 were oocyte specific. By bioinformatic analysis, independently performed on FF and oocyte microRNAs, we identified the most significant Biological Processes and the pathways regulated by their validated targets. We found many pathways shared between the two compartments and some specific for oocyte microRNAs. Moreover, we found 41 long non-coding RNAs able to interact with oocyte microRNAs and potentially involved in the regulation of folliculogenesis. These data are important in basic reproductive research and could also be useful for clinical applications. In fact, the characterization of non-coding RNAs in ovarian follicles could improve reproductive disease diagnosis, provide biomarkers of oocyte quality in Assisted Reproductive Treatment, and allow the development of therapies for infertility disorders.

  12. Decoding the function of nuclear long non-coding RNAs.

    Science.gov (United States)

    Chen, Ling-Ling; Carmichael, Gordon G

    2010-06-01

    Long non-coding RNAs (lncRNAs) are mRNA-like, non-protein-coding RNAs that are pervasively transcribed throughout eukaryotic genomes. Rather than silently accumulating in the nucleus, many of these are now known or suspected to play important roles in nuclear architecture or in the regulation of gene expression. In this review, we highlight some recent progress in how lncRNAs regulate these important nuclear processes at the molecular level. Copyright 2010 Elsevier Ltd. All rights reserved.

  13. A long and abundant non-coding RNA in Lactobacillus salivarius.

    Science.gov (United States)

    Cousin, Fabien J; Lynch, Denise B; Chuat, Victoria; Bourin, Maxence J B; Casey, Pat G; Dalmasso, Marion; Harris, Hugh M B; McCann, Angela; O'Toole, Paul W

    2017-09-01

    Lactobacillus salivarius , found in the intestinal microbiota of humans and animals, is studied as an example of the sub-dominant intestinal commensals that may impart benefits upon their host. Strains typically harbour at least one megaplasmid that encodes functions contributing to contingency metabolism and environmental adaptation. RNA sequencing (RNA-seq)transcriptomic analysis of L. salivarius strain UCC118 identified the presence of a novel unusually abundant long non-coding RNA (lncRNA) encoded by the megaplasmid, and which represented more than 75 % of the total RNA-seq reads after depletion of rRNA species. The expression level of this 520 nt lncRNA in L. salivarius UCC118 exceeded that of the 16S rRNA, it accumulated during growth, was very stable over time and was also expressed during intestinal transit in a mouse. This lncRNA sequence is specific to the L. salivarius species; however, among 45 L . salivarius genomes analysed, not all (only 34) harboured the sequence for the lncRNA. This lncRNA was produced in 27 tested L. salivarius strains, but at strain-specific expression levels. High-level lncRNA expression correlated with high megaplasmid copy number. Transcriptome analysis of a deletion mutant lacking this lncRNA identified altered expression levels of genes in a number of pathways, but a definitive function of this new lncRNA was not identified. This lncRNA presents distinctive and unique properties, and suggests potential basic and applied scientific developments of this phenomenon.

  14. Linear-time non-malleable codes in the bit-wise independent tampering model

    NARCIS (Netherlands)

    R.J.F. Cramer (Ronald); I.B. Damgård (Ivan); N.M. Döttling (Nico); I. Giacomelli (Irene); C. Xing (Chaoping)

    2017-01-01

    textabstractNon-malleable codes were introduced by Dziembowski et al. (ICS 2010) as coding schemes that protect a message against tampering attacks. Roughly speaking, a code is non-malleable if decoding an adversarially tampered encoding of a message m produces the original message m or a value m′

  15. CONDOR: a database resource of developmentally associated conserved non-coding elements

    Directory of Open Access Journals (Sweden)

    Smith Sarah

    2007-08-01

    Full Text Available Abstract Background Comparative genomics is currently one of the most popular approaches to study the regulatory architecture of vertebrate genomes. Fish-mammal genomic comparisons have proved powerful in identifying conserved non-coding elements likely to be distal cis-regulatory modules such as enhancers, silencers or insulators that control the expression of genes involved in the regulation of early development. The scientific community is showing increasing interest in characterizing the function, evolution and language of these sequences. Despite this, there remains little in the way of user-friendly access to a large dataset of such elements in conjunction with the analysis and the visualization tools needed to study them. Description Here we present CONDOR (COnserved Non-coDing Orthologous Regions available at: http://condor.fugu.biology.qmul.ac.uk. In an interactive and intuitive way the website displays data on > 6800 non-coding elements associated with over 120 early developmental genes and conserved across vertebrates. The database regularly incorporates results of ongoing in vivo zebrafish enhancer assays of the CNEs carried out in-house, which currently number ~100. Included and highlighted within this set are elements derived from duplication events both at the origin of vertebrates and more recently in the teleost lineage, thus providing valuable data for studying the divergence of regulatory roles between paralogs. CONDOR therefore provides a number of tools and facilities to allow scientists to progress in their own studies on the function and evolution of developmental cis-regulation. Conclusion By providing access to data with an approachable graphics interface, the CONDOR database presents a rich resource for further studies into the regulation and evolution of genes involved in early development.

  16. Properties of Sequence Conservation in Upstream Regulatory and Protein Coding Sequences among Paralogs in Arabidopsis thaliana

    Science.gov (United States)

    Richardson, Dale N.; Wiehe, Thomas

    Whole genome duplication (WGD) has catalyzed the formation of new species, genes with novel functions, altered expression patterns, complexified signaling pathways and has provided organisms a level of genetic robustness. We studied the long-term evolution and interrelationships of 5’ upstream regulatory sequences (URSs), protein coding sequences (CDSs) and expression correlations (EC) of duplicated gene pairs in Arabidopsis. Three distinct methods revealed significant evolutionary conservation between paralogous URSs and were highly correlated with microarray-based expression correlation of the respective gene pairs. Positional information on exact matches between sequences unveiled the contribution of micro-chromosomal rearrangements on expression divergence. A three-way rank analysis of URS similarity, CDS divergence and EC uncovered specific gene functional biases. Transcription factor activity was associated with gene pairs exhibiting conserved URSs and divergent CDSs, whereas a broad array of metabolic enzymes was found to be associated with gene pairs showing diverged URSs but conserved CDSs.

  17. LeARN: a platform for detecting, clustering and annotating non-coding RNAs

    Directory of Open Access Journals (Sweden)

    Schiex Thomas

    2008-01-01

    Full Text Available Abstract Background In the last decade, sequencing projects have led to the development of a number of annotation systems dedicated to the structural and functional annotation of protein-coding genes. These annotation systems manage the annotation of the non-protein coding genes (ncRNAs in a very crude way, allowing neither the edition of the secondary structures nor the clustering of ncRNA genes into families which are crucial for appropriate annotation of these molecules. Results LeARN is a flexible software package which handles the complete process of ncRNA annotation by integrating the layers of automatic detection and human curation. Conclusion This software provides the infrastructure to deal properly with ncRNAs in the framework of any annotation project. It fills the gap between existing prediction software, that detect independent ncRNA occurrences, and public ncRNA repositories, that do not offer the flexibility and interactivity required for annotation projects. The software is freely available from the download section of the website http://bioinfo.genopole-toulouse.prd.fr/LeARN

  18. Cyclic and constacyclic codes over a non-chain ring

    Directory of Open Access Journals (Sweden)

    Ayşegül Bayram

    2014-09-01

    Full Text Available In this study, we consider linear and especially cyclic codes over the non-chain ring $Z_{p}[v]/\\langle v^{p}-v\\rangle$ where $p$ is a prime. This is a generalization of the case $p=3.$ Further, in this work the structure of constacyclic codes are studied as well. This study takes advantage mainly from a Gray map which preserves the distance between codes over this ring and $p$-ary codes and moreover this map enlightens the structure of these codes. Furthermore, a MacWilliams type identity is presented together with some illustrative examples.

  19. Phylogenetic analyses of the polyprotein coding sequences of serotype O foot-and-mouth disease viruses in East Africa: evidence for interserotypic recombination

    Directory of Open Access Journals (Sweden)

    Balinda Sheila N

    2010-08-01

    Full Text Available Abstract Background Foot-and-mouth disease (FMD is endemic in East Africa with the majority of the reported outbreaks attributed to serotype O virus. In this study, phylogenetic analyses of the polyprotein coding region of serotype O FMD viruses from Kenya and Uganda has been undertaken to infer evolutionary relationships and processes responsible for the generation and maintenance of diversity within this serotype. FMD virus RNA was obtained from six samples following virus isolation in cell culture and in one case by direct extraction from an oropharyngeal sample. Following RT-PCR, the single long open reading frame, encoding the polyprotein, was sequenced. Results Phylogenetic comparisons of the VP1 coding region showed that the recent East African viruses belong to one lineage within the EA-2 topotype while an older Kenyan strain, K/52/1992 is a representative of the topotype EA-1. Evolutionary relationships between the coding regions for the leader protease (L, the capsid region and almost the entire coding region are monophyletic except for the K/52/1992 which is distinct. Furthermore, phylogenetic relationships for the P2 and P3 regions suggest that the K/52/1992 is a probable recombinant between serotypes A and O. A bootscan analysis of K/52/1992 with East African FMD serotype A viruses (A21/KEN/1964 and A23/KEN/1965 and serotype O viral isolate (K/117/1999 revealed that the P2 region is probably derived from a serotype A strain while the P3 region appears to be a mosaic derived from both serotypes A and O. Conclusions Sequences of the VP1 coding region from recent serotype O FMDVs from Kenya and Uganda are all representatives of a specific East African lineage (topotype EA-2, a probable indication that hardly any FMD introductions of this serotype have occurred from outside the region in the recent past. Furthermore, evidence for interserotypic recombination, within the non-structural protein coding regions, between FMDVs of serotypes A

  20. A one-step reaction for the rapid identification of Lactobacillus mindensis, Lactobacillus panis, Lactobacillus paralimentarius, Lactobacillus pontis and Lactobacillus frumenti using oligonucleotide primers designed from the 16S-23S rRNA intergenic sequences.

    Science.gov (United States)

    Ferchichi, M; Valcheva, R; Prévost, H; Onno, B; Dousset, X

    2008-06-01

    Species-specific primers targeting the 16S-23S ribosomal DNA (rDNA) intergenic spacer region (ISR) were designed to rapidly discriminate between Lactobacillus mindensis, Lactobacillus panis, Lactobacillus paralimentarius, Lactobacillus pontis and Lactobacillus frumenti species recently isolated from French sourdough. The 16S-23S ISRs were amplified using primers 16S/p2 and 23S/p7, which anneal to positions 1388-1406 of the 16S rRNA gene and to positions 207-189 of the 23S rRNA gene respectively, Escherichia coli numbering (GenBank accession number V00331). Clone libraries of the resulting amplicons were constructed using a pCR2.1 TA cloning kit and sequenced. Species-specific primers were designed based on the sequences obtained and were used to amplify the 16S-23S ISR in the Lactobacillus species considered. For all of them, two PCR amplicons, designated as small ISR (S-ISR) and large ISR (L-ISR), were obtained. The L-ISR is composed of the corresponding S-ISR, interrupted by a sequence containing tRNA(Ile) and tRNA(Ala) genes. Based on these sequences, species-specific primers were designed and proved to identify accurately the species considered among 30 reference Lactobacillus species tested. Designed species-specific primers enable a rapid and accurate identification of L. mindensis, L. paralimentarius, L. panis, L. pontis and L. frumenti species among other lactobacilli. The proposed method provides a powerful and convenient means of rapidly identifying some sourdough lactobacilli, which could be of help in large starter culture surveys.

  1. Low-pass shotgun sequencing of the barley genome facilitates rapid identification of genes, conserved non-coding sequences and novel repeats

    Directory of Open Access Journals (Sweden)

    Graner Andreas

    2008-10-01

    Full Text Available Abstract Background Barley has one of the largest and most complex genomes of all economically important food crops. The rise of new short read sequencing technologies such as Illumina/Solexa permits such large genomes to be effectively sampled at relatively low cost. Based on the corresponding sequence reads a Mathematically Defined Repeat (MDR index can be generated to map repetitive regions in genomic sequences. Results We have generated 574 Mbp of Illumina/Solexa sequences from barley total genomic DNA, representing about 10% of a genome equivalent. From these sequences we generated an MDR index which was then used to identify and mark repetitive regions in the barley genome. Comparison of the MDR plots with expert repeat annotation drawing on the information already available for known repetitive elements revealed a significant correspondence between the two methods. MDR-based annotation allowed for the identification of dozens of novel repeat sequences, though, which were not recognised by hand-annotation. The MDR data was also used to identify gene-containing regions by masking of repetitive sequences in eight de-novo sequenced bacterial artificial chromosome (BAC clones. For half of the identified candidate gene islands indeed gene sequences could be identified. MDR data were only of limited use, when mapped on genomic sequences from the closely related species Triticum monococcum as only a fraction of the repetitive sequences was recognised. Conclusion An MDR index for barley, which was obtained by whole-genome Illumina/Solexa sequencing, proved as efficient in repeat identification as manual expert annotation. Circumventing the labour-intensive step of producing a specific repeat library for expert annotation, an MDR index provides an elegant and efficient resource for the identification of repetitive and low-copy (i.e. potentially gene-containing sequences regions in uncharacterised genomic sequences. The restriction that a particular

  2. Non-coding, mRNA-like RNAs database Y2K.

    Science.gov (United States)

    Erdmann, V A; Szymanski, M; Hochberg, A; Groot, N; Barciszewski, J

    2000-01-01

    In last few years much data has accumulated on various non-translatable RNA transcripts that are synthesised in different cells. They are lacking in protein coding capacity and it seems that they work mainly or exclusively at the RNA level. All known non-coding RNA transcripts are collected in the database: http://www. man.poznan.pl/5SData/ncRNA/index.html

  3. Licensee Event Report sequence coding and search procedure workshop

    International Nuclear Information System (INIS)

    Cottrell, W.B.; Gallaher, R.B.

    1981-01-01

    Since mid-1980, the Office for Analysis and Evaluation of Operational Data (AEOD) of the Nuclear Regulatory Commission (NRC) has been developing procedures for the systematic review and analysis of Licensee Event Reports (LERs). These procedures generally address several areas of concern, including identification of significant trends and patterns, event sequence of occurrences, component failures, and system and plant effects. The AEOD and NSIC conducted a workshop on the new coding procedure at the American Museum of Science and Energy in Oak Ridge, TN, on November 24, 1980

  4. [Influence of "prehistory" of sequential movements of the right and the left hand on reproduction: coding of positions, movements and sequence structure].

    Science.gov (United States)

    Bobrova, E V; Liakhovetskiĭ, V A; Borshchevskaia, E R

    2011-01-01

    The dependence of errors during reproduction of a sequence of hand movements without visual feedback on the previous right- and left-hand performance ("prehistory") and on positions in space of sequence elements (random or ordered by the explicit rule) was analyzed. It was shown that the preceding information about the ordered positions of the sequence elements was used during right-hand movements, whereas left-hand movements were performed with involvement of the information about the random sequence. The data testify to a central mechanism of the analysis of spatial structure of sequence elements. This mechanism activates movement coding specific for the left hemisphere (vector coding) in case of an ordered sequence structure and positional coding specific for the right hemisphere in case of a random sequence structure.

  5. Two-terminal video coding.

    Science.gov (United States)

    Yang, Yang; Stanković, Vladimir; Xiong, Zixiang; Zhao, Wei

    2009-03-01

    Following recent works on the rate region of the quadratic Gaussian two-terminal source coding problem and limit-approaching code designs, this paper examines multiterminal source coding of two correlated, i.e., stereo, video sequences to save the sum rate over independent coding of both sequences. Two multiterminal video coding schemes are proposed. In the first scheme, the left sequence of the stereo pair is coded by H.264/AVC and used at the joint decoder to facilitate Wyner-Ziv coding of the right video sequence. The first I-frame of the right sequence is successively coded by H.264/AVC Intracoding and Wyner-Ziv coding. An efficient stereo matching algorithm based on loopy belief propagation is then adopted at the decoder to produce pixel-level disparity maps between the corresponding frames of the two decoded video sequences on the fly. Based on the disparity maps, side information for both motion vectors and motion-compensated residual frames of the right sequence are generated at the decoder before Wyner-Ziv encoding. In the second scheme, source splitting is employed on top of classic and Wyner-Ziv coding for compression of both I-frames to allow flexible rate allocation between the two sequences. Experiments with both schemes on stereo video sequences using H.264/AVC, LDPC codes for Slepian-Wolf coding of the motion vectors, and scalar quantization in conjunction with LDPC codes for Wyner-Ziv coding of the residual coefficients give a slightly lower sum rate than separate H.264/AVC coding of both sequences at the same video quality.

  6. Comparative genomics of chondrichthyan Hoxa clusters

    Directory of Open Access Journals (Sweden)

    Zhong Ying-Fu

    2009-09-01

    Full Text Available Abstract Background The chondrichthyan or cartilaginous fish (chimeras, sharks, skates and rays occupy an important phylogenetic position as the sister group to all other jawed vertebrates and as an early lineage to diverge from the vertebrate lineage following two whole genome duplication events in vertebrate evolution. There have been few comparative genomic analyses incorporating data from chondrichthyan fish and none comparing genomic information from within the group. We have sequenced the complete Hoxa cluster of the Little Skate (Leucoraja erinacea and compared to the published Hoxa cluster of the Horn Shark (Heterodontus francisci and to available data from the Elephant Shark (Callorhinchus milii genome project. Results A BAC clone containing the full Little Skate Hoxa cluster was fully sequenced and assembled. Analyses of coding sequences and conserved non-coding elements reveal a strikingly high level of conservation across the cartilaginous fish, with twenty ultraconserved elements (100%,100 bp found between Skate and Horn Shark, compared to three between human and marsupials. We have also identified novel potential non-coding RNAs in the Skate BAC clone, some of which are conserved to other species. Conclusion We find that the Little Skate Hoxa cluster is remarkably similar to the previously published Horn Shark Hoxa cluster with respect to sequence identity, gene size and intergenic distance despite over 180 million years of separation between the two lineages. We suggest that the genomes of cartilaginous fish are more highly conserved than those of tetrapods or teleost fish and so are more likely to have retained ancestral non-coding elements. While useful for isolating homologous DNA, this complicates bioinformatic approaches to identify chondrichthyan-specific non-coding DNA elements

  7. Instruction sequence based non-uniform complexity classes

    NARCIS (Netherlands)

    Bergstra, J.A.; Middelburg, C.A.

    2013-01-01

    We present an approach to non-uniform complexity in which single-pass instruction sequences play a key part, and answer various questions that arise from this approach. We introduce several kinds of non-uniform complexity classes. One kind includes a counterpart of the well-known non-uniform

  8. An improved PCR method for direct identification of Porphyra (Bangiales, Rhodophyta) using conchocelis based on a RUBISCO intergenic spacer

    Science.gov (United States)

    Wang, Chao; Dong, Dong; Wang, Guangce; Zhang, Baoyu; Peng, Guang; Xu, Pu; Tang, Xiaorong

    2009-09-01

    An improved method of PCR in which the small segment of conchocelis is amplified directly without DNA extraction was used to amplify a RUBISCO intergenic spacer DNA fragment from nine species of red algal genus Porphyra (Bangiales, Rhodophyta), including Porphyra yezoensis (Jiangsu, China), P. haitanensis (Fujian, China), P. oligospermatangia (Qingdao, China), P. katadai (Qingdao, China), P. tenera (Qingdao, China), P. suborboculata (Fujian, China), P. pseudolinearis (Kogendo, Korea), P. linearis (Devon, England), and P. fallax (Seattle, USA). Standard PCR and the method developed here were both conducted using primers specific for the RUBISCO spacer region, after which the two PCR products were sequenced. The sequencing data of the amplicons obtained using both methods were identical, suggesting that the improved PCR method was functional. These findings indicate that the method developed here may be useful for the rapid identification of species of Porphyra in a germplasm bank. In addition, a phylogenetic tree was constructed using the RUBISCO spacer and partial rbcS sequence, and the results were in concordant with possible alternative phylogenies based on traditional morphological taxonomic characteristics, indicating that the RUBISCO spacer is a useful region for phylogenetic studies.

  9. Enhanced Protein Production in Escherichia coli by Optimization of Cloning Scars at the Vector-Coding Sequence Junction

    DEFF Research Database (Denmark)

    Mirzadeh, Kiavash; Martinez, Virginia; Toddo, Stephen

    2015-01-01

    are poorly expressed even when they are codon-optimized and expressed from vectors with powerful genetic elements. In this study, we show that poor expression can be caused by certain nucleotide sequences (e.g., cloning scars) at the junction between the vector and the coding sequence. Since these sequences...

  10. EG-10LONG NON-CODING RNAs IN GLIOBLASTOMA

    Science.gov (United States)

    Pastori, Chiara; Kapranov, Philipp; Penas, Clara; Laurent, Georges St.; Ayad, Nagi; Wahlestedt, Claes

    2014-01-01

    Glioblastoma (GBM) is the most common, aggressive and incurable primary brain tumor in adults. Genome studies have confirmed that GBM is extremely heterogeneous with many genetically different subgroups. Consequently, there is much current interest in epigenetic drugs that may be active across genetically distinct tumors. In support of this, some epigenetic drugs has recently shown efficacy against several cancers including glioblastoma. Much recent interest has also been devoted to long non-coding RNAs (lncRNAs), which can modulate gene expression regulating chromatin architecture, in part through the interaction with epigenetic protein machineries. To date, however, only a few lncRNAs have been studied in human cancer. We therefore embarked on a comprehensive genomic and functional analysis of lncRNAs in GBM. Using the Helicos Single Molecule Sequencing platform glioblastoma samples were sequenced resulting in the identification of hundreds of dysregulated lncRNAs. Among these the lncRNA HOTAIR was found massively increased in GBM. This observation parallels findings in other cancers where HOTAIR's increased expression has been linked to poor prognosis due to metastatic events. Interestingly, here we show that in glioblastoma HOTAIR does not promote metastasis, but instead sustains the ability of these cells to proliferate. In fact, we demonstrate that HOTAIR knockdown in GBM strongly impairs cell proliferation and induces apoptosis in vitro and in vivo. Further, we implicate HOTAIR in the mechanism of action of certain epigenetic drugs. In summary, long noncoding RNAs (newly discovered epigenomic factors) play a vital role in GBM and deserve attention as entirely novel drug targets as well as biomarkers.

  11. Phytoplasma phylogenetics based on analysis of secA and 23S rRNA gene sequences for improved resolution of candidate species of 'Candidatus Phytoplasma'.

    Science.gov (United States)

    Hodgetts, Jennifer; Boonham, Neil; Mumford, Rick; Harrison, Nigel; Dickinson, Matthew

    2008-08-01

    Phytoplasma phylogenetics has focused primarily on sequences of the non-coding 16S rRNA gene and the 16S-23S rRNA intergenic spacer region (16-23S ISR), and primers that enable amplification of these regions from all phytoplasmas by PCR are well established. In this study, primers based on the secA gene have been developed into a semi-nested PCR assay that results in a sequence of the expected size (about 480 bp) from all 34 phytoplasmas examined, including strains representative of 12 16Sr groups. Phylogenetic analysis of secA gene sequences showed similar clustering of phytoplasmas when compared with clusters resolved by similar sequence analyses of a 16-23S ISR-23S rRNA gene contig or of the 16S rRNA gene alone. The main differences between trees were in the branch lengths, which were elongated in the 16-23S ISR-23S rRNA gene tree when compared with the 16S rRNA gene tree and elongated still further in the secA gene tree, despite this being a shorter sequence. The improved resolution in the secA gene-derived phylogenetic tree resulted in the 16SrII group splitting into two distinct clusters, while phytoplasmas associated with coconut lethal yellowing-type diseases split into three distinct groups, thereby supporting past proposals that they represent different candidate species within 'Candidatus Phytoplasma'. The ability to differentiate 16Sr groups and subgroups by virtual RFLP analysis of secA gene sequences suggests that this gene may provide an informative alternative molecular marker for pathogen identification and diagnosis of phytoplasma diseases.

  12. FOURTH SEMINAR TO THE MEMORY OF D.N. KLYSHKO: Algebraic solution of the synthesis problem for coded sequences

    Science.gov (United States)

    Leukhin, Anatolii N.

    2005-08-01

    The algebraic solution of a 'complex' problem of synthesis of phase-coded (PC) sequences with the zero level of side lobes of the cyclic autocorrelation function (ACF) is proposed. It is shown that the solution of the synthesis problem is connected with the existence of difference sets for a given code dimension. The problem of estimating the number of possible code combinations for a given code dimension is solved. It is pointed out that the problem of synthesis of PC sequences is related to the fundamental problems of discrete mathematics and, first of all, to a number of combinatorial problems, which can be solved, as the number factorisation problem, by algebraic methods by using the theory of Galois fields and groups.

  13. Deletion of a coordinate regulator of type 2 cytokine expression in mice

    Energy Technology Data Exchange (ETDEWEB)

    Mohrs, Markus; Blankespoor, Catherine M.; Wang, Zhi-En; Loots, Gaby G.; Hadeiba, Husein; Shinkai, Kanade; Rubin, Edward M.; Locksley, Richard M.

    2001-07-30

    Mechanisms underlying the differentiation of stable T helper subsets will be important in understanding how discrete types of immunity develop in response to different pathogens. An evolutionarily conserved {approx}400 base pair non-coding sequence in the IL-4/IL-13 intergenic region, designated CNS-1, was deleted in mice. The capacity to develop Th2 cells was compromised in vitro and in vivo in the absence of CNS-1. Despite the profound effect in T cells, mast cells from CNS-1-deleted mice maintained their capacity to produce IL-4. A T cell-specific element critical for optimal expression of type 2 cytokines may represent evolution of a regulatory sequence exploited by adaptive immunity.

  14. Complete Chloroplast Genome Sequence of Aquilaria sinensis (Lour.) Gilg and Evolution Analysis within the Malvales Order.

    Science.gov (United States)

    Wang, Ying; Zhan, Di-Feng; Jia, Xian; Mei, Wen-Li; Dai, Hao-Fu; Chen, Xiong-Ting; Peng, Shi-Qing

    2016-01-01

    Aquilaria sinensis (Lour.) Gilg is an important medicinal woody plant producing agarwood, which is widely used in traditional Chinese medicine. High-throughput sequencing of chloroplast (cp) genomes enhanced the understanding about evolutionary relationships within plant families. In this study, we determined the complete cp genome sequences for A. sinensis. The size of the A. sinensis cp genome was 159,565 bp. This genome included a large single-copy region of 87,482 bp, a small single-copy region of 19,857 bp, and a pair of inverted repeats (IRa and IRb) of 26,113 bp each. The GC content of the genome was 37.11%. The A. sinensis cp genome encoded 113 functional genes, including 82 protein-coding genes, 27 tRNA genes, and 4 rRNA genes. Seven genes were duplicated in the protein-coding genes, whereas 11 genes were duplicated in the RNA genes. A total of 45 polymorphic simple-sequence repeat loci and 60 pairs of large repeats were identified. Most simple-sequence repeats were located in the noncoding sections of the large single-copy/small single-copy region and exhibited high A/T content. Moreover, 33 pairs of large repeat sequences were located in the protein-coding genes, whereas 27 pairs were located in the intergenic regions. Aquilaria sinensis cp genome bias ended with A/T on the basis of codon usage. The distribution of codon usage in A. sinensis cp genome was most similar to that in the Gonystylus bancanus cp genome. Comparative results of 82 protein-coding genes from 29 species of cp genomes demonstrated that A. sinensis was a sister species to G. bancanus within the Malvales order. Aquilaria sinensis cp genome presented the highest sequence similarity of >90% with the G. bancanus cp genome by using CGView Comparison Tool. This finding strongly supports the placement of A. sinensis as a sister to G. bancanus within the Malvales order. The complete A. sinensis cp genome information will be highly beneficial for further studies on this traditional medicinal

  15. SinEx DB: a database for single exon coding sequences in mammalian genomes.

    Science.gov (United States)

    Jorquera, Roddy; Ortiz, Rodrigo; Ossandon, F; Cárdenas, Juan Pablo; Sepúlveda, Rene; González, Carolina; Holmes, David S

    2016-01-01

    Eukaryotic genes are typically interrupted by intragenic, noncoding sequences termed introns. However, some genes lack introns in their coding sequence (CDS) and are generally known as 'single exon genes' (SEGs). In this work, a SEG is defined as a nuclear, protein-coding gene that lacks introns in its CDS. Whereas, many public databases of Eukaryotic multi-exon genes are available, there are only two specialized databases for SEGs. The present work addresses the need for a more extensive and diverse database by creating SinEx DB, a publicly available, searchable database of predicted SEGs from 10 completely sequenced mammalian genomes including human. SinEx DB houses the DNA and protein sequence information of these SEGs and includes their functional predictions (KOG) and the relative distribution of these functions within species. The information is stored in a relational database built with My SQL Server 5.1.33 and the complete dataset of SEG sequences and their functional predictions are available for downloading. SinEx DB can be interrogated by: (i) a browsable phylogenetic schema, (ii) carrying out BLAST searches to the in-house SinEx DB of SEGs and (iii) via an advanced search mode in which the database can be searched by key words and any combination of searches by species and predicted functions. SinEx DB provides a rich source of information for advancing our understanding of the evolution and function of SEGs.Database URL: www.sinex.cl. © The Author(s) 2016. Published by Oxford University Press.

  16. RevTrans: multiple alignment of coding DNA from aligned amino acid sequences

    DEFF Research Database (Denmark)

    Wernersson, Rasmus; Pedersen, Anders Gorm

    2003-01-01

    The simple fact that proteins are built from 20 amino acids while DNA only contains four different bases, means that the 'signal-to-noise ratio' in protein sequence alignments is much better than in alignments of DNA. Besides this information-theoretical advantage, protein alignments also benefit...... proteins. It is therefore preferable to align coding DNA at the amino acid level and it is for this purpose we have constructed the program RevTrans. RevTrans constructs a multiple DNA alignment by: (i) translating the DNA; (ii) aligning the resulting peptide sequences; and (iii) building a multiple DNA...

  17. Genic non-coding microsatellites in the rice genome: characterization, marker design and use in assessing genetic and evolutionary relationships among domesticated groups

    Directory of Open Access Journals (Sweden)

    Singh Nagendra

    2009-03-01

    Full Text Available Abstract Background Completely sequenced plant genomes provide scope for designing a large number of microsatellite markers, which are useful in various aspects of crop breeding and genetic analysis. With the objective of developing genic but non-coding microsatellite (GNMS markers for the rice (Oryza sativa L. genome, we characterized the frequency and relative distribution of microsatellite repeat-motifs in 18,935 predicted protein coding genes including 14,308 putative promoter sequences. Results We identified 19,555 perfect GNMS repeats with densities ranging from 306.7/Mb in chromosome 1 to 450/Mb in chromosome 12 with an average of 357.5 GNMS per Mb. The average microsatellite density was maximum in the 5' untranslated regions (UTRs followed by those in introns, promoters, 3'UTRs and minimum in the coding sequences (CDS. Primers were designed for 17,966 (92% GNMS repeats, including 4,288 (94% hypervariable class I types, which were bin-mapped on the rice genome. The GNMS markers were most polymorphic in the intronic region (73.3% followed by markers in the promoter region (53.3% and least in the CDS (26.6%. The robust polymerase chain reaction (PCR amplification efficiency and high polymorphic potential of GNMS markers over genic coding and random genomic microsatellite markers suggest their immediate use in efficient genotyping applications in rice. A set of these markers could assess genetic diversity and establish phylogenetic relationships among domesticated rice cultivar groups. We also demonstrated the usefulness of orthologous and paralogous conserved non-coding microsatellite (CNMS markers, identified in the putative rice promoter sequences, for comparative physical mapping and understanding of evolutionary and gene regulatory complexities among rice and other members of the grass family. The divergence between long-grained aromatics and subspecies japonica was estimated to be more recent (0.004 Mya compared to short

  18. Cloning, nucleotide sequence and transcriptional analysis of the uvrA gene from Neisseria gonorrhoeae

    International Nuclear Information System (INIS)

    Black, C.G.; Fyfe, J.A.M.; Davies, J.K.

    1997-01-01

    A recombinant plasmid capable of restoring UV resistance to an Escherichia coli uvrA mutant was isolated from a genomic library of Neisseria gonorrhoeae. Sequence analysis revealed an open reading frame whose deduced amino acid sequence displayed significant similarity to those of the UvrA proteins of other bacterial species. A second open reading frame (ORF259) was identified upstream from, and in the opposite orientation to the gonococcal uvrA gene. Transcriptional fusions between portions of the gonococcal uvrA upstream region and a reporter gene were used to localise promoter activity in both E. coli and N. gonorrhoeae. The transcriptional starting points of uvrA and ORF259 were mapped in E. coli by primer extension analysis, and corresponding σ 70 promoters were identified. The arrangement of the uvrA-ORF259 intergenic region is similar to that of the gonococcal recA-aroD intergenic region. Both contain inverted copies of the 10 bp neisserial DNA uptake sequence situated between divergently transcribed genes. However, there is no evidence that either the uptake sequence or the proximity of the promoters influences expression of these genes. (author)

  19. Extracellular vesicle associated long non-coding RNAs functionally enhance cell viability

    Directory of Open Access Journals (Sweden)

    Chris Hewson

    2016-10-01

    Full Text Available Cells communicate with one another to create microenvironments and share resources. One avenue by which cells communicate is through the action of exosomes. Exosomes are extracellular vesicles that are released by one cell and taken up by neighbouring cells. But how exosomes instigate communication between cells has remained largely unknown. We present evidence here that particular long non-coding RNA molecules are preferentially packaged into exosomes. We also find that a specific class of these exosome associated non-coding RNAs functionally modulate cell viability by direct interactions with l-lactate dehydrogenase B (LDHB, high-mobility group protein 17 (HMG-17, and CSF2RB, proteins involved in metabolism, nucleosomal architecture and cell signalling respectively. Knowledge of this endogenous cell to cell pathway, those proteins interacting with exosome associated non-coding transcripts and their interacting domains, could lead to a better understanding of not only cell to cell interactions but also the development of exosome targeted approaches in patient specific cell-based therapies. Keywords: Non-coding RNA, Extracellular RNA, Exosomes, Retroelement, Pseudogene

  20. Progressive changes in non-coding RNA profile in leucocytes with age

    Science.gov (United States)

    Muñoz-Culla, Maider; Irizar, Haritz; Gorostidi, Ana; Alberro, Ainhoa; Osorio-Querejeta, Iñaki; Ruiz-Martínez, Javier; Olascoaga, Javier; de Munain, Adolfo López; Otaegui, David

    2017-01-01

    It has been observed that immune cell deterioration occurs in the elderly, as well as a chronic low-grade inflammation called inflammaging. These cellular changes must be driven by numerous changes in gene expression and in fact, both protein-coding and non-coding RNA expression alterations have been observed in peripheral blood mononuclear cells from elder people. In the present work we have studied the expression of small non-coding RNA (microRNA and small nucleolar RNA -snoRNA-) from healthy individuals from 24 to 79 years old. We have observed that the expression of 69 non-coding RNAs (56 microRNAs and 13 snoRNAs) changes progressively with chronological age. According to our results, the age range from 47 to 54 is critical given that it is the period when the expression trend (increasing or decreasing) of age-related small non-coding RNAs is more pronounced. Furthermore, age-related miRNAs regulate genes that are involved in immune, cell cycle and cancer-related processes, which had already been associated to human aging. Therefore, human aging could be studied as a result of progressive molecular changes, and different age ranges should be analysed to cover the whole aging process. PMID:28448962

  1. [Phylogenetic relationships of the species of Oxytropis DC. subg. Oxytropis and Phacoxytropis (Fabaceae) from Asian Russia inferred from the nucleotide sequence analysis of the intergenic spacers of the chloroplast genome].

    Science.gov (United States)

    Kholina, A B; Kozyrenko, M M; Artyukova, E V; Sandanov, D V; Andrianova, E A

    2016-08-01

    The nucleotide sequence analysis of trnH–psbA, trnL–trnF, and trnS–trnG intergenic spacer regions of chloroplast DNA performed in the representatives of the genus Oxytropis from Asian Russia provided clarification of the phylogenetic relationships of some species and sections in the subgenera Oxytropis and Phacoxytropis and in the genus Oxytropis as a whole. Only the section Mesogaea corresponds to the subgenus Phacoxytropis, while the section Janthina of the same subgenus groups together with the sections of the subgenus Oxytropis. The sections Chrysantha and Ortholoma of the subgenus Oxytropis are not only closely related to each other, but together with the section Mesogaea, they are grouped into the subgenus Phacoxytropis. It seems likely that the sections Chrysantha and Ortholoma should be assigned to the subgenus Phacoxytropis, and the section Janthina should be assigned to the subgenus Oxytropis. The molecular differences were identified between O. coerulea and O. mandshurica from the section Janthina that were indicative of considerable divergence of their chloroplast genomes and the species independence of the taxa. The species independence of O. czukotica belonging to the section Arctobia was also confirmed.

  2. Non-coding RNAs and plant male sterility: current knowledge and future prospects.

    Science.gov (United States)

    Mishra, Ankita; Bohra, Abhishek

    2018-02-01

    Latest outcomes assign functional role to non-coding (nc) RNA molecules in regulatory networks that confer male sterility to plants. Male sterility in plants offers great opportunity for improving crop performance through application of hybrid technology. In this respect, cytoplasmic male sterility (CMS) and sterility induced by photoperiod (PGMS)/temperature (TGMS) have greatly facilitated development of high-yielding hybrids in crops. Participation of non-coding (nc) RNA molecules in plant reproductive development is increasingly becoming evident. Recent breakthroughs in rice definitively associate ncRNAs with PGMS and TGMS. In case of CMS, the exact mechanism through which the mitochondrial ORFs exert influence on the development of male gametophyte remains obscure in several crops. High-throughput sequencing has enabled genome-wide discovery and validation of these regulatory molecules and their target genes, describing their potential roles performed in relation to CMS. Discovery of ncRNA localized in plant mtDNA with its possible implication in CMS induction is intriguing in this respect. Still, conclusive evidences linking ncRNA with CMS phenotypes are currently unavailable, demanding complementing genetic approaches like transgenics to substantiate the preliminary findings. Here, we review the recent literature on the contribution of ncRNAs in conferring male sterility to plants, with an emphasis on microRNAs. Also, we present a perspective on improved understanding about ncRNA-mediated regulatory pathways that control male sterility in plants. A refined understanding of plant male sterility would strengthen crop hybrid industry to deliver hybrids with improved performance.

  3. From structure prediction to genomic screens for novel non-coding RNAs.

    Science.gov (United States)

    Gorodkin, Jan; Hofacker, Ivo L

    2011-08-01

    Non-coding RNAs (ncRNAs) are receiving more and more attention not only as an abundant class of genes, but also as regulatory structural elements (some located in mRNAs). A key feature of RNA function is its structure. Computational methods were developed early for folding and prediction of RNA structure with the aim of assisting in functional analysis. With the discovery of more and more ncRNAs, it has become clear that a large fraction of these are highly structured. Interestingly, a large part of the structure is comprised of regular Watson-Crick and GU wobble base pairs. This and the increased amount of available genomes have made it possible to employ structure-based methods for genomic screens. The field has moved from folding prediction of single sequences to computational screens for ncRNAs in genomic sequence using the RNA structure as the main characteristic feature. Whereas early methods focused on energy-directed folding of single sequences, comparative analysis based on structure preserving changes of base pairs has been efficient in improving accuracy, and today this constitutes a key component in genomic screens. Here, we cover the basic principles of RNA folding and touch upon some of the concepts in current methods that have been applied in genomic screens for de novo RNA structures in searches for novel ncRNA genes and regulatory RNA structure on mRNAs. We discuss the strengths and weaknesses of the different strategies and how they can complement each other.

  4. Coding and decoding libraries of sequence-defined functional copolymers synthesized via photoligation.

    Science.gov (United States)

    Zydziak, Nicolas; Konrad, Waldemar; Feist, Florian; Afonin, Sergii; Weidner, Steffen; Barner-Kowollik, Christopher

    2016-11-30

    Designing artificial macromolecules with absolute sequence order represents a considerable challenge. Here we report an advanced light-induced avenue to monodisperse sequence-defined functional linear macromolecules up to decamers via a unique photochemical approach. The versatility of the synthetic strategy-combining sequential and modular concepts-enables the synthesis of perfect macromolecules varying in chemical constitution and topology. Specific functions are placed at arbitrary positions along the chain via the successive addition of monomer units and blocks, leading to a library of functional homopolymers, alternating copolymers and block copolymers. The in-depth characterization of each sequence-defined chain confirms the precision nature of the macromolecules. Decoding of the functional information contained in the molecular structure is achieved via tandem mass spectrometry without recourse to their synthetic history, showing that the sequence information can be read. We submit that the presented photochemical strategy is a viable and advanced concept for coding individual monomer units along a macromolecular chain.

  5. BIRTH: a beam deposition code for non-circular tokamak plasmas

    International Nuclear Information System (INIS)

    Otsuka, Michio; Nagami, Masayuki; Matsuda, Toshiaki

    1982-09-01

    A new beam deposition code has been developed which is capable of calculating fast ion deposition profiles including the orbit correction. The code incorporates any injection geometry and a non-circular cross section plasma with a variable elongation and an outward shift of the magnetic flux surface. Typical cpu time on a DEC-10 computer is 10 - 20 seconds and 5 - 10 seconds with and without the orbit correction, respectively. This is shorter by an order of magnitude than that of other codes, e.g., Monte Carlo codes. The power deposition profile calculated by this code is in good agreement with that calculated by a Monte Carlo code. (author)

  6. Sequence space coverage, entropy of genomes and the potential to detect non-human DNA in human samples

    Directory of Open Access Journals (Sweden)

    Maley Carlo C

    2008-10-01

    Full Text Available Abstract Background Genomes store information for building and maintaining organisms. Complete sequencing of many genomes provides the opportunity to study and compare global information properties of those genomes. Results We have analyzed aspects of the information content of Homo sapiens, Mus musculus, Drosophila melanogaster, Caenorhabditis elegans, Arabidopsis thaliana, Saccharomyces cerevisiae, and Escherichia coli (K-12 genomes. Virtually all possible (> 98% 12 bp oligomers appear in vertebrate genomes while 98% to D. melanogaster (12–17 bp, C. elegans (11–17 bp, A. thaliana (11–17 bp, S. cerevisiae (10–16 bp and E. coli (9–15 bp. Frequencies of unique oligomers in the genomes follow similar patterns. We identified a set of 2.6 M 15-mers that are more than 1 nucleotide different from all 15-mers in the human genome and so could be used as probes to detect microbes in human samples. In a human sample, these probes would detect 100% of the 433 currently fully sequenced prokaryotes and 75% of the 3065 fully sequenced viruses. The human genome is significantly more compact in sequence space than a random genome. We identified the most frequent 5- to 20-mers in the human genome, which may prove useful as PCR primers. We also identified a bacterium, Anaeromyxobacter dehalogenans, which has an exceptionally low diversity of oligomers given the size of its genome and its GC content. The entropy of coding regions in the human genome is significantly higher than non-coding regions and chromosomes. However chromosomes 1, 2, 9, 12 and 14 have a relatively high proportion of coding DNA without high entropy, and chromosome 20 is the opposite with a low frequency of coding regions but relatively high entropy. Conclusion Measures of the frequency of oligomers are useful for designing PCR assays and for identifying chromosomes and organisms with hidden structure that had not been previously recognized. This information may be used to detect

  7. Sequence space coverage, entropy of genomes and the potential to detect non-human DNA in human samples

    Science.gov (United States)

    Liu, Zhandong; Venkatesh, Santosh S; Maley, Carlo C

    2008-01-01

    Background Genomes store information for building and maintaining organisms. Complete sequencing of many genomes provides the opportunity to study and compare global information properties of those genomes. Results We have analyzed aspects of the information content of Homo sapiens, Mus musculus, Drosophila melanogaster, Caenorhabditis elegans, Arabidopsis thaliana, Saccharomyces cerevisiae, and Escherichia coli (K-12) genomes. Virtually all possible (> 98%) 12 bp oligomers appear in vertebrate genomes while 98% to < 2% of possible oligomers in D. melanogaster (12–17 bp), C. elegans (11–17 bp), A. thaliana (11–17 bp), S. cerevisiae (10–16 bp) and E. coli (9–15 bp). Frequencies of unique oligomers in the genomes follow similar patterns. We identified a set of 2.6 M 15-mers that are more than 1 nucleotide different from all 15-mers in the human genome and so could be used as probes to detect microbes in human samples. In a human sample, these probes would detect 100% of the 433 currently fully sequenced prokaryotes and 75% of the 3065 fully sequenced viruses. The human genome is significantly more compact in sequence space than a random genome. We identified the most frequent 5- to 20-mers in the human genome, which may prove useful as PCR primers. We also identified a bacterium, Anaeromyxobacter dehalogenans, which has an exceptionally low diversity of oligomers given the size of its genome and its GC content. The entropy of coding regions in the human genome is significantly higher than non-coding regions and chromosomes. However chromosomes 1, 2, 9, 12 and 14 have a relatively high proportion of coding DNA without high entropy, and chromosome 20 is the opposite with a low frequency of coding regions but relatively high entropy. Conclusion Measures of the frequency of oligomers are useful for designing PCR assays and for identifying chromosomes and organisms with hidden structure that had not been previously recognized. This information may be used to

  8. Genomic sequence around butterfly wing development genes: annotation and comparative analysis.

    Directory of Open Access Journals (Sweden)

    Inês C Conceição

    Full Text Available BACKGROUND: Analysis of genomic sequence allows characterization of genome content and organization, and access beyond gene-coding regions for identification of functional elements. BAC libraries, where relatively large genomic regions are made readily available, are especially useful for species without a fully sequenced genome and can increase genomic coverage of phylogenetic and biological diversity. For example, no butterfly genome is yet available despite the unique genetic and biological properties of this group, such as diversified wing color patterns. The evolution and development of these patterns is being studied in a few target species, including Bicyclus anynana, where a whole-genome BAC library allows targeted access to large genomic regions. METHODOLOGY/PRINCIPAL FINDINGS: We characterize ∼1.3 Mb of genomic sequence around 11 selected genes expressed in B. anynana developing wings. Extensive manual curation of in silico predictions, also making use of a large dataset of expressed genes for this species, identified repetitive elements and protein coding sequence, and highlighted an expansion of Alcohol dehydrogenase genes. Comparative analysis with orthologous regions of the lepidopteran reference genome allowed assessment of conservation of fine-scale synteny (with detection of new inversions and translocations and of DNA sequence (with detection of high levels of conservation of non-coding regions around some, but not all, developmental genes. CONCLUSIONS: The general properties and organization of the available B. anynana genomic sequence are similar to the lepidopteran reference, despite the more than 140 MY divergence. Our results lay the groundwork for further studies of new interesting findings in relation to both coding and non-coding sequence: 1 the Alcohol dehydrogenase expansion with higher similarity between the five tandemly-repeated B. anynana paralogs than with the corresponding B. mori orthologs, and 2 the high

  9. Characterization of non-coding DNA satellites associated with sweepoviruses (genus Begomovirus, Geminiviridae - definition of a distinct class of begomovirus-associated satellites

    Directory of Open Access Journals (Sweden)

    Gloria eLozano

    2016-02-01

    Full Text Available Begomoviruses (family Geminiviridae are whitefly-transmitted, plant-infecting single-stranded DNA viruses that cause crop losses throughout the warmer parts of the World. Sweepoviruses are a phylogenetically distinct group of begomoviruses that infect plants of the family Convolvulaceae, including sweet potato (Ipomoea batatas. Two classes of subviral molecules are often associated with begomoviruses, particularly in the Old World; the betasatellites and the alphasatellites. An analysis of sweet potato and Ipomoea indica samples from Spain and Merremia dissecta samples from Venezuela identified small non-coding subviral molecules in association with several distinct sweepoviruses. The sequences of 18 clones were obtained and found to be structurally similar to tomato leaf curl virus–satellite (ToLCV-sat, the first DNA satellite identified in association with a begomovirus, with a region with significant sequence identity to the conserved region of betasatellites, an A-rich sequence, a predicted stem-loop structure containing the nonanucleotide TAATATTAC, and a second predicted stem-loop. These sweepovirus-associated satellites join an increasing number of ToLCV-sat-like non-coding satellites identified recently. Although sharing some features with betasatellites, evidence is provided to suggest that the ToLCV-sat-like satellites are distinct from betasatellites and should be considered a separate class of satellites, for which the collective name deltasatellites is proposed.

  10. DNA methylation of miRNA coding sequences putatively associated with childhood obesity.

    Science.gov (United States)

    Mansego, M L; Garcia-Lacarte, M; Milagro, F I; Marti, A; Martinez, J A

    2017-02-01

    Epigenetic mechanisms may be involved in obesity onset and its consequences. The aim of the present study was to evaluate whether DNA methylation status in microRNA (miRNA) coding regions is associated with childhood obesity. DNA isolated from white blood cells of 24 children (identification sample: 12 obese and 12 non-obese) from the Grupo Navarro de Obesidad Infantil study was hybridized in a 450 K methylation microarray. Several CpGs whose DNA methylation levels were statistically different between obese and non-obese were validated by MassArray® in 95 children (validation sample) from the same study. Microarray analysis identified 16 differentially methylated CpGs between both groups (6 hypermethylated and 10 hypomethylated). DNA methylation levels in miR-1203, miR-412 and miR-216A coding regions significantly correlated with body mass index standard deviation score (BMI-SDS) and explained up to 40% of the variation of BMI-SDS. The network analysis identified 19 well-defined obesity-relevant biological pathways from the KEGG database. MassArray® validation identified three regions located in or near miR-1203, miR-412 and miR-216A coding regions differentially methylated between obese and non-obese children. The current work identified three CpG sites located in coding regions of three miRNAs (miR-1203, miR-412 and miR-216A) that were differentially methylated between obese and non-obese children, suggesting a role of miRNA epigenetic regulation in childhood obesity. © 2016 World Obesity Federation.

  11. Comparisons between Arabidopsis thaliana and Drosophila melanogaster in relation to Coding and Noncoding Sequence Length and Gene Expression

    Directory of Open Access Journals (Sweden)

    Rachel Caldwell

    2015-01-01

    Full Text Available There is a continuing interest in the analysis of gene architecture and gene expression to determine the relationship that may exist. Advances in high-quality sequencing technologies and large-scale resource datasets have increased the understanding of relationships and cross-referencing of expression data to the large genome data. Although a negative correlation between expression level and gene (especially transcript length has been generally accepted, there have been some conflicting results arising from the literature concerning the impacts of different regions of genes, and the underlying reason is not well understood. The research aims to apply quantile regression techniques for statistical analysis of coding and noncoding sequence length and gene expression data in the plant, Arabidopsis thaliana, and fruit fly, Drosophila melanogaster, to determine if a relationship exists and if there is any variation or similarities between these species. The quantile regression analysis found that the coding sequence length and gene expression correlations varied, and similarities emerged for the noncoding sequence length (5′ and 3′ UTRs between animal and plant species. In conclusion, the information described in this study provides the basis for further exploration into gene regulation with regard to coding and noncoding sequence length.

  12. An integrated PCR colony hybridization approach to screen cDNA libraries for full-length coding sequences.

    Science.gov (United States)

    Pollier, Jacob; González-Guzmán, Miguel; Ardiles-Diaz, Wilson; Geelen, Danny; Goossens, Alain

    2011-01-01

    cDNA-Amplified Fragment Length Polymorphism (cDNA-AFLP) is a commonly used technique for genome-wide expression analysis that does not require prior sequence knowledge. Typically, quantitative expression data and sequence information are obtained for a large number of differentially expressed gene tags. However, most of the gene tags do not correspond to full-length (FL) coding sequences, which is a prerequisite for subsequent functional analysis. A medium-throughput screening strategy, based on integration of polymerase chain reaction (PCR) and colony hybridization, was developed that allows in parallel screening of a cDNA library for FL clones corresponding to incomplete cDNAs. The method was applied to screen for the FL open reading frames of a selection of 163 cDNA-AFLP tags from three different medicinal plants, leading to the identification of 109 (67%) FL clones. Furthermore, the protocol allows for the use of multiple probes in a single hybridization event, thus significantly increasing the throughput when screening for rare transcripts. The presented strategy offers an efficient method for the conversion of incomplete expressed sequence tags (ESTs), such as cDNA-AFLP tags, to FL-coding sequences.

  13. The rhesus macaque is three times as diverse but more closely equivalent in damaging coding variation as compared to the human

    Directory of Open Access Journals (Sweden)

    Yuan Qiaoping

    2012-06-01

    Full Text Available Abstract Background As a model organism in biomedicine, the rhesus macaque (Macaca mulatta is the most widely used nonhuman primate. Although a draft genome sequence was completed in 2007, there has been no systematic genome-wide comparison of genetic variation of this species to humans. Comparative analysis of functional and nonfunctional diversity in this highly abundant and adaptable non-human primate could inform its use as a model for human biology, and could reveal how variation in population history and size alters patterns and levels of sequence variation in primates. Results We sequenced the mRNA transcriptome and H3K4me3-marked DNA regions in hippocampus from 14 humans and 14 rhesus macaques. Using equivalent methodology and sampling spaces, we identified 462,802 macaque SNPs, most of which were novel and disproportionately located in the functionally important genomic regions we had targeted in the sequencing. At least one SNP was identified in each of 16,797 annotated macaque genes. Accuracy of macaque SNP identification was conservatively estimated to be >90%. Comparative analyses using SNPs equivalently identified in the two species revealed that rhesus macaque has approximately three times higher SNP density and average nucleotide diversity as compared to the human. Based on this level of diversity, the effective population size of the rhesus macaque is approximately 80,000 which contrasts with an effective population size of less than 10,000 for humans. Across five categories of genomic regions, intergenic regions had the highest SNP density and average nucleotide diversity and CDS (coding sequences the lowest, in both humans and macaques. Although there are more coding SNPs (cSNPs per individual in macaques than in humans, the ratio of dN/dS is significantly lower in the macaque. Furthermore, the number of damaging nonsynonymous cSNPs (have damaging effects on protein functions from PolyPhen-2 prediction in the macaque is more

  14. An Explicit Construction of a sequence of codes attaining the Tsfasman-Vladut-Zink Bound:The first steps

    DEFF Research Database (Denmark)

    Høholdt, Tom; Voss, Cornelia

    1997-01-01

    We present a sequence of codes attaining the Tsfasman-Vladut-Zink bound. The construction is based on the tower of Artin-Schreier extensions described by Garcia and Stichtenoth (1995). We also determine the dual codes. The first steps of the constructions are explicitly given as generator matrices...

  15. Uncovering layers of human RNA polymerase II transcription

    DEFF Research Database (Denmark)

    Jensen, Torben Heick

    In recent years DNA microarray and high-throughput sequencing technologies have challenged the “gene-centric” view that pre-mRNA is the only RNA species transcribed off protein-coding genes. Instead unorthodox transcription from within genic- and intergenic regions has been demonstrated to occur...

  16. Comparison of the complete mitochondrial genome of the stonefly Sweltsa longistyla (Plecoptera: Chloroperlidae) with mitogenomes of three other stoneflies.

    Science.gov (United States)

    Chen, Zhi-Teng; Du, Yu-Zhou

    2015-03-01

    The complete mitochondrial genome of the stonefly, Sweltsa longistyla Wu (Plecoptera: Chloroperlidae), was sequenced in this study. The mitogenome of S. longistyla is 16,151bp and contains 37 genes including 13 protein-coding genes (PCGs), 22 tRNA genes, two rRNA genes, and a large non-coding region. S. longistyla, Pteronarcys princeps Banks, Kamimuria wangi Du and Cryptoperla stilifera Sivec belong to the Plecoptera, and the gene order and orientation of their mitogenomes were similar. The overall AT content for the four stoneflies was below 72%, and the AT content of tRNA genes was above 69%. The four genomes were compact and contained only 65-127bp of non-coding intergenic DNAs. Overlapping nucleotides existed in all four genomes and ranged from 24 (P. princeps) to 178bp (K. wangi). There was a 7-bp motif ('ATGATAA') of overlapping DNA and an 8-bp motif (AAGCCTTA) conserved in three stonefly species (P. princeps, K. wangi and C. stilifera). The control regions of four stoneflies contained a stem-loop structure. Four conserved sequence blocks (CSBs) were present in the A+T-rich regions of all four stoneflies. Copyright © 2014 Elsevier B.V. All rights reserved.

  17. Complex organisation and structure of the ghrelin antisense strand gene GHRLOS, a candidate non-coding RNA gene

    Directory of Open Access Journals (Sweden)

    Herington Adrian C

    2008-10-01

    Full Text Available Abstract Background The peptide hormone ghrelin has many important physiological and pathophysiological roles, including the stimulation of growth hormone (GH release, appetite regulation, gut motility and proliferation of cancer cells. We previously identified a gene on the opposite strand of the ghrelin gene, ghrelinOS (GHRLOS, which spans the promoter and untranslated regions of the ghrelin gene (GHRL. Here we further characterise GHRLOS. Results We have described GHRLOS mRNA isoforms that extend over 1.4 kb of the promoter region and 106 nucleotides of exon 4 of the ghrelin gene, GHRL. These GHRLOS transcripts initiate 4.8 kb downstream of the terminal exon 4 of GHRL and are present in the 3' untranslated exon of the adjacent gene TATDN2 (TatD DNase domain containing 2. Interestingly, we have also identified a putative non-coding TATDN2-GHRLOS chimaeric transcript, indicating that GHRLOS RNA biogenesis is extremely complex. Moreover, we have discovered that the 3' region of GHRLOS is also antisense, in a tail-to-tail fashion to a novel terminal exon of the neighbouring SEC13 gene, which is important in protein transport. Sequence analyses revealed that GHRLOS is riddled with stop codons, and that there is little nucleotide and amino-acid sequence conservation of the GHRLOS gene between vertebrates. The gene spans 44 kb on 3p25.3, is extensively spliced and harbours multiple variable exons. We have also investigated the expression of GHRLOS and found evidence of differential tissue expression. It is highly expressed in tissues which are emerging as major sites of non-coding RNA expression (the thymus, brain, and testis, as well as in the ovary and uterus. In contrast, very low levels were found in the stomach where sense, GHRL derived RNAs are highly expressed. Conclusion GHRLOS RNA transcripts display several distinctive features of non-coding (ncRNA genes, including 5' capping, polyadenylation, extensive splicing and short open reading

  18. ChIPBase: a database for decoding the transcriptional regulation of long non-coding RNA and microRNA genes from ChIP-Seq data.

    Science.gov (United States)

    Yang, Jian-Hua; Li, Jun-Hao; Jiang, Shan; Zhou, Hui; Qu, Liang-Hu

    2013-01-01

    Long non-coding RNAs (lncRNAs) and microRNAs (miRNAs) represent two classes of important non-coding RNAs in eukaryotes. Although these non-coding RNAs have been implicated in organismal development and in various human diseases, surprisingly little is known about their transcriptional regulation. Recent advances in chromatin immunoprecipitation with next-generation DNA sequencing (ChIP-Seq) have provided methods of detecting transcription factor binding sites (TFBSs) with unprecedented sensitivity. In this study, we describe ChIPBase (http://deepbase.sysu.edu.cn/chipbase/), a novel database that we have developed to facilitate the comprehensive annotation and discovery of transcription factor binding maps and transcriptional regulatory relationships of lncRNAs and miRNAs from ChIP-Seq data. The current release of ChIPBase includes high-throughput sequencing data that were generated by 543 ChIP-Seq experiments in diverse tissues and cell lines from six organisms. By analysing millions of TFBSs, we identified tens of thousands of TF-lncRNA and TF-miRNA regulatory relationships. Furthermore, two web-based servers were developed to annotate and discover transcriptional regulatory relationships of lncRNAs and miRNAs from ChIP-Seq data. In addition, we developed two genome browsers, deepView and genomeView, to provide integrated views of multidimensional data. Moreover, our web implementation supports diverse query types and the exploration of TFs, lncRNAs, miRNAs, gene ontologies and pathways.

  19. Coding sequence of human rho cDNAs clone 6 and clone 9

    Energy Technology Data Exchange (ETDEWEB)

    Chardin, P; Madaule, P; Tavitian, A

    1988-03-25

    The authors have isolated human cDNAs including the complete coding sequence for two rho proteins corresponding to the incomplete isolates previously described as clone 6 and clone 9. The deduced a.a. sequences, when compared to the a.a. sequence deduced from clone 12 cDNA, show that there are in human at least three highly homologous rho genes. They suggest that clone 12 be named rhoA, clone 6 : rhoB and clone 9 : rhoC. RhoA, B and C proteins display approx. 30% a.a. identity with ras proteins,. mainly clustered in four highly homologous internal regions corresponding to the GTP binding site; however at least one significant difference is found; the 3 rho proteins have an Alanine in position corresponding to ras Glycine 13, suggesting that rho and ras proteins might have slightly different biochemical properties.

  20. Application of Melcor code for the calculo of TMLB sequence in PWR with natural circulating into the vessel

    International Nuclear Information System (INIS)

    Marten-Fuertes, F.

    1995-01-01

    The use of computer codes to analyze the phenomena of severe accidents is very important to take decisions in Nuclear Safety. This paper presents the MELCOR code used to calculate the TMLB sequence of PWR with natural circulation into the vessels. The main goal of this code is its application for the PSA (probabilistic safety analysis)

  1. Sequence data and association statistics from 12,940 type 2 diabetes cases and controls

    DEFF Research Database (Denmark)

    Jason, Flannick; Fuchsberger, Christian; Mahajan, Anubha

    2017-01-01

    variants were identified, including 99% of low-frequency (minor allele frequency [MAF] 0.1-5%) non-coding variants in the whole-genome sequenced individuals and 99.7% of low-frequency coding variants in the whole-exome sequenced individuals. Each variant was tested for association with T2D in the sequenced...... individuals, and, to increase power, most were tested in larger numbers of individuals (>80% of low-frequency coding variants in ~82 K Europeans via the exome chip, and ~90% of low-frequency non-coding variants in ~44 K Europeans via genotype imputation). The variants, genotypes, and association statistics...... from these analyses provide the largest reference to date of human genetic information relevant to T2D, for use in activities such as T2D-focused genotype imputation, functional characterization of variants or genes, and other novel analyses to detect associations between sequence variation and T2D....

  2. A Comparison of the First Two Sequenced Chloroplast Genomes in Asteraceae: Lettuce and Sunflower

    Energy Technology Data Exchange (ETDEWEB)

    Timme, Ruth E.; Kuehl, Jennifer V.; Boore, Jeffrey L.; Jansen, Robert K.

    2006-01-20

    Asteraceae is the second largest family of plants, with over 20,000 species. For the past few decades, numerous phylogenetic studies have contributed to our understanding of the evolutionary relationships within this family, including comparisons of the fast evolving chloroplast gene, ndhF, rbcL, as well as non-coding DNA from the trnL intron plus the trnLtrnF intergenic spacer, matK, and, with lesser resolution, psbA-trnH. This culminated in a study by Panero and Funk in 2002 that used over 13,000 bp per taxon for the largest taxonomic revision of Asteraceae in over a hundred years. Still, some uncertainties remain, and it would be very useful to have more information on the relative rates of sequence evolution among various genes and on genome structure as a potential set of phylogenetic characters to help guide future phylogenetic structures. By way of contributing to this, we report the first two complete chloroplast genome sequences from members of the Asteraceae, those of Helianthus annuus and Lactuca sativa. These plants belong to two distantly related subfamilies, Asteroideae and Cichorioideae, respectively. In addition to these, there is only one other published chloroplast genome sequence for any plant within the larger group called Eusterids II, that of Panax ginseng (Araliaceae, 156,318 bps, AY582139). Early chloroplast genome mapping studies demonstrated that H. annuus and L. sativa share a 22 kb inversion relative to members of the subfamily Barnadesioideae. By comparison to outgroups, this inversion was shown to be derived, indicating that the Asteroideae and Cichorioideae are more closely related than either is to the Barnadesioideae. Later sequencing study found that taxa that share this 22 kb inversion also contain within this region a second, smaller, 3.3 kb inversion. These sequences also enable an analysis of patterns of shared repeats in the genomes at fine level and of RNA editing by comparison to available EST sequences. In addition, since

  3. PARN and TOE1 Constitute a 3′ End Maturation Module for Nuclear Non-coding RNAs

    Directory of Open Access Journals (Sweden)

    Ahyeon Son

    2018-04-01

    Full Text Available Summary: Poly(A-specific ribonuclease (PARN and target of EGR1 protein 1 (TOE1 are nuclear granule-associated deadenylases, whose mutations are linked to multiple human diseases. Here, we applied mTAIL-seq and RNA sequencing (RNA-seq to systematically identify the substrates of PARN and TOE1 and elucidate their molecular functions. We found that PARN and TOE1 do not modulate the length of mRNA poly(A tails. Rather, they promote the maturation of nuclear small non-coding RNAs (ncRNAs. PARN and TOE1 act redundantly on some ncRNAs, most prominently small Cajal body-specific RNAs (scaRNAs. scaRNAs are strongly downregulated when PARN and TOE1 are compromised together, leading to defects in small nuclear RNA (snRNA pseudouridylation. They also function redundantly in the biogenesis of telomerase RNA component (TERC, which shares sequence motifs found in H/ACA box scaRNAs. Our findings extend the knowledge of nuclear ncRNA biogenesis, and they provide insights into the pathology of PARN/TOE1-associated genetic disorders whose therapeutic treatments are currently unavailable. : By analyzing the 3′ termini of transcriptome, Son et al. reveal the targets of PARN and TOE1, two nuclear deadenylases with disease associations. Both deadenylases are involved in nuclear small non-coding RNA maturation, but not in mRNA deadenylation. Their combined activity is particularly important for biogenesis of scaRNAs and TERC. Keywords: PARN, TOE1, CAF1Z, deadenylase, 3′ end maturation, adenylation, deadenylation, scaRNA, TERC

  4. Prediction Error During Functional and Non-Functional Action Sequences

    DEFF Research Database (Denmark)

    Nielbo, Kristoffer Laigaard; Sørensen, Jesper

    2013-01-01

    recurrent networks were made and the results are presented in this article. The simulations show that non-functional action sequences do indeed increase prediction error, but that context representations, such as abstract goal information, can modulate the error signal considerably. It is also shown...... that the networks are sensitive to boundaries between sequences in both functional and non-functional actions....

  5. At the intersection of non-coding transcription, DNA repair, chromatin structure, and cellular senescence

    Directory of Open Access Journals (Sweden)

    Ryosuke eOhsawa

    2013-07-01

    Full Text Available It is well accepted that non-coding RNAs play a critical role in regulating gene expression. Recent paradigm-setting studies are now revealing that non-coding RNAs, other than microRNAs, also play intriguing roles in the maintenance of chromatin structure, in the DNA damage response, and in adult human stem cell aging. In this review, we will discuss the complex inter-dependent relationships among non-coding RNA transcription, maintenance of genomic stability, chromatin structure and adult stem cell senescence. DNA damage-induced non-coding RNAs transcribed in the vicinity of the DNA break regulate recruitment of the DNA damage machinery and DNA repair efficiency. We will discuss the correlation between non-coding RNAs and DNA damage repair efficiency and the potential role of changing chromatin structures around double-strand break sites. On the other hand, induction of non-coding RNA transcription from the repetitive Alu elements occurs during human stem cell aging and hinders efficient DNA repair causing entry into senescence. We will discuss how this fine balance between transcription and genomic instability may be regulated by the dramatic changes to chromatin structure that accompany cellular senescence.

  6. Phylum-Level Conservation of Regulatory Information in Nematodes despite Extensive Non-coding Sequence Divergence

    Science.gov (United States)

    Gordon, Kacy L.; Arthur, Robert K.; Ruvinsky, Ilya

    2015-01-01

    Gene regulatory information guides development and shapes the course of evolution. To test conservation of gene regulation within the phylum Nematoda, we compared the functions of putative cis-regulatory sequences of four sets of orthologs (unc-47, unc-25, mec-3 and elt-2) from distantly-related nematode species. These species, Caenorhabditis elegans, its congeneric C. briggsae, and three parasitic species Meloidogyne hapla, Brugia malayi, and Trichinella spiralis, represent four of the five major clades in the phylum Nematoda. Despite the great phylogenetic distances sampled and the extensive sequence divergence of nematode genomes, all but one of the regulatory elements we tested are able to drive at least a subset of the expected gene expression patterns. We show that functionally conserved cis-regulatory elements have no more extended sequence similarity to their C. elegans orthologs than would be expected by chance, but they do harbor motifs that are important for proper expression of the C. elegans genes. These motifs are too short to be distinguished from the background level of sequence similarity, and while identical in sequence they are not conserved in orientation or position. Functional tests reveal that some of these motifs contribute to proper expression. Our results suggest that conserved regulatory circuitry can persist despite considerable turnover within cis elements. PMID:26020930

  7. Phylum-Level Conservation of Regulatory Information in Nematodes despite Extensive Non-coding Sequence Divergence.

    Directory of Open Access Journals (Sweden)

    Kacy L Gordon

    2015-05-01

    Full Text Available Gene regulatory information guides development and shapes the course of evolution. To test conservation of gene regulation within the phylum Nematoda, we compared the functions of putative cis-regulatory sequences of four sets of orthologs (unc-47, unc-25, mec-3 and elt-2 from distantly-related nematode species. These species, Caenorhabditis elegans, its congeneric C. briggsae, and three parasitic species Meloidogyne hapla, Brugia malayi, and Trichinella spiralis, represent four of the five major clades in the phylum Nematoda. Despite the great phylogenetic distances sampled and the extensive sequence divergence of nematode genomes, all but one of the regulatory elements we tested are able to drive at least a subset of the expected gene expression patterns. We show that functionally conserved cis-regulatory elements have no more extended sequence similarity to their C. elegans orthologs than would be expected by chance, but they do harbor motifs that are important for proper expression of the C. elegans genes. These motifs are too short to be distinguished from the background level of sequence similarity, and while identical in sequence they are not conserved in orientation or position. Functional tests reveal that some of these motifs contribute to proper expression. Our results suggest that conserved regulatory circuitry can persist despite considerable turnover within cis elements.

  8. Junk DNA and the long non-coding RNA twist in cancer genetics

    NARCIS (Netherlands)

    H. Ling (Hui); K. Vincent; M. Pichler; R. Fodde (Riccardo); I. Berindan-Neagoe (Ioana); F.J. Slack (Frank); G.A. Calin (George)

    2015-01-01

    textabstractThe central dogma of molecular biology states that the flow of genetic information moves from DNA to RNA to protein. However, in the last decade this dogma has been challenged by new findings on non-coding RNAs (ncRNAs) such as microRNAs (miRNAs). More recently, long non-coding RNAs

  9. Complete Chloroplast Genome Sequence of Tartary Buckwheat (Fagopyrum tataricum and Comparative Analysis with Common Buckwheat (F. esculentum.

    Directory of Open Access Journals (Sweden)

    Kwang-Soo Cho

    Full Text Available We report the chloroplast (cp genome sequence of tartary buckwheat (Fagopyrum tataricum obtained by next-generation sequencing technology and compared this with the previously reported common buckwheat (F. esculentum ssp. ancestrale cp genome. The cp genome of F. tataricum has a total sequence length of 159,272 bp, which is 327 bp shorter than the common buckwheat cp genome. The cp gene content, order, and orientation are similar to those of common buckwheat, but with some structural variation at tandem and palindromic repeat frequencies and junction areas. A total of seven InDels (around 100 bp were found within the intergenic sequences and the ycf1 gene. Copy number variation of the 21-bp tandem repeat varied in F. tataricum (four repeats and F. esculentum (one repeat, and the InDel of the ycf1 gene was 63 bp long. Nucleotide and amino acid have highly conserved coding sequence with about 98% homology and four genes--rpoC2, ycf3, accD, and clpP--have high synonymous (Ks value. PCR based InDel markers were applied to diverse genetic resources of F. tataricum and F. esculentum, and the amplicon size was identical to that expected in silico. Therefore, these InDel markers are informative biomarkers to practically distinguish raw or processed buckwheat products derived from F. tataricum and F. esculentum.

  10. An expanding universe of the non-coding genome in cancer biology.

    Science.gov (United States)

    Xue, Bin; He, Lin

    2014-06-01

    Neoplastic transformation is caused by accumulation of genetic and epigenetic alterations that ultimately convert normal cells into tumor cells with uncontrolled proliferation and survival, unlimited replicative potential and invasive growth [Hanahan,D. et al. (2011) Hallmarks of cancer: the next generation. Cell, 144, 646-674]. Although the majority of the cancer studies have focused on the functions of protein-coding genes, emerging evidence has started to reveal the importance of the vast non-coding genome, which constitutes more than 98% of the human genome. A number of non-coding RNAs (ncRNAs) derived from the 'dark matter' of the human genome exhibit cancer-specific differential expression and/or genomic alterations, and it is increasingly clear that ncRNAs, including small ncRNAs and long ncRNAs (lncRNAs), play an important role in cancer development by regulating protein-coding gene expression through diverse mechanisms. In addition to ncRNAs, nearly half of the mammalian genomes consist of transposable elements, particularly retrotransposons. Once depicted as selfish genomic parasites that propagate at the expense of host fitness, retrotransposon elements could also confer regulatory complexity to the host genomes during development and disease. Reactivation of retrotransposons in cancer, while capable of causing insertional mutagenesis and genome rearrangements to promote oncogenesis, could also alter host gene expression networks to favor tumor development. Taken together, the functional significance of non-coding genome in tumorigenesis has been previously underestimated, and diverse transcripts derived from the non-coding genome could act as integral functional components of the oncogene and tumor suppressor network. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  11. Design of ACM system based on non-greedy punctured LDPC codes

    Science.gov (United States)

    Lu, Zijun; Jiang, Zihong; Zhou, Lin; He, Yucheng

    2017-08-01

    In this paper, an adaptive coded modulation (ACM) scheme based on rate-compatible LDPC (RC-LDPC) codes was designed. The RC-LDPC codes were constructed by a non-greedy puncturing method which showed good performance in high code rate region. Moreover, the incremental redundancy scheme of LDPC-based ACM system over AWGN channel was proposed. By this scheme, code rates vary from 2/3 to 5/6 and the complication of the ACM system is lowered. Simulations show that more and more obvious coding gain can be obtained by the proposed ACM system with higher throughput.

  12. From structure prediction to genomic screens for novel non-coding RNAs.

    Directory of Open Access Journals (Sweden)

    Jan Gorodkin

    2011-08-01

    Full Text Available Non-coding RNAs (ncRNAs are receiving more and more attention not only as an abundant class of genes, but also as regulatory structural elements (some located in mRNAs. A key feature of RNA function is its structure. Computational methods were developed early for folding and prediction of RNA structure with the aim of assisting in functional analysis. With the discovery of more and more ncRNAs, it has become clear that a large fraction of these are highly structured. Interestingly, a large part of the structure is comprised of regular Watson-Crick and GU wobble base pairs. This and the increased amount of available genomes have made it possible to employ structure-based methods for genomic screens. The field has moved from folding prediction of single sequences to computational screens for ncRNAs in genomic sequence using the RNA structure as the main characteristic feature. Whereas early methods focused on energy-directed folding of single sequences, comparative analysis based on structure preserving changes of base pairs has been efficient in improving accuracy, and today this constitutes a key component in genomic screens. Here, we cover the basic principles of RNA folding and touch upon some of the concepts in current methods that have been applied in genomic screens for de novo RNA structures in searches for novel ncRNA genes and regulatory RNA structure on mRNAs. We discuss the strengths and weaknesses of the different strategies and how they can complement each other.

  13. Low-Bandwidth and Non-Compute Intensive Remote Identification of Microbes from Raw Sequencing Reads

    DEFF Research Database (Denmark)

    Gautier, Laurent; Lund, Ole

    2013-01-01

    , allowing a fully automated processing of sequencing data and routine instant quality check of sequencing runs from desktop sequencers. A web access is available at http://tapir.cbs.dtu.dk. The source code for a python command-line client, a server, and supplementary data are available at http://bit.ly/1aURxkc....

  14. Divergent evolutionary rates in vertebrate and mammalian specific conserved non-coding elements (CNEs) in echolocating mammals.

    Science.gov (United States)

    Davies, Kalina T J; Tsagkogeorga, Georgia; Rossiter, Stephen J

    2014-12-19

    The majority of DNA contained within vertebrate genomes is non-coding, with a certain proportion of this thought to play regulatory roles during development. Conserved Non-coding Elements (CNEs) are an abundant group of putative regulatory sequences that are highly conserved across divergent groups and thus assumed to be under strong selective constraint. Many CNEs may contain regulatory factor binding sites, and their frequent spatial association with key developmental genes - such as those regulating sensory system development - suggests crucial roles in regulating gene expression and cellular patterning. Yet surprisingly little is known about the molecular evolution of CNEs across diverse mammalian taxa or their role in specific phenotypic adaptations. We examined 3,110 vertebrate-specific and ~82,000 mammalian-specific CNEs across 19 and 9 mammalian orders respectively, and tested for changes in the rate of evolution of CNEs located in the proximity of genes underlying the development or functioning of auditory systems. As we focused on CNEs putatively associated with genes underlying the development/functioning of auditory systems, we incorporated echolocating taxa in our dataset because of their highly specialised and derived auditory systems. Phylogenetic reconstructions of concatenated CNEs broadly recovered accepted mammal relationships despite high levels of sequence conservation. We found that CNE substitution rates were highest in rodents and lowest in primates, consistent with previous findings. Comparisons of CNE substitution rates from several genomic regions containing genes linked to auditory system development and hearing revealed differences between echolocating and non-echolocating taxa. Wider taxonomic sampling of four CNEs associated with the homeobox genes Hmx2 and Hmx3 - which are required for inner ear development - revealed family-wise variation across diverse bat species. Specifically within one family of echolocating bats that utilise

  15. Episodic sequence memory is supported by a theta-gamma phase code

    OpenAIRE

    Heusser, Andrew C.; Poeppel, David; Ezzyat, Youssef; Davachi, Lila

    2016-01-01

    The meaning we derive from our experiences is not a simple static extraction of the elements, but is largely based on the order in which those elements occur. Models propose that sequence encoding is supported by interactions between high and low frequency oscillations, such that elements within an experience are represented by neural cell assemblies firing at higher frequencies (i.e. gamma) and sequential order is coded by the specific timing of firing with respect to a lower frequency oscil...

  16. u-Constacyclic codes over F_p+u{F}_p and their applications of constructing new non-binary quantum codes

    Science.gov (United States)

    Gao, Jian; Wang, Yongkang

    2018-01-01

    Structural properties of u-constacyclic codes over the ring F_p+u{F}_p are given, where p is an odd prime and u^2=1. Under a special Gray map from F_p+u{F}_p to F_p^2, some new non-binary quantum codes are obtained by this class of constacyclic codes.

  17. Complete chloroplast DNA sequence from a Korean endemic genus, Megaleranthis saniculifolia, and its evolutionary implications.

    Science.gov (United States)

    Kim, Young-Kyu; Park, Chong-wook; Kim, Ki-Joong

    2009-03-31

    The chloroplast DNA sequences of Megaleranthis saniculifolia, an endemic and monotypic endangered plant species, were completed in this study (GenBank FJ597983). The genome is 159,924 bp in length. It harbors a pair of IR regions consisting of 26,608 bp each. The lengths of the LSC and SSC regions are 88,326 bp and 18,382 bp, respectively. The structural organizations, gene and intron contents, gene orders, AT contents, codon usages, and transcription units of the Megaleranthis chloroplast genome are similar to those of typical land plant cp DNAs. However, the detailed features of Megaleranthis chloroplast genomes are substantially different from that of Ranunculus, which belongs to the same family, the Ranunculaceae. First, the Megaleranthis cp DNA was 4,797 bp longer than that of Ranunculus due to an expanded IR region into the SSC region and duplicated sequence elements in several spacer regions of the Megaleranthis cp genome. Second, the chloroplast genomes of Megaleranthis and Ranunculus evidence 5.6% sequence divergence in the coding regions, 8.9% sequence divergence in the intron regions, and 18.7% sequence divergence in the intergenic spacer regions, respectively. In both the coding and noncoding regions, average nucleotide substitution rates differed markedly, depending on the genome position. Our data strongly implicate the positional effects of the evolutionary modes of chloroplast genes. The genes evidencing higher levels of base substitutions also have higher incidences of indel mutations and low Ka/Ks ratios. A total of 54 simple sequence repeat loci were identified from the Megaleranthis cp genome. The existence of rich cp SSR loci in the Megaleranthis cp genome provides a rare opportunity to study the population genetic structures of this endangered species. Our phylogenetic trees based on the two independent markers, the nuclear ITS and chloroplast matK sequences, strongly support the inclusion of the Megaleranthis to the Trollius. Therefore, our

  18. Long non-coding RNA expression profile in cervical cancer tissues

    Science.gov (United States)

    Zhu, Hua; Chen, Xiangjian; Hu, Yan; Shi, Zhengzheng; Zhou, Qing; Zheng, Jingjie; Wang, Yifeng

    2017-01-01

    Cervical cancer (CC), one of the most common types of cancer of the female population, presents an enormous challenge in diagnosis and treatment. Long non-coding (lnc)RNAs, non-coding (nc)RNAs with length >200 nucleotides, have been identified to be associated with multiple types of cancer, including CC. This class of nc transcripts serves an important role in tumor suppression and oncogenic signaling pathways. In the present study, the microarray method was used to obtain the expression profile of lncRNAs and protein-coding mRNAs and to compare the expression of lncRNAs between CC tissues and corresponding adjacent non-cancerous tissues in order to screen potential lncRNAs for associations with CC. Overall, 3356 lncRNAs with significantly different expression pattern in CC tissues compared with adjacent non-cancerous tissues were identified, while 1,857 of them were upregulated. These differentially expressed lncRNAs were additionally classified into 5 subgroups. Reverse transcription quantitative polymerase chain reactions were performed to validate the expression pattern of 5 random selected lncRNAs, and 2lncRNAs were identified to have significantly different expression in CC samples compared with adjacent non-cancerous tissues. This finding suggests that those lncRNAs with different expression may serve important roles in the development of CC, and the expression data may provide information for additional study on the involvement of lncRNAs in CC. PMID:28789353

  19. The Solanum commersonii Genome Sequence Provides Insights into Adaptation to Stress Conditions and Genome Evolution of Wild Potato Relatives

    Science.gov (United States)

    Aversano, Riccardo; Contaldi, Felice; Ercolano, Maria Raffaella; Grosso, Valentina; Iorizzo, Massimo; Tatino, Filippo; Xumerle, Luciano; Dal Molin, Alessandra; Avanzato, Carla; Ferrarini, Alberto; Delledonne, Massimo; Sanseverino, Walter; Cigliano, Riccardo Aiese; Capella-Gutierrez, Salvador; Gabaldón, Toni; Frusciante, Luigi; Bradeen, James M.; Carputo, Domenico

    2015-01-01

    Here, we report the draft genome sequence of Solanum commersonii, which consists of ∼830 megabases with an N50 of 44,303 bp anchored to 12 chromosomes, using the potato (Solanum tuberosum) genome sequence as a reference. Compared with potato, S. commersonii shows a striking reduction in heterozygosity (1.5% versus 53 to 59%), and differences in genome sizes were mainly due to variations in intergenic sequence length. Gene annotation by ab initio prediction supported by RNA-seq data produced a catalog of 1703 predicted microRNAs, 18,882 long noncoding RNAs of which 20% are shown to target cold-responsive genes, and 39,290 protein-coding genes with a significant repertoire of nonredundant nucleotide binding site-encoding genes and 126 cold-related genes that are lacking in S. tuberosum. Phylogenetic analyses indicate that domesticated potato and S. commersonii lineages diverged ∼2.3 million years ago. Three duplication periods corresponding to genome enrichment for particular gene families related to response to salt stress, water transport, growth, and defense response were discovered. The draft genome sequence of S. commersonii substantially increases our understanding of the domesticated germplasm, facilitating translation of acquired knowledge into advances in crop stability in light of global climate and environmental changes. PMID:25873387

  20. Code-Switching to Know a TL Equivalent of an L1 Word: Request-Provision-Acknowledgement (RPA) Sequence

    Science.gov (United States)

    Lucero, Edgar

    2011-01-01

    This article focuses on the learner's use of Code-switching to learn the TL (Target Language) equivalent of an L1 word. The interactional pattern that this situation creates defines the Request-Provision-Acknowledgement (RPA) sequence. The article explains each of the turns of the sequence under the combination of the Ethnomethodological…

  1. Transduplication resulted in the incorporation of two protein-coding sequences into the Turmoil-1 transposable element of C. elegans

    Directory of Open Access Journals (Sweden)

    Pupko Tal

    2008-10-01

    Full Text Available Abstract Transposable elements may acquire unrelated gene fragments into their sequences in a process called transduplication. Transduplication of protein-coding genes is common in plants, but is unknown of in animals. Here, we report that the Turmoil-1 transposable element in C. elegans has incorporated two protein-coding sequences into its inverted terminal repeat (ITR sequences. The ITRs of Turmoil-1 contain a conserved RNA recognition motif (RRM that originated from the rsp-2 gene and a fragment from the protein-coding region of the cpg-3 gene. We further report that an open reading frame specific to C. elegans may have been created as a result of a Turmoil-1 insertion. Mutations at the 5' splice site of this open reading frame may have reactivated the transduplicated RRM motif. Reviewers This article was reviewed by Dan Graur and William Martin. For the full reviews, please go to the Reviewers' Reports section.

  2. Cryptographic pseudo-random sequences from the chaotic Hénon ...

    Indian Academy of Sciences (India)

    dimensional discrete-time Hénon map is proposed. Properties of the proposed sequences pertaining to linear complexity, linear complexity profile, correlation and auto-correlation are investigated. All these properties of the sequences suggest a ...

  3. Sequence Coding and Search System for licensee event reports: user's guide. Volume 1, Revision 1

    International Nuclear Information System (INIS)

    Greene, N.M.; Mays, G.T.; Johnson, M.P.

    1985-04-01

    Operating experience data from nuclear power plants are essential for safety and reliability analyses, especially analyses of trends and patterns. The licensee event reports (LERs) that are submitted to the Nuclear Regulatory Commission (NRC) by the nuclear power plant utilities contain much of this data. The NRC's Office for Analysis and Evaluation of Operational Data (AEOD) has developed, under contract with NSIC, a system for codifying the events reported in the LERs. The primary objective of the Sequence Coding and Search System (SCSS) is to reduce the descriptive text of the LERs to coded sequences that are both computer-readable and computer-searchable. This system provides a structured format for detailed coding of component, system, and unit effects as well as personnel errors. The database contains all current LERs submitted by nuclear power plant utilities for events occurring since 1981 and is updated on a continual basis. This four volume report documents and describes SCSS in detail. Volume 1 is a User's Guide for searching the SCSS database. This volume contains updated material through February 1985 of the working version of ORNL/NSIC-223, Vol. 1

  4. Nucleotide sequence of the melA gene, coding for alpha-galactosidase in Escherichia coli K-12.

    OpenAIRE

    Liljeström, P L; Liljeström, P

    1987-01-01

    Melibiose uptake and hydrolysis in E.coli is performed by the MelB and MelA proteins, respectively. We report the cloning and sequencing of the melA gene. The nucleotide sequence data showed that melA codes for a 450 amino acid long protein with a molecular weight of 50.6 kd. The sequence data also supported the assumption that the mel locus forms an operon with melA in proximal position. A comparison of MelA with alpha-galactosidase proteins from yeast and human origin showed that these prot...

  5. Long non-coding RNA discovery across the genus anopheles reveals conserved secondary structures within and beyond the Gambiae complex.

    Science.gov (United States)

    Jenkins, Adam M; Waterhouse, Robert M; Muskavitch, Marc A T

    2015-04-23

    Long non-coding RNAs (lncRNAs) have been defined as mRNA-like transcripts longer than 200 nucleotides that lack significant protein-coding potential, and many of them constitute scaffolds for ribonucleoprotein complexes with critical roles in epigenetic regulation. Various lncRNAs have been implicated in the modulation of chromatin structure, transcriptional and post-transcriptional gene regulation, and regulation of genomic stability in mammals, Caenorhabditis elegans, and Drosophila melanogaster. The purpose of this study is to identify the lncRNA landscape in the malaria vector An. gambiae and assess the evolutionary conservation of lncRNAs and their secondary structures across the Anopheles genus. Using deep RNA sequencing of multiple Anopheles gambiae life stages, we have identified 2,949 lncRNAs and more than 300 previously unannotated putative protein-coding genes. The lncRNAs exhibit differential expression profiles across life stages and adult genders. We find that across the genus Anopheles, lncRNAs display much lower sequence conservation than protein-coding genes. Additionally, we find that lncRNA secondary structure is highly conserved within the Gambiae complex, but diverges rapidly across the rest of the genus Anopheles. This study offers one of the first lncRNA secondary structure analyses in vector insects. Our description of lncRNAs in An. gambiae offers the most comprehensive genome-wide insights to date into lncRNAs in this vector mosquito, and defines a set of potential targets for the development of vector-based interventions that may further curb the human malaria burden in disease-endemic countries.

  6. The Genomic Code: Genome Evolution and Potential Applications

    KAUST Repository

    Bernardi, Giorgio

    2016-01-25

    The genome of metazoans is organized according to a genomic code which comprises three laws: 1) Compositional correlations hold between contiguous coding and non-coding sequences, as well as among the three codon positions of protein-coding genes; these correlations are the consequence of the fact that the genomes under consideration consist of fairly homogeneous, long (≥200Kb) sequences, the isochores; 2) Although isochores are defined on the basis of purely compositional properties, GC levels of isochores are correlated with all tested structural and functional properties of the genome; 3) GC levels of isochores are correlated with chromosome architecture from interphase to metaphase; in the case of interphase the correlation concerns isochores and the three-dimensional “topological associated domains” (TADs); in the case of mitotic chromosomes, the correlation concerns isochores and chromosomal bands. Finally, the genomic code is the fourth and last pillar of molecular biology, the first three pillars being 1) the double helix structure of DNA; 2) the regulation of gene expression in prokaryotes; and 3) the genetic code.

  7. Long Non-Coding RNA in Cancer

    Directory of Open Access Journals (Sweden)

    Damjan Glavač

    2013-02-01

    Full Text Available Long non-coding RNAs (lncRNAs are pervasively transcribed in the genome and are emerging as new players in tumorigenesis due to their various functions in transcriptional, posttranscriptional and epigenetic mechanisms of gene regulation. LncRNAs are deregulated in a number of cancers, demonstrating both oncogenic and tumor suppressive roles, thus suggesting their aberrant expression may be a substantial contributor in cancer development. In this review, we will summarize their emerging role in human cancer and discuss their perspectives in diagnostics as potential biomarkers.

  8. Phylogenetic footprinting of non-coding RNA: hammerhead ribozyme sequences in a satellite DNA family of Dolichopoda cave crickets (Orthoptera, Rhaphidophoridae

    Directory of Open Access Journals (Sweden)

    Venanzetti Federica

    2010-01-01

    Full Text Available Abstract Background The great variety in sequence, length, complexity, and abundance of satellite DNA has made it difficult to ascribe any function to this genome component. Recent studies have shown that satellite DNA can be transcribed and be involved in regulation of chromatin structure and gene expression. Some satellite DNAs, such as the pDo500 sequence family in Dolichopoda cave crickets, have a catalytic hammerhead (HH ribozyme structure and activity embedded within each repeat. Results We assessed the phylogenetic footprints of the HH ribozyme within the pDo500 sequences from 38 different populations representing 12 species of Dolichopoda. The HH region was significantly more conserved than the non-hammerhead (NHH region of the pDo500 repeat. In addition, stems were more conserved than loops. In stems, several compensatory mutations were detected that maintain base pairing. The core region of the HH ribozyme was affected by very few nucleotide substitutions and the cleavage position was altered only once among 198 sequences. RNA folding of the HH sequences revealed that a potentially active HH ribozyme can be found in most of the Dolichopoda populations and species. Conclusions The phylogenetic footprints suggest that the HH region of the pDo500 sequence family is selected for function in Dolichopoda cave crickets. However, the functional role of HH ribozymes in eukaryotic organisms is unclear. The possible functions have been related to trans cleavage of an RNA target by a ribonucleoprotein and regulation of gene expression. Whether the HH ribozyme in Dolichopoda is involved in similar functions remains to be investigated. Future studies need to demonstrate how the observed nucleotide changes and evolutionary constraint have affected the catalytic efficiency of the hammerhead.

  9. A Case for Dynamic Reverse-code Generation to Debug Non-deterministic Programs

    Directory of Open Access Journals (Sweden)

    Jooyong Yi

    2013-09-01

    Full Text Available Backtracking (i.e., reverse execution helps the user of a debugger to naturally think backwards along the execution path of a program, and thinking backwards makes it easy to locate the origin of a bug. So far backtracking has been implemented mostly by state saving or by checkpointing. These implementations, however, inherently do not scale. Meanwhile, a more recent backtracking method based on reverse-code generation seems promising because executing reverse code can restore the previous states of a program without state saving. In the literature, there can be found two methods that generate reverse code: (a static reverse-code generation that pre-generates reverse code through static analysis before starting a debugging session, and (b dynamic reverse-code generation that generates reverse code by applying dynamic analysis on the fly during a debugging session. In particular, we espoused the latter one in our previous work to accommodate non-determinism of a program caused by e.g., multi-threading. To demonstrate the usefulness of our dynamic reverse-code generation, this article presents a case study of various backtracking methods including ours. We compare the memory usage of various backtracking methods in a simple but nontrivial example, a bounded-buffer program. In the case of non-deterministic programs such as this bounded-buffer program, our dynamic reverse-code generation outperforms the existing backtracking methods in terms of memory efficiency.

  10. Sequence data and association statistics from 12,940 type 2 diabetes cases and controls.

    Science.gov (United States)

    Flannick, Jason; Fuchsberger, Christian; Mahajan, Anubha; Teslovich, Tanya M; Agarwala, Vineeta; Gaulton, Kyle J; Caulkins, Lizz; Koesterer, Ryan; Ma, Clement; Moutsianas, Loukas; McCarthy, Davis J; Rivas, Manuel A; Perry, John R B; Sim, Xueling; Blackwell, Thomas W; Robertson, Neil R; Rayner, N William; Cingolani, Pablo; Locke, Adam E; Tajes, Juan Fernandez; Highland, Heather M; Dupuis, Josee; Chines, Peter S; Lindgren, Cecilia M; Hartl, Christopher; Jackson, Anne U; Chen, Han; Huyghe, Jeroen R; van de Bunt, Martijn; Pearson, Richard D; Kumar, Ashish; Müller-Nurasyid, Martina; Grarup, Niels; Stringham, Heather M; Gamazon, Eric R; Lee, Jaehoon; Chen, Yuhui; Scott, Robert A; Below, Jennifer E; Chen, Peng; Huang, Jinyan; Go, Min Jin; Stitzel, Michael L; Pasko, Dorota; Parker, Stephen C J; Varga, Tibor V; Green, Todd; Beer, Nicola L; Day-Williams, Aaron G; Ferreira, Teresa; Fingerlin, Tasha; Horikoshi, Momoko; Hu, Cheng; Huh, Iksoo; Ikram, Mohammad Kamran; Kim, Bong-Jo; Kim, Yongkang; Kim, Young Jin; Kwon, Min-Seok; Lee, Juyoung; Lee, Selyeong; Lin, Keng-Han; Maxwell, Taylor J; Nagai, Yoshihiko; Wang, Xu; Welch, Ryan P; Yoon, Joon; Zhang, Weihua; Barzilai, Nir; Voight, Benjamin F; Han, Bok-Ghee; Jenkinson, Christopher P; Kuulasmaa, Teemu; Kuusisto, Johanna; Manning, Alisa; Ng, Maggie C Y; Palmer, Nicholette D; Balkau, Beverley; Stančáková, Alena; Abboud, Hanna E; Boeing, Heiner; Giedraitis, Vilmantas; Prabhakaran, Dorairaj; Gottesman, Omri; Scott, James; Carey, Jason; Kwan, Phoenix; Grant, George; Smith, Joshua D; Neale, Benjamin M; Purcell, Shaun; Butterworth, Adam S; Howson, Joanna M M; Lee, Heung Man; Lu, Yingchang; Kwak, Soo-Heon; Zhao, Wei; Danesh, John; Lam, Vincent K L; Park, Kyong Soo; Saleheen, Danish; So, Wing Yee; Tam, Claudia H T; Afzal, Uzma; Aguilar, David; Arya, Rector; Aung, Tin; Chan, Edmund; Navarro, Carmen; Cheng, Ching-Yu; Palli, Domenico; Correa, Adolfo; Curran, Joanne E; Rybin, Dennis; Farook, Vidya S; Fowler, Sharon P; Freedman, Barry I; Griswold, Michael; Hale, Daniel Esten; Hicks, Pamela J; Khor, Chiea-Chuen; Kumar, Satish; Lehne, Benjamin; Thuillier, Dorothée; Lim, Wei Yen; Liu, Jianjun; Loh, Marie; Musani, Solomon K; Puppala, Sobha; Scott, William R; Yengo, Loïc; Tan, Sian-Tsung; Taylor, Herman A; Thameem, Farook; Wilson, Gregory; Wong, Tien Yin; Njølstad, Pål Rasmus; Levy, Jonathan C; Mangino, Massimo; Bonnycastle, Lori L; Schwarzmayr, Thomas; Fadista, João; Surdulescu, Gabriela L; Herder, Christian; Groves, Christopher J; Wieland, Thomas; Bork-Jensen, Jette; Brandslund, Ivan; Christensen, Cramer; Koistinen, Heikki A; Doney, Alex S F; Kinnunen, Leena; Esko, Tõnu; Farmer, Andrew J; Hakaste, Liisa; Hodgkiss, Dylan; Kravic, Jasmina; Lyssenko, Valeri; Hollensted, Mette; Jørgensen, Marit E; Jørgensen, Torben; Ladenvall, Claes; Justesen, Johanne Marie; Käräjämäki, Annemari; Kriebel, Jennifer; Rathmann, Wolfgang; Lannfelt, Lars; Lauritzen, Torsten; Narisu, Narisu; Linneberg, Allan; Melander, Olle; Milani, Lili; Neville, Matt; Orho-Melander, Marju; Qi, Lu; Qi, Qibin; Roden, Michael; Rolandsson, Olov; Swift, Amy; Rosengren, Anders H; Stirrups, Kathleen; Wood, Andrew R; Mihailov, Evelin; Blancher, Christine; Carneiro, Mauricio O; Maguire, Jared; Poplin, Ryan; Shakir, Khalid; Fennell, Timothy; DePristo, Mark; de Angelis, Martin Hrabé; Deloukas, Panos; Gjesing, Anette P; Jun, Goo; Nilsson, Peter; Murphy, Jacquelyn; Onofrio, Robert; Thorand, Barbara; Hansen, Torben; Meisinger, Christa; Hu, Frank B; Isomaa, Bo; Karpe, Fredrik; Liang, Liming; Peters, Annette; Huth, Cornelia; O'Rahilly, Stephen P; Palmer, Colin N A; Pedersen, Oluf; Rauramaa, Rainer; Tuomilehto, Jaakko; Salomaa, Veikko; Watanabe, Richard M; Syvänen, Ann-Christine; Bergman, Richard N; Bharadwaj, Dwaipayan; Bottinger, Erwin P; Cho, Yoon Shin; Chandak, Giriraj R; Chan, Juliana Cn; Chia, Kee Seng; Daly, Mark J; Ebrahim, Shah B; Langenberg, Claudia; Elliott, Paul; Jablonski, Kathleen A; Lehman, Donna M; Jia, Weiping; Ma, Ronald C W; Pollin, Toni I; Sandhu, Manjinder; Tandon, Nikhil; Froguel, Philippe; Barroso, Inês; Teo, Yik Ying; Zeggini, Eleftheria; Loos, Ruth J F; Small, Kerrin S; Ried, Janina S; DeFronzo, Ralph A; Grallert, Harald; Glaser, Benjamin; Metspalu, Andres; Wareham, Nicholas J; Walker, Mark; Banks, Eric; Gieger, Christian; Ingelsson, Erik; Im, Hae Kyung; Illig, Thomas; Franks, Paul W; Buck, Gemma; Trakalo, Joseph; Buck, David; Prokopenko, Inga; Mägi, Reedik; Lind, Lars; Farjoun, Yossi; Owen, Katharine R; Gloyn, Anna L; Strauch, Konstantin; Tuomi, Tiinamaija; Kooner, Jaspal Singh; Lee, Jong-Young; Park, Taesung; Donnelly, Peter; Morris, Andrew D; Hattersley, Andrew T; Bowden, Donald W; Collins, Francis S; Atzmon, Gil; Chambers, John C; Spector, Timothy D; Laakso, Markku; Strom, Tim M; Bell, Graeme I; Blangero, John; Duggirala, Ravindranath; Tai, E Shyong; McVean, Gilean; Hanis, Craig L; Wilson, James G; Seielstad, Mark; Frayling, Timothy M; Meigs, James B; Cox, Nancy J; Sladek, Rob; Lander, Eric S; Gabriel, Stacey; Mohlke, Karen L; Meitinger, Thomas; Groop, Leif; Abecasis, Goncalo; Scott, Laura J; Morris, Andrew P; Kang, Hyun Min; Altshuler, David; Burtt, Noël P; Florez, Jose C; Boehnke, Michael; McCarthy, Mark I

    2017-12-19

    To investigate the genetic basis of type 2 diabetes (T2D) to high resolution, the GoT2D and T2D-GENES consortia catalogued variation from whole-genome sequencing of 2,657 European individuals and exome sequencing of 12,940 individuals of multiple ancestries. Over 27M SNPs, indels, and structural variants were identified, including 99% of low-frequency (minor allele frequency [MAF] 0.1-5%) non-coding variants in the whole-genome sequenced individuals and 99.7% of low-frequency coding variants in the whole-exome sequenced individuals. Each variant was tested for association with T2D in the sequenced individuals, and, to increase power, most were tested in larger numbers of individuals (>80% of low-frequency coding variants in ~82 K Europeans via the exome chip, and ~90% of low-frequency non-coding variants in ~44 K Europeans via genotype imputation). The variants, genotypes, and association statistics from these analyses provide the largest reference to date of human genetic information relevant to T2D, for use in activities such as T2D-focused genotype imputation, functional characterization of variants or genes, and other novel analyses to detect associations between sequence variation and T2D.

  11. CVD-associated non-coding RNA, ANRIL, modulates expression of atherogenic pathways in VSMC

    International Nuclear Information System (INIS)

    Congrains, Ada; Kamide, Kei; Katsuya, Tomohiro; Yasuda, Osamu; Oguro, Ryousuke; Yamamoto, Koichi; Ohishi, Mitsuru; Rakugi, Hiromi

    2012-01-01

    Highlights: ► ANRIL maps in the strongest susceptibility locus for cardiovascular disease. ► Silencing of ANRIL leads to altered expression of tissue remodeling-related genes. ► The effects of ANRIL on gene expression are splicing variant specific. ► ANRIL affects progression of cardiovascular disease by regulating proliferation and apoptosis pathways. -- Abstract: ANRIL is a newly discovered non-coding RNA lying on the strongest genetic susceptibility locus for cardiovascular disease (CVD) in the chromosome 9p21 region. Genome-wide association studies have been linking polymorphisms in this locus with CVD and several other major diseases such as diabetes and cancer. The role of this non-coding RNA in atherosclerosis progression is still poorly understood. In this study, we investigated the implication of ANRIL in the modulation of gene sets directly involved in atherosclerosis. We designed and tested siRNA sequences to selectively target two exons (exon 1 and exon 19) of the transcript and successfully knocked down expression of ANRIL in human aortic vascular smooth muscle cells (HuAoVSMC). We used a pathway-focused RT-PCR array to profile gene expression changes caused by ANRIL knock down. Notably, the genes affected by each of the siRNAs were different, suggesting that different splicing variants of ANRIL might have distinct roles in cell physiology. Our results suggest that ANRIL splicing variants play a role in coordinating tissue remodeling, by modulating the expression of genes involved in cell proliferation, apoptosis, extra-cellular matrix remodeling and inflammatory response to finally impact in the risk of cardiovascular disease and other pathologies.

  12. Primate-specific spliced PMCHL RNAs are non-protein coding in human and macaque tissues

    Directory of Open Access Journals (Sweden)

    Delerue-Audegond Audrey

    2008-12-01

    Full Text Available Abstract Background Brain-expressed genes that were created in primate lineage represent obvious candidates to investigate molecular mechanisms that contributed to neural reorganization and emergence of new behavioural functions in Homo sapiens. PMCHL1 arose from retroposition of a pro-melanin-concentrating hormone (PMCH antisense mRNA on the ancestral human chromosome 5p14 when platyrrhines and catarrhines diverged. Mutations before divergence of hylobatidae led to creation of new exons and finally PMCHL1 duplicated in an ancestor of hominids to generate PMCHL2 at the human chromosome 5q13. A complex pattern of spliced and unspliced PMCHL RNAs were found in human brain and testis. Results Several novel spliced PMCHL transcripts have been characterized in human testis and fetal brain, identifying an additional exon and novel splice sites. Sequencing of PMCHL genes in several non-human primates allowed to carry out phylogenetic analyses revealing that the initial retroposition event took place within an intron of the brain cadherin (CDH12 gene, soon after platyrrhine/catarrhine divergence, i.e. 30–35 Mya, and was concomitant with the insertion of an AluSg element. Sequence analysis of the spliced PMCHL transcripts identified only short ORFs of less than 300 bp, with low (VMCH-p8 and protein variants or no evolutionary conservation. Western blot analyses of human and macaque tissues expressing PMCHL RNA failed to reveal any protein corresponding to VMCH-p8 and protein variants encoded by spliced transcripts. Conclusion Our present results improve our knowledge of the gene structure and the evolutionary history of the primate-specific chimeric PMCHL genes. These genes produce multiple spliced transcripts, bearing short, non-conserved and apparently non-translated ORFs that may function as mRNA-like non-coding RNAs.

  13. Long Non-Coding RNAs: A Novel Paradigm for Toxicology.

    Science.gov (United States)

    Dempsey, Joseph L; Cui, Julia Yue

    2017-01-01

    Long non-coding RNAs (lncRNAs) are over 200 nucleotides in length and are transcribed from the mammalian genome in a tissue-specific and developmentally regulated pattern. There is growing recognition that lncRNAs are novel biomarkers and/or key regulators of toxicological responses in humans and animal models. Lacking protein-coding capacity, the numerous types of lncRNAs possess a myriad of transcriptional regulatory functions that include cis and trans gene expression, transcription factor activity, chromatin remodeling, imprinting, and enhancer up-regulation. LncRNAs also influence mRNA processing, post-transcriptional regulation, and protein trafficking. Dysregulation of lncRNAs has been implicated in various human health outcomes such as various cancers, Alzheimer's disease, cardiovascular disease, autoimmune diseases, as well as intermediary metabolism such as glucose, lipid, and bile acid homeostasis. Interestingly, emerging evidence in the literature over the past five years has shown that lncRNA regulation is impacted by exposures to various chemicals such as polycyclic aromatic hydrocarbons, benzene, cadmium, chlorpyrifos-methyl, bisphenol A, phthalates, phenols, and bile acids. Recent technological advancements, including next-generation sequencing technologies and novel computational algorithms, have enabled the profiling and functional characterizations of lncRNAs on a genomic scale. In this review, we summarize the biogenesis and general biological functions of lncRNAs, highlight the important roles of lncRNAs in human diseases and especially during the toxicological responses to various xenobiotics, evaluate current methods for identifying aberrant lncRNA expression and molecular target interactions, and discuss the potential to implement these tools to address fundamental questions in toxicology. © The Author 2016. Published by Oxford University Press on behalf of the Society of Toxicology. All rights reserved. For Permissions, please e

  14. Diagnostic and prognostic signatures from the small non-coding RNA transcriptome in prostate cancer

    DEFF Research Database (Denmark)

    Martens-Uzunova, E S; Jalava, S E; Dits, N F

    2011-01-01

    Prostate cancer (PCa) is the most frequent male malignancy and the second most common cause of cancer-related death in Western countries. Current clinical and pathological methods are limited in the prediction of postoperative outcome. It is becoming increasingly evident that small non-coding RNA...... signatures of 102 fresh-frozen patient samples during PCa progression by miRNA microarrays. Both platforms were cross-validated by quantitative reverse transcriptase-PCR. Besides the altered expression of several miRNAs, our deep sequencing analyses revealed strong differential expression of small nucleolar...... RNAs (snoRNAs) and transfer RNAs (tRNAs). From microarray analysis, we derived a miRNA diagnostic classifier that accurately distinguishes normal from cancer samples. Furthermore, we were able to construct a PCa prognostic predictor that independently forecasts postoperative outcome. Importantly...

  15. Advanced GF(32) nonbinary LDPC coded modulation with non-uniform 9-QAM outperforming star 8-QAM.

    Science.gov (United States)

    Liu, Tao; Lin, Changyu; Djordjevic, Ivan B

    2016-06-27

    In this paper, we first describe a 9-symbol non-uniform signaling scheme based on Huffman code, in which different symbols are transmitted with different probabilities. By using the Huffman procedure, prefix code is designed to approach the optimal performance. Then, we introduce an algorithm to determine the optimal signal constellation sets for our proposed non-uniform scheme with the criterion of maximizing constellation figure of merit (CFM). The proposed nonuniform polarization multiplexed signaling 9-QAM scheme has the same spectral efficiency as the conventional 8-QAM. Additionally, we propose a specially designed GF(32) nonbinary quasi-cyclic LDPC code for the coded modulation system based on the 9-QAM non-uniform scheme. Further, we study the efficiency of our proposed non-uniform 9-QAM, combined with nonbinary LDPC coding, and demonstrate by Monte Carlo simulation that the proposed GF(23) nonbinary LDPC coded 9-QAM scheme outperforms nonbinary LDPC coded uniform 8-QAM by at least 0.8dB.

  16. Complete Sequence and Analysis of the Mitochondrial Genome of Hemiselmis andersenii CCMP644 (Cryptophyceae

    Directory of Open Access Journals (Sweden)

    Bowman Sharen

    2008-05-01

    Full Text Available Abstract Background Cryptophytes are an enigmatic group of unicellular eukaryotes with plastids derived by secondary (i.e., eukaryote-eukaryote endosymbiosis. Cryptophytes are unusual in that they possess four genomes–a host cell-derived nuclear and mitochondrial genome and an endosymbiont-derived plastid and 'nucleomorph' genome. The evolutionary origins of the host and endosymbiont components of cryptophyte algae are at present poorly understood. Thus far, a single complete mitochondrial genome sequence has been determined for the cryptophyte Rhodomonas salina. Here, the second complete mitochondrial genome of the cryptophyte alga Hemiselmis andersenii CCMP644 is presented. Results The H. andersenii mtDNA is 60,553 bp in size and encodes 30 structural RNAs and 36 protein-coding genes, all located on the same strand. A prominent feature of the genome is the presence of a ~20 Kbp long intergenic region comprised of numerous tandem and dispersed repeat units of between 22–336 bp. Adjacent to these repeats are 27 copies of palindromic sequences predicted to form stable DNA stem-loop structures. One such stem-loop is located near a GC-rich and GC-poor region and may have a regulatory function in replication or transcription. The H. andersenii mtDNA shares a number of features in common with the genome of the cryptophyte Rhodomonas salina, including general architecture, gene content, and the presence of a large repeat region. However, the H. andersenii mtDNA is devoid of inverted repeats and introns, which are present in R. salina. Comparative analyses of the suite of tRNAs encoded in the two genomes reveal that the H. andersenii mtDNA has lost or converted its original trnK(uuu gene and possesses a trnS-derived 'trnK(uuu', which appears unable to produce a functional tRNA. Mitochondrial protein coding gene phylogenies strongly support a variety of previously established eukaryotic groups, but fail to resolve the relationships among higher

  17. ICRPfinder: a fast pattern design algorithm for coding sequences and its application in finding potential restriction enzyme recognition sites

    Directory of Open Access Journals (Sweden)

    Stafford Phillip

    2009-09-01

    Full Text Available Abstract Background Restriction enzymes can produce easily definable segments from DNA sequences by using a variety of cut patterns. There are, however, no software tools that can aid in gene building -- that is, modifying wild-type DNA sequences to express the same wild-type amino acid sequences but with enhanced codons, specific cut sites, unique post-translational modifications, and other engineered-in components for recombinant applications. A fast DNA pattern design algorithm, ICRPfinder, is provided in this paper and applied to find or create potential recognition sites in target coding sequences. Results ICRPfinder is applied to find or create restriction enzyme recognition sites by introducing silent mutations. The algorithm is shown capable of mapping existing cut-sites but importantly it also can generate specified new unique cut-sites within a specified region that are guaranteed not to be present elsewhere in the DNA sequence. Conclusion ICRPfinder is a powerful tool for finding or creating specific DNA patterns in a given target coding sequence. ICRPfinder finds or creates patterns, which can include restriction enzyme recognition sites, without changing the translated protein sequence. ICRPfinder is a browser-based JavaScript application and it can run on any platform, in on-line or off-line mode.

  18. Python Radiative Transfer Emission code (PyRaTE): non-LTE spectral lines simulations

    Science.gov (United States)

    Tritsis, A.; Yorke, H.; Tassis, K.

    2018-05-01

    We describe PyRaTE, a new, non-local thermodynamic equilibrium (non-LTE) line radiative transfer code developed specifically for post-processing astrochemical simulations. Population densities are estimated using the escape probability method. When computing the escape probability, the optical depth is calculated towards all directions with density, molecular abundance, temperature and velocity variations all taken into account. A very easy-to-use interface, capable of importing data from simulations outputs performed with all major astrophysical codes, is also developed. The code is written in PYTHON using an "embarrassingly parallel" strategy and can handle all geometries and projection angles. We benchmark the code by comparing our results with those from RADEX (van der Tak et al. 2007) and against analytical solutions and present case studies using hydrochemical simulations. The code will be released for public use.

  19. Complete coding sequence of the human raf oncogene and the corresponding structure of the c-raf-1 gene

    Energy Technology Data Exchange (ETDEWEB)

    Bonner, T I; Oppermann, H; Seeburg, P; Kerby, S B; Gunnell, M A; Young, A C; Rapp, U R

    1986-01-24

    The complete 648 amino acid sequence of the human raf oncogene was deduced from the 2977 nucleotide sequence of a fetal liver cDNA. The cDNA has been used to obtain clones which extend the human c-raf-1 locus by an additional 18.9 kb at the 5' end and contain all the remaining coding exons.

  20. Differentiation of Populus species using chloroplast single nucleotide polymorphism (SNP) markers--essential for comprehensible and reliable poplar breeding.

    Science.gov (United States)

    Schroeder, H; Hoeltken, A M; Fladung, M

    2012-03-01

    Within the genus Populus several species belonging to different sections are cross-compatible. Hence, high numbers of interspecies hybrids occur naturally and, additionally, have been artificially produced in huge breeding programmes during the last 100 years. Therefore, determination of a single poplar species, used for the production of 'multi-species hybrids' is often difficult, and represents a great challenge for the use of molecular markers in species identification. Within this study, over 20 chloroplast regions, both intergenic spacers and coding regions, have been tested for their ability to differentiate different poplar species using 23 already published barcoding primer combinations and 17 newly designed primer combinations. About half of the published barcoding primers yielded amplification products, whereas the new primers designed on the basis of the total sequenced cpDNA genome of Populus trichocarpa Torr. & Gray yielded much higher amplification success. Intergenic spacers were found to be more variable than coding regions within the genus Populus. The highest discrimination power of Populus species was found in the combination of two intergenic spacers (trnG-psbK, psbK-psbl) and the coding region rpoC. In barcoding projects, the coding regions matK and rbcL are often recommended, but within the genus Populus they only show moderate variability and are not efficient in species discrimination. © 2011 German Botanical Society and The Royal Botanical Society of the Netherlands.

  1. Law of Iterated Logarithm for NA Sequences with Non-Identical ...

    Indian Academy of Sciences (India)

    Based on a law of the iterated logarithm for independent random variables sequences, an iterated logarithm theorem for NA sequences with non-identical distributions is obtained. The proof is based on a Kolmogrov-type exponential inequality.

  2. Decoding the non-coding genome: elucidating genetic risk outside the coding genome.

    Science.gov (United States)

    Barr, C L; Misener, V L

    2016-01-01

    Current evidence emerging from genome-wide association studies indicates that the genetic underpinnings of complex traits are likely attributable to genetic variation that changes gene expression, rather than (or in combination with) variation that changes protein-coding sequences. This is particularly compelling with respect to psychiatric disorders, as genetic changes in regulatory regions may result in differential transcriptional responses to developmental cues and environmental/psychosocial stressors. Until recently, however, the link between transcriptional regulation and psychiatric genetic risk has been understudied. Multiple obstacles have contributed to the paucity of research in this area, including challenges in identifying the positions of remote (distal from the promoter) regulatory elements (e.g. enhancers) and their target genes and the underrepresentation of neural cell types and brain tissues in epigenome projects - the availability of high-quality brain tissues for epigenetic and transcriptome profiling, particularly for the adolescent and developing brain, has been limited. Further challenges have arisen in the prediction and testing of the functional impact of DNA variation with respect to multiple aspects of transcriptional control, including regulatory-element interaction (e.g. between enhancers and promoters), transcription factor binding and DNA methylation. Further, the brain has uncommon DNA-methylation marks with unique genomic distributions not found in other tissues - current evidence suggests the involvement of non-CG methylation and 5-hydroxymethylation in neurodevelopmental processes but much remains unknown. We review here knowledge gaps as well as both technological and resource obstacles that will need to be overcome in order to elucidate the involvement of brain-relevant gene-regulatory variants in genetic risk for psychiatric disorders. © 2015 John Wiley & Sons Ltd and International Behavioural and Neural Genetics Society.

  3. Diversity in non-repetitive human sequences not found in the reference genome.

    Science.gov (United States)

    Kehr, Birte; Helgadottir, Anna; Melsted, Pall; Jonsson, Hakon; Helgason, Hannes; Jonasdottir, Adalbjörg; Jonasdottir, Aslaug; Sigurdsson, Asgeir; Gylfason, Arnaldur; Halldorsson, Gisli H; Kristmundsdottir, Snaedis; Thorgeirsson, Gudmundur; Olafsson, Isleifur; Holm, Hilma; Thorsteinsdottir, Unnur; Sulem, Patrick; Helgason, Agnar; Gudbjartsson, Daniel F; Halldorsson, Bjarni V; Stefansson, Kari

    2017-04-01

    Genomes usually contain some non-repetitive sequences that are missing from the reference genome and occur only in a population subset. Such non-repetitive, non-reference (NRNR) sequences have remained largely unexplored in terms of their characterization and downstream analyses. Here we describe 3,791 breakpoint-resolved NRNR sequence variants called using PopIns from whole-genome sequence data of 15,219 Icelanders. We found that over 95% of the 244 NRNR sequences that are 200 bp or longer are present in chimpanzees, indicating that they are ancestral. Furthermore, 149 variant loci are in linkage disequilibrium (r 2 > 0.8) with a genome-wide association study (GWAS) catalog marker, suggesting disease relevance. Additionally, we report an association (P = 3.8 × 10 -8 , odds ratio (OR) = 0.92) with myocardial infarction (23,360 cases, 300,771 controls) for a 766-bp NRNR sequence variant. Our results underline the importance of including variation of all complexity levels when searching for variants that associate with disease.

  4. A ribosomal RNA gene intergenic spacer based PCR and DGGE fingerprinting method for the analysis of specific rhizobial communities in soil.

    Science.gov (United States)

    de Oliveira, Valéria Maia; Manfio, Gilson Paulo; da Costa Coutinho, Heitor Luiz; Keijzer-Wolters, Anneke Christina; van Elsas, Jan Dirk

    2006-03-01

    A direct molecular method for assessing the diversity of specific populations of rhizobia in soil, based on nested PCR amplification of 16S-23S ribosomal RNA gene (rDNA) intergenic spacer (IGS) sequences, was developed. Initial generic amplification of bacterial rDNA IGS sequences from soil DNA was followed by specific amplification of (1) sequences affiliated with Rhizobium leguminosarum "sensu lato" and (2) R. tropici. Using analysis of the amplified sequences in clone libraries obtained on the basis of soil DNA, this two-sided method was shown to be very specific for rhizobial subpopulations in soil. It was then further validated as a direct fingerprinting tool of the target rhizobia based on denaturing gradient gel electrophoresis (DGGE). The PCR-DGGE approach was applied to soils from fields in Brazil cultivated with common bean (Phaseolus vulgaris) under conventional or no-tillage practices. The community fingerprints obtained allowed the direct analysis of the respective rhizobial community structures in soil samples from the two contrasting agricultural practices. Data obtained with both primer sets revealed clustering of the community structures of the target rhizobial types along treatment. Moreover, the DGGE profiles obtained with the R. tropici primer set indicated that the abundance and diversity of these organisms were favoured under NT practices. These results suggest that the R. leguminosarum-as well as R. tropici-targeted IGS-based nested PCR and DGGE are useful tools for monitoring the effect of agricultural practices on these and related rhizobial subpopulations in soils.

  5. Codon size reduction as the origin of the triplet genetic code.

    Directory of Open Access Journals (Sweden)

    Pavel V Baranov

    Full Text Available The genetic code appears to be optimized in its robustness to missense errors and frameshift errors. In addition, the genetic code is near-optimal in terms of its ability to carry information in addition to the sequences of encoded proteins. As evolution has no foresight, optimality of the modern genetic code suggests that it evolved from less optimal code variants. The length of codons in the genetic code is also optimal, as three is the minimal nucleotide combination that can encode the twenty standard amino acids. The apparent impossibility of transitions between codon sizes in a discontinuous manner during evolution has resulted in an unbending view that the genetic code was always triplet. Yet, recent experimental evidence on quadruplet decoding, as well as the discovery of organisms with ambiguous and dual decoding, suggest that the possibility of the evolution of triplet decoding from living systems with non-triplet decoding merits reconsideration and further exploration. To explore this possibility we designed a mathematical model of the evolution of primitive digital coding systems which can decode nucleotide sequences into protein sequences. These coding systems can evolve their nucleotide sequences via genetic events of Darwinian evolution, such as point-mutations. The replication rates of such coding systems depend on the accuracy of the generated protein sequences. Computer simulations based on our model show that decoding systems with codons of length greater than three spontaneously evolve into predominantly triplet decoding systems. Our findings suggest a plausible scenario for the evolution of the triplet genetic code in a continuous manner. This scenario suggests an explanation of how protein synthesis could be accomplished by means of long RNA-RNA interactions prior to the emergence of the complex decoding machinery, such as the ribosome, that is required for stabilization and discrimination of otherwise weak triplet codon

  6. The Number, Organization, and Size of Polymorphic Membrane Protein Coding Sequences as well as the Most Conserved Pmp Protein Differ within and across Chlamydia Species.

    Science.gov (United States)

    Van Lent, Sarah; Creasy, Heather Huot; Myers, Garry S A; Vanrompay, Daisy

    2016-01-01

    Variation is a central trait of the polymorphic membrane protein (Pmp) family. The number of pmp coding sequences differs between Chlamydia species, but it is unknown whether the number of pmp coding sequences is constant within a Chlamydia species. The level of conservation of the Pmp proteins has previously only been determined for Chlamydia trachomatis. As different Pmp proteins might be indispensible for the pathogenesis of different Chlamydia species, this study investigated the conservation of Pmp proteins both within and across C. trachomatis,C. pneumoniae,C. abortus, and C. psittaci. The pmp coding sequences were annotated in 16 C. trachomatis, 6 C. pneumoniae, 2 C. abortus, and 16 C. psittaci genomes. The number and organization of polymorphic membrane coding sequences differed within and across the analyzed Chlamydia species. The length of coding sequences of pmpA,pmpB, and pmpH was conserved among all analyzed genomes, while the length of pmpE/F and pmpG, and remarkably also of the subtype pmpD, differed among the analyzed genomes. PmpD, PmpA, PmpH, and PmpA were the most conserved Pmp in C. trachomatis,C. pneumoniae,C. abortus, and C. psittaci, respectively. PmpB was the most conserved Pmp across the 4 analyzed Chlamydia species. © 2016 S. Karger AG, Basel.

  7. The oestrogen receptor alpha-regulated lncRNA NEAT1 is a critical modulator of prostate cancer

    Science.gov (United States)

    Chakravarty, Dimple; Sboner, Andrea; Nair, Sujit S.; Giannopoulou, Eugenia; Li, Ruohan; Hennig, Sven; Mosquera, Juan Miguel; Pauwels, Jonathan; Park, Kyung; Kossai, Myriam; MacDonald, Theresa Y.; Fontugne, Jacqueline; Erho, Nicholas; Vergara, Ismael A.; Ghadessi, Mercedeh; Davicioni, Elai; Jenkins, Robert B.; Palanisamy, Nallasivam; Chen, Zhengming; Nakagawa, Shinichi; Hirose, Tetsuro; Bander, Neil H.; Beltran, Himisha; Fox, Archa H.; Elemento, Olivier; Rubin, Mark A.

    2014-01-01

    The androgen receptor (AR) plays a central role in establishing an oncogenic cascade that drives prostate cancer progression. Some prostate cancers escape androgen dependence and are often associated with an aggressive phenotype. The oestrogen receptor alpha (ERα) is expressed in prostate cancers, independent of AR status. However, the role of ERα remains elusive. Using a combination of chromatin immunoprecipitation (ChIP) and RNA-sequencing data, we identified an ERα-specific non-coding transcriptome signature. Among putatively ERα-regulated intergenic long non-coding RNAs (lncRNAs), we identified nuclear enriched abundant transcript 1 (NEAT1) as the most significantly overexpressed lncRNA in prostate cancer. Analysis of two large clinical cohorts also revealed that NEAT1 expression is associated with prostate cancer progression. Prostate cancer cells expressing high levels of NEAT1 were recalcitrant to androgen or AR antagonists. Finally, we provide evidence that NEAT1 drives oncogenic growth by altering the epigenetic landscape of target gene promoters to favour transcription. PMID:25415230

  8. Yeast genome sequencing:

    DEFF Research Database (Denmark)

    Piskur, Jure; Langkjær, Rikke Breinhold

    2004-01-01

    For decades, unicellular yeasts have been general models to help understand the eukaryotic cell and also our own biology. Recently, over a dozen yeast genomes have been sequenced, providing the basis to resolve several complex biological questions. Analysis of the novel sequence data has shown...... of closely related species helps in gene annotation and to answer how many genes there really are within the genomes. Analysis of non-coding regions among closely related species has provided an example of how to determine novel gene regulatory sequences, which were previously difficult to analyse because...... they are short and degenerate and occupy different positions. Comparative genomics helps to understand the origin of yeasts and points out crucial molecular events in yeast evolutionary history, such as whole-genome duplication and horizontal gene transfer(s). In addition, the accumulating sequence data provide...

  9. Instruction sequences and non-uniform complexity theory

    NARCIS (Netherlands)

    Bergstra, J.A.; Middelburg, C.A.

    2008-01-01

    We develop theory concerning non-uniform complexity in a setting in which the notion of single-pass instruction sequence considered in program algebra is the central notion. We define counterparts of the complexity classes P/poly and NP/poly and formulate a counterpart of the complexity theoretic

  10. CVD-associated non-coding RNA, ANRIL, modulates expression of atherogenic pathways in VSMC

    Energy Technology Data Exchange (ETDEWEB)

    Congrains, Ada; Kamide, Kei [Department of Geriatric Medicine and Nephrology, Osaka University Graduate School of Medicine (Japan); Katsuya, Tomohiro [Clinical Gene Therapy, Osaka University Graduate School of Medicine (Japan); Yasuda, Osamu [Department of Cardiovascular Clinical and Translational Research, Kumamoto University Hospital (Japan); Oguro, Ryousuke; Yamamoto, Koichi [Department of Geriatric Medicine and Nephrology, Osaka University Graduate School of Medicine (Japan); Ohishi, Mitsuru, E-mail: ohishi@geriat.med.osaka-u.ac.jp [Department of Geriatric Medicine and Nephrology, Osaka University Graduate School of Medicine (Japan); Rakugi, Hiromi [Department of Geriatric Medicine and Nephrology, Osaka University Graduate School of Medicine (Japan)

    2012-03-23

    Highlights: Black-Right-Pointing-Pointer ANRIL maps in the strongest susceptibility locus for cardiovascular disease. Black-Right-Pointing-Pointer Silencing of ANRIL leads to altered expression of tissue remodeling-related genes. Black-Right-Pointing-Pointer The effects of ANRIL on gene expression are splicing variant specific. Black-Right-Pointing-Pointer ANRIL affects progression of cardiovascular disease by regulating proliferation and apoptosis pathways. -- Abstract: ANRIL is a newly discovered non-coding RNA lying on the strongest genetic susceptibility locus for cardiovascular disease (CVD) in the chromosome 9p21 region. Genome-wide association studies have been linking polymorphisms in this locus with CVD and several other major diseases such as diabetes and cancer. The role of this non-coding RNA in atherosclerosis progression is still poorly understood. In this study, we investigated the implication of ANRIL in the modulation of gene sets directly involved in atherosclerosis. We designed and tested siRNA sequences to selectively target two exons (exon 1 and exon 19) of the transcript and successfully knocked down expression of ANRIL in human aortic vascular smooth muscle cells (HuAoVSMC). We used a pathway-focused RT-PCR array to profile gene expression changes caused by ANRIL knock down. Notably, the genes affected by each of the siRNAs were different, suggesting that different splicing variants of ANRIL might have distinct roles in cell physiology. Our results suggest that ANRIL splicing variants play a role in coordinating tissue remodeling, by modulating the expression of genes involved in cell proliferation, apoptosis, extra-cellular matrix remodeling and inflammatory response to finally impact in the risk of cardiovascular disease and other pathologies.

  11. Comprehensive reconstruction andvisualization of non-coding regulatorynetworks in human

    Directory of Open Access Journals (Sweden)

    Vincenzo eBonnici

    2014-12-01

    Full Text Available Research attention has been powered to understand the functional roles of non-coding RNAs (ncRNAs. Many studies have demonstrated their deregulation in cancer and other human disorders. ncRNAs are also present in extracellular human body fluids such as serum and plasma, giving them a great potential as non-invasive biomarkers. However, non-coding RNAs have been relatively recently discovered and a comprehensive database including all of them is still missing. Reconstructing and visualizing the network of ncRNAs interactions are important steps to understand their regulatory mechanism in complex systems. This work presents ncRNA-DB, a NoSQL database that integrates ncRNAs data interactions from a large number of well established online repositories. The interactions involve RNA, DNA, proteins and diseases. ncRNA-DB is available at http://ncrnadb.scienze.univr.it/ncrnadb/. It is equipped with three interfaces: web based, command line and a Cytoscape app called ncINetView. By accessing only one resource, users can search for ncRNAs and their interactions, build a network annotated with all known ncRNAs and associated diseases, and use all visual and mining features available in Cytoscape.

  12. Therapeutic targeting of non-coding RNAs in cancer

    Czech Academy of Sciences Publication Activity Database

    Slabý, O.; Laga, Richard; Sedláček, Ondřej

    2017-01-01

    Roč. 474, č. 24 (2017), s. 4219-4251 ISSN 0264-6021 R&D Projects: GA MŠk(CZ) LQ1604; GA MŠk(CZ) ED1.1.00/02.0109 Institutional support: RVO:61389013 Keywords : non-coding RNA * RNA delivery * polymer carriers Subject RIV: EB - Genetics ; Molecular Biology OBOR OECD: Biochemical research methods Impact factor: 3.797, year: 2016

  13. Exploration of small RNA-seq data for small non-coding RNAs in Human Colorectal Cancer.

    Science.gov (United States)

    Koduru, Srinivas V; Tiwari, Amit K; Hazard, Sprague W; Mahajan, Milind; Ravnic, Dino J

    2017-01-01

    Background: Improved healthcare and recent breakthroughs in technology have substantially reduced cancer mortality rates worldwide. Recent advancements in next-generation sequencing (NGS) have allowed genomic analysis of the human transcriptome. Now, using NGS we can further look into small non-coding regions of RNAs (sncRNAs) such as microRNAs (miRNAs), Piwi-interacting-RNAs (piRNAs), long non-coding RNAs (lncRNAs), and small nuclear/nucleolar RNAs (sn/snoRNAs) among others. Recent studies looking at sncRNAs indicate their role in important biological processes such as cancer progression and predict their role as biomarkers for disease diagnosis, prognosis, and therapy. Results: In the present study, we data mined publically available small RNA sequencing data from colorectal tissue samples of eight matched patients (benign, tumor, and metastasis) and remapped the data for various small RNA annotations. We identified aberrant expression of 13 miRNAs in tumor and metastasis specimens [tumor vs benign group (19 miRNAs) and metastasis vs benign group (38 miRNAs)] of which five were upregulated, and eight were downregulated, during disease progression. Pathway analysis of aberrantly expressed miRNAs showed that the majority of miRNAs involved in colon cancer were also involved in other cancers. Analysis of piRNAs revealed six to be over-expressed in the tumor vs benign cohort and 24 in the metastasis vs benign group. Only two piRNAs were shared between the two cohorts. Examining other types of small RNAs [sn/snoRNAs, mt_rRNA, miscRNA, nonsense mediated decay (NMD), and rRNAs] identified 15 sncRNAs in the tumor vs benign group and 104 in the metastasis vs benign group, with only four others being commonly expressed. Conclusion: In summary, our comprehensive analysis on publicly available small RNA-seq data identified multiple differentially expressed sncRNAs during colorectal cancer progression at different stages compared to normal colon tissue. We speculate that

  14. Constacyclic codes over the ring F_q+v{F}_q+v2F_q and their applications of constructing new non-binary quantum codes

    Science.gov (United States)

    Ma, Fanghui; Gao, Jian; Fu, Fang-Wei

    2018-06-01

    Let R={F}_q+v{F}_q+v2{F}_q be a finite non-chain ring, where q is an odd prime power and v^3=v. In this paper, we propose two methods of constructing quantum codes from (α +β v+γ v2)-constacyclic codes over R. The first one is obtained via the Gray map and the Calderbank-Shor-Steane construction from Euclidean dual-containing (α +β v+γ v2)-constacyclic codes over R. The second one is obtained via the Gray map and the Hermitian construction from Hermitian dual-containing (α +β v+γ v2)-constacyclic codes over R. As an application, some new non-binary quantum codes are obtained.

  15. Long Intergenic Noncoding RNAs Mediate the Human Chondrocyte Inflammatory Response and Are Differentially Expressed in Osteoarthritis Cartilage.

    Science.gov (United States)

    Pearson, Mark J; Philp, Ashleigh M; Heward, James A; Roux, Benoit T; Walsh, David A; Davis, Edward T; Lindsay, Mark A; Jones, Simon W

    2016-04-01

    To identify long noncoding RNAs (lncRNAs), including long intergenic noncoding RNAs (lincRNAs), antisense RNAs, and pseudogenes, associated with the inflammatory response in human primary osteoarthritis (OA) chondrocytes and to explore their expression and function in OA. OA cartilage was obtained from patients with hip or knee OA following joint replacement surgery. Non-OA cartilage was obtained from postmortem donors and patients with fracture of the neck of the femur. Primary OA chondrocytes were isolated by collagenase digestion. LncRNA expression analysis was performed by RNA sequencing (RNAseq) and quantitative reverse transcriptase-polymerase chain reaction. Modulation of lncRNA chondrocyte expression was achieved using LNA longRNA GapmeRs (Exiqon). Cytokine production was measured with Luminex. RNAseq identified 983 lncRNAs in primary human hip OA chondrocytes, 183 of which had not previously been identified. Following interleukin-1β (IL-1β) stimulation, we identified 125 lincRNAs that were differentially expressed. The lincRNA p50-associated cyclooxygenase 2-extragenic RNA (PACER) and 2 novel chondrocyte inflammation-associated lincRNAs (CILinc01 and CILinc02) were differentially expressed in both knee and hip OA cartilage compared to non-OA cartilage. In primary OA chondrocytes, these lincRNAs were rapidly and transiently induced in response to multiple proinflammatory cytokines. Knockdown of CILinc01 and CILinc02 expression in human chondrocytes significantly enhanced the IL-1-stimulated secretion of proinflammatory cytokines. The inflammatory response in human OA chondrocytes is associated with widespread changes in the profile of lncRNAs, including PACER, CILinc01, and CILinc02. Differential expression of CILinc01 and CIinc02 in hip and knee OA cartilage, and their role in modulating cytokine production during the chondrocyte inflammatory response, suggest that they may play an important role in mediating inflammation-driven cartilage degeneration in

  16. The Complete Plastid Genome Sequence of Madagascar Periwinkle Catharanthus roseus (L.) G. Don: Plastid Genome Evolution, Molecular Marker Identification, and Phylogenetic Implications in Asterids

    Science.gov (United States)

    Ku, Chuan; Chung, Wan-Chia; Chen, Ling-Ling; Kuo, Chih-Horng

    2013-01-01

    The Madagascar periwinkle ( Catharanthus roseus in the family Apocynaceae) is an important medicinal plant and is the source of several widely marketed chemotherapeutic drugs. It is also commonly grown for its ornamental values and, due to ease of infection and distinctiveness of symptoms, is often used as the host for studies on phytoplasmas, an important group of uncultivated plant pathogens. To gain insights into the characteristics of apocynaceous plastid genomes (plastomes), we used a reference-assisted approach to assemble the complete plastome of C . roseus , which could be applied to other C . roseus -related studies. The C . roseus plastome is the second completely sequenced plastome in the asterid order Gentianales. We performed comparative analyses with two other representative sequences in the same order, including the complete plastome of Coffea arabica (from the basal Gentianales family Rubiaceae) and the nearly complete plastome of Asclepias syriaca (Apocynaceae). The results demonstrated considerable variations in gene content and plastome organization within Apocynaceae, including the presence/absence of three essential genes (i.e., accD, clpP, and ycf1) and large size changes in non-coding regions (e.g., rps2-rpoC2 and IRb-ndhF). To find plastome markers of potential utility for Catharanthus breeding and phylogenetic analyses, we identified 41 C . roseus -specific simple sequence repeats. Furthermore, five intergenic regions with high divergence between C . roseus and three other euasterids I taxa were identified as candidate markers. To resolve the euasterids I interordinal relationships, 82 plastome genes were used for phylogenetic inference. With the addition of representatives from Apocynaceae and sampling of most other asterid orders, a sister relationship between Gentianales and Solanales is supported. PMID:23825699

  17. Diversity of antisense and other non-coding RNAs in Archaea revealed by comparative small RNA sequencing in four Pyrobaculum species

    Directory of Open Access Journals (Sweden)

    David L Bernick

    2012-07-01

    Full Text Available A great diversity of small, non-coding RNA molecules with roles in gene regulation and RNA processing have been intensely studied in eukaryotic and bacterial model organisms, yet our knowledge of possible parallel roles for small RNAs in archaea is limited. We employed RNA-seq to identify novel small RNA across multiple species of the hyperthermophilic genus Pyrobaculum, known for unusual RNA gene characteristics. By comparing transcriptional data collected in parallel among four species, we were able to identify conserved RNA genes fitting into known and novel families. Among our findings, we highlight three novel cis-antisense small RNAs encoded opposite to key regulatory (ferric uptake regulator, metabolic (triose-phosphate isomerase, and core transcriptional apparatus genes (transcription factor B. We also found a large increase in the number of conserved C/D box small RNA genes over what had been previously recognized; many of these genes are encoded antisense to protein coding genes. The conserved opposition to orthologous genes across the Pyrobaculum genus suggests similarities to other cis-antisense regulatory systems. Furthermore, the genus-specific nature of these small RNAs indicates they are relatively recent, stable adaptations.

  18. Exome sequencing and genetic testing for MODY.

    Directory of Open Access Journals (Sweden)

    Stefan Johansson

    Full Text Available Genetic testing for monogenic diabetes is important for patient care. Given the extensive genetic and clinical heterogeneity of diabetes, exome sequencing might provide additional diagnostic potential when standard Sanger sequencing-based diagnostics is inconclusive.The aim of the study was to examine the performance of exome sequencing for a molecular diagnosis of MODY in patients who have undergone conventional diagnostic sequencing of candidate genes with negative results.We performed exome enrichment followed by high-throughput sequencing in nine patients with suspected MODY. They were Sanger sequencing-negative for mutations in the HNF1A, HNF4A, GCK, HNF1B and INS genes. We excluded common, non-coding and synonymous gene variants, and performed in-depth analysis on filtered sequence variants in a pre-defined set of 111 genes implicated in glucose metabolism.On average, we obtained 45 X median coverage of the entire targeted exome and found 199 rare coding variants per individual. We identified 0-4 rare non-synonymous and nonsense variants per individual in our a priori list of 111 candidate genes. Three of the variants were considered pathogenic (in ABCC8, HNF4A and PPARG, respectively, thus exome sequencing led to a genetic diagnosis in at least three of the nine patients. Approximately 91% of known heterozygous SNPs in the target exomes were detected, but we also found low coverage in some key diabetes genes using our current exome sequencing approach. Novel variants in the genes ARAP1, GLIS3, MADD, NOTCH2 and WFS1 need further investigation to reveal their possible role in diabetes.Our results demonstrate that exome sequencing can improve molecular diagnostics of MODY when used as a complement to Sanger sequencing. However, improvements will be needed, especially concerning coverage, before the full potential of exome sequencing can be realized.

  19. Construction of Fixed Rate Non-Binary WOM Codes Based on Integer Programming

    Science.gov (United States)

    Fujino, Yoju; Wadayama, Tadashi

    In this paper, we propose a construction of non-binary WOM (Write-Once-Memory) codes for WOM storages such as flash memories. The WOM codes discussed in this paper are fixed rate WOM codes where messages in a fixed alphabet of size $M$ can be sequentially written in the WOM storage at least $t^*$-times. In this paper, a WOM storage is modeled by a state transition graph. The proposed construction has the following two features. First, it includes a systematic method to determine the encoding regions in the state transition graph. Second, the proposed construction includes a labeling method for states by using integer programming. Several novel WOM codes for $q$ level flash memories with 2 cells are constructed by the proposed construction. They achieve the worst numbers of writes $t^*$ that meet the known upper bound in many cases. In addition, we constructed fixed rate non-binary WOM codes with the capability to reduce ICI (inter cell interference) of flash cells. One of the advantages of the proposed construction is its flexibility. It can be applied to various storage devices, to various dimensions (i.e, number of cells), and various kind of additional constraints.

  20. Complete genome sequencing of the luminescent bacterium, Vibrio qinghaiensis sp. Q67 using PacBio technology

    Science.gov (United States)

    Gong, Liang; Wu, Yu; Jian, Qijie; Yin, Chunxiao; Li, Taotao; Gupta, Vijai Kumar; Duan, Xuewu; Jiang, Yueming

    2018-01-01

    Vibrio qinghaiensis sp.-Q67 (Vqin-Q67) is a freshwater luminescent bacterium that continuously emits blue-green light (485 nm). The bacterium has been widely used for detecting toxic contaminants. Here, we report the complete genome sequence of Vqin-Q67, obtained using third-generation PacBio sequencing technology. Continuous long reads were attained from three PacBio sequencing runs and reads >500 bp with a quality value of >0.75 were merged together into a single dataset. This resultant highly-contiguous de novo assembly has no genome gaps, and comprises two chromosomes with substantial genetic information, including protein-coding genes, non-coding RNA, transposon and gene islands. Our dataset can be useful as a comparative genome for evolution and speciation studies, as well as for the analysis of protein-coding gene families, the pathogenicity of different Vibrio species in fish, the evolution of non-coding RNA and transposon, and the regulation of gene expression in relation to the bioluminescence of Vqin-Q67.

  1. Asymptotic behaviour of firmly non expansive sequences

    International Nuclear Information System (INIS)

    Rouhani, B.D.

    1993-04-01

    We introduce the notion of firmly non expansive sequences in a Banach space and present several results concerning their asymptotic behaviour extending previous results and giving an affirmative answer to an open question raised by S. Reich and I. Shafir. Applications to averaged mappings are also given. (author). 16 refs

  2. A versatile palindromic amphipathic repeat coding sequence horizontally distributed among diverse bacterial and eucaryotic microbes

    Directory of Open Access Journals (Sweden)

    Glass John I

    2010-07-01

    repeat may be disseminated by HGT and intra-genomic shuffling. Conclusions We describe novel features of PARCELs (Palindromic Amphipathic Repeat Coding ELements, a set of widely distributed repeat protein domains and coding sequences that were likely acquired through HGT by diverse unicellular microbes, further mobilized and diversified within genomes, and co-opted for expression in the membrane proteome of some taxa. Disseminated by multiple gene-centric vehicles, ORFs harboring these elements enhance accessory gene pools as part of the "mobilome" connecting genomes of various clades, in taxa sharing common niches.

  3. A Sequence-Specific Interaction between the Saccharomyces cerevisiae rRNA Gene Repeats and a Locus Encoding an RNA Polymerase I Subunit Affects Ribosomal DNA Stability

    Science.gov (United States)

    Cahyani, Inswasti; Cridge, Andrew G.; Engelke, David R.; Ganley, Austen R. D.

    2014-01-01

    The spatial organization of eukaryotic genomes is linked to their functions. However, how individual features of the global spatial structure contribute to nuclear function remains largely unknown. We previously identified a high-frequency interchromosomal interaction within the Saccharomyces cerevisiae genome that occurs between the intergenic spacer of the ribosomal DNA (rDNA) repeats and the intergenic sequence between the locus encoding the second largest RNA polymerase I subunit and a lysine tRNA gene [i.e., RPA135-tK(CUU)P]. Here, we used quantitative chromosome conformation capture in combination with replacement mapping to identify a 75-bp sequence within the RPA135-tK(CUU)P intergenic region that is involved in the interaction. We demonstrate that the RPA135-IGS1 interaction is dependent on the rDNA copy number and the Msn2 protein. Surprisingly, we found that the interaction does not govern RPA135 transcription. Instead, replacement of a 605-bp region within the RPA135-tK(CUU)P intergenic region results in a reduction in the RPA135-IGS1 interaction level and fluctuations in rDNA copy number. We conclude that the chromosomal interaction that occurs between the RPA135-tK(CUU)P and rDNA IGS1 loci stabilizes rDNA repeat number and contributes to the maintenance of nucleolar stability. Our results provide evidence that the DNA loci involved in chromosomal interactions are composite elements, sections of which function in stabilizing the interaction or mediating a functional outcome. PMID:25421713

  4. Female-biased expression of long non-coding RNAs in domains that escape X-inactivation in mouse

    Directory of Open Access Journals (Sweden)

    Lu Lu

    2010-11-01

    Full Text Available Abstract Background Sexual dimorphism in brain gene expression has been recognized in several animal species. However, the relevant regulatory mechanisms remain poorly understood. To investigate whether sex-biased gene expression in mammalian brain is globally regulated or locally regulated in diverse brain structures, and to study the genomic organisation of brain-expressed sex-biased genes, we performed a large scale gene expression analysis of distinct brain regions in adult male and female mice. Results This study revealed spatial specificity in sex-biased transcription in the mouse brain, and identified 173 sex-biased genes in the striatum; 19 in the neocortex; 12 in the hippocampus and 31 in the eye. Genes located on sex chromosomes were consistently over-represented in all brain regions. Analysis on a subset of genes with sex-bias in more than one tissue revealed Y-encoded male-biased transcripts and X-encoded female-biased transcripts known to escape X-inactivation. In addition, we identified novel coding and non-coding X-linked genes with female-biased expression in multiple tissues. Interestingly, the chromosomal positions of all of the female-biased non-coding genes are in close proximity to protein-coding genes that escape X-inactivation. This defines X-chromosome domains each of which contains a coding and a non-coding female-biased gene. Lack of repressive chromatin marks in non-coding transcribed loci supports the possibility that they escape X-inactivation. Moreover, RNA-DNA combined FISH experiments confirmed the biallelic expression of one such novel domain. Conclusion This study demonstrated that the amount of genes with sex-biased expression varies between individual brain regions in mouse. The sex-biased genes identified are localized on many chromosomes. At the same time, sexually dimorphic gene expression that is common to several parts of the brain is mostly restricted to the sex chromosomes. Moreover, the study uncovered

  5. Targeted sequencing of large genomic regions with CATCH-Seq.

    Directory of Open Access Journals (Sweden)

    Kenneth Day

    Full Text Available Current target enrichment systems for large-scale next-generation sequencing typically require synthetic oligonucleotides used as capture reagents to isolate sequences of interest. The majority of target enrichment reagents are focused on gene coding regions or promoters en masse. Here we introduce development of a customizable targeted capture system using biotinylated RNA probe baits transcribed from sheared bacterial artificial chromosome clone templates that enables capture of large, contiguous blocks of the genome for sequencing applications. This clone adapted template capture hybridization sequencing (CATCH-Seq procedure can be used to capture both coding and non-coding regions of a gene, and resolve the boundaries of copy number variations within a genomic target site. Furthermore, libraries constructed with methylated adapters prior to solution hybridization also enable targeted bisulfite sequencing. We applied CATCH-Seq to diverse targets ranging in size from 125 kb to 3.5 Mb. Our approach provides a simple and cost effective alternative to other capture platforms because of template-based, enzymatic probe synthesis and the lack of oligonucleotide design costs. Given its similarity in procedure, CATCH-Seq can also be performed in parallel with commercial systems.

  6. Long Non-coding RNAs in Response to Genotoxic Stress

    Institute of Scientific and Technical Information of China (English)

    Xiaoman Li; Dong Pan; Baoquan Zhao; Burong Hu

    2016-01-01

    Long non-coding RNAs(lncRNAs) are increasingly involved in diverse biological processes.Upon DNA damage,the DNA damage response(DDR) elicits a complex signaling cascade,which includes the induction of lncRNAs.LncRNA-mediated DDR is involved in non-canonical and canonical manners.DNA-damage induced lncRNAs contribute to the regulation of cell cycle,apoptosis,and DNA repair,thereby playing a key role in maintaining genome stability.This review summarizes the emerging role of lncRNAs in DNA damage and repair.

  7. Development of a 3D non-linear implicit MHD code

    International Nuclear Information System (INIS)

    Nicolas, T.; Ichiguchi, K.

    2016-06-01

    This paper details the on-going development of a 3D non-linear implicit MHD code, which aims at making possible large scale simulations of the non-linear phase of the interchange mode. The goal of the paper is to explain the rationale behind the choices made along the development, and the technical difficulties encountered. At the present stage, the development of the code has not been completed yet. Most of the discussion is concerned with the first approach, which utilizes cartesian coordinates in the poloidal plane. This approach shows serious difficulties in writing the preconditioner, closely related to the choice of coordinates. A second approach, based on curvilinear coordinates, also faced significant difficulties, which are detailed. The third and last approach explored involves unstructured tetrahedral grids, and indicates the possibility to solve the problem. The issue to domain meshing is addressed. (author)

  8. REMap: Operon map of M. tuberculosis based on RNA sequence data.

    Science.gov (United States)

    Pelly, Shaaretha; Winglee, Kathryn; Xia, Fang Fang; Stevens, Rick L; Bishai, William R; Lamichhane, Gyanu

    2016-07-01

    A map of the transcriptional organization of genes of an organism is a basic tool that is necessary to understand and facilitate a more accurate genetic manipulation of the organism. Operon maps are largely generated by computational prediction programs that rely on gene conservation and genome architecture and may not be physiologically relevant. With the widespread use of RNA sequencing (RNAseq), the prediction of operons based on actual transcriptome sequencing rather than computational genomics alone is much needed. Here, we report a validated operon map of Mycobacterium tuberculosis, developed using RNAseq data from both the exponential and stationary phases of growth. At least 58.4% of M. tuberculosis genes are organized into 749 operons. Our prediction algorithm, REMap (RNA Expression Mapping of operons), considers the many cases of transcription coverage of intergenic regions, and avoids dependencies on functional annotation and arbitrary assumptions about gene structure. As a result, we demonstrate that REMap is able to more accurately predict operons, especially those that contain long intergenic regions or functionally unrelated genes, than previous operon prediction programs. The REMap algorithm is publicly available as a user-friendly tool that can be readily modified to predict operons in other bacteria. Copyright © 2016 Elsevier Ltd. All rights reserved.

  9. Association of Amine-Receptor DNA Sequence Variants with Associative Learning in the Honeybee.

    Science.gov (United States)

    Lagisz, Malgorzata; Mercer, Alison R; de Mouzon, Charlotte; Santos, Luana L S; Nakagawa, Shinichi

    2016-03-01

    Octopamine- and dopamine-based neuromodulatory systems play a critical role in learning and learning-related behaviour in insects. To further our understanding of these systems and resulting phenotypes, we quantified DNA sequence variations at six loci coding octopamine-and dopamine-receptors and their association with aversive and appetitive learning traits in a population of honeybees. We identified 79 polymorphic sequence markers (mostly SNPs and a few insertions/deletions) located within or close to six candidate genes. Intriguingly, we found that levels of sequence variation in the protein-coding regions studied were low, indicating that sequence variation in the coding regions of receptor genes critical to learning and memory is strongly selected against. Non-coding and upstream regions of the same genes, however, were less conserved and sequence variations in these regions were weakly associated with between-individual differences in learning-related traits. While these associations do not directly imply a specific molecular mechanism, they suggest that the cross-talk between dopamine and octopamine signalling pathways may influence olfactory learning and memory in the honeybee.

  10. Transcriptional role of androgen receptor in the expression of long non-coding RNA Sox2OT in neurogenesis.

    Directory of Open Access Journals (Sweden)

    Valentina Tosetti

    Full Text Available The complex architecture of adult brain derives from tightly regulated migration and differentiation of precursor cells generated during embryonic neurogenesis. Changes at transcriptional level of genes that regulate migration and differentiation may lead to neurodevelopmental disorders. Androgen receptor (AR is a transcription factor that is already expressed during early embryonic days. However, AR role in the regulation of gene expression at early embryonic stage is yet to be determinate. Long non-coding RNA (lncRNA Sox2 overlapping transcript (Sox2OT plays a crucial role in gene expression control during development but its transcriptional regulation is still to be clearly defined. Here, using Bicalutamide in order to pharmacologically inactivated AR, we investigated whether AR participates in the regulation of the transcription of the lncRNASox2OTat early embryonic stage. We identified a new DNA binding region upstream of Sox2 locus containing three androgen response elements (ARE, and found that AR binds such a sequence in embryonic neural stem cells and in mouse embryonic brain. Our data suggest that through this binding, AR can promote the RNA polymerase II dependent transcription of Sox2OT. Our findings also suggest that AR participates in embryonic neurogenesis through transcriptional control of the long non-coding RNA Sox2OT.

  11. Non-Coding RNA in the Pathogenesis, Progression and Treatment of Hypertension

    Directory of Open Access Journals (Sweden)

    Christiana Leimena

    2018-03-01

    Full Text Available Hypertension is a complex, multifactorial disease that involves the coexistence of multiple risk factors, environmental factors and physiological systems. The complexities extend to the treatment and management of hypertension, which are still the pursuit of many researchers. In the last two decades, various genes have emerged as possible biomarkers and have become the target for investigations of specialized drug design based on its risk factors and the primary cause. Owing to the growing technology of microarrays and next-generation sequencing, the non-protein-coding RNAs (ncRNAs have increasingly gained attention, and their status of redundancy has flipped to importance in normal cellular processes, as well as in disease progression. The ncRNA molecules make up a significant portion of the human genome, and their role in diseases continues to be uncovered. Specifically, the cellular role of these ncRNAs has played a part in the pathogenesis of hypertension and its progression to heart failure. This review explores the function of the ncRNAs, their types and biology, the current update of their association with hypertension pathology and the potential new therapeutic regime for hypertension.

  12. Nucleotide sequence of tomato ringspot virus RNA-2.

    Science.gov (United States)

    Rott, M E; Tremaine, J H; Rochon, D M

    1991-07-01

    The sequence of tomato ringspot virus (TomRSV) RNA-2 has been determined. It is 7273 nucleotides in length excluding the 3' poly(A) tail and contains a single long open reading frame (ORF) of 5646 nucleotides in the positive sense beginning at position 78 and terminating at position 5723. A second in-frame AUG at position 441 is in a more favourable context for initiation of translation and may act as a site for initiation of translation. The TomRSV RNA-2 3' noncoding region is 1550 nucleotides in length. The coat protein is located in the C-terminal region of the large polypeptide and shows significant but limited amino acid sequence similarity to the putative coat proteins of the nepoviruses tomato black ring (TBRV), Hungarian grapevine chrome mosaic (GCMV) and grapevine fanleaf (GFLV). Comparisons of the coding and non-coding regions of TomRSV RNA-2 and the RNA components of TBRV, GCMV, GFLV and the comovirus cowpea mosaic virus revealed significant similarity for over 300 amino acids between the coding region immediately to the N-terminal side of the putative coat proteins of TomRSV and GFLV; very little similarity could be detected among the non-coding regions of TomRSV and any of these viruses.

  13. Copy Number Variations in Candidate Genes and Intergenic Regions Affect Body Mass Index and Abdominal Obesity in Mexican Children

    Science.gov (United States)

    Burguete-García, Ana Isabel; Bonnefond, Amélie; Peralta-Romero, Jesús; Froguel, Philippe

    2017-01-01

    Introduction. Increase in body weight is a gradual process that usually begins in childhood and in adolescence as a result of multiple interactions among environmental and genetic factors. This study aimed to analyze the relationship between copy number variants (CNVs) in five genes and four intergenic regions with obesity in Mexican children. Methods. We studied 1423 children aged 6–12 years. Anthropometric measurements and blood levels of biochemical parameters were obtained. Identification of CNVs was performed by real-time PCR. The effect of CNVs on obesity or body composition was assessed using regression models adjusted for age, gender, and family history of obesity. Results. Gains in copy numbers of LEPR and NEGR1 were associated with decreased body mass index (BMI), waist circumference (WC), and risk of abdominal obesity, whereas gain in ARHGEF4 and CPXCR1 and the intergenic regions 12q15c, 15q21.1a, and 22q11.21d and losses in INS were associated with increased BMI and WC. Conclusion. Our results indicate a possible contribution of CNVs in LEPR, NEGR1, ARHGEF4, and CPXCR1 and the intergenic regions 12q15c, 15q21.1a, and 22q11.21d to the development of obesity, particularly abdominal obesity in Mexican children. PMID:28428959

  14. Optimizing multiple sequence alignments using a genetic algorithm based on three objectives: structural information, non-gaps percentage and totally conserved columns.

    Science.gov (United States)

    Ortuño, Francisco M; Valenzuela, Olga; Rojas, Fernando; Pomares, Hector; Florido, Javier P; Urquiza, Jose M; Rojas, Ignacio

    2013-09-01

    Multiple sequence alignments (MSAs) are widely used approaches in bioinformatics to carry out other tasks such as structure predictions, biological function analyses or phylogenetic modeling. However, current tools usually provide partially optimal alignments, as each one is focused on specific biological features. Thus, the same set of sequences can produce different alignments, above all when sequences are less similar. Consequently, researchers and biologists do not agree about which is the most suitable way to evaluate MSAs. Recent evaluations tend to use more complex scores including further biological features. Among them, 3D structures are increasingly being used to evaluate alignments. Because structures are more conserved in proteins than sequences, scores with structural information are better suited to evaluate more distant relationships between sequences. The proposed multiobjective algorithm, based on the non-dominated sorting genetic algorithm, aims to jointly optimize three objectives: STRIKE score, non-gaps percentage and totally conserved columns. It was significantly assessed on the BAliBASE benchmark according to the Kruskal-Wallis test (P algorithm also outperforms other aligners, such as ClustalW, Multiple Sequence Alignment Genetic Algorithm (MSA-GA), PRRP, DIALIGN, Hidden Markov Model Training (HMMT), Pattern-Induced Multi-sequence Alignment (PIMA), MULTIALIGN, Sequence Alignment Genetic Algorithm (SAGA), PILEUP, Rubber Band Technique Genetic Algorithm (RBT-GA) and Vertical Decomposition Genetic Algorithm (VDGA), according to the Wilcoxon signed-rank test (P 0.05) with the advantage of being able to use less structures. Structural information is included within the objective function to evaluate more accurately the obtained alignments. The source code is available at http://www.ugr.es/~fortuno/MOSAStrE/MO-SAStrE.zip.

  15. Consensus coding sequence (CCDS) database: a standardized set of human and mouse protein-coding regions supported by expert curation.

    Science.gov (United States)

    Pujar, Shashikant; O'Leary, Nuala A; Farrell, Catherine M; Loveland, Jane E; Mudge, Jonathan M; Wallin, Craig; Girón, Carlos G; Diekhans, Mark; Barnes, If; Bennett, Ruth; Berry, Andrew E; Cox, Eric; Davidson, Claire; Goldfarb, Tamara; Gonzalez, Jose M; Hunt, Toby; Jackson, John; Joardar, Vinita; Kay, Mike P; Kodali, Vamsi K; Martin, Fergal J; McAndrews, Monica; McGarvey, Kelly M; Murphy, Michael; Rajput, Bhanu; Rangwala, Sanjida H; Riddick, Lillian D; Seal, Ruth L; Suner, Marie-Marthe; Webb, David; Zhu, Sophia; Aken, Bronwen L; Bruford, Elspeth A; Bult, Carol J; Frankish, Adam; Murphy, Terence; Pruitt, Kim D

    2018-01-04

    The Consensus Coding Sequence (CCDS) project provides a dataset of protein-coding regions that are identically annotated on the human and mouse reference genome assembly in genome annotations produced independently by NCBI and the Ensembl group at EMBL-EBI. This dataset is the product of an international collaboration that includes NCBI, Ensembl, HUGO Gene Nomenclature Committee, Mouse Genome Informatics and University of California, Santa Cruz. Identically annotated coding regions, which are generated using an automated pipeline and pass multiple quality assurance checks, are assigned a stable and tracked identifier (CCDS ID). Additionally, coordinated manual review by expert curators from the CCDS collaboration helps in maintaining the integrity and high quality of the dataset. The CCDS data are available through an interactive web page (https://www.ncbi.nlm.nih.gov/CCDS/CcdsBrowse.cgi) and an FTP site (ftp://ftp.ncbi.nlm.nih.gov/pub/CCDS/). In this paper, we outline the ongoing work, growth and stability of the CCDS dataset and provide updates on new collaboration members and new features added to the CCDS user interface. We also present expert curation scenarios, with specific examples highlighting the importance of an accurate reference genome assembly and the crucial role played by input from the research community. Published by Oxford University Press on behalf of Nucleic Acids Research 2017.

  16. ncRNA-class Web Tool: Non-coding RNA feature extraction and pre-miRNA classification web tool

    KAUST Repository

    Kleftogiannis, Dimitrios A.; Theofilatos, Konstantinos A.; Papadimitriou, Stergios; Tsakalidis, Athanasios K.; Likothanassis, Spiridon D.; Mavroudi, Seferina P.

    2012-01-01

    Until recently, it was commonly accepted that most genetic information is transacted by proteins. Recent evidence suggests that the majority of the genomes of mammals and other complex organisms are in fact transcribed into non-coding RNAs (ncRNAs), many of which are alternatively spliced and/or processed into smaller products. Non coding RNA genes analysis requires the calculation of several sequential, thermodynamical and structural features. Many independent tools have already been developed for the efficient calculation of such features but to the best of our knowledge there does not exist any integrative approach for this task. The most significant amount of existing work is related to the miRNA class of non-coding RNAs. MicroRNAs (miRNAs) are small non-coding RNAs that play a significant role in gene regulation and their prediction is a challenging bioinformatics problem. Non-coding RNA feature extraction and pre-miRNA classification Web Tool (ncRNA-class Web Tool) is a publicly available web tool ( http://150.140.142.24:82/Default.aspx ) which provides a user friendly and efficient environment for the effective calculation of a set of 58 sequential, thermodynamical and structural features of non-coding RNAs, plus a tool for the accurate prediction of miRNAs. © 2012 IFIP International Federation for Information Processing.

  17. Confinement of Reinforced-Concrete Columns with Non-Code Compliant Confining Reinforcement plus Supplemental Pen-Binder

    Directory of Open Access Journals (Sweden)

    Anang Kristianto

    2012-11-01

    Full Text Available One of the important requirements for earthquake resistant building related to confinement is the use of seismic hooks in the hoop or confining reinforcement of reinforced-concrete column elements. However, installation of a confining reinforcement with a 135-degree hook is not easy. Therefore, in practice, many construction workers apply a confining reinforcement with a 90-degreehook (non-code compliant. Based on research and records of recent earthquakes in Indonesia, the use of a non-code compliant confining reinforcement for concrete columns produces structures with poor seismic performance. This paper presents a study that introduces an additional element that is expected to improve the effectiveness of concrete columns confined with a non-code compliant confining reinforcement. The additional element, named a pen-binder, is used to keep the non-code compliant confining reinforcement in place. The effectiveness of this element under pure axial concentric loading was investigatedcomprehensively.The specimens tested in this study were 18 concrete columns,with a cross-section of 170 mm x 170 mm and a height of 480 mm. The main test variables were the material type of the pen-binder, the angle of the hook, and the confining reinforcement configuration.The test results indicate that adding pen-binders can effectively improve the strength and ductility of the column specimens confined with a non-code compliant confining reinforcement

  18. Genetic diversity in breonadia salicina based on intra-species sequence variation of chloroplast dna spacer sequence

    International Nuclear Information System (INIS)

    Qurainy, F.A.; Gaafar, A.R.Z.

    2014-01-01

    Assessment and knowledge of the genetic diversity and variation within and between populations of rare and endangered plants is very important for effective conservation. Intergenic spacer sequences variation of psbA-trnH locus of chloroplast genome was assessed within Breonadia salicina (Rubiaceae), a critically endangered and endemic plant species to South western part of Kingdom of Saudi Arabia. The obtained sequence data from 19 individuals in three populations revealed nine haplotypes. The aligned sequences obtained from the overall Saudi accessions extended to 355 bp, revealing nine haplotypes. A high level of haplotype diversity (Hd = 0.842) and low level of nucleotide diversity (Pi = 0.0058) were detected. Consistently, both hierarchical analysis of molecular variance (AMOVA) and constructed neighbor-joining tree indicated null genetic differentiation among populations. This level of differentiation between populations or between regions in psbA-trnH sequences may be due to effects of the abundance of ancestral haplotype sharing and the presence of private haplotypes fixed for each population. Furthermore, the results revealed almost the same level of genetic diversity in comparison with Yemeni accessions, in which Saudi accessions were sharing three haplotypes from the four haplotypes found in Yemeni accessions. (author)

  19. Mind the gap; seven reasons to close fragmented genome assemblies.

    Science.gov (United States)

    Thomma, Bart P H J; Seidl, Michael F; Shi-Kunne, Xiaoqian; Cook, David E; Bolton, Melvin D; van Kan, Jan A L; Faino, Luigi

    2016-05-01

    Like other domains of life, research into the biology of filamentous microbes has greatly benefited from the advent of whole-genome sequencing. Next-generation sequencing (NGS) technologies have revolutionized sequencing, making genomic sciences accessible to many academic laboratories including those that study non-model organisms. Thus, hundreds of fungal genomes have been sequenced and are publically available today, although these initiatives have typically yielded considerably fragmented genome assemblies that often lack large contiguous genomic regions. Many important genomic features are contained in intergenic DNA that is often missing in current genome assemblies, and recent studies underscore the significance of non-coding regions and repetitive elements for the life style, adaptability and evolution of many organisms. The study of particular types of genetic elements, such as telomeres, centromeres, repetitive elements, effectors, and clusters of co-regulated genes, but also of phenomena such as structural rearrangements, genome compartmentalization and epigenetics, greatly benefits from having a contiguous and high-quality, preferably even complete and gapless, genome assembly. Here we discuss a number of important reasons to produce gapless, finished, genome assemblies to help answer important biological questions. Copyright © 2015 Elsevier Inc. All rights reserved.

  20. Recurrence time statistics: versatile tools for genomic DNA sequence analysis.

    Science.gov (United States)

    Cao, Yinhe; Tung, Wen-Wen; Gao, J B

    2004-01-01

    With the completion of the human and a few model organisms' genomes, and the genomes of many other organisms waiting to be sequenced, it has become increasingly important to develop faster computational tools which are capable of easily identifying the structures and extracting features from DNA sequences. One of the more important structures in a DNA sequence is repeat-related. Often they have to be masked before protein coding regions along a DNA sequence are to be identified or redundant expressed sequence tags (ESTs) are to be sequenced. Here we report a novel recurrence time based method for sequence analysis. The method can conveniently study all kinds of periodicity and exhaustively find all repeat-related features from a genomic DNA sequence. An efficient codon index is also derived from the recurrence time statistics, which has the salient features of being largely species-independent and working well on very short sequences. Efficient codon indices are key elements of successful gene finding algorithms, and are particularly useful for determining whether a suspected EST belongs to a coding or non-coding region. We illustrate the power of the method by studying the genomes of E. coli, the yeast S. cervisivae, the nematode worm C. elegans, and the human, Homo sapiens. Computationally, our method is very efficient. It allows us to carry out analysis of genomes on the whole genomic scale by a PC.

  1. Complete genome sequence of Fer-de-Lance Virus reveals a novel gene in reptilian Paramyxoviruses

    Science.gov (United States)

    Kurath, G.; Batts, W.N.; Ahne, W.; Winton, J.R.

    2004-01-01

    The complete RNA genome sequence of the archetype reptilian paramyxovirus, Fer-de-Lance virus (FDLV), has been determined. The genome is 15,378 nucleotides in length and consists of seven nonoverlapping genes in the order 3??? N-U-P-M-F-HN-L 5???, coding for the nucleocapsid, unknown, phospho-, matrix, fusion, hemagglutinin-neuraminidase, and large polymerase proteins, respectively. The gene junctions contain highly conserved transcription start and stop signal sequences and tri-nucleotide intergenic regions similar to those of other Paramyxoviridae. The FDLV P gene expression strategy is like that of rubulaviruses, which express the accessory V protein from the primary transcript and edit a portion of the mRNA to encode P and I proteins. There is also an overlapping open reading frame potentially encoding a small basic protein in the P gene. The gene designated U (unknown), encodes a deduced protein of 19.4 kDa that has no counterpart in other paramyxoviruses and has no similarity with sequences in the National Center for Biotechnology Information database. Active transcription of the U gene in infected cells was demonstrated by Northern blot analysis, and bicistronic N-U mRNA was also evident. The genomes of two other snake paramyxovirus genotypes were also found to have U genes, with 11 to 16% nucleotide divergence from the FDLV U gene. Pairwise comparisons of amino acid identities and phylogenetic analyses of all deduced FDLV protein sequences with homologous sequences from other Paramyxoviridae indicate that FDLV represents a new genus within the subfamily Paramyxovirinae. We suggest the name Ferlavirus for the new genus, with FDLV as the type species.

  2. The Complete Sequence of a Human Parainfluenzavirus 4 Genome

    Science.gov (United States)

    Yea, Carmen; Cheung, Rose; Collins, Carol; Adachi, Dena; Nishikawa, John; Tellier, Raymond

    2009-01-01

    Although the human parainfluenza virus 4 (HPIV4) has been known for a long time, its genome, alone among the human paramyxoviruses, has not been completely sequenced to date. In this study we obtained the first complete genomic sequence of HPIV4 from a clinical isolate named SKPIV4 obtained at the Hospital for Sick Children in Toronto (Ontario, Canada). The coding regions for the N, P/V, M, F and HN proteins show very high identities (95% to 97%) with previously available partial sequences for HPIV4B. The sequence for the L protein and the non-coding regions represent new information. A surprising feature of the genome is its length, more than 17 kb, making it the longest genome within the genus Rubulavirus, although the length is well within the known range of 15 kb to 19 kb for the subfamily Paramyxovirinae. The availability of a complete genomic sequence will facilitate investigations on a respiratory virus that is still not completely characterized. PMID:21994536

  3. The Complete Sequence of a Human Parainfluenzavirus 4 Genome

    Directory of Open Access Journals (Sweden)

    Carmen Yea

    2009-06-01

    Full Text Available Although the human parainfluenza virus 4 (HPIV4 has been known for a long time, its genome, alone among the human paramyxoviruses, has not been completely sequenced to date. In this study we obtained the first complete genomic sequence of HPIV4 from a clinical isolate named SKPIV4 obtained at the Hospital for Sick Children in Toronto (Ontario, Canada. The coding regions for the N, P/V, M, F and HN proteins show very high identities (95% to 97% with previously available partial sequences for HPIV4B. The sequence for the L protein and the non-coding regions represent new information. A surprising feature of the genome is its length, more than 17 kb, making it the longest genome within the genus Rubulavirus, although the length is well within the known range of 15 kb to 19 kb for the subfamily Paramyxovirinae. The availability of a complete genomic sequence will facilitate investigations on a respiratory virus that is still not completely characterized.

  4. Description and applicability of the BEFEM-CODE

    Energy Technology Data Exchange (ETDEWEB)

    Groth, T.

    1980-05-15

    The BEFEM-CODE, developed for rock mechanics problems in hard rock with joints, is a simple FEM code constructed using triangular and quadrilateral elements. As an option, a joint element of the Goodman type may be used. The Cook-Pian type quadrilateral stress hybrid element was introduced into the version of the code used for the Naesliden project, to replace the constant stress quadrilateral elements. This hybrid element, derived with assumed stress distributions, simplifies the excavation process for use in non-linear models. The shear behavior of the Goodman 1976 joint element has been replaced by Goodman's 1968 formulation. This element makes it possible to take dilation into account, but it was not considered necessary to use dilation to simulate proper joint behavior in the Naesliden project. The code uses Barton's shear strength criteria. Excessive nodal forces due to failure and non-linearities in the joint elements are redistributed with stress transfer iterations. Convergence can be speeded up by dividing each excavation sequence into several loadsteps in which the stiffness matrix is recalculated.

  5. The sequence coding and search system: An approach for constructing and analyzing event sequences at commercial nuclear power plants

    International Nuclear Information System (INIS)

    Mays, G.T.

    1989-04-01

    The US Nuclear Regulatory Commission (NRC) has recognized the importance of the collection, assessment, and feedstock of operating experience data from commercial nuclear power plants and has centralized these activities in the Office for Analysis and Evaluation of Operational Data (AEOD). Such data is essential for performing safety and reliability analyses, especially analyses of trends and patterns to identify undesirable changes in plant performance at the earliest opportunity to implement corrective measures to preclude the occurrences of a more serious event. One of NRC's principal tools for collecting and evaluating operating experience data is the Sequence Coding and Search System (SCSS). The SCSS consists of a methodology for structuring event sequences and the requisite computer system to store and search the data. The source information for SCSS is the Licensee Event Report (LER), which is a legally required document. This paper describes the objective SCSS, the information it contains, and the format and approach for constructuring SCSS event sequences. Examples are presented demonstrating the use SCSS to support the analysis of LER data. The SCSS contains over 30,000 LERs describing events from 1980 through the present. Insights gained from working with a complex data system from the initial developmental stage to the point of a mature operating system are highlighted

  6. LZW-Kernel: fast kernel utilizing variable length code blocks from LZW compressors for protein sequence classification.

    Science.gov (United States)

    Filatov, Gleb; Bauwens, Bruno; Kertész-Farkas, Attila

    2018-05-07

    Bioinformatics studies often rely on similarity measures between sequence pairs, which often pose a bottleneck in large-scale sequence analysis. Here, we present a new convolutional kernel function for protein sequences called the LZW-Kernel. It is based on code words identified with the Lempel-Ziv-Welch (LZW) universal text compressor. The LZW-Kernel is an alignment-free method, it is always symmetric, is positive, always provides 1.0 for self-similarity and it can directly be used with Support Vector Machines (SVMs) in classification problems, contrary to normalized compression distance (NCD), which often violates the distance metric properties in practice and requires further techniques to be used with SVMs. The LZW-Kernel is a one-pass algorithm, which makes it particularly plausible for big data applications. Our experimental studies on remote protein homology detection and protein classification tasks reveal that the LZW-Kernel closely approaches the performance of the Local Alignment Kernel (LAK) and the SVM-pairwise method combined with Smith-Waterman (SW) scoring at a fraction of the time. Moreover, the LZW-Kernel outperforms the SVM-pairwise method when combined with BLAST scores, which indicates that the LZW code words might be a better basis for similarity measures than local alignment approximations found with BLAST. In addition, the LZW-Kernel outperforms n-gram based mismatch kernels, hidden Markov model based SAM and Fisher kernel, and protein family based PSI-BLAST, among others. Further advantages include the LZW-Kernel's reliance on a simple idea, its ease of implementation, and its high speed, three times faster than BLAST and several magnitudes faster than SW or LAK in our tests. LZW-Kernel is implemented as a standalone C code and is a free open-source program distributed under GPLv3 license and can be downloaded from https://github.com/kfattila/LZW-Kernel. akerteszfarkas@hse.ru. Supplementary data are available at Bioinformatics Online.

  7. Coding chaotic billiards: I-Non-Compact billiards on a negative curvature manifold

    International Nuclear Information System (INIS)

    Giannoni, M.J.; Ullmo, D.

    1989-03-01

    This paper presents a method for coding billiards. The main device is to use a proper surface of section, the bounce mapping, and foliate the reduced phase space into regions associated with a given code. The alphabet is merely the ensemble of the labels of the sides of the billiard. The procedure is applied here to non-compact polygonal billiard defined on a manifold of constant negative curvature, with all vertices at infinity. A simple grammar rule is necessary and sufficient to insure existence and uniqueness of the coding

  8. Detection of non-coding RNAs on the basis of predicted secondary structure formation free energy change

    Directory of Open Access Journals (Sweden)

    Uzilov Andrew V

    2006-03-01

    Full Text Available Abstract Background Non-coding RNAs (ncRNAs have a multitude of roles in the cell, many of which remain to be discovered. However, it is difficult to detect novel ncRNAs in biochemical screens. To advance biological knowledge, computational methods that can accurately detect ncRNAs in sequenced genomes are therefore desirable. The increasing number of genomic sequences provides a rich dataset for computational comparative sequence analysis and detection of novel ncRNAs. Results Here, Dynalign, a program for predicting secondary structures common to two RNA sequences on the basis of minimizing folding free energy change, is utilized as a computational ncRNA detection tool. The Dynalign-computed optimal total free energy change, which scores the structural alignment and the free energy change of folding into a common structure for two RNA sequences, is shown to be an effective measure for distinguishing ncRNA from randomized sequences. To make the classification as a ncRNA, the total free energy change of an input sequence pair can either be compared with the total free energy changes of a set of control sequence pairs, or be used in combination with sequence length and nucleotide frequencies as input to a classification support vector machine. The latter method is much faster, but slightly less sensitive at a given specificity. Additionally, the classification support vector machine method is shown to be sensitive and specific on genomic ncRNA screens of two different Escherichia coli and Salmonella typhi genome alignments, in which many ncRNAs are known. The Dynalign computational experiments are also compared with two other ncRNA detection programs, RNAz and QRNA. Conclusion The Dynalign-based support vector machine method is more sensitive for known ncRNAs in the test genomic screens than RNAz and QRNA. Additionally, both Dynalign-based methods are more sensitive than RNAz and QRNA at low sequence pair identities. Dynalign can be used as a

  9. Inheritance of the yeast mitochondrial genome

    DEFF Research Database (Denmark)

    Piskur, Jure

    1994-01-01

    Mitochondrion, extrachromosomal genetics, intergenic sequences, genome size, mitochondrial DNA, petite mutation, yeast......Mitochondrion, extrachromosomal genetics, intergenic sequences, genome size, mitochondrial DNA, petite mutation, yeast...

  10. Facial Expression Recognition via Non-Negative Least-Squares Sparse Coding

    Directory of Open Access Journals (Sweden)

    Ying Chen

    2014-05-01

    Full Text Available Sparse coding is an active research subject in signal processing, computer vision, and pattern recognition. A novel method of facial expression recognition via non-negative least squares (NNLS sparse coding is presented in this paper. The NNLS sparse coding is used to form a facial expression classifier. To testify the performance of the presented method, local binary patterns (LBP and the raw pixels are extracted for facial feature representation. Facial expression recognition experiments are conducted on the Japanese Female Facial Expression (JAFFE database. Compared with other widely used methods such as linear support vector machines (SVM, sparse representation-based classifier (SRC, nearest subspace classifier (NSC, K-nearest neighbor (KNN and radial basis function neural networks (RBFNN, the experiment results indicate that the presented NNLS method performs better than other used methods on facial expression recognition tasks.

  11. Spectral entropy criteria for structural segmentation in genomic DNA sequences

    International Nuclear Information System (INIS)

    Chechetkin, V.R.; Lobzin, V.V.

    2004-01-01

    The spectral entropy is calculated with Fourier structure factors and characterizes the level of structural ordering in a sequence of symbols. It may efficiently be applied to the assessment and reconstruction of the modular structure in genomic DNA sequences. We present the relevant spectral entropy criteria for the local and non-local structural segmentation in DNA sequences. The results are illustrated with the model examples and analysis of intervening exon-intron segments in the protein-coding regions

  12. Bat white-nose syndrome: a real-time TaqMan polymerase chain reaction test targeting the intergenic spacer region of Geomyces destructanstructans.

    Science.gov (United States)

    Muller, Laura K.; Lorch, Jeffrey M.; Lindner, Daniel L.; O'Connor, Michael; Gargas, Andrea; Blehert, David S.

    2013-01-01

    The fungus Geomyces destructans is the causative agent of white-nose syndrome (WNS), a disease that has killed millions of North American hibernating bats. We describe a real-time TaqMan PCR test that detects DNA from G. destructans by targeting a portion of the multicopy intergenic spacer region of the rRNA gene complex. The test is highly sensitive, consistently detecting as little as 3.3 fg of genomic DNA from G. destructans. The real-time PCR test specifically amplified genomic DNA from G. destructans but did not amplify target sequence from 54 closely related fungal isolates (including 43 Geomyces spp. isolates) associated with bats. The test was further qualified by analyzing DNA extracted from 91 bat wing skin samples, and PCR results matched histopathology findings. These data indicate the real-time TaqMan PCR method described herein is a sensitive, specific, and rapid test to detect DNA from G. destructans and provides a valuable tool for WNS diagnostics and research.

  13. Molecular profiling of microbial communities from contaminated sources: Use of subtractive cloning methods and rDNA spacer sequences. 1998 annual progress report

    International Nuclear Information System (INIS)

    Robb, F.T.

    1998-01-01

    'The major objective of the research is to provide appropriate sequences and to assemble a high-density DNA array of oligonucleotides that can be used for rapid profiling of microbial populations from polluted areas. The sequences to be assigned to the DNA array are chosen from from cloned genomic DNA sequences (the ribosomal operon, described below) from groundwater at DOE sites containing organic solvents. The sites, Hanford Nuclear Plant and Lawrence Livermore Site 300, have well characterized pollutant histories, which have been provided by the collaborators. At this mid-point of the project, over 60 unique sequence classes of intergenic spacer region have been identified from the first sample site. The use of these sequences as hybridization probes, and their frequency of occurrence, allow a clear distinction between bacterial communities before and after remediation by acetate/nitrate pumping. The authors have developed the hybridization conditions for identifying PCR products in a 96 well format, a versatile alignment and visualization program (acronym: MALIGN) developed by Dr. Dennis Maeder, has been used to align the ISRs, which are variable in length and sometimes in position of the tRNAs. Finally, in collaboration with Dr. W. Chen and Dr. J. Zhou at ORNL, they have significant evidence that mass spectrometer analysis can be used to determine the lengths of PCR amplified intergenic spacer DNA.'

  14. mPUMA: a computational approach to microbiota analysis by de novo assembly of operational taxonomic units based on protein-coding barcode sequences.

    Science.gov (United States)

    Links, Matthew G; Chaban, Bonnie; Hemmingsen, Sean M; Muirhead, Kevin; Hill, Janet E

    2013-08-15

    Formation of operational taxonomic units (OTU) is a common approach to data aggregation in microbial ecology studies based on amplification and sequencing of individual gene targets. The de novo assembly of OTU sequences has been recently demonstrated as an alternative to widely used clustering methods, providing robust information from experimental data alone, without any reliance on an external reference database. Here we introduce mPUMA (microbial Profiling Using Metagenomic Assembly, http://mpuma.sourceforge.net), a software package for identification and analysis of protein-coding barcode sequence data. It was developed originally for Cpn60 universal target sequences (also known as GroEL or Hsp60). Using an unattended process that is independent of external reference sequences, mPUMA forms OTUs by DNA sequence assembly and is capable of tracking OTU abundance. mPUMA processes microbial profiles both in terms of the direct DNA sequence as well as in the translated amino acid sequence for protein coding barcodes. By forming OTUs and calculating abundance through an assembly approach, mPUMA is capable of generating inputs for several popular microbiota analysis tools. Using SFF data from sequencing of a synthetic community of Cpn60 sequences derived from the human vaginal microbiome, we demonstrate that mPUMA can faithfully reconstruct all expected OTU sequences and produce compositional profiles consistent with actual community structure. mPUMA enables analysis of microbial communities while empowering the discovery of novel organisms through OTU assembly.

  15. Learning and Recognition of a Non-conscious Sequence of Events in Human Primary Visual Cortex.

    Science.gov (United States)

    Rosenthal, Clive R; Andrews, Samantha K; Antoniades, Chrystalina A; Kennard, Christopher; Soto, David

    2016-03-21

    Human primary visual cortex (V1) has long been associated with learning simple low-level visual discriminations [1] and is classically considered outside of neural systems that support high-level cognitive behavior in contexts that differ from the original conditions of learning, such as recognition memory [2, 3]. Here, we used a novel fMRI-based dichoptic masking protocol-designed to induce activity in V1, without modulation from visual awareness-to test whether human V1 is implicated in human observers rapidly learning and then later (15-20 min) recognizing a non-conscious and complex (second-order) visuospatial sequence. Learning was associated with a change in V1 activity, as part of a temporo-occipital and basal ganglia network, which is at variance with the cortico-cerebellar network identified in prior studies of "implicit" sequence learning that involved motor responses and visible stimuli (e.g., [4]). Recognition memory was associated with V1 activity, as part of a temporo-occipital network involving the hippocampus, under conditions that were not imputable to mechanisms associated with conscious retrieval. Notably, the V1 responses during learning and recognition separately predicted non-conscious recognition memory, and functional coupling between V1 and the hippocampus was enhanced for old retrieval cues. The results provide a basis for novel hypotheses about the signals that can drive recognition memory, because these data (1) identify human V1 with a memory network that can code complex associative serial visuospatial information and support later non-conscious recognition memory-guided behavior (cf. [5]) and (2) align with mouse models of experience-dependent V1 plasticity in learning and memory [6]. Copyright © 2016 Elsevier Ltd. All rights reserved.

  16. An integrated semiconductor device enabling non-optical genome sequencing.

    Science.gov (United States)

    Rothberg, Jonathan M; Hinz, Wolfgang; Rearick, Todd M; Schultz, Jonathan; Mileski, William; Davey, Mel; Leamon, John H; Johnson, Kim; Milgrew, Mark J; Edwards, Matthew; Hoon, Jeremy; Simons, Jan F; Marran, David; Myers, Jason W; Davidson, John F; Branting, Annika; Nobile, John R; Puc, Bernard P; Light, David; Clark, Travis A; Huber, Martin; Branciforte, Jeffrey T; Stoner, Isaac B; Cawley, Simon E; Lyons, Michael; Fu, Yutao; Homer, Nils; Sedova, Marina; Miao, Xin; Reed, Brian; Sabina, Jeffrey; Feierstein, Erika; Schorn, Michelle; Alanjary, Mohammad; Dimalanta, Eileen; Dressman, Devin; Kasinskas, Rachel; Sokolsky, Tanya; Fidanza, Jacqueline A; Namsaraev, Eugeni; McKernan, Kevin J; Williams, Alan; Roth, G Thomas; Bustillo, James

    2011-07-20

    The seminal importance of DNA sequencing to the life sciences, biotechnology and medicine has driven the search for more scalable and lower-cost solutions. Here we describe a DNA sequencing technology in which scalable, low-cost semiconductor manufacturing techniques are used to make an integrated circuit able to directly perform non-optical DNA sequencing of genomes. Sequence data are obtained by directly sensing the ions produced by template-directed DNA polymerase synthesis using all-natural nucleotides on this massively parallel semiconductor-sensing device or ion chip. The ion chip contains ion-sensitive, field-effect transistor-based sensors in perfect register with 1.2 million wells, which provide confinement and allow parallel, simultaneous detection of independent sequencing reactions. Use of the most widely used technology for constructing integrated circuits, the complementary metal-oxide semiconductor (CMOS) process, allows for low-cost, large-scale production and scaling of the device to higher densities and larger array sizes. We show the performance of the system by sequencing three bacterial genomes, its robustness and scalability by producing ion chips with up to 10 times as many sensors and sequencing a human genome.

  17. Complete genome sequence and integrated protein localization and interaction map for alfalfa dwarf virus, which combines properties of both cytoplasmic and nuclear plant rhabdoviruses

    Energy Technology Data Exchange (ETDEWEB)

    Bejerman, Nicolás, E-mail: n.bejerman@uq.edu.au [Instituto de Patología Vegetal (IPAVE), Centro de Investigaciones Agropecuarias (CIAP), Instituto Nacional de Tecnología Agropecuaria INTA, Camino a 60 Cuadras k 5,5, Córdoba X5020ICA (Argentina); Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, St Lucia, QLD 4072 (Australia); Giolitti, Fabián; Breuil, Soledad de; Trucco, Verónica; Nome, Claudia; Lenardon, Sergio [Instituto de Patología Vegetal (IPAVE), Centro de Investigaciones Agropecuarias (CIAP), Instituto Nacional de Tecnología Agropecuaria INTA, Camino a 60 Cuadras k 5,5, Córdoba X5020ICA (Argentina); Dietzgen, Ralf G. [Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, St Lucia, QLD 4072 (Australia)

    2015-09-15

    Summary: We have determined the full-length 14,491-nucleotide genome sequence of a new plant rhabdovirus, alfalfa dwarf virus (ADV). Seven open reading frames (ORFs) were identified in the antigenomic orientation of the negative-sense, single-stranded viral RNA, in the order 3′-N-P-P3-M-G-P6-L-5′. The ORFs are separated by conserved intergenic regions and the genome coding region is flanked by complementary 3′ leader and 5′ trailer sequences. Phylogenetic analysis of the nucleoprotein amino acid sequence indicated that this alfalfa-infecting rhabdovirus is related to viruses in the genus Cytorhabdovirus. When transiently expressed as GFP fusions in Nicotiana benthamiana leaves, most ADV proteins accumulated in the cell periphery, but unexpectedly P protein was localized exclusively in the nucleus. ADV P protein was shown to have a homotypic, and heterotypic nuclear interactions with N, P3 and M proteins by bimolecular fluorescence complementation. ADV appears unique in that it combines properties of both cytoplasmic and nuclear plant rhabdoviruses. - Highlights: • The complete genome of alfalfa dwarf virus is obtained. • An integrated localization and interaction map for ADV is determined. • ADV has a genome sequence similarity and evolutionary links with cytorhabdoviruses. • ADV protein localization and interaction data show an association with the nucleus. • ADV combines properties of both cytoplasmic and nuclear plant rhabdoviruses.

  18. Complete genome sequence and integrated protein localization and interaction map for alfalfa dwarf virus, which combines properties of both cytoplasmic and nuclear plant rhabdoviruses

    International Nuclear Information System (INIS)

    Bejerman, Nicolás; Giolitti, Fabián; Breuil, Soledad de; Trucco, Verónica; Nome, Claudia; Lenardon, Sergio; Dietzgen, Ralf G.

    2015-01-01

    Summary: We have determined the full-length 14,491-nucleotide genome sequence of a new plant rhabdovirus, alfalfa dwarf virus (ADV). Seven open reading frames (ORFs) were identified in the antigenomic orientation of the negative-sense, single-stranded viral RNA, in the order 3′-N-P-P3-M-G-P6-L-5′. The ORFs are separated by conserved intergenic regions and the genome coding region is flanked by complementary 3′ leader and 5′ trailer sequences. Phylogenetic analysis of the nucleoprotein amino acid sequence indicated that this alfalfa-infecting rhabdovirus is related to viruses in the genus Cytorhabdovirus. When transiently expressed as GFP fusions in Nicotiana benthamiana leaves, most ADV proteins accumulated in the cell periphery, but unexpectedly P protein was localized exclusively in the nucleus. ADV P protein was shown to have a homotypic, and heterotypic nuclear interactions with N, P3 and M proteins by bimolecular fluorescence complementation. ADV appears unique in that it combines properties of both cytoplasmic and nuclear plant rhabdoviruses. - Highlights: • The complete genome of alfalfa dwarf virus is obtained. • An integrated localization and interaction map for ADV is determined. • ADV has a genome sequence similarity and evolutionary links with cytorhabdoviruses. • ADV protein localization and interaction data show an association with the nucleus. • ADV combines properties of both cytoplasmic and nuclear plant rhabdoviruses

  19. Transcriptator: An Automated Computational Pipeline to Annotate Assembled Reads and Identify Non Coding RNA.

    Directory of Open Access Journals (Sweden)

    Kumar Parijat Tripathi

    Full Text Available RNA-seq is a new tool to measure RNA transcript counts, using high-throughput sequencing at an extraordinary accuracy. It provides quantitative means to explore the transcriptome of an organism of interest. However, interpreting this extremely large data into biological knowledge is a problem, and biologist-friendly tools are lacking. In our lab, we developed Transcriptator, a web application based on a computational Python pipeline with a user-friendly Java interface. This pipeline uses the web services available for BLAST (Basis Local Search Alignment Tool, QuickGO and DAVID (Database for Annotation, Visualization and Integrated Discovery tools. It offers a report on statistical analysis of functional and Gene Ontology (GO annotation's enrichment. It helps users to identify enriched biological themes, particularly GO terms, pathways, domains, gene/proteins features and protein-protein interactions related informations. It clusters the transcripts based on functional annotations and generates a tabular report for functional and gene ontology annotations for each submitted transcript to the web server. The implementation of QuickGo web-services in our pipeline enable the users to carry out GO-Slim analysis, whereas the integration of PORTRAIT (Prediction of transcriptomic non coding RNA (ncRNA by ab initio methods helps to identify the non coding RNAs and their regulatory role in transcriptome. In summary, Transcriptator is a useful software for both NGS and array data. It helps the users to characterize the de-novo assembled reads, obtained from NGS experiments for non-referenced organisms, while it also performs the functional enrichment analysis of differentially expressed transcripts/genes for both RNA-seq and micro-array experiments. It generates easy to read tables and interactive charts for better understanding of the data. The pipeline is modular in nature, and provides an opportunity to add new plugins in the future. Web application is

  20. An intergenic non-coding rRNA correlated with expression of the rRNA and frequency of an rRNA single nucleotide polymorphism in lung cancer cells.

    Directory of Open Access Journals (Sweden)

    Yih-Horng Shiao

    Full Text Available BACKGROUND: Ribosomal RNA (rRNA is a central regulator of cell growth and may control cancer development. A cis noncoding rRNA (nc-rRNA upstream from the 45S rRNA transcription start site has recently been implicated in control of rRNA transcription in mouse fibroblasts. We investigated whether a similar nc-rRNA might be expressed in human cancer epithelial cells, and related to any genomic characteristics. METHODOLOGY/PRINCIPAL FINDINGS: Using quantitative rRNA measurement, we demonstrated that a nc-rRNA is transcribed in human lung epithelial and lung cancer cells, starting from approximately -1000 nucleotides upstream of the rRNA transcription start site (+1 and extending at least to +203. This nc-rRNA was significantly more abundant in the majority of lung cancer cell lines, relative to a nontransformed lung epithelial cell line. Its abundance correlated negatively with total 45S rRNA in 12 of 13 cell lines (P = 0.014. During sequence analysis from -388 to +306, we observed diverse, frequent intercopy single nucleotide polymorphisms (SNPs in rRNA, with a frequency greater than predicted by chance at 12 sites. A SNP at +139 (U/C in the 5' leader sequence varied among the cell lines and correlated negatively with level of the nc-rRNA (P = 0.014. Modelling of the secondary structure of the rRNA 5'-leader sequence indicated a small increase in structural stability due to the +139 U/C SNP and a minor shift in local configuration occurrences. CONCLUSIONS/SIGNIFICANCE: The results demonstrate occurrence of a sense nc-rRNA in human lung epithelial and cancer cells, and imply a role in regulation of the rRNA gene, which may be affected by a +139 SNP in the 5' leader sequence of the primary rRNA transcript.

  1. RANDNA: a random DNA sequence generator.

    Science.gov (United States)

    Piva, Francesco; Principato, Giovanni

    2006-01-01

    Monte Carlo simulations are useful to verify the significance of data. Genomic regularities, such as the nucleotide correlations or the not uniform distribution of the motifs throughout genomic or mature mRNA sequences, exist and their significance can be checked by means of the Monte Carlo test. The test needs good quality random sequences in order to work, moreover they should have the same nucleotide distribution as the sequences in which the regularities have been found. Random DNA sequences are also useful to estimate the background score of an alignment, that is a threshold below which the resulting score is merely due to chance. We have developed RANDNA, a free software which allows to produce random DNA or RNA sequences setting both their length and the percentage of nucleotide composition. Sequences having the same nucleotide distribution of exonic, intronic or intergenic sequences can be generated. Its graphic interface makes it possible to easily set the parameters that characterize the sequences being produced and saved in a text format file. The pseudo-random number generator function of Borland Delphi 6 is used, since it guarantees a good randomness, a long cycle length and a high speed. We have checked the quality of sequences generated by the software, by means of well-known tests, both by themselves and versus genuine random sequences. We show the good quality of the generated sequences. The software, complete with examples and documentation, is freely available to users from: http://www.introni.it/en/software.

  2. Development of 3D CFD code based on structured non-orthogonal grids

    International Nuclear Information System (INIS)

    Vaidya, Abhijeet Mohan; Maheshwari, Naresh Kumar; Rama Rao, A.

    2016-01-01

    Most of the nuclear industry problems involve complex geometries. Solution of flow and heat transfer over complex geometries is a very important requirement for designing new reactor systems. Hence development of a general purpose three dimensional (3D) CFD code is undertaken. For handling complex shape of computational domain, implementation on structured non-orthogonal coordinates is being done. The code is validated by comparing its results for 3D inclined lid driven cavity at different inclination angles and Reynolds numbers with OpenFOAM results. This paper contains formulation and validation of the new code developed. (author)

  3. Genetic variation in the Staphylococcus aureus 8325 strain lineage revealed by whole-genome sequencing.

    Directory of Open Access Journals (Sweden)

    Kristoffer T Bæk

    Full Text Available Staphylococcus aureus strains of the 8325 lineage, especially 8325-4 and derivatives lacking prophage, have been used extensively for decades of research. We report herein the results of our deep sequence analysis of strain 8325-4. Assignment of sequence variants compared with the reference strain 8325 (NRS77/PS47 required correction of errors in the 8325 reference genome, and reassessment of variation previously attributed to chemical mutagenesis of the restriction-defective RN4220. Using an extensive strain pedigree analysis, we discovered that 8325-4 contains 16 single nucleotide polymorphisms (SNP arising prior to the construction of RN4220. We identified 5 indels in 8325-4 compared with 8325. Three indels correspond to expected Φ11, 12, 13 excisions, one indel is explained by a sequence assembly artifact, and the final indel (Δ63bp in the spa-sarS intergenic region is common to only a sub-lineage of 8325-4 strains including SH1000. This deletion was found to significantly decrease (75% steady state sarS but not spa transcript levels in post-exponential phase. The sub-lineage 8325-4 was also found to harbor 4 additional SNPs. We also found large sequence variation between 8325, 8325-4 and RN4220 in a cluster of repetitive hypothetical proteins (SA0282 homologs near the Ess secretion cluster. The overall 8325-4 SNP set results in 17 alterations within coding sequences. Remarkably, we discovered that all tested strains of the 8325-4 lineage lack phenol soluble modulin α3 (PSMα3, a virulence determinant implicated in neutrophil chemotaxis, biofilm architecture and surface spreading. Collectively, our results clarify and define the 8325-4 pedigree and reveal clear evidence that mutations existing throughout all branches of this lineage, including the widely used RN6390 and SH1000 strains, could conceivably impact virulence regulation.

  4. Exome sequencing generates high quality data in non-target regions

    Directory of Open Access Journals (Sweden)

    Guo Yan

    2012-05-01

    Full Text Available Abstract Background Exome sequencing using next-generation sequencing technologies is a cost efficient approach to selectively sequencing coding regions of human genome for detection of disease variants. A significant amount of DNA fragments from the capture process fall outside target regions, and sequence data for positions outside target regions have been mostly ignored after alignment. Result We performed whole exome sequencing on 22 subjects using Agilent SureSelect capture reagent and 6 subjects using Illumina TrueSeq capture reagent. We also downloaded sequencing data for 6 subjects from the 1000 Genomes Project Pilot 3 study. Using these data, we examined the quality of SNPs detected outside target regions by computing consistency rate with genotypes obtained from SNP chips or the Hapmap database, transition-transversion (Ti/Tv ratio, and percentage of SNPs inside dbSNP. For all three platforms, we obtained high-quality SNPs outside target regions, and some far from target regions. In our Agilent SureSelect data, we obtained 84,049 high-quality SNPs outside target regions compared to 65,231 SNPs inside target regions (a 129% increase. For our Illumina TrueSeq data, we obtained 222,171 high-quality SNPs outside target regions compared to 95,818 SNPs inside target regions (a 232% increase. For the data from the 1000 Genomes Project, we obtained 7,139 high-quality SNPs outside target regions compared to 1,548 SNPs inside target regions (a 461% increase. Conclusions These results demonstrate that a significant amount of high quality genotypes outside target regions can be obtained from exome sequencing data. These data should not be ignored in genetic epidemiology studies.

  5. Complete motif analysis of sequence requirements for translation initiation at non-AUG start codons.

    Science.gov (United States)

    Diaz de Arce, Alexander J; Noderer, William L; Wang, Clifford L

    2018-01-25

    The initiation of mRNA translation from start codons other than AUG was previously believed to be rare and of relatively low impact. More recently, evidence has suggested that as much as half of all translation initiation utilizes non-AUG start codons, codons that deviate from AUG by a single base. Furthermore, non-AUG start codons have been shown to be involved in regulation of expression and disease etiology. Yet the ability to gauge expression based on the sequence of a translation initiation site (start codon and its flanking bases) has been limited. Here we have performed a comprehensive analysis of translation initiation sites that utilize non-AUG start codons. By combining genetic-reporter, cell-sorting, and high-throughput sequencing technologies, we have analyzed the expression associated with all possible variants of the -4 to +4 positions of non-AUG translation initiation site motifs. This complete motif analysis revealed that 1) with the right sequence context, certain non-AUG start codons can generate expression comparable to that of AUG start codons, 2) sequence context affects each non-AUG start codon differently, and 3) initiation at non-AUG start codons is highly sensitive to changes in the flanking sequences. Complete motif analysis has the potential to be a key tool for experimental and diagnostic genomics. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  6. Basic RNases of wild almond (Prunus webbii): cloning and characterization of six new S-RNase and one "non-S RNase" genes.

    Science.gov (United States)

    Banović, Bojana; Surbanovski, Nada; Konstantinović, Miroslav; Maksimović, Vesna

    2009-03-01

    In order to investigate the S-RNase allele structure of a Prunus webbii population from the Montenegrin region of the Balkans, we analyzed 10 Prunus webbii accessions. We detected 10 different S-RNase allelic variants and obtained the nucleotide sequences for six S-RNases. The BLAST analysis showed that these six sequences were new Prunus webbii S-RNase alleles. It also revealed that one of sequenced alleles, S(9)-RNase, coded for an amino acid sequence identical to that for Prunus dulcis S(14)-RNase, except for a single conservative amino acid replacement in the signal peptide region. Another, S(3)-RNase, was shown to differ by only three amino acid residues from Prunus salicina Se-RNase. The allele S(7)-RNase was found to be inactive by stylar protein isoelectric focusing followed by RNase-specific staining, but the reason for the inactivity was not at the coding sequence level. Further, in five of the 10 analyzed accessions, we detected the presence of one active basic RNase (marked PW(1)) that did not amplify with S-RNase-specific DNA primers. However, it was amplified with primers designed from the PA1 RNase nucleotide sequence (basic "non-S RNase" of Prunus avium) and the obtained sequence showed high homology (80%) with the PA1 allele. Although homologs of PA1 "non-S RNases" have been reported in four other Prunus species, this is the first recorded homolog in Prunus webbii. The evolutionary implications of the data are discussed.

  7. Population Genomics of Paramecium Species.

    Science.gov (United States)

    Johri, Parul; Krenek, Sascha; Marinov, Georgi K; Doak, Thomas G; Berendonk, Thomas U; Lynch, Michael

    2017-05-01

    Population-genomic analyses are essential to understanding factors shaping genomic variation and lineage-specific sequence constraints. The dearth of such analyses for unicellular eukaryotes prompted us to assess genomic variation in Paramecium, one of the most well-studied ciliate genera. The Paramecium aurelia complex consists of ∼15 morphologically indistinguishable species that diverged subsequent to two rounds of whole-genome duplications (WGDs, as long as 320 MYA) and possess extremely streamlined genomes. We examine patterns of both nuclear and mitochondrial polymorphism, by sequencing whole genomes of 10-13 worldwide isolates of each of three species belonging to the P. aurelia complex: P. tetraurelia, P. biaurelia, P. sexaurelia, as well as two outgroup species that do not share the WGDs: P. caudatum and P. multimicronucleatum. An apparent absence of global geographic population structure suggests continuous or recent dispersal of Paramecium over long distances. Intergenic regions are highly constrained relative to coding sequences, especially in P. caudatum and P. multimicronucleatum that have shorter intergenic distances. Sequence diversity and divergence are reduced up to ∼100-150 bp both upstream and downstream of genes, suggesting strong constraints imposed by the presence of densely packed regulatory modules. In addition, comparison of sequence variation at non-synonymous and synonymous sites suggests similar recent selective pressures on paralogs within and orthologs across the deeply diverging species. This study presents the first genome-wide population-genomic analysis in ciliates and provides a valuable resource for future studies in evolutionary and functional genetics in Paramecium. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  8. Self-complementary circular codes in coding theory.

    Science.gov (United States)

    Fimmel, Elena; Michel, Christian J; Starman, Martin; Strüngmann, Lutz

    2018-04-01

    Self-complementary circular codes are involved in pairing genetic processes. A maximal [Formula: see text] self-complementary circular code X of trinucleotides was identified in genes of bacteria, archaea, eukaryotes, plasmids and viruses (Michel in Life 7(20):1-16 2017, J Theor Biol 380:156-177, 2015; Arquès and Michel in J Theor Biol 182:45-58 1996). In this paper, self-complementary circular codes are investigated using the graph theory approach recently formulated in Fimmel et al. (Philos Trans R Soc A 374:20150058, 2016). A directed graph [Formula: see text] associated with any code X mirrors the properties of the code. In the present paper, we demonstrate a necessary condition for the self-complementarity of an arbitrary code X in terms of the graph theory. The same condition has been proven to be sufficient for codes which are circular and of large size [Formula: see text] trinucleotides, in particular for maximal circular codes ([Formula: see text] trinucleotides). For codes of small-size [Formula: see text] trinucleotides, some very rare counterexamples have been constructed. Furthermore, the length and the structure of the longest paths in the graphs associated with the self-complementary circular codes are investigated. It has been proven that the longest paths in such graphs determine the reading frame for the self-complementary circular codes. By applying this result, the reading frame in any arbitrary sequence of trinucleotides is retrieved after at most 15 nucleotides, i.e., 5 consecutive trinucleotides, from the circular code X identified in genes. Thus, an X motif of a length of at least 15 nucleotides in an arbitrary sequence of trinucleotides (not necessarily all of them belonging to X) uniquely defines the reading (correct) frame, an important criterion for analyzing the X motifs in genes in the future.

  9. Computational Approaches Reveal New Insights into Regulation and Function of Non; coding RNAs and their Targets

    KAUST Repository

    Alam, Tanvir

    2016-01-01

    Regulation and function of protein-coding genes are increasingly well-understood, but no comparable evidence exists for non-coding RNA (ncRNA) genes, which appear to be more numerous than protein-coding genes. We developed a novel machine

  10. PACCMIT/PACCMIT-CDS: identifying microRNA targets in 3' UTRs and coding sequences.

    Science.gov (United States)

    Šulc, Miroslav; Marín, Ray M; Robins, Harlan S; Vaníček, Jiří

    2015-07-01

    The purpose of the proposed web server, publicly available at http://paccmit.epfl.ch, is to provide a user-friendly interface to two algorithms for predicting messenger RNA (mRNA) molecules regulated by microRNAs: (i) PACCMIT (Prediction of ACcessible and/or Conserved MIcroRNA Targets), which identifies primarily mRNA transcripts targeted in their 3' untranslated regions (3' UTRs), and (ii) PACCMIT-CDS, designed to find mRNAs targeted within their coding sequences (CDSs). While PACCMIT belongs among the accurate algorithms for predicting conserved microRNA targets in the 3' UTRs, the main contribution of the web server is 2-fold: PACCMIT provides an accurate tool for predicting targets also of weakly conserved or non-conserved microRNAs, whereas PACCMIT-CDS addresses the lack of similar portals adapted specifically for targets in CDS. The web server asks the user for microRNAs and mRNAs to be analyzed, accesses the precomputed P-values for all microRNA-mRNA pairs from a database for all mRNAs and microRNAs in a given species, ranks the predicted microRNA-mRNA pairs, evaluates their significance according to the false discovery rate and finally displays the predictions in a tabular form. The results are also available for download in several standard formats. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  11. The sequence coding and search system: an approach for constructing and analyzing event sequences at commercial nuclear power plants

    International Nuclear Information System (INIS)

    Mays, G.T.

    1990-01-01

    The U.S. Nuclear Regulatory Commission (NRC) has recognized the importance of the collection, assessment, and feedback of operating experience data from commercial nuclear power plants and has centralized these activities in the Office for Analysis and Evaluation of Operational Data (AEOD). Such data is essential for performing safety and reliability analyses, especially analyses of trends and patterns to identify undesirable changes in plant performance at the earliest opportunity to implement corrective measures to preclude the occurrence of a more serious event. One of NRC's principal tools for collecting and evaluating operating experience data is the Sequence Coding and Search System (SCSS). The SCSS consists of a methodology for structuring event sequences and the requisite computer system to store and search the data. The source information for SCSS is the Licensee Event Report (LER), which is a legally required document. This paper describes the objectives of SCSS, the information it contains, and the format and approach for constructing SCSS event sequences. Examples are presented demonstrating the use of SCSS to support the analysis of LER data. The SCSS contains over 30,000 LERs describing events from 1980 through the present. Insights gained from working with a complex data system from the initial developmental stage to the point of a mature operating system are highlighted. Considerable experience has been gained in the areas of evolving and changing data requirements, staffing requirements, and quality control and quality assurance procedures for addressing consistency, software/hardware considerations for developing and maintaining a complex system, documentation requirements, and end-user needs. Two other approaches for constructing and evaluating event sequences are examined including the Accident Precursor Program (ASP) where sequences having the potential for core damage are identified and analyzed, and the Significant Event Compilation Tree

  12. Development of non-orthogonal and 2-dimensional numerical code TFC2D-BFC for fluid flow

    International Nuclear Information System (INIS)

    Park, Ju Yeop; In, Wang Kee; Chun, Tae Hyun; Oh, Dong Seok

    2000-09-01

    The development of algorithm for three dimensional non-orthogonal coordinate system has been made. The algorithm adopts a non-staggered grid system, Cartesian velocity components for independent variables of momentum equations and a SIMPLER algorithm for a pressure correction equation. Except the pressure correction method, the selected grid system and the selected independent variables for momentum equations have been widely used in a commercial code. It is well known that the SIMPLER is superior to the SIMPLE algorithm in the view of convergence rate. Using this algorithm, a two dimensional non-orthogonal numerical code has been completed. The code adopts a structured single square block in a computational domain with a uniform mesh interval. Consequently, any solid body existing in a flow field can be implemented in the numerical code through a blocked-off method which was devised by Patankar

  13. Circular codes revisited: a statistical approach.

    Science.gov (United States)

    Gonzalez, D L; Giannerini, S; Rosa, R

    2011-04-21

    In 1996 Arquès and Michel [1996. A complementary circular code in the protein coding genes. J. Theor. Biol. 182, 45-58] discovered the existence of a common circular code in eukaryote and prokaryote genomes. Since then, circular code theory has provoked great interest and underwent a rapid development. In this paper we discuss some theoretical issues related to the synchronization properties of coding sequences and circular codes with particular emphasis on the problem of retrieval and maintenance of the reading frame. Motivated by the theoretical discussion, we adopt a rigorous statistical approach in order to try to answer different questions. First, we investigate the covering capability of the whole class of 216 self-complementary, C(3) maximal codes with respect to a large set of coding sequences. The results indicate that, on average, the code proposed by Arquès and Michel has the best covering capability but, still, there exists a great variability among sequences. Second, we focus on such code and explore the role played by the proportion of the bases by means of a hierarchy of permutation tests. The results show the existence of a sort of optimization mechanism such that coding sequences are tailored as to maximize or minimize the coverage of circular codes on specific reading frames. Such optimization clearly relates the function of circular codes with reading frame synchronization. Copyright © 2011 Elsevier Ltd. All rights reserved.

  14. Evaluation of temporary non-code repairs in safety class 3 piping systems

    International Nuclear Information System (INIS)

    Godha, P.C.; Kupinski, M.; Azevedo, N.F.

    1996-01-01

    Temporary non-ASME Code repairs in safety class 3 pipe and piping components are permissible during plant operation in accordance with Nuclear Regulatory Commission Generic Letter 90-05. However, regulatory acceptance of such repairs requires the licensee to undertake several timely actions. Consistent with the requirements of GL 90-05, this paper presents an overview of the detailed evaluation and relief request process. The technical criteria encompasses both ductile and brittle piping materials. It also lists appropriate evaluation methods that a utility engineer can select to perform a structural integrity assessment for design basis loading conditions to support the use of temporary non-Code repair for degraded piping components. Most use of temporary non-code repairs at a nuclear generating station is in the service water system which is an essential safety related system providing the ultimate heat sink for various plant systems. Depending on the plant siting, the service water system may use fresh water or salt water as the cooling medium. Various degradation mechanisms including general corrosion, erosion/corrosion, pitting, microbiological corrosion, galvanic corrosion, under-deposit corrosion or a combination thereof continually challenge the pressure boundary structural integrity. A good source for description of corrosion degradation in cooling water systems is provided in a cited reference

  15. Application of whole genome sequence data in analyzing the molecular epidemiology of Shiga toxin-producing Escherichia coli O157:H7/H.

    Science.gov (United States)

    Yokoyama, Eiji; Hirai, Shinichiro; Ishige, Taichiro; Murakami, Satoshi

    2018-01-02

    Seventeen clusters of Shiga toxin-producing Escherichia coli O157:H7/- (O157) strains, determined by cluster analysis of pulsed-field gel electrophoresis patterns, were analyzed using whole genome sequence (WGS) data to investigate this pathogen's molecular epidemiology. The 17 clusters included 136 strains containing strains from nine outbreaks, with each outbreak caused by a single source contaminated with the organism, as shown by epidemiological contact surveys. WGS data of these strains were used to identify single nucleotide polymorphisms (SNPs) by two methods: short read data were directly mapped to a reference genome (mapping derived SNPs) and common SNPs between the mapping derived SNPs and SNPs in assembled data of short read data (common SNPs). Among both SNPs, those that were detected in genes with a gap were excluded to remove ambiguous SNPs from further analysis. The effectiveness of both SNPs was investigated among all the concatenated SNPs that were detected (whole SNP set); SNPs were divided into three categories based on the genes in which they were located (i.e., backbone SNP set, O-island SNP set, and mobile element SNP set); and SNPs in non-coding regions (intergenic region SNP set). When SNPs from strains isolated from the nine single source derived outbreaks were analyzed using an unweighted pair group method with arithmetic mean tree (UPGMA) and a minimum spanning tree (MST), the maximum pair-wise distances of the backbone SNP set of the mapping derived SNPs were significantly smaller than those of the whole and intergenic region SNP set on both UPGMAs and MSTs. This significant difference was also observed when the backbone SNP set of the common SNPs were examined (Steel-Dwass test, P≤0.01). When the maximum pair-wise distances were compared between the mapping derived and common SNPs, significant differences were observed in those of the whole, mobile element, and intergenic region SNP set (Wilcoxon signed rank test, P≤0.01). When all

  16. Serotype identification and VP1 coding sequence analysis of foot-and-mouth disease virus from outbreaks in Eastern and Northern Uganda in 2008/9

    DEFF Research Database (Denmark)

    Kasambula, L.; Belsham, Graham; Siegismund, H. R.

    2012-01-01

    regions, and the presence of FMDV RNA in these samples was determined using a standard diagnostic RT-PCR assay. From the total of 27 positive samples, the VP1 coding region was amplified and sequenced. Each of these sequences showed >99% identity to each other, and just five distinct sequences were...

  17. Functional Characterization of MC1R-TUBB3 Intergenic Splice Variants of the Human Melanocortin 1 Receptor.

    Directory of Open Access Journals (Sweden)

    Cecilia Herraiz

    Full Text Available The melanocortin 1 receptor gene (MC1R expressed in melanocytes is a major determinant of skin pigmentation. It encodes a Gs protein-coupled receptor activated by α-melanocyte stimulating hormone (αMSH. Human MC1R has an inefficient poly(A site allowing intergenic splicing with its downstream neighbour Tubulin-β-III (TUBB3. Intergenic splicing produces two MC1R isoforms, designated Iso1 and Iso2, bearing the complete seven transmembrane helices from MC1R fused to TUBB3-derived C-terminal extensions, in-frame for Iso1 and out-of-frame for Iso2. It has been reported that exposure to ultraviolet radiation (UVR might promote an isoform switch from canonical MC1R (MC1R-001 to the MC1R-TUBB3 chimeras, which might lead to novel phenotypes required for tanning. We expressed the Flag epitope-tagged intergenic isoforms in heterologous HEK293T cells and human melanoma cells, for functional characterization. Iso1 was expressed with the expected size. Iso2 yielded a doublet of Mr significantly lower than predicted, and impaired intracellular stability. Although Iso1- and Iso2 bound radiolabelled agonist with the same affinity as MC1R-001, their plasma membrane expression was strongly reduced. Decreased surface expression mostly resulted from aberrant forward trafficking, rather than high rates of endocytosis. Functional coupling of both isoforms to cAMP was lower than wild-type, but ERK activation upon binding of αMSH was unimpaired, suggesting imbalanced signaling from the splice variants. Heterodimerization of differentially labelled MC1R-001 with the splicing isoforms analyzed by co-immunoprecipitation was efficient and caused decreased surface expression of binding sites. Thus, UVR-induced MC1R isoforms might contribute to fine-tune the tanning response by modulating MC1R-001 availability and functional parameters.

  18. Promoter Analysis Reveals Globally Differential Regulation of Human Long Non-Coding RNA and Protein-Coding Genes

    KAUST Repository

    Alam, Tanvir

    2014-10-02

    Transcriptional regulation of protein-coding genes is increasingly well-understood on a global scale, yet no comparable information exists for long non-coding RNA (lncRNA) genes, which were recently recognized to be as numerous as protein-coding genes in mammalian genomes. We performed a genome-wide comparative analysis of the promoters of human lncRNA and protein-coding genes, finding global differences in specific genetic and epigenetic features relevant to transcriptional regulation. These two groups of genes are hence subject to separate transcriptional regulatory programs, including distinct transcription factor (TF) proteins that significantly favor lncRNA, rather than coding-gene, promoters. We report a specific signature of promoter-proximal transcriptional regulation of lncRNA genes, including several distinct transcription factor binding sites (TFBS). Experimental DNase I hypersensitive site profiles are consistent with active configurations of these lncRNA TFBS sets in diverse human cell types. TFBS ChIP-seq datasets confirm the binding events that we predicted using computational approaches for a subset of factors. For several TFs known to be directly regulated by lncRNAs, we find that their putative TFBSs are enriched at lncRNA promoters, suggesting that the TFs and the lncRNAs may participate in a bidirectional feedback loop regulatory network. Accordingly, cells may be able to modulate lncRNA expression levels independently of mRNA levels via distinct regulatory pathways. Our results also raise the possibility that, given the historical reliance on protein-coding gene catalogs to define the chromatin states of active promoters, a revision of these chromatin signature profiles to incorporate expressed lncRNA genes is warranted in the future.

  19. Event-Related Potential Correlates of Declarative and Non-Declarative Sequence Knowledge

    Science.gov (United States)

    Ferdinand, Nicola K.; Runger, Dennis; Frensch, Peter A.; Mecklinger, Axel

    2010-01-01

    The goal of the present study was to demonstrate that declarative and non-declarative knowledge acquired in an incidental sequence learning task contributes differentially to memory retrieval and leads to dissociable ERP signatures in a recognition memory task. For this purpose, participants performed a sequence learning task and were classified…

  20. Evolutionary dynamics of microsatellite distribution in plants: insight from the comparison of sequenced brassica, Arabidopsis and other angiosperm species.

    Directory of Open Access Journals (Sweden)

    Jiaqin Shi

    Full Text Available Despite their ubiquity and functional importance, microsatellites have been largely ignored in comparative genomics, mostly due to the lack of genomic information. In the current study, microsatellite distribution was characterized and compared in the whole genomes and both the coding and non-coding DNA sequences of the sequenced Brassica, Arabidopsis and other angiosperm species to investigate their evolutionary dynamics in plants. The variation in the microsatellite frequencies of these angiosperm species was much smaller than those for their microsatellite numbers and genome sizes, suggesting that microsatellite frequency may be relatively stable in plants. The microsatellite frequencies of these angiosperm species were significantly negatively correlated with both their genome sizes and transposable elements contents. The pattern of microsatellite distribution may differ according to the different genomic regions (such as coding and non-coding sequences. The observed differences in many important microsatellite characteristics (especially the distribution with respect to motif length, type and repeat number of these angiosperm species were generally accordant with their phylogenetic distance, which suggested that the evolutionary dynamics of microsatellite distribution may be generally consistent with plant divergence/evolution. Importantly, by comparing these microsatellite characteristics (especially the distribution with respect to motif type the angiosperm species (aside from a few species all clustered into two obviously different groups that were largely represented by monocots and dicots, suggesting a complex and generally dichotomous evolutionary pattern of microsatellite distribution in angiosperms. Polyploidy may lead to a slight increase in microsatellite frequency in the coding sequences and a significant decrease in microsatellite frequency in the whole genome/non-coding sequences, but have little effect on the microsatellite

  1. Evolutionary Dynamics of Microsatellite Distribution in Plants: Insight from the Comparison of Sequenced Brassica, Arabidopsis and Other Angiosperm Species

    Science.gov (United States)

    Shi, Jiaqin; Huang, Shunmou; Fu, Donghui; Yu, Jinyin; Wang, Xinfa; Hua, Wei; Liu, Shengyi; Liu, Guihua; Wang, Hanzhong

    2013-01-01

    Despite their ubiquity and functional importance, microsatellites have been largely ignored in comparative genomics, mostly due to the lack of genomic information. In the current study, microsatellite distribution was characterized and compared in the whole genomes and both the coding and non-coding DNA sequences of the sequenced Brassica, Arabidopsis and other angiosperm species to investigate their evolutionary dynamics in plants. The variation in the microsatellite frequencies of these angiosperm species was much smaller than those for their microsatellite numbers and genome sizes, suggesting that microsatellite frequency may be relatively stable in plants. The microsatellite frequencies of these angiosperm species were significantly negatively correlated with both their genome sizes and transposable elements contents. The pattern of microsatellite distribution may differ according to the different genomic regions (such as coding and non-coding sequences). The observed differences in many important microsatellite characteristics (especially the distribution with respect to motif length, type and repeat number) of these angiosperm species were generally accordant with their phylogenetic distance, which suggested that the evolutionary dynamics of microsatellite distribution may be generally consistent with plant divergence/evolution. Importantly, by comparing these microsatellite characteristics (especially the distribution with respect to motif type) the angiosperm species (aside from a few species) all clustered into two obviously different groups that were largely represented by monocots and dicots, suggesting a complex and generally dichotomous evolutionary pattern of microsatellite distribution in angiosperms. Polyploidy may lead to a slight increase in microsatellite frequency in the coding sequences and a significant decrease in microsatellite frequency in the whole genome/non-coding sequences, but have little effect on the microsatellite distribution with

  2. ReRep: Computational detection of repetitive sequences in genome survey sequences (GSS

    Directory of Open Access Journals (Sweden)

    Alves-Ferreira Marcelo

    2008-09-01

    Full Text Available Abstract Background Genome survey sequences (GSS offer a preliminary global view of a genome since, unlike ESTs, they cover coding as well as non-coding DNA and include repetitive regions of the genome. A more precise estimation of the nature, quantity and variability of repetitive sequences very early in a genome sequencing project is of considerable importance, as such data strongly influence the estimation of genome coverage, library quality and progress in scaffold construction. Also, the elimination of repetitive sequences from the initial assembly process is important to avoid errors and unnecessary complexity. Repetitive sequences are also of interest in a variety of other studies, for instance as molecular markers. Results We designed and implemented a straightforward pipeline called ReRep, which combines bioinformatics tools for identifying repetitive structures in a GSS dataset. In a case study, we first applied the pipeline to a set of 970 GSSs, sequenced in our laboratory from the human pathogen Leishmania braziliensis, the causative agent of leishmaniosis, an important public health problem in Brazil. We also verified the applicability of ReRep to new sequencing technologies using a set of 454-reads of an Escheria coli. The behaviour of several parameters in the algorithm is evaluated and suggestions are made for tuning of the analysis. Conclusion The ReRep approach for identification of repetitive elements in GSS datasets proved to be straightforward and efficient. Several potential repetitive sequences were found in a L. braziliensis GSS dataset generated in our laboratory, and further validated by the analysis of a more complete genomic dataset from the EMBL and Sanger Centre databases. ReRep also identified most of the E. coli K12 repeats prior to assembly in an example dataset obtained by automated sequencing using 454 technology. The parameters controlling the algorithm behaved consistently and may be tuned to the properties

  3. Improved resolution of reef-coral endosymbiont (Symbiodinium species diversity, ecology, and evolution through psbA non-coding region genotyping.

    Directory of Open Access Journals (Sweden)

    Todd C LaJeunesse

    Full Text Available Ribosomal DNA sequence data abounds from numerous studies on the dinoflagellate endosymbionts of corals, and yet the multi-copy nature and intragenomic variability of rRNA genes and spacers confound interpretations of symbiont diversity and ecology. Making consistent sense of extensive sequence variation in a meaningful ecological and evolutionary context would benefit from the application of additional genetic markers. Sequences of the non-coding region of the plastid psbA minicircle (psbA(ncr were used to independently examine symbiont genotypic and species diversity found within and between colonies of Hawaiian reef corals in the genus Montipora. A single psbA(ncr haplotype was recovered in most samples through direct sequencing (~80-90% and members of the same internal transcribed spacer region 2 (ITS2 type were phylogenetically differentiated from other ITS2 types by substantial psbA(ncr sequence divergence. The repeated sequencing of bacterially-cloned fragments of psbA(ncr from samples and clonal cultures often recovered a single numerically common haplotype accompanied by rare, highly-similar, sequence variants. When sequence artifacts of cloning and intragenomic variation are factored out, these data indicate that most colonies harbored one dominant Symbiodinium genotype. The cloning and sequencing of ITS2 DNA amplified from these same samples recovered numerically abundant variants (that are diagnostic of distinct Symbiodinium lineages, but also generated a large amount of sequences comprising PCR/cloning artifacts combined with ancestral and/or rare variants that, if incorporated into phylogenetic reconstructions, confound how small sequence differences are interpreted. Finally, psbA(ncr sequence data from a broad sampling of Symbiodinium diversity obtained from various corals throughout the Indo-Pacific were concordant with ITS lineage membership (defined by denaturing gradient gel electrophoresis screening, yet exhibited

  4. Molecular characterization of Banana streak virus isolate from Musa Acuminata in China.

    Science.gov (United States)

    Zhuang, Jun; Wang, Jian-Hua; Zhang, Xin; Liu, Zhi-Xin

    2011-12-01

    Banana streak virus (BSV), a member of genus Badnavirus, is a causal agent of banana streak disease throughout the world. The genetic diversity of BSVs from different regions of banana plantations has previously been investigated, but there are relatively few reports of the genetic characteristic of episomal (non-integrated) BSV genomes isolated from China. Here, the complete genome, a total of 7722bp (GenBank accession number DQ092436), of an isolate of Banana streak virus (BSV) on cultivar Cavendish (BSAcYNV) in Yunnan, China was determined. The genome organises in the typical manner of badnaviruses. The intergenic region of genomic DNA contains a large stem-loop, which may contribute to the ribosome shift into the following open reading frames (ORFs). The coding region of BSAcYNV consists of three overlapping ORFs, ORF1 with a non-AUG start codon and ORF2 encoding two small proteins are individually involved in viral movement and ORF3 encodes a polyprotein. Besides the complete genome, a defective genome lacking the whole RNA leader region and a majority of ORF1 and which encompasses 6525bp was also isolated and sequenced from this BSV DNA reservoir in infected banana plants. Sequence analyses showed that BSAcYNV has closest similarity in terms of genome organization and the coding assignments with an BSV isolate from Vietnam (BSAcVNV). The corresponding coding regions shared identities of 88% and -95% at nucleotide and amino acid levels, respectively. Phylogenetic analysis also indicated BSAcYNV shared the closest geographical evolutionary relationship to BSAcVNV among sequenced banana streak badnaviruses.

  5. Selection, Recombination and History in a Parasitic Flatworm (Echinococcus Inferred from Nucleotide Sequences

    Directory of Open Access Journals (Sweden)

    Haag KL

    1998-01-01

    Full Text Available Three species of flatworms from the genus Echinococcus (E. granulosus, E. multilocularis and E. vogeli and four strains of E. granulosus (cattle, horse, pig and sheep strains were analysed by the PCR-SSCP method followed by sequencing, using as targets two non-coding and two coding (one nuclear and one mitochondrial genomic regions. The sequencing data was used to evaluate hypothesis about the parasite breeding system and the causes of genetic diversification. The calculated recombination parameters suggested that cross-fertilisation was rare in the history of the group. However, the relative rates of substitution in the coding sequences showed that positive selection (instead of purifying selection drove the evolution of an elastase and neutrophil chemotaxis inhibitor gene (AgB/1. The phylogenetic analyses revealed several ambiguities, indicating that the taxonomic status of the E. granulosus horse strain should be revised

  6. The emerging role of non-coding RNAs in drug addiction

    Directory of Open Access Journals (Sweden)

    Gregory Charles Sartor

    2012-06-01

    Full Text Available Prolonged drug use causes long-lasting neuroadaptations in reward-related brain areas that contribute to addiction. Despite significant amount of research dedicated to understanding the underlying mechanisms of addiction, the molecular underpinnings remain unclear. At the same time, much of the pervasive transcription that encompasses the human genome occurs in the nervous system and contributes to its heterogeneity and complexity. Recent evidence suggests that non-coding RNAs (ncRNAs play an important and dynamic role in transcriptional regulation, epigenetic signaling, stress response, and plasticity in the nervous system. Dysregulation of ncRNAs are thought to contribute to many, and perhaps all, neurological disorders, including addiction. Here, we review recent insights in the functional relevance of ncRNAs, including both microRNAs (miRNAs and long non-coding RNAs (lncRNAs, and then illustrate specific examples of ncRNA regulation in the context of drug addiction. We conclude that ncRNAs are importantly involved in the persistent neuroadaptations associated with addiction-related behaviors, and that therapies that target specific ncRNAs may represent new avenues for the treatment of drug addiction.

  7. The roles of non-coding RNAs in cardiac regenerative medicine

    Directory of Open Access Journals (Sweden)

    Oi Kuan Choong

    2017-06-01

    Full Text Available The emergence of non-coding RNAs (ncRNAs has challenged the central dogma of molecular biology that dictates that the decryption of genetic information starts from transcription of DNA to RNA, with subsequent translation into a protein. Large numbers of ncRNAs with biological significance have now been identified, suggesting that ncRNAs are important in their own right and their roles extend far beyond what was originally envisaged. ncRNAs do not only regulate gene expression, but are also involved in chromatin architecture and structural conformation. Several studies have pointed out that ncRNAs participate in heart disease; however, the functions of ncRNAs still remain unclear. ncRNAs are involved in cellular fate, differentiation, proliferation and tissue regeneration, hinting at their potential therapeutic applications. Here, we review the current understanding of both the biological functions and molecular mechanisms of ncRNAs in heart disease and describe some of the ncRNAs that have potential heart regeneration effects. Keywords: Non-coding RNAs, Cardiac regeneration, Cardiac fate, Proliferation, Differentiation, Reprograming

  8. Improvements to Busquet's Non LTE algorithm in NRL's Hydro code

    Science.gov (United States)

    Klapisch, M.; Colombant, D.

    1996-11-01

    Implementation of the Non LTE model RADIOM (M. Busquet, Phys. Fluids B, 5, 4191 (1993)) in NRL's RAD2D Hydro code in conservative form was reported previously(M. Klapisch et al., Bull. Am. Phys. Soc., 40, 1806 (1995)).While the results were satisfactory, the algorithm was slow and not always converging. We describe here modifications that address the latter two shortcomings. This method is quicker and more stable than the original. It also gives information about the validity of the fitting. It turns out that the number and distribution of groups in the multigroup diffusion opacity tables - a basis for the computation of radiation effects in the ionization balance in RADIOM- has a large influence on the robustness of the algorithm. These modifications give insight about the algorithm, and allow to check that the obtained average charge state is the true average. In addition, code optimization resulted in greatly reduced computing time: The ratio of Non LTE to LTE computing times being now between 1.5 and 2.

  9. Construction and Analysis of a Novel 2-D Optical Orthogonal Codes Based on Modified One-coincidence Sequence

    Science.gov (United States)

    Ji, Jianhua; Wang, Yanfen; Wang, Ke; Xu, Ming; Zhang, Zhipeng; Yang, Shuwen

    2013-09-01

    A new two-dimensional OOC (optical orthogonal codes) named PC/MOCS is constructed, using PC (prime code) for time spreading and MOCS (modified one-coincidence sequence) for wavelength hopping. Compared with PC/PC, the number of wavelengths for PC/MOCS is not limited to a prime number. Compared with PC/OCS, the length of MOCS need not be expanded to the same length of PC. PC/MOCS can be constructed flexibly, and also can use available wavelengths effectively. Theoretical analysis shows that PC/MOCS can reduce the bit error rate (BER) of OCDMA system, and can support more users than PC/PC and PC/OCS.

  10. History and genomic sequence analysis of the herpes simplex virus 1 KOS and KOS1.1 sub-strains.

    Science.gov (United States)

    Colgrove, Robert C; Liu, Xueqiao; Griffiths, Anthony; Raja, Priya; Deluca, Neal A; Newman, Ruchi M; Coen, Donald M; Knipe, David M

    2016-01-01

    A collection of genomic DNA sequences of herpes simplex virus (HSV) strains has been defined and analyzed, and some information is available about genomic stability upon limited passage of viruses in culture. The nature of genomic change upon extensive laboratory passage remains to be determined. In this report we review the history of the HSV-1 KOS laboratory strain and the related KOS1.1 laboratory sub-strain, also called KOS (M), and determine the complete genomic sequence of an early passage stock of the KOS laboratory sub-strain and a laboratory stock of the KOS1.1 sub-strain. The genomes of the two sub-strains are highly similar with only five coding changes, 20 non-coding changes, and about twenty non-ORF sequence changes. The coding changes could potentially explain the KOS1.1 phenotypic properties of increased replication at high temperature and reduced neuroinvasiveness. The study also provides sequence markers to define the provenance of specific laboratory KOS virus stocks. Copyright © 2015 Elsevier Inc. All rights reserved.

  11. On fuzzy semantic similarity measure for DNA coding.

    Science.gov (United States)

    Ahmad, Muneer; Jung, Low Tang; Bhuiyan, Md Al-Amin

    2016-02-01

    A coding measure scheme numerically translates the DNA sequence to a time domain signal for protein coding regions identification. A number of coding measure schemes based on numerology, geometry, fixed mapping, statistical characteristics and chemical attributes of nucleotides have been proposed in recent decades. Such coding measure schemes lack the biologically meaningful aspects of nucleotide data and hence do not significantly discriminate coding regions from non-coding regions. This paper presents a novel fuzzy semantic similarity measure (FSSM) coding scheme centering on FSSM codons׳ clustering and genetic code context of nucleotides. Certain natural characteristics of nucleotides i.e. appearance as a unique combination of triplets, preserving special structure and occurrence, and ability to own and share density distributions in codons have been exploited in FSSM. The nucleotides׳ fuzzy behaviors, semantic similarities and defuzzification based on the center of gravity of nucleotides revealed a strong correlation between nucleotides in codons. The proposed FSSM coding scheme attains a significant enhancement in coding regions identification i.e. 36-133% as compared to other existing coding measure schemes tested over more than 250 benchmarked and randomly taken DNA datasets of different organisms. Copyright © 2015 Elsevier Ltd. All rights reserved.

  12. Deciphering the genetic regulatory code using an inverse error control coding framework.

    Energy Technology Data Exchange (ETDEWEB)

    Rintoul, Mark Daniel; May, Elebeoba Eni; Brown, William Michael; Johnston, Anna Marie; Watson, Jean-Paul

    2005-03-01

    We have found that developing a computational framework for reconstructing error control codes for engineered data and ultimately for deciphering genetic regulatory coding sequences is a challenging and uncharted area that will require advances in computational technology for exact solutions. Although exact solutions are desired, computational approaches that yield plausible solutions would be considered sufficient as a proof of concept to the feasibility of reverse engineering error control codes and the possibility of developing a quantitative model for understanding and engineering genetic regulation. Such evidence would help move the idea of reconstructing error control codes for engineered and biological systems from the high risk high payoff realm into the highly probable high payoff domain. Additionally this work will impact biological sensor development and the ability to model and ultimately develop defense mechanisms against bioagents that can be engineered to cause catastrophic damage. Understanding how biological organisms are able to communicate their genetic message efficiently in the presence of noise can improve our current communication protocols, a continuing research interest. Towards this end, project goals include: (1) Develop parameter estimation methods for n for block codes and for n, k, and m for convolutional codes. Use methods to determine error control (EC) code parameters for gene regulatory sequence. (2) Develop an evolutionary computing computational framework for near-optimal solutions to the algebraic code reconstruction problem. Method will be tested on engineered and biological sequences.

  13. Whole-Exome Sequencing Identifies Rare and Low-Frequency Coding Variants Associated with LDL Cholesterol

    Science.gov (United States)

    Lange, Leslie A.; Hu, Youna; Zhang, He; Xue, Chenyi; Schmidt, Ellen M.; Tang, Zheng-Zheng; Bizon, Chris; Lange, Ethan M.; Smith, Joshua D.; Turner, Emily H.; Jun, Goo; Kang, Hyun Min; Peloso, Gina; Auer, Paul; Li, Kuo-ping; Flannick, Jason; Zhang, Ji; Fuchsberger, Christian; Gaulton, Kyle; Lindgren, Cecilia; Locke, Adam; Manning, Alisa; Sim, Xueling; Rivas, Manuel A.; Holmen, Oddgeir L.; Gottesman, Omri; Lu, Yingchang; Ruderfer, Douglas; Stahl, Eli A.; Duan, Qing; Li, Yun; Durda, Peter; Jiao, Shuo; Isaacs, Aaron; Hofman, Albert; Bis, Joshua C.; Correa, Adolfo; Griswold, Michael E.; Jakobsdottir, Johanna; Smith, Albert V.; Schreiner, Pamela J.; Feitosa, Mary F.; Zhang, Qunyuan; Huffman, Jennifer E.; Crosby, Jacy; Wassel, Christina L.; Do, Ron; Franceschini, Nora; Martin, Lisa W.; Robinson, Jennifer G.; Assimes, Themistocles L.; Crosslin, David R.; Rosenthal, Elisabeth A.; Tsai, Michael; Rieder, Mark J.; Farlow, Deborah N.; Folsom, Aaron R.; Lumley, Thomas; Fox, Ervin R.; Carlson, Christopher S.; Peters, Ulrike; Jackson, Rebecca D.; van Duijn, Cornelia M.; Uitterlinden, André G.; Levy, Daniel; Rotter, Jerome I.; Taylor, Herman A.; Gudnason, Vilmundur; Siscovick, David S.; Fornage, Myriam; Borecki, Ingrid B.; Hayward, Caroline; Rudan, Igor; Chen, Y. Eugene; Bottinger, Erwin P.; Loos, Ruth J.F.; Sætrom, Pål; Hveem, Kristian; Boehnke, Michael; Groop, Leif; McCarthy, Mark; Meitinger, Thomas; Ballantyne, Christie M.; Gabriel, Stacey B.; O’Donnell, Christopher J.; Post, Wendy S.; North, Kari E.; Reiner, Alexander P.; Boerwinkle, Eric; Psaty, Bruce M.; Altshuler, David; Kathiresan, Sekar; Lin, Dan-Yu; Jarvik, Gail P.; Cupples, L. Adrienne; Kooperberg, Charles; Wilson, James G.; Nickerson, Deborah A.; Abecasis, Goncalo R.; Rich, Stephen S.; Tracy, Russell P.; Willer, Cristen J.; Gabriel, Stacey B.; Altshuler, David M.; Abecasis, Gonçalo R.; Allayee, Hooman; Cresci, Sharon; Daly, Mark J.; de Bakker, Paul I.W.; DePristo, Mark A.; Do, Ron; Donnelly, Peter; Farlow, Deborah N.; Fennell, Tim; Garimella, Kiran; Hazen, Stanley L.; Hu, Youna; Jordan, Daniel M.; Jun, Goo; Kathiresan, Sekar; Kang, Hyun Min; Kiezun, Adam; Lettre, Guillaume; Li, Bingshan; Li, Mingyao; Newton-Cheh, Christopher H.; Padmanabhan, Sandosh; Peloso, Gina; Pulit, Sara; Rader, Daniel J.; Reich, David; Reilly, Muredach P.; Rivas, Manuel A.; Schwartz, Steve; Scott, Laura; Siscovick, David S.; Spertus, John A.; Stitziel, Nathaniel O.; Stoletzki, Nina; Sunyaev, Shamil R.; Voight, Benjamin F.; Willer, Cristen J.; Rich, Stephen S.; Akylbekova, Ermeg; Atwood, Larry D.; Ballantyne, Christie M.; Barbalic, Maja; Barr, R. Graham; Benjamin, Emelia J.; Bis, Joshua; Boerwinkle, Eric; Bowden, Donald W.; Brody, Jennifer; Budoff, Matthew; Burke, Greg; Buxbaum, Sarah; Carr, Jeff; Chen, Donna T.; Chen, Ida Y.; Chen, Wei-Min; Concannon, Pat; Crosby, Jacy; Cupples, L. Adrienne; D’Agostino, Ralph; DeStefano, Anita L.; Dreisbach, Albert; Dupuis, Josée; Durda, J. Peter; Ellis, Jaclyn; Folsom, Aaron R.; Fornage, Myriam; Fox, Caroline S.; Fox, Ervin; Funari, Vincent; Ganesh, Santhi K.; Gardin, Julius; Goff, David; Gordon, Ora; Grody, Wayne; Gross, Myron; Guo, Xiuqing; Hall, Ira M.; Heard-Costa, Nancy L.; Heckbert, Susan R.; Heintz, Nicholas; Herrington, David M.; Hickson, DeMarc; Huang, Jie; Hwang, Shih-Jen; Jacobs, David R.; Jenny, Nancy S.; Johnson, Andrew D.; Johnson, Craig W.; Kawut, Steven; Kronmal, Richard; Kurz, Raluca; Lange, Ethan M.; Lange, Leslie A.; Larson, Martin G.; Lawson, Mark; Lewis, Cora E.; Levy, Daniel; Li, Dalin; Lin, Honghuang; Liu, Chunyu; Liu, Jiankang; Liu, Kiang; Liu, Xiaoming; Liu, Yongmei; Longstreth, William T.; Loria, Cay; Lumley, Thomas; Lunetta, Kathryn; Mackey, Aaron J.; Mackey, Rachel; Manichaikul, Ani; Maxwell, Taylor; McKnight, Barbara; Meigs, James B.; Morrison, Alanna C.; Musani, Solomon K.; Mychaleckyj, Josyf C.; Nettleton, Jennifer A.; North, Kari; O’Donnell, Christopher J.; O’Leary, Daniel; Ong, Frank; Palmas, Walter; Pankow, James S.; Pankratz, Nathan D.; Paul, Shom; Perez, Marco; Person, Sharina D.; Polak, Joseph; Post, Wendy S.; Psaty, Bruce M.; Quinlan, Aaron R.; Raffel, Leslie J.; Ramachandran, Vasan S.; Reiner, Alexander P.; Rice, Kenneth; Rotter, Jerome I.; Sanders, Jill P.; Schreiner, Pamela; Seshadri, Sudha; Shea, Steve; Sidney, Stephen; Silverstein, Kevin; Smith, Nicholas L.; Sotoodehnia, Nona; Srinivasan, Asoke; Taylor, Herman A.; Taylor, Kent; Thomas, Fridtjof; Tracy, Russell P.; Tsai, Michael Y.; Volcik, Kelly A.; Wassel, Chrstina L.; Watson, Karol; Wei, Gina; White, Wendy; Wiggins, Kerri L.; Wilk, Jemma B.; Williams, O. Dale; Wilson, Gregory; Wilson, James G.; Wolf, Phillip; Zakai, Neil A.; Hardy, John; Meschia, James F.; Nalls, Michael; Singleton, Andrew; Worrall, Brad; Bamshad, Michael J.; Barnes, Kathleen C.; Abdulhamid, Ibrahim; Accurso, Frank; Anbar, Ran; Beaty, Terri; Bigham, Abigail; Black, Phillip; Bleecker, Eugene; Buckingham, Kati; Cairns, Anne Marie; Caplan, Daniel; Chatfield, Barbara; Chidekel, Aaron; Cho, Michael; Christiani, David C.; Crapo, James D.; Crouch, Julia; Daley, Denise; Dang, Anthony; Dang, Hong; De Paula, Alicia; DeCelie-Germana, Joan; Drumm, Allen DozorMitch; Dyson, Maynard; Emerson, Julia; Emond, Mary J.; Ferkol, Thomas; Fink, Robert; Foster, Cassandra; Froh, Deborah; Gao, Li; Gershan, William; Gibson, Ronald L.; Godwin, Elizabeth; Gondor, Magdalen; Gutierrez, Hector; Hansel, Nadia N.; Hassoun, Paul M.; Hiatt, Peter; Hokanson, John E.; Howenstine, Michelle; Hummer, Laura K.; Kanga, Jamshed; Kim, Yoonhee; Knowles, Michael R.; Konstan, Michael; Lahiri, Thomas; Laird, Nan; Lange, Christoph; Lin, Lin; Lin, Xihong; Louie, Tin L.; Lynch, David; Make, Barry; Martin, Thomas R.; Mathai, Steve C.; Mathias, Rasika A.; McNamara, John; McNamara, Sharon; Meyers, Deborah; Millard, Susan; Mogayzel, Peter; Moss, Richard; Murray, Tanda; Nielson, Dennis; Noyes, Blakeslee; O’Neal, Wanda; Orenstein, David; O’Sullivan, Brian; Pace, Rhonda; Pare, Peter; Parker, H. Worth; Passero, Mary Ann; Perkett, Elizabeth; Prestridge, Adrienne; Rafaels, Nicholas M.; Ramsey, Bonnie; Regan, Elizabeth; Ren, Clement; Retsch-Bogart, George; Rock, Michael; Rosen, Antony; Rosenfeld, Margaret; Ruczinski, Ingo; Sanford, Andrew; Schaeffer, David; Sell, Cindy; Sheehan, Daniel; Silverman, Edwin K.; Sin, Don; Spencer, Terry; Stonebraker, Jackie; Tabor, Holly K.; Varlotta, Laurie; Vergara, Candelaria I.; Weiss, Robert; Wigley, Fred; Wise, Robert A.; Wright, Fred A.; Wurfel, Mark M.; Zanni, Robert; Zou, Fei; Nickerson, Deborah A.; Rieder, Mark J.; Green, Phil; Shendure, Jay; Akey, Joshua M.; Bustamante, Carlos D.; Crosslin, David R.; Eichler, Evan E.; Fox, P. Keolu; Fu, Wenqing; Gordon, Adam; Gravel, Simon; Jarvik, Gail P.; Johnsen, Jill M.; Kan, Mengyuan; Kenny, Eimear E.; Kidd, Jeffrey M.; Lara-Garduno, Fremiet; Leal, Suzanne M.; Liu, Dajiang J.; McGee, Sean; O’Connor, Timothy D.; Paeper, Bryan; Robertson, Peggy D.; Smith, Joshua D.; Staples, Jeffrey C.; Tennessen, Jacob A.; Turner, Emily H.; Wang, Gao; Yi, Qian; Jackson, Rebecca; Peters, Ulrike; Carlson, Christopher S.; Anderson, Garnet; Anton-Culver, Hoda; Assimes, Themistocles L.; Auer, Paul L.; Beresford, Shirley; Bizon, Chris; Black, Henry; Brunner, Robert; Brzyski, Robert; Burwen, Dale; Caan, Bette; Carty, Cara L.; Chlebowski, Rowan; Cummings, Steven; Curb, J. David; Eaton, Charles B.; Ford, Leslie; Franceschini, Nora; Fullerton, Stephanie M.; Gass, Margery; Geller, Nancy; Heiss, Gerardo; Howard, Barbara V.; Hsu, Li; Hutter, Carolyn M.; Ioannidis, John; Jiao, Shuo; Johnson, Karen C.; Kooperberg, Charles; Kuller, Lewis; LaCroix, Andrea; Lakshminarayan, Kamakshi; Lane, Dorothy; Lasser, Norman; LeBlanc, Erin; Li, Kuo-Ping; Limacher, Marian; Lin, Dan-Yu; Logsdon, Benjamin A.; Ludlam, Shari; Manson, JoAnn E.; Margolis, Karen; Martin, Lisa; McGowan, Joan; Monda, Keri L.; Kotchen, Jane Morley; Nathan, Lauren; Ockene, Judith; O’Sullivan, Mary Jo; Phillips, Lawrence S.; Prentice, Ross L.; Robbins, John; Robinson, Jennifer G.; Rossouw, Jacques E.; Sangi-Haghpeykar, Haleh; Sarto, Gloria E.; Shumaker, Sally; Simon, Michael S.; Stefanick, Marcia L.; Stein, Evan; Tang, Hua; Taylor, Kira C.; Thomson, Cynthia A.; Thornton, Timothy A.; Van Horn, Linda; Vitolins, Mara; Wactawski-Wende, Jean; Wallace, Robert; Wassertheil-Smoller, Sylvia; Zeng, Donglin; Applebaum-Bowden, Deborah; Feolo, Michael; Gan, Weiniu; Paltoo, Dina N.; Sholinsky, Phyliss; Sturcke, Anne

    2014-01-01

    Elevated low-density lipoprotein cholesterol (LDL-C) is a treatable, heritable risk factor for cardiovascular disease. Genome-wide association studies (GWASs) have identified 157 variants associated with lipid levels but are not well suited to assess the impact of rare and low-frequency variants. To determine whether rare or low-frequency coding variants are associated with LDL-C, we exome sequenced 2,005 individuals, including 554 individuals selected for extreme LDL-C (>98th or <2nd percentile). Follow-up analyses included sequencing of 1,302 additional individuals and genotype-based analysis of 52,221 individuals. We observed significant evidence of association between LDL-C and the burden of rare or low-frequency variants in PNPLA5, encoding a phospholipase-domain-containing protein, and both known and previously unidentified variants in PCSK9, LDLR and APOB, three known lipid-related genes. The effect sizes for the burden of rare variants for each associated gene were substantially higher than those observed for individual SNPs identified from GWASs. We replicated the PNPLA5 signal in an independent large-scale sequencing study of 2,084 individuals. In conclusion, this large whole-exome-sequencing study for LDL-C identified a gene not known to be implicated in LDL-C and provides unique insight into the design and analysis of similar experiments. PMID:24507775

  14. Wind Noise Reduction using Non-negative Sparse Coding

    DEFF Research Database (Denmark)

    Schmidt, Mikkel N.; Larsen, Jan; Hsiao, Fu-Tien

    2007-01-01

    We introduce a new speaker independent method for reducing wind noise in single-channel recordings of noisy speech. The method is based on non-negative sparse coding and relies on a wind noise dictionary which is estimated from an isolated noise recording. We estimate the parameters of the model ...... and discuss their sensitivity. We then compare the algorithm with the classical spectral subtraction method and the Qualcomm-ICSI-OGI noise reduction method. We optimize the sound quality in terms of signal-to-noise ratio and provide results on a noisy speech recognition task....

  15. Identification of developmentally regulated PCP-responsive non-coding RNA, prt6, in the rat thalamus.

    Directory of Open Access Journals (Sweden)

    Hironao Takebayashi

    Full Text Available Schizophrenia and similar psychoses induced by NMDA-type glutamate receptor antagonists, such as phencyclidine (PCP and ketamine, usually develop after adolescence. Moreover, adult-type behavioral disturbance following NMDA receptor antagonist application in rodents is observed after a critical period at around 3 postnatal weeks. These observations suggest that the schizophrenic symptoms caused by and psychotomimetic effects of NMDA antagonists require the maturation of certain brain neuron circuits and molecular networks, which differentially respond to NMDA receptor antagonists across adolescence and the critical period. From this viewpoint, we have identified a novel developmentally regulated phencyclidine-responsive transcript from the rat thalamus, designated as prt6, as a candidate molecule involved in the above schizophrenia-related systems using a DNA microarray technique. The transcript is a non-coding RNA that includes sequences of at least two microRNAs, miR132 and miR212, and is expressed strongly in the brain and testis, with trace or non-detectable levels in the spleen, heart, liver, kidney, lung and skeletal muscle, as revealed by Northern blot analysis. The systemic administration of PCP (7.5 mg/kg, subcutaneously (s.c. significantly elevated the expression of prt6 mRNA in the thalamus at postnatal days (PD 32 and 50, but not at PD 8, 13, 20, or 24 as compared to saline-treated controls. At PD 50, another NMDA receptor antagonist, dizocilpine (0.5 mg/kg, s.c., and a schizophrenomimetic dopamine agonist, methamphetamine (4.8 mg/kg, s.c., mimicked a significant increase in the levels of thalamic prt6 mRNAs, while a D2 dopmamine receptor antagonist, haloperidol, partly inhibited the increasing influence of PCP on thalamic prt6 expression without its own effects. These data indicate that prt6 may be involved in the pathophysiology of the onset of drug-induced schizophrenia-like symptoms and schizophrenia through the possible

  16. Reconstruction of ancestral RNA sequences under multiple structural constraints

    OpenAIRE

    Tremblay-Savard, Olivier; Reinharz, Vladimir; Waldisp?hl, J?r?me

    2016-01-01

    Background Secondary structures form the scaffold of multiple sequence alignment of non-coding RNA (ncRNA) families. An accurate reconstruction of ancestral ncRNAs must use this structural signal. However, the inference of ancestors of a single ncRNA family with a single consensus structure may bias the results towards sequences with high affinity to this structure, which are far from the true ancestors. Methods In this paper, we introduce achARNement, a maximum parsimony approach that, given...

  17. A method of non-contact reading code based on computer vision

    Science.gov (United States)

    Zhang, Chunsen; Zong, Xiaoyu; Guo, Bingxuan

    2018-03-01

    With the purpose of guarantee the computer information exchange security between internal and external network (trusted network and un-trusted network), A non-contact Reading code method based on machine vision has been proposed. Which is different from the existing network physical isolation method. By using the computer monitors, camera and other equipment. Deal with the information which will be on exchanged, Include image coding ,Generate the standard image , Display and get the actual image , Calculate homography matrix, Image distort correction and decoding in calibration, To achieve the computer information security, Non-contact, One-way transmission between the internal and external network , The effectiveness of the proposed method is verified by experiments on real computer text data, The speed of data transfer can be achieved 24kb/s. The experiment shows that this algorithm has the characteristics of high security, fast velocity and less loss of information. Which can meet the daily needs of the confidentiality department to update the data effectively and reliably, Solved the difficulty of computer information exchange between Secret network and non-secret network, With distinctive originality, practicability, and practical research value.

  18. Comprehensive Reconstruction and Visualization of Non-Coding Regulatory Networks in Human

    Science.gov (United States)

    Bonnici, Vincenzo; Russo, Francesco; Bombieri, Nicola; Pulvirenti, Alfredo; Giugno, Rosalba

    2014-01-01

    Research attention has been powered to understand the functional roles of non-coding RNAs (ncRNAs). Many studies have demonstrated their deregulation in cancer and other human disorders. ncRNAs are also present in extracellular human body fluids such as serum and plasma, giving them a great potential as non-invasive biomarkers. However, non-coding RNAs have been relatively recently discovered and a comprehensive database including all of them is still missing. Reconstructing and visualizing the network of ncRNAs interactions are important steps to understand their regulatory mechanism in complex systems. This work presents ncRNA-DB, a NoSQL database that integrates ncRNAs data interactions from a large number of well established on-line repositories. The interactions involve RNA, DNA, proteins, and diseases. ncRNA-DB is available at http://ncrnadb.scienze.univr.it/ncrnadb/. It is equipped with three interfaces: web based, command-line, and a Cytoscape app called ncINetView. By accessing only one resource, users can search for ncRNAs and their interactions, build a network annotated with all known ncRNAs and associated diseases, and use all visual and mining features available in Cytoscape. PMID:25540777

  19. Comprehensive reconstruction and visualization of non-coding regulatory networks in human.

    Science.gov (United States)

    Bonnici, Vincenzo; Russo, Francesco; Bombieri, Nicola; Pulvirenti, Alfredo; Giugno, Rosalba

    2014-01-01

    Research attention has been powered to understand the functional roles of non-coding RNAs (ncRNAs). Many studies have demonstrated their deregulation in cancer and other human disorders. ncRNAs are also present in extracellular human body fluids such as serum and plasma, giving them a great potential as non-invasive biomarkers. However, non-coding RNAs have been relatively recently discovered and a comprehensive database including all of them is still missing. Reconstructing and visualizing the network of ncRNAs interactions are important steps to understand their regulatory mechanism in complex systems. This work presents ncRNA-DB, a NoSQL database that integrates ncRNAs data interactions from a large number of well established on-line repositories. The interactions involve RNA, DNA, proteins, and diseases. ncRNA-DB is available at http://ncrnadb.scienze.univr.it/ncrnadb/. It is equipped with three interfaces: web based, command-line, and a Cytoscape app called ncINetView. By accessing only one resource, users can search for ncRNAs and their interactions, build a network annotated with all known ncRNAs and associated diseases, and use all visual and mining features available in Cytoscape.

  20. Computationally Efficient Chaotic Spreading Sequence Selection for Asynchronous DS-CDMA

    Directory of Open Access Journals (Sweden)

    Litviņenko Anna

    2017-12-01

    Full Text Available The choice of the spreading sequence for asynchronous direct-sequence code-division multiple-access (DS-CDMA systems plays a crucial role for the mitigation of multiple-access interference. Considering the rich dynamics of chaotic sequences, their use for spreading allows overcoming the limitations of the classical spreading sequences. However, to ensure low cross-correlation between the sequences, careful selection must be performed. This paper presents a novel exhaustive search algorithm, which allows finding sets of chaotic spreading sequences of required length with a particularly low mutual cross-correlation. The efficiency of the search is verified by simulations, which show a significant advantage compared to non-selected chaotic sequences. Moreover, the impact of sequence length on the efficiency of the selection is studied.

  1. Conservation patterns in different functional sequence categoriesof divergent Drosophila species

    Energy Technology Data Exchange (ETDEWEB)

    Papatsenko, Dmitri; Kislyuk, Andrey; Levine, Michael; Dubchak, Inna

    2005-10-01

    We have explored the distributions of fully conservedungapped blocks in genome-wide pairwise alignments of recently completedspecies of Drosophila: D.yakuba, D.ananassae, D.pseudoobscura, D.virilisand D.mojavensis. Based on these distributions we have found that nearlyevery functional sequence category possesses its own distinctiveconservation pattern, sometimes independent of the overall sequenceconservation level. In the coding and regulatory regions, the ungappedblocks were longer than in introns, UTRs and non-functional sequences. Atthe same time, the blocks in the coding regions carried 3N+2 signaturecharacteristic to synonymic substitutions in the 3rd codon positions.Larger block sizes in transcription regulatory regions can be explainedby the presence of conserved arrays of binding sites for transcriptionfactors. We also have shown that the longest ungapped blocks, or'ultraconserved' sequences, are associated with specific gene groups,including those encoding ion channels and components of the cytoskeleton.We discussed how restrained conservation patterns may help in mappingfunctional sequence categories and improving genomeannotation.

  2. Differentiation between Aspergillus flavus and Aspergillus parasiticus from Pure Culture and Aflatoxin-Contaminated Grapes Using PCR-RFLP Analysis of aflR-aflJ Intergenic Spacer

    International Nuclear Information System (INIS)

    El Khoury, A.; Atoui, A.; Lebrihi, A.; Rizk, T.; Lteif, R.; Kallassy, M.

    2011-01-01

    Aflatoxins (AFs) represent the most important single mycotoxin-related food safety problem in developed and developing countries as they have adverse effects on human and animal health. They are produced mainly by Aspergillus flavus and A. parasiticus. Both species have different aflatoxinogenic profile. In order to distinguish between A. flavus and A. parasiticus, gene-specific primers were designed to target the intergenic spacer (IGS) for the AF biosynthesis genes, aflJ and aflR. Polymerase chain reaction (PCR) products were subjected to restriction endonuclease analysis using BglII to look for restriction fragment length polymorphisms (RFLPs). Our result showed that both species displayed different PCR-based RFLP (PCR-RFLP) profile. PCR products from A. flavus cleaved into 3 fragments of 362, 210, and 102 bp. However, there is only one restriction site for this enzyme in the sequence of A. parasiticus that produced only 2 fragments of 363 and 311 bp. The method was successfully applied to contaminated grapes samples. This approach of differentiating these 2 species would be simpler, less costly, and quicker than conventional sequencing of PCR products and/or morphological identification. (author)

  3. Non-Coding RNAs in Castration-Resistant Prostate Cancer: Regulation of Androgen Receptor Signaling and Cancer Metabolism.

    Science.gov (United States)

    Shih, Jing-Wen; Wang, Ling-Yu; Hung, Chiu-Lien; Kung, Hsing-Jien; Hsieh, Chia-Ling

    2015-12-04

    Hormone-refractory prostate cancer frequently relapses from therapy and inevitably progresses to a bone-metastatic status with no cure. Understanding of the molecular mechanisms conferring resistance to androgen deprivation therapy has the potential to lead to the discovery of novel therapeutic targets for type of prostate cancer with poor prognosis. Progression to castration-resistant prostate cancer (CRPC) is characterized by aberrant androgen receptor (AR) expression and persistent AR signaling activity. Alterations in metabolic activity regulated by oncogenic pathways, such as c-Myc, were found to promote prostate cancer growth during the development of CRPC. Non-coding RNAs represent a diverse family of regulatory transcripts that drive tumorigenesis of prostate cancer and various other cancers by their hyperactivity or diminished function. A number of studies have examined differentially expressed non-coding RNAs in each stage of prostate cancer. Herein, we highlight the emerging impacts of microRNAs and long non-coding RNAs linked to reactivation of the AR signaling axis and reprogramming of the cellular metabolism in prostate cancer. The translational implications of non-coding RNA research for developing new biomarkers and therapeutic strategies for CRPC are also discussed.

  4. Comparison of four polymerase chain reaction assays for the detection of Brucella spp. in clinical samples from dogs

    Directory of Open Access Journals (Sweden)

    Eduardo J. Boeri

    2018-02-01

    Full Text Available Aim: This study aimed to compare the sensitivity (S, specificity (Sp, and positive likelihood ratios (LR+ of four polymerase chain reaction (PCR assays for the detection of Brucella spp. in dog's clinical samples. Materials and Methods: A total of 595 samples of whole blood, urine, and genital fluids were evaluated between October 2014 and November 2016. To compare PCR assays, the gold standard was defined using a combination of different serological and microbiological test. Bacterial isolation from urine and blood cultures was carried out. Serological methods such as rapid slide agglutination test, indirect enzyme-linked immunosorbent assay, agar gel immunodiffusion test, and buffered plate antigen test were performed. Four genes were evaluated: (i The gene coding for the BCSP31 protein, (ii the ribosomal gene coding for the 16S-23S intergenic spacer region, (iii the gene coding for porins omp2a/omp2b, and (iv the gene coding for the insertion sequence IS711. Results: The results obtained were as follows: (1 For the primers that amplify the gene coding for the BCSP31 protein: S: 45.64% (confidence interval [CI] 39.81-51.46, Sp: 95.62% (CI 93.13-98.12, and LR+: 10.43 (CI 6.04-18; (2 for the primers that amplify the ribosomal gene of the 16S-23S rDNA intergenic spacer region: S: 69.80% (CI 64.42-75.18, Sp: 95.62 % (CI 93.13-98.12, and LR+: 11.52 (CI 7.31-18.13; (3 for the primers that amplify the omp2a and omp2b genes: S: 39.26% (CI 33.55-44.97, Sp: 97.31% (CI 95.30-99.32, and LR+ 14.58 (CI 7.25-29.29; and (4 for the primers that amplify the insertion sequence IS711: S: 22.82% (CI 17.89 - 27.75, Sp: 99.66% (CI 98.84-100, and LR+ 67.77 (CI 9.47-484.89. Conclusion: We concluded that the gene coding for the 16S-23S rDNA intergenic spacer region was the one that best detected Brucella spp. in canine clinical samples.

  5. cDNA sequence of human transforming gene hst and identification of the coding sequence required for transforming activity

    International Nuclear Information System (INIS)

    Taira, M.; Yoshida, T.; Miyagawa, K.; Sakamoto, H.; Terada, M.; Sugimura, T.

    1987-01-01

    The hst gene was originally identified as a transforming gene in DNAs from human stomach cancers and from a noncancerous portion of stomach mucosa by DNA-mediated transfection assay using NIH3T3 cells. cDNA clones of hst were isolated from the cDNA library constructed from poly(A) + RNA of a secondary transformant induced by the DNA from a stomach cancer. The sequence analysis of the hst cDNA revealed the presence of two open reading frames. When this cDNA was inserted into an expression vector containing the simian virus 40 promoter, it efficiently induced the transformation of NIH3T3 cells upon transfection. It was found that one of the reading frames, which coded for 206 amino acids, was responsible for the transforming activity

  6. Complete chloroplast genome sequences of Drimys, Liriodendron, andPiper: Implications for the phylogeny of magnoliids and the evolution ofGC content

    Energy Technology Data Exchange (ETDEWEB)

    Zhengqiu, C.; Penaflor, C.; Kuehl, J.V.; Leebens-Mack, J.; Carlson, J.; dePamphilis, C.W.; Boore, J.L.; Jansen, R.K.

    2006-06-01

    The magnoliids represent the largest basal angiosperm clade with four orders, 19 families and 8,500 species. Although several recent angiosperm molecular phylogenies have supported the monophyly of magnoliids and suggested relationships among the orders, the limited number of genes examined resulted in only weak support, and these issues remain controversial. Furthermore, considerable incongruence has resulted in phylogenies supporting three different sets of relationships among magnoliids and the two large angiosperm clades, monocots and eudicots. This is one of the most important remaining issues concerning relationships among basal angiosperms. We sequenced the chloroplast genomes of three magnoliids, Drimys (Canellales), Liriodendron (Magnoliales), and Piper (Piperales), and used these data in combination with 32 other completed angiosperm chloroplast genomes to assess phylogenetic relationships among magnoliids. The Drimys and Piper chloroplast genomes are nearly identical in size at 160,606 and 160,624 bp, respectively. The genomes include a pair of inverted repeats of 26,649 bp (Drimys) and 27,039 (Piper), separated by a small single copy region of 18,621 (Drimys) and 18,878 (Piper) and a large single copy region of 88,685 bp (Drimys) and 87,666 bp (Piper). The gene order of both taxa is nearly identical to many other unrearranged angiosperm chloroplast genomes, including Calycanthus, the other published magnoliid genome. Comparisons of angiosperm chloroplast genomes indicate that GC content is not uniformly distributed across the genome. Overall GC content ranges from 34-39%, and coding regions have a substantially higher GC content than non-coding regions (both intergenic spacers and introns). Among protein-coding genes, GC content varies by codon position with 1st codon > 2nd codon > 3rd codon, and it varies by functional group with photosynthetic genes having the highest percentage and NADH genes the lowest. Across the genome, GC content is highest in

  7. Evaluation of Automated Ribosomal Intergenic Spacer Analysis for Bacterial Fingerprinting of Rumen Microbiome Compared to Pyrosequencing Technology

    Directory of Open Access Journals (Sweden)

    Elie Jami

    2014-01-01

    Full Text Available The mammalian gut houses a complex microbial community which is believed to play a significant role in host physiology. In recent years, several microbial community analysis methods have been implemented to study the whole gut microbial environment, in contrast to classical microbiological methods focusing on bacteria which can be cultivated. One of these is automated ribosomal intergenic spacer analysis (ARISA, an inexpensive and popular way of analyzing bacterial diversity and community fingerprinting in ecological samples. ARISA uses the natural variability in length of the DNA fragment found between the 16S and 23S genes in different bacterial lineages to infer diversity. This method is now being supplanted by affordable next-generation sequencing technologies that can also simultaneously annotate operational taxonomic units for taxonomic identification. We compared ARISA and pyrosequencing of samples from the rumen microbiome of cows, previously sampled at different stages of development and varying in microbial complexity using several ecological parameters. We revealed close agreement between ARISA and pyrosequencing outputs, especially in their ability to discriminate samples from different ecological niches. In contrast, the ARISA method seemed to underestimate sample richness. The good performance of the relatively inexpensive ARISA makes it relevant for straightforward use in bacterial fingerprinting analysis as well as for quick cross-validation of pyrosequencing data.

  8. Non-coding RNAs and heme oxygenase-1 in vaccinia virus infection

    International Nuclear Information System (INIS)

    Meseda, Clement A.; Srinivasan, Kumar; Wise, Jasen; Catalano, Jennifer; Yamada, Kenneth M.; Dhawan, Subhash

    2014-01-01

    Highlights: • Heme oxygenase-1 (HO-1) induction inhibited vaccinia virus infection of macrophages. • Reduced infectivity inversely correlated with increased expression of non-coding RNAs. • The regulation of HO-1 and ncRNAs suggests a novel host defense response against vaccinia virus infection. - Abstract: Small nuclear RNAs (snRNAs) are <200 nucleotide non-coding uridylate-rich RNAs. Although the functions of many snRNAs remain undetermined, a population of snRNAs is produced during the early phase of infection of cells by vaccinia virus. In the present study, we demonstrate a direct correlation between expression of the cytoprotective enzyme heme oxygenase-1 (HO-1), suppression of selective snRNA expression, and inhibition of vaccinia virus infection of macrophages. Hemin induced HO-1 expression, completely reversed virus-induced host snRNA expression, and suppressed vaccinia virus infection. This involvement of specific virus-induced snRNAs and associated gene clusters suggests a novel HO-1-dependent host-defense pathway in poxvirus infection

  9. Long non-coding RNA Loc554202 regulates proliferation and migration in breast cancer cells

    Energy Technology Data Exchange (ETDEWEB)

    Shi, Yongguo, E-mail: 1138303166@qq.com [Department of Oncology, Second Affiliated Hospital, Nanjing Medical University, Nanjing, Jiangsu (China); Lu, Jianwei, E-mail: jianwei2010077@163.com [Cancer Hospital of Jiangsu Province, Nanjing, Jiangsu (China); Zhou, Jing, E-mail: 2310848@163.com [Department of Oncology, Taizhou People’ Hospital, Taizhou, Jiangsu (China); Tan, Xueming, E-mail: 843039795@qq.com [Department of Gastroenterology, Second Affiliated Hospital, Nanjing Medical University, Nanjing, Jiangsu (China); He, Ye, E-mail: 2825636@qq.com [Department of Oncology, Second Affiliated Hospital, Nanjing Medical University, Nanjing, Jiangsu (China); Ding, Jie, E-mail: 9111165@qq.com [Department of Oncology, Second Affiliated Hospital, Nanjing Medical University, Nanjing, Jiangsu (China); Tian, Yun, E-mail: 1815857@qq.com [Department of Oncology, Second Affiliated Hospital, Nanjing Medical University, Nanjing, Jiangsu (China); Wang, Li, E-mail: 2376737@qq.com [Department of Oncology, Second Affiliated Hospital, Nanjing Medical University, Nanjing, Jiangsu (China); Wang, Keming, E-mail: wkmys@sohu.com [Department of Oncology, Second Affiliated Hospital, Nanjing Medical University, Nanjing, Jiangsu (China)

    2014-04-04

    Highlights: • First, we have shown that upregulated of the Loc554202 in breast cancer tissues. • Second, we demonstrated the function of Loc554202 in breast cancer cell. • Finally, we demonstrated that LOC554202 knockdown could inhibit tumor growth in vivo. - Abstract: Data derived from massive cloning and traditional sequencing methods have revealed that long non-coding RNAs (lncRNA) play important roles in the development and progression of cancer. Although many studies suggest that the lncRNAs have different cellular functions, many of them are not yet to be identified and characterized for the mechanism of their functions. To address this question, we assay the expression level of lncRNAs–Loc554202 in breast cancer tissues and find that Loc554202 is significantly increased compared with normal control, and associated with advanced pathologic stage and tumor size. Moreover, knockdown of Loc554202 decreased breast cancer cell proliferation, induced apoptosis and inhibits migration/invasion in vitro and impeded tumorigenesis in vivo. These data suggest an important role of Loc554202 in breast tumorigenesis.

  10. Identification of Small RNAs in Desulfovibrio vulgaris Hildenborough

    International Nuclear Information System (INIS)

    Burns, Andrew; Joachimiak, Marcin; Deutschbauer, Adam; Arkin, Adam; Bender, Kelly

    2010-01-01

    Desulfovibrio vulgaris is an anaerobic sulfate-reducing bacterium capable of facilitating the removal of toxic metals such as uranium from contaminated sites via reduction. As such, it is essential to understand the intricate regulatory cascades involved in how D. vulgaris and its relatives respond to stressors in such sites. One approach is the identification and analysis of small non-coding RNAs (sRNAs); molecules ranging in size from 20-200 nucleotides that predominantly affect gene regulation by binding to complementary mRNA in an anti-sense fashion and therefore provide an immediate regulatory response. To identify sRNAs in D. vulgaris, a bacterium that does not possess an annotated hfq gene, RNA was pooled from stationary and exponential phases, nitrate exposure, and biofilm conditions. The subsequent RNA was size fractionated, modified, and converted to cDNA for high throughput transcriptomic deep sequencing. A computational approach to identify sRNAs via the alignment of seven separate Desulfovibrio genomes was also performed. From the deep sequencing analysis, 2,296 reads between 20 and 250 nt were identified with expression above genome background. Analysis of those reads limited the number of candidates to ∼87 intergenic, while ∼140 appeared to be antisense to annotated open reading frames (ORFs). Further BLAST analysis of the intergenic candidates and other Desulfovibrio genomes indicated that eight candidates were likely portions of ORFs not previously annotated in the D. vulgaris genome. Comparison of the intergenic and antisense data sets to the bioinformatical predicted candidates, resulted in ∼54 common candidates. Current approaches using Northern analysis and qRT-PCR are being used toverify expression of the candidates and to further develop the role these sRNAs play in D. vulgaris regulation.

  11. In Silico Mining of Microsatellites in Coding Sequences of the Date Palm (Arecaceae Genome, Characterization, and Transferability

    Directory of Open Access Journals (Sweden)

    Frédérique Aberlenc-Bertossi

    2014-01-01

    Full Text Available Premise of the study: To complement existing sets of primarily dinucleotide microsatellite loci from noncoding sequences of date palm, we developed primers for tri- and hexanucleotide microsatellite loci identified within genes. Due to their conserved genomic locations, the primers should be useful in other palm taxa, and their utility was tested in seven other Phoenix species and in Chamaerops, Livistona, and Hyphaene. Methods and Results: Tandem repeat motifs of 3–6 bp were searched using a simple sequence repeat (SSR–pipeline package in coding portions of the date palm draft genome sequence. Fifteen loci produced highly consistent amplification, intraspecific polymorphisms, and stepwise mutation patterns. Conclusions: These microsatellite loci showed sufficient levels of variability and transferability to make them useful for population genetic, selection signature, and interspecific gene flow studies in Phoenix and other Coryphoideae genera.

  12. Functional Diets Modulate lncRNA-Coding RNAs and Gene Interactions in the Intestine of Rainbow Trout Oncorhynchus mykiss.

    Science.gov (United States)

    Núñez-Acuña, Gustavo; Détrée, Camille; Gallardo-Escárate, Cristian; Gonçalves, Ana Teresa

    2017-06-01

    The advent of functional genomics has sparked the interest in inferring the function of non-coding regions from the transcriptome in non-model species. However, numerous biological processes remain understudied from this perspective, including intestinal immunity in farmed fish. The aim of this study was to infer long non-coding RNA (lncRNAs) expression profiles in rainbow trout (Oncorhynchus mykiss) fed for 30 days with functional diets based on pre- and probiotics. For this, whole transcriptome sequencing was conducted through Illumina technology, and lncRNAs were mined to evaluate transcriptional activity in conjunction with known protein sequences. To detect differentially expressed transcripts, 880 novels and 9067 previously described O. mykiss lncRNAs were used. Expression levels and genome co-localization correlations with coding genes were also analyzed. Significant differences in gene expression were primarily found in the probiotic diet, which had a twofold downregulation of lncRNAs compared to other treatments. Notable differences by diet were also evidenced between the coding genes of distinct metabolic processes. In contrast, genome co-localization of lncRNAs with coding genes was similar for all diets. This study contributes novel knowledge regarding lncRNAs in fish, suggesting key roles in salmons fed with in-feed additives with the capacity to modulate the intestinal homeostasis and host health.

  13. Supplementary Material for: CRISPR/Cas9-mediated viral interference in plants

    KAUST Repository

    Ali, Zahir; Abulfaraj, Aala A.; Idris, Ali; Ali, Shakila; Tashkandi, Manal; Mahfouz, Magdy

    2015-01-01

    Abstract Background The CRISPR/Cas9 system provides bacteria and archaea with molecular immunity against invading phages and conjugative plasmids. Recently, CRISPR/Cas9 has been used for targeted genome editing in diverse eukaryotic species. Results In this study, we investigate whether the CRISPR/Cas9 system could be used in plants to confer molecular immunity against DNA viruses. We deliver sgRNAs specific for coding and non-coding sequences of tomato yellow leaf curl virus (TYLCV) into Nicotiana benthamiana plants stably overexpressing the Cas9 endonuclease, and subsequently challenge these plants with TYLCV. Our data demonstrate that the CRISPR/Cas9 system targeted TYLCV for degradation and introduced mutations at the target sequences. All tested sgRNAs exhibit interference activity, but those targeting the stem-loop sequence within the TYLCV origin of replication in the intergenic region (IR) are the most effective. N. benthamiana plants expressing CRISPR/Cas9 exhibit delayed or reduced accumulation of viral DNA, abolishing or significantly attenuating symptoms of infection. Moreover, this system could simultaneously target multiple DNA viruses. Conclusions These data establish the efficacy of the CRISPR/Cas9 system for viral interference in plants, thereby extending the utility of this technology and opening the possibility of producing plants resistant to multiple viral infections.

  14. CRISPR/Cas9-mediated viral interference in plants

    KAUST Repository

    Ali, Zahir

    2015-11-11

    Background The CRISPR/Cas9 system provides bacteria and archaea with molecular immunity against invading phages and conjugative plasmids. Recently, CRISPR/Cas9 has been used for targeted genome editing in diverse eukaryotic species. Results In this study, we investigate whether the CRISPR/Cas9 system could be used in plants to confer molecular immunity against DNA viruses. We deliver sgRNAs specific for coding and non-coding sequences of tomato yellow leaf curl virus (TYLCV) into Nicotiana benthamiana plants stably overexpressing the Cas9 endonuclease, and subsequently challenge these plants with TYLCV. Our data demonstrate that the CRISPR/Cas9 system targeted TYLCV for degradation and introduced mutations at the target sequences. All tested sgRNAs exhibit interference activity, but those targeting the stem-loop sequence within the TYLCV origin of replication in the intergenic region (IR) are the most effective. N. benthamiana plants expressing CRISPR/Cas9 exhibit delayed or reduced accumulation of viral DNA, abolishing or significantly attenuating symptoms of infection. Moreover, this system could simultaneously target multiple DNA viruses. Conclusions These data establish the efficacy of the CRISPR/Cas9 system for viral interference in plants, thereby extending the utility of this technology and opening the possibility of producing plants resistant to multiple viral infections.

  15. A comparison of chloroplast genome sequences in Aconitum (Ranunculaceae: a traditional herbal medicinal genus

    Directory of Open Access Journals (Sweden)

    Hanghui Kong

    2017-11-01

    Full Text Available The herbal medicinal genus Aconitum L., belonging to the Ranunculaceae family, represents the earliest diverging lineage within the eudicots. It currently comprises of two subgenera, A. subgenus Lycoctonum and A. subg. Aconitum. The complete chloroplast (cp genome sequences were characterized in three species: A. angustius, A. finetianum, and A. sinomontanum in subg. Lycoctonum and compared to other Aconitum species to clarify their phylogenetic relationship and provide molecular information for utilization of Aconitum species particularly in Eastern Asia. The length of the chloroplast genome sequences were 156,109 bp in A. angustius, 155,625 bp in A. finetianum and 157,215 bp in A. sinomontanum, with each species possessing 126 genes with 84 protein coding genes (PCGs. While genomic rearrangements were absent, structural variation was detected in the LSC/IR/SSC boundaries. Five pseudogenes were identified, among which Ψrps19 and Ψycf1 were in the LSC/IR/SSC boundaries, Ψrps16 and ΨinfA in the LSC region, and Ψycf15 in the IRb region. The nucleotide variability (Pi of Aconitum was estimated to be 0.00549, with comparably higher variations in the LSC and SSC than the IR regions. Eight intergenic regions were revealed to be highly variable and a total of 58–62 simple sequence repeats (SSRs were detected in all three species. More than 80% of SSRs were present in the LSC region. Altogether, 64.41% and 46.81% of SSRs are mononucleotides in subg. Lycoctonum and subg. Aconitum, respectively, while a higher percentage of di-, tri-, tetra-, and penta- SSRs were present in subg. Aconitum. Most species of subg. Aconitum in Eastern Asia were first used for phylogenetic analyses. The availability of the complete cp genome sequences of these species in subg. Lycoctonum will benefit future phylogenetic analyses and aid in germplasm utilization in Aconitum species.

  16. A comparison of chloroplast genome sequences in Aconitum (Ranunculaceae): a traditional herbal medicinal genus.

    Science.gov (United States)

    Kong, Hanghui; Liu, Wanzhen; Yao, Gang; Gong, Wei

    2017-01-01

    The herbal medicinal genus Aconitum L., belonging to the Ranunculaceae family, represents the earliest diverging lineage within the eudicots. It currently comprises of two subgenera, A . subgenus Lycoctonum and A . subg. Aconitum . The complete chloroplast (cp) genome sequences were characterized in three species: A. angustius , A. finetianum , and A. sinomontanum in subg. Lycoctonum and compared to other Aconitum species to clarify their phylogenetic relationship and provide molecular information for utilization of Aconitum species particularly in Eastern Asia. The length of the chloroplast genome sequences were 156,109 bp in A. angustius , 155,625 bp in A. finetianum and 157,215 bp in A. sinomontanum , with each species possessing 126 genes with 84 protein coding genes (PCGs). While genomic rearrangements were absent, structural variation was detected in the LSC/IR/SSC boundaries. Five pseudogenes were identified, among which Ψ rps 19 and Ψ ycf 1 were in the LSC/IR/SSC boundaries, Ψ rps 16 and Ψ inf A in the LSC region, and Ψ ycf 15 in the IRb region. The nucleotide variability ( Pi ) of Aconitum was estimated to be 0.00549, with comparably higher variations in the LSC and SSC than the IR regions. Eight intergenic regions were revealed to be highly variable and a total of 58-62 simple sequence repeats (SSRs) were detected in all three species. More than 80% of SSRs were present in the LSC region. Altogether, 64.41% and 46.81% of SSRs are mononucleotides in subg. Lycoctonum and subg. Aconitum , respectively, while a higher percentage of di-, tri-, tetra-, and penta- SSRs were present in subg. Aconitum . Most species of subg. Aconitum in Eastern Asia were first used for phylogenetic analyses. The availability of the complete cp genome sequences of these species in subg. Lycoctonum will benefit future phylogenetic analyses and aid in germplasm utilization in Aconitum species.

  17. Biocomputational prediction of small non-coding RNAs in Streptomyces

    Czech Academy of Sciences Publication Activity Database

    Pánek, Josef; Bobek, Jan; Mikulík, Karel; Basler, Marek; Vohradský, Jiří

    2008-01-01

    Roč. 9, č. 217 (2008), s. 1-14 ISSN 1471-2164 R&D Projects: GA ČR GP204/07/P361; GA ČR GA203/05/0106; GA ČR GA310/07/1009 Grant - others:XE(XE) EC Integrated Project ActinoGEN, LSHM-CT-2004-005224. Institutional research plan: CEZ:AV0Z50200510 Keywords : non-coding RNA * streptomyces * biocomputational prediction Subject RIV: IN - Informatics, Computer Science Impact factor: 3.926, year: 2008

  18. Changes in the Coding and Non-coding Transcriptome and DNA Methylome that Define the Schwann Cell Repair Phenotype after Nerve Injury.

    Science.gov (United States)

    Arthur-Farraj, Peter J; Morgan, Claire C; Adamowicz, Martyna; Gomez-Sanchez, Jose A; Fazal, Shaline V; Beucher, Anthony; Razzaghi, Bonnie; Mirsky, Rhona; Jessen, Kristjan R; Aitman, Timothy J

    2017-09-12

    Repair Schwann cells play a critical role in orchestrating nerve repair after injury, but the cellular and molecular processes that generate them are poorly understood. Here, we perform a combined whole-genome, coding and non-coding RNA and CpG methylation study following nerve injury. We show that genes involved in the epithelial-mesenchymal transition are enriched in repair cells, and we identify several long non-coding RNAs in Schwann cells. We demonstrate that the AP-1 transcription factor C-JUN regulates the expression of certain micro RNAs in repair Schwann cells, in particular miR-21 and miR-34. Surprisingly, unlike during development, changes in CpG methylation are limited in injury, restricted to specific locations, such as enhancer regions of Schwann cell-specific genes (e.g., Nedd4l), and close to local enrichment of AP-1 motifs. These genetic and epigenomic changes broaden our mechanistic understanding of the formation of repair Schwann cell during peripheral nervous system tissue repair. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.

  19. PACCMIT/PACCMIT-CDS: identifying microRNA targets in 3′ UTRs and coding sequences

    Science.gov (United States)

    Šulc, Miroslav; Marín, Ray M.; Robins, Harlan S.; Vaníček, Jiří

    2015-01-01

    The purpose of the proposed web server, publicly available at http://paccmit.epfl.ch, is to provide a user-friendly interface to two algorithms for predicting messenger RNA (mRNA) molecules regulated by microRNAs: (i) PACCMIT (Prediction of ACcessible and/or Conserved MIcroRNA Targets), which identifies primarily mRNA transcripts targeted in their 3′ untranslated regions (3′ UTRs), and (ii) PACCMIT-CDS, designed to find mRNAs targeted within their coding sequences (CDSs). While PACCMIT belongs among the accurate algorithms for predicting conserved microRNA targets in the 3′ UTRs, the main contribution of the web server is 2-fold: PACCMIT provides an accurate tool for predicting targets also of weakly conserved or non-conserved microRNAs, whereas PACCMIT-CDS addresses the lack of similar portals adapted specifically for targets in CDS. The web server asks the user for microRNAs and mRNAs to be analyzed, accesses the precomputed P-values for all microRNA–mRNA pairs from a database for all mRNAs and microRNAs in a given species, ranks the predicted microRNA–mRNA pairs, evaluates their significance according to the false discovery rate and finally displays the predictions in a tabular form. The results are also available for download in several standard formats. PMID:25948580

  20. Non-coding RNA detection methods combined to improve usability, reproducibility and precision

    Directory of Open Access Journals (Sweden)

    Kreikemeyer Bernd

    2010-09-01

    Full Text Available Abstract Background Non-coding RNAs gain more attention as their diverse roles in many cellular processes are discovered. At the same time, the need for efficient computational prediction of ncRNAs increases with the pace of sequencing technology. Existing tools are based on various approaches and techniques, but none of them provides a reliable ncRNA detector yet. Consequently, a natural approach is to combine existing tools. Due to a lack of standard input and output formats combination and comparison of existing tools is difficult. Also, for genomic scans they often need to be incorporated in detection workflows using custom scripts, which decreases transparency and reproducibility. Results We developed a Java-based framework to integrate existing tools and methods for ncRNA detection. This framework enables users to construct transparent detection workflows and to combine and compare different methods efficiently. We demonstrate the effectiveness of combining detection methods in case studies with the small genomes of Escherichia coli, Listeria monocytogenes and Streptococcus pyogenes. With the combined method, we gained 10% to 20% precision for sensitivities from 30% to 80%. Further, we investigated Streptococcus pyogenes for novel ncRNAs. Using multiple methods--integrated by our framework--we determined four highly probable candidates. We verified all four candidates experimentally using RT-PCR. Conclusions We have created an extensible framework for practical, transparent and reproducible combination and comparison of ncRNA detection methods. We have proven the effectiveness of this approach in tests and by guiding experiments to find new ncRNAs. The software is freely available under the GNU General Public License (GPL, version 3 at http://www.sbi.uni-rostock.de/moses along with source code, screen shots, examples and tutorial material.

  1. The architecture of the chloroplast trnH-psbA non-coding region in angiosperms

    Czech Academy of Sciences Publication Activity Database

    Štorchová, Helena; Olson, M.S.

    2007-01-01

    Roč. 268, 1-4 (2007), s. 235-256 ISSN 0378-2697 R&D Projects: GA MŠk(CZ) LC06004 Grant - others:ESPSCor Visiting Scholar Research Grant(US) NSF DEB 0317115 Institutional research plan: CEZ:AV0Z50380511 Source of funding: V - iné verejné zdroje ; V - iné verejné zdroje Keywords : Chloroplast DNA * psbA-trnH intergenic region * Silene * deletions * insertions and inversions in stem-loop region * psbA 3´untranslated region * RNA secondary structure Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 1.492, year: 2007

  2. Spontaneous processing of functional and non-functional action sequences

    DEFF Research Database (Denmark)

    Nielbo, Kristoffer Laigaard; Sørensen, Jesper

    2011-01-01

    as sub-categories of non-functional behavior (i.e., actions lacking causal coherence and a necessary integration between subparts). New insights in human action processing can help us explain how cognition might vary depending on the type of behavior processed. Using an event segmentation paradigm, we...... conducted two experiments eliciting differences in participants' response patterns to functional and non-functional actions. Participants consistently segmented non-functional action sequences into smaller units indicating either an attentional shift to the level of gesture analysis or a problem...... of representational integration. Experimental studies of non-functional behavior can strengthen explanations of recurrent features of human action processing, such as ritual and ritualized behavior, as well as indicate potential sources and effects of breakdown of the system....

  3. Mutation of miRNA target sequences during human evolution

    DEFF Research Database (Denmark)

    Gardner, Paul P; Vinther, Jeppe

    2008-01-01

    It has long-been hypothesized that changes in non-protein-coding genes and the regulatory sequences controlling expression could undergo positive selection. Here we identify 402 putative microRNA (miRNA) target sequences that have been mutated specifically in the human lineage and show that genes...... containing such deletions are more highly expressed than their mouse orthologs. Our findings indicate that some miRNA target mutations are fixed by positive selection and might have been involved in the evolution of human-specific traits....

  4. DanQ: a hybrid convolutional and recurrent deep neural network for quantifying the function of DNA sequences.

    Science.gov (United States)

    Quang, Daniel; Xie, Xiaohui

    2016-06-20

    Modeling the properties and functions of DNA sequences is an important, but challenging task in the broad field of genomics. This task is particularly difficult for non-coding DNA, the vast majority of which is still poorly understood in terms of function. A powerful predictive model for the function of non-coding DNA can have enormous benefit for both basic science and translational research because over 98% of the human genome is non-coding and 93% of disease-associated variants lie in these regions. To address this need, we propose DanQ, a novel hybrid convolutional and bi-directional long short-term memory recurrent neural network framework for predicting non-coding function de novo from sequence. In the DanQ model, the convolution layer captures regulatory motifs, while the recurrent layer captures long-term dependencies between the motifs in order to learn a regulatory 'grammar' to improve predictions. DanQ improves considerably upon other models across several metrics. For some regulatory markers, DanQ can achieve over a 50% relative improvement in the area under the precision-recall curve metric compared to related models. We have made the source code available at the github repository http://github.com/uci-cbcl/DanQ. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  5. p53-dependent non-coding RNA networks in chronic lymphocytic leukemia

    NARCIS (Netherlands)

    Blume, C. J.; Hotz-Wagenblatt, A.; Hüllein, J.; Sellner, L.; Jethwa, A.; Stolz, T.; Slabicki, M.; Lee, K.; Sharathchandra, A.; Benner, A.; Dietrich, S.; Oakes, C. C.; Dreger, P.; te Raa, D.; Kater, A. P.; Jauch, A.; Merkel, O.; Oren, M.; Hielscher, T.; Zenz, T.

    2015-01-01

    Mutations of the tumor suppressor p53 lead to chemotherapy resistance and a dismal prognosis in chronic lymphocytic leukemia (CLL). Whereas p53 targets are used to identify patient subgroups with impaired p53 function, a comprehensive assessment of non-coding RNA targets of p53 in CLL is missing. We

  6. Securing optical code-division multiple-access networks with a postswitching coding scheme of signature reconfiguration

    Science.gov (United States)

    Huang, Jen-Fa; Meng, Sheng-Hui; Lin, Ying-Chen

    2014-11-01

    The optical code-division multiple-access (OCDMA) technique is considered a good candidate for providing optical layer security. An enhanced OCDMA network security mechanism with a pseudonoise (PN) random digital signals type of maximal-length sequence (M-sequence) code switching to protect against eavesdropping is presented. Signature codes unique to individual OCDMA-network users are reconfigured according to the register state of the controlling electrical shift registers. Examples of signature reconfiguration following state switching of the controlling shift register for both the network user and the eavesdropper are numerically illustrated. Dynamically changing the PN state of the shift register to reconfigure the user signature sequence is shown; this hinders eavesdroppers' efforts to decode correct data sequences. The proposed scheme increases the probability of eavesdroppers committing errors in decoding and thereby substantially enhances the degree of an OCDMA network's confidentiality.

  7. A standardized framework for accurate, high-throughput genotyping of recombinant and non-recombinant viral sequences.

    Science.gov (United States)

    Alcantara, Luiz Carlos Junior; Cassol, Sharon; Libin, Pieter; Deforche, Koen; Pybus, Oliver G; Van Ranst, Marc; Galvão-Castro, Bernardo; Vandamme, Anne-Mieke; de Oliveira, Tulio

    2009-07-01

    Human immunodeficiency virus type-1 (HIV-1), hepatitis B and C and other rapidly evolving viruses are characterized by extremely high levels of genetic diversity. To facilitate diagnosis and the development of prevention and treatment strategies that efficiently target the diversity of these viruses, and other pathogens such as human T-lymphotropic virus type-1 (HTLV-1), human herpes virus type-8 (HHV8) and human papillomavirus (HPV), we developed a rapid high-throughput-genotyping system. The method involves the alignment of a query sequence with a carefully selected set of pre-defined reference strains, followed by phylogenetic analysis of multiple overlapping segments of the alignment using a sliding window. Each segment of the query sequence is assigned the genotype and sub-genotype of the reference strain with the highest bootstrap (>70%) and bootscanning (>90%) scores. Results from all windows are combined and displayed graphically using color-coded genotypes. The new Virus-Genotyping Tools provide accurate classification of recombinant and non-recombinant viruses and are currently being assessed for their diagnostic utility. They have incorporated into several HIV drug resistance algorithms including the Stanford (http://hivdb.stanford.edu) and two European databases (http://www.umcutrecht.nl/subsite/spread-programme/ and http://www.hivrdb.org.uk/) and have been successfully used to genotype a large number of sequences in these and other databases. The tools are a PHP/JAVA web application and are freely accessible on a number of servers including: http://bioafrica.mrc.ac.za/rega-genotype/html/, http://lasp.cpqgm.fiocruz.br/virus-genotype/html/, http://jose.med.kuleuven.be/genotypetool/html/.

  8. Catalog of Differentially Expressed Long Non-Coding RNA following Activation of Human and Mouse Innate Immune Response

    Directory of Open Access Journals (Sweden)

    Benoit T. Roux

    2017-08-01

    Full Text Available Despite increasing evidence to indicate that long non-coding RNAs (lncRNAs are novel regulators of immunity, there has been no systematic attempt to identify and characterize the lncRNAs whose expression is changed following the induction of the innate immune response. To address this issue, we have employed next-generation sequencing data to determine the changes in the lncRNA profile in four human (monocytes, macrophages, epithelium, and chondrocytes and four mouse cell types (RAW 264.7 macrophages, bone marrow-derived macrophages, peritoneal macrophages, and splenic dendritic cells following exposure to the pro-inflammatory mediators, lipopolysaccharides (LPS, or interleukin-1β. We show differential expression of 204 human and 210 mouse lncRNAs, with positional analysis demonstrating correlation with immune-related genes. These lncRNAs are predominantly cell-type specific, composed of large regions of repeat sequences, and show poor evolutionary conservation. Comparison within the human and mouse sequences showed less than 1% sequence conservation, although we identified multiple conserved motifs. Of the 204 human lncRNAs, 21 overlapped with syntenic mouse lncRNAs, of which five were differentially expressed in both species. Among these syntenic lncRNA was IL7-AS (antisense, which was induced in multiple cell types and shown to regulate the production of the pro-inflammatory mediator interleukin-6 in both human and mouse cells. In summary, we have identified and characterized those lncRNAs that are differentially expressed following activation of the human and mouse innate immune responses and believe that these catalogs will provide the foundation for the future analysis of the role of lncRNAs in immune and inflammatory responses.

  9. Comparative anatomy of chromosomal domains with imprinted and non-imprinted allele-specific DNA methylation.

    Science.gov (United States)

    Paliwal, Anupam; Temkin, Alexis M; Kerkel, Kristi; Yale, Alexander; Yotova, Iveta; Drost, Natalia; Lax, Simon; Nhan-Chang, Chia-Ling; Powell, Charles; Borczuk, Alain; Aviv, Abraham; Wapner, Ronald; Chen, Xiaowei; Nagy, Peter L; Schork, Nicholas; Do, Catherine; Torkamani, Ali; Tycko, Benjamin

    2013-08-01

    Allele-specific DNA methylation (ASM) is well studied in imprinted domains, but this type of epigenetic asymmetry is actually found more commonly at non-imprinted loci, where the ASM is dictated not by parent-of-origin but instead by the local haplotype. We identified loci with strong ASM in human tissues from methylation-sensitive SNP array data. Two index regions (bisulfite PCR amplicons), one between the C3orf27 and RPN1 genes in chromosome band 3q21 and the other near the VTRNA2-1 vault RNA in band 5q31, proved to be new examples of imprinted DMRs (maternal alleles methylated) while a third, between STEAP3 and C2orf76 in chromosome band 2q14, showed non-imprinted haplotype-dependent ASM. Using long-read bisulfite sequencing (bis-seq) in 8 human tissues we found that in all 3 domains the ASM is restricted to single differentially methylated regions (DMRs), each less than 2kb. The ASM in the C3orf27-RPN1 intergenic region was placenta-specific and associated with allele-specific expression of a long non-coding RNA. Strikingly, the discrete DMRs in all 3 regions overlap with binding sites for the insulator protein CTCF, which we found selectively bound to the unmethylated allele of the STEAP3-C2orf76 DMR. Methylation mapping in two additional genes with non-imprinted haplotype-dependent ASM, ELK3 and CYP2A7, showed that the CYP2A7 DMR also overlaps a CTCF site. Thus, two features of imprinted domains, highly localized DMRs and allele-specific insulator occupancy by CTCF, can also be found in chromosomal domains with non-imprinted ASM. Arguing for biological importance, our analysis of published whole genome bis-seq data from hES cells revealed multiple genome-wide association study (GWAS) peaks near CTCF binding sites with ASM.

  10. Comparative anatomy of chromosomal domains with imprinted and non-imprinted allele-specific DNA methylation.

    Directory of Open Access Journals (Sweden)

    Anupam Paliwal

    2013-08-01

    Full Text Available Allele-specific DNA methylation (ASM is well studied in imprinted domains, but this type of epigenetic asymmetry is actually found more commonly at non-imprinted loci, where the ASM is dictated not by parent-of-origin but instead by the local haplotype. We identified loci with strong ASM in human tissues from methylation-sensitive SNP array data. Two index regions (bisulfite PCR amplicons, one between the C3orf27 and RPN1 genes in chromosome band 3q21 and the other near the VTRNA2-1 vault RNA in band 5q31, proved to be new examples of imprinted DMRs (maternal alleles methylated while a third, between STEAP3 and C2orf76 in chromosome band 2q14, showed non-imprinted haplotype-dependent ASM. Using long-read bisulfite sequencing (bis-seq in 8 human tissues we found that in all 3 domains the ASM is restricted to single differentially methylated regions (DMRs, each less than 2kb. The ASM in the C3orf27-RPN1 intergenic region was placenta-specific and associated with allele-specific expression of a long non-coding RNA. Strikingly, the discrete DMRs in all 3 regions overlap with binding sites for the insulator protein CTCF, which we found selectively bound to the unmethylated allele of the STEAP3-C2orf76 DMR. Methylation mapping in two additional genes with non-imprinted haplotype-dependent ASM, ELK3 and CYP2A7, showed that the CYP2A7 DMR also overlaps a CTCF site. Thus, two features of imprinted domains, highly localized DMRs and allele-specific insulator occupancy by CTCF, can also be found in chromosomal domains with non-imprinted ASM. Arguing for biological importance, our analysis of published whole genome bis-seq data from hES cells revealed multiple genome-wide association study (GWAS peaks near CTCF binding sites with ASM.

  11. New tools to analyze overlapping coding regions.

    Science.gov (United States)

    Bayegan, Amir H; Garcia-Martin, Juan Antonio; Clote, Peter

    2016-12-13

    Retroviruses transcribe messenger RNA for the overlapping Gag and Gag-Pol polyproteins, by using a programmed -1 ribosomal frameshift which requires a slippery sequence and an immediate downstream stem-loop secondary structure, together called frameshift stimulating signal (FSS). It follows that the molecular evolution of this genomic region of HIV-1 is highly constrained, since the retroviral genome must contain a slippery sequence (sequence constraint), code appropriate peptides in reading frames 0 and 1 (coding requirements), and form a thermodynamically stable stem-loop secondary structure (structure requirement). We describe a unique computational tool, RNAsampleCDS, designed to compute the number of RNA sequences that code two (or more) peptides p,q in overlapping reading frames, that are identical (or have BLOSUM/PAM similarity that exceeds a user-specified value) to the input peptides p,q. RNAsampleCDS then samples a user-specified number of messenger RNAs that code such peptides; alternatively, RNAsampleCDS can exactly compute the position-specific scoring matrix and codon usage bias for all such RNA sequences. Our software allows the user to stipulate overlapping coding requirements for all 6 possible reading frames simultaneously, even allowing IUPAC constraints on RNA sequences and fixing GC-content. We generalize the notion of codon preference index (CPI) to overlapping reading frames, and use RNAsampleCDS to generate control sequences required in the computation of CPI. Moreover, by applying RNAsampleCDS, we are able to quantify the extent to which the overlapping coding requirement in HIV-1 [resp. HCV] contribute to the formation of the stem-loop [resp. double stem-loop] secondary structure known as the frameshift stimulating signal. Using our software, we confirm that certain experimentally determined deleterious HCV mutations occur in positions for which our software RNAsampleCDS and RNAiFold both indicate a single possible nucleotide. We

  12. RNA2 of grapevine fanleaf virus: sequence analysis and coat protein cistron location.

    Science.gov (United States)

    Serghini, M A; Fuchs, M; Pinck, M; Reinbolt, J; Walter, B; Pinck, L

    1990-07-01

    The nucleotide sequence of the genomic RNA2 (3774 nucleotides) of grapevine fanleaf virus strain F13 was determined from overlapping cDNA clones and its genetic organization was deduced. Two rapid and efficient methods were used for cDNA cloning of the 5' region of RNA2. The complete sequence contained only one long open reading frame of 3555 nucleotides (1184 codons, 131K product). The analysis of the N-terminal sequence of purified coat protein (CP) and identification of its C-terminal residue have allowed the CP cistron to be precisely positioned within the polyprotein. The CP produced by proteolytic cleavage at the Arg/Gly site between residues 680 and 681 contains 504 amino acids (Mr 56019) and has hydrophobic properties. The Arg/Gly cleavage site deduced by N-terminal amino acid sequence analysis is the first for a nepovirus coat protein and for plant viruses expressing their genomic RNAs by polyprotein synthesis. Comparison of GFLV RNA2 with M RNA of cowpea mosaic comovirus and with RNA2 of two closely related nepoviruses, tomato black ring virus and Hungarian grapevine chrome mosaic virus, showed strong similarities among the 3' non-coding regions but less similarity among the 5' end non-coding sequences than reported among other nepovirus RNAs.

  13. Long non-coding RNA expression profiling of mouse testis during postnatal development.

    Directory of Open Access Journals (Sweden)

    Jin Sun

    Full Text Available Mammalian testis development and spermatogenesis play critical roles in male fertility and continuation of a species. Previous research into the molecular mechanisms of testis development and spermatogenesis has largely focused on the role of protein-coding genes and small non-coding RNAs, such as microRNAs and piRNAs. Recently, it has become apparent that large numbers of long (>200 nt non-coding RNAs (lncRNAs are transcribed from mammalian genomes and that lncRNAs perform important regulatory functions in various developmental processes. However, the expression of lncRNAs and their biological functions in post-natal testis development remain unknown. In this study, we employed microarray technology to examine lncRNA expression profiles of neonatal (6-day-old and adult (8-week-old mouse testes. We found that 8,265 lncRNAs were expressed above background levels during post-natal testis development, of which 3,025 were differentially expressed. Candidate lncRNAs were identified for further characterization by an integrated examination of genomic context, gene ontology (GO enrichment of their associated protein-coding genes, promoter analysis for epigenetic modification, and evolutionary conservation of elements. Many lncRNAs overlapped or were adjacent to key transcription factors and other genes involved in spermatogenesis, such as Ovol1, Ovol2, Lhx1, Sox3, Sox9, Plzf, c-Kit, Wt1, Sycp2, Prm1 and Prm2. Most differentially expressed lncRNAs exhibited epigenetic modification marks similar to protein-coding genes and tend to be expressed in a tissue-specific manner. In addition, the majority of differentially expressed lncRNAs harbored evolutionary conserved elements. Taken together, our findings represent the first systematic investigation of lncRNA expression in the mammalian testis and provide a solid foundation for further research into the molecular mechanisms of lncRNAs function in mammalian testis development and spermatogenesis.

  14. The Intergenic Recombinant HLA-B*46:01 Has a Distinctive Peptidome that Includes KIR2DL3 Ligands

    DEFF Research Database (Denmark)

    Hilton, Hugo G.; McMurtrey, Curtis P.; Han, Alex S.

    2017-01-01

    HLA-B*46:01 was formed by an intergenic mini-conversion, between HLA-B*15:01 and HLA-C*01:02, in Southeast Asia during the last 50,000 years, and it has since become the most common HLA-B allele in the region. A functional effect of the mini-conversion was introduction of the C1 epitope into HLA-...

  15. RNA-Seq of human neurons derived from iPS cells reveals candidate long non-coding RNAs involved in neurogenesis and neuropsychiatric disorders.

    Directory of Open Access Journals (Sweden)

    Mingyan Lin

    Full Text Available Genome-wide expression analysis using next generation sequencing (RNA-Seq provides an opportunity for in-depth molecular profiling of fundamental biological processes, such as cellular differentiation and malignant transformation. Differentiating human neurons derived from induced pluripotent stem cells (iPSCs provide an ideal system for RNA-Seq since defective neurogenesis caused by abnormalities in transcription factors, DNA methylation, and chromatin modifiers lie at the heart of some neuropsychiatric disorders. As a preliminary step towards applying next generation sequencing using neurons derived from patient-specific iPSCs, we have carried out an RNA-Seq analysis on control human neurons. Dramatic changes in the expression of coding genes, long non-coding RNAs (lncRNAs, pseudogenes, and splice isoforms were seen during the transition from pluripotent stem cells to early differentiating neurons. A number of genes that undergo radical changes in expression during this transition include candidates for schizophrenia (SZ, bipolar disorder (BD and autism spectrum disorders (ASD that function as transcription factors and chromatin modifiers, such as POU3F2 and ZNF804A, and genes coding for cell adhesion proteins implicated in these conditions including NRXN1 and NLGN1. In addition, a number of novel lncRNAs were found to undergo dramatic changes in expression, one of which is HOTAIRM1, a regulator of several HOXA genes during myelopoiesis. The increase we observed in differentiating neurons suggests a role in neurogenesis as well. Finally, several lncRNAs that map near SNPs associated with SZ in genome wide association studies also increase during neuronal differentiation, suggesting that these novel transcripts may be abnormally regulated in a subgroup of patients.

  16. TASS code topical report. V.1 TASS code technical manual

    International Nuclear Information System (INIS)

    Sim, Suk K.; Chang, W. P.; Kim, K. D.; Kim, H. C.; Yoon, H. Y.

    1997-02-01

    TASS 1.0 code has been developed at KAERI for the initial and reload non-LOCA safety analysis for the operating PWRs as well as the PWRs under construction in Korea. TASS code will replace various vendor's non-LOCA safety analysis codes currently used for the Westinghouse and ABB-CE type PWRs in Korea. This can be achieved through TASS code input modifications specific to each reactor type. The TASS code can be run interactively through the keyboard operation. A simimodular configuration used in developing the TASS code enables the user easily implement new models. TASS code has been programmed using FORTRAN77 which makes it easy to install and port for different computer environments. The TASS code can be utilized for the steady state simulation as well as the non-LOCA transient simulations such as power excursions, reactor coolant pump trips, load rejections, loss of feedwater, steam line breaks, steam generator tube ruptures, rod withdrawal and drop, and anticipated transients without scram (ATWS). The malfunctions of the control systems, components, operator actions and the transients caused by the malfunctions can be easily simulated using the TASS code. This technical report describes the TASS 1.0 code models including reactor thermal hydraulic, reactor core and control models. This TASS code models including reactor thermal hydraulic, reactor core and control models. This TASS code technical manual has been prepared as a part of the TASS code manual which includes TASS code user's manual and TASS code validation report, and will be submitted to the regulatory body as a TASS code topical report for a licensing non-LOCA safety analysis for the Westinghouse and ABB-CE type PWRs operating and under construction in Korea. (author). 42 refs., 29 tabs., 32 figs

  17. Methods for Using Small Non-Coding RNAs to Improve Recombinant Protein Expression in Mammalian Cells

    Directory of Open Access Journals (Sweden)

    Sarah Inwood

    2018-01-01

    Full Text Available The ability to produce recombinant proteins by utilizing different “cell factories” revolutionized the biotherapeutic and pharmaceutical industry. Chinese hamster ovary (CHO cells are the dominant industrial producer, especially for antibodies. Human embryonic kidney cells (HEK, while not being as widely used as CHO cells, are used where CHO cells are unable to meet the needs for expression, such as growth factors. Therefore, improving recombinant protein expression from mammalian cells is a priority, and continuing effort is being devoted to this topic. Non-coding RNAs are RNA segments that are not translated into a protein and often have a regulatory role. Since their discovery, major progress has been made towards understanding their functions. Non-coding RNA has been investigated extensively in relation to disease, especially cancer, and recently they have also been used as a method for engineering cells to improve their protein expression capability. In this review, we provide information about methods used to identify non-coding RNAs with the potential of improving recombinant protein expression in mammalian cell lines.

  18. Deep Sequencing Reveals Uncharted Isoform Heterogeneity of the Protein-Coding Transcriptome in Cerebral Ischemia.

    Science.gov (United States)

    Bhattarai, Sunil; Aly, Ahmed; Garcia, Kristy; Ruiz, Diandra; Pontarelli, Fabrizio; Dharap, Ashutosh

    2018-06-03

    Gene expression in cerebral ischemia has been a subject of intense investigations for several years. Studies utilizing probe-based high-throughput methodologies such as microarrays have contributed significantly to our existing knowledge but lacked the capacity to dissect the transcriptome in detail. Genome-wide RNA-sequencing (RNA-seq) enables comprehensive examinations of transcriptomes for attributes such as strandedness, alternative splicing, alternative transcription start/stop sites, and sequence composition, thus providing a very detailed account of gene expression. Leveraging this capability, we conducted an in-depth, genome-wide evaluation of the protein-coding transcriptome of the adult mouse cortex after transient focal ischemia at 6, 12, or 24 h of reperfusion using RNA-seq. We identified a total of 1007 transcripts at 6 h, 1878 transcripts at 12 h, and 1618 transcripts at 24 h of reperfusion that were significantly altered as compared to sham controls. With isoform-level resolution, we identified 23 splice variants arising from 23 genes that were novel mRNA isoforms. For a subset of genes, we detected reperfusion time-point-dependent splice isoform switching, indicating an expression and/or functional switch for these genes. Finally, for 286 genes across all three reperfusion time-points, we discovered multiple, distinct, simultaneously expressed and differentially altered isoforms per gene that were generated via alternative transcription start/stop sites. Of these, 165 isoforms derived from 109 genes were novel mRNAs. Together, our data unravel the protein-coding transcriptome of the cerebral cortex at an unprecedented depth to provide several new insights into the flexibility and complexity of stroke-related gene transcription and transcript organization.

  19. Deduced amino acid sequence of the small hydrophobic protein of US avian pneumovirus has greater identity with that of human metapneumovirus than those of non-US avian pneumoviruses.

    Science.gov (United States)

    Yunus, Abdul S; Govindarajan, Dhanasekaran; Huang, Zhuhui; Samal, Siba K

    2003-05-01

    We report here the nucleotide and deduced amino acid (aa) sequences of the small hydrophobic (SH) gene of the avian pneumovirus strain Colorado (APV/CO). The SH gene of APV/CO is 628 nucleotides in length from gene-start to gene-end. The longest ORF of the SH gene encoded a protein of 177 aas in length. Comparison of the deduced aa sequence of the SH protein of APV/CO with the corresponding published sequences of other members of genera metapneumovirus showed 28% identity with the newly discovered human metapneumovirus (hMPV), but no discernable identity with the APV subgroup A or B. Collectively, this data supports the hypothesis that: (i) APV/CO is distinct from European APV subgroups and belongs to the novel subgroup APV/C (APV/US); (ii) APV/CO is more closely related to hMPV, a mammalian metapneumovirus, than to either APV subgroup A or B. The SH gene of APV/CO was cloned using a genomic walk strategy which initiated cDNA synthesis from genomic RNA that traversed the genes in the order 3'-M-F-M2-SH-G-5', thus confirming that gene-order of APV/CO conforms in the genus Metapneumovirus. We also provide the sequences of transcription-signals and the M-F, F-M2, M2-SH and SH-G intergenic regions of APV/CO.

  20. Flavivirus RNAi suppression: decoding non-coding RNA.

    Science.gov (United States)

    Pijlman, Gorben P

    2014-08-01

    Flaviviruses are important human pathogens that are transmitted by invertebrate vectors, mostly mosquitoes and ticks. During replication in their vector, flaviviruses are subject to a potent innate immune response known as antiviral RNA interference (RNAi). This defense mechanism is associated with the production of small interfering (si)RNA that lead to degradation of viral RNA. To what extent flaviviruses would benefit from counteracting antiviral RNAi is subject of debate. Here, the experimental evidence to suggest the existence of flavivirus RNAi suppressors is discussed. I will highlight the putative role of non-coding, subgenomic flavivirus RNA in suppression of RNAi in insect and mammalian cells. Novel insights from ongoing research will reveal how arthropod-borne viruses modulate innate immunity including antiviral RNAi. Copyright © 2014 Elsevier B.V. All rights reserved.

  1. Viewing multiple sequence alignments with the JavaScript Sequence Alignment Viewer (JSAV).

    Science.gov (United States)

    Martin, Andrew C R

    2014-01-01

    The JavaScript Sequence Alignment Viewer (JSAV) is designed as a simple-to-use JavaScript component for displaying sequence alignments on web pages. The display of sequences is highly configurable with options to allow alternative coloring schemes, sorting of sequences and 'dotifying' repeated amino acids. An option is also available to submit selected sequences to another web site, or to other JavaScript code. JSAV is implemented purely in JavaScript making use of the JQuery and JQuery-UI libraries. It does not use any HTML5-specific options to help with browser compatibility. The code is documented using JSDOC and is available from http://www.bioinf.org.uk/software/jsav/.

  2. Complete nucleotide sequences of avian metapneumovirus subtype B genome.

    Science.gov (United States)

    Sugiyama, Miki; Ito, Hiroshi; Hata, Yusuke; Ono, Eriko; Ito, Toshihiro

    2010-12-01

    Complete nucleotide sequences were determined for subtype B avian metapneumovirus (aMPV), the attenuated vaccine strain VCO3/50 and its parental pathogenic strain VCO3/60616. The genomes of both strains comprised 13,508 nucleotides (nt), with a 42-nt leader at the 3'-end and a 46-nt trailer at the 5'-end. The genome contains eight genes in the order 3'-N-P-M-F-M2-SH-G-L-5', which is the same order shown in the other metapneumoviruses. The genes are flanked on either side by conserved transcriptional start and stop signals and have intergenic sequences varying in length from 1 to 88 nt. Comparison of nt and predicted amino acid (aa) sequences of VCO3/60616 with those of other metapneumoviruses revealed higher homology with aMPV subtype A virus than with other metapneumoviruses. A total of 18 nt and 10 deduced aa differences were seen between the strains, and one or a combination of several differences could be associated with attenuation of VCO3/50.

  3. Joint beam design and user selection over non-binary coded MIMO interference channel

    Science.gov (United States)

    Li, Haitao; Yuan, Haiying

    2013-03-01

    In this paper, we discuss the problem of sum rate improvement for coded MIMO interference system, and propose joint beam design and user selection over interference channel. Firstly, we have formulated non-binary LDPC coded MIMO interference networks model. Then, the least square beam design for MIMO interference system is derived, and the low complexity user selection is presented. Simulation results confirm that the sum rate can be improved by the joint user selection and beam design comparing with single interference aligning beamformer.

  4. Long non-coding RNA expression profiles in hereditary haemorrhagic telangiectasia

    DEFF Research Database (Denmark)

    Tørring, Pernille M; Larsen, Martin Jakob; Kjeldsen, Anette D

    2014-01-01

    transcriptome, we wanted to assess whether lncRNAs play a role in the molecular pathogenesis of HHT manifestations. By microarray technology, we profiled lncRNA transcripts from HHT nasal telangiectasial and non-telangiectasial tissue using a paired design. The microarray probes were annotated using the GENCODE...... v.16 dataset, identifying 4,810 probes mapping to 2,811 lncRNAs. Comparing HHT telangiectasial tissue with HHT non-telangiectasial tissue, we identified 42 lncRNAs that are differentially expressed (qUsing GREAT, a tool that assumes cis-regulation, we showed that differently expressed lncRNAs...... to the TGF-β signalling pathway. The exact mechanism of how haploinsufficiency of ENG and ACVRL1 leads to HHT manifestations remains to be identified. As long non-coding RNAs (lncRNAs) are increasingly recognized as key regulators of gene expression and constitute a sizable fraction of the human...

  5. Sequence comparison for non-enhanced MRA of the lower extremity arteries at 7 Tesla.

    Directory of Open Access Journals (Sweden)

    Sören Johst

    Full Text Available In this study three sequences for non-contrast-enhanced MRA of the lower extremity arteries at 7T were compared. Cardiac triggering was used with the aim to reduce signal variations in the arteries. Two fast single-shot 2D sequences, a modified Ultrafast Spoiled Gradient Echo (UGRE sequence and a variant of the Quiescent-Interval Single-Shot (QISS sequence were triggered via phonocardiogram and compared in volunteer examinations to a non-triggered 2D gradient echo (GRE sequence. For image acquisition, a 16-channel transmit/receive coil and a manually positionable AngioSURF table were used. To tackle B1 inhomogeneities at 7T, Time-Interleaved Acquisition of Modes (TIAMO was integrated in GRE and UGRE. To compare the three sequences quantitatively, a vessel-to-background ratio (VBR was measured in all volunteers and stations. In conclusion, cardiac triggering was able to suppress flow artifacts satisfactorily. The modified UGRE showed only moderate image artifacts. Averaged over all volunteers and stations, GRE reached a VBR of 4.18±0.05, UGRE 5.20±0.06, and QISS 2.72±0.03. Using cardiac triggering and TIAMO imaging technique was essential to perform non-enhanced MRA of the lower extremities vessels at 7T. The modified UGRE performed best, as observed artifacts were only moderate and the highest average VBR was reached.

  6. Code Cactus; Code Cactus

    Energy Technology Data Exchange (ETDEWEB)

    Fajeau, M; Nguyen, L T; Saunier, J [Commissariat a l' Energie Atomique, Centre d' Etudes Nucleaires de Saclay, 91 - Gif-sur-Yvette (France)

    1966-09-01

    This code handles the following problems: -1) Analysis of thermal experiments on a water loop at high or low pressure; steady state or transient behavior; -2) Analysis of thermal and hydrodynamic behavior of water-cooled and moderated reactors, at either high or low pressure, with boiling permitted; fuel elements are assumed to be flat plates: - Flowrate in parallel channels coupled or not by conduction across plates, with conditions of pressure drops or flowrate, variable or not with respect to time is given; the power can be coupled to reactor kinetics calculation or supplied by the code user. The code, containing a schematic representation of safety rod behavior, is a one dimensional, multi-channel code, and has as its complement (FLID), a one-channel, two-dimensional code. (authors) [French] Ce code permet de traiter les problemes ci-dessous: 1. Depouillement d'essais thermiques sur boucle a eau, haute ou basse pression, en regime permanent ou transitoire; 2. Etudes thermiques et hydrauliques de reacteurs a eau, a plaques, a haute ou basse pression, ebullition permise: - repartition entre canaux paralleles, couples on non par conduction a travers plaques, pour des conditions de debit ou de pertes de charge imposees, variables ou non dans le temps; - la puissance peut etre couplee a la neutronique et une representation schematique des actions de securite est prevue. Ce code (Cactus) a une dimension d'espace et plusieurs canaux, a pour complement Flid qui traite l'etude d'un seul canal a deux dimensions. (auteurs)

  7. The Aster code; Code Aster

    Energy Technology Data Exchange (ETDEWEB)

    Delbecq, J.M

    1999-07-01

    The Aster code is a 2D or 3D finite-element calculation code for structures developed by the R and D direction of Electricite de France (EdF). This dossier presents a complete overview of the characteristics and uses of the Aster code: introduction of version 4; the context of Aster (organisation of the code development, versions, systems and interfaces, development tools, quality assurance, independent validation); static mechanics (linear thermo-elasticity, Euler buckling, cables, Zarka-Casier method); non-linear mechanics (materials behaviour, big deformations, specific loads, unloading and loss of load proportionality indicators, global algorithm, contact and friction); rupture mechanics (G energy restitution level, restitution level in thermo-elasto-plasticity, 3D local energy restitution level, KI and KII stress intensity factors, calculation of limit loads for structures), specific treatments (fatigue, rupture, wear, error estimation); meshes and models (mesh generation, modeling, loads and boundary conditions, links between different modeling processes, resolution of linear systems, display of results etc..); vibration mechanics (modal and harmonic analysis, dynamics with shocks, direct transient dynamics, seismic analysis and aleatory dynamics, non-linear dynamics, dynamical sub-structuring); fluid-structure interactions (internal acoustics, mass, rigidity and damping); linear and non-linear thermal analysis; steels and metal industry (structure transformations); coupled problems (internal chaining, internal thermo-hydro-mechanical coupling, chaining with other codes); products and services. (J.S.)

  8. The Complete Chloroplast Genome of Ye-Xing-Ba (Scrophularia dentata; Scrophulariaceae), an Alpine Tibetan Herb.

    Science.gov (United States)

    Ni, Lianghong; Zhao, Zhili; Dorje, Gaawe; Ma, Mi

    2016-01-01

    Scrophularia dentata is an important Tibetan medicinal plant and traditionally used for the treatment of exanthema and fever in Traditional Tibetan Medicine (TTM). However, there is little sequence and genomic information available for S. dentata. In this paper, we report the complete chloroplast genome sequence of S. dentata and it is the first sequenced member of the Sect. Tomiophyllum within Scrophularia (Scrophulariaceae). The gene order and organization of the chloroplast genome of S. dentata are similar to other Lamiales chloroplast genomes. The plastome is 152,553 bp in length and includes a pair of inverted repeats (IRs) of 25,523 bp that separate a large single copy (LSC) region of 84,058 bp and a small single copy (SSC) region of 17,449 bp. It has 38.0% GC content and includes 114 unique genes, of which 80 are protein-coding, 30 are transfer RNA, and 4 are ribosomal RNA. Also, it contains 21 forward repeats, 19 palindrome repeats and 41 simple sequence repeats (SSRs). The repeats and SSRs within S. dentata were compared with those of S. takesimensis and present certain discrepancies. The chloroplast genome of S. dentata was compared with other five publicly available Lamiales genomes from different families. All the coding regions and non-coding regions (introns and intergenic spacers) within the six chloroplast genomes have been extracted and analysed. Furthermore, the genome divergent hotspot regions were identified. Our studies could provide basic data for the alpine medicinal species conservation and molecular phylogenetic researches of Scrophulariaceae and Lamiales.

  9. The Complete Chloroplast Genome of Ye-Xing-Ba (Scrophularia dentata; Scrophulariaceae, an Alpine Tibetan Herb.

    Directory of Open Access Journals (Sweden)

    Lianghong Ni

    Full Text Available Scrophularia dentata is an important Tibetan medicinal plant and traditionally used for the treatment of exanthema and fever in Traditional Tibetan Medicine (TTM. However, there is little sequence and genomic information available for S. dentata. In this paper, we report the complete chloroplast genome sequence of S. dentata and it is the first sequenced member of the Sect. Tomiophyllum within Scrophularia (Scrophulariaceae. The gene order and organization of the chloroplast genome of S. dentata are similar to other Lamiales chloroplast genomes. The plastome is 152,553 bp in length and includes a pair of inverted repeats (IRs of 25,523 bp that separate a large single copy (LSC region of 84,058 bp and a small single copy (SSC region of 17,449 bp. It has 38.0% GC content and includes 114 unique genes, of which 80 are protein-coding, 30 are transfer RNA, and 4 are ribosomal RNA. Also, it contains 21 forward repeats, 19 palindrome repeats and 41 simple sequence repeats (SSRs. The repeats and SSRs within S. dentata were compared with those of S. takesimensis and present certain discrepancies. The chloroplast genome of S. dentata was compared with other five publicly available Lamiales genomes from different families. All the coding regions and non-coding regions (introns and intergenic spacers within the six chloroplast genomes have been extracted and analysed. Furthermore, the genome divergent hotspot regions were identified. Our studies could provide basic data for the alpine medicinal species conservation and molecular phylogenetic researches of Scrophulariaceae and Lamiales.

  10. The interplay of long non-coding RNAs and MYC in cancer

    Directory of Open Access Journals (Sweden)

    Michael J. Hamilton

    2015-12-01

    Full Text Available Long non-coding RNAs (lncRNAs are a class of RNA molecules that are changing how researchers view eukaryotic gene regulation. Once considered to be non-functional products of low-level aberrant transcription from non-coding regions of the genome, lncRNAs are now viewed as important epigenetic regulators and several lncRNAs have now been demonstrated to be critical players in the development and/or maintenance of cancer. Similarly, the emerging variety of interactions between lncRNAs and MYC, a well-known oncogenic transcription factor linked to most types of cancer, have caught the attention of many biomedical researchers. Investigations exploring the dynamic interactions between lncRNAs and MYC, referred to as the lncRNA-MYC network, have proven to be especially complex. Genome-wide studies have shown that MYC transcriptionally regulates many lncRNA genes. Conversely, recent reports identified lncRNAs that regulate MYC expression both at the transcriptional and post-transcriptional levels. These findings are of particular interest because they suggest roles of lncRNAs as regulators of MYC oncogenic functions and the possibility that targeting lncRNAs could represent a novel avenue to cancer treatment. Here, we briefly review the current understanding of how lncRNAs regulate chromatin structure and gene transcription, and then focus on the new developments in the emerging field exploring the lncRNA-MYC network in cancer.

  11. The AutoProof Verifier: Usability by Non-Experts and on Standard Code

    Directory of Open Access Journals (Sweden)

    Carlo A. Furia

    2015-08-01

    Full Text Available Formal verification tools are often developed by experts for experts; as a result, their usability by programmers with little formal methods experience may be severely limited. In this paper, we discuss this general phenomenon with reference to AutoProof: a tool that can verify the full functional correctness of object-oriented software. In particular, we present our experiences of using AutoProof in two contrasting contexts representative of non-expert usage. First, we discuss its usability by students in a graduate course on software verification, who were tasked with verifying implementations of various sorting algorithms. Second, we evaluate its usability in verifying code developed for programming assignments of an undergraduate course. The first scenario represents usability by serious non-experts; the second represents usability on "standard code", developed without full functional verification in mind. We report our experiences and lessons learnt, from which we derive some general suggestions for furthering the development of verification tools with respect to improving their usability.

  12. Characterization and Analysis of Whole Transcriptome of Giant Panda Spleens: Implying Critical Roles of Long Non-Coding RNAs in Immunity.

    Science.gov (United States)

    Peng, Rui; Liu, Yuliang; Cai, Zhigang; Shen, Fujun; Chen, Jiasong; Hou, Rong; Zou, Fangdong

    2018-01-01

    Giant pandas, an endangered species, are a powerful symbol of species conservation. Giant pandas may suffer from a variety of diseases. Owing to their highly specialized diet of bamboo, giant pandas are thought to have a relatively weak ability to resist diseases. The spleen is the largest organ in the lymphatic system. However, there is little known about giant panda spleen at a molecular level. Thus, clarifying the regulatory mechanisms of spleen could help us further understand the immune system of the giant panda as well as its conservation. The two giant panda spleens were from two male individuals, one newborn and one an adult, in a non-pathological condition. The whole transcriptomes of mRNA, lncRNA, miRNA, and circRNA in the two spleens were sequenced using the Illumina HiSeq platform. EBseq and IDEG6 were used to observe the differentially expressed genes (DEGs) between these two spleens. Gene Ontology and KEGG analyses were used to annotate the function of DEGs. Furthermore, networks between non-coding RNAs and protein-coding genes were constructed to investigate the relationship between non-coding RNAs and immune-associated genes. By comparative analysis of the whole transcriptomes of these two spleens, we found that one of the major roles of lncRNAs could be involved in the regulation of immune responses of giant panda spleens. In addition, our results also revealed that microRNAs and circRNAs may have evolved to regulate a large set of biological processes of giant panda spleens, and circRNAs may function as miRNA sponges. To our knowledge, this is the first report of lncRNAs and circRNAs in giant panda, which could be a useful resource for further giant panda research. Our study reveals the potential functional roles of miRNAs, lncRNAs, and circRNAs in giant panda spleen. © 2018 The Author(s). Published by S. Karger AG, Basel.

  13. Characterization and Analysis of Whole Transcriptome of Giant Panda Spleens: Implying Critical Roles of Long Non-Coding RNAs in Immunity

    Directory of Open Access Journals (Sweden)

    Rui Peng

    2018-04-01

    Full Text Available Background/Aims: Giant pandas, an endangered species, are a powerful symbol of species conservation. Giant pandas may suffer from a variety of diseases. Owing to their highly specialized diet of bamboo, giant pandas are thought to have a relatively weak ability to resist diseases. The spleen is the largest organ in the lymphatic system. However, there is little known about giant panda spleen at a molecular level. Thus, clarifying the regulatory mechanisms of spleen could help us further understand the immune system of the giant panda as well as its conservation. Methods: The two giant panda spleens were from two male individuals, one newborn and one an adult, in a non-pathological condition. The whole transcriptomes of mRNA, lncRNA, miRNA, and circRNA in the two spleens were sequenced using the Illumina HiSeq platform. EBseq and IDEG6 were used to observe the differentially expressed genes (DEGs between these two spleens. Gene Ontology and KEGG analyses were used to annotate the function of DEGs. Furthermore, networks between non-coding RNAs and protein-coding genes were constructed to investigate the relationship between non-coding RNAs and immune-associated genes. Results: By comparative analysis of the whole transcriptomes of these two spleens, we found that one of the major roles of lncRNAs could be involved in the regulation of immune responses of giant panda spleens. In addition, our results also revealed that microRNAs and circRNAs may have evolved to regulate a large set of biological processes of giant panda spleens, and circRNAs may function as miRNA sponges. Conclusion: To our knowledge, this is the first report of lncRNAs and circRNAs in giant panda, which could be a useful resource for further giant panda research. Our study reveals the potential functional roles of miRNAs, lncRNAs, and circRNAs in giant panda spleen.

  14. Long Non-Coding RNAs in Cancer and Development: Where Do We Go from Here?

    Directory of Open Access Journals (Sweden)

    Monika Haemmerle

    2015-01-01

    Full Text Available Recent genome-wide expression profiling studies have uncovered a huge amount of novel, long non-protein-coding RNA transcripts (lncRNA. In general, these transcripts possess a low, but tissue-specific expression, and their nucleotide sequences are often poorly conserved. However, several studies showed that lncRNAs can have important roles for normal tissue development and regulate cellular pluripotency as well as differentiation. Moreover, lncRNAs are implicated in the control of multiple molecular pathways leading to gene expression changes and thus, ultimately modulate cell proliferation, migration and apoptosis. Consequently, deregulation of lncRNA expression contributes to carcinogenesis and is associated with human diseases, e.g., neurodegenerative disorders like Alzheimer’s Disease. Here, we will focus on some major challenges of lncRNA research, especially loss-of-function studies. We will delineate strategies for lncRNA gene targeting in vivo, and we will briefly discuss important consideration and pitfalls when investigating lncRNA functions in knockout animal models. Finally, we will highlight future opportunities for lncRNAs research by applying the concept of cross-species comparison, which might contribute to novel disease biomarker discovery and might identify lncRNAs as potential therapeutic targets.

  15. Genetic diagnosis of Mendelian disorders via RNA sequencing.

    Science.gov (United States)

    Kremer, Laura S; Bader, Daniel M; Mertes, Christian; Kopajtich, Robert; Pichler, Garwin; Iuso, Arcangela; Haack, Tobias B; Graf, Elisabeth; Schwarzmayr, Thomas; Terrile, Caterina; Koňaříková, Eliška; Repp, Birgit; Kastenmüller, Gabi; Adamski, Jerzy; Lichtner, Peter; Leonhardt, Christoph; Funalot, Benoit; Donati, Alice; Tiranti, Valeria; Lombes, Anne; Jardel, Claude; Gläser, Dieter; Taylor, Robert W; Ghezzi, Daniele; Mayr, Johannes A; Rötig, Agnes; Freisinger, Peter; Distelmaier, Felix; Strom, Tim M; Meitinger, Thomas; Gagneur, Julien; Prokisch, Holger

    2017-06-12

    Across a variety of Mendelian disorders, ∼50-75% of patients do not receive a genetic diagnosis by exome sequencing indicating disease-causing variants in non-coding regions. Although genome sequencing in principle reveals all genetic variants, their sizeable number and poorer annotation make prioritization challenging. Here, we demonstrate the power of transcriptome sequencing to molecularly diagnose 10% (5 of 48) of mitochondriopathy patients and identify candidate genes for the remainder. We find a median of one aberrantly expressed gene, five aberrant splicing events and six mono-allelically expressed rare variants in patient-derived fibroblasts and establish disease-causing roles for each kind. Private exons often arise from cryptic splice sites providing an important clue for variant prioritization. One such event is found in the complex I assembly factor TIMMDC1 establishing a novel disease-associated gene. In conclusion, our study expands the diagnostic tools for detecting non-exonic variants and provides examples of intronic loss-of-function variants with pathological relevance.

  16. Ultrafast all-optical code-division multiple-access networks

    Science.gov (United States)

    Kwong, Wing C.; Prucnal, Paul R.; Liu, Yanming

    1992-12-01

    In optical code-division multiple access (CDMA), the architecture of optical encoders/decoders is another important factor that needs to be considered, besides the correlation properties of those already extensively studied optical codes. The architecture of optical encoders/decoders affects, for example, the amount of power loss and length of optical delays that are associated with code sequence generation and correlation, which, in turn, affect the power budget, size, and cost of an optical CDMA system. Various CDMA coding architectures are studied in the paper. In contrast to the encoders/decoders used in prime networks (i.e., prime encodes/decoders), which generate, select, and correlate code sequences by a parallel combination of fiber-optic delay-lines, and in 2n networks (i.e., 2n encoders/decoders), which generate and correlate code sequences by a serial combination of 2 X 2 passive couplers and fiber delays with sequence selection performed in a parallel fashion, the modified 2n encoders/decoders generate, select, and correlate code sequences by a serial combination of directional couplers and delays. The power and delay- length requirements of the modified 2n encoders/decoders are compared to that of the prime and 2n encoders/decoders. A 100 Mbit/s optical CDMA experiment in free space demonstrating the feasibility of the all-serial coding architecture using a serial combination of 50/50 beam splitters and retroreflectors at 10 Tchip/s (i.e., 100,000 chip/bit) with 100 fs laser pulses is reported.

  17. Application of MELCOR Code to a French PWR 900 MWe Severe Accident Sequence and Evaluation of Models Performance Focusing on In-Vessel Thermal Hydraulic Results

    International Nuclear Information System (INIS)

    De Rosa, Felice

    2006-01-01

    In the ambit of the Severe Accident Network of Excellence Project (SARNET), funded by the European Union, 6. FISA (Fission Safety) Programme, one of the main tasks is the development and validation of the European Accident Source Term Evaluation Code (ASTEC Code). One of the reference codes used to compare ASTEC results, coming from experimental and Reactor Plant applications, is MELCOR. ENEA is a SARNET member and also an ASTEC and MELCOR user. During the first 18 months of this project, we performed a series of MELCOR and ASTEC calculations referring to a French PWR 900 MWe and to the accident sequence of 'Loss of Steam Generator (SG) Feedwater' (known as H2 sequence in the French classification). H2 is an accident sequence substantially equivalent to a Station Blackout scenario, like a TMLB accident, with the only difference that in H2 sequence the scram is forced to occur with a delay of 28 seconds. The main events during the accident sequence are a loss of normal and auxiliary SG feedwater (0 s), followed by a scram when the water level in SG is equal or less than 0.7 m (after 28 seconds). There is also a main coolant pumps trip when ΔTsat < 10 deg. C, a total opening of the three relief valves when Tric (core maximal outlet temperature) is above 603 K (330 deg. C) and accumulators isolation when primary pressure goes below 1.5 MPa (15 bar). Among many other points, it is worth noting that this was the first time that a MELCOR 1.8.5 input deck was available for a French PWR 900. The main ENEA effort in this period was devoted to prepare the MELCOR input deck using the code version v.1.8.5 (build QZ Oct 2000 with the latest patch 185003 Oct 2001). The input deck, completely new, was prepared taking into account structure, data and same conditions as those found inside ASTEC input decks. The main goal of the work presented in this paper is to put in evidence where and when MELCOR provides good enough results and why, in some cases mainly referring to its

  18. A deep learning method for lincRNA detection using auto-encoder algorithm.

    Science.gov (United States)

    Yu, Ning; Yu, Zeng; Pan, Yi

    2017-12-06

    RNA sequencing technique (RNA-seq) enables scientists to develop novel data-driven methods for discovering more unidentified lincRNAs. Meantime, knowledge-based technologies are experiencing a potential revolution ignited by the new deep learning methods. By scanning the newly found data set from RNA-seq, scientists have found that: (1) the expression of lincRNAs appears to be regulated, that is, the relevance exists along the DNA sequences; (2) lincRNAs contain some conversed patterns/motifs tethered together by non-conserved regions. The two evidences give the reasoning for adopting knowledge-based deep learning methods in lincRNA detection. Similar to coding region transcription, non-coding regions are split at transcriptional sites. However, regulatory RNAs rather than message RNAs are generated. That is, the transcribed RNAs participate the biological process as regulatory units instead of generating proteins. Identifying these transcriptional regions from non-coding regions is the first step towards lincRNA recognition. The auto-encoder method achieves 100% and 92.4% prediction accuracy on transcription sites over the putative data sets. The experimental results also show the excellent performance of predictive deep neural network on the lincRNA data sets compared with support vector machine and traditional neural network. In addition, it is validated through the newly discovered lincRNA data set and one unreported transcription site is found by feeding the whole annotated sequences through the deep learning machine, which indicates that deep learning method has the extensive ability for lincRNA prediction. The transcriptional sequences of lincRNAs are collected from the annotated human DNA genome data. Subsequently, a two-layer deep neural network is developed for the lincRNA detection, which adopts the auto-encoder algorithm and utilizes different encoding schemes to obtain the best performance over intergenic DNA sequence data. Driven by those newly

  19. Repeated extragenic sequences in prokaryotic genomes: a proposal for the origin and dynamics of the RUP element in Streptococcus pneumoniae.

    Science.gov (United States)

    Oggioni, M R; Claverys, J P

    1999-10-01

    A survey of all Streptococcus pneumoniae GenBank/EMBL DNA sequence entries and of the public domain sequence (representing more than 90% of the genome) of an S. pneumoniae type 4 strain allowed identification of 108 copies of a 107-bp-long highly repeated intergenic element called RUP (for repeat unit of pneumococcus). Several features of the element, revealed in this study, led to the proposal that RUP is an insertion sequence (IS)-derivative that could still be mobile. Among these features are: (1) a highly significant homology between the terminal inverted repeats (IRs) of RUPs and of IS630-Spn1, a new putative IS of S. pneumoniae; and (2) insertion at a TA dinucleotide, a characteristic target of several members of the IS630 family. Trans-mobilization of RUP is therefore proposed to be mediated by the transposase of IS630-Spn1. To account for the observation that RUPs are distributed among four subtypes which exhibit different degrees of sequence homogeneity, a scenario is invoked based on successive stages of RUP mobility and non-mobility, depending on whether an active transposase is present or absent. In the latter situation, an active transposase could be reintroduced into the species through natural transformation. Examination of sequences flanking RUP revealed a preferential association with ISs. It also provided evidence that RUPs promote sequence rearrangements, thereby contributing to genome flexibility. The possibility that RUP preferentially targets transforming DNA of foreign origin and subsequently favours disruption/rearrangement of exogenous sequences is discussed.

  20. The complete chloroplast genome sequence of strawberry (Fragaria  × ananassa Duch.) and comparison with related species of Rosaceae.

    Science.gov (United States)

    Cheng, Hui; Li, Jinfeng; Zhang, Hong; Cai, Binhua; Gao, Zhihong; Qiao, Yushan; Mi, Lin

    2017-01-01

    Compared with other members of the family Rosaceae, the chloroplast genomes of Fragaria species exhibit low variation, and this situation has limited phylogenetic analyses; thus, complete chloroplast genome sequencing of Fragaria species is needed. In this study, we sequenced the complete chloroplast genome of F . ×  ananassa 'Benihoppe' using the Illumina HiSeq 2500-PE150 platform and then performed a combination of de novo assembly and reference-guided mapping of contigs to generate complete chloroplast genome sequences. The chloroplast genome exhibits a typical quadripartite structure with a pair of inverted repeats (IRs, 25,936 bp) separated by large (LSC, 85,531 bp) and small (SSC, 18,146 bp) single-copy (SC) regions. The length of the F . ×  ananassa 'Benihoppe' chloroplast genome is 155,549 bp, representing the smallest Fragaria chloroplast genome observed to date. The genome encodes 112 unique genes, comprising 78 protein-coding genes, 30 tRNA genes and four rRNA genes. Comparative analysis of the overall nucleotide sequence identity among ten complete chloroplast genomes confirmed that for both coding and non-coding regions in Rosaceae, SC regions exhibit higher sequence variation than IRs. The Ka/Ks ratio of most genes was less than 1, suggesting that most genes are under purifying selection. Moreover, the mVISTA results also showed a high degree of conservation in genome structure, gene order and gene content in Fragaria , particularly among three octoploid strawberries which were F . ×  ananassa 'Benihoppe', F . chiloensis (GP33) and F . virginiana (O477). However, when the sequences of the coding and non-coding regions of F . ×  ananassa 'Benihoppe' were compared in detail with those of F . chiloensis (GP33) and F . virginiana (O477), a number of SNPs and InDels were revealed by MEGA 7. Six non-coding regions ( trnK - matK , trnS - trnG , atpF - atpH , trnC - petN , trnT - psbD and trnP - psaJ ) with a percentage of variable sites greater than

  1. The complete chloroplast genome sequence of strawberry (Fragaria  × ananassa Duch. and comparison with related species of Rosaceae

    Directory of Open Access Journals (Sweden)

    Hui Cheng

    2017-10-01

    Full Text Available Compared with other members of the family Rosaceae, the chloroplast genomes of Fragaria species exhibit low variation, and this situation has limited phylogenetic analyses; thus, complete chloroplast genome sequencing of Fragaria species is needed. In this study, we sequenced the complete chloroplast genome of F. × ananassa ‘Benihoppe’ using the Illumina HiSeq 2500-PE150 platform and then performed a combination of de novo assembly and reference-guided mapping of contigs to generate complete chloroplast genome sequences. The chloroplast genome exhibits a typical quadripartite structure with a pair of inverted repeats (IRs, 25,936 bp separated by large (LSC, 85,531 bp and small (SSC, 18,146 bp single-copy (SC regions. The length of the F. × ananassa ‘Benihoppe’ chloroplast genome is 155,549 bp, representing the smallest Fragaria chloroplast genome observed to date. The genome encodes 112 unique genes, comprising 78 protein-coding genes, 30 tRNA genes and four rRNA genes. Comparative analysis of the overall nucleotide sequence identity among ten complete chloroplast genomes confirmed that for both coding and non-coding regions in Rosaceae, SC regions exhibit higher sequence variation than IRs. The Ka/Ks ratio of most genes was less than 1, suggesting that most genes are under purifying selection. Moreover, the mVISTA results also showed a high degree of conservation in genome structure, gene order and gene content in Fragaria, particularly among three octoploid strawberries which were F. × ananassa ‘Benihoppe’, F. chiloensis (GP33 and F. virginiana (O477. However, when the sequences of the coding and non-coding regions of F. × ananassa ‘Benihoppe’ were compared in detail with those of F. chiloensis (GP33 and F. virginiana (O477, a number of SNPs and InDels were revealed by MEGA 7. Six non-coding regions (trnK-matK, trnS-trnG, atpF-atpH, trnC-petN, trnT-psbD and trnP-psaJ with a percentage of variable sites greater than 1

  2. Quantification of non-coding RNA target localization diversity and its application in cancers.

    Science.gov (United States)

    Cheng, Lixin; Leung, Kwong-Sak

    2018-04-01

    Subcellular localization is pivotal for RNAs and proteins to implement biological functions. The localization diversity of protein interactions has been studied as a crucial feature of proteins, considering that the protein-protein interactions take place in various subcellular locations. Nevertheless, the localization diversity of non-coding RNA (ncRNA) target proteins has not been systematically studied, especially its characteristics in cancers. In this study, we provide a new algorithm, non-coding RNA target localization coefficient (ncTALENT), to quantify the target localization diversity of ncRNAs based on the ncRNA-protein interaction and protein subcellular localization data. ncTALENT can be used to calculate the target localization coefficient of ncRNAs and measure how diversely their targets are distributed among the subcellular locations in various scenarios. We focus our study on long non-coding RNAs (lncRNAs), and our observations reveal that the target localization diversity is a primary characteristic of lncRNAs in different biotypes. Moreover, we found that lncRNAs in multiple cancers, differentially expressed cancer lncRNAs, and lncRNAs with multiple cancer target proteins are prone to have high target localization diversity. Furthermore, the analysis of gastric cancer helps us to obtain a better understanding that the target localization diversity of lncRNAs is an important feature closely related to clinical prognosis. Overall, we systematically studied the target localization diversity of the lncRNAs and uncovered its association with cancer.

  3. HLA-E regulatory and coding region variability and haplotypes in a Brazilian population sample.

    Science.gov (United States)

    Ramalho, Jaqueline; Veiga-Castelli, Luciana C; Donadi, Eduardo A; Mendes-Junior, Celso T; Castelli, Erick C

    2017-11-01

    The HLA-E gene is characterized by low but wide expression on different tissues. HLA-E is considered a conserved gene, being one of the least polymorphic class I HLA genes. The HLA-E molecule interacts with Natural Killer cell receptors and T lymphocytes receptors, and might activate or inhibit immune responses depending on the peptide associated with HLA-E and with which receptors HLA-E interacts to. Variable sites within the HLA-E regulatory and coding segments may influence the gene function by modifying its expression pattern or encoded molecule, thus, influencing its interaction with receptors and the peptide. Here we propose an approach to evaluate the gene structure, haplotype pattern and the complete HLA-E variability, including regulatory (promoter and 3'UTR) and coding segments (with introns), by using massively parallel sequencing. We investigated the variability of 420 samples from a very admixed population such as Brazilians by using this approach. Considering a segment of about 7kb, 63 variable sites were detected, arranged into 75 extended haplotypes. We detected 37 different promoter sequences (but few frequent ones), 27 different coding sequences (15 representing new HLA-E alleles) and 12 haplotypes at the 3'UTR segment, two of them presenting a summed frequency of 90%. Despite the number of coding alleles, they encode mainly two different full-length molecules, known as E*01:01 and E*01:03, which corresponds to about 90% of all. In addition, differently from what has been previously observed for other non classical HLA genes, the relationship among the HLA-E promoter, coding and 3'UTR haplotypes is not straightforward because the same promoter and 3'UTR haplotypes were many times associated with different HLA-E coding haplotypes. This data reinforces the presence of only two main full-length HLA-E molecules encoded by the many HLA-E alleles detected in our population sample. In addition, this data does indicate that the distal HLA-E promoter is by

  4. Non-coding RNAs: New therapeutic targets and opportunities for hepatocellular carcinoma

    Directory of Open Access Journals (Sweden)

    Yu Cuiyun

    2016-02-01

    Full Text Available Non-coding RNAs (ncRNA are RNA molecules without protein coding functions owing to the lack of an open reading frame (ORF. Based on the length, ncRNAs can be divided into long and short ncRNAs; short ncRNAs include miRNAs and piRNAs. Hepatocellular carcinoma (HCC is among the most frequent forms of cancer worldwide and its incidence is increasing rapidly. Studies have found that ncRNAs are likely to play a crucial role in a variety of biological processes including the pathogenesis and progression of HCC. In this review, we summarized the regulation mechanism and biological functions of ncRNAs in HCC with respect to its application in HCC diagnosis, therapy and prognosis.

  5. DNA barcode goes two-dimensions: DNA QR code web server.

    Science.gov (United States)

    Liu, Chang; Shi, Linchun; Xu, Xiaolan; Li, Huan; Xing, Hang; Liang, Dong; Jiang, Kun; Pang, Xiaohui; Song, Jingyuan; Chen, Shilin

    2012-01-01

    The DNA barcoding technology uses a standard region of DNA sequence for species identification and discovery. At present, "DNA barcode" actually refers to DNA sequences, which are not amenable to information storage, recognition, and retrieval. Our aim is to identify the best symbology that can represent DNA barcode sequences in practical applications. A comprehensive set of sequences for five DNA barcode markers ITS2, rbcL, matK, psbA-trnH, and CO1 was used as the test data. Fifty-three different types of one-dimensional and ten two-dimensional barcode symbologies were compared based on different criteria, such as coding capacity, compression efficiency, and error detection ability. The quick response (QR) code was found to have the largest coding capacity and relatively high compression ratio. To facilitate the further usage of QR code-based DNA barcodes, a web server was developed and is accessible at http://qrfordna.dnsalias.org. The web server allows users to retrieve the QR code for a species of interests, convert a DNA sequence to and from a QR code, and perform species identification based on local and global sequence similarities. In summary, the first comprehensive evaluation of various barcode symbologies has been carried out. The QR code has been found to be the most appropriate symbology for DNA barcode sequences. A web server has also been constructed to allow biologists to utilize QR codes in practical DNA barcoding applications.

  6. DNA barcode goes two-dimensions: DNA QR code web server.

    Directory of Open Access Journals (Sweden)

    Chang Liu

    Full Text Available The DNA barcoding technology uses a standard region of DNA sequence for species identification and discovery. At present, "DNA barcode" actually refers to DNA sequences, which are not amenable to information storage, recognition, and retrieval. Our aim is to identify the best symbology that can represent DNA barcode sequences in practical applications. A comprehensive set of sequences for five DNA barcode markers ITS2, rbcL, matK, psbA-trnH, and CO1 was used as the test data. Fifty-three different types of one-dimensional and ten two-dimensional barcode symbologies were compared based on different criteria, such as coding capacity, compression efficiency, and error detection ability. The quick response (QR code was found to have the largest coding capacity and relatively high compression ratio. To facilitate the further usage of QR code-based DNA barcodes, a web server was developed and is accessible at http://qrfordna.dnsalias.org. The web server allows users to retrieve the QR code for a species of interests, convert a DNA sequence to and from a QR code, and perform species identification based on local and global sequence similarities. In summary, the first comprehensive evaluation of various barcode symbologies has been carried out. The QR code has been found to be the most appropriate symbology for DNA barcode sequences. A web server has also been constructed to allow biologists to utilize QR codes in practical DNA barcoding applications.

  7. Improving performance of DS-CDMA systems using chaotic complex Bernoulli spreading codes

    Science.gov (United States)

    Farzan Sabahi, Mohammad; Dehghanfard, Ali

    2014-12-01

    The most important goal of spreading spectrum communication system is to protect communication signals against interference and exploitation of information by unintended listeners. In fact, low probability of detection and low probability of intercept are two important parameters to increase the performance of the system. In Direct Sequence Code Division Multiple Access (DS-CDMA) systems, these properties are achieved by multiplying the data information in spreading sequences. Chaotic sequences, with their particular properties, have numerous applications in constructing spreading codes. Using one-dimensional Bernoulli chaotic sequence as spreading code is proposed in literature previously. The main feature of this sequence is its negative auto-correlation at lag of 1, which with proper design, leads to increase in efficiency of the communication system based on these codes. On the other hand, employing the complex chaotic sequences as spreading sequence also has been discussed in several papers. In this paper, use of two-dimensional Bernoulli chaotic sequences is proposed as spreading codes. The performance of a multi-user synchronous and asynchronous DS-CDMA system will be evaluated by applying these sequences under Additive White Gaussian Noise (AWGN) and fading channel. Simulation results indicate improvement of the performance in comparison with conventional spreading codes like Gold codes as well as similar complex chaotic spreading sequences. Similar to one-dimensional Bernoulli chaotic sequences, the proposed sequences also have negative auto-correlation. Besides, construction of complex sequences with lower average cross-correlation is possible with the proposed method.

  8. Deep sequencing of small RNAs identifies canonical and non-canonical miRNA and endogenous siRNAs in mammalian somatic tissues.

    Science.gov (United States)

    Castellano, Leandro; Stebbing, Justin

    2013-03-01

    MicroRNAs (miRNAs) are small RNA molecules that regulate gene expression. They are characterized by specific maturation processes defined by canonical and non-canonical biogenic pathways. Analysis of ∼0.5 billion sequences from mouse data sets derived from different tissues, developmental stages and cell types, partly characterized by either ablation or mutation of the main proteins belonging to miRNA processor complexes, reveals 66 high-confidence new genomic loci coding for miRNAs that could be processed in a canonical or non-canonical manner. A proportion of the newly discovered miRNAs comprises mirtrons, for which we define a new sub-class. Notably, some of these newly discovered miRNAs are generated from untranslated and open reading frames of coding genes, and we experimentally validate these. We also show that many annotated miRNAs do not present miRNA-like features, as they are neither processed by known processing complexes nor loaded on AGO2; this indicates that the current miRNA miRBase database list should be refined and re-defined. Accordingly, a group of them map on ribosomal RNA molecules, whereas others cannot undergo genuine miRNA biogenesis. Notably, a group of annotated miRNAs are Dgcr8 independent and DICER dependent endogenous small interfering RNAs that derive from a unique hairpin formed from a short interspersed nuclear element.

  9. Random sequences are an abundant source of bioactive RNAs or peptides

    DEFF Research Database (Denmark)

    Neme, Rafik; Amador, Cristina; Yildirim, Burcin

    2017-01-01

    It is generally assumed that new genes arise through duplication and/or recombination of existing genes. The probability that a new functional gene could arise out of random non-coding DNA is so far considered to be negligible, as it seems unlikely that such an RNA or protein sequence could have ...

  10. SINEUPs are modular antisense long-non coding RNAs that increase synthesis of target proteins in cells

    Directory of Open Access Journals (Sweden)

    Silvia eZucchelli

    2015-05-01

    Full Text Available Despite recent efforts in discovering novel long non-coding RNAs (lncRNAs and unveiling their functions in a wide range of biological processes their applications as biotechnological or therapeutic tools are still at their infancy. We have recently shown that AS Uchl1, a natural lncRNA antisense to the Parkinson’s disease-associated gene Ubiquitin carboxyl-terminal esterase L1 (Uchl1, is able to increase UchL1 protein synthesis at post-transcriptional level. Its activity requires two RNA elements: an embedded inverted SINEB2 sequence to increase translation and the overlapping region to target its sense mRNA. This functional organization is shared with several mouse lncRNAs antisense to protein coding genes. The potential use of AS Uchl1-derived lncRNAs as enhancers of target mRNA translation remains unexplored. Here we define AS Uchl1 as the representative member of a new functional class of natural and synthetic antisense lncRNAs that activate translation. We named this class of RNAs SINEUPs for their requirement of the inverted SINEB2 sequence to UP-regulate translation in a gene-specific manner. The overlapping region is indicated as the Binding Doman (BD while the embedded inverted SINEB2 element is the Effector Domain (ED. By swapping BD, synthetic SINEUPs are designed targeting mRNAs of interest. SINEUPs function in an array of cell lines and can be efficiently directed towards N-terminally tagged proteins. Their biological activity is retained in a miniaturized version within the range of small RNAs length. Its modular structure was exploited to successfully design synthetic SINEUPs targeting endogenous Parkinson’s disease-associated DJ-1 and proved to be active in different neuronal cell lines.In summary, SINEUPs represent the first scalable tool to increase synthesis of proteins of interest. We propose SINEUPs as reagents for molecular biology experiments, in protein manufacturing as well as in therapy of haploinsufficiencies.

  11. A Biologically Based Approach to the Mutation of Code

    Science.gov (United States)

    1999-09-01

    instructions called a genome. This genome contains the master blueprint for all cellular structures and functions within the organism for the duration of...structures known as chromosomes, which are found in the nucleus of all non-somatic cells. Many procaryotic organisms have single-stranded DNA. An...coding sequence of a gene, or by an aberrant cellular recombination process. One way to reduce the chances of a harmful mutation occuring is to

  12. Beamforming-Based Physical Layer Network Coding for Non-Regenerative Multi-Way Relaying

    Directory of Open Access Journals (Sweden)

    Klein Anja

    2010-01-01

    Full Text Available We propose non-regenerative multi-way relaying where a half-duplex multi-antenna relay station (RS assists multiple single-antenna nodes to communicate with each other. The required number of communication phases is equal to the number of the nodes, N. There are only one multiple-access phase, where the nodes transmit simultaneously to the RS, and broadcast (BC phases. Two transmission methods for the BC phases are proposed, namely, multiplexing transmission and analog network coded transmission. The latter is a cooperation method between the RS and the nodes to manage the interference in the network. Assuming that perfect channel state information is available, the RS performs transceive beamforming to the received signals and transmits simultaneously to all nodes in each BC phase. We address the optimum transceive beamforming maximising the sum rate of non-regenerative multi-way relaying. Due to the nonconvexity of the optimization problem, we propose suboptimum but practical signal processing schemes. For multiplexing transmission, we propose suboptimum schemes based on zero forcing, minimising the mean square error, and maximising the signal to noise ratio. For analog network coded transmission, we propose suboptimum schemes based on matched filtering and semidefinite relaxation of maximising the minimum signal to noise ratio. It is shown that analog network coded transmission outperforms multiplexing transmission.

  13. The complete mitochondrial genome sequence of Oceanic whitetip shark, Carcharhinus longimanus (Carcharhiniformes: Carcharhinidae).

    Science.gov (United States)

    Li, Weiwen; Dai, Xiaojie; Xu, Qianghua; Wu, Feng; Gao, Chunxia; Zhang, Yanbo

    2016-05-01

    The complete mitochondrial DNA sequence of Carcharhinus longimanus was determined and analyzed. The complete mtDNA genome sequence of C. longimanus was 16,706 bp in length. It contained 22 tRNA genes, 2 rRNA genes, 13 protein-coding genes and 2 non-conding regions: control region (D-loop) and origin of light-strand replication (OL). The complete mitogenome sequence information of C. longimanus can provide a useful data for further studies on molecular systematics, stock evaluation, taxonomic status and conservation genetics.

  14. FEAST: a two-dimensional non-linear finite element code for calculating stresses

    International Nuclear Information System (INIS)

    Tayal, M.

    1986-06-01

    The computer code FEAST calculates stresses, strains, and displacements. The code is two-dimensional. That is, either plane or axisymmetric calculations can be done. The code models elastic, plastic, creep, and thermal strains and stresses. Cracking can also be simulated. The finite element method is used to solve equations describing the following fundamental laws of mechanics: equilibrium; compatibility; constitutive relations; yield criterion; and flow rule. FEAST combines several unique features that permit large time-steps in even severely non-linear situations. The features include a special formulation for permitting many finite elements to simultaneously cross the boundary from elastic to plastic behaviour; accomodation of large drops in yield-strength due to changes in local temperature and a three-step predictor-corrector method for plastic analyses. These features reduce computing costs. Comparisons against twenty analytical solutions and against experimental measurements show that predictions of FEAST are generally accurate to ± 5%

  15. Comprehensive Identification of Long Non-coding RNAs in Purified Cell Types from the Brain Reveals Functional LncRNA in OPC Fate Determination.

    Directory of Open Access Journals (Sweden)

    Xiaomin Dong

    2015-12-01

    Full Text Available Long non-coding RNAs (lncRNAs (> 200 bp play crucial roles in transcriptional regulation during numerous biological processes. However, it is challenging to comprehensively identify lncRNAs, because they are often expressed at low levels and with more cell-type specificity than are protein-coding genes. In the present study, we performed ab initio transcriptome reconstruction using eight purified cell populations from mouse cortex and detected more than 5000 lncRNAs. Predicting the functions of lncRNAs using cell-type specific data revealed their potential functional roles in Central Nervous System (CNS development. We performed motif searches in ENCODE DNase I digital footprint data and Mouse ENCODE promoters to infer transcription factor (TF occupancy. By integrating TF binding and cell-type specific transcriptomic data, we constructed a novel framework that is useful for systematically identifying lncRNAs that are potentially essential for brain cell fate determination. Based on this integrative analysis, we identified lncRNAs that are regulated during Oligodendrocyte Precursor Cell (OPC differentiation from Neural Stem Cells (NSCs and that are likely to be involved in oligodendrogenesis. The top candidate, lnc-OPC, shows highly specific expression in OPCs and remarkable sequence conservation among placental mammals. Interestingly, lnc-OPC is significantly up-regulated in glial progenitors from experimental autoimmune encephalomyelitis (EAE mouse models compared to wild-type mice. OLIG2-binding sites in the upstream regulatory region of lnc-OPC were identified by ChIP (chromatin immunoprecipitation-Sequencing and validated by luciferase assays. Loss-of-function experiments confirmed that lnc-OPC plays a functional role in OPC genesis. Overall, our results substantiated the role of lncRNA in OPC fate determination and provided an unprecedented data source for future functional investigations in CNS cell types. We present our datasets and

  16. Multispacer sequence typing relapsing fever Borreliae in Africa.

    Directory of Open Access Journals (Sweden)

    Haitham Elbir

    Full Text Available BACKGROUND: In Africa, relapsing fevers are neglected arthropod-borne infections caused by closely related Borrelia species. They cause mild to deadly undifferentiated fever particularly severe in pregnant women. Lack of a tool to genotype these Borrelia organisms limits knowledge regarding their reservoirs and their epidemiology. METHODOLOGY/PRINCIPAL FINDINGS: Genome sequence analysis of Borrelia crocidurae, Borrelia duttonii and Borrelia recurrentis yielded 5 intergenic spacers scattered between 10 chromosomal genes that were incorporated into a multispacer sequence typing (MST approach. Sequencing these spacers directly from human blood specimens previously found to be infected by B. recurrentis (30 specimens, B. duttonii (17 specimens and B. crocidurae (13 specimens resolved these 60 strains and the 3 type strains into 13 species-specific spacer types in the presence of negative controls. B. crocidurae comprised of 8 spacer types, B. duttonii of 3 spacer types and B. recurrentis of 2 spacer types. CONCLUSIONS/SIGNIFICANCE: Phylogenetic analyses of MST data suggested that B. duttonii, B. crocidurae and B. recurrentis are variants of a unique ancestral Borrelia species. MST proved to be a suitable approach for identifying and genotyping relapsing fever borreliae in Africa. It could be applied to both vectors and clinical specimens.

  17. Molecular Profiling of Microbial Communities from Contaminated Sources: Use of Subtractive Cloning Methods and rDNA Spacer Sequences; FINAL

    International Nuclear Information System (INIS)

    Robb, Frank T.

    2001-01-01

    The major objective of this research was to provide appropriate sequences and assemble a DNA array of oligonucleotides to be used for rapid profiling of microbial populations from polluted areas and other areas of interest. The sequences to be assigned to the DNA array were chosen from cloned genomic DNA taken from groundwater sites having well characterized pollutant histories at Hanford Nuclear Plant and Lawrence Livermore Site 300. Glass-slide arrays were made and tested; and a new multiplexed, bead-based method was developed that uses nucleic acid hybridization on the surface of microscopic polystyrene spheres to identify specific sequences in heterogeneous mixtures of DNA sequences. The test data revealed considerable strain variation between sample sites showing a striking distribution of sequences. It also suggests that diversity varies greatly with bioremediation, and that there are many bacterial intergenic spacer region sequences that can indicate its effects. The bead method exhibited superior sequence discrimination and has features for easier and more accurate measurement

  18. Color differences among feral pigeons (Columba livia) are not attributable to sequence variation in the coding region of the melanocortin-1 receptor gene (MC1R)

    Science.gov (United States)

    2013-01-01

    Background Genetic variation at the melanocortin-1 receptor (MC1R) gene is correlated with melanin color variation in many birds. Feral pigeons (Columba livia) show two major melanin-based colorations: a red coloration due to pheomelanic pigment and a black coloration due to eumelanic pigment. Furthermore, within each color type, feral pigeons display continuous variation in the amount of melanin pigment present in the feathers, with individuals varying from pure white to a full dark melanic color. Coloration is highly heritable and it has been suggested that it is under natural or sexual selection, or both. Our objective was to investigate whether MC1R allelic variants are associated with plumage color in feral pigeons. Findings We sequenced 888 bp of the coding sequence of MC1R among pigeons varying both in the type, eumelanin or pheomelanin, and the amount of melanin in their feathers. We detected 10 non-synonymous substitutions and 2 synonymous substitution but none of them were associated with a plumage type. It remains possible that non-synonymous substitutions that influence coloration are present in the short MC1R fragment that we did not sequence but this seems unlikely because we analyzed the entire functionally important region of the gene. Conclusions Our results show that color differences among feral pigeons are probably not attributable to amino acid variation at the MC1R locus. Therefore, variation in regulatory regions of MC1R or variation in other genes may be responsible for the color polymorphism of feral pigeons. PMID:23915680

  19. Non-declarative sequence learning does not show savings in relearning.

    Science.gov (United States)

    Keisler, Aysha; Willingham, Daniel T

    2007-04-01

    Researchers have utilized the savings in relearning paradigm in a variety of settings since Ebbinghaus developed the tool over a century ago. In spite of its widespread use, we do not yet understand what type(s) of memory are measurable by savings. Specifically, can savings measure both declarative and non-declarative memories? The lack of conscious recollection of the encoded material in some studies indicates that non-declarative memories may show savings effects, but as all studies to date have used declarative tasks, we cannot be certain. Here, we administer a non-declarative task and then measure savings in relearning the material declaratively. Our results show that while material outside of awareness may show savings effects, non-declarative sequence memory does not. These data highlight the important distinction between memory without awareness and non-declarative memory.

  20. Full Genome Sequence and sfRNA Interferon Antagonist Activity of Zika Virus from Recife, Brazil.

    Directory of Open Access Journals (Sweden)

    Claire L Donald

    2016-10-01

    Full Text Available The outbreak of Zika virus (ZIKV in the Americas has transformed a previously obscure mosquito-transmitted arbovirus of the Flaviviridae family into a major public health concern. Little is currently known about the evolution and biology of ZIKV and the factors that contribute to the associated pathogenesis. Determining genomic sequences of clinical viral isolates and characterization of elements within these are an important prerequisite to advance our understanding of viral replicative processes and virus-host interactions.We obtained a ZIKV isolate from a patient who presented with classical ZIKV-associated symptoms, and used high throughput sequencing and other molecular biology approaches to determine its full genome sequence, including non-coding regions. Genome regions were characterized and compared to the sequences of other isolates where available. Furthermore, we identified a subgenomic flavivirus RNA (sfRNA in ZIKV-infected cells that has antagonist activity against RIG-I induced type I interferon induction, with a lesser effect on MDA-5 mediated action.The full-length genome sequence including non-coding regions of a South American ZIKV isolate from a patient with classical symptoms will support efforts to develop genetic tools for this virus. Detection of sfRNA that counteracts interferon responses is likely to be important for further understanding of pathogenesis and virus-host interactions.