WorldWideScience

Sample records for novo protein-coding gene

  1. De novo origin of human protein-coding genes.

    Directory of Open Access Journals (Sweden)

    Dong-Dong Wu

    2011-11-01

    Full Text Available The de novo origin of a new protein-coding gene from non-coding DNA is considered to be a very rare occurrence in genomes. Here we identify 60 new protein-coding genes that originated de novo on the human lineage since divergence from the chimpanzee. The functionality of these genes is supported by both transcriptional and proteomic evidence. RNA-seq data indicate that these genes have their highest expression levels in the cerebral cortex and testes, which might suggest that these genes contribute to phenotypic traits that are unique to humans, such as improved cognitive ability. Our results are inconsistent with the traditional view that the de novo origin of new genes is very rare, thus there should be greater appreciation of the importance of the de novo origination of genes.

  2. De Novo Origin of Human Protein-Coding Genes

    Science.gov (United States)

    Wu, Dong-Dong; Irwin, David M.; Zhang, Ya-Ping

    2011-01-01

    The de novo origin of a new protein-coding gene from non-coding DNA is considered to be a very rare occurrence in genomes. Here we identify 60 new protein-coding genes that originated de novo on the human lineage since divergence from the chimpanzee. The functionality of these genes is supported by both transcriptional and proteomic evidence. RNA–seq data indicate that these genes have their highest expression levels in the cerebral cortex and testes, which might suggest that these genes contribute to phenotypic traits that are unique to humans, such as improved cognitive ability. Our results are inconsistent with the traditional view that the de novo origin of new genes is very rare, thus there should be greater appreciation of the importance of the de novo origination of genes. PMID:22102831

  3. Hominoid-specific de novo protein-coding genes originating from long non-coding RNAs.

    Directory of Open Access Journals (Sweden)

    Chen Xie

    2012-09-01

    Full Text Available Tinkering with pre-existing genes has long been known as a major way to create new genes. Recently, however, motherless protein-coding genes have been found to have emerged de novo from ancestral non-coding DNAs. How these genes originated is not well addressed to date. Here we identified 24 hominoid-specific de novo protein-coding genes with precise origination timing in vertebrate phylogeny. Strand-specific RNA-Seq analyses were performed in five rhesus macaque tissues (liver, prefrontal cortex, skeletal muscle, adipose, and testis, which were then integrated with public transcriptome data from human, chimpanzee, and rhesus macaque. On the basis of comparing the RNA expression profiles in the three species, we found that most of the hominoid-specific de novo protein-coding genes encoded polyadenylated non-coding RNAs in rhesus macaque or chimpanzee with a similar transcript structure and correlated tissue expression profile. According to the rule of parsimony, the majority of these hominoid-specific de novo protein-coding genes appear to have acquired a regulated transcript structure and expression profile before acquiring coding potential. Interestingly, although the expression profile was largely correlated, the coding genes in human often showed higher transcriptional abundance than their non-coding counterparts in rhesus macaque. The major findings we report in this manuscript are robust and insensitive to the parameters used in the identification and analysis of de novo genes. Our results suggest that at least a portion of long non-coding RNAs, especially those with active and regulated transcription, may serve as a birth pool for protein-coding genes, which are then further optimized at the transcriptional level.

  4. A human-specific de novo protein-coding gene associated with human brain functions.

    Directory of Open Access Journals (Sweden)

    Chuan-Yun Li

    2010-03-01

    Full Text Available To understand whether any human-specific new genes may be associated with human brain functions, we computationally screened the genetic vulnerable factors identified through Genome-Wide Association Studies and linkage analyses of nicotine addiction and found one human-specific de novo protein-coding gene, FLJ33706 (alternative gene symbol C20orf203. Cross-species analysis revealed interesting evolutionary paths of how this gene had originated from noncoding DNA sequences: insertion of repeat elements especially Alu contributed to the formation of the first coding exon and six standard splice junctions on the branch leading to humans and chimpanzees, and two subsequent substitutions in the human lineage escaped two stop codons and created an open reading frame of 194 amino acids. We experimentally verified FLJ33706's mRNA and protein expression in the brain. Real-Time PCR in multiple tissues demonstrated that FLJ33706 was most abundantly expressed in brain. Human polymorphism data suggested that FLJ33706 encodes a protein under purifying selection. A specifically designed antibody detected its protein expression across human cortex, cerebellum and midbrain. Immunohistochemistry study in normal human brain cortex revealed the localization of FLJ33706 protein in neurons. Elevated expressions of FLJ33706 were detected in Alzheimer's brain samples, suggesting the role of this novel gene in human-specific pathogenesis of Alzheimer's disease. FLJ33706 provided the strongest evidence so far that human-specific de novo genes can have protein-coding potential and differential protein expression, and be involved in human brain functions.

  5. A dual origin of the Xist gene from a protein-coding gene and a set of transposable elements.

    Directory of Open Access Journals (Sweden)

    Eugeny A Elisaphenko

    2008-06-01

    Full Text Available X-chromosome inactivation, which occurs in female eutherian mammals is controlled by a complex X-linked locus termed the X-inactivation center (XIC. Previously it was proposed that genes of the XIC evolved, at least in part, as a result of pseudogenization of protein-coding genes. In this study we show that the key XIC gene Xist, which displays fragmentary homology to a protein-coding gene Lnx3, emerged de novo in early eutherians by integration of mobile elements which gave rise to simple tandem repeats. The Xist gene promoter region and four out of ten exons found in eutherians retain homology to exons of the Lnx3 gene. The remaining six Xist exons including those with simple tandem repeats detectable in their structure have similarity to different transposable elements. Integration of mobile elements into Xist accompanies the overall evolution of the gene and presumably continues in contemporary eutherian species. Additionally we showed that the combination of remnants of protein-coding sequences and mobile elements is not unique to the Xist gene and is found in other XIC genes producing non-coding nuclear RNA.

  6. Inheritance-mode specific pathogenicity prioritization (ISPP) for human protein coding genes.

    Science.gov (United States)

    Hsu, Jacob Shujui; Kwan, Johnny S H; Pan, Zhicheng; Garcia-Barcelo, Maria-Mercè; Sham, Pak Chung; Li, Miaoxin

    2016-10-15

    Exome sequencing studies have facilitated the detection of causal genetic variants in yet-unsolved Mendelian diseases. However, the identification of disease causal genes among a list of candidates in an exome sequencing study is still not fully settled, and it is often difficult to prioritize candidate genes for follow-up studies. The inheritance mode provides crucial information for understanding Mendelian diseases, but none of the existing gene prioritization tools fully utilize this information. We examined the characteristics of Mendelian disease genes under different inheritance modes. The results suggest that Mendelian disease genes with autosomal dominant (AD) inheritance mode are more haploinsufficiency and de novo mutation sensitive, whereas those autosomal recessive (AR) genes have significantly more non-synonymous variants and regulatory transcript isoforms. In addition, the X-linked (XL) Mendelian disease genes have fewer non-synonymous and synonymous variants. As a result, we derived a new scoring system for prioritizing candidate genes for Mendelian diseases according to the inheritance mode. Our scoring system assigned to each annotated protein-coding gene (N = 18 859) three pathogenic scores according to the inheritance mode (AD, AR and XL). This inheritance mode-specific framework achieved higher accuracy (area under curve  = 0.84) in XL mode. The inheritance-mode specific pathogenicity prioritization (ISPP) outperformed other well-known methods including Haploinsufficiency, Recessive, Network centrality, Genic Intolerance, Gene Damage Index and Gene Constraint scores. This systematic study suggests that genes manifesting disease inheritance modes tend to have unique characteristics. ISPP is included in KGGSeq v1.0 (http://grass.cgs.hku.hk/limx/kggseq/), and source code is available from (https://github.com/jacobhsu35/ISPP.git). mxli@hku.hkSupplementary information: Supplementary data are available at Bioinformatics online. © The Author

  7. mPUMA: a computational approach to microbiota analysis by de novo assembly of operational taxonomic units based on protein-coding barcode sequences.

    Science.gov (United States)

    Links, Matthew G; Chaban, Bonnie; Hemmingsen, Sean M; Muirhead, Kevin; Hill, Janet E

    2013-08-15

    Formation of operational taxonomic units (OTU) is a common approach to data aggregation in microbial ecology studies based on amplification and sequencing of individual gene targets. The de novo assembly of OTU sequences has been recently demonstrated as an alternative to widely used clustering methods, providing robust information from experimental data alone, without any reliance on an external reference database. Here we introduce mPUMA (microbial Profiling Using Metagenomic Assembly, http://mpuma.sourceforge.net), a software package for identification and analysis of protein-coding barcode sequence data. It was developed originally for Cpn60 universal target sequences (also known as GroEL or Hsp60). Using an unattended process that is independent of external reference sequences, mPUMA forms OTUs by DNA sequence assembly and is capable of tracking OTU abundance. mPUMA processes microbial profiles both in terms of the direct DNA sequence as well as in the translated amino acid sequence for protein coding barcodes. By forming OTUs and calculating abundance through an assembly approach, mPUMA is capable of generating inputs for several popular microbiota analysis tools. Using SFF data from sequencing of a synthetic community of Cpn60 sequences derived from the human vaginal microbiome, we demonstrate that mPUMA can faithfully reconstruct all expected OTU sequences and produce compositional profiles consistent with actual community structure. mPUMA enables analysis of microbial communities while empowering the discovery of novel organisms through OTU assembly.

  8. Expression of protein-coding genes embedded in ribosomal DNA

    DEFF Research Database (Denmark)

    Johansen, Steinar D; Haugen, Peik; Nielsen, Henrik

    2007-01-01

    Ribosomal DNA (rDNA) is a specialised chromosomal location that is dedicated to high-level transcription of ribosomal RNA genes. Interestingly, rDNAs are frequently interrupted by parasitic elements, some of which carry protein genes. These are non-LTR retrotransposons and group II introns that e...... in the nucleolus....

  9. Codon usage and expression level of human mitochondrial 13 protein coding genes across six continents.

    Science.gov (United States)

    Chakraborty, Supriyo; Uddin, Arif; Mazumder, Tarikul Huda; Choudhury, Monisha Nath; Malakar, Arup Kumar; Paul, Prosenjit; Halder, Binata; Deka, Himangshu; Mazumder, Gulshana Akthar; Barbhuiya, Riazul Ahmed; Barbhuiya, Masuk Ahmed; Devi, Warepam Jesmi

    2017-12-02

    The study of codon usage coupled with phylogenetic analysis is an important tool to understand the genetic and evolutionary relationship of a gene. The 13 protein coding genes of human mitochondria are involved in electron transport chain for the generation of energy currency (ATP). However, no work has yet been reported on the codon usage of the mitochondrial protein coding genes across six continents. To understand the patterns of codon usage in mitochondrial genes across six different continents, we used bioinformatic analyses to analyze the protein coding genes. The codon usage bias was low as revealed from high ENC value. Correlation between codon usage and GC3 suggested that all the codons ending with G/C were positively correlated with GC3 but vice versa for A/T ending codons with the exception of ND4L and ND5 genes. Neutrality plot revealed that for the genes ATP6, COI, COIII, CYB, ND4 and ND4L, natural selection might have played a major role while mutation pressure might have played a dominant role in the codon usage bias of ATP8, COII, ND1, ND2, ND3, ND5 and ND6 genes. Phylogenetic analysis indicated that evolutionary relationships in each of 13 protein coding genes of human mitochondria were different across six continents and further suggested that geographical distance was an important factor for the origin and evolution of 13 protein coding genes of human mitochondria. Copyright © 2017 Elsevier B.V. and Mitochondria Research Society. All rights reserved.

  10. Selfish DNA in protein-coding genes of Rickettsia.

    Science.gov (United States)

    Ogata, H; Audic, S; Barbe, V; Artiguenave, F; Fournier, P E; Raoult, D; Claverie, J M

    2000-10-13

    Rickettsia conorii, the aetiological agent of Mediterranean spotted fever, is an intracellular bacterium transmitted by ticks. Preliminary analyses of the nearly complete genome sequence of R. conorii have revealed 44 occurrences of a previously undescribed palindromic repeat (150 base pairs long) throughout the genome. Unexpectedly, this repeat was found inserted in-frame within 19 different R. conorii open reading frames likely to encode functional proteins. We found the same repeat in proteins of other Rickettsia species. The finding of a mobile element inserted in many unrelated genes suggests the potential role of selfish DNA in the creation of new protein sequences.

  11. Improvement of genome assembly completeness and identification of novel full-length protein-coding genes by RNA-seq in the giant panda genome.

    Science.gov (United States)

    Chen, Meili; Hu, Yibo; Liu, Jingxing; Wu, Qi; Zhang, Chenglin; Yu, Jun; Xiao, Jingfa; Wei, Fuwen; Wu, Jiayan

    2015-12-11

    High-quality and complete gene models are the basis of whole genome analyses. The giant panda (Ailuropoda melanoleuca) genome was the first genome sequenced on the basis of solely short reads, but the genome annotation had lacked the support of transcriptomic evidence. In this study, we applied RNA-seq to globally improve the genome assembly completeness and to detect novel expressed transcripts in 12 tissues from giant pandas, by using a transcriptome reconstruction strategy that combined reference-based and de novo methods. Several aspects of genome assembly completeness in the transcribed regions were effectively improved by the de novo assembled transcripts, including genome scaffolding, the detection of small-size assembly errors, the extension of scaffold/contig boundaries, and gap closure. Through expression and homology validation, we detected three groups of novel full-length protein-coding genes. A total of 12.62% of the novel protein-coding genes were validated by proteomic data. GO annotation analysis showed that some of the novel protein-coding genes were involved in pigmentation, anatomical structure formation and reproduction, which might be related to the development and evolution of the black-white pelage, pseudo-thumb and delayed embryonic implantation of giant pandas. The updated genome annotation will help further giant panda studies from both structural and functional perspectives.

  12. Discovery of rare protein-coding genes in model methylotroph Methylobacterium extorquens AM1.

    Science.gov (United States)

    Kumar, Dhirendra; Mondal, Anupam Kumar; Yadav, Amit Kumar; Dash, Debasis

    2014-12-01

    Proteogenomics involves the use of MS to refine annotation of protein-coding genes and discover genes in a genome. We carried out comprehensive proteogenomic analysis of Methylobacterium extorquens AM1 (ME-AM1) from publicly available proteomics data with a motive to improve annotation for methylotrophs; organisms capable of surviving in reduced carbon compounds such as methanol. Besides identifying 2482(50%) proteins, 29 new genes were discovered and 66 annotated gene models were revised in ME-AM1 genome. One such novel gene is identified with 75 peptides, lacks homolog in other methylobacteria but has glycosyl transferase and lipopolysaccharide biosynthesis protein domains, indicating its potential role in outer membrane synthesis. Many novel genes are present only in ME-AM1 among methylobacteria. Distant homologs of these genes in unrelated taxonomic classes and low GC-content of few genes suggest lateral gene transfer as a potential mode of their origin. Annotations of methylotrophy related genes were also improved by the discovery of a short gene in methylotrophy gene island and redefining a gene important for pyrroquinoline quinone synthesis, essential for methylotrophy. The combined use of proteogenomics and rigorous bioinformatics analysis greatly enhanced the annotation of protein-coding genes in model methylotroph ME-AM1 genome. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  13. Revisiting the missing protein-coding gene catalog of the domestic dog

    Directory of Open Access Journals (Sweden)

    Galibert Francis

    2009-02-01

    Full Text Available Abstract Background Among mammals for which there is a high sequence coverage, the whole genome assembly of the dog is unique in that it predicts a low number of protein-coding genes, ~19,000, compared to the over 20,000 reported for other mammalian species. Of particular interest are the more than 400 of genes annotated in primates and rodent genomes, but missing in dog. Results Using over 14,000 orthologous genes between human, chimpanzee, mouse rat and dog, we built multiple pairwise synteny maps to infer short orthologous intervals that were targeted for characterizing the canine missing genes. Based on gene prediction and a functionality test using the ratio of replacement to silent nucleotide substitution rates (dN/dS, we provide compelling structural and functional evidence for the identification of 232 new protein-coding genes in the canine genome and 69 gene losses, characterized as undetected gene or pseudogenes. Gene loss phyletic pattern analysis using ten species from chicken to human allowed us to characterize 28 canine-specific gene losses that have functional orthologs continuously from chicken or marsupials through human, and 10 genes that arose specifically in the evolutionary lineage leading to rodent and primates. Conclusion This study demonstrates the central role of comparative genomics for refining gene catalogs and exploring the evolutionary history of gene repertoires, particularly as applied for the characterization of species-specific gene gains and losses.

  14. Promoter Analysis Reveals Globally Differential Regulation of Human Long Non-Coding RNA and Protein-Coding Genes

    KAUST Repository

    Alam, Tanvir; Medvedeva, Yulia A.; Jia, Hui; Brown, James B.; Lipovich, Leonard; Bajic, Vladimir B.

    2014-01-01

    raise the possibility that, given the historical reliance on protein-coding gene catalogs to define the chromatin states of active promoters, a revision of these chromatin signature profiles to incorporate expressed lncRNA genes is warranted

  15. Proteogenomics of rare taxonomic phyla: A prospective treasure trove of protein coding genes.

    Science.gov (United States)

    Kumar, Dhirendra; Mondal, Anupam Kumar; Kutum, Rintu; Dash, Debasis

    2016-01-01

    Sustainable innovations in sequencing technologies have resulted in a torrent of microbial genome sequencing projects. However, the prokaryotic genomes sequenced so far are unequally distributed along their phylogenetic tree; few phyla contain the majority, the rest only a few representatives. Accurate genome annotation lags far behind genome sequencing. While automated computational prediction, aided by comparative genomics, remains a popular choice for genome annotation, substantial fraction of these annotations are erroneous. Proteogenomics utilizes protein level experimental observations to annotate protein coding genes on a genome wide scale. Benefits of proteogenomics include discovery and correction of gene annotations regardless of their phylogenetic conservation. This not only allows detection of common, conserved proteins but also the discovery of protein products of rare genes that may be horizontally transferred or taxonomy specific. Chances of encountering such genes are more in rare phyla that comprise a small number of complete genome sequences. We collated all bacterial and archaeal proteogenomic studies carried out to date and reviewed them in the context of genome sequencing projects. Here, we present a comprehensive list of microbial proteogenomic studies, their taxonomic distribution, and also urge for targeted proteogenomics of underexplored taxa to build an extensive reference of protein coding genes. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  16. Both noncoding and protein-coding RNAs contribute to gene expression evolution in the primate brain.

    Science.gov (United States)

    Babbitt, Courtney C; Fedrigo, Olivier; Pfefferle, Adam D; Boyle, Alan P; Horvath, Julie E; Furey, Terrence S; Wray, Gregory A

    2010-01-18

    Despite striking differences in cognition and behavior between humans and our closest primate relatives, several studies have found little evidence for adaptive change in protein-coding regions of genes expressed primarily in the brain. Instead, changes in gene expression may underlie many cognitive and behavioral differences. Here, we used digital gene expression: tag profiling (here called Tag-Seq, also called DGE:tag profiling) to assess changes in global transcript abundance in the frontal cortex of the brains of 3 humans, 3 chimpanzees, and 3 rhesus macaques. A substantial fraction of transcripts we identified as differentially transcribed among species were not assayed in previous studies based on microarrays. Differentially expressed tags within coding regions are enriched for gene functions involved in synaptic transmission, transport, oxidative phosphorylation, and lipid metabolism. Importantly, because Tag-Seq technology provides strand-specific information about all polyadenlyated transcripts, we were able to assay expression in noncoding intragenic regions, including both sense and antisense noncoding transcripts (relative to nearby genes). We find that many noncoding transcripts are conserved in both location and expression level between species, suggesting a possible functional role. Lastly, we examined the overlap between differential gene expression and signatures of positive selection within putative promoter regions, a sign that these differences represent adaptations during human evolution. Comparative approaches may provide important insights into genes responsible for differences in cognitive functions between humans and nonhuman primates, as well as highlighting new candidate genes for studies investigating neurological disorders.

  17. A study on climatic adaptation of dipteran mitochondrial protein coding genes

    Directory of Open Access Journals (Sweden)

    Debajyoti Kabiraj

    2017-10-01

    Full Text Available Diptera, the true flies are frequently found in nature and their habitat is found all over the world including Antarctica and Polar Regions. The number of documented species for order diptera is quite high and thought to be 14% of the total animal present in the earth [1]. Most of the study in diptera has focused on the taxa of economic and medical importance, such as the fruit flies Ceratitis capitata and Bactrocera spp. (Tephritidae, which are serious agricultural pests; the blowflies (Calliphoridae and oestrid flies (Oestridae, which can cause myiasis; the anopheles mosquitoes (Culicidae, are the vectors of malaria; and leaf-miners (Agromyzidae, vegetable and horticultural pests [2]. Insect mitochondrion consists of 13 protein coding genes, 22 tRNAs and 2 rRNAs, are the remnant portion of alpha-proteobacteria is responsible for simultaneous function of energy production and thermoregulation of the cell through the bi-genomic system thus different adaptability in different climatic condition might have compensated by complementary changes is the both genomes [3,4]. In this study we have collected complete mitochondrial genome and occurrence data of one hundred thirteen such dipteran insects from different databases and literature survey. Our understanding of the genetic basis of climatic adaptation in diptera is limited to the basic information on the occurrence location of those species and mito genetic factors underlying changes in conspicuous phenotypes. To examine this hypothesis, we have taken an approach of Nucleotide substitution analysis for 13 protein coding genes of mitochondrial DNA individually and combined by different software for monophyletic group as well as paraphyletic group of dipteran species. Moreover, we have also calculated codon adaptation index for all dipteran mitochondrial protein coding genes. Following this work, we have classified our sample organisms according to their location data from GBIF (https

  18. Natural selection in avian protein-coding genes expressed in brain.

    Science.gov (United States)

    Axelsson, Erik; Hultin-Rosenberg, Lina; Brandström, Mikael; Zwahlén, Martin; Clayton, David F; Ellegren, Hans

    2008-06-01

    The evolution of birds from theropod dinosaurs took place approximately 150 million years ago, and was associated with a number of specific adaptations that are still evident among extant birds, including feathers, song and extravagant secondary sexual characteristics. Knowledge about the molecular evolutionary background to such adaptations is lacking. Here, we analyse the evolution of > 5000 protein-coding gene sequences expressed in zebra finch brain by comparison to orthologous sequences in chicken. Mean d(N)/d(S) is 0.085 and genes with their maximal expression in the eye and central nervous system have the lowest mean d(N)/d(S) value, while those expressed in digestive and reproductive tissues exhibit the highest. We find that fast-evolving genes (those which have higher than expected rate of nonsynonymous substitution, indicative of adaptive evolution) are enriched for biological functions such as fertilization, muscle contraction, defence response, response to stress, wounding and endogenous stimulus, and cell death. After alignment to mammalian orthologues, we identify a catalogue of 228 genes that show a significantly higher rate of protein evolution in the two bird lineages than in mammals. These accelerated bird genes, representing candidates for avian-specific adaptations, include genes implicated in vocal learning and other cognitive processes. Moreover, colouration genes evolve faster in birds than in mammals, which may have been driven by sexual selection for extravagant plumage characteristics.

  19. Conserved syntenic clusters of protein coding genes are missing in birds.

    Science.gov (United States)

    Lovell, Peter V; Wirthlin, Morgan; Wilhelm, Larry; Minx, Patrick; Lazar, Nathan H; Carbone, Lucia; Warren, Wesley C; Mello, Claudio V

    2014-01-01

    Birds are one of the most highly successful and diverse groups of vertebrates, having evolved a number of distinct characteristics, including feathers and wings, a sturdy lightweight skeleton and unique respiratory and urinary/excretion systems. However, the genetic basis of these traits is poorly understood. Using comparative genomics based on extensive searches of 60 avian genomes, we have found that birds lack approximately 274 protein coding genes that are present in the genomes of most vertebrate lineages and are for the most part organized in conserved syntenic clusters in non-avian sauropsids and in humans. These genes are located in regions associated with chromosomal rearrangements, and are largely present in crocodiles, suggesting that their loss occurred subsequent to the split of dinosaurs/birds from crocodilians. Many of these genes are associated with lethality in rodents, human genetic disorders, or biological functions targeting various tissues. Functional enrichment analysis combined with orthogroup analysis and paralog searches revealed enrichments that were shared by non-avian species, present only in birds, or shared between all species. Together these results provide a clearer definition of the genetic background of extant birds, extend the findings of previous studies on missing avian genes, and provide clues about molecular events that shaped avian evolution. They also have implications for fields that largely benefit from avian studies, including development, immune system, oncogenesis, and brain function and cognition. With regards to the missing genes, birds can be considered ‘natural knockouts’ that may become invaluable model organisms for several human diseases.

  20. PanCoreGen - Profiling, detecting, annotating protein-coding genes in microbial genomes.

    Science.gov (United States)

    Paul, Sandip; Bhardwaj, Archana; Bag, Sumit K; Sokurenko, Evgeni V; Chattopadhyay, Sujay

    2015-12-01

    A large amount of genomic data, especially from multiple isolates of a single species, has opened new vistas for microbial genomics analysis. Analyzing the pan-genome (i.e. the sum of genetic repertoire) of microbial species is crucial in understanding the dynamics of molecular evolution, where virulence evolution is of major interest. Here we present PanCoreGen - a standalone application for pan- and core-genomic profiling of microbial protein-coding genes. PanCoreGen overcomes key limitations of the existing pan-genomic analysis tools, and develops an integrated annotation-structure for a species-specific pan-genomic profile. It provides important new features for annotating draft genomes/contigs and detecting unidentified genes in annotated genomes. It also generates user-defined group-specific datasets within the pan-genome. Interestingly, analyzing an example-set of Salmonella genomes, we detect potential footprints of adaptive convergence of horizontally transferred genes in two human-restricted pathogenic serovars - Typhi and Paratyphi A. Overall, PanCoreGen represents a state-of-the-art tool for microbial phylogenomics and pathogenomics study. Copyright © 2015 Elsevier Inc. All rights reserved.

  1. PanCoreGen – profiling, detecting, annotating protein-coding genes in microbial genomes

    Science.gov (United States)

    Bhardwaj, Archana; Bag, Sumit K; Sokurenko, Evgeni V.

    2015-01-01

    A large amount of genomic data, especially from multiple isolates of a single species, has opened new vistas for microbial genomics analysis. Analyzing pan-genome (i.e. the sum of genetic repertoire) of microbial species is crucial in understanding the dynamics of molecular evolution, where virulence evolution is of major interest. Here we present PanCoreGen – a standalone application for pan- and core-genomic profiling of microbial protein-coding genes. PanCoreGen overcomes key limitations of the existing pan-genomic analysis tools, and develops an integrated annotation-structure for species-specific pan-genomic profile. It provides important new features for annotating draft genomes/contigs and detecting unidentified genes in annotated genomes. It also generates user-defined group-specific datasets within the pan-genome. Interestingly, analyzing an example-set of Salmonella genomes, we detect potential footprints of adaptive convergence of horizontally transferred genes in two human-restricted pathogenic serovars – Typhi and Paratyphi A. Overall, PanCoreGen represents a state-of-the-art tool for microbial phylogenomics and pathogenomics study. PMID:26456591

  2. Ribosome Profiling Reveals Pervasive Translation Outside of Annotated Protein-Coding Genes

    Directory of Open Access Journals (Sweden)

    Nicholas T. Ingolia

    2014-09-01

    Full Text Available Ribosome profiling suggests that ribosomes occupy many regions of the transcriptome thought to be noncoding, including 5′ UTRs and long noncoding RNAs (lncRNAs. Apparent ribosome footprints outside of protein-coding regions raise the possibility of artifacts unrelated to translation, particularly when they occupy multiple, overlapping open reading frames (ORFs. Here, we show hallmarks of translation in these footprints: copurification with the large ribosomal subunit, response to drugs targeting elongation, trinucleotide periodicity, and initiation at early AUGs. We develop a metric for distinguishing between 80S footprints and nonribosomal sources using footprint size distributions, which validates the vast majority of footprints outside of coding regions. We present evidence for polypeptide production beyond annotated genes, including the induction of immune responses following human cytomegalovirus (HCMV infection. Translation is pervasive on cytosolic transcripts outside of conserved reading frames, and direct detection of this expanded universe of translated products enables efforts at understanding how cells manage and exploit its consequences.

  3. Promoter Analysis Reveals Globally Differential Regulation of Human Long Non-Coding RNA and Protein-Coding Genes

    KAUST Repository

    Alam, Tanvir

    2014-10-02

    Transcriptional regulation of protein-coding genes is increasingly well-understood on a global scale, yet no comparable information exists for long non-coding RNA (lncRNA) genes, which were recently recognized to be as numerous as protein-coding genes in mammalian genomes. We performed a genome-wide comparative analysis of the promoters of human lncRNA and protein-coding genes, finding global differences in specific genetic and epigenetic features relevant to transcriptional regulation. These two groups of genes are hence subject to separate transcriptional regulatory programs, including distinct transcription factor (TF) proteins that significantly favor lncRNA, rather than coding-gene, promoters. We report a specific signature of promoter-proximal transcriptional regulation of lncRNA genes, including several distinct transcription factor binding sites (TFBS). Experimental DNase I hypersensitive site profiles are consistent with active configurations of these lncRNA TFBS sets in diverse human cell types. TFBS ChIP-seq datasets confirm the binding events that we predicted using computational approaches for a subset of factors. For several TFs known to be directly regulated by lncRNAs, we find that their putative TFBSs are enriched at lncRNA promoters, suggesting that the TFs and the lncRNAs may participate in a bidirectional feedback loop regulatory network. Accordingly, cells may be able to modulate lncRNA expression levels independently of mRNA levels via distinct regulatory pathways. Our results also raise the possibility that, given the historical reliance on protein-coding gene catalogs to define the chromatin states of active promoters, a revision of these chromatin signature profiles to incorporate expressed lncRNA genes is warranted in the future.

  4. On the Origin of De Novo Genes in Arabidopsis thaliana Populations.

    Science.gov (United States)

    Li, Zi-Wen; Chen, Xi; Wu, Qiong; Hagmann, Jörg; Han, Ting-Shen; Zou, Yu-Pan; Ge, Song; Guo, Ya-Long

    2016-08-03

    De novo genes, which originate from ancestral nongenic sequences, are one of the most important sources of protein-coding genes. This origination process is crucial for the adaptation of organisms. However, how de novo genes arise and become fixed in a population or species remains largely unknown. Here, we identified 782 de novo genes from the model plant Arabidopsis thaliana and divided them into three types based on the availability of translational evidence, transcriptional evidence, and neither transcriptional nor translational evidence for their origin. Importantly, by integrating multiple types of omics data, including data from genomes, epigenomes, transcriptomes, and translatomes, we found that epigenetic modifications (DNA methylation and histone modification) play an important role in the origination process of de novo genes. Intriguingly, using the transcriptomes and methylomes from the same population of 84 accessions, we found that de novo genes that are transcribed in approximately half of the total accessions within the population are highly methylated, with lower levels of transcription than those transcribed at other frequencies within the population. We hypothesized that, during the origin of de novo gene alleles, those neutralized to low expression states via DNA methylation have relatively high probabilities of spreading and becoming fixed in a population. Our results highlight the process underlying the origin of de novo genes at the population level, as well as the importance of DNA methylation in this process. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  5. Evaluation of the efficacy of twelve mitochondrial protein-coding genes as barcodes for mollusk DNA barcoding.

    Science.gov (United States)

    Yu, Hong; Kong, Lingfeng; Li, Qi

    2016-01-01

    In this study, we evaluated the efficacy of 12 mitochondrial protein-coding genes from 238 mitochondrial genomes of 140 molluscan species as potential DNA barcodes for mollusks. Three barcoding methods (distance, monophyly and character-based methods) were used in species identification. The species recovery rates based on genetic distances for the 12 genes ranged from 70.83 to 83.33%. There were no significant differences in intra- or interspecific variability among the 12 genes. The monophyly and character-based methods provided higher resolution than the distance-based method in species delimitation. Especially in closely related taxa, the character-based method showed some advantages. The results suggested that besides the standard COI barcode, other 11 mitochondrial protein-coding genes could also be potentially used as a molecular diagnostic for molluscan species discrimination. Our results also showed that the combination of mitochondrial genes did not enhance the efficacy for species identification and a single mitochondrial gene would be fully competent.

  6. Dataset of the first transcriptome assembly of the tree crop “yerba mate” (Ilex paraguariensis and systematic characterization of protein coding genes

    Directory of Open Access Journals (Sweden)

    Patricia M. Aguilera

    2018-04-01

    Full Text Available This contribution contains data associated to the research article entitled “Exploring the genes of yerba mate (Ilex paraguariensis A. St.-Hil. by NGS and de novo transcriptome assembly” (Debat et al., 2014 [1]. By means of a bioinformatic approach involving extensive NGS data analyses, we provide a resource encompassing the full transcriptome assembly of yerba mate, the first available reference for the Ilex L. genus. This dataset (Supplementary files 1 and 2 consolidates the transcriptome-wide assembled sequences of I. paraguariensis with further comprehensive annotation of the protein coding genes of yerba mate via the integration of Arabidopsis thaliana databases. The generated data is pivotal for the characterization of agronomical relevant genes in the tree crop yerba mate -a non-model species- and related taxa in Ilex. The raw sequencing data dissected here is available at DDBJ/ENA/GenBank (NCBI Resource Coordinators, 2016 [2] Sequence Read Archive (SRA under the accession SRP043293 and the assembled sequences have been deposited at the Transcriptome Shotgun Assembly Sequence Database (TSA under the accession GFHV00000000.

  7. Non-Protein Coding RNAs

    CERN Document Server

    Walter, Nils G; Batey, Robert T

    2009-01-01

    This book assembles chapters from experts in the Biophysics of RNA to provide a broadly accessible snapshot of the current status of this rapidly expanding field. The 2006 Nobel Prize in Physiology or Medicine was awarded to the discoverers of RNA interference, highlighting just one example of a large number of non-protein coding RNAs. Because non-protein coding RNAs outnumber protein coding genes in mammals and other higher eukaryotes, it is now thought that the complexity of organisms is correlated with the fraction of their genome that encodes non-protein coding RNAs. Essential biological processes as diverse as cell differentiation, suppression of infecting viruses and parasitic transposons, higher-level organization of eukaryotic chromosomes, and gene expression itself are found to largely be directed by non-protein coding RNAs. The biophysical study of these RNAs employs X-ray crystallography, NMR, ensemble and single molecule fluorescence spectroscopy, optical tweezers, cryo-electron microscopy, and ot...

  8. The small RNA content of human sperm reveals pseudogene-derived piRNAs complementary to protein-coding genes

    DEFF Research Database (Denmark)

    Pantano, Lorena; Jodar, Meritxell; Bak, Mads

    2015-01-01

    -specific genes. The most abundant class of small noncoding RNAs in sperm are PIWI-interacting RNAs (piRNAs). Surprisingly, we found that human sperm cells contain piRNAs processed from pseudogenes. Clusters of piRNAs from human testes contain pseudogenes transcribed in the antisense strand and processed...... into small RNAs. Several human protein-coding genes contain antisense predicted targets of pseudogene-derived piRNAs in the male germline and these piRNAs are still found in mature sperm. Our study provides the most extensive data set and annotation of human sperm small RNAs to date and is a resource...... for further functional studies on the roles of sperm small RNAs. In addition, we propose that some of the pseudogene-derived human piRNAs may regulate expression of their parent gene in the male germline....

  9. A systematic genome-wide analysis of zebrafish protein-coding gene function

    NARCIS (Netherlands)

    Kettleborough, R.N.; Busch-Nentwich, E.M.; Harvey, S.A.; Dooley, C.M.; de Bruijn, E.; van Eeden, F.; Sealy, I.; White, R.J.; Herd, C.; Nijman, I.J.; Fenyes, F.; Mehroke, S.; Scahill, C.; Gibbons, R.; Wali, N.; Carruthers, S.; Hall, A.; Yen, J.; Cuppen, E.; Stemple, D.L.

    2013-01-01

    Since the publication of the human reference genome, the identities of specific genes associated with human diseases are being discovered at a rapid rate. A central problem is that the biological activity of these genes is often unclear. Detailed investigations in model vertebrate organisms,

  10. Computational prediction of over-annotated protein-coding genes in the genome of Agrobacterium tumefaciens strain C58

    Science.gov (United States)

    Yu, Jia-Feng; Sui, Tian-Xiang; Wang, Hong-Mei; Wang, Chun-Ling; Jing, Li; Wang, Ji-Hua

    2015-12-01

    Agrobacterium tumefaciens strain C58 is a type of pathogen that can cause tumors in some dicotyledonous plants. Ever since the genome of A. tumefaciens strain C58 was sequenced, the quality of annotation of its protein-coding genes has been queried continually, because the annotation varies greatly among different databases. In this paper, the questionable hypothetical genes were re-predicted by integrating the TN curve and Z curve methods. As a result, 30 genes originally annotated as “hypothetical” were discriminated as being non-coding sequences. By testing the re-prediction program 10 times on data sets composed of the function-known genes, the mean accuracy of 99.99% and mean Matthews correlation coefficient value of 0.9999 were obtained. Further sequence analysis and COG analysis showed that the re-annotation results were very reliable. This work can provide an efficient tool and data resources for future studies of A. tumefaciens strain C58. Project supported by the National Natural Science Foundation of China (Grant Nos. 61302186 and 61271378) and the Funding from the State Key Laboratory of Bioelectronics of Southeast University.

  11. Computational prediction of over-annotated protein-coding genes in the genome of Agrobacterium tumefaciens strain C58

    International Nuclear Information System (INIS)

    Yu Jia-Feng; Sui Tian-Xiang; Wang Ji-Hua; Wang Hong-Mei; Wang Chun-Ling; Jing Li

    2015-01-01

    Agrobacterium tumefaciens strain C58 is a type of pathogen that can cause tumors in some dicotyledonous plants. Ever since the genome of A. tumefaciens strain C58 was sequenced, the quality of annotation of its protein-coding genes has been queried continually, because the annotation varies greatly among different databases. In this paper, the questionable hypothetical genes were re-predicted by integrating the TN curve and Z curve methods. As a result, 30 genes originally annotated as “hypothetical” were discriminated as being non-coding sequences. By testing the re-prediction program 10 times on data sets composed of the function-known genes, the mean accuracy of 99.99% and mean Matthews correlation coefficient value of 0.9999 were obtained. Further sequence analysis and COG analysis showed that the re-annotation results were very reliable. This work can provide an efficient tool and data resources for future studies of A. tumefaciens strain C58. (special topic)

  12. Comparison of protein coding gene contents of the fungal phyla Pezizomycotina and Saccharomycotina

    DEFF Research Database (Denmark)

    Arvas, Mikko; Kivioja, Teemu; Mitchell, Alex

    2007-01-01

    Saccharomycotina are slightly better characterised and predicted to encode mainly enzymes. The genes specific to Saccharomycotina are enriched in transcription and mitochondrion related functions. Especially mitochondrial ribosomal proteins seem to have diverged from those of Pezizomycotina. In addition, we...

  13. The small RNA content of human sperm reveals pseudogene-derived piRNAs complementary to protein-coding genes

    Science.gov (United States)

    Pantano, Lorena; Jodar, Meritxell; Bak, Mads; Ballescà, Josep Lluís; Tommerup, Niels; Oliva, Rafael; Vavouri, Tanya

    2015-01-01

    At the end of mammalian sperm development, sperm cells expel most of their cytoplasm and dispose of the majority of their RNA. Yet, hundreds of RNA molecules remain in mature sperm. The biological significance of the vast majority of these molecules is unclear. To better understand the processes that generate sperm small RNAs and what roles they may have, we sequenced and characterized the small RNA content of sperm samples from two human fertile individuals. We detected 182 microRNAs, some of which are highly abundant. The most abundant microRNA in sperm is miR-1246 with predicted targets among sperm-specific genes. The most abundant class of small noncoding RNAs in sperm are PIWI-interacting RNAs (piRNAs). Surprisingly, we found that human sperm cells contain piRNAs processed from pseudogenes. Clusters of piRNAs from human testes contain pseudogenes transcribed in the antisense strand and processed into small RNAs. Several human protein-coding genes contain antisense predicted targets of pseudogene-derived piRNAs in the male germline and these piRNAs are still found in mature sperm. Our study provides the most extensive data set and annotation of human sperm small RNAs to date and is a resource for further functional studies on the roles of sperm small RNAs. In addition, we propose that some of the pseudogene-derived human piRNAs may regulate expression of their parent gene in the male germline. PMID:25904136

  14. Novel methods for the molecular discrimination of Fasciola spp. on the basis of nuclear protein-coding genes.

    Science.gov (United States)

    Shoriki, Takuya; Ichikawa-Seki, Madoka; Suganuma, Keisuke; Naito, Ikunori; Hayashi, Kei; Nakao, Minoru; Aita, Junya; Mohanta, Uday Kumar; Inoue, Noboru; Murakami, Kenji; Itagaki, Tadashi

    2016-06-01

    Fasciolosis is an economically important disease of livestock caused by Fasciola hepatica, Fasciola gigantica, and aspermic Fasciola flukes. The aspermic Fasciola flukes have been discriminated morphologically from the two other species by the absence of sperm in their seminal vesicles. To date, the molecular discrimination of F. hepatica and F. gigantica has relied on the nucleotide sequences of the internal transcribed spacer 1 (ITS1) region. However, ITS1 genotypes of aspermic Fasciola flukes cannot be clearly differentiated from those of F. hepatica and F. gigantica. Therefore, more precise and robust methods are required to discriminate Fasciola spp. In this study, we developed PCR restriction fragment length polymorphism and multiplex PCR methods to discriminate F. hepatica, F. gigantica, and aspermic Fasciola flukes on the basis of the nuclear protein-coding genes, phosphoenolpyruvate carboxykinase and DNA polymerase delta, which are single locus genes in most eukaryotes. All aspermic Fasciola flukes used in this study had mixed fragment pattern of F. hepatica and F. gigantica for both of these genes, suggesting that the flukes are descended through hybridization between the two species. These molecular methods will facilitate the identification of F. hepatica, F. gigantica, and aspermic Fasciola flukes, and will also prove useful in etiological studies of fasciolosis. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  15. Natural selection on protein-coding genes in the human genome

    DEFF Research Database (Denmark)

    Bustamente, Carlos D.; Fledel-Alon, Adi; Williamson, Scott

    2005-01-01

    , showing an excess of deleterious variation within local populations 9, 10 . Here we contrast patterns of coding sequence polymorphism identified by direct sequencing of 39 humans for over 11,000 genes to divergence between humans and chimpanzees, and find strong evidence that natural selection has shaped......Comparisons of DNA polymorphism within species to divergence between species enables the discovery of molecular adaptation in evolutionarily constrained genes as well as the differentiation of weak from strong purifying selection 1, 2, 3, 4 . The extent to which weak negative and positive darwinian...... selection have driven the molecular evolution of different species varies greatly 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 , with some species, such as Drosophila melanogaster, showing strong evidence of pervasive positive selection 6, 7, 8, 9 , and others, such as the selfing weed Arabidopsis thaliana...

  16. Bioinformatics analysis identify novel OB fold protein coding genes in C. elegans.

    Directory of Open Access Journals (Sweden)

    Daryanaz Dargahi

    Full Text Available BACKGROUND: The C. elegans genome has been extensively annotated by the WormBase consortium that uses state of the art bioinformatics pipelines, functional genomics and manual curation approaches. As a result, the identification of novel genes in silico in this model organism is becoming more challenging requiring new approaches. The Oligonucleotide-oligosaccharide binding (OB fold is a highly divergent protein family, in which protein sequences, in spite of having the same fold, share very little sequence identity (5-25%. Therefore, evidence from sequence-based annotation may not be sufficient to identify all the members of this family. In C. elegans, the number of OB-fold proteins reported is remarkably low (n=46 compared to other evolutionary-related eukaryotes, such as yeast S. cerevisiae (n=344 or fruit fly D. melanogaster (n=84. Gene loss during evolution or differences in the level of annotation for this protein family, may explain these discrepancies. METHODOLOGY/PRINCIPAL FINDINGS: This study examines the possibility that novel OB-fold coding genes exist in the worm. We developed a bioinformatics approach that uses the most sensitive sequence-sequence, sequence-profile and profile-profile similarity search methods followed by 3D-structure prediction as a filtering step to eliminate false positive candidate sequences. We have predicted 18 coding genes containing the OB-fold that have remarkably partially been characterized in C. elegans. CONCLUSIONS/SIGNIFICANCE: This study raises the possibility that the annotation of highly divergent protein fold families can be improved in C. elegans. Similar strategies could be implemented for large scale analysis by the WormBase consortium when novel versions of the genome sequence of C. elegans, or other evolutionary related species are being released. This approach is of general interest to the scientific community since it can be used to annotate any genome.

  17. Phylogenetic relationships within Echinococcus and Taenia tapeworms (Cestoda: Taeniidae): an inference from nuclear protein-coding genes.

    Science.gov (United States)

    Knapp, Jenny; Nakao, Minoru; Yanagida, Tetsuya; Okamoto, Munehiro; Saarma, Urmas; Lavikainen, Antti; Ito, Akira

    2011-12-01

    The family Taeniidae of tapeworms is composed of two genera, Echinococcus and Taenia, which obligately parasitize mammals including humans. Inferring phylogeny via molecular markers is the only way to trace back their evolutionary histories. However, molecular dating approaches are lacking so far. Here we established new markers from nuclear protein-coding genes for RNA polymerase II second largest subunit (rpb2), phosphoenolpyruvate carboxykinase (pepck) and DNA polymerase delta (pold). Bayesian inference and maximum likelihood analyses of the concatenated gene sequences allowed us to reconstruct phylogenetic trees for taeniid parasites. The tree topologies clearly demonstrated that Taenia is paraphyletic and that the clade of Echinococcus oligarthrus and Echinococcusvogeli is sister to all other members of Echinococcus. Both species are endemic in Central and South America, and their definitive hosts originated from carnivores that immigrated from North America after the formation of the Panamanian land bridge about 3 million years ago (Ma). A time-calibrated phylogeny was estimated by a Bayesian relaxed-clock method based on the assumption that the most recent common ancestor of E. oligarthrus and E. vogeli existed during the late Pliocene (3.0 Ma). The results suggest that a clade of Taenia including human-pathogenic species diversified primarily in the late Miocene (11.2 Ma), whereas Echinococcus started to diversify later, in the end of the Miocene (5.8 Ma). Close genetic relationships among the members of Echinococcus imply that the genus is a young group in which speciation and global radiation occurred rapidly. Copyright © 2011 Elsevier Inc. All rights reserved.

  18. Histone modification profiles are predictive for tissue/cell-type specific expression of both protein-coding and microRNA genes

    Directory of Open Access Journals (Sweden)

    Zhang Michael Q

    2011-05-01

    Full Text Available Abstract Background Gene expression is regulated at both the DNA sequence level and through modification of chromatin. However, the effect of chromatin on tissue/cell-type specific gene regulation (TCSR is largely unknown. In this paper, we present a method to elucidate the relationship between histone modification/variation (HMV and TCSR. Results A classifier for differentiating CD4+ T cell-specific genes from housekeeping genes using HMV data was built. We found HMV in both promoter and gene body regions to be predictive of genes which are targets of TCSR. For example, the histone modification types H3K4me3 and H3K27ac were identified as the most predictive for CpG-related promoters, whereas H3K4me3 and H3K79me3 were the most predictive for nonCpG-related promoters. However, genes targeted by TCSR can be predicted using other type of HMVs as well. Such redundancy implies that multiple type of underlying regulatory elements, such as enhancers or intragenic alternative promoters, which can regulate gene expression in a tissue/cell-type specific fashion, may be marked by the HMVs. Finally, we show that the predictive power of HMV for TCSR is not limited to protein-coding genes in CD4+ T cells, as we successfully predicted TCSR targeted genes in muscle cells, as well as microRNA genes with expression specific to CD4+ T cells, by the same classifier which was trained on HMV data of protein-coding genes in CD4+ T cells. Conclusion We have begun to understand the HMV patterns that guide gene expression in both tissue/cell-type specific and ubiquitous manner.

  19. Emerging putative associations between non-coding RNAs and protein-coding genes in Neuropathic Pain. Added value from re-using microarray data.

    Directory of Open Access Journals (Sweden)

    Enrico Capobianco

    2016-10-01

    Full Text Available Regeneration of injured nerves is likely occurring in the peripheral nervous system, but not in the central nervous system. Although protein-coding gene expression has been assessed during nerve regeneration, little is currently known about the role of non-coding RNAs (ncRNAs. This leaves open questions about the potential effects of ncRNAs at transcriptome level. Due to the limited availability of human neuropathic pain data, we have identified the most comprehensive time-course gene expression profile referred to sciatic nerve injury, and studied in a rat model, using two neuronal tissues, namely dorsal root ganglion (DRG and sciatic nerve (SN. We have developed a methodology to identify differentially expressed bioentities starting from microarray probes, and re-purposing them to annotate ncRNAs, while analyzing the expression profiles of protein-coding genes. The approach is designed to reuse microarray data and perform first profiling and then meta-analysis through three main steps. First, we used contextual analysis to identify what we considered putative or potential protein coding targets for selected ncRNAs. Relevance was therefore assigned to differential expression of neighbor protein-coding genes, with neighborhood defined by a fixed genomic distance from long or antisense ncRNA loci, and of parent genes associated with pseudogenes. Second, connectivity among putative targets was used to build networks, in turn useful to conduct inference at interactomic scale. Last, network paths were annotated to assess relevance to neuropathic pain. We found significant differential expression in long-intergenic ncRNAs (32 lincRNAs in SN, and 8 in DRG, antisense RNA (31 asRNA in SN, and 12 in DRG and pseudogenes (456 in SN, 56 in DRG. In particular, contextual analysis centered on pseudogenes revealed some targets with known association to neurodegeneration and/or neurogenesis processes. While modules of the olfactory receptors were clearly

  20. A novel TaqMan® assay for Nosema ceranae quantification in honey bee, based on the protein coding gene Hsp70.

    Science.gov (United States)

    Cilia, Giovanni; Cabbri, Riccardo; Maiorana, Giacomo; Cardaio, Ilaria; Dall'Olio, Raffaele; Nanetti, Antonio

    2018-04-01

    Nosema ceranae is now a widespread honey bee pathogen with high incidence in apiculture. Rapid and reliable detection and quantification methods are a matter of concern for research community, nowadays mainly relying on the use of biomolecular techniques such as PCR, RT-PCR or HRMA. The aim of this technical paper is to provide a new qPCR assay, based on the highly-conserved protein coding gene Hsp70, to detect and quantify the microsporidian Nosema ceranae affecting the western honey bee Apis mellifera. The validation steps to assess efficiency, sensitivity, specificity and robustness of the assay are described also. Copyright © 2018 Elsevier GmbH. All rights reserved.

  1. Nucleotide sequence of the Escherichia coli pyrE gene and of the DNA in front of the protein-coding region

    DEFF Research Database (Denmark)

    Poulsen, Peter; Jensen, Kaj Frank; Valentin-Hansen, Poul

    1983-01-01

    leader segment in front of the protein-coding region. This leader contains a structure with features characteristic for a (translated?) rho-independent transcriptional terminator, which is preceded by a cluster of uridylate residues. This indicates that the frequency of pyrE transcription is regulated......Orotate phosphoribosyltransferase (EC 2.4.2.10) was purified to electrophoretic homogeneity from a strain of Escherichia coli containing the pyrE gene cloned on a multicopy plasmid. The relative molecular masses (Mr) of the native enzyme and its subunit were estimated by means of gel filtration...

  2. Genes from scratch--the evolutionary fate of de novo genes.

    Science.gov (United States)

    Schlötterer, Christian

    2015-04-01

    Although considered an extremely unlikely event, many genes emerge from previously noncoding genomic regions. This review covers the entire life cycle of such de novo genes. Two competing hypotheses about the process of de novo gene birth are discussed as well as the high death rate of de novo genes. Despite the high death rate, some de novo genes are retained and remain functional, even in distantly related species, through their integration into gene networks. Further studies combining gene expression with ribosome profiling in multiple populations across different species will be instrumental for an improved understanding of the evolutionary processes operating on de novo genes. Copyright © 2015 The Author. Published by Elsevier Ltd.. All rights reserved.

  3. De novo ORFs in Drosophila are important to organismal fitness and evolved rapidly from previously non-coding sequences.

    Directory of Open Access Journals (Sweden)

    Josephine A Reinhardt

    Full Text Available How non-coding DNA gives rise to new protein-coding genes (de novo genes is not well understood. Recent work has revealed the origins and functions of a few de novo genes, but common principles governing the evolution or biological roles of these genes are unknown. To better define these principles, we performed a parallel analysis of the evolution and function of six putatively protein-coding de novo genes described in Drosophila melanogaster. Reconstruction of the transcriptional history of de novo genes shows that two de novo genes emerged from novel long non-coding RNAs that arose at least 5 MY prior to evolution of an open reading frame. In contrast, four other de novo genes evolved a translated open reading frame and transcription within the same evolutionary interval suggesting that nascent open reading frames (proto-ORFs, while not required, can contribute to the emergence of a new de novo gene. However, none of the genes arose from proto-ORFs that existed long before expression evolved. Sequence and structural evolution of de novo genes was rapid compared to nearby genes and the structural complexity of de novo genes steadily increases over evolutionary time. Despite the fact that these genes are transcribed at a higher level in males than females, and are most strongly expressed in testes, RNAi experiments show that most of these genes are essential in both sexes during metamorphosis. This lethality suggests that protein coding de novo genes in Drosophila quickly become functionally important.

  4. The complete mitochondrial genome of the land snail Cornu aspersum (Helicidae: Mollusca: intra-specific divergence of protein-coding genes and phylogenetic considerations within Euthyneura.

    Directory of Open Access Journals (Sweden)

    Juan Diego Gaitán-Espitia

    Full Text Available The complete sequences of three mitochondrial genomes from the land snail Cornu aspersum were determined. The mitogenome has a length of 14050 bp, and it encodes 13 protein-coding genes, 22 transfer RNA genes and two ribosomal RNA genes. It also includes nine small intergene spacers, and a large AT-rich intergenic spacer. The intra-specific divergence analysis revealed that COX1 has the lower genetic differentiation, while the most divergent genes were NADH1, NADH3 and NADH4. With the exception of Euhadra herklotsi, the structural comparisons showed the same gene order within the family Helicidae, and nearly identical gene organization to that found in order Pulmonata. Phylogenetic reconstruction recovered Basommatophora as polyphyletic group, whereas Eupulmonata and Pulmonata as paraphyletic groups. Bayesian and Maximum Likelihood analyses showed that C. aspersum is a close relative of Cepaea nemoralis, and with the other Helicidae species form a sister group of Albinaria caerulea, supporting the monophyly of the Stylommatophora clade.

  5. Arabidopsis RNASE THREE LIKE2 Modulates the Expression of Protein-Coding Genes via 24-Nucleotide Small Interfering RNA-Directed DNA Methylation.

    Science.gov (United States)

    Elvira-Matelot, Emilie; Hachet, Mélanie; Shamandi, Nahid; Comella, Pascale; Sáez-Vásquez, Julio; Zytnicki, Matthias; Vaucheret, Hervé

    2016-02-01

    RNaseIII enzymes catalyze the cleavage of double-stranded RNA (dsRNA) and have diverse functions in RNA maturation. Arabidopsis thaliana RNASE THREE LIKE2 (RTL2), which carries one RNaseIII and two dsRNA binding (DRB) domains, is a unique Arabidopsis RNaseIII enzyme resembling the budding yeast small interfering RNA (siRNA)-producing Dcr1 enzyme. Here, we show that RTL2 modulates the production of a subset of small RNAs and that this activity depends on both its RNaseIII and DRB domains. However, the mode of action of RTL2 differs from that of Dcr1. Whereas Dcr1 directly cleaves dsRNAs into 23-nucleotide siRNAs, RTL2 likely cleaves dsRNAs into longer molecules, which are subsequently processed into small RNAs by the DICER-LIKE enzymes. Depending on the dsRNA considered, RTL2-mediated maturation either improves (RTL2-dependent loci) or reduces (RTL2-sensitive loci) the production of small RNAs. Because the vast majority of RTL2-regulated loci correspond to transposons and intergenic regions producing 24-nucleotide siRNAs that guide DNA methylation, RTL2 depletion modifies DNA methylation in these regions. Nevertheless, 13% of RTL2-regulated loci correspond to protein-coding genes. We show that changes in 24-nucleotide siRNA levels also affect DNA methylation levels at such loci and inversely correlate with mRNA steady state levels, thus implicating RTL2 in the regulation of protein-coding gene expression. © 2016 American Society of Plant Biologists. All rights reserved.

  6. A novel bidirectional expression system for simultaneous expression of both the protein-coding genes and short hairpin RNAs in mammalian cells

    International Nuclear Information System (INIS)

    Hung, C.-F.; Cheng, T.-L.; Wu, R.-H.; Teng, C.-F.; Chang, W.-T.

    2006-01-01

    RNA interference (RNAi) is an extremely powerful and widely used gene silencing approach for reverse functional genomics and molecular therapeutics. In mammals, the conserved poly(ADP-ribose) polymerase 2 (PARP-2)/RNase P bidirectional control promoter simultaneously expresses both the PARP-2 protein and RNase P RNA by RNA polymerase II- and III-dependent mechanisms, respectively. To explore this unique bidirectional control system in RNAi-mediated gene silencing strategy, we have constructed two novel bidirectional expression vectors, pbiHsH1 and pbiMmH1, which contained the PARP-2/RNase P bidirectional control promoters from human and mouse, for simultaneous expression of both the protein-coding genes and short hairpin RNAs. Analyses of the dual transcriptional activities indicated that these two bidirectional expression vectors could not only express enhanced green fluorescent protein as a functional reporter but also simultaneously transcribe shLuc for inhibiting the firefly luciferase expression. In addition, to extend its utility for the establishment of inherited stable clones, we have also reconstructed this bidirectional expression system with the blasticidin S deaminase gene, an effective dominant drug resistance selectable marker, and examined both the selection and inhibition efficiencies in drug resistance and gene expression. Moreover, we have further demonstrated that this bidirectional expression system could efficiently co-regulate the functionally important genes, such as overexpression of tumor suppressor protein p53 and inhibition of anti-apoptotic protein Bcl-2 at the same time. In summary, the bidirectional expression vectors, pbiHsH1 and pbiMmH1, should provide a simple, convenient, and efficient novel tool for manipulating the gene function in mammalian cells

  7. Nonsynonymous substitution rate (Ka is a relatively consistent parameter for defining fast-evolving and slow-evolving protein-coding genes

    Directory of Open Access Journals (Sweden)

    Wang Lei

    2011-02-01

    Full Text Available Abstract Background Mammalian genome sequence data are being acquired in large quantities and at enormous speeds. We now have a tremendous opportunity to better understand which genes are the most variable or conserved, and what their particular functions and evolutionary dynamics are, through comparative genomics. Results We chose human and eleven other high-coverage mammalian genome data–as well as an avian genome as an outgroup–to analyze orthologous protein-coding genes using nonsynonymous (Ka and synonymous (Ks substitution rates. After evaluating eight commonly-used methods of Ka and Ks calculation, we observed that these methods yielded a nearly uniform result when estimating Ka, but not Ks (or Ka/Ks. When sorting genes based on Ka, we noticed that fast-evolving and slow-evolving genes often belonged to different functional classes, with respect to species-specificity and lineage-specificity. In particular, we identified two functional classes of genes in the acquired immune system. Fast-evolving genes coded for signal-transducing proteins, such as receptors, ligands, cytokines, and CDs (cluster of differentiation, mostly surface proteins, whereas the slow-evolving genes were for function-modulating proteins, such as kinases and adaptor proteins. In addition, among slow-evolving genes that had functions related to the central nervous system, neurodegenerative disease-related pathways were enriched significantly in most mammalian species. We also confirmed that gene expression was negatively correlated with evolution rate, i.e. slow-evolving genes were expressed at higher levels than fast-evolving genes. Our results indicated that the functional specializations of the three major mammalian clades were: sensory perception and oncogenesis in primates, reproduction and hormone regulation in large mammals, and immunity and angiotensin in rodents. Conclusion Our study suggests that Ka calculation, which is less biased compared to Ks and Ka

  8. Origins of De Novo Genes in Human and Chimpanzee.

    Science.gov (United States)

    Ruiz-Orera, Jorge; Hernandez-Rodriguez, Jessica; Chiva, Cristina; Sabidó, Eduard; Kondova, Ivanela; Bontrop, Ronald; Marqués-Bonet, Tomàs; Albà, M Mar

    2015-12-01

    The birth of new genes is an important motor of evolutionary innovation. Whereas many new genes arise by gene duplication, others originate at genomic regions that did not contain any genes or gene copies. Some of these newly expressed genes may acquire coding or non-coding functions and be preserved by natural selection. However, it is yet unclear which is the prevalence and underlying mechanisms of de novo gene emergence. In order to obtain a comprehensive view of this process, we have performed in-depth sequencing of the transcriptomes of four mammalian species--human, chimpanzee, macaque, and mouse--and subsequently compared the assembled transcripts and the corresponding syntenic genomic regions. This has resulted in the identification of over five thousand new multiexonic transcriptional events in human and/or chimpanzee that are not observed in the rest of species. Using comparative genomics, we show that the expression of these transcripts is associated with the gain of regulatory motifs upstream of the transcription start site (TSS) and of U1 snRNP sites downstream of the TSS. In general, these transcripts show little evidence of purifying selection, suggesting that many of them are not functional. However, we find signatures of selection in a subset of de novo genes which have evidence of protein translation. Taken together, the data support a model in which frequently-occurring new transcriptional events in the genome provide the raw material for the evolution of new proteins.

  9. A compendium of transcription factor and Transcriptionally active protein coding gene families in cowpea (Vigna unguiculata L.).

    Science.gov (United States)

    Misra, Vikram A; Wang, Yu; Timko, Michael P

    2017-11-22

    Cowpea (Vigna unguiculata (L.) Walp.) is the most important food and forage legume in the semi-arid tropics of sub-Saharan Africa where approximately 80% of worldwide production takes place primarily on low-input, subsistence farm sites. Among the major goals of cowpea breeding and improvement programs are the rapid manipulation of agronomic traits for seed size and quality and improved resistance to abiotic and biotic stresses to enhance productivity. Knowing the suite of transcription factors (TFs) and transcriptionally active proteins (TAPs) that control various critical plant cellular processes would contribute tremendously to these improvement aims. We used a computational approach that employed three different predictive pipelines to data mine the cowpea genome and identified over 4400 genes representing 136 different TF and TAP families. We compare the information content of cowpea to two evolutionarily close species common bean (Phaseolus vulgaris), and soybean (Glycine max) to gauge the relative informational content. Our data indicate that correcting for genome size cowpea has fewer TF and TAP genes than common bean (4408 / 5291) and soybean (4408/ 11,065). Members of the GROWTH-REGULATING FACTOR (GRF) and Auxin/indole-3-acetic acid (Aux/IAA) gene families appear to be over-represented in the genome relative to common bean and soybean, whereas members of the MADS (Minichromosome maintenance deficient 1 (MCM1), AGAMOUS, DEFICIENS, and serum response factor (SRF)) and C2C2-YABBY appear to be under-represented. Analysis of the AP2-EREBP APETALA2-Ethylene Responsive Element Binding Protein (AP2-EREBP), NAC (NAM (no apical meristem), ATAF1, 2 (Arabidopsis transcription activation factor), CUC (cup-shaped cotyledon)), and WRKY families, known to be important in defense signaling, revealed changes and phylogenetic rearrangements relative to common bean and soybean that suggest these groups may have evolved different functions. The availability of detailed

  10. ChIPBase v2.0: decoding transcriptional regulatory networks of non-coding RNAs and protein-coding genes from ChIP-seq data.

    Science.gov (United States)

    Zhou, Ke-Ren; Liu, Shun; Sun, Wen-Ju; Zheng, Ling-Ling; Zhou, Hui; Yang, Jian-Hua; Qu, Liang-Hu

    2017-01-04

    The abnormal transcriptional regulation of non-coding RNAs (ncRNAs) and protein-coding genes (PCGs) is contributed to various biological processes and linked with human diseases, but the underlying mechanisms remain elusive. In this study, we developed ChIPBase v2.0 (http://rna.sysu.edu.cn/chipbase/) to explore the transcriptional regulatory networks of ncRNAs and PCGs. ChIPBase v2.0 has been expanded with ∼10 200 curated ChIP-seq datasets, which represent about 20 times expansion when comparing to the previous released version. We identified thousands of binding motif matrices and their binding sites from ChIP-seq data of DNA-binding proteins and predicted millions of transcriptional regulatory relationships between transcription factors (TFs) and genes. We constructed 'Regulator' module to predict hundreds of TFs and histone modifications that were involved in or affected transcription of ncRNAs and PCGs. Moreover, we built a web-based tool, Co-Expression, to explore the co-expression patterns between DNA-binding proteins and various types of genes by integrating the gene expression profiles of ∼10 000 tumor samples and ∼9100 normal tissues and cell lines. ChIPBase also provides a ChIP-Function tool and a genome browser to predict functions of diverse genes and visualize various ChIP-seq data. This study will greatly expand our understanding of the transcriptional regulations of ncRNAs and PCGs. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  11. Analysis of antisense expression by whole genome tiling microarrays and siRNAs suggests mis-annotation of Arabidopsis orphan protein-coding genes.

    Directory of Open Access Journals (Sweden)

    Casey R Richardson

    2010-05-01

    Full Text Available MicroRNAs (miRNAs and trans-acting small-interfering RNAs (tasi-RNAs are small (20-22 nt long RNAs (smRNAs generated from hairpin secondary structures or antisense transcripts, respectively, that regulate gene expression by Watson-Crick pairing to a target mRNA and altering expression by mechanisms related to RNA interference. The high sequence homology of plant miRNAs to their targets has been the mainstay of miRNA prediction algorithms, which are limited in their predictive power for other kingdoms because miRNA complementarity is less conserved yet transitive processes (production of antisense smRNAs are active in eukaryotes. We hypothesize that antisense transcription and associated smRNAs are biomarkers which can be computationally modeled for gene discovery.We explored rice (Oryza sativa sense and antisense gene expression in publicly available whole genome tiling array transcriptome data and sequenced smRNA libraries (as well as C. elegans and found evidence of transitivity of MIRNA genes similar to that found in Arabidopsis. Statistical analysis of antisense transcript abundances, presence of antisense ESTs, and association with smRNAs suggests several hundred Arabidopsis 'orphan' hypothetical genes are non-coding RNAs. Consistent with this hypothesis, we found novel Arabidopsis homologues of some MIRNA genes on the antisense strand of previously annotated protein-coding genes. A Support Vector Machine (SVM was applied using thermodynamic energy of binding plus novel expression features of sense/antisense transcription topology and siRNA abundances to build a prediction model of miRNA targets. The SVM when trained on targets could predict the "ancient" (deeply conserved class of validated Arabidopsis MIRNA genes with an accuracy of 84%, and 76% for "new" rapidly-evolving MIRNA genes.Antisense and smRNA expression features and computational methods may identify novel MIRNA genes and other non-coding RNAs in plants and potentially other

  12. NCYM, a Cis-antisense gene of MYCN, encodes a de novo evolved protein that inhibits GSK3β resulting in the stabilization of MYCN in human neuroblastomas.

    Directory of Open Access Journals (Sweden)

    Yusuke Suenaga

    2014-01-01

    Full Text Available The rearrangement of pre-existing genes has long been thought of as the major mode of new gene generation. Recently, de novo gene birth from non-genic DNA was found to be an alternative mechanism to generate novel protein-coding genes. However, its functional role in human disease remains largely unknown. Here we show that NCYM, a cis-antisense gene of the MYCN oncogene, initially thought to be a large non-coding RNA, encodes a de novo evolved protein regulating the pathogenesis of human cancers, particularly neuroblastoma. The NCYM gene is evolutionally conserved only in the taxonomic group containing humans and chimpanzees. In primary human neuroblastomas, NCYM is 100% co-amplified and co-expressed with MYCN, and NCYM mRNA expression is associated with poor clinical outcome. MYCN directly transactivates both NCYM and MYCN mRNA, whereas NCYM stabilizes MYCN protein by inhibiting the activity of GSK3β, a kinase that promotes MYCN degradation. In contrast to MYCN transgenic mice, neuroblastomas in MYCN/NCYM double transgenic mice were frequently accompanied by distant metastases, behavior reminiscent of human neuroblastomas with MYCN amplification. The NCYM protein also interacts with GSK3β, thereby stabilizing the MYCN protein in the tumors of the MYCN/NCYM double transgenic mice. Thus, these results suggest that GSK3β inhibition by NCYM stabilizes the MYCN protein both in vitro and in vivo. Furthermore, the survival of MYCN transgenic mice bearing neuroblastoma was improved by treatment with NVP-BEZ235, a dual PI3K/mTOR inhibitor shown to destabilize MYCN via GSK3β activation. In contrast, tumors caused in MYCN/NCYM double transgenic mice showed chemo-resistance to the drug. Collectively, our results show that NCYM is the first de novo evolved protein known to act as an oncopromoting factor in human cancer, and suggest that de novo evolved proteins may functionally characterize human disease.

  13. De novo mutation in the dopamine transporter gene associates dopamine dysfunction with autism spectrum disorder

    DEFF Research Database (Denmark)

    Hamilton, P J; Campbell, N G; Sharma, S

    2013-01-01

    De novo genetic variation is an important class of risk factors for autism spectrum disorder (ASD). Recently, whole-exome sequencing of ASD families has identified a novel de novo missense mutation in the human dopamine (DA) transporter (hDAT) gene, which results in a Thr to Met substitution...

  14. Expression of the Long Intergenic Non-Protein Coding RNA 665 (LINC00665) Gene and the Cell Cycle in Hepatocellular Carcinoma Using The Cancer Genome Atlas, the Gene Expression Omnibus, and Quantitative Real-Time Polymerase Chain Reaction.

    Science.gov (United States)

    Wen, Dong-Yue; Lin, Peng; Pang, Yu-Yan; Chen, Gang; He, Yun; Dang, Yi-Wu; Yang, Hong

    2018-05-05

    BACKGROUND Long non-coding RNAs (lncRNAs) have a role in physiological and pathological processes, including cancer. The aim of this study was to investigate the expression of the long intergenic non-protein coding RNA 665 (LINC00665) gene and the cell cycle in hepatocellular carcinoma (HCC) using database analysis including The Cancer Genome Atlas (TCGA), the Gene Expression Omnibus (GEO), and quantitative real-time polymerase chain reaction (qPCR). MATERIAL AND METHODS Expression levels of LINC00665 were compared between human tissue samples of HCC and adjacent normal liver, clinicopathological correlations were made using TCGA and the GEO, and qPCR was performed to validate the findings. Other public databases were searched for other genes associated with LINC00665 expression, including The Atlas of Noncoding RNAs in Cancer (TANRIC), the Multi Experiment Matrix (MEM), Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG) and protein-protein interaction (PPI) networks. RESULTS Overexpression of LINC00665 in patients with HCC was significantly associated with gender, tumor grade, stage, and tumor cell type. Overexpression of LINC00665 in patients with HCC was significantly associated with overall survival (OS) (HR=1.47795%; CI: 1.046-2.086). Bioinformatics analysis identified 469 related genes and further analysis supported a hypothesis that LINC00665 regulates pathways in the cell cycle to facilitate the development and progression of HCC through ten identified core genes: CDK1, BUB1B, BUB1, PLK1, CCNB2, CCNB1, CDC20, ESPL1, MAD2L1, and CCNA2. CONCLUSIONS Overexpression of the lncRNA, LINC00665 may be involved in the regulation of cell cycle pathways in HCC through ten identified hub genes.

  15. De novo mutations in synaptic transmission genes including DNM1 cause epileptic encephalopathies

    DEFF Research Database (Denmark)

    2014-01-01

    in five individuals and de novo mutations in GABBR2, FASN, and RYR3 in two individuals each. Unlike previous studies, this cohort is sufficiently large to show a significant excess of de novo mutations in epileptic encephalopathy probands compared to the general population using a likelihood analysis (p...... = 8.2 × 10(-4)), supporting a prominent role for de novo mutations in epileptic encephalopathies. We bring statistical evidence that mutations in DNM1 cause epileptic encephalopathy, find suggestive evidence for a role of three additional genes, and show that at least 12% of analyzed individuals have...... analyzed exome-sequencing data of 356 trios with the "classical" epileptic encephalopathies, infantile spasms and Lennox Gastaut syndrome, including 264 trios previously analyzed by the Epi4K/EPGP consortium. In this expanded cohort, we find 429 de novo mutations, including de novo mutations in DNM1...

  16. A rice gene of de novo origin negatively regulates pathogen-induced defense response.

    Directory of Open Access Journals (Sweden)

    Wenfei Xiao

    Full Text Available How defense genes originated with the evolution of their specific pathogen-responsive traits remains an important problem. It is generally known that a form of duplication can generate new genes, suggesting that a new gene usually evolves from an ancestral gene. However, we show that a new defense gene in plants may evolve by de novo origination, resulting in sophisticated disease-resistant functions in rice. Analyses of gene evolution showed that this new gene, OsDR10, had homologs only in the closest relative, Leersia genus, but not other subfamilies of the grass family; therefore, it is a rice tribe-specific gene that may have originated de novo in the tribe. We further show that this gene may evolve a highly conservative rice-specific function that contributes to the regulation difference between rice and other plant species in response to pathogen infections. Biologic analyses including gene silencing, pathologic analysis, and mutant characterization by transformation showed that the OsDR10-suppressed plants enhanced resistance to a broad spectrum of Xanthomonas oryzae pv. oryzae strains, which cause bacterial blight disease. This enhanced disease resistance was accompanied by increased accumulation of endogenous salicylic acid (SA and suppressed accumulation of endogenous jasmonic acid (JA as well as modified expression of a subset of defense-responsive genes functioning both upstream and downstream of SA and JA. These data and analyses provide fresh insights into the new biologic and evolutionary processes of a de novo gene recruited rapidly.

  17. In-depth comparative analysis of malaria parasite genomes reveals protein-coding genes linked to human disease in Plasmodium falciparum genome.

    Science.gov (United States)

    Liu, Xuewu; Wang, Yuanyuan; Liang, Jiao; Wang, Luojun; Qin, Na; Zhao, Ya; Zhao, Gang

    2018-05-02

    Plasmodium falciparum is the most virulent malaria parasite capable of parasitizing human erythrocytes. The identification of genes related to this capability can enhance our understanding of the molecular mechanisms underlying human malaria and lead to the development of new therapeutic strategies for malaria control. With the availability of several malaria parasite genome sequences, performing computational analysis is now a practical strategy to identify genes contributing to this disease. Here, we developed and used a virtual genome method to assign 33,314 genes from three human malaria parasites, namely, P. falciparum, P. knowlesi and P. vivax, and three rodent malaria parasites, namely, P. berghei, P. chabaudi and P. yoelii, to 4605 clusters. Each cluster consisted of genes whose protein sequences were significantly similar and was considered as a virtual gene. Comparing the enriched values of all clusters in human malaria parasites with those in rodent malaria parasites revealed 115 P. falciparum genes putatively responsible for parasitizing human erythrocytes. These genes are mainly located in the chromosome internal regions and participate in many biological processes, including membrane protein trafficking and thiamine biosynthesis. Meanwhile, 289 P. berghei genes were included in the rodent parasite-enriched clusters. Most are located in subtelomeric regions and encode erythrocyte surface proteins. Comparing cluster values in P. falciparum with those in P. vivax and P. knowlesi revealed 493 candidate genes linked to virulence. Some of them encode proteins present on the erythrocyte surface and participate in cytoadhesion, virulence factor trafficking, or erythrocyte invasion, but many genes with unknown function were also identified. Cerebral malaria is characterized by accumulation of infected erythrocytes at trophozoite stage in brain microvascular. To discover cerebral malaria-related genes, fast Fourier transformation (FFT) was introduced to extract

  18. De novo origin of VCY2 from autosome to Y-transposed amplicon.

    Directory of Open Access Journals (Sweden)

    Peng-Rong Cao

    Full Text Available The formation of new genes is a primary driving force of evolution in all organisms. The de novo evolution of new genes from non-protein-coding genomic regions is emerging as an important additional mechanism for novel gene creation. Y chromosomes underlie sex determination in mammals and contain genes that are required for male-specific functions. In this study, a search was undertaken for Y chromosome de novo genes derived from non-protein-coding sequences. The Y chromosome orphan gene variable charge, Y-linked (VCY2, is an autosome-derived gene that has sequence similarity to large autosomal fragments but lacks an autosomal protein-coding homolog. VCY2 locates in the amplicon containing long DNA fragments that were transposed from autosomes to the Y chromosome before the ape-monkey split. We confirmed that VCY2 cannot be encoded by autosomes due to the presence of multiple disablers that disrupt the open reading frame, such as the absence of start or stop codons and the presence of premature stop codons. Similar observations have been made for homologs in the autosomes of the chimpanzee, gorilla, rhesus macaque, baboon and out-group marmoset, which suggests that there was a non-protein-coding ancestral VCY2 that was common to apes and monkeys that predated the transposition event. Furthermore, while protein-coding orthologs are absent, a putative non-protein-coding VCY2 with conserved disablers was identified in the rhesus macaque Y chromosome male-specific region. This finding implies that VCY2 might have not acquired its protein-coding ability before the ape-monkey split. VCY2 encodes a testis-specific expressed protein and is involved in the pathologic process of male infertility, and the acquisition of this gene might improve male fertility. This is the first evidence that de novo genes can be generated from transposed autosomal non-protein-coding segments, and this evidence provides novel insights into the evolutionary history of the Y

  19. Annotation of the protein coding regions of the equine genome

    DEFF Research Database (Denmark)

    Hestand, Matthew S.; Kalbfleisch, Theodore S.; Coleman, Stephen J.

    2015-01-01

    Current gene annotation of the horse genome is largely derived from in silico predictions and cross-species alignments. Only a small number of genes are annotated based on equine EST and mRNA sequences. To expand the number of equine genes annotated from equine experimental evidence, we sequenced m...... and appear to be small errors in the equine reference genome, since they are also identified as homozygous variants by genomic DNA resequencing of the reference horse. Taken together, we provide a resource of equine mRNA structures and protein coding variants that will enhance equine and cross...

  20. Retrotransposons and non-protein coding RNAs

    DEFF Research Database (Denmark)

    Mourier, Tobias; Willerslev, Eske

    2009-01-01

    does not merely represent spurious transcription. We review examples of functional RNAs transcribed from retrotransposons, and address the collection of non-protein coding RNAs derived from transposable element sequences, including numerous human microRNAs and the neuronal BC RNAs. Finally, we review...

  1. De novo mutation in the dopamine transporter gene associates dopamine dysfunction with autism spectrum disorder.

    Science.gov (United States)

    Hamilton, P J; Campbell, N G; Sharma, S; Erreger, K; Herborg Hansen, F; Saunders, C; Belovich, A N; Sahai, M A; Cook, E H; Gether, U; McHaourab, H S; Matthies, H J G; Sutcliffe, J S; Galli, A

    2013-12-01

    De novo genetic variation is an important class of risk factors for autism spectrum disorder (ASD). Recently, whole-exome sequencing of ASD families has identified a novel de novo missense mutation in the human dopamine (DA) transporter (hDAT) gene, which results in a Thr to Met substitution at site 356 (hDAT T356M). The dopamine transporter (DAT) is a presynaptic membrane protein that regulates dopaminergic tone in the central nervous system by mediating the high-affinity reuptake of synaptically released DA, making it a crucial regulator of DA homeostasis. Here, we report the first functional, structural and behavioral characterization of an ASD-associated de novo mutation in the hDAT. We demonstrate that the hDAT T356M displays anomalous function, characterized as a persistent reverse transport of DA (substrate efflux). Importantly, in the bacterial homolog leucine transporter, substitution of A289 (the homologous site to T356) with a Met promotes an outward-facing conformation upon substrate binding. In the substrate-bound state, an outward-facing transporter conformation is required for substrate efflux. In Drosophila melanogaster, the expression of hDAT T356M in DA neurons-lacking Drosophila DAT leads to hyperlocomotion, a trait associated with DA dysfunction and ASD. Taken together, our findings demonstrate that alterations in DA homeostasis, mediated by aberrant DAT function, may confer risk for ASD and related neuropsychiatric conditions.

  2. De novo transcriptome sequencing of axolotl blastema for identification of differentially expressed genes during limb regeneration

    Science.gov (United States)

    2013-01-01

    Background Salamanders are unique among vertebrates in their ability to completely regenerate amputated limbs through the mediation of blastema cells located at the stump ends. This regeneration is nerve-dependent because blastema formation and regeneration does not occur after limb denervation. To obtain the genomic information of blastema tissues, de novo transcriptomes from both blastema tissues and denervated stump ends of Ambystoma mexicanum (axolotls) 14 days post-amputation were sequenced and compared using Solexa DNA sequencing. Results The sequencing done for this study produced 40,688,892 reads that were assembled into 307,345 transcribed sequences. The N50 of transcribed sequence length was 562 bases. A similarity search with known proteins identified 39,200 different genes to be expressed during limb regeneration with a cut-off E-value exceeding 10-5. We annotated assembled sequences by using gene descriptions, gene ontology, and clusters of orthologous group terms. Targeted searches using these annotations showed that the majority of the genes were in the categories of essential metabolic pathways, transcription factors and conserved signaling pathways, and novel candidate genes for regenerative processes. We discovered and confirmed numerous sequences of the candidate genes by using quantitative polymerase chain reaction and in situ hybridization. Conclusion The results of this study demonstrate that de novo transcriptome sequencing allows gene expression analysis in a species lacking genome information and provides the most comprehensive mRNA sequence resources for axolotls. The characterization of the axolotl transcriptome can help elucidate the molecular mechanisms underlying blastema formation during limb regeneration. PMID:23815514

  3. The Rickettsia Endosymbiont of Ixodes pacificus Contains All the Genes of De Novo Folate Biosynthesis

    Science.gov (United States)

    Bodnar, James; Mortazavi, Bobak; Laurent, Timothy; Deason, Jeff; Thephavongsa, Khanhkeo; Zhong, Jianmin

    2015-01-01

    Ticks and other arthropods often are hosts to nutrient providing bacterial endosymbionts, which contribute to their host’s fitness by supplying nutrients such as vitamins and amino acids. It has been detected, in our lab, that Ixodes pacificus is host to Rickettsia species phylotype G021. This endosymbiont is predominantly present, and 100% maternally transmitted in I. pacificus. To study roles of phylotype G021 in I. pacificus, bioinformatic and molecular approaches were carried out. MUMmer genome alignments of whole genome sequence of I. scapularis, a close relative to I. pacificus, against completely sequenced genomes of R. bellii OSU85-389, R. conorii, and R. felis, identified 8,190 unique sequences that are homologous to Rickettsia sequences in the NCBI Trace Archive. MetaCyc metabolic reconstructions revealed that all folate gene orthologues (folA, folC, folE, folKP, ptpS) required for de novo folate biosynthesis are present in the genome of Rickettsia buchneri in I. scapularis. To examine the metabolic capability of phylotype G021 in I. pacificus, genes of the folate biosynthesis pathway of the bacterium were PCR amplified using degenerate primers. BLAST searches identified that nucleotide sequences of the folA, folC, folE, folKP, and ptpS genes possess 98.6%, 98.8%, 98.9%, 98.5% and 99.0% identity respectively to the corresponding genes of Rickettsia buchneri. Phylogenetic tree constructions show that the folate genes of phylotype G021 and homologous genes from various Rickettsia species are monophyletic. This study has shown that all folate genes exist in the genome of Rickettsia species phylotype G021 and that this bacterium has the genetic capability for de novo folate synthesis. PMID:26650541

  4. A de novo variant in the ASPRV1 gene in a dog with ichthyosis.

    Science.gov (United States)

    Bauer, Anina; Waluk, Dominik P; Galichet, Arnaud; Timm, Katrin; Jagannathan, Vidhya; Sayar, Beyza S; Wiener, Dominique J; Dietschi, Elisabeth; Müller, Eliane J; Roosje, Petra; Welle, Monika M; Leeb, Tosso

    2017-03-01

    Ichthyoses are a heterogeneous group of inherited cornification disorders characterized by generalized dry skin, scaling and/or hyperkeratosis. Ichthyosis vulgaris is the most common form of ichthyosis in humans and caused by genetic variants in the FLG gene encoding filaggrin. Filaggrin is a key player in the formation of the stratum corneum, the uppermost layer of the epidermis and therefore crucial for barrier function. During terminal differentiation of keratinocytes, the precursor profilaggrin is cleaved by several proteases into filaggrin monomers and eventually processed into free amino acids contributing to the hydration of the cornified layer. We studied a German Shepherd dog with a novel form of ichthyosis. Comparing the genome sequence of the affected dog with 288 genomes from genetically diverse non-affected dogs we identified a private heterozygous variant in the ASPRV1 gene encoding "aspartic peptidase, retroviral-like 1", which is also known as skin aspartic protease (SASPase). The variant was absent in both parents and therefore due to a de novo mutation event. It was a missense variant, c.1052T>C, affecting a conserved residue close to an autoprocessing cleavage site, p.(Leu351Pro). ASPRV1 encodes a retroviral-like protease involved in profilaggrin-to-filaggrin processing. By immunofluorescence staining we showed that the filaggrin expression pattern was altered in the affected dog. Thus, our findings provide strong evidence that the identified de novo variant is causative for the ichthyosis in the affected dog and that ASPRV1 plays an essential role in skin barrier formation. ASPRV1 is thus a novel candidate gene for unexplained human forms of ichthyoses.

  5. A de novo variant in the ASPRV1 gene in a dog with ichthyosis.

    Directory of Open Access Journals (Sweden)

    Anina Bauer

    2017-03-01

    Full Text Available Ichthyoses are a heterogeneous group of inherited cornification disorders characterized by generalized dry skin, scaling and/or hyperkeratosis. Ichthyosis vulgaris is the most common form of ichthyosis in humans and caused by genetic variants in the FLG gene encoding filaggrin. Filaggrin is a key player in the formation of the stratum corneum, the uppermost layer of the epidermis and therefore crucial for barrier function. During terminal differentiation of keratinocytes, the precursor profilaggrin is cleaved by several proteases into filaggrin monomers and eventually processed into free amino acids contributing to the hydration of the cornified layer. We studied a German Shepherd dog with a novel form of ichthyosis. Comparing the genome sequence of the affected dog with 288 genomes from genetically diverse non-affected dogs we identified a private heterozygous variant in the ASPRV1 gene encoding "aspartic peptidase, retroviral-like 1", which is also known as skin aspartic protease (SASPase. The variant was absent in both parents and therefore due to a de novo mutation event. It was a missense variant, c.1052T>C, affecting a conserved residue close to an autoprocessing cleavage site, p.(Leu351Pro. ASPRV1 encodes a retroviral-like protease involved in profilaggrin-to-filaggrin processing. By immunofluorescence staining we showed that the filaggrin expression pattern was altered in the affected dog. Thus, our findings provide strong evidence that the identified de novo variant is causative for the ichthyosis in the affected dog and that ASPRV1 plays an essential role in skin barrier formation. ASPRV1 is thus a novel candidate gene for unexplained human forms of ichthyoses.

  6. Human native lipoprotein-induced de novo DNA methylation is associated with repression of inflammatory genes in THP-1 macrophages.

    Science.gov (United States)

    Rangel-Salazar, Rubén; Wickström-Lindholm, Marie; Aguilar-Salinas, Carlos A; Alvarado-Caudillo, Yolanda; Døssing, Kristina B V; Esteller, Manel; Labourier, Emmanuel; Lund, Gertrud; Nielsen, Finn C; Rodríguez-Ríos, Dalia; Solís-Martínez, Martha O; Wrobel, Katarzyna; Wrobel, Kazimierz; Zaina, Silvio

    2011-11-25

    We previously showed that a VLDL- and LDL-rich mix of human native lipoproteins induces a set of repressive epigenetic marks, i.e. de novo DNA methylation, histone 4 hypoacetylation and histone 4 lysine 20 (H4K20) hypermethylation in THP-1 macrophages. Here, we: 1) ask what gene expression changes accompany these epigenetic responses; 2) test the involvement of candidate factors mediating the latter. We exploited genome expression arrays to identify target genes for lipoprotein-induced silencing, in addition to RNAi and expression studies to test the involvement of candidate mediating factors. The study was conducted in human THP-1 macrophages. Native lipoprotein-induced de novo DNA methylation was associated with a general repression of various critical genes for macrophage function, including pro-inflammatory genes. Lipoproteins showed differential effects on epigenetic marks, as de novo DNA methylation was induced by VLDL and to a lesser extent by LDL, but not by HDL, and VLDL induced H4K20 hypermethylation, while HDL caused H4 deacetylation. The analysis of candidate factors mediating VLDL-induced DNA hypermethylation revealed that this response was: 1) surprisingly, mediated exclusively by the canonical maintenance DNA methyltransferase DNMT1, and 2) independent of the Dicer/micro-RNA pathway. Our work provides novel insights into epigenetic gene regulation by native lipoproteins. Furthermore, we provide an example of DNMT1 acting as a de novo DNA methyltransferase independently of canonical de novo enzymes, and show proof of principle that de novo DNA methylation can occur independently of a functional Dicer/micro-RNA pathway in mammals.

  7. Human native lipoprotein-induced de novo DNA methylation is associated with repression of inflammatory genes in THP-1 macrophages

    Directory of Open Access Journals (Sweden)

    Rangel-Salazar Rubén

    2011-11-01

    Full Text Available Abstract Background We previously showed that a VLDL- and LDL-rich mix of human native lipoproteins induces a set of repressive epigenetic marks, i.e. de novo DNA methylation, histone 4 hypoacetylation and histone 4 lysine 20 (H4K20 hypermethylation in THP-1 macrophages. Here, we: 1 ask what gene expression changes accompany these epigenetic responses; 2 test the involvement of candidate factors mediating the latter. We exploited genome expression arrays to identify target genes for lipoprotein-induced silencing, in addition to RNAi and expression studies to test the involvement of candidate mediating factors. The study was conducted in human THP-1 macrophages. Results Native lipoprotein-induced de novo DNA methylation was associated with a general repression of various critical genes for macrophage function, including pro-inflammatory genes. Lipoproteins showed differential effects on epigenetic marks, as de novo DNA methylation was induced by VLDL and to a lesser extent by LDL, but not by HDL, and VLDL induced H4K20 hypermethylation, while HDL caused H4 deacetylation. The analysis of candidate factors mediating VLDL-induced DNA hypermethylation revealed that this response was: 1 surprisingly, mediated exclusively by the canonical maintenance DNA methyltransferase DNMT1, and 2 independent of the Dicer/micro-RNA pathway. Conclusions Our work provides novel insights into epigenetic gene regulation by native lipoproteins. Furthermore, we provide an example of DNMT1 acting as a de novo DNA methyltransferase independently of canonical de novo enzymes, and show proof of principle that de novo DNA methylation can occur independently of a functional Dicer/micro-RNA pathway in mammals.

  8. XX male sex reversal with genital abnormalities associated with a de novo SOX3 gene duplication.

    Science.gov (United States)

    Moalem, Sharon; Babul-Hirji, Riyana; Stavropolous, Dmitri J; Wherrett, Diane; Bägli, Darius J; Thomas, Paul; Chitayat, David

    2012-07-01

    Differentiation of the bipotential gonad into testis is initiated by the Y chromosome-linked gene SRY (Sex-determining Region Y) through upregulation of its autosomal direct target gene SOX9 (Sry-related HMG box-containing gene 9). Sequence and chromosome homology studies have shown that SRY most probably evolved from SOX3, which in humans is located at Xq27.1. Mutations causing SOX3 loss-of-function do not affect the sex determination in mice or humans. However, transgenic mouse studies have shown that ectopic expression of Sox3 in the bipotential gonad results in upregulation of Sox9, resulting in testicular induction and XX male sex reversal. However, the mechanism by which these rearrangements cause sex reversal and the frequency with which they are associated with disorders of sex development remains unclear. Rearrangements of the SOX3 locus were identified recently in three cases of human XX male sex reversal. We report on a case of XX male sex reversal associated with a novel de novo duplication of the SOX3 gene. These data provide additional evidence that SOX3 gain-of-function in the XX bipotential gonad causes XX male sex reversal and further support the hypothesis that SOX3 is the evolutionary antecedent of SRY. Copyright © 2012 Wiley Periodicals, Inc.

  9. Transcriptional regulator-mediated activation of adaptation genes triggers CRISPR de novo spacer acquisition

    DEFF Research Database (Denmark)

    Liu, Tao; Li, Yingjun; Wang, Xiaodi

    2015-01-01

    Acquisition of de novo spacer sequences confers CRISPR-Cas with a memory to defend against invading genetic elements. However, the mechanism of regulation of CRISPR spacer acquisition remains unknown. Here we examine the transcriptional regulation of the conserved spacer acquisition genes in Type I......, it was demonstrated that the transcription level of csa1, cas1, cas2 and cas4 was significantly enhanced in a csa3a-overexpression strain and, moreover, the Csa1 and Cas1 protein levels were increased in this strain. Furthermore, we demonstrated the hyperactive uptake of unique spacers within both CRISPR loci...... in the presence of the csa3a overexpression vector. The spacer acquisition process is dependent on the CCN PAM sequence and protospacer selection is random and non-directional. These results suggested a regulation mechanism of CRISPR spacer acquisition where a single transcriptional regulator senses the presence...

  10. Transferência do fator caturra para o cultivar Mundo Novo de Coffea arabica Transfer of the CT gene to Mundo Novo cultivar

    Directory of Open Access Journals (Sweden)

    A. Carvalho

    1972-01-01

    Full Text Available No presente trabalho são relatados os estudos realizados visando à introdução do gene Ct (caturra que contribui para reduzir a altura da planta, no cultivar Mundo" Novo de Coffea arabica.Estudaram-se, em ensaios de produtividade, as populações Fv F.,, F3 e F4. Nessas populações e principalmente entre os descendentes dos "caféeiros H 2077-2-5 e H 2077-2-12, foram selecionadas plantas homozigotas para os alelos Ct e também para os alelos responsáveis pela cor do fruto xc ou Xc. Essas combinações foram denominadas 'Catuaí Amarelo' e 'Catuaí Vermelho', respectivamente, e suas características são apresentadas. Os novos cultivares vêm-se mostrando de interesse econômico para as regiões cafeeiras não somente pelo porte pequeno, mas também pela produtividade, pelo vigor vegetativo e pela precocidade.The successful transfer of the Ct gene for short internode to the tall cultivar of Coffea arábica'Mundo Novo' is reported. Individual selections were carried out in the F1, F2, F3 and F4 generations. It was found that early selection in the F2 generation was quite effective. A remarkably good correlation was found between productitivity of F2 plants and the yield of the F3 and F4 generations. Plants of the F4 generation have shown reasonable uniformity and high yield in several trials. The new selections showed to be early producers. Two new cultivars were released namely 'Catuaí Amarelo' and 'Catuaí Vermelho'. The former has yellow fruits whereas the latter has red fruits. The plants are much shorter that the ones of Mundo Novo. The new cultivars have a very strong secondary and tertiary branching. Because of these characteristics Catuaí Amarelo and Catuaí Vermelho are being planted in large scale replacing the tall cultivars.

  11. Synergistic interactions between Drosophila orthologues of genes spanned by de novo human CNVs support multiple-hit models of autism.

    Science.gov (United States)

    Grice, Stuart J; Liu, Ji-Long; Webber, Caleb

    2015-03-01

    Autism spectrum disorders (ASDs) are highly heritable and characterised by deficits in social interaction and communication, as well as restricted and repetitive behaviours. Although a number of highly penetrant ASD gene variants have been identified, there is growing evidence to support a causal role for combinatorial effects arising from the contributions of multiple loci. By examining synaptic and circadian neurological phenotypes resulting from the dosage variants of unique human:fly orthologues in Drosophila, we observe numerous synergistic interactions between pairs of informatically-identified candidate genes whose orthologues are jointly affected by large de novo copy number variants (CNVs). These CNVs were found in the genomes of individuals with autism, including a patient carrying a 22q11.2 deletion. We first demonstrate that dosage alterations of the unique Drosophila orthologues of candidate genes from de novo CNVs that harbour only a single candidate gene display neurological defects similar to those previously reported in Drosophila models of ASD-associated variants. We then considered pairwise dosage changes within the set of orthologues of candidate genes that were affected by the same single human de novo CNV. For three of four CNVs with complete orthologous relationships, we observed significant synergistic effects following the simultaneous dosage change of gene pairs drawn from a single CNV. The phenotypic variation observed at the Drosophila synapse that results from these interacting genetic variants supports a concordant phenotypic outcome across all interacting gene pairs following the direction of human gene copy number change. We observe both specificity and transitivity between interactors, both within and between CNV candidate gene sets, supporting shared and distinct genetic aetiologies. We then show that different interactions affect divergent synaptic processes, demonstrating distinct molecular aetiologies. Our study illustrates

  12. Foldability of a Natural De Novo Evolved Protein.

    Science.gov (United States)

    Bungard, Dixie; Copple, Jacob S; Yan, Jing; Chhun, Jimmy J; Kumirov, Vlad K; Foy, Scott G; Masel, Joanna; Wysocki, Vicki H; Cordes, Matthew H J

    2017-11-07

    The de novo evolution of protein-coding genes from noncoding DNA is emerging as a source of molecular innovation in biology. Studies of random sequence libraries, however, suggest that young de novo proteins will not fold into compact, specific structures typical of native globular proteins. Here we show that Bsc4, a functional, natural de novo protein encoded by a gene that evolved recently from noncoding DNA in the yeast S. cerevisiae, folds to a partially specific three-dimensional structure. Bsc4 forms soluble, compact oligomers with high β sheet content and a hydrophobic core, and undergoes cooperative, reversible denaturation. Bsc4 lacks a specific quaternary state, however, existing instead as a continuous distribution of oligomer sizes, and binds dyes indicative of amyloid oligomers or molten globules. The combination of native-like and non-native-like properties suggests a rudimentary fold that could potentially act as a functional intermediate in the emergence of new folded proteins de novo. Copyright © 2017 Elsevier Ltd. All rights reserved.

  13. De Novo Discovery of Structured ncRNA Motifs in Genomic Sequences

    DEFF Research Database (Denmark)

    Ruzzo, Walter L; Gorodkin, Jan

    2014-01-01

    De novo discovery of "motifs" capturing the commonalities among related noncoding ncRNA structured RNAs is among the most difficult problems in computational biology. This chapter outlines the challenges presented by this problem, together with some approaches towards solving them, with an emphas...... on an approach based on the CMfinder CMfinder program as a case study. Applications to genomic screens for novel de novo structured ncRNA ncRNA s, including structured RNA elements in untranslated portions of protein-coding genes, are presented.......De novo discovery of "motifs" capturing the commonalities among related noncoding ncRNA structured RNAs is among the most difficult problems in computational biology. This chapter outlines the challenges presented by this problem, together with some approaches towards solving them, with an emphasis...

  14. De novo dominant mutation of SOX10 gene in a Chinese family with Waardenburg syndrome type II.

    Science.gov (United States)

    Chen, Kaitian; Zong, Ling; Liu, Min; Zhan, Yuan; Wu, Xuan; Zou, Wenting; Jiang, Hongyan

    2014-06-01

    Waardenburg syndrome is a rare genetic disorder, inherited as an autosomal dominant trait. The condition is characterized by sensorineural hearing loss and pigment disturbances of the hair, skin, and iris. The de novo mutation in the SOX10 gene, responsible for Waardenburg syndrome type II, is rarely seen. The present study aimed to identify the genetic causes of Waardenburg syndrome type II in a Chinese family. Clinical and molecular evaluations were conducted in a Chinese family with Waardenburg syndrome type II. A novel SOX10 heterozygous c.259-260delCT mutation was identified. Heterozygosity was not observed in the parents and sister of the proband, indicating that the mutation has arisen de novo. The novel frameshift mutation, located in exon 3 of the SOX10 gene, disrupted normal amino acid coding from Leu87, leading to premature termination at nucleotide 396 (TGA). The high mobility group domain of SOX10 was inferred to be partially impaired. The novel heterozygous c.259-260delCT mutation in the SOX10 gene was considered to be the cause of Waardenburg syndrome in the proband. The clinical and genetic characterization of this family would help elucidate the genetic heterogeneity of SOX10 in Waardenburg syndrome type II. Moreover, the de novo pattern expanded the mutation data of SOX10. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.

  15. Examining the process of de novo gene birth: an educational primer on "integration of new genes into cellular networks, and their structural maturation".

    Science.gov (United States)

    Frietze, Seth; Leatherman, Judith

    2014-03-01

    New genes that arise from modification of the noncoding portion of a genome rather than being duplicated from parent genes are called de novo genes. These genes, identified by their brief evolution and lack of parent genes, provide an opportunity to study the timeframe in which emerging genes integrate into cellular networks, and how the characteristics of these genes change as they mature into bona fide genes. An article by G. Abrusán provides an opportunity to introduce students to fundamental concepts in evolutionary and comparative genetics and to provide a technical background by which to discuss systems biology approaches when studying the evolutionary process of gene birth. Basic background needed to understand the Abrusán study and details on comparative genomic concepts tailored for a classroom discussion are provided, including discussion questions and a supplemental exercise on navigating a genome database.

  16. Transcriptomic identification of salt-related genes and de novo assembly in common buckwheat (F. esculentum).

    Science.gov (United States)

    Lu, Qi-Huan; Wang, Ya-Qi; Song, Jin-Nan; Yang, Hong-Bing

    2018-06-01

    Common buckwheat (F. esculentum), annually herbaceous crop, is prevalent in people's daily life with the increasing development of economics. Compared with wheat, it is highly praised with high content of rutin and flavonoid. Common buckwheat is recognized as healthy food with good taste, and the product price of which such as noodles, flour, bread and so on are higher than wheat, and the seeds of which are bigger than that of tartary buckwheat, so if common buckwheat are planted more widely, people will spend less money on this healthy and delicious food. However, soil salinity has been a giant problem for agriculture production. The cultivation of salt tolerant crop varieties is an effective way to make full use of saline alkali land, and the highest salinity that the common buckwheat can sow is at 6.0%, so we chose 100 mM as the concentration of NaCl for treatment. Then we conducted transcriptome comparison between control and treatment groups. Potential regulatory genes related salt stress in common buckwheat were identified. A total of 29.36 million clean reads were produced via an illumina sequencing approach. We de novo assembled these reads into a transcriptome dataset containing 43,772 unigenes with N50 length of 1778 bp. A total of 26,672 unigenes could be found matches in public databases. GO, KEGG and Swiss-Prot classification suggested the enrichment of these unigenes in 47 sub-categories, 25 KOG and 129 pathways, respectively. We got 385 differentially expressed genes (DEGs) after comparing the transcriptome data between salt treatment and control groups. There are some genes encoded for responsing to stimulus, cell killing, metabolic process, signaling, multi-organism process, growth and cellular process might be relevant to salt stress in common buckwheat, which will provide a valuable references for the study on mechanism of salt tolerance and will be used as a genetic information for cultivating strong salt tolerant common buckwheat varieties in

  17. Transduplication resulted in the incorporation of two protein-coding sequences into the Turmoil-1 transposable element of C. elegans

    Directory of Open Access Journals (Sweden)

    Pupko Tal

    2008-10-01

    Full Text Available Abstract Transposable elements may acquire unrelated gene fragments into their sequences in a process called transduplication. Transduplication of protein-coding genes is common in plants, but is unknown of in animals. Here, we report that the Turmoil-1 transposable element in C. elegans has incorporated two protein-coding sequences into its inverted terminal repeat (ITR sequences. The ITRs of Turmoil-1 contain a conserved RNA recognition motif (RRM that originated from the rsp-2 gene and a fragment from the protein-coding region of the cpg-3 gene. We further report that an open reading frame specific to C. elegans may have been created as a result of a Turmoil-1 insertion. Mutations at the 5' splice site of this open reading frame may have reactivated the transduplicated RRM motif. Reviewers This article was reviewed by Dan Graur and William Martin. For the full reviews, please go to the Reviewers' Reports section.

  18. Investigation of de novo unique differentially expressed genes related to evolution in exercise response during domestication in Thoroughbred race horses.

    Directory of Open Access Journals (Sweden)

    Woncheoul Park

    Full Text Available Previous studies of horse RNA-seq were performed by mapping sequence reads to the reference genome during transcriptome analysis. However in this study, we focused on two main ideas. First, differentially expressed genes (DEGs were identified by de novo-based analysis (DBA in RNA-seq data from six Thoroughbreds before and after exercise, here-after referred to as "de novo unique differentially expressed genes" (DUDEG. Second, by integrating both conventional DEGs and genes identified as being selected for during domestication of Thoroughbred and Jeju pony from whole genome re-sequencing (WGS data, we give a new concept to the definition of DEG. We identified 1,034 and 567 DUDEGs in skeletal muscle and blood, respectively. DUDEGs in skeletal muscle were significantly related to exercise-induced stress biological process gene ontology (BP-GO terms: 'immune system process'; 'response to stimulus'; and, 'death' and a KEGG pathways: 'JAK-STAT signaling pathway'; 'MAPK signaling pathway'; 'regulation of actin cytoskeleton'; and, 'p53 signaling pathway'. In addition, we found TIMELESS, EIF4A3 and ZNF592 in blood and CHMP4C and FOXO3 in skeletal muscle, to be in common between DUDEGs and selected genes identified by evolutionary statistics such as FST and Cross Population Extended Haplotype Homozygosity (XP-EHH. Moreover, in Thoroughbreds, three out of five genes (CHMP4C, EIF4A3 and FOXO3 related to exercise response showed relatively low nucleotide diversity compared to the Jeju pony. DUDEGs are not only conceptually new DEGs that cannot be attained from reference-based analysis (RBA but also supports previous RBA results related to exercise in Thoroughbred. In summary, three exercise related genes which were selected for during domestication in the evolutionary history of Thoroughbred were identified as conceptually new DEGs in this study.

  19. Investigation of de novo unique differentially expressed genes related to evolution in exercise response during domestication in Thoroughbred race horses.

    Science.gov (United States)

    Park, Woncheoul; Kim, Jaemin; Kim, Hyeon Jeong; Choi, JaeYoung; Park, Jeong-Woong; Cho, Hyun-Woo; Kim, Byeong-Woo; Park, Myung Hum; Shin, Teak-Soon; Cho, Seong-Keun; Park, Jun-Kyu; Kim, Heebal; Hwang, Jae Yeon; Lee, Chang-Kyu; Lee, Hak-Kyo; Cho, Seoae; Cho, Byung-Wook

    2014-01-01

    Previous studies of horse RNA-seq were performed by mapping sequence reads to the reference genome during transcriptome analysis. However in this study, we focused on two main ideas. First, differentially expressed genes (DEGs) were identified by de novo-based analysis (DBA) in RNA-seq data from six Thoroughbreds before and after exercise, here-after referred to as "de novo unique differentially expressed genes" (DUDEG). Second, by integrating both conventional DEGs and genes identified as being selected for during domestication of Thoroughbred and Jeju pony from whole genome re-sequencing (WGS) data, we give a new concept to the definition of DEG. We identified 1,034 and 567 DUDEGs in skeletal muscle and blood, respectively. DUDEGs in skeletal muscle were significantly related to exercise-induced stress biological process gene ontology (BP-GO) terms: 'immune system process'; 'response to stimulus'; and, 'death' and a KEGG pathways: 'JAK-STAT signaling pathway'; 'MAPK signaling pathway'; 'regulation of actin cytoskeleton'; and, 'p53 signaling pathway'. In addition, we found TIMELESS, EIF4A3 and ZNF592 in blood and CHMP4C and FOXO3 in skeletal muscle, to be in common between DUDEGs and selected genes identified by evolutionary statistics such as FST and Cross Population Extended Haplotype Homozygosity (XP-EHH). Moreover, in Thoroughbreds, three out of five genes (CHMP4C, EIF4A3 and FOXO3) related to exercise response showed relatively low nucleotide diversity compared to the Jeju pony. DUDEGs are not only conceptually new DEGs that cannot be attained from reference-based analysis (RBA) but also supports previous RBA results related to exercise in Thoroughbred. In summary, three exercise related genes which were selected for during domestication in the evolutionary history of Thoroughbred were identified as conceptually new DEGs in this study.

  20. Transcriptome sequencing and de novo assembly in arecanut, Areca catechu L elucidates the secondary metabolite pathway genes

    Directory of Open Access Journals (Sweden)

    Ramaswamy Manimekalai

    2018-03-01

    Full Text Available Areca catechu L. belongs to the Arecaceae family which comprises many economically important palms. The palm is a source of alkaloids and carotenoids. The lack of ample genetic information in public databases has been a constraint for the genetic improvement of arecanut. To gain molecular insight into the palm, high throughput RNA sequencing and de novo assembly of arecanut leaf transcriptome was undertaken in the present study. A total 56,321,907 paired end reads of 101 bp length consisting of 11.343 Gb nucleotides were generated. De novo assembly resulted in 48,783 good quality transcripts, of which 67% of transcripts could be annotated against NCBI non – redundant database. The Gene Ontology (GO analysis with UniProt database identified 9222 biological process, 11268 molecular function and 7574 cellular components GO terms. Large scale expression profiling through Fragments per Kilobase per Million mapped reads (FPKM showed major genes involved in different metabolic pathways of the plant. Metabolic pathway analysis of the assembled transcripts identified 124 plant related pathways. The transcripts related to carotenoid and alkaloid biosynthetic pathways had more number of reads and FPKM values suggesting higher expression of these genes. The arecanut transcript sequences generated in the study showed high similarity with coconut, oil palm and date palm sequences retrieved from public domains. We also identified 6853 genic SSR regions in the arecanut. The possible primers were designed for SSR detection and this would simplify the future efforts in genetic characterization of arecanut.

  1. Locating protein-coding sequences under selection for additional, overlapping functions in 29 mammalian genomes

    DEFF Research Database (Denmark)

    Lin, Michael F; Kheradpour, Pouya; Washietl, Stefan

    2011-01-01

    conservation compared to typical protein-coding genes—especially at synonymous sites. In this study, we use genome alignments of 29 placental mammals to systematically locate short regions within human ORFs that show conspicuously low estimated rates of synonymous substitution across these species. The 29......-species alignment provides statistical power to locate more than 10,000 such regions with resolution down to nine-codon windows, which are found within more than a quarter of all human protein-coding genes and contain ~2% of their synonymous sites. We collect numerous lines of evidence that the observed...... synonymous constraint in these regions reflects selection on overlapping functional elements including splicing regulatory elements, dual-coding genes, RNA secondary structures, microRNA target sites, and developmental enhancers. Our results show that overlapping functional elements are common in mammalian...

  2. De novo transcriptome assembly and comparative analysis of differentially expressed genes in Prunus dulcis Mill. in response to freezing stress.

    Directory of Open Access Journals (Sweden)

    Sadegh Mousavi

    Full Text Available Almond (Prunus dulcis Mill., one of the most important nut crops, requires chilling during winter to develop fruiting buds. However, early spring chilling and late spring frost may damage the reproductive tissues leading to reduction in the rate of productivity. Despite the importance of transcriptional changes and regulation, little is known about the almond's transcriptome under the cold stress conditions. In the current research, we used RNA-seq technique to study the response of the reproductive tissues of almond (anther and ovary to frost stress. RNA sequencing resulted in more than 20 million reads from anther and ovary tissues of almond, individually. About 40,000 contigs were assembled and annotated de novo in each tissue. Profile of gene expression in ovary showed significant alterations in 5,112 genes, whereas in anther 6,926 genes were affected by freezing stress. Around two thousands of these genes were common altered genes in both ovary and anther libraries. Gene ontology indicated the involvement of differentially expressed (DE genes, responding to freezing stress, in metabolic and cellular processes. qRT-PCR analysis verified the expression pattern of eight genes randomly selected from the DE genes. In conclusion, the almond gene index assembled in this study and the reported DE genes can provide great insights on responses of almond and other Prunus species to abiotic stresses. The obtained results from current research would add to the limited available information on almond and Rosaceae. Besides, the findings would be very useful for comparative studies as the number of DE genes reported here is much higher than that of any previous reports in this plant.

  3. De novo transcriptome assembly and comparative analysis of differentially expressed genes in Prunus dulcis Mill. in response to freezing stress.

    Science.gov (United States)

    Mousavi, Sadegh; Alisoltani, Arghavan; Shiran, Behrouz; Fallahi, Hossein; Ebrahimie, Esameil; Imani, Ali; Houshmand, Saadollah

    2014-01-01

    Almond (Prunus dulcis Mill.), one of the most important nut crops, requires chilling during winter to develop fruiting buds. However, early spring chilling and late spring frost may damage the reproductive tissues leading to reduction in the rate of productivity. Despite the importance of transcriptional changes and regulation, little is known about the almond's transcriptome under the cold stress conditions. In the current research, we used RNA-seq technique to study the response of the reproductive tissues of almond (anther and ovary) to frost stress. RNA sequencing resulted in more than 20 million reads from anther and ovary tissues of almond, individually. About 40,000 contigs were assembled and annotated de novo in each tissue. Profile of gene expression in ovary showed significant alterations in 5,112 genes, whereas in anther 6,926 genes were affected by freezing stress. Around two thousands of these genes were common altered genes in both ovary and anther libraries. Gene ontology indicated the involvement of differentially expressed (DE) genes, responding to freezing stress, in metabolic and cellular processes. qRT-PCR analysis verified the expression pattern of eight genes randomly selected from the DE genes. In conclusion, the almond gene index assembled in this study and the reported DE genes can provide great insights on responses of almond and other Prunus species to abiotic stresses. The obtained results from current research would add to the limited available information on almond and Rosaceae. Besides, the findings would be very useful for comparative studies as the number of DE genes reported here is much higher than that of any previous reports in this plant.

  4. Emergence, Retention and Selection: A Trilogy of Origination for Functional De Novo Proteins from Ancestral LncRNAs in Primates.

    Directory of Open Access Journals (Sweden)

    Jia-Yu Chen

    2015-07-01

    Full Text Available While some human-specific protein-coding genes have been proposed to originate from ancestral lncRNAs, the transition process remains poorly understood. Here we identified 64 hominoid-specific de novo genes and report a mechanism for the origination of functional de novo proteins from ancestral lncRNAs with precise splicing structures and specific tissue expression profiles. Whole-genome sequencing of dozens of rhesus macaque animals revealed that these lncRNAs are generally not more selectively constrained than other lncRNA loci. The existence of these newly-originated de novo proteins is also not beyond anticipation under neutral expectation, as they generally have longer theoretical lifespan than their current age, due to their GC-rich sequence property enabling stable ORFs with lower chance of non-sense mutations. Interestingly, although the emergence and retention of these de novo genes are likely driven by neutral forces, population genetics study in 67 human individuals and 82 macaque animals revealed signatures of purifying selection on these genes specifically in human population, indicating a proportion of these newly-originated proteins are already functional in human. We thus propose a mechanism for creation of functional de novo proteins from ancestral lncRNAs during the primate evolution, which may contribute to human-specific genetic novelties by taking advantage of existed genomic contexts.

  5. Digital gene expression analysis based on integrated de novo transcriptome assembly of sweet potato [Ipomoea batatas (L. Lam].

    Directory of Open Access Journals (Sweden)

    Xiang Tao

    Full Text Available BACKGROUND: Sweet potato (Ipomoea batatas L. [Lam.] ranks among the top six most important food crops in the world. It is widely grown throughout the world with high and stable yield, strong adaptability, rich nutrient content, and multiple uses. However, little is known about the molecular biology of this important non-model organism due to lack of genomic resources. Hence, studies based on high-throughput sequencing technologies are needed to get a comprehensive and integrated genomic resource and better understanding of gene expression patterns in different tissues and at various developmental stages. METHODOLOGY/PRINCIPAL FINDINGS: Illumina paired-end (PE RNA-Sequencing was performed, and generated 48.7 million of 75 bp PE reads. These reads were de novo assembled into 128,052 transcripts (≥ 100 bp, which correspond to 41.1 million base pairs, by using a combined assembly strategy. Transcripts were annotated by Blast2GO and 51,763 transcripts got BLASTX hits, in which 39,677 transcripts have GO terms and 14,117 have ECs that are associated with 147 KEGG pathways. Furthermore, transcriptome differences of seven tissues were analyzed by using Illumina digital gene expression (DGE tag profiling and numerous differentially and specifically expressed transcripts were identified. Moreover, the expression characteristics of genes involved in viral genomes, starch metabolism and potential stress tolerance and insect resistance were also identified. CONCLUSIONS/SIGNIFICANCE: The combined de novo transcriptome assembly strategy can be applied to other organisms whose reference genomes are not available. The data provided here represent the most comprehensive and integrated genomic resources for cloning and identifying genes of interest in sweet potato. Characterization of sweet potato transcriptome provides an effective tool for better understanding the molecular mechanisms of cellular processes including development of leaves and storage roots

  6. De novo transcriptome assembly and analysis of differential gene expression in response to drought in European beech.

    Directory of Open Access Journals (Sweden)

    Markus Müller

    Full Text Available Despite the ecological and economic importance of European beech (Fagus sylvatica L. genomic resources of this species are still limited. This hampers an understanding of the molecular basis of adaptation to stress. Since beech will most likely be threatened by the consequences of climate change, an understanding of adaptive processes to climate change-related drought stress is of major importance. Here, we used RNA-seq to provide the first drought stress-related transcriptome of beech. In a drought stress trial with beech saplings, 50 samples were taken for RNA extraction at five points in time during a soil desiccation experiment. De novo transcriptome assembly and analysis of differential gene expression revealed 44,335 contigs, and 662 differentially expressed genes between the stress and normally watered control group. Gene expression was specific to the different time points, and only five genes were significantly differentially expressed between the stress and control group on all five sampling days. GO term enrichment showed that mostly genes involved in lipid- and homeostasis-related processes were upregulated, whereas genes involved in oxidative stress response were downregulated in the stressed seedlings. This study gives first insights into the genomic drought stress response of European beech, and provides new genetic resources for adaptation research in this species.

  7. De novo amplification within a silent human cholinesterase gene in a family subjected to prolonged exposure to organophosphorus insecticides

    International Nuclear Information System (INIS)

    Prody, C.A.; Dreyfus, P.; Soreq, H.; Zamir, R.; Zakut, H.

    1989-01-01

    A 100-fold DNA amplification in the CHE gene, coding for serum butyrylcholinesterase (BtChoEase), was found in a farmer expressing silent CHE phenotype. Individuals homozygous for this gene display a defective serum BtChoEase and are particularly vulnerable to poisoning by agricultural organophosphorus insecticides, to which all members of this family had long been exposed. DNA blot hybridization with regional BtChoEase cDNA probes suggested that the amplification was most intense in regions encoding central sequences within BtChoEase cDNA, whereas distal sequences were amplified to a much lower extent. This is in agreement with the onion skin model, based on amplification of genes in cultured cells and primary tumors. The amplification was absent in the grandparents but present at the same extent in one of their sons and in a grandson, with similar DNA blot hybridization patterns. In situ hybridization experiments localized the amplified sequences to the long arm of chromosome 3, close to the site where the authors previously mapped the CHE gene. Altogether, these observations suggest that the initial amplification event occurred early in embryogenesis, spermatogenesis, or oogenesis, where the CHE gene is intensely active and where cholinergic functioning was indicated to be physiologically necessary. These findings demonstrate a de novo amplification in apparently healthy individuals within an autosomal gene producing a target protein to an inhibitor

  8. De novo transcriptome sequencing of Isaria cateniannulata and comparative analysis of gene expression in response to heat and cold stresses.

    Directory of Open Access Journals (Sweden)

    Dingfeng Wang

    Full Text Available Isaria cateniannulata is a very important and virulent entomopathogenic fungus that infects many insect pest species. Although I. cateniannulata is commonly exposed to extreme environmental temperature conditions, little is known about its molecular response mechanism to temperature stress. Here, we sequenced and de novo assembled the transcriptome of I. cateniannulata in response to high and low temperature stresses using Illumina RNA-Seq technology. Our assembly encompassed 17,514 unigenes (mean length = 1,197 bp, in which 11,445 unigenes (65.34% showed significant similarities to known sequences in NCBI non-redundant protein sequences (Nr database. Using digital gene expression analysis, 4,483 differentially expressed genes (DEGs were identified after heat treatment, including 2,905 up-regulated genes and 1,578 down-regulated genes. Under cold stress, 1,927 DEGs were identified, including 1,245 up-regulated genes and 682 down-regulated genes. The expression patterns of 18 randomly selected candidate DEGs resulting from quantitative real-time PCR (qRT-PCR were consistent with their transcriptome analysis results. Although DEGs were involved in many pathways, we focused on the genes that were involved in endocytosis: In heat stress, the pathway of clathrin-dependent endocytosis (CDE was active; however at low temperature stresses, the pathway of clathrin-independent endocytosis (CIE was active. Besides, four categories of DEGs acting as temperature sensors were observed, including cell-wall-major-components-metabolism-related (CWMCMR genes, heat shock protein (Hsp genes, intracellular-compatible-solutes-metabolism-related (ICSMR genes and glutathione S-transferase (GST. These results enhance our understanding of the molecular mechanisms of I. cateniannulata in response to temperature stresses and provide a valuable resource for the future investigations.

  9. De Novo Transcriptomic Analysis of an Oleaginous Microalga: Pathway Description and Gene Discovery for Production of Next-Generation Biofuels

    Science.gov (United States)

    Wan, LingLin; Han, Juan; Sang, Min; Li, AiFen; Wu, Hong; Yin, ShunJi; Zhang, ChengWu

    2012-01-01

    Background Eustigmatos cf. polyphem is a yellow-green unicellular soil microalga belonging to the eustimatophyte with high biomass and considerable production of triacylglycerols (TAGs) for biofuels, which is thus referred to as an oleaginous microalga. The paucity of microalgae genome sequences, however, limits development of gene-based biofuel feedstock optimization studies. Here we describe the sequencing and de novo transcriptome assembly for a non-model microalgae species, E. cf. polyphem, and identify pathways and genes of importance related to biofuel production. Results We performed the de novo assembly of E. cf. polyphem transcriptome using Illumina paired-end sequencing technology. In a single run, we produced 29,199,432 sequencing reads corresponding to 2.33 Gb total nucleotides. These reads were assembled into 75,632 unigenes with a mean size of 503 bp and an N50 of 663 bp, ranging from 100 bp to >3,000 bp. Assembled unigenes were subjected to BLAST similarity searches and annotated with Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) orthology identifiers. These analyses identified the majority of carbohydrate, fatty acids, TAG and carotenoids biosynthesis and catabolism pathways in E. cf. polyphem. Conclusions Our data provides the construction of metabolic pathways involved in the biosynthesis and catabolism of carbohydrate, fatty acids, TAG and carotenoids in E. cf. polyphem and provides a foundation for the molecular genetics and functional genomics required to direct metabolic engineering efforts that seek to enhance the quantity and character of microalgae-based biofuel feedstock. PMID:22536352

  10. De novo transcriptomic analysis of an oleaginous microalga: pathway description and gene discovery for production of next-generation biofuels.

    Directory of Open Access Journals (Sweden)

    LingLin Wan

    Full Text Available Eustigmatos cf. polyphem is a yellow-green unicellular soil microalga belonging to the eustimatophyte with high biomass and considerable production of triacylglycerols (TAGs for biofuels, which is thus referred to as an oleaginous microalga. The paucity of microalgae genome sequences, however, limits development of gene-based biofuel feedstock optimization studies. Here we describe the sequencing and de novo transcriptome assembly for a non-model microalgae species, E. cf. polyphem, and identify pathways and genes of importance related to biofuel production.We performed the de novo assembly of E. cf. polyphem transcriptome using Illumina paired-end sequencing technology. In a single run, we produced 29,199,432 sequencing reads corresponding to 2.33 Gb total nucleotides. These reads were assembled into 75,632 unigenes with a mean size of 503 bp and an N50 of 663 bp, ranging from 100 bp to >3,000 bp. Assembled unigenes were subjected to BLAST similarity searches and annotated with Gene Ontology (GO and Kyoto Encyclopedia of Genes and Genomes (KEGG orthology identifiers. These analyses identified the majority of carbohydrate, fatty acids, TAG and carotenoids biosynthesis and catabolism pathways in E. cf. polyphem.Our data provides the construction of metabolic pathways involved in the biosynthesis and catabolism of carbohydrate, fatty acids, TAG and carotenoids in E. cf. polyphem and provides a foundation for the molecular genetics and functional genomics required to direct metabolic engineering efforts that seek to enhance the quantity and character of microalgae-based biofuel feedstock.

  11. De novo assembly of Eugenia uniflora L. transcriptome and identification of genes from the terpenoid biosynthesis pathway.

    Science.gov (United States)

    Guzman, Frank; Kulcheski, Franceli Rodrigues; Turchetto-Zolet, Andreia Carina; Margis, Rogerio

    2014-12-01

    Pitanga (Eugenia uniflora L.) is a member of the Myrtaceae family and is of particular interest due to its medicinal properties that are attributed to specialized metabolites with known biological activities. Among these molecules, terpenoids are the most abundant in essential oils that are found in the leaves and represent compounds with potential pharmacological benefits. The terpene diversity observed in Myrtaceae is determined by the activity of different members of the terpene synthase and oxidosqualene cyclase families. Therefore, the aim of this study was to perform a de novo assembly of transcripts from E. uniflora leaves and to annotation to identify the genes potentially involved in the terpenoid biosynthesis pathway and terpene diversity. In total, 72,742 unigenes with a mean length of 1048bp were identified. Of these, 43,631 and 36,289 were annotated with the NCBI non-redundant protein and Swiss-Prot databases, respectively. The gene ontology categorized the sequences into 53 functional groups. A metabolic pathway analysis with KEGG revealed 8,625 unigenes assigned to 141 metabolic pathways and 40 unigenes predicted to be associated with the biosynthesis of terpenoids. Furthermore, we identified four putative full-length terpene synthase genes involved in sesquiterpenes and monoterpenes biosynthesis, and three putative full-length oxidosqualene cyclase genes involved in the triterpenes biosynthesis. The expression of these genes was validated in different E. uniflora tissues. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.

  12. De Novo Transcriptome Sequencing of Olea europaea L. to Identify Genes Involved in the Development of the Pollen Tube.

    Science.gov (United States)

    Iaria, Domenico; Chiappetta, Adriana; Muzzalupo, Innocenzo

    2016-01-01

    In olive (Olea europaea L.), the processes controlling self-incompatibility are still unclear and the molecular basis underlying this process are still not fully characterized. In order to determine compatibility relationships, using next-generation sequencing techniques and a de novo transcriptome assembly strategy, we show that pollen tubes from different olive plants, grown in vitro in a medium containing its own pistil and in combination pollen/pistil from self-sterile and self-fertile cultivars, have a distinct gene expression profile and many of the differentially expressed sequences between the samples fall within gene families involved in the development of the pollen tube, such as lipase, carboxylesterase, pectinesterase, pectin methylesterase, and callose synthase. Moreover, different genes involved in signal transduction, transcription, and growth are overrepresented. The analysis also allowed us to identify members in actin and actin depolymerization factor and fibrin gene family and member of the Ca(2+) binding gene family related to the development and polarization of pollen apical tip. The whole transcriptomic analysis, through the identification of the differentially expressed transcripts set and an extended functional annotation analysis, will lead to a better understanding of the mechanisms of pollen germination and pollen tube growth in the olive.

  13. Analysis of SNP rs16754 of WT1 gene in a series of de novo acute myeloid leukemia patients.

    Science.gov (United States)

    Luna, Irene; Such, Esperanza; Cervera, Jose; Barragán, Eva; Jiménez-Velasco, Antonio; Dolz, Sandra; Ibáñez, Mariam; Gómez-Seguí, Inés; López-Pavía, María; Llop, Marta; Fuster, Óscar; Oltra, Silvestre; Moscardó, Federico; Martínez-Cuadrón, David; Senent, M Leonor; Gascón, Adriana; Montesinos, Pau; Martín, Guillermo; Bolufer, Pascual; Sanz, Miguel A

    2012-12-01

    The single nucleotide polymorphism (SNP) rs16754 of the WT1 gene has been previously described as a possible prognostic marker in normal karyotype acute myeloid leukemia (AML) patients. Nevertheless, the findings in this field are not always reproducible in different series. One hundred and seventy-five adult de novo AML patients were screened with two different methods for the detection of SNP rs16754: high-resolution melting (HRM) and FRET hybridization probes. Direct sequencing was used to validate both techniques. The SNP was detected in 52 out of 175 patients (30 %), both by HRM and hybridization probes. Direct sequencing confirmed that every positive sample in the screening methods had a variation in the DNA sequence. Patients with the wild-type genotype (WT1(AA)) for the SNP rs16754 were significantly younger than those with the heterozygous WT1(AG) genotype. No other difference was observed for baseline characteristic or outcome between patients with or without the SNP. Both techniques are equally reliable and reproducible as screening methods for the detection of the SNP rs16754, allowing for the selection of those samples that will need to be sequenced. We were unable to confirm the suggested favorable outcome of SNP rs16754 in de novo AML.

  14. De novo transcriptome and small RNA analysis of two Chinese willow cultivars reveals stress response genes in Salix matsudana.

    Directory of Open Access Journals (Sweden)

    Guodong Rao

    Full Text Available Salix matsudana Koidz. is a deciduous, rapidly growing, and drought resistant tree and is one of the most widely distributed and commonly cultivated willow species in China. Currently little transcriptomic and small RNAomic data are available to reveal the genes involve in the stress resistant in S. matsudana. Here, we report the RNA-seq analysis results of both transcriptome and small RNAome data using Illumina deep sequencing of shoot tips from two willow variants(Salix. matsudana and Salix matsudana Koidz. cultivar 'Tortuosa'. De novo gene assembly was used to generate the consensus transcriptome and small RNAome, which contained 106,403 unique transcripts with an average length of 944 bp and a total length of 100.45 MB, and 166 known miRNAs representing 35 miRNA families. Comparison of transcriptomes and small RNAomes combined with quantitative real-time PCR from the two Salix libraries revealed a total of 292 different expressed genes(DEGs and 36 different expressed miRNAs (DEMs. Among the DEGs and DEMs, 196 genes and 24 miRNAs were up regulated, 96 genes and 12 miRNA were down regulated in S. matsudana. Functional analysis of DEGs and miRNA targets showed that many genes were involved in stress resistance in S. matsudana. Our global gene expression profiling presents a comprehensive view of the transcriptome and small RNAome which provide valuable information and sequence resources for uncovering the stress response genes in S. matsudana. Moreover the transcriptome and small RNAome data provide a basis for future study of genetic resistance in Salix.

  15. Discovery of genes related to insecticide resistance in Bactrocera dorsalis by functional genomic analysis of a de novo assembled transcriptome.

    Science.gov (United States)

    Hsu, Ju-Chun; Chien, Ting-Ying; Hu, Chia-Cheng; Chen, Mei-Ju May; Wu, Wen-Jer; Feng, Hai-Tung; Haymer, David S; Chen, Chien-Yu

    2012-01-01

    Insecticide resistance has recently become a critical concern for control of many insect pest species. Genome sequencing and global quantization of gene expression through analysis of the transcriptome can provide useful information relevant to this challenging problem. The oriental fruit fly, Bactrocera dorsalis, is one of the world's most destructive agricultural pests, and recently it has been used as a target for studies of genetic mechanisms related to insecticide resistance. However, prior to this study, the molecular data available for this species was largely limited to genes identified through homology. To provide a broader pool of gene sequences of potential interest with regard to insecticide resistance, this study uses whole transcriptome analysis developed through de novo assembly of short reads generated by next-generation sequencing (NGS). The transcriptome of B. dorsalis was initially constructed using Illumina's Solexa sequencing technology. Qualified reads were assembled into contigs and potential splicing variants (isotigs). A total of 29,067 isotigs have putative homologues in the non-redundant (nr) protein database from NCBI, and 11,073 of these correspond to distinct D. melanogaster proteins in the RefSeq database. Approximately 5,546 isotigs contain coding sequences that are at least 80% complete and appear to represent B. dorsalis genes. We observed a strong correlation between the completeness of the assembled sequences and the expression intensity of the transcripts. The assembled sequences were also used to identify large numbers of genes potentially belonging to families related to insecticide resistance. A total of 90 P450-, 42 GST-and 37 COE-related genes, representing three major enzyme families involved in insecticide metabolism and resistance, were identified. In addition, 36 isotigs were discovered to contain target site sequences related to four classes of resistance genes. Identified sequence motifs were also analyzed to

  16. De novo assembly, gene annotation, and marker discovery in stored-product pest Liposcelis entomophila (Enderlein using transcriptome sequences.

    Directory of Open Access Journals (Sweden)

    Dan-Dan Wei

    Full Text Available BACKGROUND: As a major stored-product pest insect, Liposcelis entomophila has developed high levels of resistance to various insecticides in grain storage systems. However, the molecular mechanisms underlying resistance and environmental stress have not been characterized. To date, there is a lack of genomic information for this species. Therefore, studies aimed at profiling the L. entomophila transcriptome would provide a better understanding of the biological functions at the molecular levels. METHODOLOGY/PRINCIPAL FINDINGS: We applied Illumina sequencing technology to sequence the transcriptome of L. entomophila. A total of 54,406,328 clean reads were obtained and that de novo assembled into 54,220 unigenes, with an average length of 571 bp. Through a similarity search, 33,404 (61.61% unigenes were matched to known proteins in the NCBI non-redundant (Nr protein database. These unigenes were further functionally annotated with gene ontology (GO, cluster of orthologous groups of proteins (COG, and Kyoto Encyclopedia of Genes and Genomes (KEGG databases. A large number of genes potentially involved in insecticide resistance were manually curated, including 68 putative cytochrome P450 genes, 37 putative glutathione S-transferase (GST genes, 19 putative carboxyl/cholinesterase (CCE genes, and other 126 transcripts to contain target site sequences or encoding detoxification genes representing eight types of resistance enzymes. Furthermore, to gain insight into the molecular basis of the L. entomophila toward thermal stresses, 25 heat shock protein (Hsp genes were identified. In addition, 1,100 SSRs and 57,757 SNPs were detected and 231 pairs of SSR primes were designed for investigating the genetic diversity in future. CONCLUSIONS/SIGNIFICANCE: We developed a comprehensive transcriptomic database for L. entomophila. These sequences and putative molecular markers would further promote our understanding of the molecular mechanisms underlying

  17. Inactivation of human α-globin gene expression by a de novo deletion located upstream of the α-globin gene cluster

    International Nuclear Information System (INIS)

    Liebhaber, S.A.; Weiss, I.; Cash, F.E.; Griese, E.U.; Horst, J.; Ayyub, H.; Higgs, D.R.

    1990-01-01

    Synthesis of normal human hemoglobin A, α 2 β 2 , is based upon balanced expression of genes in the α-globin gene cluster on chromosome 15 and the β-globin gene cluster on chromosome 11. Full levels of erythroid-specific activation of the β-globin cluster depend on sequences located at a considerable distance 5' to the β-globin gene, referred to as the locus-activating or dominant control region. The existence of an analogous element(s) upstream of the α-globin cluster has been suggested from observations on naturally occurring deletions and experimental studies. The authors have identified an individual with α-thalassemia in whom structurally normal α-globin genes have been inactivated in cis by a discrete de novo 35-kilobase deletion located ∼30 kilobases 5' from the α-globin gene cluster. They conclude that this deletion inactivates expression of the α-globin genes by removing one or more of the previously identified upstream regulatory sequences that are critical to expression of the α-globin genes

  18. De novo transcriptome sequencing and comparative analysis of differentially expressed genes in dryoperis fragrans under temperature stress

    International Nuclear Information System (INIS)

    Wang, W.Z.; Tong, W.S.; Gao, R.

    2016-01-01

    Dryopteris fragrans is a species of fern and contains flavonoids compounds with medicinal value. This study explain the temperature stress impact flavonoids synthesis in D. fragrans tissue culture seedlings under the low temperature at 4 degree C, high temperature at 35 degree C and moderate temperature at 25 degree C. By using Illumina HiSeq 2000 sequencing, 80.9 million raw sequence reads were de novo assembled into 66,716 non-redundant unigenes. 38,486 unigenes (57.7%) were annotated for their function. 13,973 unigenes and 29,598 unigenes were allocated to gene ontology (GO) and clusters of orthologous group (COG), respectively. 18,989 sequences mapped to 118 Kyoto Encyclopedia of Genes and Genomes Pathway database (KEGG), 204 genes were involved in flavonoid biosynthesis, regulation and transport. 25,292 and 16,817 unigenes exhibited marked differential expression in response to temperature shifts of 25 degree C to 4 degree C and 25 degree C to 35 degree C, respectively. 4CL and CHS genes involved in flavonoid biosynthesis were tested and suggested that they were responsible for biosynthesis of flavonoids. This study provides the first published data to describe the D. fragrans transcriptome and should accelerate understanding of flavonoids biosynthesis, regulation and transport mechanisms. Since most unigenes described here were successfully annotated, these results should facilitate future functional genomic understanding and research of D. fragrans. (author)

  19. Transcriptome Sequencing, De Novo Assembly and Differential Gene Expression Analysis of the Early Development of Acipenser baeri.

    Directory of Open Access Journals (Sweden)

    Wei Song

    Full Text Available The molecular mechanisms that drive the development of the endangered fossil fish species Acipenser baeri are difficult to study due to the lack of genomic data. Recent advances in sequencing technologies and the reducing cost of sequencing offer exclusive opportunities for exploring important molecular mechanisms underlying specific biological processes. This manuscript describes the large scale sequencing and analyses of mRNA from Acipenser baeri collected at five development time points using the Illumina Hiseq2000 platform. The sequencing reads were de novo assembled and clustered into 278167 unigenes, of which 57346 (20.62% had 45837 known homologues proteins in Uniprot protein databases while 11509 proteins matched with at least one sequence of assembled unigenes. The remaining 79.38% of unigenes could stand for non-coding unigenes or unigenes specific to A. baeri. A number of 43062 unigenes were annotated into functional categories via Gene Ontology (GO annotation whereas 29526 unigenes were associated with 329 pathways by mapping to KEGG database. Subsequently, 3479 differentially expressed genes were scanned within developmental stages and clustered into 50 gene expression profiles. Genes preferentially expressed at each stage were also identified. Through GO and KEGG pathway enrichment analysis, relevant physiological variations during the early development of A. baeri could be better cognized. Accordingly, the present study gives insights into the transcriptome profile of the early development of A. baeri, and the information contained in this large scale transcriptome will provide substantial references for A. baeri developmental biology and promote its aquaculture research.

  20. IN-MACA-MCC: Integrated Multiple Attractor Cellular Automata with Modified Clonal Classifier for Human Protein Coding and Promoter Prediction

    Directory of Open Access Journals (Sweden)

    Kiran Sree Pokkuluri

    2014-01-01

    Full Text Available Protein coding and promoter region predictions are very important challenges of bioinformatics (Attwood and Teresa, 2000. The identification of these regions plays a crucial role in understanding the genes. Many novel computational and mathematical methods are introduced as well as existing methods that are getting refined for predicting both of the regions separately; still there is a scope for improvement. We propose a classifier that is built with MACA (multiple attractor cellular automata and MCC (modified clonal classifier to predict both regions with a single classifier. The proposed classifier is trained and tested with Fickett and Tung (1992 datasets for protein coding region prediction for DNA sequences of lengths 54, 108, and 162. This classifier is trained and tested with MMCRI datasets for protein coding region prediction for DNA sequences of lengths 252 and 354. The proposed classifier is trained and tested with promoter sequences from DBTSS (Yamashita et al., 2006 dataset and nonpromoters from EID (Saxonov et al., 2000 and UTRdb (Pesole et al., 2002 datasets. The proposed model can predict both regions with an average accuracy of 90.5% for promoter and 89.6% for protein coding region predictions. The specificity and sensitivity values of promoter and protein coding region predictions are 0.89 and 0.92, respectively.

  1. Properties of Sequence Conservation in Upstream Regulatory and Protein Coding Sequences among Paralogs in Arabidopsis thaliana

    Science.gov (United States)

    Richardson, Dale N.; Wiehe, Thomas

    Whole genome duplication (WGD) has catalyzed the formation of new species, genes with novel functions, altered expression patterns, complexified signaling pathways and has provided organisms a level of genetic robustness. We studied the long-term evolution and interrelationships of 5’ upstream regulatory sequences (URSs), protein coding sequences (CDSs) and expression correlations (EC) of duplicated gene pairs in Arabidopsis. Three distinct methods revealed significant evolutionary conservation between paralogous URSs and were highly correlated with microarray-based expression correlation of the respective gene pairs. Positional information on exact matches between sequences unveiled the contribution of micro-chromosomal rearrangements on expression divergence. A three-way rank analysis of URS similarity, CDS divergence and EC uncovered specific gene functional biases. Transcription factor activity was associated with gene pairs exhibiting conserved URSs and divergent CDSs, whereas a broad array of metabolic enzymes was found to be associated with gene pairs showing diverged URSs but conserved CDSs.

  2. De novo assembly and characterization of the transcriptome of the parasitic weed dodder identifies genes associated with plant parasitism.

    Science.gov (United States)

    Ranjan, Aashish; Ichihashi, Yasunori; Farhi, Moran; Zumstein, Kristina; Townsley, Brad; David-Schwartz, Rakefet; Sinha, Neelima R

    2014-11-01

    Parasitic flowering plants are one of the most destructive agricultural pests and have major impact on crop yields throughout the world. Being dependent on finding a host plant for growth, parasitic plants penetrate their host using specialized organs called haustoria. Haustoria establish vascular connections with the host, which enable the parasite to steal nutrients and water. The underlying molecular and developmental basis of parasitism by plants is largely unknown. In order to investigate the process of parasitism, RNAs from different stages (i.e. seed, seedling, vegetative strand, prehaustoria, haustoria, and flower) were used to de novo assemble and annotate the transcriptome of the obligate plant stem parasite dodder (Cuscuta pentagona). The assembled transcriptome was used to dissect transcriptional dynamics during dodder development and parasitism and identified key gene categories involved in the process of plant parasitism. Host plant infection is accompanied by increased expression of parasite genes underlying transport and transporter categories, response to stress and stimuli, as well as genes encoding enzymes involved in cell wall modifications. By contrast, expression of photosynthetic genes is decreased in the dodder infective stages compared with normal stem. In addition, genes relating to biosynthesis, transport, and response of phytohormones, such as auxin, gibberellins, and strigolactone, were differentially expressed in the dodder infective stages compared with stems and seedlings. This analysis sheds light on the transcriptional changes that accompany plant parasitism and will aid in identifying potential gene targets for use in controlling the infestation of crops by parasitic weeds. © 2014 American Society of Plant Biologists. All Rights Reserved.

  3. De Novo Transcriptome Sequencing of Desert Herbaceous Achnatherum splendens (Achnatherum Seedlings and Identification of Salt Tolerance Genes

    Directory of Open Access Journals (Sweden)

    Jiangtao Liu

    2016-03-01

    Full Text Available Achnatherum splendens is an important forage herb in Northwestern China. It has a high tolerance to salinity and is, thus, considered one of the most important constructive plants in saline and alkaline areas of land in Northwest China. However, the mechanisms of salt stress tolerance in A. splendens remain unknown. Next-generation sequencing (NGS technologies can be used for global gene expression profiling. In this study, we examined sequence and transcript abundance data for the root/leaf transcriptome of A. splendens obtained using an Illumina HiSeq 2500. Over 35 million clean reads were obtained from the leaf and root libraries. All of the RNA sequencing (RNA-seq reads were assembled de novo into a total of 126,235 unigenes and 36,511 coding DNA sequences (CDS. We further identified 1663 differentially-expressed genes (DEGs between the salt stress treatment and control. Functional annotation of the DEGs by gene ontology (GO, using Arabidopsis and rice as references, revealed enrichment of salt stress-related GO categories, including “oxidation reduction”, “transcription factor activity”, and “ion channel transporter”. Thus, this global transcriptome analysis of A. splendens has provided an important genetic resource for the study of salt tolerance in this halophyte. The identified sequences and their putative functional data will facilitate future investigations of the tolerance of Achnatherum species to various types of abiotic stress.

  4. Co-Option and De Novo Gene Evolution Underlie Molluscan Shell Diversity

    Science.gov (United States)

    Aguilera, Felipe; McDougall, Carmel

    2017-01-01

    Abstract Molluscs fabricate shells of incredible diversity and complexity by localized secretions from the dorsal epithelium of the mantle. Although distantly related molluscs express remarkably different secreted gene products, it remains unclear if the evolution of shell structure and pattern is underpinned by the differential co-option of conserved genes or the integration of lineage-specific genes into the mantle regulatory program. To address this, we compare the mantle transcriptomes of 11 bivalves and gastropods of varying relatedness. We find that each species, including four Pinctada (pearl oyster) species that diverged within the last 20 Ma, expresses a unique mantle secretome. Lineage- or species-specific genes comprise a large proportion of each species’ mantle secretome. A majority of these secreted proteins have unique domain architectures that include repetitive, low complexity domains (RLCDs), which evolve rapidly, and have a proclivity to expand, contract and rearrange in the genome. There are also a large number of secretome genes expressed in the mantle that arose before the origin of gastropods and bivalves. Each species expresses a unique set of these more ancient genes consistent with their independent co-option into these mantle gene regulatory networks. From this analysis, we infer lineage-specific secretomes underlie shell diversity, and include both rapidly evolving RLCD-containing proteins, and the continual recruitment and loss of both ancient and recently evolved genes into the periphery of the regulatory network controlling gene expression in the mantle epithelium. PMID:28053006

  5. De novo assembly and next-generation sequencing to analyse full-length gene variants from codon-barcoded libraries.

    Science.gov (United States)

    Cho, Namjin; Hwang, Byungjin; Yoon, Jung-ki; Park, Sangun; Lee, Joongoo; Seo, Han Na; Lee, Jeewon; Huh, Sunghoon; Chung, Jinsoo; Bang, Duhee

    2015-09-21

    Interpreting epistatic interactions is crucial for understanding evolutionary dynamics of complex genetic systems and unveiling structure and function of genetic pathways. Although high resolution mapping of en masse variant libraries renders molecular biologists to address genotype-phenotype relationships, long-read sequencing technology remains indispensable to assess functional relationship between mutations that lie far apart. Here, we introduce JigsawSeq for multiplexed sequence identification of pooled gene variant libraries by combining a codon-based molecular barcoding strategy and de novo assembly of short-read data. We first validate JigsawSeq on small sub-pools and observed high precision and recall at various experimental settings. With extensive simulations, we then apply JigsawSeq to large-scale gene variant libraries to show that our method can be reliably scaled using next-generation sequencing. JigsawSeq may serve as a rapid screening tool for functional genomics and offer the opportunity to explore evolutionary trajectories of protein variants.

  6. De novo assembly, gene annotation and marker development using Illumina paired-end transcriptome sequences in celery (Apium graveolens L..

    Directory of Open Access Journals (Sweden)

    Nan Fu

    Full Text Available BACKGROUND: Celery is an increasing popular vegetable species, but limited transcriptome and genomic data hinder the research to it. In addition, a lack of celery molecular markers limits the process of molecular genetic breeding. High-throughput transcriptome sequencing is an efficient method to generate a large transcriptome sequence dataset for gene discovery, molecular marker development and marker-assisted selection breeding. PRINCIPAL FINDINGS: Celery transcriptomes from four tissues were sequenced using Illumina paired-end sequencing technology. De novo assembling was performed to generate a collection of 42,280 unigenes (average length of 502.6 bp that represent the first transcriptome of the species. 78.43% and 48.93% of the unigenes had significant similarity with proteins in the National Center for Biotechnology Information (NCBI non-redundant protein database (Nr and Swiss-Prot database respectively, and 10,473 (24.77% unigenes were assigned to Clusters of Orthologous Groups (COG. 21,126 (49.97% unigenes harboring Interpro domains were annotated, in which 15,409 (36.45% were assigned to Gene Ontology(GO categories. Additionally, 7,478 unigenes were mapped onto 228 pathways using the Kyoto Encyclopedia of Genes and Genomes Pathway database (KEGG. Large numbers of simple sequence repeats (SSRs were indentified, and then the rate of successful amplication and polymorphism were investigated among 31 celery accessions. CONCLUSIONS: This study demonstrates the feasibility of generating a large scale of sequence information by Illumina paired-end sequencing and efficient assembling. Our results provide a valuable resource for celery research. The developed molecular markers are the foundation of further genetic linkage analysis and gene localization, and they will be essential to accelerate the process of breeding.

  7. Deep Sequencing Reveals Uncharted Isoform Heterogeneity of the Protein-Coding Transcriptome in Cerebral Ischemia.

    Science.gov (United States)

    Bhattarai, Sunil; Aly, Ahmed; Garcia, Kristy; Ruiz, Diandra; Pontarelli, Fabrizio; Dharap, Ashutosh

    2018-06-03

    Gene expression in cerebral ischemia has been a subject of intense investigations for several years. Studies utilizing probe-based high-throughput methodologies such as microarrays have contributed significantly to our existing knowledge but lacked the capacity to dissect the transcriptome in detail. Genome-wide RNA-sequencing (RNA-seq) enables comprehensive examinations of transcriptomes for attributes such as strandedness, alternative splicing, alternative transcription start/stop sites, and sequence composition, thus providing a very detailed account of gene expression. Leveraging this capability, we conducted an in-depth, genome-wide evaluation of the protein-coding transcriptome of the adult mouse cortex after transient focal ischemia at 6, 12, or 24 h of reperfusion using RNA-seq. We identified a total of 1007 transcripts at 6 h, 1878 transcripts at 12 h, and 1618 transcripts at 24 h of reperfusion that were significantly altered as compared to sham controls. With isoform-level resolution, we identified 23 splice variants arising from 23 genes that were novel mRNA isoforms. For a subset of genes, we detected reperfusion time-point-dependent splice isoform switching, indicating an expression and/or functional switch for these genes. Finally, for 286 genes across all three reperfusion time-points, we discovered multiple, distinct, simultaneously expressed and differentially altered isoforms per gene that were generated via alternative transcription start/stop sites. Of these, 165 isoforms derived from 109 genes were novel mRNAs. Together, our data unravel the protein-coding transcriptome of the cerebral cortex at an unprecedented depth to provide several new insights into the flexibility and complexity of stroke-related gene transcription and transcript organization.

  8. ELFN1-AS1: A Novel Primate Gene with Possible MicroRNA Function Expressed Predominantly in Human Tumors

    Directory of Open Access Journals (Sweden)

    Dmitrii E. Polev

    2014-01-01

    Full Text Available Human gene LOC100505644 uncharacterized LOC100505644 [Homo sapiens] (Entrez Gene ID 100505644 is abundantly expressed in tumors but weakly expressed in few normal tissues. Till now the function of this gene remains unknown. Here we identified the chromosomal borders of the transcribed region and the major splice form of the LOC100505644-specific transcript. We characterised the major regulatory motifs of the gene and its splice sites. Analysis of the secondary structure of the major transcript variant revealed a hairpin-like structure characteristic for precursor microRNAs. Comparative genomic analysis of the locus showed that it originated in primates de novo. Taken together, our data indicate that human gene LOC100505644 encodes some non-protein coding RNA, likely a microRNA. It was assigned a gene symbol ELFN1-AS1 (ELFN1 antisense RNA 1 (non-protein coding. This gene combines features of evolutionary novelty and predominant expression in tumors.

  9. First report of a de novo germline mutation in the MLH1 gene

    NARCIS (Netherlands)

    Stulp, Rein P; Vos, Yvonne J; Mol, Bart; Karrenbeld, Arend; de Raad, Monique; van der Mijle, Huub J C; Sijmons, Rolf H

    2006-01-01

    Hereditary non-polyposis colorectal carcinoma (HNPCC) is an autosomal dominant disorder associated with colorectal and endometrial cancer and a range of other tumor types. Germline mutations in the DNA mismatch repair (MMR) genes, particularly MLH1, MSH2, and MSH6, underlie this disorder. The vast

  10. Gene expression patterns regulating embryogenesis based on the integrated de novo transcriptome assembly of the Japanese flounder.

    Science.gov (United States)

    Fu, Yuanshuai; Jia, Liang; Shi, Zhiyi; Zhang, Junling; Li, Wenjuan

    2017-06-01

    The Japanese flounder (Paralichthys olivaceus) is one of the most important commercial and biological marine fishes. However, the molecular biology involved during embryogenesis and early development of the Japanese flounder remains largely unknown due to a lack of genomic resources. A comprehensive and integrated transcriptome is necessary to study the molecular mechanisms of early development and to allow for the detailed characterization of gene expression patterns during embryogenesis; this approach is critical to understanding the processes that occur prior to mesectoderm formation during early embryonic development. In this study, more than 117.8 million 100bp PE reads were generated from pooled RNA extracted from unfertilized eggs to 41dph (days post-hatching) embryos and were sequenced using Illumina pair-end sequencing technology. In total, 121,513 transcripts (≥200bp) were obtained using de novo assembly. A sequence similarity search indicated that 52,338 transcripts show significant similarity to 22,462 known proteins from the NCBI non-redundant database and the Swiss-Prot protein database and were annotated using Blast2GO. GO terms were assigned to 44,627 transcripts with 12,006 functional terms, and 10,024 transcripts were assigned to 133 KEGG pathways. Furthermore, gene expression differences between the unfertilized egg and the gastrula embryo were analysed using Illumina RNA-Seq with single-read sequencing technology, and 24,837 differentially and specifically expressed transcripts were identified and included 5,286 annotated transcripts and 19,569 non-annotated transcripts. All of the expressed transcripts in the unfertilized egg and gastrula embryo were further classified as maternal, zygotic, or maternal-zygotic transcripts, which may help us to understand the roles of these transcripts during the embryonic development of the Japanese flounder. Thus, the results will contribute to an improved understanding of the gene expression patterns and

  11. Exploring the genes of yerba mate (Ilex paraguariensis A. St.-Hil. by NGS and de novo transcriptome assembly.

    Directory of Open Access Journals (Sweden)

    Humberto J Debat

    Full Text Available Yerba mate (Ilex paraguariensis A. St.-Hil. is an important subtropical tree crop cultivated on 326,000 ha in Argentina, Brazil and Paraguay, with a total yield production of more than 1,000,000 t. Yerba mate presents a strong limitation regarding sequence information. The NCBI GenBank lacks an EST database of yerba mate and depicts only 80 DNA sequences, mostly uncharacterized. In this scenario, in order to elucidate the yerba mate gene landscape by means of NGS, we explored and discovered a vast collection of I. paraguariensis transcripts. Total RNA from I. paraguariensis was sequenced by Illumina HiSeq-2000 obtaining 72,031,388 pair-end 100 bp sequences. High quality reads were de novo assembled into 44,907 transcripts encompassing 40 million bases with an estimated coverage of 180X. Multiple sequence analysis allowed us to predict that yerba mate contains ∼ 32,355 genes and 12,551 gene variants or isoforms. We identified and categorized members of more than 100 metabolic pathways. Overall, we have identified ∼ 1,000 putative transcription factors, genes involved in heat and oxidative stress, pathogen response, as well as disease resistance and hormone response. We have also identified, based in sequence homology searches, novel transcripts related to osmotic, drought, salinity and cold stress, senescence and early flowering. We have also pinpointed several members of the gene silencing pathway, and characterized the silencing effector Argonaute1. We predicted a diverse supply of putative microRNA precursors involved in developmental processes. We present here the first draft of the transcribed genomes of the yerba mate chloroplast and mitochondrion. The putative sequence and predicted structure of the caffeine synthase of yerba mate is presented. Moreover, we provide a collection of over 10,800 SSR accessible to the scientific community interested in yerba mate genetic improvement. This contribution broadly expands the limited knowledge

  12. Exploring the Genes of Yerba Mate (Ilex paraguariensis A. St.-Hil.) by NGS and De Novo Transcriptome Assembly

    Science.gov (United States)

    Aguilera, Patricia M.; Bubillo, Rosana E.; Otegui, Mónica B.; Ducasse, Daniel A.; Zapata, Pedro D.; Marti, Dardo A.

    2014-01-01

    Yerba mate (Ilex paraguariensis A. St.-Hil.) is an important subtropical tree crop cultivated on 326,000 ha in Argentina, Brazil and Paraguay, with a total yield production of more than 1,000,000 t. Yerba mate presents a strong limitation regarding sequence information. The NCBI GenBank lacks an EST database of yerba mate and depicts only 80 DNA sequences, mostly uncharacterized. In this scenario, in order to elucidate the yerba mate gene landscape by means of NGS, we explored and discovered a vast collection of I. paraguariensis transcripts. Total RNA from I. paraguariensis was sequenced by Illumina HiSeq-2000 obtaining 72,031,388 pair-end 100 bp sequences. High quality reads were de novo assembled into 44,907 transcripts encompassing 40 million bases with an estimated coverage of 180X. Multiple sequence analysis allowed us to predict that yerba mate contains ∼32,355 genes and 12,551 gene variants or isoforms. We identified and categorized members of more than 100 metabolic pathways. Overall, we have identified ∼1,000 putative transcription factors, genes involved in heat and oxidative stress, pathogen response, as well as disease resistance and hormone response. We have also identified, based in sequence homology searches, novel transcripts related to osmotic, drought, salinity and cold stress, senescence and early flowering. We have also pinpointed several members of the gene silencing pathway, and characterized the silencing effector Argonaute1. We predicted a diverse supply of putative microRNA precursors involved in developmental processes. We present here the first draft of the transcribed genomes of the yerba mate chloroplast and mitochondrion. The putative sequence and predicted structure of the caffeine synthase of yerba mate is presented. Moreover, we provide a collection of over 10,800 SSR accessible to the scientific community interested in yerba mate genetic improvement. This contribution broadly expands the limited knowledge of yerba mate genes

  13. De novo transcriptome analysis in Dendrobium and identification of critical genes associated with flowering.

    Science.gov (United States)

    Chen, Yue; Shen, Qi; Lin, Renan; Zhao, Zhuangliu; Shen, Chenjia; Sun, Chongbo

    2017-10-01

    Artificial control of flowering time is pivotal for the ornamental value of orchids including the genus Dendrobium. Although various flowering pathways have been revealed in model plants, little information is available on the genetic regualtion of flowering in Dendrobium. To identify the critical genes associated with flowering, transcriptomes from four organs (leaf, root, stem and flower) of D. officinale were analyzed in our study. In total, 2645 flower-specific transcripts were identified. Functional annotation and classification suggested that several metabolic pathways, including four sugar-related pathways and two fatty acid-related pathways, were enriched. A total of 24 flowering-related transcripts were identified in D. officinale according to the similarities to their homologous genes from Arabidopsis, suggesting that most classical flowering pathways existed in D. officinale. Furthermore, phylogenetic analysis suggested that the FLOWERING LOCUS T homologs in orchids are highly conserved during evolution process. In addition, expression changes in nine randomly-selected critical flowering-related transcripts between the vegetative stage and reproductive stage were quantified by qRT-PCR analysis. Our study provided a number of candidate genes and sequence resources for investigating the mechanisms underlying the flowering process of the Dendrobium genus. Copyright © 2017. Published by Elsevier Masson SAS.

  14. De novo transcriptome characterization and gene expression profiling of the desiccation tolerant moss Bryum argenteum following rehydration.

    Science.gov (United States)

    Gao, Bei; Zhang, Daoyuan; Li, Xiaoshuang; Yang, Honglan; Zhang, Yuanming; Wood, Andrew J

    2015-05-28

    The desiccation-tolerant moss Bryum argenteum is an important component of the Biological Soil Crusts (BSCs) found in the Gurbantunggut desert. Desiccation tolerance is defined as the ability to revive from the air dried state. To elucidate the molecular mechanisms related to desiccation tolerance, we employed RNA-Seq and digital gene expression (DGE) technologies to study the genome-wide expression profiles of the dehydration and rehydration processes in this important desert plant. We applied a two-step approach to investigate the gene expression profile upon rehydration in the moss Bryum argenteum using Illumina HiSeq2000 sequencing platform. First, a total of 57,247 transcript assembly contigs (TACs) were obtained from 54.79 million reads by de novo assembly, with an average length of 863 bp and N50 of 1,372 bp. Among the reconstructed TACs, 36,916 (64.5%) revealed similarity with existing protein sequences in the public databases. 23,509 and 21,607 TACs were assigned GO and KEGG annotation information, respectively. Second, samples were taken from 3 hydration stages: desiccated (Dry), rehydrated 2 h (R2) and rehydrated 24 h (R24), and DEG libraries were constructed for Differentially Expressed Genes (DEGs) discovery. 4,081 and 6,709 DEGs were identified in R2 and R24, compared with Dry, respectively. Compared to the desiccated sample, up-regulated genes after two hours of hydration are primarily related to stress responses. GO function enrichment network, EKGG metabolic pathway and MapMan analysis supports the idea of the rapid recovery of photosynthesis after 24 h of rehydration. We identified 770 transcription factors (TFs) which were classified into 50 TF families. 142 TF transcripts were up-regulated upon rehydration including 23 members of the ERF family. In this study, we constructed a pioneering, high-quality reference transcriptome in B. argenteum and generated three DGE libraries to elucidate the changes of gene expression upon rehydration. Expression

  15. De novo Assembly of the Camellia nitidissima Transcriptome Reveals Key Genes of Flower Pigment Biosynthesis

    Directory of Open Access Journals (Sweden)

    Xingwen Zhou

    2017-09-01

    Full Text Available The golden camellia, Camellia nitidissima Chi., is a well-known ornamental plant that is known as “the queen of camellias” because of its golden yellow flowers. The principal pigments in the flowers are carotenoids and flavonol glycosides. Understanding the biosynthesis of the golden color and its regulation is important in camellia breeding. To obtain a comprehensive understanding of flower development in C. nitidissima, a number of cDNA libraries were independently constructed during flower development. Using the Illumina Hiseq2500 platform, approximately 71.8 million raw reads (about 10.8 gigabase pairs were obtained and assembled into 583,194 transcripts and 466, 594 unigenes. A differentially expressed genes (DEGs and co-expression network was constructed to identify unigenes correlated with flower color. The analysis of DEGs and co-expressed network involved in the carotenoid pathway indicated that the biosynthesis of carotenoids is regulated mainly at the transcript level and that phytoene synthase (PSY, β -carotene 3-hydroxylase (CrtZ, and capsanthin synthase (CCS1 exert synergistic effects in carotenoid biosynthesis. The analysis of DEGs and co-expressed network involved in the flavonoid pathway indicated that chalcone synthase (CHS, naringenin 3-dioxygenase (F3H, leucoanthocyanidin dioxygenase(ANS, and flavonol synthase (FLS play critical roles in regulating the formation of flavonols and anthocyanidin. Based on the gene expression analysis of the carotenoid and flavonoid pathways, and determinations of the pigments, we speculate that the high expression of PSY and CrtZ ensures the production of adequate levels of carotenoids, while the expression of CHS, FLS ensures the production of flavonols. The golden yellow color is then the result of the accumulation of carotenoids and flavonol glucosides in the petals. This study of the mechanism of color formation in golden camellia points the way to breeding strategies that exploit gene

  16. A De Novo Whole GCK Gene Deletion Not Detected by Gene Sequencing, in a Boy with Phenotypic GCK Insufficiency

    Directory of Open Access Journals (Sweden)

    N. H. Birkebæk

    2011-01-01

    Full Text Available We report on a boy with diabetes mellitus and a phenotype indicating glucokinase (GCK insufficiency, but a normal GCK gene examination applying direct gene sequencing. The boy was referred for diabetes mellitus at 7.5 years old. His father, grandfather and great grandfather suffered type 2 DM. Several blood glucose profiles showed (BG of 6.5–10 mmol/L L. After three years on neutral insulin Hagedorn (NPH in a dose of 0.3 IU/kg/day haemoglobin A1c (HbA1c was 6.8%. Treatment was changed to sulphonylurea 750 mg a day, and after 4 years HbA1c was 7%. At that time a multiplex ligation-dependent amplification gene dosage assay (MLPA was done, revealing a whole GCK gene deletion. Medical treatment was ceased, and after one year HbA1c was 6.8%. This case underscores the importance of a MLPA examination if the phenotype of a patient is strongly indicative of GCK insufficiency and no mutation is identified using direct sequencing.

  17. Sequencing of sporadic Attention-Deficit Hyperactivity Disorder (ADHD) identifies novel and potentially pathogenic de novo variants and excludes overlap with genes associated with autism spectrum disorder.

    Science.gov (United States)

    Kim, Daniel Seung; Burt, Amber A; Ranchalis, Jane E; Wilmot, Beth; Smith, Joshua D; Patterson, Karynne E; Coe, Bradley P; Li, Yatong K; Bamshad, Michael J; Nikolas, Molly; Eichler, Evan E; Swanson, James M; Nigg, Joel T; Nickerson, Deborah A; Jarvik, Gail P

    2017-06-01

    Attention-Deficit Hyperactivity Disorder (ADHD) has high heritability; however, studies of common variation account for ADHD variance. Using data from affected participants without a family history of ADHD, we sought to identify de novo variants that could account for sporadic ADHD. Considering a total of 128 families, two analyses were conducted in parallel: first, in 11 unaffected parent/affected proband trios (or quads with the addition of an unaffected sibling) we completed exome sequencing. Six de novo missense variants at highly conserved bases were identified and validated from four of the 11 families: the brain-expressed genes TBC1D9, DAGLA, QARS, CSMD2, TRPM2, and WDR83. Separately, in 117 unrelated probands with sporadic ADHD, we sequenced a panel of 26 genes implicated in intellectual disability (ID) and autism spectrum disorder (ASD) to evaluate whether variation in ASD/ID-associated genes were also present in participants with ADHD. Only one putative deleterious variant (Gln600STOP) in CHD1L was identified; this was found in a single proband. Notably, no other nonsense, splice, frameshift, or highly conserved missense variants in the 26 gene panel were identified and validated. These data suggest that de novo variant analysis in families with independently adjudicated sporadic ADHD diagnosis can identify novel genes implicated in ADHD pathogenesis. Moreover, that only one of the 128 cases (0.8%, 11 exome, and 117 MIP sequenced participants) had putative deleterious variants within our data in 26 genes related to ID and ASD suggests significant independence in the genetic pathogenesis of ADHD as compared to ASD and ID phenotypes. © 2017 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.

  18. De novo analysis of transcriptome dynamics in the migratory locust during the development of phase traits.

    Directory of Open Access Journals (Sweden)

    Shuang Chen

    Full Text Available Locusts exhibit remarkable density-dependent phenotype (phase changes from the solitary to the gregarious, making them one of the most destructive agricultural pests. This phenotype polyphenism arises from a single genome and diverse transcriptomes in different conditions. Here we report a de novo transcriptome for the migratory locust and a comprehensive, representative core gene set. We carried out assembly of 21.5 Gb Illumina reads, generated 72,977 transcripts with N50 2,275 bp and identified 11,490 locust protein-coding genes. Comparative genomics analysis with eight other sequenced insects was carried out to identify the genomic divergence between hemimetabolous and holometabolous insects for the first time and 18 genes relevant to development was found. We further utilized the quantitative feature of RNA-seq to measure and compare gene expression among libraries. We first discovered how divergence in gene expression between two phases progresses as locusts develop and identified 242 transcripts as candidates for phase marker genes. Together with the detailed analysis of deep sequencing data of the 4(th instar, we discovered a phase-dependent divergence of biological investment in the molecular level. Solitary locusts have higher activity in biosynthetic pathways while gregarious locusts show higher activity in environmental interaction, in which genes and pathways associated with regulation of neurotransmitter activities, such as neurotransmitter receptors, synthetase, transporters, and GPCR signaling pathways, are strongly involved. Our study, as the largest de novo transcriptome to date, with optimization of sequencing and assembly strategy, can further facilitate the application of de novo transcriptome. The locust transcriptome enriches genetic resources for hemimetabolous insects and our understanding of the origin of insect metamorphosis. Most importantly, we identified genes and pathways that might be involved in locust development

  19. Developmental gene discovery in a hemimetabolous insect: de novo assembly and annotation of a transcriptome for the cricket Gryllus bimaculatus.

    Directory of Open Access Journals (Sweden)

    Victor Zeng

    Full Text Available Most genomic resources available for insects represent the Holometabola, which are insects that undergo complete metamorphosis like beetles and flies. In contrast, the Hemimetabola (direct developing insects, representing the basal branches of the insect tree, have very few genomic resources. We have therefore created a large and publicly available transcriptome for the hemimetabolous insect Gryllus bimaculatus (cricket, a well-developed laboratory model organism whose potential for functional genetic experiments is currently limited by the absence of genomic resources. cDNA was prepared using mRNA obtained from adult ovaries containing all stages of oogenesis, and from embryo samples on each day of embryogenesis. Using 454 Titanium pyrosequencing, we sequenced over four million raw reads, and assembled them into 21,512 isotigs (predicted transcripts and 120,805 singletons with an average coverage per base pair of 51.3. We annotated the transcriptome manually for over 400 conserved genes involved in embryonic patterning, gametogenesis, and signaling pathways. BLAST comparison of the transcriptome against the NCBI non-redundant protein database (nr identified significant similarity to nr sequences for 55.5% of transcriptome sequences, and suggested that the transcriptome may contain 19,874 unique transcripts. For predicted transcripts without significant similarity to known sequences, we assessed their similarity to other orthopteran sequences, and determined that these transcripts contain recognizable protein domains, largely of unknown function. We created a searchable, web-based database to allow public access to all raw, assembled and annotated data. This database is to our knowledge the largest de novo assembled and annotated transcriptome resource available for any hemimetabolous insect. We therefore anticipate that these data will contribute significantly to more effective and higher-throughput deployment of molecular analysis tools in

  20. De novo transcriptome assembly and quantification reveal differentially expressed genes between soft-seed and hard-seed pomegranate (Punica granatum L..

    Directory of Open Access Journals (Sweden)

    Hui Xue

    Full Text Available Pomegranate (Punica granatum L. belongs to Punicaceae, and is valued for its social, ecological, economic, and aesthetic values, as well as more recently for its health benefits. The 'Tunisia' variety has softer seeds and big arils that are easily swallowed. It is a widely popular fruit; however, the molecular mechanisms of the formation of hard and soft seeds is not yet clear. We conducted a de novo assembly of the seed transcriptome in P. granatum L. and revealed differential gene expression between the soft-seed and hard-seed pomegranate varieties. A total of 35.1 Gb of data were acquired in this study, including 280,881,106 raw reads. Additionally, de novo transcriptome assembly generated 132,287 transcripts and 105,743 representative unigenes; approximately 13,805 unigenes (37.7% were longer than 1,000 bp. Using bioinformatics annotation libraries, a total of 76,806 unigenes were annotated and, among the high-quality reads, 72.63% had at least one significant match to an existing gene model. Gene expression and differentially expressed genes were analyzed. The seed formation of the two pomegranate cultivars involves lignin biosynthesis and metabolism, including some genes encoding laccase and peroxidase, WRKY, MYB, and NAC transcription factors. In the hard-seed pomegranate, lignin-related genes and cellulose synthesis-related genes were highly expressed; in soft-seed pomegranates, expression of genes related to flavonoids and programmed cell death was slightly higher. We validated selection of the identified genes using qRT-PCR. This is the first transcriptome analysis of P. granatum L. This transcription sequencing greatly enriched the pomegranate molecular database, and the high-quality SSRs generated in this study will aid the gene cloning from pomegranate in the future. It provides important insights into the molecular mechanisms underlying the formation of soft seeds in pomegranate.

  1. De novo transcriptome assembly and quantification reveal differentially expressed genes between soft-seed and hard-seed pomegranate (Punica granatum L.).

    Science.gov (United States)

    Xue, Hui; Cao, Shangyin; Li, Haoxian; Zhang, Jie; Niu, Juan; Chen, Lina; Zhang, Fuhong; Zhao, Diguang

    2017-01-01

    Pomegranate (Punica granatum L.) belongs to Punicaceae, and is valued for its social, ecological, economic, and aesthetic values, as well as more recently for its health benefits. The 'Tunisia' variety has softer seeds and big arils that are easily swallowed. It is a widely popular fruit; however, the molecular mechanisms of the formation of hard and soft seeds is not yet clear. We conducted a de novo assembly of the seed transcriptome in P. granatum L. and revealed differential gene expression between the soft-seed and hard-seed pomegranate varieties. A total of 35.1 Gb of data were acquired in this study, including 280,881,106 raw reads. Additionally, de novo transcriptome assembly generated 132,287 transcripts and 105,743 representative unigenes; approximately 13,805 unigenes (37.7%) were longer than 1,000 bp. Using bioinformatics annotation libraries, a total of 76,806 unigenes were annotated and, among the high-quality reads, 72.63% had at least one significant match to an existing gene model. Gene expression and differentially expressed genes were analyzed. The seed formation of the two pomegranate cultivars involves lignin biosynthesis and metabolism, including some genes encoding laccase and peroxidase, WRKY, MYB, and NAC transcription factors. In the hard-seed pomegranate, lignin-related genes and cellulose synthesis-related genes were highly expressed; in soft-seed pomegranates, expression of genes related to flavonoids and programmed cell death was slightly higher. We validated selection of the identified genes using qRT-PCR. This is the first transcriptome analysis of P. granatum L. This transcription sequencing greatly enriched the pomegranate molecular database, and the high-quality SSRs generated in this study will aid the gene cloning from pomegranate in the future. It provides important insights into the molecular mechanisms underlying the formation of soft seeds in pomegranate.

  2. Heterozygous de novo and inherited mutations in the smooth muscle actin (ACTG2 gene underlie megacystis-microcolon-intestinal hypoperistalsis syndrome.

    Directory of Open Access Journals (Sweden)

    Michael F Wangler

    2014-03-01

    Full Text Available Megacystis-microcolon-intestinal hypoperistalsis syndrome (MMIHS is a rare disorder of enteric smooth muscle function affecting the intestine and bladder. Patients with this severe phenotype are dependent on total parenteral nutrition and urinary catheterization. The cause of this syndrome has remained a mystery since Berdon's initial description in 1976. No genes have been clearly linked to MMIHS. We used whole-exome sequencing for gene discovery followed by targeted Sanger sequencing in a cohort of patients with MMIHS and intestinal pseudo-obstruction. We identified heterozygous ACTG2 missense variants in 15 unrelated subjects, ten being apparent de novo mutations. Ten unique variants were detected, of which six affected CpG dinucleotides and resulted in missense mutations at arginine residues, perhaps related to biased usage of CpG containing codons within actin genes. We also found some of the same heterozygous mutations that we observed as apparent de novo mutations in MMIHS segregating in families with intestinal pseudo-obstruction, suggesting that ACTG2 is responsible for a spectrum of smooth muscle disease. ACTG2 encodes γ2 enteric actin and is the first gene to be clearly associated with MMIHS, suggesting an important role for contractile proteins in enteric smooth muscle disease.

  3. De novo transcriptome assembly facilitates characterisation of fast-evolving gene families, MHC class I in the bank vole (Myodes glareolus).

    Science.gov (United States)

    Migalska, M; Sebastian, A; Konczal, M; Kotlík, P; Radwan, J

    2017-04-01

    The major histocompatibility complex (MHC) plays a central role in the adaptive immune response and is the most polymorphic gene family in vertebrates. Although high-throughput sequencing has increasingly been used for genotyping families of co-amplifying MHC genes, its potential to facilitate early steps in the characterisation of MHC variation in nonmodel organism has not been fully explored. In this study we evaluated the usefulness of de novo transcriptome assembly in characterisation of MHC sequence diversity. We found that although de novo transcriptome assembly of MHC I genes does not reconstruct sequences of individual alleles, it does allow the identification of conserved regions for PCR primer design. Using the newly designed primers, we characterised MHC I sequences in the bank vole. Phylogenetic analysis of the partial MHC I coding sequence (2-4 exons) of the bank vole revealed a lack of orthology to MHC I of other Cricetidae, consistent with the high gene turnover of this region. The diversity of expressed alleles was characterised using ultra-deep sequencing of the third exon that codes for the peptide-binding region of the MHC molecule. High allelic diversity was demonstrated, with 72 alleles found in 29 individuals. Interindividual variation in the number of expressed loci was found, with the number of alleles per individual ranging from 5 to 14. Strong signatures of positive selection were found for 8 amino acid sites, most of which are inferred to bind antigens in human MHC, indicating conservation of structure despite rapid sequence evolution.

  4. Potential hot spot for de novo mutations in PTCH1 gene in Gorlin syndrome patients: a case report of twins from Croatia.

    Science.gov (United States)

    Musani, Vesna; Ozretić, Petar; Trnski, Diana; Sabol, Maja; Poduje, Sanja; Tošić, Mateja; Šitum, Mirna; Levanat, Sonja

    2018-02-28

    We describe a case of twins with sporadic Gorlin syndrome. Both twins had common Gorlin syndrome features including calcification of the falx cerebri, multiple jaw keratocysts, and multiple basal cell carcinomas, but with different expressivity. One brother also had benign testicular mesothelioma. We propose this tumor type as a possible new feature of Gorlin syndrome. Gorlin syndrome is a rare autosomal dominant disorder characterized by both developmental abnormalities and cancer predisposition, with variable expression of various developmental abnormalities and different types of tumors. The syndrome is primarily caused by mutations in the Patched 1 (PTCH1) gene, although rare mutations of Patched 2 (PTCH2) or Suppressor of Fused (SUFU) genes have also been found. Neither founder mutations nor hot spot locations have been described for PTCH1 in Gorlin syndrome patients. Although de novo mutations of the PTCH1 gene occur in almost 50% of Gorlin syndrome cases, there are a few recurrent mutations. Our twin patients were carriers of a de novo mutation in the PTCH1 gene, c.3364_3365delAT (p.Met1122ValfsX22). This is, to our knowledge, the first Gorlin syndrome-causing mutation that has been reported four independent times in distant geographical locations. Therefore, we propose the location of the described mutation as a potential hot spot for mutations in PTCH1.

  5. De novo Transcriptome Assembly of Common Wild Rice (Oryza rufipogon Griff.) and Discovery of Drought-Response Genes in Root Tissue Based on Transcriptomic Data.

    Science.gov (United States)

    Tian, Xin-Jie; Long, Yan; Wang, Jiao; Zhang, Jing-Wen; Wang, Yan-Yan; Li, Wei-Min; Peng, Yu-Fa; Yuan, Qian-Hua; Pei, Xin-Wu

    2015-01-01

    The perennial O. rufipogon (common wild rice), which is considered to be the ancestor of Asian cultivated rice species, contains many useful genetic resources, including drought resistance genes. However, few studies have identified the drought resistance and tissue-specific genes in common wild rice. In this study, transcriptome sequencing libraries were constructed, including drought-treated roots (DR) and control leaves (CL) and roots (CR). Using Illumina sequencing technology, we generated 16.75 million bases of high-quality sequence data for common wild rice and conducted de novo assembly and annotation of genes without prior genome information. These reads were assembled into 119,332 unigenes with an average length of 715 bp. A total of 88,813 distinct sequences (74.42% of unigenes) significantly matched known genes in the NCBI NT database. Differentially expressed gene (DEG) analysis showed that 3617 genes were up-regulated and 4171 genes were down-regulated in the CR library compared with the CL library. Among the DEGs, 535 genes were expressed in roots but not in shoots. A similar comparison between the DR and CR libraries showed that 1393 genes were up-regulated and 315 genes were down-regulated in the DR library compared with the CR library. Finally, 37 genes that were specifically expressed in roots were screened after comparing the DEGs identified in the above-described analyses. This study provides a transcriptome sequence resource for common wild rice plants and establishes a digital gene expression profile of wild rice plants under drought conditions using the assembled transcriptome data as a reference. Several tissue-specific and drought-stress-related candidate genes were identified, representing a fully characterized transcriptome and providing a valuable resource for genetic and genomic studies in plants.

  6. Analysis of 6,515 exomes reveals the recent origin of most human protein-coding variants.

    Science.gov (United States)

    Fu, Wenqing; O'Connor, Timothy D; Jun, Goo; Kang, Hyun Min; Abecasis, Goncalo; Leal, Suzanne M; Gabriel, Stacey; Rieder, Mark J; Altshuler, David; Shendure, Jay; Nickerson, Deborah A; Bamshad, Michael J; Akey, Joshua M

    2013-01-10

    Establishing the age of each mutation segregating in contemporary human populations is important to fully understand our evolutionary history and will help to facilitate the development of new approaches for disease-gene discovery. Large-scale surveys of human genetic variation have reported signatures of recent explosive population growth, notable for an excess of rare genetic variants, suggesting that many mutations arose recently. To more quantitatively assess the distribution of mutation ages, we resequenced 15,336 genes in 6,515 individuals of European American and African American ancestry and inferred the age of 1,146,401 autosomal single nucleotide variants (SNVs). We estimate that approximately 73% of all protein-coding SNVs and approximately 86% of SNVs predicted to be deleterious arose in the past 5,000-10,000 years. The average age of deleterious SNVs varied significantly across molecular pathways, and disease genes contained a significantly higher proportion of recently arisen deleterious SNVs than other genes. Furthermore, European Americans had an excess of deleterious variants in essential and Mendelian disease genes compared to African Americans, consistent with weaker purifying selection due to the Out-of-Africa dispersal. Our results better delimit the historical details of human protein-coding variation, show the profound effect of recent human history on the burden of deleterious SNVs segregating in contemporary populations, and provide important practical information that can be used to prioritize variants in disease-gene discovery.

  7. De novo transcriptome assembly, functional annotation and differential gene expression analysis of juvenile and adult E. fetida, a model oligochaete used in ecotoxicological studies

    Directory of Open Access Journals (Sweden)

    Michelle Thunders

    Full Text Available Abstract Background Earthworms are sensitive to toxic chemicals present in the soil and so are useful indicator organisms for soil health. Eisenia fetida are commonly used in ecotoxicological studies; therefore the assembly of a baseline transcriptome is important for subsequent analyses exploring the impact of toxin exposure on genome wide gene expression. Results This paper reports on the de novo transcriptome assembly of E. fetida using Trinity, a freely available software tool. Trinotate was used to carry out functional annotation of the Trinity generated transcriptome file and the transdecoder generated peptide sequence file along with BLASTX, BLASTP and HMMER searches and were loaded into a Sqlite3 database. To identify differentially expressed transcripts; each of the original sequence files were aligned to the de novo assembled transcriptome using Bowtie and then RSEM was used to estimate expression values based on the alignment. EdgeR was used to calculate differential expression between the two conditions, with an FDR corrected P value cut off of 0.001, this returned six significantly differentially expressed genes. Initial BLASTX hits of these putative genes included hits with annelid ferritin and lysozyme proteins, as well as fungal NADH cytochrome b5 reductase and senescence associated proteins. At a cut off of P = 0.01 there were a further 26 differentially expressed genes. Conclusion These data have been made publicly available, and to our knowledge represent the most comprehensive available transcriptome for E. fetida assembled from RNA sequencing data. This provides important groundwork for subsequent ecotoxicogenomic studies exploring the impact of the environment on global gene expression in E. fetida and other earthworm species.

  8. De novo assembly of the Indo-Pacific humpback dolphin leucocyte transcriptome to identify putative genes involved in the aquatic adaptation and immune response.

    Science.gov (United States)

    Gui, Duan; Jia, Kuntong; Xia, Jia; Yang, Lili; Chen, Jialin; Wu, Yuping; Yi, Meisheng

    2013-01-01

    The Indo-Pacific humpback dolphin (Sousa chinensis), a marine mammal species inhabited in the waters of Southeast Asia, South Africa and Australia, has attracted much attention because of the dramatic decline in population size in the past decades, which raises the concern of extinction. So far, this species is poorly characterized at molecular level due to little sequence information available in public databases. Recent advances in large-scale RNA sequencing provide an efficient approach to generate abundant sequences for functional genomic analyses in the species with un-sequenced genomes. We performed a de novo assembly of the Indo-Pacific humpback dolphin leucocyte transcriptome by Illumina sequencing. 108,751 high quality sequences from 47,840,388 paired-end reads were generated, and 48,868 and 46,587 unigenes were functionally annotated by BLAST search against the NCBI non-redundant and Swiss-Prot protein databases (E-valueIndo-Pacific humpback dolphin, an endangered species. The de novo transcriptome analysis of the unique transcripts will provide valuable sequence information for discovery of new genes, characterization of gene expression, investigation of various pathways and adaptive evolution, as well as identification of genetic markers.

  9. De Novo Assembly and Characterization of the Transcriptome of the Parasitic Weed Dodder Identifies Genes Associated with Plant Parasitism1[C][W][OPEN

    Science.gov (United States)

    Ranjan, Aashish; Ichihashi, Yasunori; Farhi, Moran; Zumstein, Kristina; Townsley, Brad; David-Schwartz, Rakefet; Sinha, Neelima R.

    2014-01-01

    Parasitic flowering plants are one of the most destructive agricultural pests and have major impact on crop yields throughout the world. Being dependent on finding a host plant for growth, parasitic plants penetrate their host using specialized organs called haustoria. Haustoria establish vascular connections with the host, which enable the parasite to steal nutrients and water. The underlying molecular and developmental basis of parasitism by plants is largely unknown. In order to investigate the process of parasitism, RNAs from different stages (i.e. seed, seedling, vegetative strand, prehaustoria, haustoria, and flower) were used to de novo assemble and annotate the transcriptome of the obligate plant stem parasite dodder (Cuscuta pentagona). The assembled transcriptome was used to dissect transcriptional dynamics during dodder development and parasitism and identified key gene categories involved in the process of plant parasitism. Host plant infection is accompanied by increased expression of parasite genes underlying transport and transporter categories, response to stress and stimuli, as well as genes encoding enzymes involved in cell wall modifications. By contrast, expression of photosynthetic genes is decreased in the dodder infective stages compared with normal stem. In addition, genes relating to biosynthesis, transport, and response of phytohormones, such as auxin, gibberellins, and strigolactone, were differentially expressed in the dodder infective stages compared with stems and seedlings. This analysis sheds light on the transcriptional changes that accompany plant parasitism and will aid in identifying potential gene targets for use in controlling the infestation of crops by parasitic weeds. PMID:24399359

  10. De novo transcriptome assembly of drought tolerant CAM plants, Agave deserti and Agave tequilana.

    Science.gov (United States)

    Gross, Stephen M; Martin, Jeffrey A; Simpson, June; Abraham-Juarez, María Jazmín; Wang, Zhong; Visel, Axel

    2013-08-19

    Agaves are succulent monocotyledonous plants native to xeric environments of North America. Because of their adaptations to their environment, including crassulacean acid metabolism (CAM, a water-efficient form of photosynthesis), and existing technologies for ethanol production, agaves have gained attention both as potential lignocellulosic bioenergy feedstocks and models for exploring plant responses to abiotic stress. However, the lack of comprehensive Agave sequence datasets limits the scope of investigations into the molecular-genetic basis of Agave traits. Here, we present comprehensive, high quality de novo transcriptome assemblies of two Agave species, A. tequilana and A. deserti, built from short-read RNA-seq data. Our analyses support completeness and accuracy of the de novo transcriptome assemblies, with each species having a minimum of approximately 35,000 protein-coding genes. Comparison of agave proteomes to those of additional plant species identifies biological functions of gene families displaying sequence divergence in agave species. Additionally, a focus on the transcriptomics of the A. deserti juvenile leaf confirms evolutionary conservation of monocotyledonous leaf physiology and development along the proximal-distal axis. Our work presents a comprehensive transcriptome resource for two Agave species and provides insight into their biology and physiology. These resources are a foundation for further investigation of agave biology and their improvement for bioenergy development.

  11. Identification of a Novel De Novo Variant in the PAX3 Gene in Waardenburg Syndrome by Diagnostic Exome Sequencing: The First Molecular Diagnosis in Korea.

    Science.gov (United States)

    Jang, Mi-Ae; Lee, Taeheon; Lee, Junnam; Cho, Eun-Hae; Ki, Chang-Seok

    2015-05-01

    Waardenburg syndrome (WS) is a clinically and genetically heterogeneous hereditary auditory pigmentary disorder characterized by congenital sensorineural hearing loss and iris discoloration. Many genes have been linked to WS, including PAX3, MITF, SNAI2, EDNRB, EDN3, and SOX10, and many additional genes have been associated with disorders with phenotypic overlap with WS. To screen all possible genes associated with WS and congenital deafness simultaneously, we performed diagnostic exome sequencing (DES) in a male patient with clinical features consistent with WS. Using DES, we identified a novel missense variant (c.220C>G; p.Arg74Gly) in exon 2 of the PAX3 gene in the patient. Further analysis by Sanger sequencing of the patient and his parents revealed a de novo occurrence of the variant. Our findings show that DES can be a useful tool for the identification of pathogenic gene variants in WS patients and for differentiation between WS and similar disorders. To the best of our knowledge, this is the first report of genetically confirmed WS in Korea.

  12. De Novo Assembly and Genome Analyses of the Marine-Derived Scopulariopsis brevicaulis Strain LF580 Unravels Life-Style Traits and Anticancerous Scopularide Biosynthetic Gene Cluster.

    Science.gov (United States)

    Kumar, Abhishek; Henrissat, Bernard; Arvas, Mikko; Syed, Muhammad Fahad; Thieme, Nils; Benz, J Philipp; Sørensen, Jens Laurids; Record, Eric; Pöggeler, Stefanie; Kempken, Frank

    2015-01-01

    The marine-derived Scopulariopsis brevicaulis strain LF580 produces scopularides A and B, which have anticancerous properties. We carried out genome sequencing using three next-generation DNA sequencing methods. De novo hybrid assembly yielded 621 scaffolds with a total size of 32.2 Mb and 16298 putative gene models. We identified a large non-ribosomal peptide synthetase gene (nrps1) and supporting pks2 gene in the same biosynthetic gene cluster. This cluster and the genes within the cluster are functionally active as confirmed by RNA-Seq. Characterization of carbohydrate-active enzymes and major facilitator superfamily (MFS)-type transporters lead to postulate S. brevicaulis originated from a soil fungus, which came into contact with the marine sponge Tethya aurantium. This marine sponge seems to provide shelter to this fungus and micro-environment suitable for its survival in the ocean. This study also builds the platform for further investigations of the role of life-style and secondary metabolites from S. brevicaulis.

  13. Detection of a Usp-like gene in Calotropis procera plant from the de novo assembled genome contigs of the high-throughput sequencing dataset

    KAUST Repository

    Shokry, Ahmed M.

    2014-02-01

    The wild plant species Calotropis procera (C. procera) has many potential applications and beneficial uses in medicine, industry and ornamental field. It also represents an excellent source of genes for drought and salt tolerance. Genes encoding proteins that contain the conserved universal stress protein (USP) domain are known to provide organisms like bacteria, archaea, fungi, protozoa and plants with the ability to respond to a plethora of environmental stresses. However, information on the possible occurrence of Usp in C. procera is not available. In this study, we uncovered and characterized a one-class A Usp-like (UspA-like, NCBI accession No. KC954274) gene in this medicinal plant from the de novo assembled genome contigs of the high-throughput sequencing dataset. A number of GenBank accessions for Usp sequences were blasted with the recovered de novo assembled contigs. Homology modelling of the deduced amino acids (NCBI accession No. AGT02387) was further carried out using Swiss-Model, accessible via the EXPASY. Superimposition of C. procera USPA-like full sequence model on Thermus thermophilus USP UniProt protein (PDB accession No. Q5SJV7) was constructed using RasMol and Deep-View programs. The functional domains of the novel USPA-like amino acids sequence were identified from the NCBI conserved domain database (CDD) that provide insights into sequence structure/function relationships, as well as domain models imported from a number of external source databases (Pfam, SMART, COG, PRK, TIGRFAM). © 2014 Académie des sciences.

  14. De novo assembly of the Indo-Pacific humpback dolphin leucocyte transcriptome to identify putative genes involved in the aquatic adaptation and immune response.

    Directory of Open Access Journals (Sweden)

    Duan Gui

    Full Text Available BACKGROUND: The Indo-Pacific humpback dolphin (Sousa chinensis, a marine mammal species inhabited in the waters of Southeast Asia, South Africa and Australia, has attracted much attention because of the dramatic decline in population size in the past decades, which raises the concern of extinction. So far, this species is poorly characterized at molecular level due to little sequence information available in public databases. Recent advances in large-scale RNA sequencing provide an efficient approach to generate abundant sequences for functional genomic analyses in the species with un-sequenced genomes. PRINCIPAL FINDINGS: We performed a de novo assembly of the Indo-Pacific humpback dolphin leucocyte transcriptome by Illumina sequencing. 108,751 high quality sequences from 47,840,388 paired-end reads were generated, and 48,868 and 46,587 unigenes were functionally annotated by BLAST search against the NCBI non-redundant and Swiss-Prot protein databases (E-value<10(-5, respectively. In total, 16,467 unigenes were clustered into 25 functional categories by searching against the COG database, and BLAST2GO search assigned 37,976 unigenes to 61 GO terms. In addition, 36,345 unigenes were grouped into 258 KEGG pathways. We also identified 9,906 simple sequence repeats and 3,681 putative single nucleotide polymorphisms as potential molecular markers in our assembled sequences. A large number of unigenes were predicted to be involved in immune response, and many genes were predicted to be relevant to adaptive evolution and cetacean-specific traits. CONCLUSION: This study represented the first transcriptome analysis of the Indo-Pacific humpback dolphin, an endangered species. The de novo transcriptome analysis of the unique transcripts will provide valuable sequence information for discovery of new genes, characterization of gene expression, investigation of various pathways and adaptive evolution, as well as identification of genetic markers.

  15. De Novo Assembly of the Donkey White Blood Cell Transcriptome and a Comparative Analysis of Phenotype-Associated Genes between Donkeys and Horses.

    Science.gov (United States)

    Xie, Feng-Yun; Feng, Yu-Long; Wang, Hong-Hui; Ma, Yun-Feng; Yang, Yang; Wang, Yin-Chao; Shen, Wei; Pan, Qing-Jie; Yin, Shen; Sun, Yu-Jiang; Ma, Jun-Yu

    2015-01-01

    Prior to the mechanization of agriculture and labor-intensive tasks, humans used donkeys (Equus africanus asinus) for farm work and packing. However, as mechanization increased, donkeys have been increasingly raised for meat, milk, and fur in China. To maintain the development of the donkey industry, breeding programs should focus on traits related to these new uses. Compared to conventional marker-assisted breeding plans, genome- and transcriptome-based selection methods are more efficient and effective. To analyze the coding genes of the donkey genome, we assembled the transcriptome of donkey white blood cells de novo. Using transcriptomic deep-sequencing data, we identified 264,714 distinct donkey unigenes and predicted 38,949 protein fragments. We annotated the donkey unigenes by BLAST searches against the non-redundant (NR) protein database. We also compared the donkey protein sequences with those of the horse (E. caballus) and wild horse (E. przewalskii), and linked the donkey protein fragments with mammalian phenotypes. As the outer ear size of donkeys and horses are obviously different, we compared the outer ear size-associated proteins in donkeys and horses. We identified three ear size-associated proteins, HIC1, PRKRA, and KMT2A, with sequence differences among the donkey, horse, and wild horse loci. Since the donkey genome sequence has not been released, the de novo assembled donkey transcriptome is helpful for preliminary investigations of donkey cultivars and for genetic improvement.

  16. De Novo Assembly of the Donkey White Blood Cell Transcriptome and a Comparative Analysis of Phenotype-Associated Genes between Donkeys and Horses.

    Directory of Open Access Journals (Sweden)

    Feng-Yun Xie

    Full Text Available Prior to the mechanization of agriculture and labor-intensive tasks, humans used donkeys (Equus africanus asinus for farm work and packing. However, as mechanization increased, donkeys have been increasingly raised for meat, milk, and fur in China. To maintain the development of the donkey industry, breeding programs should focus on traits related to these new uses. Compared to conventional marker-assisted breeding plans, genome- and transcriptome-based selection methods are more efficient and effective. To analyze the coding genes of the donkey genome, we assembled the transcriptome of donkey white blood cells de novo. Using transcriptomic deep-sequencing data, we identified 264,714 distinct donkey unigenes and predicted 38,949 protein fragments. We annotated the donkey unigenes by BLAST searches against the non-redundant (NR protein database. We also compared the donkey protein sequences with those of the horse (E. caballus and wild horse (E. przewalskii, and linked the donkey protein fragments with mammalian phenotypes. As the outer ear size of donkeys and horses are obviously different, we compared the outer ear size-associated proteins in donkeys and horses. We identified three ear size-associated proteins, HIC1, PRKRA, and KMT2A, with sequence differences among the donkey, horse, and wild horse loci. Since the donkey genome sequence has not been released, the de novo assembled donkey transcriptome is helpful for preliminary investigations of donkey cultivars and for genetic improvement.

  17. De novo transcriptome assembly of a sour cherry cultivar, Schattenmorelle

    Directory of Open Access Journals (Sweden)

    Yeonhwa Jo

    2015-12-01

    Full Text Available Sour cherry (Prunus cerasus in the genus Prunus in the family Rosaceae is one of the most popular stone fruit trees worldwide. Of known sour cherry cultivars, the Schattenmorelle is a famous old sour cherry with a high amount of fruit production. The Schattenmorelle was selected before 1650 and described in the 1800s. This cultivar was named after gardens of the Chateau de Moreille in which the cultivar was initially found. In order to identify new genes and to develop genetic markers for sour cherry, we performed a transcriptome analysis of a sour cherry. We selected the cultivar Schattenmorelle, which is among commercially important cultivars in Europe and North America. We obtained 2.05 GB raw data from the Schattenmorelle (NCBI accession number: SRX1187170. De novo transcriptome assembly using Trinity identified 61,053 transcripts in which N50 was 611 bp. Next, we identified 25,585 protein coding sequences using TransDecoder. The identified proteins were blasted against NCBI's non-redundant database for annotation. Based on blast search, we taxonomically classified the obtained sequences. As a result, we provide the transcriptome of sour cherry cultivar Schattenmorelle using next generation sequencing.

  18. Identification of a Novel De Novo Heterozygous Deletion in the SOX10 Gene in Waardenburg Syndrome Type II Using Next-Generation Sequencing.

    Science.gov (United States)

    Li, Haonan; Jin, Peng; Hao, Qian; Zhu, Wei; Chen, Xia; Wang, Ping

    2017-11-01

    Waardenburg syndrome (WS) is a rare autosomal dominant disorder associated with pigmentation abnormalities and sensorineural hearing loss. In this study, we investigated the genetic cause of WSII in a patient and evaluated the reliability of the targeted next-generation exome sequencing method for the genetic diagnosis of WS. Clinical evaluations were conducted on the patient and targeted next-generation sequencing (NGS) was used to identify the candidate genes responsible for WSII. Multiplex ligation-dependent probe amplification (MLPA) and real-time quantitative polymerase chain reaction (qPCR) were performed to confirm the targeted NGS results. Targeted NGS detected the entire deletion of the coding sequence (CDS) of the SOX10 gene in the WSII patient. MLPA results indicated that all exons of the SOX10 heterozygous deletion were detected; no aberrant copy number in the PAX3 and microphthalmia-associated transcription factor (MITF) genes was found. Real-time qPCR results identified the mutation as a de novo heterozygous deletion. This is the first report of using a targeted NGS method for WS candidate gene sequencing; its accuracy was verified by using the MLPA and qPCR methods. Our research provides a valuable method for the genetic diagnosis of WS.

  19. Analysis of insecticide resistance-related genes of the Carmine spider mite Tetranychus cinnabarinus based on a de novo assembled transcriptome.

    Science.gov (United States)

    Xu, Zhifeng; Zhu, Wenyi; Liu, Yanchao; Liu, Xing; Chen, Qiushuang; Peng, Miao; Wang, Xiangzun; Shen, Guangmao; He, Lin

    2014-01-01

    The carmine spider mite (CSM), Tetranychus cinnabarinus, is an important pest mite in agriculture, because it can develop insecticide resistance easily. To gain valuable gene information and molecular basis for the future insecticide resistance study of CSM, the first transcriptome analysis of CSM was conducted. A total of 45,016 contigs and 25,519 unigenes were generated from the de novo transcriptome assembly, and 15,167 unigenes were annotated via BLAST querying against current databases, including nr, SwissProt, the Clusters of Orthologous Groups (COGs), Kyoto Encyclopedia of Genes and Genomes (KEGG) and Gene Ontology (GO). Aligning the transcript to Tetranychus urticae genome, the 19255 (75.45%) of the transcripts had significant (e-value insecticide resistance in arthropod were generated from CSM transcriptome, including 53 P450-, 22 GSTs-, 23 CarEs-, 1 AChE-, 7 GluCls-, 9 nAChRs-, 8 GABA receptor-, 1 sodium channel-, 6 ATPase- and 12 Cyt b genes. We developed significant molecular resources for T. cinnabarinus putatively involved in insecticide resistance. The transcriptome assembly analysis will significantly facilitate our study on the mechanism of adapting environmental stress (including insecticide) in CSM at the molecular level, and will be very important for developing new control strategies against this pest mite.

  20. De novo characterization of fall dormant and nondormant alfalfa (Medicago sativa L.) leaf transcriptome and identification of candidate genes related to fall dormancy.

    Science.gov (United States)

    Zhang, Senhao; Shi, Yinghua; Cheng, Ningning; Du, Hongqi; Fan, Wenna; Wang, Chengzhang

    2015-01-01

    Alfalfa (Medicago sativa L.) is one of the most widely cultivated perennial forage legumes worldwide. Fall dormancy is an adaptive character related to the biomass production and winter survival in alfalfa. The physiological, biochemical and molecular mechanisms causing fall dormancy and the related genes have not been well studied. In this study, we sequenced two standard varieties of alfalfa (dormant and non-dormant) at two time points and generated approximately 160 million high quality paired-end sequence reads using sequencing by synthesis (SBS) technology. The de novo transcriptome assembly generated a set of 192,875 transcripts with an average length of 856 bp representing about 165.1 Mb of the alfalfa leaf transcriptome. After assembly, 111,062 (57.6%) transcripts were annotated against the NCBI non-redundant database. A total of 30,165 (15.6%) transcripts were mapped to 323 Kyoto Encyclopedia of Genes and Genomes pathways. We also identified 41,973 simple sequence repeats, which can be used to generate markers for alfalfa, and 1,541 transcription factors were identified across 1,350 transcripts. Gene expression between dormant and non-dormant alfalfa at different time points were performed, and we identified several differentially expressed genes potentially related to fall dormancy. The Gene Ontology and pathways information were also identified. We sequenced and assembled the leaf transcriptome of alfalfa related to fall dormancy, and also identified some genes of interest involved in the fall dormancy mechanism. Thus, our research focused on studying fall dormancy in alfalfa through transcriptome sequencing. The sequencing and gene expression data generated in this study may be used further to elucidate the complete mechanisms governing fall dormancy in alfalfa.

  1. Analysis of insecticide resistance-related genes of the Carmine spider mite Tetranychus cinnabarinus based on a de novo assembled transcriptome.

    Directory of Open Access Journals (Sweden)

    Zhifeng Xu

    Full Text Available The carmine spider mite (CSM, Tetranychus cinnabarinus, is an important pest mite in agriculture, because it can develop insecticide resistance easily. To gain valuable gene information and molecular basis for the future insecticide resistance study of CSM, the first transcriptome analysis of CSM was conducted. A total of 45,016 contigs and 25,519 unigenes were generated from the de novo transcriptome assembly, and 15,167 unigenes were annotated via BLAST querying against current databases, including nr, SwissProt, the Clusters of Orthologous Groups (COGs, Kyoto Encyclopedia of Genes and Genomes (KEGG and Gene Ontology (GO. Aligning the transcript to Tetranychus urticae genome, the 19255 (75.45% of the transcripts had significant (e-value <10-5 matches to T. urticae DNA genome, 19111 sequences matched to T. urticae proteome with an average protein length coverage of 42.55%. Core Eukaryotic Genes Mapping Approach (CEGMA analysis identified 435 core eukaryotic genes (CEGs in the CSM dataset corresponding to 95% coverage. Ten gene categories that relate to insecticide resistance in arthropod were generated from CSM transcriptome, including 53 P450-, 22 GSTs-, 23 CarEs-, 1 AChE-, 7 GluCls-, 9 nAChRs-, 8 GABA receptor-, 1 sodium channel-, 6 ATPase- and 12 Cyt b genes. We developed significant molecular resources for T. cinnabarinus putatively involved in insecticide resistance. The transcriptome assembly analysis will significantly facilitate our study on the mechanism of adapting environmental stress (including insecticide in CSM at the molecular level, and will be very important for developing new control strategies against this pest mite.

  2. De novo synthesis and functional analysis of the phosphatase-encoding gene acI-B of uncultured Actinobacteria from Lake Stechlin (NE Germany).

    Science.gov (United States)

    Srivastava, Abhishek; McMahon, Katherine D; Stepanauskas, Ramunas; Grossart, Hans-Peter

    2015-12-01

    The National Center for Biotechnology Information [http://www.ncbi.nlm.nih.gov/guide/taxonomy/] database enlists more than 15,500 bacterial species. But this also includes a plethora of uncultured bacterial representations. Owing to their metabolism, they directly influence biogeochemical cycles, which underscores the the important status of bacteria on our planet. To study the function of a gene from an uncultured bacterium, we have undertaken a de novo gene synthesis approach. Actinobacteria of the acI-B subcluster are important but yet uncultured members of the bacterioplankton in temperate lakes of the northern hemisphere such as oligotrophic Lake Stechlin (NE Germany). This lake is relatively poor in phosphate (P) and harbors on average ~1.3 x 10 6 bacterial cells/ml, whereby Actinobacteria of the ac-I lineage can contribute to almost half of the entire bacterial community depending on seasonal variability. Single cell genome analysis of Actinobacterium SCGC AB141-P03, a member of the acI-B tribe in Lake Stechlin has revealed several phosphate-metabolizing genes. The genome of acI-B Actinobacteria indicates potential to degrade polyphosphate compound. To test for this genetic potential, we targeted the exoP-annotated gene potentially encoding polyphosphatase and synthesized it artificially to examine its biochemical role. Heterologous overexpression of the gene in Escherichia coli and protein purification revealed phosphatase activity. Comparative genome analysis suggested that homologs of this gene should be also present in other Actinobacteria of the acI lineages. This strategic retention of specialized genes in their genome provides a metabolic advantage over other members of the aquatic food web in a P-limited ecosystem. [Int Microbiol 2016; 19(1):39-47]. Copyright© by the Spanish Society for Microbiology and Institute for Catalan Studies.

  3. De Novo Mutations in CHD4, an ATP-Dependent Chromatin Remodeler Gene, Cause an Intellectual Disability Syndrome with Distinctive Dysmorphisms.

    Science.gov (United States)

    Weiss, Karin; Terhal, Paulien A; Cohen, Lior; Bruccoleri, Michael; Irving, Melita; Martinez, Ariel F; Rosenfeld, Jill A; Machol, Keren; Yang, Yaping; Liu, Pengfei; Walkiewicz, Magdalena; Beuten, Joke; Gomez-Ospina, Natalia; Haude, Katrina; Fong, Chin-To; Enns, Gregory M; Bernstein, Jonathan A; Fan, Judith; Gotway, Garrett; Ghorbani, Mohammad; van Gassen, Koen; Monroe, Glen R; van Haaften, Gijs; Basel-Vanagaite, Lina; Yang, Xiang-Jiao; Campeau, Philippe M; Muenke, Maximilian

    2016-10-06

    Chromodomain helicase DNA-binding protein 4 (CHD4) is an ATP-dependent chromatin remodeler involved in epigenetic regulation of gene transcription, DNA repair, and cell cycle progression. Also known as Mi2β, CHD4 is an integral subunit of a well-characterized histone deacetylase complex. Here we report five individuals with de novo missense substitutions in CHD4 identified through whole-exome sequencing and web-based gene matching. These individuals have overlapping phenotypes including developmental delay, intellectual disability, hearing loss, macrocephaly, distinct facial dysmorphisms, palatal abnormalities, ventriculomegaly, and hypogonadism as well as additional findings such as bone fusions. The variants, c.3380G>A (p.Arg1127Gln), c.3443G>T (p.Trp1148Leu), c.3518G>T (p.Arg1173Leu), and c.3008G>A, (p.Gly1003Asp) (GenBank: NM_001273.3), affect evolutionarily highly conserved residues and are predicted to be deleterious. Previous studies in yeast showed the equivalent Arg1127 and Trp1148 residues to be crucial for SNF2 function. Furthermore, mutations in the same positions were reported in malignant tumors, and a de novo missense substitution in an equivalent arginine residue in the C-terminal helicase domain of SMARCA4 is associated with Coffin Siris syndrome. Cell-based studies of the p.Arg1127Gln and p.Arg1173Leu mutants demonstrate normal localization to the nucleus and HDAC1 interaction. Based on these findings, the mutations potentially alter the complex activity but not its formation. This report provides evidence for the role of CHD4 in human development and expands an increasingly recognized group of Mendelian disorders involving chromatin remodeling and modification. Published by Elsevier Inc.

  4. De Novo Assembly of the Pea (Pisum sativum L. Nodule Transcriptome

    Directory of Open Access Journals (Sweden)

    Vladimir A. Zhukov

    2015-01-01

    Full Text Available The large size and complexity of the garden pea (Pisum sativum L. genome hamper its sequencing and the discovery of pea gene resources. Although transcriptome sequencing provides extensive information about expressed genes, some tissue-specific transcripts can only be identified from particular organs under appropriate conditions. In this study, we performed RNA sequencing of polyadenylated transcripts from young pea nodules and root tips on an Illumina GAIIx system, followed by de novo transcriptome assembly using the Trinity program. We obtained more than 58,000 and 37,000 contigs from “Nodules” and “Root Tips” assemblies, respectively. The quality of the assemblies was assessed by comparison with pea expressed sequence tags and transcriptome sequencing project data available from NCBI website. The “Nodules” assembly was compared with the “Root Tips” assembly and with pea transcriptome sequencing data from projects indicating tissue specificity. As a result, approximately 13,000 nodule-specific contigs were found and annotated by alignment to known plant protein-coding sequences and by Gene Ontology searching. Of these, 581 sequences were found to possess full CDSs and could thus be considered as novel nodule-specific transcripts of pea. The information about pea nodule-specific gene sequences can be applied for gene-based markers creation, polymorphism studies, and real-time PCR.

  5. Identification of lignin genes and regulatory sequences involved in secondary cell wall formation in Acacia auriculiformis and Acacia mangium via de novo transcriptome sequencing

    Directory of Open Access Journals (Sweden)

    Cannon Charles H

    2011-07-01

    Full Text Available Abstract Background Acacia auriculiformis × Acacia mangium hybrids are commercially important trees for the timber and pulp industry in Southeast Asia. Increasing pulp yield while reducing pulping costs are major objectives of tree breeding programs. The general monolignol biosynthesis and secondary cell wall formation pathways are well-characterized but genes in these pathways are poorly characterized in Acacia hybrids. RNA-seq on short-read platforms is a rapid approach for obtaining comprehensive transcriptomic data and to discover informative sequence variants. Results We sequenced transcriptomes of A. auriculiformis and A. mangium from non-normalized cDNA libraries synthesized from pooled young stem and inner bark tissues using paired-end libraries and a single lane of an Illumina GAII machine. De novo assembly produced a total of 42,217 and 35,759 contigs with an average length of 496 bp and 498 bp for A. auriculiformis and A. mangium respectively. The assemblies of A. auriculiformis and A. mangium had a total length of 21,022,649 bp and 17,838,260 bp, respectively, with the largest contig 15,262 bp long. We detected all ten monolignol biosynthetic genes using Blastx and further analysis revealed 18 lignin isoforms for each species. We also identified five contigs homologous to R2R3-MYB proteins in other plant species that are involved in transcriptional regulation of secondary cell wall formation and lignin deposition. We searched the contigs against public microRNA database and predicted the stem-loop structures of six highly conserved microRNA families (miR319, miR396, miR160, miR172, miR162 and miR168 and one legume-specific family (miR2086. Three microRNA target genes were predicted to be involved in wood formation and flavonoid biosynthesis. By using the assemblies as a reference, we discovered 16,648 and 9,335 high quality putative Single Nucleotide Polymorphisms (SNPs in the transcriptomes of A. auriculiformis and A. mangium

  6. New Genes and Functional Innovation in Mammals.

    Science.gov (United States)

    Luis Villanueva-Cañas, José; Ruiz-Orera, Jorge; Agea, M Isabel; Gallo, Maria; Andreu, David; Albà, M Mar

    2017-07-01

    The birth of genes that encode new protein sequences is a major source of evolutionary innovation. However, we still understand relatively little about how these genes come into being and which functions they are selected for. To address these questions, we have obtained a large collection of mammalian-specific gene families that lack homologues in other eukaryotic groups. We have combined gene annotations and de novo transcript assemblies from 30 different mammalian species, obtaining ∼6,000 gene families. In general, the proteins in mammalian-specific gene families tend to be short and depleted in aromatic and negatively charged residues. Proteins which arose early in mammalian evolution include milk and skin polypeptides, immune response components, and proteins involved in reproduction. In contrast, the functions of proteins which have a more recent origin remain largely unknown, despite the fact that these proteins also have extensive proteomics support. We identify several previously described cases of genes originated de novo from noncoding genomic regions, supporting the idea that this mechanism frequently underlies the evolution of new protein-coding genes in mammals. Finally, we show that most young mammalian genes are preferentially expressed in testis, suggesting that sexual selection plays an important role in the emergence of new functional genes. © The Author(s) 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  7. CD48-deficient T-lymphocytes from DMBA-treated rats have de novo mutations in the endogenous Pig-a gene.

    Science.gov (United States)

    Dobrovolsky, Vasily N; Revollo, Javier; Pearce, Mason G; Pacheco-Martinez, M Monserrat; Lin, Haixia

    2015-10-01

    A major question concerning the scientific and regulatory acceptance of the rodent red blood cell-based Pig-a gene mutation assay is the extent to which mutants identified by their phenotype in the assay are caused by mutations in the Pig-a gene. In this study, we identified T-lymphocytes deficient for the glycosylphosphatidylinositol-anchored surface marker, CD48, in control and 7,12-dimethylbenz[a]anthracene (DMBA)-treated rats using a flow cytometric assay and determined the spectra of mutations in the endogenous Pig-a gene in these cells. CD48-deficient T-cells were seeded by sorting at one cell per well into 96-well plates, expanded into clones, and exons of their genomic Pig-a were sequenced. The majority (78%) of CD48-deficient T-cell clones from DMBA-treated rats had mutations in the Pig-a gene. The spectrum of DMBA-induced Pig-a mutations was dominated by mutations at A:T, with the mutated A being on the nontranscribed strand and A → T transversion being the most frequent change. The spectrum of Pig-a mutations in DMBA-treated rats was different from the spectrum of Pig-a mutations in N-ethyl-N-nitrosourea (ENU)-treated rats, but similar to the spectrum of DMBA mutations for another endogenous X-linked gene, Hprt. Only 15% of CD48-deficient mutants from control animals contained Pig-a mutations; T-cell biology may be responsible for a relatively large fraction of false Pig-a mutant lymphocytes in control animals. Among the verified mutants from control rats, the most common were frameshifts and deletions. The differences in the spectra of spontaneous, DMBA-, and ENU-induced Pig-a mutations suggest that the flow cytometric Pig-a assay detects de novo mutation in the endogenous Pig-a gene. © 2015 Wiley Periodicals, Inc.

  8. Sequencing, de novo assembly and characterization of the spotted scat Scatophagus argus (Linnaeus 1766) transcriptome for discovery of reproduction related genes and SSRs

    Science.gov (United States)

    Yang, Wei; Chen, Huapu; Cui, Xuefan; Zhang, Kewei; Jiang, Dongneng; Deng, Siping; Zhu, Chunhua; Li, Guangli

    2017-09-01

    Spotted scat (Scatophagus argus) is an economically important farmed fish, particularly in East and Southeast Asia. Because there has been little research on reproductive development and regulation in this species, the lack of a mature artificial reproduction technology remains a barrier for the sustainable development of the aquaculture industry. More genetic and genomic background knowledge is urgently needed for an in-depth understanding of the molecular mechanism of reproductive process and identification of functional genes related to sexual differentiation, gonad maturation and gametogenesis. For these reasons, we performed transcriptomic analysis on spotted scat using a multiple tissue sample mixing strategy. The Illumina RNA sequencing generated 118 510 486 raw reads. After trimming, de novo assembly was performed and yielded 99 888 unigenes with an average length of 905.75 bp. A total of 45 015 unigenes were successfully annotated to the Nr, Swiss-Prot, KOG and KEGG databases. Additionally, 23 783 and 27 183 annotated unigenes were assigned to 56 Gene Ontology (GO) functional groups and 228 KEGG pathways, respectively. Subsequently, 2 474 transcripts associated with reproduction were selected using GO term and KEGG pathway assignments, and a number of reproduction-related genes involved in sex differentiation, gonad development and gametogenesis were identified. Furthermore, 22 279 simple sequence repeat (SSR) loci were discovered and characterized. The comprehensive transcript dataset described here greatly increases the genetic information available for spotted scat and contributes valuable sequence resources for functional gene mining and analysis. Candidate transcripts involved in reproduction would make good starting points for future studies on reproductive mechanisms, and the putative sex differentiation-related genes will be helpful for sex-determining gene identification and sex-specific marker isolation. Lastly, the SSRs can serve as marker

  9. De novo Transcriptome Assembly of Floral Buds of Pineapple and Identification of Differentially Expressed Genes in Response to Ethephon Induction

    Science.gov (United States)

    Liu, Chuan-He; Fan, Chao

    2016-01-01

    A remarkable characteristic of pineapple is its ability to undergo floral induction in response to external ethylene stimulation. However, little information is available regarding the molecular mechanism underlying this process. In this study, the differentially expressed genes (DEGs) in plants exposed to 1.80 mL·L−1 (T1) or 2.40 mL·L−1 ethephon (T2) compared with Ct plants (control, cleaning water) were identified using RNA-seq and gene expression profiling. Illumina sequencing generated 65,825,224 high-quality reads that were assembled into 129,594 unigenes with an average sequence length of 1173 bp. Of these unigenes, 24,775 were assigned to specific KEGG pathways, of which metabolic pathways and biosynthesis of secondary metabolites were the most highly represented. Gene Ontology (GO) analysis of the annotated unigenes revealed that the majority were involved in metabolic and cellular processes, cell and cell part, catalytic activity and binding. Gene expression profiling analysis revealed 3788, 3062, and 758 DEGs in the comparisons of T1 with Ct, T2 with Ct, and T2 with T1, respectively. GO analysis indicated that these DEGs were predominantly annotated to metabolic and cellular processes, cell and cell part, catalytic activity, and binding. KEGG pathway analysis revealed the enrichment of several important pathways among the DEGs, including metabolic pathways, biosynthesis of secondary metabolites and plant hormone signal transduction. Thirteen DEGs were identified as candidate genes associated with the process of floral induction by ethephon, including three ERF-like genes, one ETR-like gene, one LTI-like gene, one FT-like gene, one VRN1-like gene, three FRI-like genes, one AP1-like gene, one CAL-like gene, and one AG-like gene. qPCR analysis indicated that the changes in the expression of these 13 candidate genes were consistent with the alterations in the corresponding RPKM values, confirming the accuracy and credibility of the RNA-seq and gene

  10. RNA-seq de novo Assembly Reveals Differential Gene Expression in Glossina palpalis gambiensis Infected with Trypanosoma brucei gambiense vs. Non-Infected and Self-Cured Flies.

    Science.gov (United States)

    Hamidou Soumana, Illiassou; Klopp, Christophe; Ravel, Sophie; Nabihoudine, Ibouniyamine; Tchicaya, Bernadette; Parrinello, Hugues; Abate, Luc; Rialle, Stéphanie; Geiger, Anne

    2015-01-01

    Trypanosoma brucei gambiense (Tbg), causing the sleeping sickness chronic form, completes its developmental cycle within the tsetse fly vector Glossina palpalis gambiensis (Gpg) before its transmission to humans. Within the framework of an anti-vector disease control strategy, a global gene expression profiling of trypanosome infected (susceptible), non-infected, and self-cured (refractory) tsetse flies was performed, on their midguts, to determine differential genes expression resulting from in vivo trypanosomes, tsetse flies (and their microbiome) interactions. An RNAseq de novo assembly was achieved. The assembled transcripts were mapped to reference sequences for functional annotation. Twenty-four percent of the 16,936 contigs could not be annotated, possibly representing untranslated mRNA regions, or Gpg- or Tbg-specific ORFs. The remaining contigs were classified into 65 functional groups. Only a few transposable elements were present in the Gpg midgut transcriptome, which may represent active transpositions and play regulatory roles. One thousand three hundred and seventy three genes differentially expressed (DEGs) between stimulated and non-stimulated flies were identified at day-3 post-feeding; 52 and 1025 between infected and self-cured flies at 10 and 20 days post-feeding, respectively. The possible roles of several DEGs regarding fly susceptibility and refractoriness are discussed. The results provide new means to decipher fly infection mechanisms, crucial to develop anti-vector control strategies.

  11. De novo 454 sequencing of barcoded BAC pools for comprehensive gene survey and genome analysis in the complex genome of barley

    Directory of Open Access Journals (Sweden)

    Scholz Uwe

    2009-11-01

    Full Text Available Abstract Background De novo sequencing the entire genome of a large complex plant genome like the one of barley (Hordeum vulgare L. is a major challenge both in terms of experimental feasibility and costs. The emergence and breathtaking progress of next generation sequencing technologies has put this goal into focus and a clone based strategy combined with the 454/Roche technology is conceivable. Results To test the feasibility, we sequenced 91 barcoded, pooled, gene containing barley BACs using the GS FLX platform and assembled the sequences under iterative change of parameters. The BAC assemblies were characterized by N50 of ~50 kb (N80 ~31 kb, N90 ~21 kb and a Q40 of 94%. For ~80% of the clones, the best assemblies consisted of less than 10 contigs at 24-fold mean sequence coverage. Moreover we show that gene containing regions seem to assemble completely and uninterrupted thus making the approach suitable for detecting complete and positionally anchored genes. By comparing the assemblies of four clones to their complete reference sequences generated by the Sanger method, we evaluated the distribution, quality and representativeness of the 454 sequences as well as the consistency and reliability of the assemblies. Conclusion The described multiplex 454 sequencing of barcoded BACs leads to sequence consensi highly representative for the clones. Assemblies are correct for the majority of contigs. Though the resolution of complex repetitive structures requires additional experimental efforts, our approach paves the way for a clone based strategy of sequencing the barley genome.

  12. Challenging the dogma: the hidden layer of non-protein-coding RNAs in complex organisms.

    Science.gov (United States)

    Mattick, John S

    2003-10-01

    The central dogma of biology holds that genetic information normally flows from DNA to RNA to protein. As a consequence it has been generally assumed that genes generally code for proteins, and that proteins fulfil not only most structural and catalytic but also most regulatory functions, in all cells, from microbes to mammals. However, the latter may not be the case in complex organisms. A number of startling observations about the extent of non-protein-coding RNA (ncRNA) transcription in the higher eukaryotes and the range of genetic and epigenetic phenomena that are RNA-directed suggests that the traditional view of the structure of genetic regulatory systems in animals and plants may be incorrect. ncRNA dominates the genomic output of the higher organisms and has been shown to control chromosome architecture, mRNA turnover and the developmental timing of protein expression, and may also regulate transcription and alternative splicing. This paper re-examines the available evidence and suggests a new framework for considering and understanding the genomic programming of biological complexity, autopoietic development and phenotypic variation. Copyright 2003 Wiley Periodicals, Inc.

  13. The first de novo mutation of the connexin 32 gene associated with X linked Charcot-Marie-Tooth disease

    NARCIS (Netherlands)

    Meggouh, F.; Benomar, A.; Rouger, H.; Tardieu, S.; Birouk, N.; Tassin, J.; Barhoumi, C.; Yahyaoui, M.; Chkili, T.; Brice, A.; LeGuern, E.

    1998-01-01

    X linked Charcot-Marie-Tooth disease (CMTX) is a hereditary motor and sensory neuropathy caused by mutations in the connexin 32 gene (Cx32). Using the SSCP technique and direct sequencing of PCR amplified genomic DNA fragments of the Cx32 gene from a Moroccan patient and her relatives, we identified

  14. De novo analysis of Wolfiporia cocos transcriptome to reveal the differentially expressed carbohydrate-active enzymes (CAZymes genes during the early stage of sclerotial growth

    Directory of Open Access Journals (Sweden)

    Shaopeng eZhang

    2016-02-01

    Full Text Available The sclerotium of Wolfiporia cocos has been used as an edible mushroom and/or a traditional herbal medicine for centuries. W. cocos sclerotial formation is dependent on parasitism of the wood of Pinus species. Currently, the sclerotial development mechanisms of W. cocos remain largely unknown and the lack of pine resources limit the commercial production. The CAZymes (carbohydrate-active enzymes play important roles in degradation of the plant cell wall to provide carbohydrates for fungal growth, development and reproduction. In this study, the transcript profiles from W. cocos mycelium and two-months-old sclerotium, the early stage of sclerotial growth, were specially analyzed using de novo sequencing technology. A total of 142,428,180 high-quality reads of mycelium and 70,594,319 high-quality reads of two-months-old sclerotium were obtained. Additionally, differentially expressed genes from the W. cocos mycelium and two-months-old sclerotium stages were analyzed, resulting in identification of 69 CAZymes genes which were significantly up-regulated during the early stage of sclerotial growth compared to that of in mycelium stage, and more than half of them belonged to glycosyl hydrolases (GHs family, indicating the importance of W. cocos GHs family for degrading the pine woods. And qRT-PCR was further used to confirm the expression pattern of these up-regulated CAZymes genes. Our results will provide comprehensive CAZymes genes expression information during W. cocos sclerotial growth at the transcriptional level and will lay a foundation for functional genes studies in this fungus. In addition, our study will also facilitate the efficient use of limited pine resources, which is significant for promoting steady development of Chinese W. cocos industry.

  15. The genome of flax (Linum usitatissimum) assembled de novo from short shotgun sequence reads.

    Science.gov (United States)

    Wang, Zhiwen; Hobson, Neil; Galindo, Leonardo; Zhu, Shilin; Shi, Daihu; McDill, Joshua; Yang, Linfeng; Hawkins, Simon; Neutelings, Godfrey; Datla, Raju; Lambert, Georgina; Galbraith, David W; Grassa, Christopher J; Geraldes, Armando; Cronk, Quentin C; Cullis, Christopher; Dash, Prasanta K; Kumar, Polumetla A; Cloutier, Sylvie; Sharpe, Andrew G; Wong, Gane K-S; Wang, Jun; Deyholos, Michael K

    2012-11-01

    Flax (Linum usitatissimum) is an ancient crop that is widely cultivated as a source of fiber, oil and medicinally relevant compounds. To accelerate crop improvement, we performed whole-genome shotgun sequencing of the nuclear genome of flax. Seven paired-end libraries ranging in size from 300 bp to 10 kb were sequenced using an Illumina genome analyzer. A de novo assembly, comprised exclusively of deep-coverage (approximately 94× raw, approximately 69× filtered) short-sequence reads (44-100 bp), produced a set of scaffolds with N(50) =694 kb, including contigs with N(50)=20.1 kb. The contig assembly contained 302 Mb of non-redundant sequence representing an estimated 81% genome coverage. Up to 96% of published flax ESTs aligned to the whole-genome shotgun scaffolds. However, comparisons with independently sequenced BACs and fosmids showed some mis-assembly of regions at the genome scale. A total of 43384 protein-coding genes were predicted in the whole-genome shotgun assembly, and up to 93% of published flax ESTs, and 86% of A. thaliana genes aligned to these predicted genes, indicating excellent coverage and accuracy at the gene level. Analysis of the synonymous substitution rates (K(s) ) observed within duplicate gene pairs was consistent with a recent (5-9 MYA) whole-genome duplication in flax. Within the predicted proteome, we observed enrichment of many conserved domains (Pfam-A) that may contribute to the unique properties of this crop, including agglutinin proteins. Together these results show that de novo assembly, based solely on whole-genome shotgun short-sequence reads, is an efficient means of obtaining nearly complete genome sequence information for some plant species. © 2012 The Authors. The Plant Journal © 2012 Blackwell Publishing Ltd.

  16. A de novo 1q22q23.1 Interstitial Microdeletion in a Girl with Intellectual Disability and Multiple Congenital Anomalies Including Congenital Heart Defect.

    Science.gov (United States)

    Aleksiūnienė, Beata; Preiksaitiene, Egle; Morkūnienė, Aušra; Ambrozaitytė, Laima; Utkus, Algirdas

    2018-01-01

    Many studies have shown that molecular karyotyping is an effective diagnostic tool in individuals with developmental delay/intellectual disability. We report on a de novo interstitial 1q22q23.1 microdeletion, 1.6 Mb in size, detected in a patient with short stature, microcephaly, hypoplastic corpus callosum, cleft palate, minor facial anomalies, congenital heart defect, camptodactyly of the 4-5th fingers, and intellectual disability. Chromosomal microarray analysis revealed a 1.6-Mb deletion in the 1q22q23.1 region, arr[GRCh37] 1q22q23.1(155630752_157193893)×1. Real-time PCR analysis confirmed its de novo origin. The deleted region encompasses 50 protein-coding genes, including the morbid genes APOA1BP, ARHGEF2, LAMTOR2, LMNA, NTRK1, PRCC, RIT1, SEMA4A, and YY1AP1. Although the unique phenotype observed in our patient can arise from the haploinsufficiency of the dosage-sensitive LMNA gene, the dosage imbalance of other genes implicated in the rearrangement could also contribute to the phenotype. Further studies are required for the delineation of the phenotype associated with this rare chromosomal alteration and elucidation of the critical genes for manifestation of the specific clinical features. © 2018 S. Karger AG, Basel.

  17. Sequencing, De Novo Assembly, and Annotation of the Transcriptome of the Endangered Freshwater Pearl Bivalve, Cristaria plicata, Provides Novel Insights into Functional Genes and Marker Discovery.

    Directory of Open Access Journals (Sweden)

    Bharat Bhusan Patnaik

    Full Text Available The freshwater mussel Cristaria plicata (Bivalvia: Eulamellibranchia: Unionidae, is an economically important species in molluscan aquaculture due to its use in pearl farming. The species have been listed as endangered in South Korea due to the loss of natural habitats caused by anthropogenic activities. The decreasing population and a lack of genomic information on the species is concerning for environmentalists and conservationists. In this study, we conducted a de novo transcriptome sequencing and annotation analysis of C. plicata using Illumina HiSeq 2500 next-generation sequencing (NGS technology, the Trinity assembler, and bioinformatics databases to prepare a sustainable resource for the identification of candidate genes involved in immunity, defense, and reproduction.The C. plicata transcriptome analysis included a total of 286,152,584 raw reads and 281,322,837 clean reads. The de novo assembly identified a total of 453,931 contigs and 374,794 non-redundant unigenes with average lengths of 731.2 and 737.1 bp, respectively. Furthermore, 100% coverage of C. plicata mitochondrial genes within two unigenes supported the quality of the assembler. In total, 84,274 unigenes showed homology to entries in at least one database, and 23,246 unigenes were allocated to one or more Gene Ontology (GO terms. The most prominent GO biological process, cellular component, and molecular function categories (level 2 were cellular process, membrane, and binding, respectively. A total of 4,776 unigenes were mapped to 123 biological pathways in the KEGG database. Based on the GO terms and KEGG annotation, the unigenes were suggested to be involved in immunity, stress responses, sex-determination, and reproduction. A total of 17,251 cDNA simple sequence repeats (cSSRs were identified from 61,141 unigenes (size of >1 kb with the most abundant being dinucleotide repeats.This dataset represents the first transcriptome analysis of the endangered mollusc, C. plicata

  18. De novo Transcriptome Assembly of Chinese Kale and Global Expression Analysis of Genes Involved in Glucosinolate Metabolism in Multiple Tissues

    Science.gov (United States)

    Wu, Shuanghua; Lei, Jianjun; Chen, Guoju; Chen, Hancai; Cao, Bihao; Chen, Changming

    2017-01-01

    Chinese kale, a vegetable of the cruciferous family, is a popular crop in southern China and Southeast Asia due to its high glucosinolate content and nutritional qualities. However, there is little research on the molecular genetics and genes involved in glucosinolate metabolism and its regulation in Chinese kale. In this study, we sequenced and characterized the transcriptomes and expression profiles of genes expressed in 11 tissues of Chinese kale. A total of 216 million 150-bp clean reads were generated using RNA-sequencing technology. From the sequences, 98,180 unigenes were assembled for the whole plant, and 49,582~98,423 unigenes were assembled for each tissue. Blast analysis indicated that a total of 80,688 (82.18%) unigenes exhibited similarity to known proteins. The functional annotation and classification tools used in this study suggested that genes principally expressed in Chinese kale, were mostly involved in fundamental processes, such as cellular and molecular functions, the signal transduction, and biosynthesis of secondary metabolites. The expression levels of all unigenes were analyzed in various tissues of Chinese kale. A large number of candidate genes involved in glucosinolate metabolism and its regulation were identified, and the expression patterns of these genes were analyzed. We found that most of the genes involved in glucosinolate biosynthesis were highly expressed in the root, petiole, and in senescent leaves. The expression patterns of ten glucosinolate biosynthetic genes from RNA-seq were validated by quantitative RT-PCR in different tissues. These results provided an initial and global overview of Chinese kale gene functions and expression activities in different tissues. PMID:28228764

  19. Extracellular Hsp90 serves as a co-factor for MAPK activation and latent viral gene expression during de novo infection by KSHV

    International Nuclear Information System (INIS)

    Qin Zhiqiang; DeFee, Michael; Isaacs, Jennifer S.; Parsons, Chris

    2010-01-01

    The Kaposi's sarcoma-associated herpesvirus (KSHV) is the causative agent of Kaposi's sarcoma (KS), an important cause of morbidity and mortality in immunocompromised patients. KSHV interaction with the cell membrane triggers activation of specific intracellular signal transduction pathways to facilitate virus entry, nuclear trafficking, and ultimately viral oncogene expression. Extracellular heat shock protein 90 localizes to the cell surface (csHsp90) and facilitates signal transduction in cancer cell lines, but whether csHsp90 assists in the coordination of KSHV gene expression through these or other mechanisms is unknown. Using a recently characterized non-permeable inhibitor specifically targeting csHsp90 and Hsp90-specific antibodies, we show that csHsp90 inhibition suppresses KSHV gene expression during de novo infection, and that this effect is mediated largely through the inhibition of mitogen-activated protein kinase (MAPK) activation by KSHV. Moreover, we show that targeting csHsp90 reduces constitutive MAPK expression and the release of infectious viral particles by patient-derived, KSHV-infected primary effusion lymphoma cells. These data suggest that csHsp90 serves as an important co-factor for KSHV-initiated MAPK activation and provide proof-of-concept for the potential benefit of targeting csHsp90 for the treatment or prevention of KSHV-associated illnesses.

  20. A de novo whole gene deletion of XIAP detected by exome sequencing analysis in very early onset inflammatory bowel disease: a case report.

    Science.gov (United States)

    Kelsen, Judith R; Dawany, Noor; Martinez, Alejandro; Martinez, Alejuandro; Grochowski, Christopher M; Maurer, Kelly; Rappaport, Eric; Piccoli, David A; Baldassano, Robert N; Mamula, Petar; Sullivan, Kathleen E; Devoto, Marcella

    2015-11-18

    Children with very early-onset inflammatory bowel disease (VEO-IBD), those diagnosed at less than 5 years of age, are a unique population. A subset of these patients present with a distinct phenotype and more severe disease than older children and adults. Host genetics is thought to play a more prominent role in this young population, and monogenic defects in genes related to primary immunodeficiencies are responsible for the disease in a small subset of patients with VEO-IBD. We report a child who presented at 3 weeks of life with very early-onset inflammatory bowel disease (VEO-IBD). He had a complicated disease course and remained unresponsive to medical and surgical therapy. The refractory nature of his disease, together with his young age of presentation, prompted utilization of whole exome sequencing (WES) to detect an underlying monogenic primary immunodeficiency and potentially target therapy to the identified defect. Copy number variation analysis (CNV) was performed using the eXome-Hidden Markov Model. Whole exome sequencing revealed 1,380 nonsense and missense variants in the patient. Plausible candidate variants were not detected following analysis of filtered variants, therefore, we performed CNV analysis of the WES data, which led us to identify a de novo whole gene deletion in XIAP. This is the first reported whole gene deletion in XIAP, the causal gene responsible for XLP2 (X-linked lymphoproliferative Disease 2). XLP2 is a syndrome resulting in VEO-IBD and can increase susceptibility to hemophagocytic lymphohistocytosis (HLH). This identification allowed the patient to be referred for bone marrow transplantation, potentially curative for his disease and critical to prevent the catastrophic sequela of HLH. This illustrates the unique etiology of VEO-IBD, and the subsequent effects on therapeutic options. This cohort requires careful and thorough evaluation for monogenic defects and primary immunodeficiencies.

  1. Characterization of the 'Xiangshui' lemon transcriptome by de novo assembly to discover genes associated with self-incompatibility.

    Science.gov (United States)

    Zhang, Shuwei; Ding, Feng; He, Xinhua; Luo, Cong; Huang, Guixiang; Hu, Ying

    2015-02-01

    Seedlessness is a desirable character in lemons and other citrus species. Seedless fruit can be induced in many ways, including through self-incompatibility (SI). SI is widely used as an intraspecific reproductive barrier that prevents self-fertilization in flowering plants. Although there have been many studies on SI, its mechanism remains unclear. The 'Xiangshui' lemon is an important seedless cultivar whose seedlessness has been caused by SI. It is essential to identify genes involved in SI in 'Xiangshui' lemon to clarify its molecular mechanism. In this study, candidate genes associated with SI were identified using high-throughput Illumina RNA sequencing (RNA-seq). A total of 61,224 unigenes were obtained (average, 948 bp; N50 of 1,457 bp), among which 47,260 unigenes were annotated by comparison to six public databases (Nr, Nt, Swiss-Prot, KEGG, COG, and GO). Differentially expressed genes were identified by comparing the transcriptomes of no-, self-, and cross-pollinated stigmas with styles of the 'Xiangshui' lemon. Several differentially expressed genes that might be associated with SI were identified, such as those involved in pollen tube growth, programmed cell death, signal transduction, and transcription. NADPH oxidase genes associated with apoptosis were highly upregulated in the self-pollinated transcriptome. The expression pattern of 12 genes was analyzed by quantitative real-time polymerase chain reaction. A putative S-RNase gene was identified that had not been previously associated with self-pollen rejection in lemon or citrus. This study provided a transcriptome dataset for further studies of SI and seedless lemon breeding.

  2. De novo assembly and analysis of the Artemisia argyi transcriptome and identification of genes involved in terpenoid biosynthesis.

    Science.gov (United States)

    Liu, Miaomiao; Zhu, Jinhang; Wu, Shengbing; Wang, Chenkai; Guo, Xingyi; Wu, Jiawen; Zhou, Meiqi

    2018-04-11

    Artemisia argyi Lev. et Vant. (A. argyi) is widely utilized for moxibustion in Chinese medicine, and the mechanism underlying terpenoid biosynthesis in its leaves is suggested to play an important role in its medicinal use. However, the A. argyi transcriptome has not been sequenced. Herein, we performed RNA sequencing for A. argyi leaf, root and stem tissues to identify as many as possible of the transcribed genes. In total, 99,807 unigenes were assembled by analysing the expression profiles generated from the three tissue types, and 67,446 of those unigenes were annotated in public databases. We further performed differential gene expression analysis to compare leaf tissue with the other two tissue types and identified numerous genes that were specifically expressed or up-regulated in leaf tissue. Specifically, we identified multiple genes encoding significant enzymes or transcription factors related to terpenoid synthesis. This study serves as a valuable resource for transcriptome information, as many transcribed genes related to terpenoid biosynthesis were identified in the A. argyi transcriptome, providing a functional genomic basis for additional studies on molecular mechanisms underlying the medicinal use of A. argyi.

  3. De novo deletion of HOXB gene cluster in a patient with failure to thrive, developmental delay, gastroesophageal reflux and bronchiectasis.

    Science.gov (United States)

    Pajusalu, Sander; Reimand, Tiia; Uibo, Oivi; Vasar, Maire; Talvik, Inga; Zilina, Olga; Tammur, Pille; Õunap, Katrin

    2015-01-01

    We report a female patient with a complex phenotype consisting of failure to thrive, developmental delay, congenital bronchiectasis, gastroesophageal reflux and bilateral inguinal hernias. Chromosomal microarray analysis revealed a 230 kilobase deletion in chromosomal region 17q21.32 (arr[hg19] 17q21.32(46 550 362-46 784 039)×1) encompassing only 9 genes - HOXB1 to HOXB9. The deletion was not found in her mother or father. This is the first report of a patient with a HOXB gene cluster deletion involving only HOXB1 to HOXB9 genes. By comparing our case to previously reported five patients with larger chromosomal aberrations involving the HOXB gene cluster, we can suppose that HOXB gene cluster deletions are responsible for growth retardation, developmental delay, and specific facial dysmorphic features. Also, we suppose that bilateral inguinal hernias, tracheo-esophageal abnormalities, and lung malformations represent features with incomplete penetrance. Interestingly, previously published knock-out mice with targeted heterozygous deletion comparable to our patient did not show phenotypic alterations. Copyright © 2015 Elsevier Masson SAS. All rights reserved.

  4. Severe neonatal marfan syndrome resulting from a De Novo 3-bp insertion into the fibrillin gene on chromosome 15

    Energy Technology Data Exchange (ETDEWEB)

    Milewicz, D.M.; Duvic, M. (Univ. of Texas Medical School, Houston, TX (United States))

    1994-03-01

    Severe neonatal Marfan syndrome has features of the Marfan syndrome and congenital contractural arachnodactyly present at birth, along with unique features such as loose, redundant skin and pulmonary emphysema. Since the Marfan syndrome and congenital contractural arachnodactyly are due to mutations in different genes, it has been uncertain whether neonatal Marfan syndrome is due to mutations in the fibrillin gene on chromosome 15 or in another gene. The authors studied an infant with severe neonatal Marfan syndrome. Dermal fibroblasts were metabolically labeled and found to secrete fibrillin inefficiently when compared with control cells. Reverse transcription and amplification of the proband's fibroblast RNA was used to identify a 3-bp insertion between nucleotides 480-481 or 481-482 of the fibrillin cDNA. The insertion maintains the reading frame of the protein and inserts a cysteine between amino acids 160 and 161 in an epidermal growth-factor-like motif of fibrillin. This 3-bp insertion was not found in the fibrillin gene in 70 unrelated, unaffected individuals and 11 unrelated individuals with the Maran syndrome. The authors conclude that neonatal Marfan syndrome is the result of mutations in the fibrillin gene on chromosome 15 and is part of the Marfan syndrome spectrum. 32 refs., 3 figs.

  5. De Novo Transcriptome Analysis of Plant Pathogenic Fungus Myrothecium roridum and Identification of Genes Associated with Trichothecene Mycotoxin Biosynthesis

    Directory of Open Access Journals (Sweden)

    Wei Ye

    2017-02-01

    Full Text Available Myrothecium roridum is a plant pathogenic fungus that infects different crops and decreases the yield of economical crops, including soybean, cotton, corn, pepper, and tomato. Until now, the pathogenic mechanism of M. roridum has remained unclear. Different types of trichothecene mycotoxins were isolated from M. roridum, and trichothecene was considered as a plant pathogenic factor of M. roridum. In this study, the transcriptome of M. roridum in different incubation durations was sequenced using an Illumina Hiseq 2000. A total of 35,485 transcripts and 25,996 unigenes for M. roridum were obtained from 8.0 Gb clean reads. The protein–protein network of the M. roridum transcriptome indicated that the mitogen-activated protein kinases signal pathway also played an important role in the pathogenicity of M. roridum. The genes related to trichothecene biosynthesis were annotated. The expression levels of these genes were also predicted and validated through quantitative real-time polymerase chain reaction. Tri5 gene encoding trichodiene synthase was cloned and expressed, and the purified trichodiene synthase was able to catalyze farnesyl pyrophosphate into different kinds of sesquiterpenoids.Tri4 and Tri11 genes were expressed in Escherichia coli, and their corresponding enzymatic properties were characterized. The phylogenetic tree of trichodiene synthase showed a great discrepancy between the trichodiene synthase from M. roridum and other species. Our study on the genes related to trichothecene biosynthesis establishes a foundation for the M. roridum hazard prevention, thus improving the yields of economical crops.

  6. De Novo Transcriptomes of Forsythia koreana Using a Novel Assembly Method: Insight into Tissue- and Species-Specific Expression of Lignan Biosynthesis-Related Gene.

    Directory of Open Access Journals (Sweden)

    Akira Shiraishi

    Full Text Available Forsythia spp. are perennial woody plants which are one of the most extensively used medicinal sources of Chinese medicines and functional diets owing to their lignan contents. Lignans have received widespread attention as leading compounds in the development of antitumor drugs and healthy diets for reducing the risks of lifestyle-related diseases. However, the molecular basis of Forsythia has yet to be established. In this study, we have verified de novo deep transcriptome of Forsythia koreana leaf and callus using the Illumina HiSeq 1500 platform. A total of 89 million reads were assembled into 116,824 contigs using Trinity, and 1,576 of the contigs displayed the sequence similarity to the enzymes responsible for plant specialized metabolism including lignan biosynthesis. Notably, gene ontology (GO analysis indicated the remarkable enrichment of lignan-biosynthetic enzyme genes in the callus transcriptome. Nevertheless, precise annotation and molecular phylogenetic analyses were hindered by partial sequences of open reading frames (ORFs of the Trinity-based contigs. To obtain more numerous contigs harboring a full-length ORF, we developed a novel overlapping layout consensus-based procedure, virtual primer-based sequence reassembly (VP-seq. VP-seq elucidated 709 full-length ORFs, whereas only 146 full-length ORFs were assembled by Trinity. The comparison of expression profiles of leaf and callus using VP-seq-based full-length ORFs revealed 50-fold upregulation of secoisolariciresinol dehydrogenase (SIRD in callus. Expression and phylogenetic cluster analyses predicted candidates for matairesinol-glucosylating enzymes. We also performed VP-seq analysis of lignan-biosynthetic enzyme genes in the transcriptome data of other lignan-rich plants, Linum flavum, Linum usitatissimum and Podophyllum hexandrum. The comparative analysis indicated both common gene clusters involved in biosynthesis upstream of matairesinol such as SIRD and plant lineage

  7. De Novo Transcriptome Sequencing in Passiflora edulis Sims to Identify Genes and Signaling Pathways Involved in Cold Tolerance

    Directory of Open Access Journals (Sweden)

    Sian Liu

    2017-11-01

    Full Text Available The passion fruit (Passiflora edulis Sims, also known as the purple granadilla, is widely cultivated as the new darling of the fruit market throughout southern China. This exotic and perennial climber is adapted to warm and humid climates, and thus is generally intolerant of cold. There is limited information about gene regulation and signaling pathways related to the cold stress response in this species. In this study, two transcriptome libraries (KEDU_AP vs. GX_AP were constructed from the aerial parts of cold-tolerant and cold-susceptible varieties of P. edulis, respectively. Overall, 126,284,018 clean reads were obtained, and 86,880 unigenes with a mean size of 1449 bp were assembled. Of these, there were 64,067 (73.74% unigenes with significant similarity to publicly available plant protein sequences. Expression profiles were generated, and 3045 genes were found to be significantly differentially expressed between the KEDU_AP and GX_AP libraries, including 1075 (35.3% up-regulated and 1970 (64.7% down-regulated. These included 36 genes in enriched pathways of plant hormone signal transduction, and 56 genes encoding putative transcription factors. Six genes involved in the ICE1–CBF–COR pathway were induced in the cold-tolerant variety, and their expression levels were further verified using quantitative real-time PCR. This report is the first to identify genes and signaling pathways involved in cold tolerance using high-throughput transcriptome sequencing in P. edulis. These findings may provide useful insights into the molecular mechanisms regulating cold tolerance and genetic breeding in Passiflora spp.

  8. Identification of a repressor gene involved in the regulation of NAD de novo biosynthesis in Salmonella typhimurium.

    OpenAIRE

    Zhu, N; Olivera, B M; Roth, J R

    1988-01-01

    Mutations at the nadI locus affect expression of the first two genes of NAD synthesis, nadA and nadB, which are unlinked. Genetic data imply that the regulatory effects of nadI mutations are not due to indirect consequences of physiological alterations. Two types of mutations map in the nadI region. Common null mutations (nadI) show constitutive high-level expression of the nadB and nadA genes. Rare nadIs mutations cause constitutive low-level expression of nadB and nadA. Some nadIs mutations...

  9. Genome-Wide Analysis of Secondary Metabolite Gene Clusters in Ophiostoma ulmi and Ophiostoma novo-ulmi Reveals a Fujikurin-Like Gene Cluster with a Putative Role in Infection

    Directory of Open Access Journals (Sweden)

    Nicolau Sbaraini

    2017-06-01

    Full Text Available The emergence of new microbial pathogens can result in destructive outbreaks, since their hosts have limited resistance and pathogens may be excessively aggressive. Described as the major ecological incident of the twentieth century, Dutch elm disease, caused by ascomycete fungi from the Ophiostoma genus, has caused a significant decline in elm tree populations (Ulmus sp. in North America and Europe. Genome sequencing of the two main causative agents of Dutch elm disease (Ophiostoma ulmi and Ophiostoma novo-ulmi, along with closely related species with different lifestyles, allows for unique comparisons to be made to identify how pathogens and virulence determinants have emerged. Among several established virulence determinants, secondary metabolites (SMs have been suggested to play significant roles during phytopathogen infection. Interestingly, the secondary metabolism of Dutch elm pathogens remains almost unexplored, and little is known about how SM biosynthetic genes are organized in these species. To better understand the metabolic potential of O. ulmi and O. novo-ulmi, we performed a deep survey and description of SM biosynthetic gene clusters (BGCs in these species and assessed their conservation among eight species from the Ophiostomataceae family. Among 19 identified BGCs, a fujikurin-like gene cluster (OpPKS8 was unique to Dutch elm pathogens. Phylogenetic analysis revealed that orthologs for this gene cluster are widespread among phytopathogens and plant-associated fungi, suggesting that OpPKS8 may have been horizontally acquired by the Ophiostoma genus. Moreover, the detailed identification of several BGCs paves the way for future in-depth research and supports the potential impact of secondary metabolism on Ophiostoma genus’ lifestyle.

  10. De novo clustering methods outperform reference-based methods for assigning 16S rRNA gene sequences to operational taxonomic units

    Directory of Open Access Journals (Sweden)

    Sarah L. Westcott

    2015-12-01

    Full Text Available Background. 16S rRNA gene sequences are routinely assigned to operational taxonomic units (OTUs that are then used to analyze complex microbial communities. A number of methods have been employed to carry out the assignment of 16S rRNA gene sequences to OTUs leading to confusion over which method is optimal. A recent study suggested that a clustering method should be selected based on its ability to generate stable OTU assignments that do not change as additional sequences are added to the dataset. In contrast, we contend that the quality of the OTU assignments, the ability of the method to properly represent the distances between the sequences, is more important.Methods. Our analysis implemented six de novo clustering algorithms including the single linkage, complete linkage, average linkage, abundance-based greedy clustering, distance-based greedy clustering, and Swarm and the open and closed-reference methods. Using two previously published datasets we used the Matthew’s Correlation Coefficient (MCC to assess the stability and quality of OTU assignments.Results. The stability of OTU assignments did not reflect the quality of the assignments. Depending on the dataset being analyzed, the average linkage and the distance and abundance-based greedy clustering methods generated OTUs that were more likely to represent the actual distances between sequences than the open and closed-reference methods. We also demonstrated that for the greedy algorithms VSEARCH produced assignments that were comparable to those produced by USEARCH making VSEARCH a viable free and open source alternative to USEARCH. Further interrogation of the reference-based methods indicated that when USEARCH or VSEARCH were used to identify the closest reference, the OTU assignments were sensitive to the order of the reference sequences because the reference sequences can be identical over the region being considered. More troubling was the observation that while both USEARCH and

  11. Widespread horizontal transfer of the cerato-ulmin gene between Ophiostoma novo-ulmi and Geosmithia species

    Czech Academy of Sciences Publication Activity Database

    Bettini, P.P.; Frascella, A.; Kolařík, Miroslav; Comparini, C.; Pepori, A.L.; Santini, A.; Scala, F.; Scala, A.

    2014-01-01

    Roč. 118, č. 8 (2014), s. 663-674 ISSN 1878-6146 R&D Projects: GA ČR(CZ) GAP506/11/2302 Institutional support: RVO:61388971 Keywords : Ascomycetes * Entomochoric fungi * Gene expression Subject RIV: EE - Microbiology, Virology Impact factor: 2.342, year: 2014

  12. De novo characterization of Larimichthys crocea transcriptome for growth-/immune-related gene identification and massive microsatellite (SSR) marker development

    Science.gov (United States)

    Han, Zhaofang; Xiao, Shijun; Liu, Xiande; Liu, Yang; Li, Jiakai; Xie, Yangjie; Wang, Zhiyong

    2017-03-01

    The large yellow croaker, Larimichthys crocea is an important marine fish in China with a high economic value. In the last decade, the stock conservation and aquaculture industry of this species have been facing severe challenges because of wild population collapse and degeneration of important economic traits. However, genes contributing to growth and immunity in L. crocea have not been thoroughly analyzed, and available molecular markers are still not sufficient for genetic resource management and molecular selection. In this work, we sequenced the transcriptome in L. crocea liver tissue with a Roche 454 sequencing platform and assembled the transcriptome into 93 801 transcripts. Of them, 38 856 transcripts were successfully annotated in nt, nr, Swiss-Prot, InterPro, COG, GO and KEGG databases. Based on the annotation information, 3 165 unigenes related to growth and immunity were identified. Additionally, a total of 6 391 simple sequence repeats (SSRs) were identified from the transcriptome, among which 4 498 SSRs had enough flanking regions to design primers for polymerase chain reactions (PCR). To access the polymorphism of these markers, 30 primer pairs were randomly selected for PCR amplification and validation in 30 individuals, and 12 primer pairs (40.0%) exhibited obvious length polymorphisms. This work applied RNA-Seq to assemble and analyze a live transcriptome in L. crocea. With gene annotation and sequence information, genes related to growth and immunity were identified and massive SSR markers were developed, providing valuable genetic resources for future gene functional analysis and selective breeding of L. crocea.

  13. De Novo assembly of the Japanese flounder (Paralichthys olivaceus spleen transcriptome to identify putative genes involved in immunity.

    Directory of Open Access Journals (Sweden)

    Lin Huang

    Full Text Available Japanese flounder (Paralichthys olivaceus is an economically important marine fish in Asia and has suffered from disease outbreaks caused by various pathogens, which requires more information for immune relevant genes on genome background. However, genomic and transcriptomic data for Japanese flounder remain scarce, which limits studies on the immune system of this species. In this study, we characterized the Japanese flounder spleen transcriptome using an Illumina paired-end sequencing platform to identify putative genes involved in immunity.A cDNA library from the spleen of P. olivaceus was constructed and randomly sequenced using an Illumina technique. The removal of low quality reads generated 12,196,968 trimmed reads, which assembled into 96,627 unigenes. A total of 21,391 unigenes (22.14% were annotated in the NCBI Nr database, and only 1.1% of the BLASTx top-hits matched P. olivaceus protein sequences. Approximately 12,503 (58.45% unigenes were categorized into three Gene Ontology groups, 19,547 (91.38% were classified into 26 Cluster of Orthologous Groups, and 10,649 (49.78% were assigned to six Kyoto Encyclopedia of Genes and Genomes pathways. Furthermore, 40,928 putative simple sequence repeats and 47, 362 putative single nucleotide polymorphisms were identified. Importantly, we identified 1,563 putative immune-associated unigenes that mapped to 15 immune signaling pathways.The P. olivaceus transciptome data provides a rich source to discover and identify new genes, and the immune-relevant sequences identified here will facilitate our understanding of the mechanisms involved in the immune response. Furthermore, the plentiful potential SSRs and SNPs found in this study are important resources with respect to future development of a linkage map or marker assisted breeding programs for the flounder.

  14. Optimizing Hybrid de Novo Transcriptome Assembly and Extending Genomic Resources for Giant Freshwater Prawns (Macrobrachium rosenbergii: The Identification of Genes and Markers Associated with Reproduction

    Directory of Open Access Journals (Sweden)

    Hyungtaek Jung

    2016-05-01

    Full Text Available The giant freshwater prawn, Macrobrachium rosenbergii, a sexually dimorphic decapod crustacean is currently the world’s most economically important cultured freshwater crustacean species. Despite its economic importance, there is currently a lack of genomic resources available for this species, and this has limited exploration of the molecular mechanisms that control the M. rosenbergii sex-differentiation system more widely in freshwater prawns. Here, we present the first hybrid transcriptome from M. rosenbergii applying RNA-Seq technologies directed at identifying genes that have potential functional roles in reproductive-related traits. A total of 13,733,210 combined raw reads (1720 Mbp were obtained from Ion-Torrent PGM and 454 FLX. Bioinformatic analyses based on three state-of-the-art assemblers, the CLC Genomic Workbench, Trans-ABySS, and Trinity, that use single and multiple k-mer methods respectively, were used to analyse the data. The influence of multiple k-mers on assembly performance was assessed to gain insight into transcriptome assembly from short reads. After optimisation, de novo assembly resulted in 44,407 contigs with a mean length of 437 bp, and the assembled transcripts were further functionally annotated to detect single nucleotide polymorphisms and simple sequence repeat motifs. Gene expression analysis was also used to compare expression patterns from ovary and testis tissue libraries to identify genes with potential roles in reproduction and sex differentiation. The large transcript set assembled here represents the most comprehensive set of transcriptomic resources ever developed for reproduction traits in M. rosenbergii, and the large number of genetic markers predicted should constitute an invaluable resource for future genetic research studies on M. rosenbergii and can be applied more widely on other freshwater prawn species in the genus Macrobrachium.

  15. De novo transcriptome sequencing and comparative analysis to discover genes involved in ovarian maturity in Strongylocentrotus nudus.

    Science.gov (United States)

    Jia, Zhiying; Wang, Qiai; Wu, Kaikai; Wei, Zhenlin; Zhou, Zunchun; Liu, Xiaolin

    2017-09-01

    Strongylocentrotus nudus is an edible sea urchin, mainly harvested in China. Correlation studies indicated that S. nudus with larger diameter have a prolonged marketing time and better palatability owing to their precocious gonads and extended maturation process. However, the molecular mechanism underlying this phenomenon is still unknown. Here, transcriptome sequencing was applied to study the ovaries of adult S. nudus with different shell diameters to explore the possible mechanism. In this study, four independent cDNA libraries were constructed, including two from the big size urchins and two from the small ones using a HiSeq™2500 platform. A total of 88,581 unigenes were acquired with a mean length of 1354bp, of which 66,331 (74.88%) unigenes could be annotated using six major publicly available databases. Comparative analysis revealed that 353 unigenes were differentially expressed (with log2(ratio)≥1, FDR≤0.001) between the two groups. Of these, 20 differentially expressed genes (DEGs) were selected to confirm the accuracy of RNA-seq data by quantitative real-time RT-PCR. Furthermore, gene ontology and KEGG pathway enrichment analyses were performed to find the putative genes and pathways related to ovarian maturity. Eight unigenes were identified as significant DEGs involved in reproduction related pathways; these included Mos, Cdc20, Rec8, YP30, cytochrome P450 2U1, ovoperoxidase, proteoliaisin, and rendezvin. Our research fills the gap in the studies on the S. nudus ovaries using transcriptome analysis. Copyright © 2017 Elsevier Inc. All rights reserved.

  16. Characterization of Liaoning cashmere goat transcriptome: sequencing, de novo assembly, functional annotation and comparative analysis.

    Directory of Open Access Journals (Sweden)

    Hongliang Liu

    Full Text Available Liaoning cashmere goat is a famous goat breed for cashmere wool. In order to increase the transcriptome data and accelerate genetic improvement for this breed, we performed de novo transcriptome sequencing to generate the first expressed sequence tag dataset for the Liaoning cashmere goat, using next-generation sequencing technology.Transcriptome sequencing of Liaoning cashmere goat on a Roche 454 platform yielded 804,601 high-quality reads. Clustering and assembly of these reads produced a non-redundant set of 117,854 unigenes, comprising 13,194 isotigs and 104,660 singletons. Based on similarity searches with known proteins, 17,356 unigenes were assigned to 6,700 GO categories, and the terms were summarized into three main GO categories and 59 sub-categories. 3,548 and 46,778 unigenes had significant similarity to existing sequences in the KEGG and COG databases, respectively. Comparative analysis revealed that 42,254 unigenes were aligned to 17,532 different sequences in NCBI non-redundant nucleotide databases. 97,236 (82.51% unigenes were mapped to the 30 goat chromosomes. 35,551 (30.17% unigenes were matched to 11,438 reported goat protein-coding genes. The remaining non-matched unigenes were further compared with cattle and human reference genes, 67 putative new goat genes were discovered. Additionally, 2,781 potential simple sequence repeats were initially identified from all unigenes.The transcriptome of Liaoning cashmere goat was deep sequenced, de novo assembled, and annotated, providing abundant data to better understand the Liaoning cashmere goat transcriptome. The potential simple sequence repeats provide a material basis for future genetic linkage and quantitative trait loci analyses.

  17. Alport Syndrome: De Novo Mutation in the COL4A5 Gene Converting Glycine 1205 to Valine

    Directory of Open Access Journals (Sweden)

    Pilar Antón-Martín

    2012-01-01

    Full Text Available Background Alport syndrome is a primary basement membrane disorder arising from mutations in genes encoding the type IV collagen protein family. It is a genetically heterogeneous disease with different mutations and forms of inheritance that presents with renal affection, hearing loss and eye defects. Several new mutations related to X-linked forms have been previously determined. Methods We report the case of a 12 years old male and his family diagnosed with Alport syndrome after genetic analysis was performed. Result Anew mutation determining a nucleotide change C.3614G > T (p. Gly1205Val in hemizygosis in the COL4A5 gene was found. This molecular defect has not been previously described. Conclusion Molecular biology has helped us to comprehend the mechanisms of pathophysiology in Alport syndrome. Genetic analysis provides the only conclusive diagnosis of the disorder at the moment. Our contribution with a new mutation further supports the need of more sophisticated molecular methods to increase the mutation detection rates with lower costs and less time.

  18. De novo characterization of the Iris lactea var. chinensis transcriptome and an analysis of genes under cadmium or lead exposure.

    Science.gov (United States)

    Gu, Chun-Sun; Liu, Liang-Qin; Deng, Yan-Ming; Zhang, Yong-Xia; Wang, Zhi-Quan; Yuan, Hai-Yan; Huang, Su-Zhen

    2017-10-01

    Iris lactea var. chinensis (I. lactea var. chinensis) is tolerant to accumulations of cadmium (Cd) and lead (Pb). In this study, the transcriptome of I. lactea var. chinensis was investigated under Cd or Pb stresses. Using the gene ontology database, 31,974 unigenes were classified into biological process, cellular component and molecular function. In total, 13,132 unigenes were involved in enriched Encyclopedia of Genes and Genomes (KEGG) metabolic pathways, and the expression levels of 5904 unigenes were significantly changed after exposure to Cd or Pb stresses. Of these, 974 were co-up-regulated and 1281 were co-down-regulated under the two stresses. The transcriptome expression profiles of I. lactea var. chinensis under Cd or Pb stresses obtained in this study provided a resource for identifying common mechanisms in the detoxification of different heavy metals. Furthermore, the identified unigenes may be used for the genetic breeding of heavy-metal tolerant plants. Copyright © 2017 Elsevier Inc. All rights reserved.

  19. Digital Gene Expression Analysis Based on De Novo Transcriptome Assembly Reveals New Genes Associated with Floral Organ Differentiation of the Orchid Plant Cymbidium ensifolium.

    Directory of Open Access Journals (Sweden)

    Fengxi Yang

    Full Text Available Cymbidium ensifolium belongs to the genus Cymbidium of the orchid family. Owing to its spectacular flower morphology, C. ensifolium has considerable ecological and cultural value. However, limited genetic data is available for this non-model plant, and the molecular mechanism underlying floral organ identity is still poorly understood. In this study, we characterize the floral transcriptome of C. ensifolium and present, for the first time, extensive sequence and transcript abundance data of individual floral organs. After sequencing, over 10 Gb clean sequence data were generated and assembled into 111,892 unigenes with an average length of 932.03 base pairs, including 1,227 clusters and 110,665 singletons. Assembled sequences were annotated with gene descriptions, gene ontology, clusters of orthologous group terms, the Kyoto Encyclopedia of Genes and Genomes, and the plant transcription factor database. From these annotations, 131 flowering-associated unigenes, 61 CONSTANS-LIKE (COL unigenes and 90 floral homeotic genes were identified. In addition, four digital gene expression libraries were constructed for the sepal, petal, labellum and gynostemium, and 1,058 genes corresponding to individual floral organ development were identified. Among them, eight MADS-box genes were further investigated by full-length cDNA sequence analysis and expression validation, which revealed two APETALA1/AGL9-like MADS-box genes preferentially expressed in the sepal and petal, two AGAMOUS-like genes particularly restricted to the gynostemium, and four DEF-like genes distinctively expressed in different floral organs. The spatial expression of these genes varied distinctly in different floral mutant corresponding to different floral morphogenesis, which validated the specialized roles of them in floral patterning and further supported the effectiveness of our in silico analysis. This dataset generated in our study provides new insights into the molecular mechanisms

  20. A sweetpotato gene index established by de novo assembly of pyrosequencing and Sanger sequences and mining for gene-based microsatellite markers

    Directory of Open Access Journals (Sweden)

    Solis Julio

    2010-10-01

    Full Text Available Abstract Background Sweetpotato (Ipomoea batatas (L. Lam., a hexaploid outcrossing crop, is an important staple and food security crop in developing countries in Africa and Asia. The availability of genomic resources for sweetpotato is in striking contrast to its importance for human nutrition. Previously existing sequence data were restricted to around 22,000 expressed sequence tag (EST sequences and ~ 1,500 GenBank sequences. We have used 454 pyrosequencing to augment the available gene sequence information to enhance functional genomics and marker design for this plant species. Results Two quarter 454 pyrosequencing runs used two normalized cDNA collections from stems and leaves from drought-stressed sweetpotato clone Tanzania and yielded 524,209 reads, which were assembled together with 22,094 publically available expressed sequence tags into 31,685 sets of overlapping DNA segments and 34,733 unassembled sequences. Blastx comparisons with the UniRef100 database allowed annotation of 23,957 contigs and 15,342 singletons resulting in 24,657 putatively unique genes. Further, 27,119 sequences had no match to protein sequences of UniRef100database. On the basis of this gene index, we have identified 1,661 gene-based microsatellite sequences, of which 223 were selected for testing and 195 were successfully amplified in a test panel of 6 hexaploid (I. batatas and 2 diploid (I. trifida accessions. Conclusions The sweetpotato gene index is a useful source for functionally annotated sweetpotato gene sequences that contains three times more gene sequence information for sweetpotato than previous EST assemblies. A searchable version of the gene index, including a blastn function, is available at http://www.cipotato.org/sweetpotato_gene_index.

  1. De Novo Transcriptome Assembly and Characterization of the Synthesis Genes of Bioactive Constituents in Abelmoschus esculentus (L.) Moench

    Science.gov (United States)

    Zhang, Chenghao; Dong, Wenqi; Gen, Wei; Xu, Baoyu; Shen, Chenjia

    2018-01-01

    Abelmoschus esculentus (okra or lady’s fingers) is a vegetable with high nutritional value, as well as having certain medicinal effects. It is widely used as food, in the food industry, and in herbal medicinal products, but also as an ornamental, in animal feed, and in other commercial sectors. Okra is rich in bioactive compounds, such as flavonoids, polysaccharides, polyphenols, caffeine, and pectin. In the present study, the concentrations of total flavonoids and polysaccharides in five organs of okra were determined and compared. Transcriptome sequencing was used to explore the biosynthesis pathways associated with the active constituents in okra. Transcriptome sequencing of five organs (roots, stem, leaves, flowers, and fruits) of okra enabled us to obtain 293,971 unigenes, of which 232,490 were annotated. Unigenes related to the enzymes involved in the flavonoid biosynthetic pathway or in fructose and mannose metabolism were identified, based on Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis. All of the transcriptional datasets were uploaded to Sequence Read Archive (SRA). In summary, our comprehensive analysis provides important information at the molecular level about the flavonoid and polysaccharide biosynthesis pathways in okra. PMID:29495525

  2. De Novo Deep Transcriptome Analysis of Medicinal Plants for Gene Discovery in Biosynthesis of Plant Natural Products.

    Science.gov (United States)

    Han, R; Rai, A; Nakamura, M; Suzuki, H; Takahashi, H; Yamazaki, M; Saito, K

    2016-01-01

    Study on transcriptome, the entire pool of transcripts in an organism or single cells at certain physiological or pathological stage, is indispensable in unraveling the connection and regulation between DNA and protein. Before the advent of deep sequencing, microarray was the main approach to handle transcripts. Despite obvious shortcomings, including limited dynamic range and difficulties to compare the results from distinct experiments, microarray was widely applied. During the past decade, next-generation sequencing (NGS) has revolutionized our understanding of genomics in a fast, high-throughput, cost-effective, and tractable manner. By adopting NGS, efficiency and fruitful outcomes concerning the efforts to elucidate genes responsible for producing active compounds in medicinal plants were profoundly enhanced. The whole process involves steps, from the plant material sampling, to cDNA library preparation, to deep sequencing, and then bioinformatics takes over to assemble enormous-yet fragmentary-data from which to comb and extract information. The unprecedentedly rapid development of such technologies provides so many choices to facilitate the task, which can cause confusion when choosing the suitable methodology for specific purposes. Here, we review the general approaches for deep transcriptome analysis and then focus on their application in discovering biosynthetic pathways of medicinal plants that produce important secondary metabolites. © 2016 Elsevier Inc. All rights reserved.

  3. Novos polimorfismos no gene da obesidade em raças divergentes de suínos Polymorphisms in the leptin gene in divergent swine breeds

    Directory of Open Access Journals (Sweden)

    M.A.M. Soares

    2006-06-01

    Full Text Available Investigou-se a existência de polimorfismo no gene da leptina (gene da obesidade entre varrões da raça nativa Piau (porco tipo banha e matrizes mestiças de raças comerciais (Landrace/Large White e Landrace/Large White com Pietrain, selecionadas para peso e precocidade. Oito pares de primers foram desenhados a partir da seqüência disponível no GenBank (U66254, usada, neste trabalho, como seqüência de referência. Amostras de DNA foram extraídas de células sangüíneas brancas utilizando-se solução de fenol:clorofórmio, após tratamento com proteinase K. Os fragmentos gerados por amplificação da reação em cadeia da polimerase foram purificados e seqüenciados em seqüenciador automático. As seqüências de nucleotídeos, obtidas a partir do DNA das raças comerciais de suíno, apresentaram maior similaridade com a seqüência de referência, e as seqüências geradas a partir do DNA dos animais nativos divergiram de ambas em algumas posições. Dos 28 polimorfismos encontrados, oito foram observados em apenas uma das três seqüências geradas a partir do DNA das raças nativas. Doze estavam presentes em duas seqüências, e os oito polimorfismos restantes foram encontrados nos três animais nativos.Leptin gene (obese gene polymorphism was investigated in Piau boars (a fat, native breed and sows from commercial strains (Landrace/Large White and Landrace/Large White by Pietrain chosen for rapid growth and early sexual maturity. Eight pairs of primers designed using the sequence available from GenBank (access nº U66254 were identified as the reference sequence in this project. DNA samples were extracted from white blood cells using phenol:chloroform solution, after treatment with proteinase K. Fragments generated by amplification of the Polymerase Chain Reaction were purified and sequenced in an automatic sequencer. Nucleotide sequences obtained from DNA of commercial swine breeds were similar to the reference sequence; whereas

  4. Transferência de fatores genéticos de resistência a Hemileia vastatrix para o cultivar mundo novo Transference of the genes SH2 and SH3 for resistance to Hemileia vastatrix to the mundo novo cultivar of C. arabica

    Directory of Open Access Journals (Sweden)

    A. Carvalho

    1977-01-01

    Full Text Available Cafeeiros portadores dos fatores genéticos SH2 ou SH2 e SH3, simultaneamente, que conferem resistência a várias raças de Hemileia vastatrix, foram cruzados com plantas selecionadas do cultivar mundo novo de Coffea arabica a fim de se obter, em F2, recombinações com resistência a esse patógeno e elevada produtividade. Analisaram-se 14 populações F2 segregando apenas para o fator SH2, oito para os fatores SH2 e HS3, e três populações que dão, em sua descendência, plantas do grupo A, resistentes a todas as raças do patógeno até agora conhecidas. De 22.356 cafeeiros originalmente plantados em ensaio, a duas mudas por cova, em parcelas casualizadas, fez-se uma primeira seleção deixando apenas um cafeeiro por cova, reduzindo-se para 11.178 as plantas em estudo. Com base no aspecto vegetativo, na produtividade, na ausência de defeitos nos frutos e na reação de resistência ao agente causal da ferrugem, realizaram-se sucessivas seleções escolhendo-se finalmente, apenas 100 cafeeiros do tipo mundo novo e resistentes a H. vastatrix para derivação das populações F2 e prosseguimento da seleção.Coffee trees homozygous for the alleles SH2 or SH2 and SH3 which confer resistance to several physiological races of Hemileia vastatrix, were crossed to selected plants of Mundo Novo cultivar of Coffea arabica and the F2 generations were studied aiming to develop new high yielding and resistant coffee recombinations. A complete randomized field trial was stablished including 14 F2 populations segregating for SH2, eight populations segregating for SH2 and SH3 genes, and three populations segregating for plants of the A group of reaction to the H. vastatrix attack. A total of 22,356 F2 plants were analysed. Based on the plant vigor, yield capacity, percentage of normal developed seeds and resistance reaction to H. vastatrix, three successive series of selection were undertaken leaving only 100 coffee trees for development of F3 populations

  5. Prenatal Diagnosis of a 2.5 Mb De Novo 17q24.1q24.2 Deletion Encompassing KPNA2 and PSMD12 Genes in a Fetus with Craniofacial Dysmorphism, Equinovarus Feet, and Syndactyly

    Directory of Open Access Journals (Sweden)

    Marie-Emmanuelle Naud

    2017-01-01

    Full Text Available Interstitial 17q24.1 or 17q24.2 deletions were reported after conventional cytogenetic analysis or chromosomal microarray analysis in patients presenting intellectual disability, facial dysmorphism, and/or malformations. We report on a fetus with craniofacial dysmorphism, talipes equinovarus, and syndactyly associated with a de novo 2.5 Mb 17q24.1q24.2 deletion. Among the deleted genes, KPNA2 and PSMD12 are discussed for the correlation with the fetal phenotype. This is the first case of prenatal diagnosis of 17q24.1q24.2 deletion.

  6. RNA editing differently affects protein-coding genes in D. melanogaster and H. sapiens.

    Science.gov (United States)

    Grassi, Luigi; Leoni, Guido; Tramontano, Anna

    2015-07-14

    When an RNA editing event occurs within a coding sequence it can lead to a different encoded amino acid. The biological significance of these events remains an open question: they can modulate protein functionality, increase the complexity of transcriptomes or arise from a loose specificity of the involved enzymes. We analysed the editing events in coding regions that produce or not a change in the encoded amino acid (nonsynonymous and synonymous events, respectively) in D. melanogaster and in H. sapiens and compared them with the appropriate random models. Interestingly, our results show that the phenomenon has rather different characteristics in the two organisms. For example, we confirm the observation that editing events occur more frequently in non-coding than in coding regions, and report that this effect is much more evident in H. sapiens. Additionally, in this latter organism, editing events tend to affect less conserved residues. The less frequently occurring editing events in Drosophila tend to avoid drastic amino acid changes. Interestingly, we find that, in Drosophila, changes from less frequently used codons to more frequently used ones are favoured, while this is not the case in H. sapiens.

  7. Distinguishing the Transcription Regulation Patterns in Promoters of Human Genes with Different Function or Evolutionary Age

    KAUST Repository

    Alam, Tanvir

    2012-07-01

    Distinguishing transcription regulatory patterns of different gene groups is a common problem in various bioinformatics studies. In this work we developed a methodology to deal with such a problem based on machine learning techniques. We applied our method to two biologically important problems related to detecting a difference in transcription regulation of: a/ protein-coding and long non-coding RNAs (lncRNAs) in human, as well as b/ a difference between primate-specific and non-primate-specific long non-coding RNAs. Our method is capable to classify RNAs using various regulatory features of genes that transcribe into these RNAs, such as nucleotide frequencies, transcription factor binding sites, de novo sequence motifs, CpG islands, repetitive elements, histone modification marks, and others. Ten-fold cross-validation tests suggest that our model can distinguish protein-coding and non-coding RNAs with accuracy above 80%. Twenty-fold cross-validation tests suggest that our model can distinguish primate-specific from non-primate-specific promoters of lncRNAs with accuracy above 80%. Consequently, we can hypothesize that transcription of the groups of genes mentioned above are regulated by different mechanisms. Feature selection techniques allowed us to reduce the number of features significantly while keeping the accuracy around 80%. Consequently, we can conclude that selected features play significant role in transcription regulation of coding and non-coding genes, as well as primate-specific and non-primate-specific lncRNA genes.

  8. Discrete Ramanujan transform for distinguishing the protein coding regions from other regions.

    Science.gov (United States)

    Hua, Wei; Wang, Jiasong; Zhao, Jian

    2014-01-01

    Based on the study of Ramanujan sum and Ramanujan coefficient, this paper suggests the concepts of discrete Ramanujan transform and spectrum. Using Voss numerical representation, one maps a symbolic DNA strand as a numerical DNA sequence, and deduces the discrete Ramanujan spectrum of the numerical DNA sequence. It is well known that of discrete Fourier power spectrum of protein coding sequence has an important feature of 3-base periodicity, which is widely used for DNA sequence analysis by the technique of discrete Fourier transform. It is performed by testing the signal-to-noise ratio at frequency N/3 as a criterion for the analysis, where N is the length of the sequence. The results presented in this paper show that the property of 3-base periodicity can be only identified as a prominent spike of the discrete Ramanujan spectrum at period 3 for the protein coding regions. The signal-to-noise ratio for discrete Ramanujan spectrum is defined for numerical measurement. Therefore, the discrete Ramanujan spectrum and the signal-to-noise ratio of a DNA sequence can be used for distinguishing the protein coding regions from the noncoding regions. All the exon and intron sequences in whole chromosomes 1, 2, 3 and 4 of Caenorhabditis elegans have been tested and the histograms and tables from the computational results illustrate the reliability of our method. In addition, we have analyzed theoretically and gotten the conclusion that the algorithm for calculating discrete Ramanujan spectrum owns the lower computational complexity and higher computational accuracy. The computational experiments show that the technique by using discrete Ramanujan spectrum for classifying different DNA sequences is a fast and effective method. Copyright © 2014 Elsevier Ltd. All rights reserved.

  9. A new method for species identification via protein-coding and non-coding DNA barcodes by combining machine learning with bioinformatic methods.

    Directory of Open Access Journals (Sweden)

    Ai-bing Zhang

    Full Text Available Species identification via DNA barcodes is contributing greatly to current bioinventory efforts. The initial, and widely accepted, proposal was to use the protein-coding cytochrome c oxidase subunit I (COI region as the standard barcode for animals, but recently non-coding internal transcribed spacer (ITS genes have been proposed as candidate barcodes for both animals and plants. However, achieving a robust alignment for non-coding regions can be problematic. Here we propose two new methods (DV-RBF and FJ-RBF to address this issue for species assignment by both coding and non-coding sequences that take advantage of the power of machine learning and bioinformatics. We demonstrate the value of the new methods with four empirical datasets, two representing typical protein-coding COI barcode datasets (neotropical bats and marine fish and two representing non-coding ITS barcodes (rust fungi and brown algae. Using two random sub-sampling approaches, we demonstrate that the new methods significantly outperformed existing Neighbor-joining (NJ and Maximum likelihood (ML methods for both coding and non-coding barcodes when there was complete species coverage in the reference dataset. The new methods also out-performed NJ and ML methods for non-coding sequences in circumstances of potentially incomplete species coverage, although then the NJ and ML methods performed slightly better than the new methods for protein-coding barcodes. A 100% success rate of species identification was achieved with the two new methods for 4,122 bat queries and 5,134 fish queries using COI barcodes, with 95% confidence intervals (CI of 99.75-100%. The new methods also obtained a 96.29% success rate (95%CI: 91.62-98.40% for 484 rust fungi queries and a 98.50% success rate (95%CI: 96.60-99.37% for 1094 brown algae queries, both using ITS barcodes.

  10. A new method for species identification via protein-coding and non-coding DNA barcodes by combining machine learning with bioinformatic methods.

    Science.gov (United States)

    Zhang, Ai-bing; Feng, Jie; Ward, Robert D; Wan, Ping; Gao, Qiang; Wu, Jun; Zhao, Wei-zhong

    2012-01-01

    Species identification via DNA barcodes is contributing greatly to current bioinventory efforts. The initial, and widely accepted, proposal was to use the protein-coding cytochrome c oxidase subunit I (COI) region as the standard barcode for animals, but recently non-coding internal transcribed spacer (ITS) genes have been proposed as candidate barcodes for both animals and plants. However, achieving a robust alignment for non-coding regions can be problematic. Here we propose two new methods (DV-RBF and FJ-RBF) to address this issue for species assignment by both coding and non-coding sequences that take advantage of the power of machine learning and bioinformatics. We demonstrate the value of the new methods with four empirical datasets, two representing typical protein-coding COI barcode datasets (neotropical bats and marine fish) and two representing non-coding ITS barcodes (rust fungi and brown algae). Using two random sub-sampling approaches, we demonstrate that the new methods significantly outperformed existing Neighbor-joining (NJ) and Maximum likelihood (ML) methods for both coding and non-coding barcodes when there was complete species coverage in the reference dataset. The new methods also out-performed NJ and ML methods for non-coding sequences in circumstances of potentially incomplete species coverage, although then the NJ and ML methods performed slightly better than the new methods for protein-coding barcodes. A 100% success rate of species identification was achieved with the two new methods for 4,122 bat queries and 5,134 fish queries using COI barcodes, with 95% confidence intervals (CI) of 99.75-100%. The new methods also obtained a 96.29% success rate (95%CI: 91.62-98.40%) for 484 rust fungi queries and a 98.50% success rate (95%CI: 96.60-99.37%) for 1094 brown algae queries, both using ITS barcodes.

  11. Consensus coding sequence (CCDS) database: a standardized set of human and mouse protein-coding regions supported by expert curation.

    Science.gov (United States)

    Pujar, Shashikant; O'Leary, Nuala A; Farrell, Catherine M; Loveland, Jane E; Mudge, Jonathan M; Wallin, Craig; Girón, Carlos G; Diekhans, Mark; Barnes, If; Bennett, Ruth; Berry, Andrew E; Cox, Eric; Davidson, Claire; Goldfarb, Tamara; Gonzalez, Jose M; Hunt, Toby; Jackson, John; Joardar, Vinita; Kay, Mike P; Kodali, Vamsi K; Martin, Fergal J; McAndrews, Monica; McGarvey, Kelly M; Murphy, Michael; Rajput, Bhanu; Rangwala, Sanjida H; Riddick, Lillian D; Seal, Ruth L; Suner, Marie-Marthe; Webb, David; Zhu, Sophia; Aken, Bronwen L; Bruford, Elspeth A; Bult, Carol J; Frankish, Adam; Murphy, Terence; Pruitt, Kim D

    2018-01-04

    The Consensus Coding Sequence (CCDS) project provides a dataset of protein-coding regions that are identically annotated on the human and mouse reference genome assembly in genome annotations produced independently by NCBI and the Ensembl group at EMBL-EBI. This dataset is the product of an international collaboration that includes NCBI, Ensembl, HUGO Gene Nomenclature Committee, Mouse Genome Informatics and University of California, Santa Cruz. Identically annotated coding regions, which are generated using an automated pipeline and pass multiple quality assurance checks, are assigned a stable and tracked identifier (CCDS ID). Additionally, coordinated manual review by expert curators from the CCDS collaboration helps in maintaining the integrity and high quality of the dataset. The CCDS data are available through an interactive web page (https://www.ncbi.nlm.nih.gov/CCDS/CcdsBrowse.cgi) and an FTP site (ftp://ftp.ncbi.nlm.nih.gov/pub/CCDS/). In this paper, we outline the ongoing work, growth and stability of the CCDS dataset and provide updates on new collaboration members and new features added to the CCDS user interface. We also present expert curation scenarios, with specific examples highlighting the importance of an accurate reference genome assembly and the crucial role played by input from the research community. Published by Oxford University Press on behalf of Nucleic Acids Research 2017.

  12. De novo transcriptome sequencing and digital gene expression analysis predict biosynthetic pathway of rhynchophylline and isorhynchophylline from Uncaria rhynchophylla, a non-model plant with potent anti-alzheimer's properties.

    Science.gov (United States)

    Guo, Qianqian; Ma, Xiaojun; Wei, Shugen; Qiu, Deyou; Wilson, Iain W; Wu, Peng; Tang, Qi; Liu, Lijun; Dong, Shoukun; Zu, Wei

    2014-08-12

    The major medicinal alkaloids isolated from Uncaria rhynchophylla (gouteng in chinese) capsules are rhynchophylline (RIN) and isorhynchophylline (IRN). Extracts containing these terpene indole alkaloids (TIAs) can inhibit the formation and destabilize preformed fibrils of amyloid β protein (a pathological marker of Alzheimer's disease), and have been shown to improve the cognitive function of mice with Alzheimer-like symptoms. The biosynthetic pathways of RIN and IRN are largely unknown. In this study, RNA-sequencing of pooled Uncaria capsules RNA samples taken at three developmental stages that accumulate different amount of RIN and IRN was performed. More than 50 million high-quality reads from a cDNA library were generated and de novo assembled. Sequences for all of the known enzymes involved in TIAs synthesis were identified. Additionally, 193 cytochrome P450 (CYP450), 280 methyltransferase and 144 isomerase genes were identified, that are potential candidates for enzymes involved in RIN and IRN synthesis. Digital gene expression profile (DGE) analysis was performed on the three capsule developmental stages, and based on genes possessing expression profiles consistent with RIN and IRN levels; four CYP450s, three methyltransferases and three isomerases were identified as the candidates most likely to be involved in the later steps of RIN and IRN biosynthesis. A combination of de novo transcriptome assembly and DGE analysis was shown to be a powerful method for identifying genes encoding enzymes potentially involved in the biosynthesis of important secondary metabolites in a non-model plant. The transcriptome data from this study provides an important resource for understanding the formation of major bioactive constituents in the capsule extract from Uncaria, and provides information that may aid in metabolic engineering to increase yields of these important alkaloids.

  13. De novo 14q24.2q24.3 microdeletion including IFT43 is associated with intellectual disability, skeletal anomalies, cardiac anomalies, and myopia.

    Science.gov (United States)

    Stokman, Marijn F; Oud, Machteld M; van Binsbergen, Ellen; Slaats, Gisela G; Nicolaou, Nayia; Renkema, Kirsten Y; Nijman, Isaac J; Roepman, Ronald; Giles, Rachel H; Arts, Heleen H; Knoers, Nine V A M; van Haelst, Mieke M

    2016-06-01

    We report an 11-year-old girl with mild intellectual disability, skeletal anomalies, congenital heart defect, myopia, and facial dysmorphisms including an extra incisor, cup-shaped ears, and a preauricular skin tag. Array comparative genomic hybridization analysis identified a de novo 4.5-Mb microdeletion on chromosome 14q24.2q24.3. The deleted region and phenotype partially overlap with previously reported patients. Here, we provide an overview of the literature on 14q24 microdeletions and further delineate the associated phenotype. We performed exome sequencing to examine other causes for the phenotype and queried genes present in the 14q24.2q24.3 microdeletion that are associated with recessive disease for variants in the non-deleted allele. The deleted region contains 65 protein-coding genes, including the ciliary gene IFT43. Although Sanger and exome sequencing did not identify variants in the second IFT43 allele or in other IFT complex A-protein-encoding genes, immunocytochemistry showed increased accumulation of IFT-B proteins at the ciliary tip in patient-derived fibroblasts compared to control cells, demonstrating defective retrograde ciliary transport. This could suggest a ciliary defect in the pathogenesis of this disorder. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  14. Modeling compositional dynamics based on GC and purine contents of protein-coding sequences

    KAUST Repository

    Zhang, Zhang; Yu, Jun

    2010-01-01

    Background: Understanding the compositional dynamics of genomes and their coding sequences is of great significance in gaining clues into molecular evolution and a large number of publically-available genome sequences have allowed us to quantitatively predict deviations of empirical data from their theoretical counterparts. However, the quantification of theoretical compositional variations for a wide diversity of genomes remains a major challenge.Results: To model the compositional dynamics of protein-coding sequences, we propose two simple models that take into account both mutation and selection effects, which act differently at the three codon positions, and use both GC and purine contents as compositional parameters. The two models concern the theoretical composition of nucleotides, codons, and amino acids, with no prerequisite of homologous sequences or their alignments. We evaluated the two models by quantifying theoretical compositions of a large collection of protein-coding sequences (including 46 of Archaea, 686 of Bacteria, and 826 of Eukarya), yielding consistent theoretical compositions across all the collected sequences.Conclusions: We show that the compositions of nucleotides, codons, and amino acids are largely determined by both GC and purine contents and suggest that deviations of the observed from the expected compositions may reflect compositional signatures that arise from a complex interplay between mutation and selection via DNA replication and repair mechanisms.Reviewers: This article was reviewed by Zhaolei Zhang (nominated by Mark Gerstein), Guruprasad Ananda (nominated by Kateryna Makova), and Daniel Haft. 2010 Zhang and Yu; licensee BioMed Central Ltd.

  15. Modeling compositional dynamics based on GC and purine contents of protein-coding sequences

    KAUST Repository

    Zhang, Zhang

    2010-11-08

    Background: Understanding the compositional dynamics of genomes and their coding sequences is of great significance in gaining clues into molecular evolution and a large number of publically-available genome sequences have allowed us to quantitatively predict deviations of empirical data from their theoretical counterparts. However, the quantification of theoretical compositional variations for a wide diversity of genomes remains a major challenge.Results: To model the compositional dynamics of protein-coding sequences, we propose two simple models that take into account both mutation and selection effects, which act differently at the three codon positions, and use both GC and purine contents as compositional parameters. The two models concern the theoretical composition of nucleotides, codons, and amino acids, with no prerequisite of homologous sequences or their alignments. We evaluated the two models by quantifying theoretical compositions of a large collection of protein-coding sequences (including 46 of Archaea, 686 of Bacteria, and 826 of Eukarya), yielding consistent theoretical compositions across all the collected sequences.Conclusions: We show that the compositions of nucleotides, codons, and amino acids are largely determined by both GC and purine contents and suggest that deviations of the observed from the expected compositions may reflect compositional signatures that arise from a complex interplay between mutation and selection via DNA replication and repair mechanisms.Reviewers: This article was reviewed by Zhaolei Zhang (nominated by Mark Gerstein), Guruprasad Ananda (nominated by Kateryna Makova), and Daniel Haft. 2010 Zhang and Yu; licensee BioMed Central Ltd.

  16. The chaperonin-60 universal target is a barcode for bacteria that enables de novo assembly of metagenomic sequence data.

    Science.gov (United States)

    Links, Matthew G; Dumonceaux, Tim J; Hemmingsen, Sean M; Hill, Janet E

    2012-01-01

    Barcoding with molecular sequences is widely used to catalogue eukaryotic biodiversity. Studies investigating the community dynamics of microbes have relied heavily on gene-centric metagenomic profiling using two genes (16S rRNA and cpn60) to identify and track Bacteria. While there have been criteria formalized for barcoding of eukaryotes, these criteria have not been used to evaluate gene targets for other domains of life. Using the framework of the International Barcode of Life we evaluated DNA barcodes for Bacteria. Candidates from the 16S rRNA gene and the protein coding cpn60 gene were evaluated. Within complete bacterial genomes in the public domain representing 983 species from 21 phyla, the largest difference between median pairwise inter- and intra-specific distances ("barcode gap") was found from cpn60. Distribution of sequence diversity along the ∼555 bp cpn60 target region was remarkably uniform. The barcode gap of the cpn60 universal target facilitated the faithful de novo assembly of full-length operational taxonomic units from pyrosequencing data from a synthetic microbial community. Analysis supported the recognition of both 16S rRNA and cpn60 as DNA barcodes for Bacteria. The cpn60 universal target was found to have a much larger barcode gap than 16S rRNA suggesting cpn60 as a preferred barcode for Bacteria. A large barcode gap for cpn60 provided a robust target for species-level characterization of data. The assembly of consensus sequences for barcodes was shown to be a reliable method for the identification and tracking of novel microbes in metagenomic studies.

  17. Motif-Independent De Novo Detection of Secondary Metabolite Gene Clusters – Towards Identification of Novel Secondary Metabolisms from Filamentous Fungi -

    Directory of Open Access Journals (Sweden)

    Myco eUmemura

    2015-05-01

    Full Text Available Secondary metabolites are produced mostly by clustered genes that are essential to their biosynthesis. The transcriptional expression of these genes is often cooperatively regulated by a transcription factor located inside or close to a cluster. Most of the secondary metabolism biosynthesis (SMB gene clusters identified to date contain so-called core genes with distinctive sequence features, such as polyketide synthase (PKS and non-ribosomal peptide synthetase (NRPS. Recent efforts in sequencing fungal genomes have revealed far more SMB gene clusters than expected based on the number of core genes in the genomes. Several bioinformatics tools have been developed to survey SMB gene clusters using the sequence motif information of the core genes, including SMURF and antiSMASH.More recently, accompanied by the development of sequencing techniques allowing to obtain large-scale genomic and transcriptomic data, motif-independent prediction methods of SMB gene clusters, including MIDDAS-M, have been developed. Most these methods detect the clusters in which the genes are cooperatively regulated at transcriptional levels, thus allowing the identification of novel SMB gene clusters regardless of the presence of the core genes. Another type of the method, MIPS-CG, uses the characteristics of SMB genes, which are highly enriched in non-syntenic blocks (NSBs, enabling the prediction even without transcriptome data although the results have not been evaluated in detail. Considering that large portion of SMB gene clusters might be sufficiently expressed only in limited uncommon conditions, it seems that prediction of SMB gene clusters by bioinformatics and successive experimental validation is an only way to efficiently uncover hidden SMB gene clusters. Here, we describe and discuss possible novel approaches for the determination of SMB gene clusters that have not been identified using conventional methods.

  18. Tree ferns: monophyletic groups and their relationships as revealed by four protein-coding plastid loci.

    Science.gov (United States)

    Korall, Petra; Pryer, Kathleen M; Metzgar, Jordan S; Schneider, Harald; Conant, David S

    2006-06-01

    Tree ferns are a well-established clade within leptosporangiate ferns. Most of the 600 species (in seven families and 13 genera) are arborescent, but considerable morphological variability exists, spanning the giant scaly tree ferns (Cyatheaceae), the low, erect plants (Plagiogyriaceae), and the diminutive endemics of the Guayana Highlands (Hymenophyllopsidaceae). In this study, we investigate phylogenetic relationships within tree ferns based on analyses of four protein-coding, plastid loci (atpA, atpB, rbcL, and rps4). Our results reveal four well-supported clades, with genera of Dicksoniaceae (sensu ) interspersed among them: (A) (Loxomataceae, (Culcita, Plagiogyriaceae)), (B) (Calochlaena, (Dicksonia, Lophosoriaceae)), (C) Cibotium, and (D) Cyatheaceae, with Hymenophyllopsidaceae nested within. How these four groups are related to one other, to Thyrsopteris, or to Metaxyaceae is weakly supported. Our results show that Dicksoniaceae and Cyatheaceae, as currently recognised, are not monophyletic and new circumscriptions for these families are needed.

  19. An evolutionary model for protein-coding regions with conserved RNA structure

    DEFF Research Database (Denmark)

    Pedersen, Jakob Skou; Forsberg, Roald; Meyer, Irmtraud Margret

    2004-01-01

    in the RNA structure. The overlap of these fundamental dependencies is sufficient to cause "contagious" context dependencies which cascade across many nucleotide sites. Such large-scale dependencies challenge the use of traditional phylogenetic models in evolutionary inference because they explicitly assume...... components of traditional phylogenetic models. We applied this to a data set of full-genome sequences from the hepatitis C virus where five RNA structures are mapped within the coding region. This allowed us to partition the effects of selection on different structural elements and to test various hypotheses......Here we present a model of nucleotide substitution in protein-coding regions that also encode the formation of conserved RNA structures. In such regions, apparent evolutionary context dependencies exist, both between nucleotides occupying the same codon and between nucleotides forming a base pair...

  20. Allele-Selective Transcriptome Recruitment to Polysomes Primed for Translation: Protein-Coding and Noncoding RNAs, and RNA Isoforms.

    Directory of Open Access Journals (Sweden)

    Roshan Mascarenhas

    Full Text Available mRNA translation into proteins is highly regulated, but the role of mRNA isoforms, noncoding RNAs (ncRNAs, and genetic variants remains poorly understood. mRNA levels on polysomes have been shown to correlate well with expressed protein levels, pointing to polysomal loading as a critical factor. To study regulation and genetic factors of protein translation we measured levels and allelic ratios of mRNAs and ncRNAs (including microRNAs in lymphoblast cell lines (LCL and in polysomal fractions. We first used targeted assays to measure polysomal loading of mRNA alleles, confirming reported genetic effects on translation of OPRM1 and NAT1, and detecting no effect of rs1045642 (3435C>T in ABCB1 (MDR1 on polysomal loading while supporting previous results showing increased mRNA turnover of the 3435T allele. Use of high-throughput sequencing of complete transcript profiles (RNA-Seq in three LCLs revealed significant differences in polysomal loading of individual RNA classes and isoforms. Correlated polysomal distribution between protein-coding and non-coding RNAs suggests interactions between them. Allele-selective polysome recruitment revealed strong genetic influence for multiple RNAs, attributable either to differential expression of RNA isoforms or to differential loading onto polysomes, the latter defining a direct genetic effect on translation. Genes identified by different allelic RNA ratios between cytosol and polysomes were enriched with published expression quantitative trait loci (eQTLs affecting RNA functions, and associations with clinical phenotypes. Polysomal RNA-Seq combined with allelic ratio analysis provides a powerful approach to study polysomal RNA recruitment and regulatory variants affecting protein translation.

  1. Comparative De Novo Transcriptome Analysis of Fertilized Ovules in Xanthoceras sorbifolium Uncovered a Pool of Genes Expressed Specifically or Preferentially in the Selfed Ovule That Are Potentially Involved in Late-Acting Self-Incompatibility.

    Directory of Open Access Journals (Sweden)

    Qingyuan Zhou

    Full Text Available Xanthoceras sorbifolium, a tree species endemic to northern China, has high oil content in its seeds and is recognized as an important biodiesel crop. The plant is characterized by late-acting self-incompatibility (LSI. LSI was found to occur in many angiosperm species and plays an important role in reducing inbreeding and its harmful effects, as do gametophytic self-incompatibility (GSI and sporophytic self-incompatibility (SSI. Molecular mechanisms of conventional GSI and SSI have been well characterized in several families, but no effort has been made to identify the genes involved in the LSI process. The present studies indicated that there were no significant differences in structural and histological features between the self- and cross-pollinated ovules during the early stages of ovule development until 5 days after pollination (DAP. This suggests that 5 DAP is likely to be a turning point for the development of the selfed ovules. Comparative de novo transcriptome analysis of the selfed and crossed ovules at 5 DAP identified 274 genes expressed specifically or preferentially in the selfed ovules. These genes contained a significant proportion of genes predicted to function in the biosynthesis of secondary metabolites, consistent with our histological observations in the fertilized ovules. The genes encoding signal transduction-related components, such as protein kinases and protein phosphatases, are overrepresented in the selfed ovules. X. sorbifolium selfed ovules also specifically or preferentially express many unique transcription factor (TF genes that could potentially be involved in the novel mechanisms of LSI. We also identified 42 genes significantly up-regulated in the crossed ovules compared to the selfed ovules. The expression of all 16 genes selected from the RNA-seq data was validated using PCR in the selfed and crossed ovules. This study represents the first genome-wide identification of genes expressed in the fertilized

  2. De novo cloning and annotation of genes associated with immunity, detoxification and energy metabolism from the fat body of the oriental fruit fly, Bactrocera dorsalis.

    Directory of Open Access Journals (Sweden)

    Wen-Jia Yang

    Full Text Available The oriental fruit fly, Bactrocera dorsalis, is a destructive pest in tropical and subtropical areas. In this study, we performed transcriptome-wide analysis of the fat body of B. dorsalis and obtained more than 59 million sequencing reads, which were assembled into 27,787 unigenes with an average length of 591 bp. Among them, 17,442 (62.8% unigenes matched known proteins in the NCBI database. The assembled sequences were further annotated with gene ontology, cluster of orthologous group terms, and Kyoto encyclopedia of genes and genomes. In depth analysis was performed to identify genes putatively involved in immunity, detoxification, and energy metabolism. Many new genes were identified including serpins, peptidoglycan recognition proteins and defensins, which were potentially linked to immune defense. Many detoxification genes were identified, including cytochrome P450s, glutathione S-transferases and ATP-binding cassette (ABC transporters. Many new transcripts possibly involved in energy metabolism, including fatty acid desaturases, lipases, alpha amylases, and trehalose-6-phosphate synthases, were identified. Moreover, we randomly selected some genes to examine their expression patterns in different tissues by quantitative real-time PCR, which indicated that some genes exhibited fat body-specific expression in B. dorsalis. The identification of a numerous transcripts in the fat body of B. dorsalis laid the foundation for future studies on the functions of these genes.

  3. De novo transcriptome sequencing of black pepper (Piper nigrum L.) and an analysis of genes involved in phenylpropanoid metabolism in response to Phytophthora capsici.

    Science.gov (United States)

    Hao, Chaoyun; Xia, Zhiqiang; Fan, Rui; Tan, Lehe; Hu, Lisong; Wu, Baoduo; Wu, Huasong

    2016-10-21

    Piper nigrum L., or "black pepper", is an economically important spice crop in tropical regions. Black pepper production is markedly affected by foot rot disease caused by Phytophthora capsici, and genetic improvement of black pepper is essential for combating foot rot diseases. However, little is known about the mechanism of anti- P. capsici in black pepper. The molecular mechanisms underlying foot rot susceptibility were studied by comparing transcriptome analysis between resistant (Piper flaviflorum) and susceptible (Piper nigrum cv. Reyin-1) black pepper species. 116,432 unigenes were acquired from six libraries (three replicates of resistant and susceptible black pepper samples), which were integrated by applying BLAST similarity searches and noted by adopting Kyoto Encyclopaedia of Genes and Gene Ontology (GO) genome orthology identifiers. The reference transcriptome was mapped using two sets of digital gene expression data. Using GO enrichment analysis for the differentially expressed genes, the majority of the genes associated with the phenylpropanoid biosynthesis pathway were identified in P. flaviflorum. In addition, the expression of genes revealed that after susceptible and resistant species were inoculated with P. capsici, the majority of genes incorporated in the phenylpropanoid metabolism pathway were up-regulated in both species. Among various treatments and organs, all the genes were up-regulated to a relatively high degree in resistant species. Phenylalanine ammonia lyase and peroxidase enzyme activity increased in susceptible and resistant species after inoculation with P. capsici, and the resistant species increased faster. The resistant plants retain their vascular structure in lignin revealed by histochemical analysis. Our data provide critical information regarding target genes and a technological basis for future studies of black pepper genetic improvements, including transgenic breeding.

  4. Novel de novo pathogenic variant in the NR2F2 gene in a boy with congenital heart defect and dysmorphic features.

    Science.gov (United States)

    Upadia, Jariya; Gonzales, Patrick R; Robin, Nathaniel H

    2018-04-16

    The NR2F2 gene plays an important role in angiogenesis and heart development. Moreover, this gene is involved in organogenesis in many other organs in mouse models. Variants in this gene have been reported in a number of patients with nonsyndromic atrioventricular septal defect, and in one patient with congenital heart defect and dysmorphic features. Here we report an 11-month-old Caucasian male with global developmental delay, dysmorphic features, coarctation of the aorta, and ventricular septal defect. He was later found to have a pathogenic mutation in the NR2F2 gene by whole exome sequencing. This is the second instance in which an NR2F2 mutation has been identified in a child with a congenital heart defect and other anomalies. This case suggests that some variants in NR2F2 may cause syndromic forms of congenital heart defect. © 2018 Wiley Periodicals, Inc.

  5. De novo characterization of the spleen transcriptome of the large yellow croaker (Pseudosciaena crocea) and analysis of the immune relevant genes and pathways involved in the antiviral response

    KAUST Repository

    Mu, Yinnan

    2014-05-12

    The large yellow croaker (Pseudosciaena crocea) is an economically important marine fish in China. To understand the molecular basis for antiviral defense in this species, we used Illumia paired-end sequencing to characterize the spleen transcriptome of polyriboinosinic:polyribocytidylic acid [poly(I:C)]-induced large yellow croakers. The library produced 56,355,728 reads and assembled into 108,237 contigs. As a result, 15,192 unigenes were found from this transcriptome. Gene ontology analysis showed that 4,759 genes were involved in three major functional categories: biological process, cellular component, and molecular function. We further ascertained that numerous consensus sequences were homologous to known immune-relevant genes. Kyoto Encyclopedia of Genes and Genomes orthology mapping annotated 5,389 unigenes and identified numerous immune-relevant pathways. These immune-relevant genes and pathways revealed major antiviral immunity effectors, including but not limited to: pattern recognition receptors, adaptors and signal transducers, the interferons and interferon-stimulated genes, inflammatory cytokines and receptors, complement components, and B-cell and T-cell antigen activation molecules. Moreover, the partial genes of Toll-like receptor signaling pathway, RIG-I-like receptors signaling pathway, Janus kinase-Signal Transducer and Activator of Transcription (JAK-STAT) signaling pathway, and T-cell receptor (TCR) signaling pathway were found to be changed after poly(I:C) induction by real-time polymerase chain reaction (PCR) analysis, suggesting that these signaling pathways may be regulated by poly(I:C), a viral mimic. Overall, the antivirus-related genes and signaling pathways that were identified in response to poly(I:C) challenge provide valuable leads for further investigation of the antiviral defense mechanism in the large yellow croaker. © 2014 Mu et al.

  6. De novo characterization of the spleen transcriptome of the large yellow croaker (Pseudosciaena crocea and analysis of the immune relevant genes and pathways involved in the antiviral response.

    Directory of Open Access Journals (Sweden)

    Yinnan Mu

    Full Text Available The large yellow croaker (Pseudosciaena crocea is an economically important marine fish in China. To understand the molecular basis for antiviral defense in this species, we used Illumia paired-end sequencing to characterize the spleen transcriptome of polyriboinosinic:polyribocytidylic acid [poly(I:C]-induced large yellow croakers. The library produced 56,355,728 reads and assembled into 108,237 contigs. As a result, 15,192 unigenes were found from this transcriptome. Gene ontology analysis showed that 4,759 genes were involved in three major functional categories: biological process, cellular component, and molecular function. We further ascertained that numerous consensus sequences were homologous to known immune-relevant genes. Kyoto Encyclopedia of Genes and Genomes orthology mapping annotated 5,389 unigenes and identified numerous immune-relevant pathways. These immune-relevant genes and pathways revealed major antiviral immunity effectors, including but not limited to: pattern recognition receptors, adaptors and signal transducers, the interferons and interferon-stimulated genes, inflammatory cytokines and receptors, complement components, and B-cell and T-cell antigen activation molecules. Moreover, the partial genes of Toll-like receptor signaling pathway, RIG-I-like receptors signaling pathway, Janus kinase-Signal Transducer and Activator of Transcription (JAK-STAT signaling pathway, and T-cell receptor (TCR signaling pathway were found to be changed after poly(I:C induction by real-time polymerase chain reaction (PCR analysis, suggesting that these signaling pathways may be regulated by poly(I:C, a viral mimic. Overall, the antivirus-related genes and signaling pathways that were identified in response to poly(I:C challenge provide valuable leads for further investigation of the antiviral defense mechanism in the large yellow croaker.

  7. De novo characterization of the spleen transcriptome of the large yellow croaker (Pseudosciaena crocea) and analysis of the immune relevant genes and pathways involved in the antiviral response

    KAUST Repository

    Mu, Yinnan; Li, Mingyu; Ding, Feng; Ding, Yang; Ao, Jingqun; Hu, Songnian; Chen, Xinhua

    2014-01-01

    The large yellow croaker (Pseudosciaena crocea) is an economically important marine fish in China. To understand the molecular basis for antiviral defense in this species, we used Illumia paired-end sequencing to characterize the spleen transcriptome of polyriboinosinic:polyribocytidylic acid [poly(I:C)]-induced large yellow croakers. The library produced 56,355,728 reads and assembled into 108,237 contigs. As a result, 15,192 unigenes were found from this transcriptome. Gene ontology analysis showed that 4,759 genes were involved in three major functional categories: biological process, cellular component, and molecular function. We further ascertained that numerous consensus sequences were homologous to known immune-relevant genes. Kyoto Encyclopedia of Genes and Genomes orthology mapping annotated 5,389 unigenes and identified numerous immune-relevant pathways. These immune-relevant genes and pathways revealed major antiviral immunity effectors, including but not limited to: pattern recognition receptors, adaptors and signal transducers, the interferons and interferon-stimulated genes, inflammatory cytokines and receptors, complement components, and B-cell and T-cell antigen activation molecules. Moreover, the partial genes of Toll-like receptor signaling pathway, RIG-I-like receptors signaling pathway, Janus kinase-Signal Transducer and Activator of Transcription (JAK-STAT) signaling pathway, and T-cell receptor (TCR) signaling pathway were found to be changed after poly(I:C) induction by real-time polymerase chain reaction (PCR) analysis, suggesting that these signaling pathways may be regulated by poly(I:C), a viral mimic. Overall, the antivirus-related genes and signaling pathways that were identified in response to poly(I:C) challenge provide valuable leads for further investigation of the antiviral defense mechanism in the large yellow croaker. © 2014 Mu et al.

  8. De novo transcriptome sequencing of Acer palmatum and comprehensive analysis of differentially expressed genes under salt stress in two contrasting genotypes.

    Science.gov (United States)

    Rong, Liping; Li, Qianzhong; Li, Shushun; Tang, Ling; Wen, Jing

    2016-04-01

    Maple (Acer palmatum) is an important species for landscape planting worldwide. Salt stress affects the normal growth of the Maple leaf directly, leading to loss of esthetic value. However, the limited availability of Maple genomic information has hindered research on the mechanisms underlying this tolerance. In this study, we performed comprehensive analyses of the salt tolerance in two genotypes of Maple using RNA-seq. Approximately 146.4 million paired-end reads, representing 181,769 unigenes, were obtained. The N50 length of the unigenes was 738 bp, and their total length over 102.66 Mb. 14,090 simple sequence repeats and over 500,000 single nucleotide polymorphisms were identified, which represent useful resources for marker development. Importantly, 181,769 genes were detected in at least one library, and 303 differentially expressed genes (DEGs) were identified between salt-sensitive and salt-tolerant genotypes. Among these DEGs, 125 were upregulated and 178 were downregulated genes. Two MYB-related proteins and one LEA protein were detected among the first 10 most downregulated genes. Moreover, a methyltransferase-related gene was detected among the first 10 most upregulated genes. The three most significantly enriched pathways were plant hormone signal transduction, arginine and proline metabolism, and photosynthesis. The transcriptome analysis provided a rich genetic resource for gene discovery related to salt tolerance in Maple, and in closely related species. The data will serve as an important public information platform to further our understanding of the molecular mechanisms involved in salt tolerance in Maple.

  9. De novo RNA sequencing transcriptome of Rhododendron obtusum identified the early heat response genes involved in the transcriptional regulation of photosynthesis.

    Directory of Open Access Journals (Sweden)

    Linchuan Fang

    Full Text Available Rhododendron spp. is an important ornamental species that is widely cultivated for landscape worldwide. Heat stress is a major obstacle for its cultivation in south China. Previous studies on rhododendron principally focused on its physiological and biochemical processes, which are involved in a series of stress tolerance. However, molecular or genetic properties of rhododendron's response to heat stress are still poorly understood. The phenotype and chlorophyll fluorescence kinetics parameters of four rhododendron cultivars were compared under normal or heat stress conditions, and a cultivar with highest heat tolerance, "Yanzhimi" (R. obtusum was selected for transcriptome sequencing. A total of 325,429,240 high quality reads were obtained and assembled into 395,561 transcripts and 92,463 unigenes. Functional annotation showed that 38,724 unigenes had sequence similarity to known genes in at least one of the proteins or nucleotide databases used in this study. These 38,724 unigenes were categorized into 51 functional groups based on Gene Ontology classification and were blasted to 24 known cluster of orthologous groups. A total of 973 identified unigenes belonged to 57 transcription factor families, including the stress-related HSF, DREB, ZNF, and NAC genes. Photosynthesis was significantly enriched in the Kyoto Encyclopedia of Genes and Genomes pathway, and the changed expression pattern was illustrated. The key pathways and signaling components that contribute to heat tolerance in rhododendron were revealed. These results provide a potentially valuable resource that can be used for heat-tolerance breeding.

  10. De-novo RNA sequencing and metabolite profiling to identify genes involved in anthocyanin biosynthesis in Korean black raspberry (Rubus coreanus Miquel.

    Directory of Open Access Journals (Sweden)

    Tae Kyung Hyun

    Full Text Available The Korean black raspberry (Rubus coreanus Miquel, KB on ripening is usually consumed as fresh fruit, whereas the unripe KB has been widely used as a source of traditional herbal medicine. Such a stage specific utilization of KB has been assumed due to the changing metabolite profile during fruit ripening process, but so far molecular and biochemical changes during its fruit maturation are poorly understood. To analyze biochemical changes during fruit ripening process at molecular level, firstly, we have sequenced, assembled, and annotated the transcriptome of KB fruits. Over 4.86 Gb of normalized cDNA prepared from fruits was sequenced using Illumina HiSeq™ 2000, and assembled into 43,723 unigenes. Secondly, we have reported that alterations in anthocyanins and proanthocyanidins are the major factors facilitating variations in these stages of fruits. In addition, up-regulation of F3'H1, DFR4 and LDOX1 resulted in the accumulation of cyanidin derivatives during the ripening process of KB, indicating the positive relationship between the expression of anthocyanin biosynthetic genes and the anthocyanin accumulation. Furthermore, the ability of RcMCHI2 (R. coreanus Miquel chalcone flavanone isomerase 2 gene to complement Arabidopsis transparent testa 5 mutant supported the feasibility of our transcriptome library to provide the gene resources for improving plant nutrition and pigmentation. Taken together, these datasets obtained from transcriptome library and metabolic profiling would be helpful to define the gene-metabolite relationships in this non-model plant.

  11. De novo comparative transcriptome analysis of genes involved in fruit morphology of pumpkin cultivars with extreme size difference and development of EST-SSR markers.

    Science.gov (United States)

    Xanthopoulou, Aliki; Ganopoulos, Ioannis; Psomopoulos, Fotis; Manioudaki, Maria; Moysiadis, Theodoros; Kapazoglou, Aliki; Osathanunkul, Maslin; Michailidou, Sofia; Kalivas, Apostolos; Tsaftaris, Athanasios; Nianiou-Obeidat, Irini; Madesis, Panagiotis

    2017-07-30

    The genetic basis of fruit size and shape was investigated for the first time in Cucurbita species and genetic loci associated with fruit morphology have been identified. Although extensive genomic resources are available at present for tomato (Solanum lycopersicum), cucumber (Cucumis sativus), melon (Cucumis melo) and watermelon (Citrullus lanatus), genomic databases for Cucurbita species are limited. Recently, our group reported the generation of pumpkin (Cucurbita pepo) transcriptome databases from two contrasting cultivars with extreme fruit sizes. In the current study we used these databases to perform comparative transcriptome analysis in order to identify genes with potential roles in fruit morphology and fruit size. Differential Gene Expression (DGE) analysis between cv. 'Munchkin' (small-fruit) and cv. 'Big Moose' (large-fruit) revealed a variety of candidate genes associated with fruit morphology with significant differences in gene expression between the two cultivars. In addition, we have set the framework for generating EST-SSR markers, which discriminate different C. pepo cultivars and show transferability to related Cucurbitaceae species. The results of the present study will contribute to both further understanding the molecular mechanisms regulating fruit morphology and furthermore identifying the factors that determine fruit size. Moreover, they may lead to the development of molecular marker tools for selecting genotypes with desired morphological traits. Copyright © 2017. Published by Elsevier B.V.

  12. De Novo Mutations in CHD4, an ATP-Dependent Chromatin Remodeler Gene, Cause an Intellectual Disability Syndrome with Distinctive Dysmorphisms

    NARCIS (Netherlands)

    Weiss, Karin; Terhal, Paulien A; Cohen, Lior; Bruccoleri, Michael; Irving, Melita; Martinez, Ariel F; Rosenfeld, Jill A; Machol, Keren; Yang, Yaping; Liu, Pengfei; Walkiewicz, Magdalena; Beuten, Joke; Gomez-Ospina, Natalia; Haude, Katrina; Fong, Chin-To; Enns, Gregory M; Bernstein, Jonathan A; Fan, Judith; Gotway, Garrett; Ghorbani, Mohammad; van Gassen, Koen; Monroe, Glen R; van Haaften, Gijs; Basel-Vanagaite, Lina; Yang, Xiang-Jiao; Campeau, Philippe M; Muenke, Maximilian

    2016-01-01

    Chromodomain helicase DNA-binding protein 4 (CHD4) is an ATP-dependent chromatin remodeler involved in epigenetic regulation of gene transcription, DNA repair, and cell cycle progression. Also known as Mi2β, CHD4 is an integral subunit of a well-characterized histone deacetylase complex. Here we

  13. De novo RNA sequencing transcriptome of Rhododendron obtusum identified the early heat response genes involved in the transcriptional regulation of photosynthesis

    Science.gov (United States)

    Tong, Jun; Dong, Yanfang; Xu, Dongyun; Mao, Jing; Zhou, Yuan

    2017-01-01

    Rhododendron spp. is an important ornamental species that is widely cultivated for landscape worldwide. Heat stress is a major obstacle for its cultivation in south China. Previous studies on rhododendron principally focused on its physiological and biochemical processes, which are involved in a series of stress tolerance. However, molecular or genetic properties of rhododendron’s response to heat stress are still poorly understood. The phenotype and chlorophyll fluorescence kinetics parameters of four rhododendron cultivars were compared under normal or heat stress conditions, and a cultivar with highest heat tolerance, “Yanzhimi” (R. obtusum) was selected for transcriptome sequencing. A total of 325,429,240 high quality reads were obtained and assembled into 395,561 transcripts and 92,463 unigenes. Functional annotation showed that 38,724 unigenes had sequence similarity to known genes in at least one of the proteins or nucleotide databases used in this study. These 38,724 unigenes were categorized into 51 functional groups based on Gene Ontology classification and were blasted to 24 known cluster of orthologous groups. A total of 973 identified unigenes belonged to 57 transcription factor families, including the stress-related HSF, DREB, ZNF, and NAC genes. Photosynthesis was significantly enriched in the Kyoto Encyclopedia of Genes and Genomes pathway, and the changed expression pattern was illustrated. The key pathways and signaling components that contribute to heat tolerance in rhododendron were revealed. These results provide a potentially valuable resource that can be used for heat-tolerance breeding. PMID:29059200

  14. What does a worm want with 20,000 genes?

    OpenAIRE

    Hodgkin, Jonathan

    2001-01-01

    The number of genes predicted for the Caenorhabditis elegans genome is remarkably high: approximately 20,000, if both protein-coding and RNA-coding genes are counted. This article discusses possible explanations for such a high value.

  15. De novo analysis of the Adelphocoris suturalis Jakovlev metathoracic scent glands transcriptome and expression patterns of pheromone biosynthesis-related genes.

    Science.gov (United States)

    Luo, Jing; Liu, Xiangyang; Liu, Lang; Zhang, Poyao; Chen, Longjia; Gao, Qiao; Ma, Weihua; Chen, Lizhen; Lei, Chaoliang

    2014-11-10

    Adelphocoris suturalis Jakovlev is a major cotton pest in Southern China. Metathoracic scent glands (MTGs) produced pheromones that play an important role in survival and population propagation of this species, and also show great potential for pest control. Up to the present, there is little information that underlined the molecular basis of the pheromone biosynthesis of this bug. It is essential to clarify genes involved in the production of pheromone components, and also in the regulation of the variation of the blend ratio. We sequenced the transcriptome of metathoracic scent glands (MTGs) of A. suturalis. A total of 52 million 91-bp-long reads were obtained and assembled into 70,296 unigenes with a mean length of 691bp. Of these unigenes, a total of 26,744 (38%) unigenes showed significant similarity to known proteins in the NCBI database (E-valuepheromone biosynthesis were selected, and the gene expression patterns were verified by qRT-PCR. The qRT-PCR results indicated that Asdelta9-DES, AsFAR, AsAOX, Ascarboxylesterase, AsNT-ES and AsATFs have a higher expression level in the period when female A. suturalis release sex pheromones. These data constitutes the first transcriptomic analysis exploring the repertoire of genes expressed in insect MTGs. We identified a large number of potential pheromone biosynthetic pathway genes. In this context, our study provides an invaluable resource for future exploration of molecular mechanisms of pheromone biosynthesis in A. suturalis, as well as other hemipteran species. Copyright © 2014 Elsevier B.V. All rights reserved.

  16. De novo transcriptome analysis of pneumatophores (modified roots in the true mangrove species Avicennia marina and identification of the genes related to root gas exchange

    Directory of Open Access Journals (Sweden)

    Purushothaman Natarajan

    2017-10-01

    Full Text Available Mangroves plants which grow in estuaries naturally tolerate extreme conditions of high salinity (90 ppt and high light intensity. Avicennia marina is a true mangrove tree species with physiological adaptations like modified root system (pneumatophores and salt excretion glands in leaves as its one of the unique features to consider. The pneumatophores are a special type of roots with negative geotropism that project above the water surface or the level of flooded soils [1]. In contact with air these roots develop lenticels, which improve gas exchange between roots and environment [2]. In swamps and wetlands the presence of pneumatophores facilitates oxygen diffusion through the tissues, maintaining levels adequate for cellular respiration [3]. Objective of this study was to perform the whole transcriptome analysis of pneumatophore tissue of A. marina by Illumina sequencing and to identify putative genes involved in process of root gas exchange. We generated 19.73 million of paired-end reads and assembled into 86,856 unigenes with an average length of 772 bp. Further, annotation, tissue specific gene expression and genes related to root gas exchange will be presented.

  17. De Novo Transcriptome Assembly and Identification of Gene Candidates for Rapid Evolution of Soil Al Tolerance in Anthoxanthum odoratum at the Long-Term Park Grass Experiment.

    Science.gov (United States)

    Gould, Billie; McCouch, Susan; Geber, Monica

    2015-01-01

    Studies of adaptation in the wild grass Anthoxanthum odoratum at the Park Grass Experiment (PGE) provided one of the earliest examples of rapid evolution in plants. Anthoxanthum has become locally adapted to differences in soil Al toxicity, which have developed there due to soil acidification from long-term experimental fertilizer treatments. In this study, we used transcriptome sequencing to identify Al stress responsive genes in Anthoxanhum and identify candidates among them for further molecular study of rapid Al tolerance evolution at the PGE. We examined the Al content of Anthoxanthum tissues and conducted RNA-sequencing of root tips, the primary site of Al induced damage. We found that despite its high tolerance Anthoxanthum is not an Al accumulating species. Genes similar to those involved in organic acid exudation (TaALMT1, ZmMATE), cell wall modification (OsSTAR1), and internal Al detoxification (OsNRAT1) in cultivated grasses were responsive to Al exposure. Expression of a large suite of novel loci was also triggered by early exposure to Al stress in roots. Three-hundred forty five transcripts were significantly more up- or down-regulated in tolerant vs. sensitive Anthoxanthum genotypes, providing important targets for future study of rapid evolution at the PGE.

  18. De Novo Transcriptome Assembly and Identification of Gene Candidates for Rapid Evolution of Soil Al Tolerance in Anthoxanthum odoratum at the Long-Term Park Grass Experiment.

    Directory of Open Access Journals (Sweden)

    Billie Gould

    Full Text Available Studies of adaptation in the wild grass Anthoxanthum odoratum at the Park Grass Experiment (PGE provided one of the earliest examples of rapid evolution in plants. Anthoxanthum has become locally adapted to differences in soil Al toxicity, which have developed there due to soil acidification from long-term experimental fertilizer treatments. In this study, we used transcriptome sequencing to identify Al stress responsive genes in Anthoxanhum and identify candidates among them for further molecular study of rapid Al tolerance evolution at the PGE. We examined the Al content of Anthoxanthum tissues and conducted RNA-sequencing of root tips, the primary site of Al induced damage. We found that despite its high tolerance Anthoxanthum is not an Al accumulating species. Genes similar to those involved in organic acid exudation (TaALMT1, ZmMATE, cell wall modification (OsSTAR1, and internal Al detoxification (OsNRAT1 in cultivated grasses were responsive to Al exposure. Expression of a large suite of novel loci was also triggered by early exposure to Al stress in roots. Three-hundred forty five transcripts were significantly more up- or down-regulated in tolerant vs. sensitive Anthoxanthum genotypes, providing important targets for future study of rapid evolution at the PGE.

  19. A Novel de Novo Mutation in the CD40 Ligand Gene in a Patient With a Mild X-Linked Hyper-IgM Phenotype Initially Diagnosed as CVID: New Aspects of Old Diseases

    Directory of Open Access Journals (Sweden)

    Tábata T. França

    2018-05-01

    Full Text Available Mutations in the CD40 ligand (CD40L gene (CD40LG lead to X-linked hyper-IgM syndrome (X-HIGM, which is a primary immunodeficiency (PID characterized by decreased serum levels of IgG and IgA and normal or elevated IgM levels. Although most X-HIGM patients become symptomatic during the first or second year of life, during which they exhibit recurrent infections, some patients exhibit mild phenotypes, which are usually associated with hypomorphic mutations that do not abrogate protein expression or function. Here, we describe a 28-year-old man who initially presented with recurrent infections since the age of 7 years, when he exhibited meningitis caused by Cryptococcus neoformans. The patient had no family history of immunodeficiency, and based on clinical and laboratory presentation, he was initially diagnosed with common variable immunodeficiency (CVID. In subsequent years, he displayed several sporadic episodes of infection, including pneumonia, pharyngotonsillitis, acute otitis media, rhinosinusitis, fungal dermatosis, and intestinal helminthiasis. The evaluation of CD40L expression on the surface of activated CD3+CD4+ T cells from the patient showed decreased expression of CD40L. Genetic analysis revealed a novel de novo mutation consisting of a 6-nucleotide insertion in exon 1 of CD40LG, which confirmed the diagnosis of X-HIGM. In this report, we describe a novel mutation in the CD40L gene and highlight the complexities of PID diagnosis in light of atypical phenotypes and hypomorphic mutations as well as the importance of the differential diagnosis of PIDs.

  20. Assembly of the Boechera retrofracta Genome and Evolutionary Analysis of Apomixis-Associated Genes

    Directory of Open Access Journals (Sweden)

    Sergei Kliver

    2018-03-01

    Full Text Available Closely related to the model plant Arabidopsis thaliana, the genus Boechera is known to contain both sexual and apomictic species or accessions. Boechera retrofracta is a diploid sexually reproducing species and is thought to be an ancestral parent species of apomictic species. Here we report the de novo assembly of the B. retrofracta genome using short Illumina and Roche reads from 1 paired-end and 3 mate pair libraries. The distribution of 23-mers from the paired end library has indicated a low level of heterozygosity and the presence of detectable duplications and triplications. The genome size was estimated to be equal 227 Mb. N50 of the assembled scaffolds was 2.3 Mb. Using a hybrid approach that combines homology-based and de novo methods 27,048 protein-coding genes were predicted. Also repeats, transfer RNA (tRNA and ribosomal RNA (rRNA genes were annotated. Finally, genes of B. retrofracta and 6 other Brassicaceae species were used for phylogenetic tree reconstruction. In addition, we explored the histidine exonuclease APOLLO locus, related to apomixis in Boechera, and proposed model of its evolution through the series of duplications. An assembled genome of B. retrofracta will help in the challenging assembly of the highly heterozygous genomes of hybrid apomictic species.

  1. De Novo Glutamine Synthesis

    Science.gov (United States)

    He, Qiao; Shi, Xinchong; Zhang, Linqi; Yi, Chang; Zhang, Xuezhen

    2016-01-01

    Purpose: The aim of this study was to investigate the role of de novo glutamine (Gln) synthesis in the proliferation of C6 glioma cells and its detection with 13N-ammonia. Methods: Chronic Gln-deprived C6 glioma (0.06C6) cells were established. The proliferation rates of C6 and 0.06C6 cells were measured under the conditions of Gln deprivation along with or without the addition of ammonia or glutamine synthetase (GS) inhibitor. 13N-ammonia uptake was assessed in C6 cells by gamma counting and in rats with C6 and 0.06C6 xenografts by micro–positron emission tomography (PET) scanning. The expression of GS in C6 cells and xenografts was assessed by Western blotting and immunohistochemistry, respectively. Results: The Gln-deprived C6 cells showed decreased proliferation ability but had a significant increase in GS expression. Furthermore, we found that low concentration of ammonia was sufficient to maintain the proliferation of Gln-deprived C6 cells, and 13N-ammonia uptake in C6 cells showed Gln-dependent decrease, whereas inhibition of GS markedly reduced the proliferation of C6 cells as well as the uptake of 13N-ammoina. Additionally, microPET/computed tomography exhibited that subcutaneous 0.06C6 xenografts had higher 13N-ammonia uptake and GS expression in contrast to C6 xenografts. Conclusion: De novo Gln synthesis through ammonia–glutamate reaction plays an important role in the proliferation of C6 cells. 13N-ammonia can be a potential metabolic PET tracer for Gln-dependent tumors. PMID:27118759

  2. Computational Tools and Algorithms for Designing Customized Synthetic Genes

    Directory of Open Access Journals (Sweden)

    Nathan eGould

    2014-10-01

    Full Text Available Advances in DNA synthesis have enabled the construction of artificial genes, gene circuits, and genomes of bacterial scale. Freedom in de-novo design of synthetic constructs provides significant power in studying the impact of mutations in sequence features, and verifying hypotheses on the functional information that is encoded in nucleic and amino acids. To aid this goal, a large number of software tools of variable sophistication have been implemented, enabling the design of synthetic genes for sequence optimization based on rationally defined properties. The first generation of tools dealt predominantly with singular objectives such as codon usage optimization and unique restriction site incorporation. Recent years have seen the emergence of sequence design tools that aim to evolve sequences toward combinations of objectives. The design of optimal protein coding sequences adhering to multiple objectives is computationally hard, and most tools rely on heuristics to sample the vast sequence design space. In this review we study some of the algorithmic issues behind gene optimization and the approaches that different tools have adopted to redesign genes and optimize desired coding features. We utilize test cases to demonstrate the efficiency of each approach, as well as identify their strengths and limitations.

  3. Computational Tools and Algorithms for Designing Customized Synthetic Genes

    Energy Technology Data Exchange (ETDEWEB)

    Gould, Nathan [Department of Computer Science, The College of New Jersey, Ewing, NJ (United States); Hendy, Oliver [Department of Biology, The College of New Jersey, Ewing, NJ (United States); Papamichail, Dimitris, E-mail: papamicd@tcnj.edu [Department of Computer Science, The College of New Jersey, Ewing, NJ (United States)

    2014-10-06

    Advances in DNA synthesis have enabled the construction of artificial genes, gene circuits, and genomes of bacterial scale. Freedom in de novo design of synthetic constructs provides significant power in studying the impact of mutations in sequence features, and verifying hypotheses on the functional information that is encoded in nucleic and amino acids. To aid this goal, a large number of software tools of variable sophistication have been implemented, enabling the design of synthetic genes for sequence optimization based on rationally defined properties. The first generation of tools dealt predominantly with singular objectives such as codon usage optimization and unique restriction site incorporation. Recent years have seen the emergence of sequence design tools that aim to evolve sequences toward combinations of objectives. The design of optimal protein-coding sequences adhering to multiple objectives is computationally hard, and most tools rely on heuristics to sample the vast sequence design space. In this review, we study some of the algorithmic issues behind gene optimization and the approaches that different tools have adopted to redesign genes and optimize desired coding features. We utilize test cases to demonstrate the efficiency of each approach, as well as identify their strengths and limitations.

  4. Computational Tools and Algorithms for Designing Customized Synthetic Genes

    International Nuclear Information System (INIS)

    Gould, Nathan; Hendy, Oliver; Papamichail, Dimitris

    2014-01-01

    Advances in DNA synthesis have enabled the construction of artificial genes, gene circuits, and genomes of bacterial scale. Freedom in de novo design of synthetic constructs provides significant power in studying the impact of mutations in sequence features, and verifying hypotheses on the functional information that is encoded in nucleic and amino acids. To aid this goal, a large number of software tools of variable sophistication have been implemented, enabling the design of synthetic genes for sequence optimization based on rationally defined properties. The first generation of tools dealt predominantly with singular objectives such as codon usage optimization and unique restriction site incorporation. Recent years have seen the emergence of sequence design tools that aim to evolve sequences toward combinations of objectives. The design of optimal protein-coding sequences adhering to multiple objectives is computationally hard, and most tools rely on heuristics to sample the vast sequence design space. In this review, we study some of the algorithmic issues behind gene optimization and the approaches that different tools have adopted to redesign genes and optimize desired coding features. We utilize test cases to demonstrate the efficiency of each approach, as well as identify their strengths and limitations.

  5. Vertebrate gene predictions and the problem of large genes

    DEFF Research Database (Denmark)

    Wang, Jun; Li, ShengTing; Zhang, Yong

    2003-01-01

    To find unknown protein-coding genes, annotation pipelines use a combination of ab initio gene prediction and similarity to experimentally confirmed genes or proteins. Here, we show that although the ab initio predictions have an intrinsically high false-positive rate, they also have a consistent...

  6. The clinical impact of chromosomal rearrangements with breakpoints upstream of the SOX9 gene: two novel de novo balanced translocations associated with acampomelic campomelic dysplasia.

    Science.gov (United States)

    Fonseca, Ana Carolina S; Bonaldi, Adriano; Bertola, Débora R; Kim, Chong A; Otto, Paulo A; Vianna-Morgante, Angela M

    2013-05-07

    The association of balanced rearrangements with breakpoints near SOX9 [SRY (sex determining region Y)-box 9] with skeletal abnormalities has been ascribed to the presumptive altering of SOX9 expression by the direct disruption of regulatory elements, their separation from SOX9 or the effect of juxtaposed sequences. We report on two sporadic apparently balanced translocations, t(7;17)(p13;q24) and t(17;20)(q24.3;q11.2), whose carriers have skeletal abnormalities that led to the diagnosis of acampomelic campomelic dysplasia (ACD; MIM 114290). No pathogenic chromosomal imbalances were detected by a-CGH. The chromosome 17 breakpoints were mapped, respectively, 917-855 kb and 601-585 kb upstream of the SOX9 gene. A distal cluster of balanced rearrangements breakpoints on chromosome 17 associated with SOX9-related skeletal disorders has been mapped to a segment 932-789 kb upstream of SOX9. In this cluster, the breakpoint of the herein described t(17;20) is the most telomeric to SOX9, thus allowing the redefining of the telomeric boundary of the distal breakpoint cluster region related to skeletal disorders to 601-585 kb upstream of SOX9. Although both patients have skeletal abnormalities, the t(7;17) carrier presents with relatively mild clinical features, whereas the t(17;20) was detected in a boy with severe broncheomalacia, depending on mechanical ventilation. Balanced and unbalanced rearrangements associated with disorders of sex determination led to the mapping of a regulatory region of SOX9 function on testicular differentiation to a 517-595 kb interval upstream of SOX9, in addition to TESCO (Testis-specific enhancer of SOX9 core). As the carrier of t(17;20) has an XY sex-chromosome constitution and normal male development for his age, the segment of chromosome 17 distal to the translocation breakpoint should contain the regulatory elements for normal testis development. These two novel translocations illustrate the clinical variability in carriers of balanced

  7. De novo assembly of the pepper transcriptome (Capsicum annuum): a benchmark for in silico discovery of SNPs, SSRs and candidate genes

    Science.gov (United States)

    2012-01-01

    Background Molecular breeding of pepper (Capsicum spp.) can be accelerated by developing DNA markers associated with transcriptomes in breeding germplasm. Before the advent of next generation sequencing (NGS) technologies, the majority of sequencing data were generated by the Sanger sequencing method. By leveraging Sanger EST data, we have generated a wealth of genetic information for pepper including thousands of SNPs and Single Position Polymorphic (SPP) markers. To complement and enhance these resources, we applied NGS to three pepper genotypes: Maor, Early Jalapeño and Criollo de Morelos-334 (CM334) to identify SNPs and SSRs in the assembly of these three genotypes. Results Two pepper transcriptome assemblies were developed with different purposes. The first reference sequence, assembled by CAP3 software, comprises 31,196 contigs from >125,000 Sanger-EST sequences that were mainly derived from a Korean F1-hybrid line, Bukang. Overlapping probes were designed for 30,815 unigenes to construct a pepper Affymetrix GeneChip® microarray for whole genome analyses. In addition, custom Python scripts were used to identify 4,236 SNPs in contigs of the assembly. A total of 2,489 simple sequence repeats (SSRs) were identified from the assembly, and primers were designed for the SSRs. Annotation of contigs using Blast2GO software resulted in information for 60% of the unigenes in the assembly. The second transcriptome assembly was constructed from more than 200 million Illumina Genome Analyzer II reads (80–120 nt) using a combination of Velvet, CLC workbench and CAP3 software packages. BWA, SAMtools and in-house Perl scripts were used to identify SNPs among three pepper genotypes. The SNPs were filtered to be at least 50 bp from any intron-exon junctions as well as flanking SNPs. More than 22,000 high-quality putative SNPs were identified. Using the MISA software, 10,398 SSR markers were also identified within the Illumina transcriptome assembly and primers were

  8. De novo transcriptome assembly of a Chinese locoweed (Oxytropis ochrocephala species provides insights into genes associated with drought, salinity and cold tolerance

    Directory of Open Access Journals (Sweden)

    Wei eHe

    2015-12-01

    Full Text Available Background: Locoweeds (toxic Oxytropis and Astraglus species, containing the toxic agent swainsonine, pose serious threats to animal husbandry on grasslands in both China and the US. Some locoweeds have evolved adaptations in order to resist various stress conditions such as drought, salt and cold. As a result they replace other plants in their communities and become an ecological problem. Currently very limited genetic information of locoweeds is available and this hinders our understanding in the molecular basis of their environmental plasticity, and the interaction between locoweeds and their symbiotic swainsonine producing endophytes. Next-generation sequencing provides a means of obtaining transcriptomic sequences in a timely manner, which is particularly useful for non-model plants. In this study, we performed transcriptome sequencing of Oxytropis ochrocephala plants followed by a de nove assembly. Our primary aim was to provide an enriched pool of genetic sequences of an Oxytropis sp. for further locoweed research. Results: Transcriptomes of four different O. ochrocephala samples, from control (CK plants, and those that had experienced either drought (20% PEG, salt (150 mM NaCl or cold (4 °C stress were sequenced using an Illumina Hiseq 2000 platform. From 232,209,506 clean reads 23,220,950,600 (~23 G nucleotides, 182,430 transcripts and 88,942 unigenes were retrieved, with an N50 value of 1,237. Differential expression analysis revealed putative genes encoding heat shock proteins (HSPs and late embryogenesis abundant (LEA proteins, enzymes in secondary metabolite and plant hormone biosyntheses, and transcription factors which are involved in stress tolerance in O. ochrocephala. In order to validate our sequencing results, we further analyzed the expression profiles of nine genes by quantitative real-time PCR. Finally, we discuss the possible mechanism of O. ochrocephala’s adaptations to stress environment. Conclusion: Our

  9. Whole-Exome Sequencing Identifies One De Novo Variant in the FGD6 Gene in a Thai Family with Autism Spectrum Disorder

    Directory of Open Access Journals (Sweden)

    Chuphong Thongnak

    2018-01-01

    Full Text Available Autism spectrum disorder (ASD has a strong genetic basis, although the genetics of autism is complex and it is unclear. Genetic testing such as microarray or sequencing was widely used to identify autism markers, but they are unsuccessful in several cases. The objective of this study is to identify causative variants of autism in two Thai families by using whole-exome sequencing technique. Whole-exome sequencing was performed with autism-affected children from two unrelated families. Each sample was sequenced on SOLiD 5500xl Genetic Analyzer system followed by combined bioinformatics pipeline including annotation and filtering process to identify candidate variants. Candidate variants were validated, and the segregation study with other family members was performed using Sanger sequencing. This study identified a possible causative variant for ASD, c.2951G>A, in the FGD6 gene. We demonstrated the potential for ASD genetic variants associated with ASD using whole-exome sequencing and a bioinformatics filtering procedure. These techniques could be useful in identifying possible causative ASD variants, especially in cases in which variants cannot be identified by other techniques.

  10. An in vivo genetic screen for genes involved in spliced leader trans-splicing indicates a crucial role for continuous de novo spliced leader RNP assembly.

    Science.gov (United States)

    Philippe, Lucas; Pandarakalam, George C; Fasimoye, Rotimi; Harrison, Neale; Connolly, Bernadette; Pettitt, Jonathan; Müller, Berndt

    2017-08-21

    Spliced leader (SL) trans-splicing is a critical element of gene expression in a number of eukaryotic groups. This process is arguably best understood in nematodes, where biochemical and molecular studies in Caenorhabditis elegans and Ascaris suum have identified key steps and factors involved. Despite this, the precise details of SL trans-splicing have yet to be elucidated. In part, this is because the systematic identification of the molecules involved has not previously been possible due to the lack of a specific phenotype associated with defects in this process. We present here a novel GFP-based reporter assay that can monitor SL1 trans-splicing in living C. elegans. Using this assay, we have identified mutants in sna-1 that are defective in SL trans-splicing, and demonstrate that reducing function of SNA-1, SNA-2 and SUT-1, proteins that associate with SL1 RNA and related SmY RNAs, impairs SL trans-splicing. We further demonstrate that the Sm proteins and pICln, SMN and Gemin5, which are involved in small nuclear ribonucleoprotein assembly, have an important role in SL trans-splicing. Taken together these results provide the first in vivo evidence for proteins involved in SL trans-splicing, and indicate that continuous replacement of SL ribonucleoproteins consumed during trans-splicing reactions is essential for effective trans-splicing. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  11. De novo assembly of mud loach (Misgurnus anguillicaudatus skin transcriptome to identify putative genes involved in immunity and epidermal mucus secretion.

    Directory of Open Access Journals (Sweden)

    Yong Long

    Full Text Available Fish skin serves as the first line of defense against a wide variety of chemical, physical and biological stressors. Secretion of mucus is among the most prominent characteristics of fish skin and numerous innate immune factors have been identified in the epidermal mucus. However, molecular mechanisms underlying the mucus secretion and immune activities of fish skin remain largely unclear due to the lack of genomic and transcriptomic data for most economically important fish species. In this study, we characterized the skin transcriptome of mud loach using Illumia paired-end sequencing. A total of 40364 unigenes were assembled from 86.6 million (3.07 gigabases filtered reads. The mean length, N50 size and maximum length of assembled transcripts were 387, 611 and 8670 bp, respectively. A total of 17336 (43.76% unigenes were annotated by blast searches against the NCBI non-redundant protein database. Gene ontology mapping assigned a total of 108513 GO terms to 15369 (38.08% unigenes. KEGG orthology mapping annotated 9337 (23.23% unigenes. Among the identified KO categories, immune system is the largest category that contains various components of multiple immune pathways such as chemokine signaling, leukocyte transendothelial migration and T cell receptor signaling, suggesting the complexity of immune mechanisms in fish skin. As for mucin biosynthesis, 37 unigenes were mapped to 7 enzymes of the mucin type O-glycan biosynthesis pathway and 8 members of the polypeptide N-acetylgalactosaminyltransferase family were identified. Additionally, 38 unigenes were mapped to 23 factors of the SNARE interactions in vesicular transport pathway, indicating that the activity of this pathway is required for the processes of epidermal mucus storage and release. Moreover, 1754 simple sequence repeats (SSRs were detected in 1564 unigenes and dinucleotide repeats represented the most abundant type. These findings have laid the foundation for further understanding

  12. De novo assembly of mud loach (Misgurnus anguillicaudatus) skin transcriptome to identify putative genes involved in immunity and epidermal mucus secretion.

    Science.gov (United States)

    Long, Yong; Li, Qing; Zhou, Bolan; Song, Guili; Li, Tao; Cui, Zongbin

    2013-01-01

    Fish skin serves as the first line of defense against a wide variety of chemical, physical and biological stressors. Secretion of mucus is among the most prominent characteristics of fish skin and numerous innate immune factors have been identified in the epidermal mucus. However, molecular mechanisms underlying the mucus secretion and immune activities of fish skin remain largely unclear due to the lack of genomic and transcriptomic data for most economically important fish species. In this study, we characterized the skin transcriptome of mud loach using Illumia paired-end sequencing. A total of 40364 unigenes were assembled from 86.6 million (3.07 gigabases) filtered reads. The mean length, N50 size and maximum length of assembled transcripts were 387, 611 and 8670 bp, respectively. A total of 17336 (43.76%) unigenes were annotated by blast searches against the NCBI non-redundant protein database. Gene ontology mapping assigned a total of 108513 GO terms to 15369 (38.08%) unigenes. KEGG orthology mapping annotated 9337 (23.23%) unigenes. Among the identified KO categories, immune system is the largest category that contains various components of multiple immune pathways such as chemokine signaling, leukocyte transendothelial migration and T cell receptor signaling, suggesting the complexity of immune mechanisms in fish skin. As for mucin biosynthesis, 37 unigenes were mapped to 7 enzymes of the mucin type O-glycan biosynthesis pathway and 8 members of the polypeptide N-acetylgalactosaminyltransferase family were identified. Additionally, 38 unigenes were mapped to 23 factors of the SNARE interactions in vesicular transport pathway, indicating that the activity of this pathway is required for the processes of epidermal mucus storage and release. Moreover, 1754 simple sequence repeats (SSRs) were detected in 1564 unigenes and dinucleotide repeats represented the most abundant type. These findings have laid the foundation for further understanding the secretary

  13. Mutations of NPM1 gene in de novo acute myeloid leukaemia: determination of incidence, distribution pattern and identification of two novel mutations in Indian population.

    Science.gov (United States)

    Ahmad, Firoz; Mandava, Swarna; Das, Bibhu Ranjan

    2009-06-01

    Mutations in the nucleophosmin (NPM1) gene have been recently described to occur in about one-third of acute myeloid leukaemias (AMLs) and represent the most frequent genetic alteration currently known in this subset, specially in those with normal karyotype. This study explored the prevalence and clinical profile of NPM1 mutations in a cohort of 200 Indian adult and children with AML. NPM1 mutations were observed in 19.5% of all population and 34.2% of those with normal karyotype. Adults had a significantly higher incidence of NPM1 mutations than children [38 of 161 (23.6%) vs. 1 of 39 (2.5%), p = 0.002]. NPM1 mutations were significantly associated with normal karyotype (p = 0.001), high WBC count (p = 0.034), AML-M4 subtype (p = 0.039) and a gradient increase of mutation rate with the increase in age groups. Sequence analysis of 39 mutated cases revealed typical mutations (types A, B, D, Nm and H*) as well as two novel variations (types F1 and F2). Majority of the patients had mutation type A (69.2%), followed by B (5.1%), D (15.3%), H* (2.5%) and Nm (2.5%) all involving COOH terminal of the NPM1 protein. In conclusion, this study represents the first report of NPM1 mutation from Indian population and confirms that the incidence of NPM1 mutations varies considerably globally, with slightly lower incidence in Indian population compared to western countries. The current study also served to identify two novel NPM1 mutants that add new insights into the heterogeneity of genomic insertions at exon 12. More ongoing larger studies are warranted to elucidate the molecular pathogenesis of AML that arises in this part of the world. Furthermore, we believe that in light of its high prevalence worldwide, inclusion of NPM1 mutation detection assay in diagnostic evaluations of AML may improve the efficacy of routine genetic characterization and allow assignment of patients to better-defined risk categories.

  14. De Novo Coding Variants Are Strongly Associated with Tourette Disorder

    DEFF Research Database (Denmark)

    Willsey, A Jeremy; Fernandez, Thomas V; Yu, Dongmei

    2017-01-01

    Whole-exome sequencing (WES) and de novo variant detection have proven a powerful approach to gene discovery in complex neurodevelopmental disorders. We have completed WES of 325 Tourette disorder trios from the Tourette International Collaborative Genetics cohort and a replication sample of 186 ...

  15. Primate-specific spliced PMCHL RNAs are non-protein coding in human and macaque tissues

    Directory of Open Access Journals (Sweden)

    Delerue-Audegond Audrey

    2008-12-01

    Full Text Available Abstract Background Brain-expressed genes that were created in primate lineage represent obvious candidates to investigate molecular mechanisms that contributed to neural reorganization and emergence of new behavioural functions in Homo sapiens. PMCHL1 arose from retroposition of a pro-melanin-concentrating hormone (PMCH antisense mRNA on the ancestral human chromosome 5p14 when platyrrhines and catarrhines diverged. Mutations before divergence of hylobatidae led to creation of new exons and finally PMCHL1 duplicated in an ancestor of hominids to generate PMCHL2 at the human chromosome 5q13. A complex pattern of spliced and unspliced PMCHL RNAs were found in human brain and testis. Results Several novel spliced PMCHL transcripts have been characterized in human testis and fetal brain, identifying an additional exon and novel splice sites. Sequencing of PMCHL genes in several non-human primates allowed to carry out phylogenetic analyses revealing that the initial retroposition event took place within an intron of the brain cadherin (CDH12 gene, soon after platyrrhine/catarrhine divergence, i.e. 30–35 Mya, and was concomitant with the insertion of an AluSg element. Sequence analysis of the spliced PMCHL transcripts identified only short ORFs of less than 300 bp, with low (VMCH-p8 and protein variants or no evolutionary conservation. Western blot analyses of human and macaque tissues expressing PMCHL RNA failed to reveal any protein corresponding to VMCH-p8 and protein variants encoded by spliced transcripts. Conclusion Our present results improve our knowledge of the gene structure and the evolutionary history of the primate-specific chimeric PMCHL genes. These genes produce multiple spliced transcripts, bearing short, non-conserved and apparently non-translated ORFs that may function as mRNA-like non-coding RNAs.

  16. CpG + CpNpG Analysis of Protein-Coding Sequences from Tomato

    DEFF Research Database (Denmark)

    Hobolth, Asger; Nielsen, Rasmus; Wang, Ying

    2006-01-01

    We develop codon-based models for simultaneously inferring the mutational effects of CpG and CpNpG methylation in coding regions. In a data set of 369 tomato genes, we show that there is very little effect of CpNpG methylation but a strong effect of CpG methylation affecting almost all genes. We...... further show that the CpNpG and CpG effects are largely uncorrelated. Our results suggest different roles of CpG and CpNpG methylation, with CpNpG methylation possibly playing a specialized role in defense against transposons and RNA viruses....

  17. PROTEIN-CODING GENES AS MOLECULAR MARKERS FOR ECOLOGICALLY DISTINCT POPULATIONS: THE CASE OF TWO BACILLUS SPECIES. (R825348)

    Science.gov (United States)

    The perspectives, information and conclusions conveyed in research project abstracts, progress reports, final reports, journal abstracts and journal publications convey the viewpoints of the principal investigator and may not represent the views and policies of ORD and EPA. Concl...

  18. De novo pathway-based biomarker identification

    DEFF Research Database (Denmark)

    Alcaraz, Nicolas; List, Markus; Batra, Richa

    2017-01-01

    in a large cohort of breast cancer samples from The Cancer Genome Atlas (TCGA) revealed that MGs are considerably more stable than SG models, while also providing valuable insight into the cancer hallmarks that drive them. In addition, when tested on an independent benchmark non-TCGA dataset, MG features......Gene expression profiles have been extensively discussed as an aid to guide the therapy by predicting disease outcome for the patients suffering from complex diseases, such as cancer. However, prediction models built upon single-gene (SG) features show poor stability and performance on independent...... on their molecular subtypes can provide a detailed view of the disease and lead to more personalized therapies. We propose and discuss a novel MG approach based on de novo pathways, which for the first time have been used as features in a multi-class setting to predict cancer subtypes. Comprehensive evaluation...

  19. The Development of Three Long Universal Nuclear Protein-Coding Locus Markers and Their Application to Osteichthyan Phylogenetics with Nested PCR

    Science.gov (United States)

    Zhang, Peng

    2012-01-01

    Background Universal nuclear protein-coding locus (NPCL) markers that are applicable across diverse taxa and show good phylogenetic discrimination have broad applications in molecular phylogenetic studies. For example, RAG1, a representative NPCL marker, has been successfully used to make phylogenetic inferences within all major osteichthyan groups. However, such markers with broad working range and high phylogenetic performance are still scarce. It is necessary to develop more universal NPCL markers comparable to RAG1 for osteichthyan phylogenetics. Methodology/Principal Findings We developed three long universal NPCL markers (>1.6 kb each) based on single-copy nuclear genes (KIAA1239, SACS and TTN) that possess large exons and exhibit the appropriate evolutionary rates. We then compared their phylogenetic utilities with that of the reference marker RAG1 in 47 jawed vertebrate species. In comparison with RAG1, each of the three long universal markers yielded similar topologies and branch supports, all in congruence with the currently accepted osteichthyan phylogeny. To compare their phylogenetic performance visually, we also estimated the phylogenetic informativeness (PI) profile for each of the four long universal NPCL markers. The PI curves indicated that SACS performed best over the whole timescale, while RAG1, KIAA1239 and TTN exhibited similar phylogenetic performances. In addition, we compared the success of nested PCR and standard PCR when amplifying NPCL marker fragments. The amplification success rate and efficiency of the nested PCR were overwhelmingly higher than those of standard PCR. Conclusions/Significance Our work clearly demonstrates the superiority of nested PCR over the conventional PCR in phylogenetic studies and develops three long universal NPCL markers (KIAA1239, SACS and TTN) with the nested PCR strategy. The three markers exhibit high phylogenetic utilities in osteichthyan phylogenetics and can be widely used as pilot genes for

  20. Influence of the Leader protein coding region of foot-and-mouth disease virus on virus replication

    DEFF Research Database (Denmark)

    Belsham, Graham

    2013-01-01

    The foot-and-mouth disease virus (FMDV) Leader (L) protein is produced in two forms, Lab and Lb, differing only at their amino-termini, due to the use of separate initiation codons, usually 84 nt apart. It has been shown previously, and confirmed here, that precise deletion of the Lab coding......, in the context of the virus lacking the Lb coding region, was also tolerated by the virus within BHK cells. However, precise loss of the Lb coding sequence alone blocked FMDV replication in primary bovine thyroid cells. Thus, the requirement for the Leader protein coding sequences is highly dependent...... on the nature and extent of the residual Leader protein sequences and on the host cell system used. FMDVs precisely lacking Lb and with the Lab initiation codon modified may represent safer seed viruses for vaccine production....

  1. TYK2 protein-coding variants protect against rheumatoid arthritis and autoimmunity, with no evidence of major pleiotropic effects on non-autoimmune complex traits.

    Directory of Open Access Journals (Sweden)

    Dorothée Diogo

    Full Text Available Despite the success of genome-wide association studies (GWAS in detecting a large number of loci for complex phenotypes such as rheumatoid arthritis (RA susceptibility, the lack of information on the causal genes leaves important challenges to interpret GWAS results in the context of the disease biology. Here, we genetically fine-map the RA risk locus at 19p13 to define causal variants, and explore the pleiotropic effects of these same variants in other complex traits. First, we combined Immunochip dense genotyping (n = 23,092 case/control samples, Exomechip genotyping (n = 18,409 case/control samples and targeted exon-sequencing (n = 2,236 case/controls samples to demonstrate that three protein-coding variants in TYK2 (tyrosine kinase 2 independently protect against RA: P1104A (rs34536443, OR = 0.66, P = 2.3 x 10(-21, A928V (rs35018800, OR = 0.53, P = 1.2 x 10(-9, and I684S (rs12720356, OR = 0.86, P = 4.6 x 10(-7. Second, we show that the same three TYK2 variants protect against systemic lupus erythematosus (SLE, Pomnibus = 6 x 10(-18, and provide suggestive evidence that two of the TYK2 variants (P1104A and A928V may also protect against inflammatory bowel disease (IBD; P(omnibus = 0.005. Finally, in a phenome-wide association study (PheWAS assessing >500 phenotypes using electronic medical records (EMR in >29,000 subjects, we found no convincing evidence for association of P1104A and A928V with complex phenotypes other than autoimmune diseases such as RA, SLE and IBD. Together, our results demonstrate the role of TYK2 in the pathogenesis of RA, SLE and IBD, and provide supporting evidence for TYK2 as a promising drug target for the treatment of autoimmune diseases.

  2. A novel missense mutation pattern of the GCH1 gene in dopa-responsive dystonia Novo padrão de mutação missense no gene GCH1 na distonia dopa-responsiva

    Directory of Open Access Journals (Sweden)

    Rosana H. Scola

    2007-12-01

    Full Text Available Dopa-responsive dystonia (DRD is an inherited metabolic disorder now classified as DYT5 with two different biochemical defects: autosomal dominant GTP cyclohydrolase 1 (GCH1 deficiency or autosomal recessive tyrosine hydroxylase deficiency. We report the case of a 10-years-old girl with progressive generalized dystonia and gait disorder who presented dramatic response to levodopa. The phenylalanine to tyrosine ratio was significantly higher after phenylalanine loading test. This condition had two different heterozygous mutations in the GCH1 gene: the previously reported P23L mutation and a new Q182E mutation. The characteristics of the DRD and the molecular genetic findings are discussed.Distonia dopa-responsiva (DRD, classificada como DYT5, é um erro inato do metabolismo que pode ser causado por dois diferentes tipos de defeito bioquímico: deficiência de GTP ciclo-hidrolase 1 (GCH1 (autossômica dominante ou de tirosina hidroxilase (autossômica recessiva. Descrevemos o caso de menina de 10 anos com distonia generalizada progressiva e alteração da marcha com importante melhora após uso de levodopa. A relação fenilalanina/tirosina estava aumentada após teste de sobrecarga com fenilalanina. O estudo molecular mostrou que o paciente apresenta uma combinação hererozigótica de mutação no gene GCH1: a já conhecida mutação P23L e uma nova mutação Q182E. Discutem-se as características da DRD e as alterações genéticas possíveis.

  3. Pesquisa de novos elementos Pesquisa de novos elementos

    Directory of Open Access Journals (Sweden)

    Gil Mário de Macedo Grassi

    1978-11-01

    Full Text Available The present study deals with the discovery of new elements synthesized by man. The introduction discusses in general the theories about nuclear transmutation, which is the method employed in these syntheses. The study shows the importance of the Periodical Table since it is through this table that one can reach a prevision of new elements and its, properties. The discoveries of the transuranic elements, together wich the data of their first preparations are also tabulated The stability of these elements is also discussed, and future speculations are showedNeste trabalho estuda-se, teoricamente, a descoberta de novos elementos sintetizados pelo homem Na introdução apresentamos um apanhado geral sobre as teorias a respeito da transmutação nuclear, que é o método utilizado nestas sínteses. Em seguida, mostramos a importância da Tabela Periódica, pois é através dela que se chega à previsão dos novos elementos e de suas propriedades. As descobertas dos transurânicos, Já realizadas com êxito, juntamente com os dados de suas primeiras preparações são tabelados. A estabilidade destes novos elementos também é discutida, e apresentadas futuras especulações.

  4. Selecting Superior De Novo Transcriptome Assemblies: Lessons Learned by Leveraging the Best Plant Genome.

    Directory of Open Access Journals (Sweden)

    Loren A Honaas

    Full Text Available Whereas de novo assemblies of RNA-Seq data are being published for a growing number of species across the tree of life, there are currently no broadly accepted methods for evaluating such assemblies. Here we present a detailed comparison of 99 transcriptome assemblies, generated with 6 de novo assemblers including CLC, Trinity, SOAP, Oases, ABySS and NextGENe. Controlled analyses of de novo assemblies for Arabidopsis thaliana and Oryza sativa transcriptomes provide new insights into the strengths and limitations of transcriptome assembly strategies. We find that the leading assemblers generate reassuringly accurate assemblies for the majority of transcripts. At the same time, we find a propensity for assemblers to fail to fully assemble highly expressed genes. Surprisingly, the instance of true chimeric assemblies is very low for all assemblers. Normalized libraries are reduced in highly abundant transcripts, but they also lack 1000s of low abundance transcripts. We conclude that the quality of de novo transcriptome assemblies is best assessed through consideration of a combination of metrics: 1 proportion of reads mapping to an assembly 2 recovery of conserved, widely expressed genes, 3 N50 length statistics, and 4 the total number of unigenes. We provide benchmark Illumina transcriptome data and introduce SCERNA, a broadly applicable modular protocol for de novo assembly improvement. Finally, our de novo assembly of the Arabidopsis leaf transcriptome revealed ~20 putative Arabidopsis genes lacking in the current annotation.

  5. De novo nonsense mutations in ASXL1 cause Bohring-Opitz syndrome

    DEFF Research Database (Denmark)

    Hoischen, Alexander; van Bon, Bregje W M; Rodríguez-Santiago, Benjamín

    2011-01-01

    Bohring-Opitz syndrome is characterized by severe intellectual disability, distinctive facial features and multiple congenital malformations. We sequenced the exomes of three individuals with Bohring-Opitz syndrome and in each identified heterozygous de novo nonsense mutations in ASXL1, which...... is required for maintenance of both activation and silencing of Hox genes. In total, 7 out of 13 subjects with a Bohring-Opitz phenotype had de novo ASXL1 mutations, suggesting that the syndrome is genetically heterogeneous....

  6. Genome and transcriptome sequencing characterises the gene space of Macadamia integrifolia (Proteaceae).

    Science.gov (United States)

    Nock, Catherine J; Baten, Abdul; Barkla, Bronwyn J; Furtado, Agnelo; Henry, Robert J; King, Graham J

    2016-11-17

    The large Gondwanan plant family Proteaceae is an early-diverging eudicot lineage renowned for its morphological, taxonomic and ecological diversity. Macadamia is the most economically important Proteaceae crop and represents an ancient rainforest-restricted lineage. The family is a focus for studies of adaptive radiation due to remarkable species diversification in Mediterranean-climate biodiversity hotspots, and numerous evolutionary transitions between biomes. Despite a long history of research, comparative analyses in the Proteaceae and macadamia breeding programs are restricted by a paucity of genetic information. To address this, we sequenced the genome and transcriptome of the widely grown Macadamia integrifolia cultivar 741. Over 95 gigabases of DNA and RNA-seq sequence data were de novo assembled and annotated. The draft assembly has a total length of 518 Mb and spans approximately 79% of the estimated genome size. Following annotation, 35,337 protein-coding genes were predicted of which over 90% were expressed in at least one of the leaf, shoot or flower tissues examined. Gene family comparisons with five other eudicot species revealed 13,689 clusters containing macadamia genes and 1005 macadamia-specific clusters, and provides evidence for linage-specific expansion of gene families involved in pathogen recognition, plant defense and monoterpene synthesis. Cyanogenesis is an important defense strategy in the Proteaceae, and a detailed analysis of macadamia gene homologues potentially involved in cyanogenic glycoside biosynthesis revealed several highly expressed candidate genes. The gene space of macadamia provides a foundation for comparative genomics, gene discovery and the acceleration of molecular-assisted breeding. This study presents the first available genomic resources for the large basal eudicot family Proteaceae, access to most macadamia genes and opportunities to uncover the genetic basis of traits of importance for adaptation and crop

  7. A Public Trial De Novo

    DEFF Research Database (Denmark)

    Vedel, Jane Bjørn; Gad, Christopher

    2011-01-01

    This article addresses the concept of “industrial interests” and examines its role in a topical controversy about a large research grant from a private foundation, the Novo Nordisk Foundation, to the University of Copenhagen. The authors suggest that the debate took the form of a “public trial” w.......” The article ends with a discussion of some implications of the analysis, including that policy making, academic research, and public debates might benefit from more detailed accounts of interests and stakes.......This article addresses the concept of “industrial interests” and examines its role in a topical controversy about a large research grant from a private foundation, the Novo Nordisk Foundation, to the University of Copenhagen. The authors suggest that the debate took the form of a “public trial......” where the grant and close(r) intermingling between industry and public research was prosecuted and defended. First, the authors address how the grant was framed in the media. Second, they redescribe the case by introducing new “evidence” that, because of this framing, did not reach “the court...

  8. de novo computational enzyme design.

    Science.gov (United States)

    Zanghellini, Alexandre

    2014-10-01

    Recent advances in systems and synthetic biology as well as metabolic engineering are poised to transform industrial biotechnology by allowing us to design cell factories for the sustainable production of valuable fuels and chemicals. To deliver on their promises, such cell factories, as much as their brick-and-mortar counterparts, will require appropriate catalysts, especially for classes of reactions that are not known to be catalyzed by enzymes in natural organisms. A recently developed methodology, de novo computational enzyme design can be used to create enzymes catalyzing novel reactions. Here we review the different classes of chemical reactions for which active protein catalysts have been designed as well as the results of detailed biochemical and structural characterization studies. We also discuss how combining de novo computational enzyme design with more traditional protein engineering techniques can alleviate the shortcomings of state-of-the-art computational design techniques and create novel enzymes with catalytic proficiencies on par with natural enzymes. Copyright © 2014 Elsevier Ltd. All rights reserved.

  9. Integrative Analyses of De Novo Mutations Provide Deeper Biological Insights into Autism Spectrum Disorder

    Directory of Open Access Journals (Sweden)

    Atsushi Takata

    2018-01-01

    Full Text Available Recent studies have established important roles of de novo mutations (DNMs in autism spectrum disorders (ASDs. Here, we analyze DNMs in 262 ASD probands of Japanese origin and confirm the “de novo paradigm” of ASDs across ethnicities. Based on this consistency, we combine the lists of damaging DNMs in our and published ASD cohorts (total number of trios, 4,244 and perform integrative bioinformatics analyses. Besides replicating the findings of previous studies, our analyses highlight ATP-binding genes and fetal cerebellar/striatal circuits. Analysis of individual genes identified 61 genes enriched for damaging DNMs, including ten genes for which our dataset now contributes to statistical significance. Screening of compounds altering the expression of genes hit by damaging DNMs reveals a global downregulating effect of valproic acid, a known risk factor for ASDs, whereas cardiac glycosides upregulate these genes. Collectively, our integrative approach provides deeper biological and potential medical insights into ASDs.

  10. Novel insights into the functional metabolic impact of an apparent de novo m.8993T>G variant in the MT-ATP6 gene associated with maternally inherited form of Leigh Syndrome.

    Science.gov (United States)

    Uittenbogaard, Martine; Brantner, Christine A; Fang, ZiShui; Wong, Lee-Jun C; Gropman, Andrea; Chiaramello, Anne

    2018-03-27

    In this study, we report a novel perpective of metabolic consequences for the m.8993T>G variant using fibroblasts from a proband with clinical symptoms compatible with Maternally Inherited Leigh Syndrome (MILS). Definitive diagnosis was corroborated by mitochondrial DNA testing for the pathogenic variant m.8993T>G in MT-ATP6 subunit by Sanger sequencing. The long-range PCR followed by massively parallel sequencing method detected the near homoplasmic m.8993T>G variant at 83% in the proband's fibroblasts and at 0.4% in the mother's fibroblasts. Our results are compatible with very low levels of germline heteroplasmy or an apparent de novo mutation. Our mitochondrial morphometric analysis reveals severe defects in mitochondrial cristae structure in the proband's fibroblasts. Our live-cell mitochondrial respiratory analyses show impaired oxidative phosphorylation with decreased spare respiratory capacity in response to energy stress in the proband's fibroblasts. We detected a diminished glycolysis with a lessened glycolytic capacity and reserve, revealing a stunted ability to switch to glycolysis upon full inhibition of OXPHOS activities. This dysregulated energy reprogramming results in a defective interplay between OXPHOS and glycolysis during an energy crisis. Our study sheds light on the potential pathophysiologic mechanism leading to chronic energy crisis in this MILS patient harboring the m.8993T>G variant. Copyright © 2018 Elsevier Inc. All rights reserved.

  11. UniNovo: a universal tool for de novo peptide sequencing.

    Science.gov (United States)

    Jeong, Kyowon; Kim, Sangtae; Pevzner, Pavel A

    2013-08-15

    Mass spectrometry (MS) instruments and experimental protocols are rapidly advancing, but de novo peptide sequencing algorithms to analyze tandem mass (MS/MS) spectra are lagging behind. Although existing de novo sequencing tools perform well on certain types of spectra [e.g. Collision Induced Dissociation (CID) spectra of tryptic peptides], their performance often deteriorates on other types of spectra, such as Electron Transfer Dissociation (ETD), Higher-energy Collisional Dissociation (HCD) spectra or spectra of non-tryptic digests. Thus, rather than developing a new algorithm for each type of spectra, we develop a universal de novo sequencing algorithm called UniNovo that works well for all types of spectra or even for spectral pairs (e.g. CID/ETD spectral pairs). UniNovo uses an improved scoring function that captures the dependences between different ion types, where such dependencies are learned automatically using a modified offset frequency function. The performance of UniNovo is compared with PepNovo+, PEAKS and pNovo using various types of spectra. The results show that the performance of UniNovo is superior to other tools for ETD spectra and superior or comparable with others for CID and HCD spectra. UniNovo also estimates the probability that each reported reconstruction is correct, using simple statistics that are readily obtained from a small training dataset. We demonstrate that the estimation is accurate for all tested types of spectra (including CID, HCD, ETD, CID/ETD and HCD/ETD spectra of trypsin, LysC or AspN digested peptides). UniNovo is implemented in JAVA and tested on Windows, Ubuntu and OS X machines. UniNovo is available at http://proteomics.ucsd.edu/Software/UniNovo.html along with the manual.

  12. De novo FBXO11 mutations are associated with intellectual disability and behavioural anomalies.

    Science.gov (United States)

    Fritzen, Daniel; Kuechler, Alma; Grimmel, Mona; Becker, Jessica; Peters, Sophia; Sturm, Marc; Hundertmark, Hela; Schmidt, Axel; Kreiß, Martina; Strom, Tim M; Wieczorek, Dagmar; Haack, Tobias B; Beck-Wödl, Stefanie; Cremer, Kirsten; Engels, Hartmut

    2018-05-01

    Intellectual disability (ID) has an estimated prevalence of 1.5-2%. In most affected individuals, its genetic basis remains unclear. Whole exome sequencing (WES) studies have identified a multitude of novel causative gene defects and have shown that a large proportion of sporadic ID cases results from de novo mutations. Here, we present two unrelated individuals with similar clinical features and deleterious de novo variants in FBXO11 detected by WES. Individual 1, a 14-year-old boy, has mild ID as well as mild microcephaly, corrected cleft lip and alveolus, hyperkinetic disorder, mild brain atrophy and minor facial dysmorphism. WES detected a heterozygous de novo 1 bp insertion in the splice donor site of exon 3. Individual 2, a 3-year-old boy, showed ID and pre- and postnatal growth retardation, postnatal mild microcephaly, hyperkinetic and restless behaviour, as well as mild dysmorphism. WES detected a heterozygous de novo frameshift mutation. While ten individuals with ID and de novo variants in FBXO11 have been reported as part of larger studies, only one of the reports has some additional clinical data. Interestingly, the latter individual carries the identical mutation as our individual 2 and also displays ID, intrauterine growth retardation, microcephaly, behavioural anomalies, and dysmorphisms. Thus, we confirm deleterious de novo mutations in FBXO11 as a cause of ID and start the delineation of the associated clinical picture which may also comprise postnatal microcephaly or borderline small head size and behavioural anomalies.

  13. Uncovering Clinical Features of De Novo Philadelphia Positive Myelodysplasia.

    Science.gov (United States)

    Armas, Aristides; Chen, Chen; Mims, Martha; Rivero, Gustavo

    2017-01-01

    Myelodysplastic syndrome (MDS) is cytogenetically heterogeneous and retains variable risk for acute myeloid leukemia transformation. Though not yet fully understood, there is an association between genetic abnormalities and defects in gene expression. The functional role for infrequent cytogenetic alteration remains unclear. An uncommon chromosomic abnormality is the presence of the Philadelphia (Ph) chromosome. Here, we report a patient with Ph+ MDS treated with low dose Dasatinib who achieved hematologic response for 7 months. In addition, we also examined the English literature on all de novo Ph + MDS cases between 1996 and 2015 to gain insight into clinical features and outcome.

  14. A De novo Mutation in Dystrophin Causing Muscular Dystrophy in a Female Patient

    Directory of Open Access Journals (Sweden)

    Hao Yu

    2017-01-01

    Conclusions: We identified two novel de novo mutations of DMD gene in two Chinese pedigrees, one of which caused a female patient with muscular dystrophy. The mutational analysis is important for DMD patients and carriers in the absence of a family history. The NGS can help detect the mutations in MLPA-negative patients.

  15. De novo mutations of KIAA2022 in females cause intellectual disability and intractable epilepsy

    NARCIS (Netherlands)

    de Lange, Iris M; Helbig, Katherine L; Weckhuysen, Sarah; Møller, Rikke S; Velinov, Milen; Dolzhanskaya, Natalia; Marsh, Eric; Helbig, Ingo; Devinsky, Orrin; Tang, Sha; Mefford, Heather C; Myers, Candace T; van Paesschen, Wim; Striano, Pasquale; van Gassen, Koen; van Kempen, Marjan; de Kovel, Carolien G F; Piard, Juliette; Minassian, Berge A; Nezarati, Marjan M; Pessoa, André; Jacquette, Aurelia; Maher, Bridget; Balestrini, Simona; Sisodiya, Sanjay; Warde, Marie Therese Abi; De St Martin, Anne; Chelly, Jamel; van 't Slot, Ruben; Van Maldergem, Lionel; Brilstra, Eva H; Koeleman, Bobby P C

    2016-01-01

    BACKGROUND: Mutations in the KIAA2022 gene have been reported in male patients with X-linked intellectual disability, and related female carriers were unaffected. Here, we report 14 female patients who carry a heterozygous de novo KIAA2022 mutation and share a phenotype characterised by intellectual

  16. Whole-genome sequencing in autism identifies hot spots for de novo germline mutation

    DEFF Research Database (Denmark)

    Michaelson, Jacob J.; Shi, Yujian; Gujral, Madhusudan

    2012-01-01

    De novo mutation plays an important role in autism spectrum disorders (ASDs). Notably, pathogenic copy number variants (CNVs) are characterized by high mutation rates. We hypothesize that hypermutability is a property of ASD genes and may also include nucleotide-substitution hot spots. We...

  17. De novo mutations of KIAA2022 in females cause intellectual disability and intractable epilepsy

    DEFF Research Database (Denmark)

    de Lange, Iris M; Helbig, Katherine L; Weckhuysen, Sarah

    2016-01-01

    BACKGROUND: Mutations in the KIAA2022 gene have been reported in male patients with X-linked intellectual disability, and related female carriers were unaffected. Here, we report 14 female patients who carry a heterozygous de novo KIAA2022 mutation and share a phenotype characterised by intellect...

  18. LncRNAs: emerging players in gene regulation and disease ...

    Indian Academy of Sciences (India)

    and Glavac 2013), accounting for about 20,000 protein coding ... general information on lncRNAs' feature (Da Sacco et al. 2012). ..... mal cells, stabilized Zeb2 intron encompasses an internal ..... cially growth-control genes and cell mobility-induced genes ..... RNAs in development and disease of the central nervous system.

  19. Novel Accurate Bacterial Discrimination by MALDI-Time-of-Flight MS Based on Ribosomal Proteins Coding in S10-spc-alpha Operon at Strain Level S10-GERMS

    Science.gov (United States)

    Tamura, Hiroto; Hotta, Yudai; Sato, Hiroaki

    2013-08-01

    Matrix-assisted laser-desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) is one of the most widely used mass-based approaches for bacterial identification and classification because of the simple sample preparation and extremely rapid analysis within a few minutes. To establish the accurate MALDI-TOF MS bacterial discrimination method at strain level, the ribosomal subunit proteins coded in the S 10-spc-alpha operon, which encodes half of the ribosomal subunit protein and is highly conserved in eubacterial genomes, were selected as reliable biomarkers. This method, named the S10-GERMS method, revealed that the strains of genus Pseudomonas were successfully identified and discriminated at species and strain levels, respectively; therefore, the S10-GERMS method was further applied to discriminate the pathovar of P. syringae. The eight selected biomarkers (L24, L30, S10, S12, S14, S16, S17, and S19) suggested the rapid discrimination of P. syringae at the strain (pathovar) level. The S10-GERMS method appears to be a powerful tool for rapid and reliable bacterial discrimination and successful phylogenetic characterization. In this article, an overview of the utilization of results from the S10-GERMS method is presented, highlighting the characterization of the Lactobacillus casei group and discrimination of the bacteria of genera Bacillus and Sphingopyxis despite only two and one base difference in the 16S rRNA gene sequence, respectively.

  20. de novo'' aneurysms following endovascular procedures

    International Nuclear Information System (INIS)

    Briganti, F.; Cirillo, S.; Caranci, F.; Esposito, F.; Maiuri, F.

    2002-01-01

    Two personal cases of ''de novo'' aneurysms of the anterior communicating artery (ACoA) occurring 9 and 4 years, respectively, after endovascular carotid occlusion are described. A review of the 30 reported cases (including our own two) of ''de novo'' aneurysms after occlusion of the major cerebral vessels has shown some features, including a rather long time interval after the endovascular procedure of up to 20-25 years (average 9.6 years), a preferential ACoA (36.3%) and internal carotid artery-posterior communicating artery (ICA-PCoA) (33.3%) location of the ''de novo'' aneurysms, and a 10% rate of multiple aneurysms. These data are compared with those of the group of reported spontaneous ''de novo'' aneurysms after SAH or previous aneurysm clipping. We agree that the frequency of ''de novo'' aneurysms after major-vessel occlusion (two among ten procedures in our series, or 20%) is higher than commonly reported (0 to 11%). For this reason, we suggest that patients who have been submitted to endovascular major-vessel occlusion be followed up for up to 20-25 years after the procedure, using non-invasive imaging studies such as MR angiography and high-resolution CT angiography. On the other hand, periodic digital angiography has a questionable risk-benefit ratio; it may be used when a ''de novo'' aneurysm is detected or suspected on non-invasive studies. The progressive enlargement of the ACoA after carotid occlusion, as described in our case 1, must be considered a radiological finding of risk for ''de novo'' aneurysm formation. (orig.)

  1. The limits of de novo DNA motif discovery.

    Directory of Open Access Journals (Sweden)

    David Simcha

    Full Text Available A major challenge in molecular biology is reverse-engineering the cis-regulatory logic that plays a major role in the control of gene expression. This program includes searching through DNA sequences to identify "motifs" that serve as the binding sites for transcription factors or, more generally, are predictive of gene expression across cellular conditions. Several approaches have been proposed for de novo motif discovery-searching sequences without prior knowledge of binding sites or nucleotide patterns. However, unbiased validation is not straightforward. We consider two approaches to unbiased validation of discovered motifs: testing the statistical significance of a motif using a DNA "background" sequence model to represent the null hypothesis and measuring performance in predicting membership in gene clusters. We demonstrate that the background models typically used are "too null," resulting in overly optimistic assessments of significance, and argue that performance in predicting TF binding or expression patterns from DNA motifs should be assessed by held-out data, as in predictive learning. Applying this criterion to common motif discovery methods resulted in universally poor performance, although there is a marked improvement when motifs are statistically significant against real background sequences. Moreover, on synthetic data where "ground truth" is known, discriminative performance of all algorithms is far below the theoretical upper bound, with pronounced "over-fitting" in training. A key conclusion from this work is that the failure of de novo discovery approaches to accurately identify motifs is basically due to statistical intractability resulting from the fixed size of co-regulated gene clusters, and thus such failures do not necessarily provide evidence that unfound motifs are not active biologically. Consequently, the use of prior knowledge to enhance motif discovery is not just advantageous but necessary. An implementation of

  2. Defining the maize transcriptome de novo using deep RNA-Seq

    Energy Technology Data Exchange (ETDEWEB)

    Martin, Jeffrey; Gross, Stephen; Choi, Cindy; Zhang, Tao; Lindquist, Erika; Wei, Chia-Lin; Wang, Zhong

    2011-06-01

    De novo assembly of the transcriptome is crucial for functional genomics studies in bioenergy research, since many of the organisms lack high quality reference genomes. In a previous study we successfully de novo assembled simple eukaryote transcriptomes exclusively from short Illumina RNA-Seq reads [1]. However, extensive alternative splicing, present in most of the higher eukaryotes, poses a significant challenge for current short read assembly processes. Furthermore, the size of next-generation datasets, often large for plant genomes, presents an informatics challenge. To tackle these challenges we present a combined experimental and informatics strategy for de novo assembly in higher eukaryotes. Using maize as a test case, preliminary results suggest our approach can resolve transcript variants and improve gene annotations.

  3. Defining the maize transcriptome de novo using deep RNA-Seq

    Energy Technology Data Exchange (ETDEWEB)

    Martin, Jeffrey; Gross, Stephen; Choi, Cindy; Zhang, Tao; Lindquist, Erika; Wei, Chia-Lin; Wang, Zhong

    2011-06-02

    De novo assembly of the transcriptome is crucial for functional genomics studies in bioenergy research, since many of the organisms lack high quality reference genomes. In a previous study we successfully de novo assembled simple eukaryote transcriptomes exclusively from short Illumina RNA-Seq reads [1]. However, extensive alternative splicing, present in most of the higher eukaryotes, poses a significant challenge for current short read assembly processes. Furthermore, the size of next-generation datasets, often large for plant genomes, presents an informatics challenge. To tackle these challenges we present a combined experimental and informatics strategy for de novo assembly in higher eukaryotes. Using maize as a test case, preliminary results suggest our approach can resolve transcript variants and improve gene annotations.

  4. Critical importance of the de novo pyrimidine biosynthesis pathway for Trypanosoma cruzi growth in the mammalian host cell cytoplasm

    International Nuclear Information System (INIS)

    Hashimoto, Muneaki; Morales, Jorge; Fukai, Yoshihisa; Suzuki, Shigeo; Takamiya, Shinzaburo; Tsubouchi, Akiko; Inoue, Syou; Inoue, Masayuki; Kita, Kiyoshi; Harada, Shigeharu; Tanaka, Akiko; Aoki, Takashi; Nara, Takeshi

    2012-01-01

    Highlights: ► We established Trypanosoma cruzi lacking the gene for carbamoyl phosphate synthetase II. ► Disruption of the cpsII gene significantly reduced the growth of epimastigotes. ► In particular, the CPSII-null mutant severely retarded intracellular growth. ► The de novo pyrimidine pathway is critical for the parasite growth in the host cell. -- Abstract: The intracellular parasitic protist Trypanosoma cruzi is the causative agent of Chagas disease in Latin America. In general, pyrimidine nucleotides are supplied by both de novo biosynthesis and salvage pathways. While epimastigotes—an insect form—possess both activities, amastigotes—an intracellular replicating form of T. cruzi—are unable to mediate the uptake of pyrimidine. However, the requirement of de novo pyrimidine biosynthesis for parasite growth and survival has not yet been elucidated. Carbamoyl-phosphate synthetase II (CPSII) is the first and rate-limiting enzyme of the de novo biosynthetic pathway, and increased CPSII activity is associated with the rapid proliferation of tumor cells. In the present study, we showed that disruption of the T. cruzicpsII gene significantly reduced parasite growth. In particular, the growth of amastigotes lacking the cpsII gene was severely suppressed. Thus, the de novo pyrimidine pathway is important for proliferation of T. cruzi in the host cell cytoplasm and represents a promising target for chemotherapy against Chagas disease.

  5. Whole Exome Sequencing for a Patient with Rubinstein-Taybi Syndrome Reveals de Novo Variants besides an Overt CREBBP Mutation

    Directory of Open Access Journals (Sweden)

    Hee Jeong Yoo

    2015-03-01

    Full Text Available Rubinstein-Taybi syndrome (RSTS is a rare condition with a prevalence of 1 in 125,000–720,000 births and characterized by clinical features that include facial, dental, and limb dysmorphology and growth retardation. Most cases of RSTS occur sporadically and are caused by de novo mutations. Cytogenetic or molecular abnormalities are detected in only 55% of RSTS cases. Previous genetic studies have yielded inconsistent results due to the variety of methods used for genetic analysis. The purpose of this study was to use whole exome sequencing (WES to evaluate the genetic causes of RSTS in a young girl presenting with an Autism phenotype. We used the Autism diagnostic observation schedule (ADOS and Autism diagnostic interview revised (ADI-R to confirm her diagnosis of Autism. In addition, various questionnaires were used to evaluate other psychiatric features. We used WES to analyze the DNA sequences of the patient and her parents and to search for de novo variants. The patient showed all the typical features of Autism, WES revealed a de novo frameshift mutation in CREBBP and de novo sequence variants in TNC and IGFALS genes. Mutations in the CREBBP gene have been extensively reported in RSTS patients, while potential missense mutations in TNC and IGFALS genes have not previously been associated with RSTS. The TNC and IGFALS genes are involved in central nervous system development and growth. It is possible for patients with RSTS to have additional de novo variants that could account for previously unexplained phenotypes.

  6. A multiplexed miRNA and transgene expression platform for simultaneous repression and expression of protein coding sequences.

    Science.gov (United States)

    Seyhan, Attila A

    2016-01-01

    Knockdown of single or multiple gene targets by RNA interference (RNAi) is necessary to overcome escape mutants or isoform redundancy. It is also necessary to use multiple RNAi reagents to knockdown multiple targets. It is also desirable to express a transgene or positive regulatory elements and inhibit a target gene in a coordinated fashion. This study reports a flexible multiplexed RNAi and transgene platform using endogenous intronic primary microRNAs (pri-miRNAs) as a scaffold located in the green fluorescent protein (GFP) as a model for any functional transgene. The multiplexed intronic miRNA - GFP transgene platform was designed to co-express multiple small RNAs within the polycistronic cluster from a Pol II promoter at more moderate levels to reduce potential vector toxicity. The native intronic miRNAs are co-transcribed with a precursor GFP mRNA as a single transcript and presumably cleaved out of the precursor-(pre) mRNA by the RNA splicing machinery, spliceosome. The spliced intron with miRNA hairpins will be further processed into mature miRNAs or small interfering RNAs (siRNAs) capable of triggering RNAi effects, while the ligated exons become a mature messenger RNA for the translation of the functional GFP protein. Data show that this approach led to robust RNAi-mediated silencing of multiple Renilla Luciferase (R-Luc)-tagged target genes and coordinated expression of functional GFP from a single transcript in transiently transfected HeLa cells. The results demonstrated that this design facilitates the coordinated expression of all mature miRNAs either as individual miRNAs or as multiple miRNAs and the associated protein. The data suggest that, it is possible to simultaneously deliver multiple negative (miRNA or shRNA) and positive (transgene) regulatory elements. Because many cellular processes require simultaneous repression and activation of downstream pathways, this approach offers a platform technology to achieve that dual manipulation efficiently

  7. DeNovoGUI: an open source graphical user interface for de novo sequencing of tandem mass spectra.

    Science.gov (United States)

    Muth, Thilo; Weilnböck, Lisa; Rapp, Erdmann; Huber, Christian G; Martens, Lennart; Vaudel, Marc; Barsnes, Harald

    2014-02-07

    De novo sequencing is a popular technique in proteomics for identifying peptides from tandem mass spectra without having to rely on a protein sequence database. Despite the strong potential of de novo sequencing algorithms, their adoption threshold remains quite high. We here present a user-friendly and lightweight graphical user interface called DeNovoGUI for running parallelized versions of the freely available de novo sequencing software PepNovo+, greatly simplifying the use of de novo sequencing in proteomics. Our platform-independent software is freely available under the permissible Apache2 open source license. Source code, binaries, and additional documentation are available at http://denovogui.googlecode.com .

  8. A novel genetic technique in Plasmodium berghei allows liver stage analysis of genes required for mosquito stage development and demonstrates that de novo heme synthesis is essential for liver stage development in the malaria parasite.

    Directory of Open Access Journals (Sweden)

    Upeksha L Rathnapala

    2017-06-01

    Full Text Available The combination of drug resistance, lack of an effective vaccine, and ongoing conflict and poverty means that malaria remains a major global health crisis. Understanding metabolic pathways at all parasite life stages is important in prioritising and targeting novel anti-parasitic compounds. The unusual heme synthesis pathway of the rodent malaria parasite, Plasmodium berghei, requires eight enzymes distributed across the mitochondrion, apicoplast and cytoplasm. Deletion of the ferrochelatase (FC gene, the final enzyme in the pathway, confirms that heme synthesis is not essential in the red blood cell stages of the life cycle but is required to complete oocyst development in mosquitoes. The lethality of FC deletions in the mosquito stage makes it difficult to study the impact of these mutations in the subsequent liver stage. To overcome this, we combined locus-specific fluorophore expression with a genetic complementation approach to generate viable, heterozygous oocysts able to produce a mix of FC expressing and FC deficient sporozoites. These sporozoites show normal motility and can invade liver cells, where FC deficient parasites can be distinguished by fluorescence and phenotyped. Parasites lacking FC exhibit a severe growth defect within liver cells, with development failure detectable in the early to mid stages of liver development in vitro. FC deficient parasites could not complete liver stage development in vitro nor infect naïve mice, confirming liver stage arrest. These results validate the heme pathway as a potential target for prophylactic drugs targeting liver stage parasites. In addition, we demonstrate that our simple genetic approach can extend the phenotyping window beyond the insect stages, opening considerable scope for straightforward reverse genetic analysis of genes that are dispensable in blood stages but essential for completing mosquito development.

  9. A novel genetic technique in Plasmodium berghei allows liver stage analysis of genes required for mosquito stage development and demonstrates that de novo heme synthesis is essential for liver stage development in the malaria parasite.

    Science.gov (United States)

    Rathnapala, Upeksha L; Goodman, Christopher D; McFadden, Geoffrey I

    2017-06-01

    The combination of drug resistance, lack of an effective vaccine, and ongoing conflict and poverty means that malaria remains a major global health crisis. Understanding metabolic pathways at all parasite life stages is important in prioritising and targeting novel anti-parasitic compounds. The unusual heme synthesis pathway of the rodent malaria parasite, Plasmodium berghei, requires eight enzymes distributed across the mitochondrion, apicoplast and cytoplasm. Deletion of the ferrochelatase (FC) gene, the final enzyme in the pathway, confirms that heme synthesis is not essential in the red blood cell stages of the life cycle but is required to complete oocyst development in mosquitoes. The lethality of FC deletions in the mosquito stage makes it difficult to study the impact of these mutations in the subsequent liver stage. To overcome this, we combined locus-specific fluorophore expression with a genetic complementation approach to generate viable, heterozygous oocysts able to produce a mix of FC expressing and FC deficient sporozoites. These sporozoites show normal motility and can invade liver cells, where FC deficient parasites can be distinguished by fluorescence and phenotyped. Parasites lacking FC exhibit a severe growth defect within liver cells, with development failure detectable in the early to mid stages of liver development in vitro. FC deficient parasites could not complete liver stage development in vitro nor infect naïve mice, confirming liver stage arrest. These results validate the heme pathway as a potential target for prophylactic drugs targeting liver stage parasites. In addition, we demonstrate that our simple genetic approach can extend the phenotyping window beyond the insect stages, opening considerable scope for straightforward reverse genetic analysis of genes that are dispensable in blood stages but essential for completing mosquito development.

  10. A de novo missense mutation of FGFR2 causes facial dysplasia syndrome in Holstein cattle

    DEFF Research Database (Denmark)

    Agerholm, Jørgen Steen; McEvoy, Fintan; Heegaard, Steffen

    2017-01-01

    was suspected as all recorded cases were progeny of the same sire. Detailed investigations were performed to characterize the syndrome and to reveal its cause. Results Seven malformed calves were submitted examination. All cases shared a common morphology with the most striking lesions being severe facial...... chromosome 26 where whole genome sequencing of a case-parent trio revealed two de novo variants perfectly associated with the disease: an intronic SNP in the DMBT1 gene and a single non-synonymous variant in the FGFR2 gene. This FGFR2 missense variant (c.927G>T) affects a gene encoding a member...... of the fibroblast growth factor receptor family, where amino acid sequence is highly conserved between members and across species. It is predicted to change an evolutionary conserved tryptophan into a cysteine residue (p.Trp309Cys). Both variant alleles were proven to result from de novo mutation events...

  11. Modular Engineering Concept at Novo Nordisk Engineering

    DEFF Research Database (Denmark)

    Moelgaard, Gert; Miller, Thomas Dedenroth

    1997-01-01

    This report describes the concept of a new engineering method at Novo Nordisk Engineering: Modular Engineering (ME). Three tools are designed to support project phases with different levels of detailing and abstraction. ME supports a standard, cross-functional breakdown of projects that facilitates...

  12. De novo insertions and deletions of predominantly paternal origin are associated with autism spectrum disorder

    Science.gov (United States)

    Dong, Shan; Walker, Michael F.; Carriero, Nicholas J.; DiCola, Michael; Willsey, A. Jeremy; Ye, Adam Y.; Waqar, Zainulabedin; Gonzalez, Luis E.; Overton, John D.; Frahm, Stephanie; Keaney, John F.; Teran, Nicole A.; Dea, Jeanselle; Mandell, Jeffrey D.; Bal, Vanessa Hus; Sullivan, Catherine A.; DiLullo, Nicholas M.; Khalil, Rehab O.; Gockley, Jake; Yuksel, Zafer; Sertel, Sinem M.; Ercan-Sencicek, A. Gulhan; Gupta, Abha R.; Mane, Shrikant M.; Sheldon, Michael; Brooks, Andrew I.; Roeder, Kathryn; Devlin, Bernie; State, Matthew W.; Wei, Liping; Sanders, Stephan J.

    2014-01-01

    SUMMARY Whole-exome sequencing (WES) studies have demonstrated the contribution of de novo loss-of-function single nucleotide variants to autism spectrum disorders (ASD). However, challenges in the reliable detection of de novo insertions and deletions (indels) have limited inclusion of these variants in prior analyses. Through the application of a robust indel detection method to WES data from 787 ASD families (2,963 individuals), we demonstrate that de novo frameshift indels contribute to ASD risk (OR=1.6; 95%CI=1.0-2.7; p=0.03), are more common in female probands (p=0.02), are enriched among genes encoding FMRP targets (p=6×10−9), and arise predominantly on the paternal chromosome (p<0.001). Based on mutation rates in probands versus unaffected siblings, de novo frameshift indels contribute to risk in approximately 3.0% of individuals with ASD. Finally, through observing clustering of mutations in unrelated probands, we report two novel ASD-associated genes: KMT2E (MLL5), a chromatin regulator, and RIMS1, a regulator of synaptic vesicle release. PMID:25284784

  13. Associations between Familial Rates of Psychiatric Disorders and De Novo Genetic Mutations in Autism

    Directory of Open Access Journals (Sweden)

    Kyleen Luhrs

    2017-01-01

    Full Text Available The purpose of this study was to examine the confluence of genetic and familial risk factors in children with Autism Spectrum Disorder (ASD with distinct de novo genetic events. We hypothesized that gene-disrupting mutations would be associated with reduced rates of familial psychiatric disorders relative to structural mutations. Participants included families of children with ASD in four groups: de novo duplication copy number variations (DUP, n=62, de novo deletion copy number variations (DEL, n=74, de novo likely gene-disrupting mutations (LGDM, n=267, and children without a known genetic etiology (NON, n=2111. Familial rates of psychiatric disorders were calculated from semistructured interviews. Results indicated overall increased rates of psychiatric disorders in DUP families compared to DEL and LGDM families, specific to paternal psychiatric histories, and particularly evident for depressive disorders. Higher rates of depressive disorders in maternal psychiatric histories were observed overall compared to paternal histories and higher rates of anxiety disorders were observed in paternal histories for LGDM families compared to DUP families. These findings support the notion of an additive contribution of genetic etiology and familial factors are associated with ASD risk and highlight critical need for continued work targeting these relationships.

  14. De Novo Insertions and Deletions of Predominantly Paternal Origin Are Associated with Autism Spectrum Disorder

    Directory of Open Access Journals (Sweden)

    Shan Dong

    2014-10-01

    Full Text Available Summary: Whole-exome sequencing (WES studies have demonstrated the contribution of de novo loss-of-function single-nucleotide variants (SNVs to autism spectrum disorder (ASD. However, challenges in the reliable detection of de novo insertions and deletions (indels have limited inclusion of these variants in prior analyses. By applying a robust indel detection method to WES data from 787 ASD families (2,963 individuals, we demonstrate that de novo frameshift indels contribute to ASD risk (OR = 1.6; 95% CI = 1.0–2.7; p = 0.03, are more common in female probands (p = 0.02, are enriched among genes encoding FMRP targets (p = 6 × 10−9, and arise predominantly on the paternal chromosome (p < 0.001. On the basis of mutation rates in probands versus unaffected siblings, we conclude that de novo frameshift indels contribute to risk in approximately 3% of individuals with ASD. Finally, by observing clustering of mutations in unrelated probands, we uncover two ASD-associated genes: KMT2E (MLL5, a chromatin regulator, and RIMS1, a regulator of synaptic vesicle release. : Insertions and deletions (indels have proven especially difficult to detect in exome sequencing data. Dong et al. now identify indels in exome data for 787 autism spectrum disorder (ASD families. They demonstrate association between de novo indels that alter the reading frame and ASD. Furthermore, by observing clustering of indels in unrelated probands, they uncover two additional ASD-associated genes: KMT2E (MLL5, a chromatin regulator, and RIMS1, a regulator of synaptic vesicle release.

  15. Efficient assembly of de novo human artificial chromosomes from large genomic loci

    Directory of Open Access Journals (Sweden)

    Stromberg Gregory

    2005-07-01

    Full Text Available Abstract Background Human Artificial Chromosomes (HACs are potentially useful vectors for gene transfer studies and for functional annotation of the genome because of their suitability for cloning, manipulating and transferring large segments of the genome. However, development of HACs for the transfer of large genomic loci into mammalian cells has been limited by difficulties in manipulating high-molecular weight DNA, as well as by the low overall frequencies of de novo HAC formation. Indeed, to date, only a small number of large (>100 kb genomic loci have been reported to be successfully packaged into de novo HACs. Results We have developed novel methodologies to enable efficient assembly of HAC vectors containing any genomic locus of interest. We report here the creation of a novel, bimolecular system based on bacterial artificial chromosomes (BACs for the construction of HACs incorporating any defined genomic region. We have utilized this vector system to rapidly design, construct and validate multiple de novo HACs containing large (100–200 kb genomic loci including therapeutically significant genes for human growth hormone (HGH, polycystic kidney disease (PKD1 and ß-globin. We report significant differences in the ability of different genomic loci to support de novo HAC formation, suggesting possible effects of cis-acting genomic elements. Finally, as a proof of principle, we have observed sustained ß-globin gene expression from HACs incorporating the entire 200 kb ß-globin genomic locus for over 90 days in the absence of selection. Conclusion Taken together, these results are significant for the development of HAC vector technology, as they enable high-throughput assembly and functional validation of HACs containing any large genomic locus. We have evaluated the impact of different genomic loci on the frequency of HAC formation and identified segments of genomic DNA that appear to facilitate de novo HAC formation. These genomic loci

  16. Phylogenetic classification of Pleurothecium and Pleurotheciella gen. nov. and its dactylaria-like anamorph (Sordariomycetes) based on nuclear ribosomal and protein-coding genes

    Czech Academy of Sciences Publication Activity Database

    Réblová, Martina; Seifert, K. A.; Fournier, J.; Štěpánek, Václav

    2012-01-01

    Roč. 104, č. 6 (2012), s. 1299-1314 ISSN 0027-5514 R&D Projects: GA ČR GAP506/12/0038 Institutional support: RVO:67985939 ; RVO:61388971 Keywords : holoblastic denticulate conidiogenesis * life cycles * Steringmatobotrys Subject RIV: EF - Botanics; EE - Microbiology, Virology (MBU-M) Impact factor: 2.110, year: 2012

  17. Methylation of miRNA genes and oncogenesis.

    Science.gov (United States)

    Loginov, V I; Rykov, S V; Fridman, M V; Braga, E A

    2015-02-01

    Interaction between microRNA (miRNA) and messenger RNA of target genes at the posttranscriptional level provides fine-tuned dynamic regulation of cell signaling pathways. Each miRNA can be involved in regulating hundreds of protein-coding genes, and, conversely, a number of different miRNAs usually target a structural gene. Epigenetic gene inactivation associated with methylation of promoter CpG-islands is common to both protein-coding genes and miRNA genes. Here, data on functions of miRNAs in development of tumor-cell phenotype are reviewed. Genomic organization of promoter CpG-islands of the miRNA genes located in inter- and intragenic areas is discussed. The literature and our own results on frequency of CpG-island methylation in miRNA genes from tumors are summarized, and data regarding a link between such modification and changed activity of miRNA genes and, consequently, protein-coding target genes are presented. Moreover, the impact of miRNA gene methylation on key oncogenetic processes as well as affected signaling pathways is discussed.

  18. Gene

    Data.gov (United States)

    U.S. Department of Health & Human Services — Gene integrates information from a wide range of species. A record may include nomenclature, Reference Sequences (RefSeqs), maps, pathways, variations, phenotypes,...

  19. De novo SOX11 mutations cause Coffin-Siris syndrome.

    Science.gov (United States)

    Tsurusaki, Yoshinori; Koshimizu, Eriko; Ohashi, Hirofumi; Phadke, Shubha; Kou, Ikuyo; Shiina, Masaaki; Suzuki, Toshifumi; Okamoto, Nobuhiko; Imamura, Shintaro; Yamashita, Michiaki; Watanabe, Satoshi; Yoshiura, Koh-ichiro; Kodera, Hirofumi; Miyatake, Satoko; Nakashima, Mitsuko; Saitsu, Hirotomo; Ogata, Kazuhiro; Ikegawa, Shiro; Miyake, Noriko; Matsumoto, Naomichi

    2014-06-02

    Coffin-Siris syndrome (CSS) is a congenital disorder characterized by growth deficiency, intellectual disability, microcephaly, characteristic facial features and hypoplastic nails of the fifth fingers and/or toes. We previously identified mutations in five genes encoding subunits of the BAF complex, in 55% of CSS patients. Here we perform whole-exome sequencing in additional CSS patients, identifying de novo SOX11 mutations in two patients with a mild CSS phenotype. sox11a/b knockdown in zebrafish causes brain abnormalities, potentially explaining the brain phenotype of CSS. SOX11 is the downstream transcriptional factor of the PAX6-BAF complex, highlighting the importance of the BAF complex and SOX11 transcriptional network in brain development.

  20. De novo transcriptome assembly of shrimp Palaemon serratus

    Directory of Open Access Journals (Sweden)

    Alejandra Perina

    2017-03-01

    Full Text Available The shrimp Palaemon serratus is a coastal decapod crustacean with a high commercial value. It is harvested for human consumption. In this study, we used Illumina sequencing technology (HiSeq 2000 to sequence, assemble and annotate the transcriptome of P. serratus. RNA was isolated from muscle of adults individuals and, from a pool of larvae. A total number of 4 cDNA libraries were constructed, using the TruSeq RNA Sample Preparation Kit v2. The raw data in this study was deposited in NCBI SRA database with study accession number of SRP090769. The obtained data were subjected to de novo transcriptome assembly using Trinity software, and coding regions were predicted by TransDecoder. We used Blastp and Sma3s to annotate the identified proteins. The transcriptome data could provide some insight into the understanding of genes involved in the larval development and metamorphosis.

  1. De novo assembly of a haplotype-resolved human genome.

    Science.gov (United States)

    Cao, Hongzhi; Wu, Honglong; Luo, Ruibang; Huang, Shujia; Sun, Yuhui; Tong, Xin; Xie, Yinlong; Liu, Binghang; Yang, Hailong; Zheng, Hancheng; Li, Jian; Li, Bo; Wang, Yu; Yang, Fang; Sun, Peng; Liu, Siyang; Gao, Peng; Huang, Haodong; Sun, Jing; Chen, Dan; He, Guangzhu; Huang, Weihua; Huang, Zheng; Li, Yue; Tellier, Laurent C A M; Liu, Xiao; Feng, Qiang; Xu, Xun; Zhang, Xiuqing; Bolund, Lars; Krogh, Anders; Kristiansen, Karsten; Drmanac, Radoje; Drmanac, Snezana; Nielsen, Rasmus; Li, Songgang; Wang, Jian; Yang, Huanming; Li, Yingrui; Wong, Gane Ka-Shu; Wang, Jun

    2015-06-01

    The human genome is diploid, and knowledge of the variants on each chromosome is important for the interpretation of genomic information. Here we report the assembly of a haplotype-resolved diploid genome without using a reference genome. Our pipeline relies on fosmid pooling together with whole-genome shotgun strategies, based solely on next-generation sequencing and hierarchical assembly methods. We applied our sequencing method to the genome of an Asian individual and generated a 5.15-Gb assembled genome with a haplotype N50 of 484 kb. Our analysis identified previously undetected indels and 7.49 Mb of novel coding sequences that could not be aligned to the human reference genome, which include at least six predicted genes. This haplotype-resolved genome represents the most complete de novo human genome assembly to date. Application of our approach to identify individual haplotype differences should aid in translating genotypes to phenotypes for the development of personalized medicine.

  2. Annotation of a hybrid partial genome of the Coffee Rust (Hemileia vastatrix contributes to the gene repertoire catalogue of the Pucciniales

    Directory of Open Access Journals (Sweden)

    Marco Aurelio Cristancho

    2014-10-01

    Full Text Available Coffee leaf rust caused by the fungus Hemileia vastatrix is the most damaging disease to coffee worldwide. The pathogen has recently appeared in multiple outbreaks in coffee producing countries resulting in significant yield losses and increases in costs related to its control. New races/isolates are constantly emerging as evidenced by the presence of the fungus in plants that were previously resistant. Genomic studies are opening new avenues for the study of the evolution of pathogens, the detailed description of plant-pathogen interactions and the development of molecular techniques for the identification of individual isolates. For this purpose we sequenced 8 different H. vastatrix isolates using NGS technologies and gathered partial genome assemblies due to the large repetitive content in the coffee rust hybrid genome; 74.4% of the assembled contigs harbor repetitive sequences. A hybrid assembly of 333Mb was built based on the 8 isolates; this assembly was used for subsequent analyses.Analysis of the conserved gene space showed that the hybrid H. vastatrix genome, though highly fragmented, had a satisfactory level of completion with 91.94% of core protein-coding orthologous genes present. RNA-Seq from urediniospores was used to guide the de novo annotation of the H. vastatrix gene complement. In total, 14,445 genes organized in 3,921 families were uncovered; a considerable proportion of the predicted proteins (73.8% were homologous to other Pucciniales species genomes. Several gene families related to the fungal lifestyle were identified, particularly 483 predicted secreted proteins that represent candidate effector genes and will provide interesting hints to decipher virulence in the coffee rust fungus. The genome sequence of Hva will serve as a template to understand the molecular mechanisms used by this fungus to attack the coffee plant, to study the diversity of this species and for the development of molecular markers to distinguish

  3. De novo and inherited private variants in MAP1B in periventricular nodular heterotopia.

    Science.gov (United States)

    Heinzen, Erin L; O'Neill, Adam C; Zhu, Xiaolin; Allen, Andrew S; Bahlo, Melanie; Chelly, Jamel; Dobyns, William B; Freytag, Saskia; Guerrini, Renzo; Leventer, Richard J; Poduri, Annapurna; Robertson, Stephen P; Walsh, Christopher A; Zhang, Mengqi

    2018-05-08

    Periventricular nodular heterotopia (PVNH) is a malformation of cortical development commonly associated with epilepsy. We exome sequenced 202 individuals with sporadic PVNH to identify novel genetic risk loci. We first performed a trio-based analysis and identified 219 de novo variants. Although no novel genes were implicated in this initial analysis, PVNH cases were found overall to have a significant excess of nonsynonymous de novo variants in intolerant genes (p = 3.27x10-7), suggesting a role for rare new alleles in genes yet to be associated with the condition. Using a gene-level collapsing analysis comparing cases and controls, we identified a genome-wide significant signal driven by four ultra-rare loss-of-function heterozygous variants in MAP1B, including one de novo variant. In at least one instance, the MAP1B variant was inherited from a parent with previously undiagnosed PVNH. The PVNH was frontally predominant and associated with perisylvian polymicrogyria. These results implicate MAP1B in PVNH. More broadly, our findings suggest that detrimental mutations likely arising in immediately preceding generations with incomplete penetrance may also be responsible for some apparently sporadic diseases.

  4. The completion of the Mammalian Gene Collection (MGC)

    Science.gov (United States)

    Temple, Gary; Gerhard, Daniela S.; Rasooly, Rebekah; Feingold, Elise A.; Good, Peter J.; Robinson, Cristen; Mandich, Allison; Derge, Jeffrey G.; Lewis, Jeanne; Shoaf, Debonny; Collins, Francis S.; Jang, Wonhee; Wagner, Lukas; Shenmen, Carolyn M.; Misquitta, Leonie; Schaefer, Carl F.; Buetow, Kenneth H.; Bonner, Tom I.; Yankie, Linda; Ward, Ming; Phan, Lon; Astashyn, Alex; Brown, Garth; Farrell, Catherine; Hart, Jennifer; Landrum, Melissa; Maidak, Bonnie L.; Murphy, Michael; Murphy, Terence; Rajput, Bhanu; Riddick, Lillian; Webb, David; Weber, Janet; Wu, Wendy; Pruitt, Kim D.; Maglott, Donna; Siepel, Adam; Brejova, Brona; Diekhans, Mark; Harte, Rachel; Baertsch, Robert; Kent, Jim; Haussler, David; Brent, Michael; Langton, Laura; Comstock, Charles L.G.; Stevens, Michael; Wei, Chaochun; van Baren, Marijke J.; Salehi-Ashtiani, Kourosh; Murray, Ryan R.; Ghamsari, Lila; Mello, Elizabeth; Lin, Chenwei; Pennacchio, Christa; Schreiber, Kirsten; Shapiro, Nicole; Marsh, Amber; Pardes, Elizabeth; Moore, Troy; Lebeau, Anita; Muratet, Mike; Simmons, Blake; Kloske, David; Sieja, Stephanie; Hudson, James; Sethupathy, Praveen; Brownstein, Michael; Bhat, Narayan; Lazar, Joseph; Jacob, Howard; Gruber, Chris E.; Smith, Mark R.; McPherson, John; Garcia, Angela M.; Gunaratne, Preethi H.; Wu, Jiaqian; Muzny, Donna; Gibbs, Richard A.; Young, Alice C.; Bouffard, Gerard G.; Blakesley, Robert W.; Mullikin, Jim; Green, Eric D.; Dickson, Mark C.; Rodriguez, Alex C.; Grimwood, Jane; Schmutz, Jeremy; Myers, Richard M.; Hirst, Martin; Zeng, Thomas; Tse, Kane; Moksa, Michelle; Deng, Merinda; Ma, Kevin; Mah, Diana; Pang, Johnson; Taylor, Greg; Chuah, Eric; Deng, Athena; Fichter, Keith; Go, Anne; Lee, Stephanie; Wang, Jing; Griffith, Malachi; Morin, Ryan; Moore, Richard A.; Mayo, Michael; Munro, Sarah; Wagner, Susan; Jones, Steven J.M.; Holt, Robert A.; Marra, Marco A.; Lu, Sun; Yang, Shuwei; Hartigan, James; Graf, Marcus; Wagner, Ralf; Letovksy, Stanley; Pulido, Jacqueline C.; Robison, Keith; Esposito, Dominic; Hartley, James; Wall, Vanessa E.; Hopkins, Ralph F.; Ohara, Osamu; Wiemann, Stefan

    2009-01-01

    Since its start, the Mammalian Gene Collection (MGC) has sought to provide at least one full-protein-coding sequence cDNA clone for every human and mouse gene with a RefSeq transcript, and at least 6200 rat genes. The MGC cloning effort initially relied on random expressed sequence tag screening of cDNA libraries. Here, we summarize our recent progress using directed RT-PCR cloning and DNA synthesis. The MGC now contains clones with the entire protein-coding sequence for 92% of human and 89% of mouse genes with curated RefSeq (NM-accession) transcripts, and for 97% of human and 96% of mouse genes with curated RefSeq transcripts that have one or more PubMed publications, in addition to clones for more than 6300 rat genes. These high-quality MGC clones and their sequences are accessible without restriction to researchers worldwide. PMID:19767417

  5. Do prion protein gene polymorphisms induce apoptosis in non ...

    Indian Academy of Sciences (India)

    2016-08-26

    Aug 26, 2016 ... Genetic variations such as single nucleotide polymorphisms (SNPs) in prion protein coding gene, Prnp, greatly affect susceptibility to prion diseases in mammals. Here, the coding region of Prnp was screened for polymorphisms in redeared turtle, Trachemys scripta. Four polymorphisms, L203V, N205I, ...

  6. De novo mutations in the genome organizer CTCF cause intellectual disability

    DEFF Research Database (Denmark)

    Gregor, Anne; Oti, Martin; Kouwenhoven, Evelyn N

    2013-01-01

    An increasing number of genes involved in chromatin structure and epigenetic regulation has been implicated in a variety of developmental disorders, often including intellectual disability. By trio exome sequencing and subsequent mutational screening we now identified two de novo frameshift...... mutations and one de novo missense mutation in CTCF in individuals with intellectual disability, microcephaly, and growth retardation. Furthermore, an individual with a larger deletion including CTCF was identified. CTCF (CCCTC-binding factor) is one of the most important chromatin organizers in vertebrates...... and is involved in various chromatin regulation processes such as higher order of chromatin organization, enhancer function, and maintenance of three-dimensional chromatin structure. Transcriptome analyses in all three individuals with point mutations revealed deregulation of genes involved in signal transduction...

  7. Evaluating bacterial gene-finding HMM structures as probabilistic logic programs

    DEFF Research Database (Denmark)

    Mørk, Søren; Holmes, Ian

    2012-01-01

    , a probabilistic dialect of Prolog. Results: We evaluate Hidden Markov Model structures for bacterial protein-coding gene potential, including a simple null model structure, three structures based on existing bacterial gene finders and two novel model structures. We test standard versions as well as ADPH length...

  8. Natural variation of rice blast resistance gene Pi-d2

    Science.gov (United States)

    Studying natural variation of rice resistance (R) genes in cultivated and wild rice relatives can predict resistance stability to rice blast fungus. In the present study, the protein coding regions of rice R gene Pi-d2 in 35 rice accessions of subgroups, aus (AUS), indica (IND), temperate japonica (...

  9. SNP discovery in candidate adaptive genes using exon capture in a free-ranging alpine ungulate

    Science.gov (United States)

    Gretchen H. Roffler; Stephen J. Amish; Seth Smith; Ted Cosart; Marty Kardos; Michael K. Schwartz; Gordon Luikart

    2016-01-01

    Identification of genes underlying genomic signatures of natural selection is key to understanding adaptation to local conditions. We used targeted resequencing to identify SNP markers in 5321 candidate adaptive genes associated with known immunological, metabolic and growth functions in ovids and other ungulates. We selectively targeted 8161 exons in protein-coding...

  10. From essential to persistent genes: a functional approach to constructing synthetic life

    DEFF Research Database (Denmark)

    Acevedo-Rocha, Carlos G.; Fang, Gang; Schmidt, Markus

    2013-01-01

    A central undertaking in synthetic biology (SB) is the quest for the ‘minimal genome’. However, ‘minimal sets’ of essential genes are strongly context-dependent and, in all prokaryotic genomes sequenced to date, not a single protein-coding gene is entirely conserved. Furthermore, a lack...

  11. Extreme-Scale De Novo Genome Assembly

    Energy Technology Data Exchange (ETDEWEB)

    Georganas, Evangelos [Intel Corporation, Santa Clara, CA (United States); Hofmeyr, Steven [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Joint Genome Inst.; Egan, Rob [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Computational Research Division; Buluc, Aydin [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Joint Genome Inst.; Oliker, Leonid [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Joint Genome Inst.; Rokhsar, Daniel [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Computational Research Division; Yelick, Katherine [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Joint Genome Inst.

    2017-09-26

    De novo whole genome assembly reconstructs genomic sequence from short, overlapping, and potentially erroneous DNA segments and is one of the most important computations in modern genomics. This work presents HipMER, a high-quality end-to-end de novo assembler designed for extreme scale analysis, via efficient parallelization of the Meraculous code. Genome assembly software has many components, each of which stresses different components of a computer system. This chapter explains the computational challenges involved in each step of the HipMer pipeline, the key distributed data structures, and communication costs in detail. We present performance results of assembling the human genome and the large hexaploid wheat genome on large supercomputers up to tens of thousands of cores.

  12. Wegener's granulomatosis occurring de novo during pregnancy.

    Science.gov (United States)

    Alfhaily, F; Watts, R; Leather, A

    2009-01-01

    Wegener's granulomatosis (WG) is rarely diagnosed during the reproductive years and uncommonly manifests for the first time during pregnancy. We report a case of de novo WG presenting at 30 weeks gestation with classical symptoms of WG (ENT, pulmonary). The diagnosis was confirmed by radiological, laboratory, and histological investigations. With a multidisciplinary approach, she had a successful vaginal delivery of a healthy baby. She was treated successfully by a combination of steroids, azathioprine and intravenous immunoglobulin in the active phase of disease for induction of remission and by azathioprine and steroids for maintenance of remission. The significant improvement in her symptoms allowed us to continue her pregnancy to 37 weeks when delivery was electively induced. Transplacental transmission of PR3-ANCA occurred but the neonate remained well. This case of de novo WG during pregnancy highlights the seriousness of this disease and the challenge in management of such patients.

  13. The non-protein coding breast cancer susceptibility locus Mcs5a acts in a non-mammary cell-autonomous fashion through the immune system and modulates T-cell homeostasis and functions.

    Science.gov (United States)

    Smits, Bart M G; Sharma, Deepak; Samuelson, David J; Woditschka, Stephan; Mau, Bob; Haag, Jill D; Gould, Michael N

    2011-08-16

    Mechanisms underlying low-penetrance, common, non-protein coding variants in breast cancer risk loci are largely undefined. We showed previously that the non-protein coding mammary carcinoma susceptibility locus Mcs5a/MCS5A modulates breast cancer risk in rats and women. The Mcs5a allele from the Wistar-Kyoto (WKy) rat strain consists of two genetically interacting elements that have to be present on the same chromosome to confer mammary carcinoma resistance. We also found that the two interacting elements of the resistant allele are required for the downregulation of transcript levels of the Fbxo10 gene specifically in T-cells. Here we describe mechanisms through which Mcs5a may reduce mammary carcinoma susceptibility. We performed mammary carcinoma multiplicity studies with three mammary carcinoma-inducing treatments, namely 7,12-dimethylbenz(a)anthracene (DMBA) and N-nitroso-N-methylurea (NMU) carcinogenesis, and mammary ductal infusion of retrovirus expressing the activated HER2/neu oncogene. We used mammary gland and bone marrow transplantation assays to assess the target tissue of Mcs5a activity. We used immunophenotyping assays on well-defined congenic rat lines carrying susceptible and resistant Mcs5a alleles to identify changes in T-cell homeostasis and function associated with resistance. We show that Mcs5a acts beyond the initial step of mammary epithelial cell transformation, during early cancer progression. We show that Mcs5a controls susceptibility in a non-mammary cell-autonomous manner through the immune system. The resistant Mcs5a allele was found to be associated with an overabundance of gd T-cell receptor (TCR)+ T-cells as well as a CD62L (L-selectin)-high population of all T-cell classes. In contrast to in mammary carcinoma, gdTCR+ T-cells are the predominant T-cell type in the mammary gland and were found to be overabundant in the mammary epithelium of Mcs5a resistant congenic rats. Most of them simultaneously expressed the CD4, CD8, and CD161

  14. De novo Biosynthesis of "Non-Natural" Thaxtomin Phytotoxins.

    Science.gov (United States)

    Winn, Michael; Francis, Daniel; Micklefield, Jason

    2018-03-30

    Thaxtomins are diketopiperazine phytotoxins produced by Streptomyces scabies and other actinobacterial plant pathogens that inhibit cellulose biosynthesis in plants. Due to their potent bioactivity and novel mode of action there has been considerable interest in developing thaxtomins as herbicides for crop protection. To address the need for more stable derivatives, we have developed a new approach for structural diversification of thaxtomins. Genes encoding the thaxtomin NRPS from S. scabies, along with genes encoding a promiscuous tryptophan synthase (TrpS) from Salmonella typhimurium, were assembled in a heterologous host Streptomyces albus. Upon feeding indole derivatives to the engineered S. albus strain, tryptophan intermediates with alternative substituents are biosynthesized and incorporated by the NRPS to deliver a series of thaxtomins with different functionalities in place of the nitro group. The approach described herein, demonstrates how genes from different pathways and different bacterial origins can be combined in a heterologous host to create a de novo biosynthetic pathway to "non-natural" product target compounds. © 2018 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.

  15. MRUniNovo: an efficient tool for de novo peptide sequencing utilizing the hadoop distributed computing framework.

    Science.gov (United States)

    Li, Chuang; Chen, Tao; He, Qiang; Zhu, Yunping; Li, Kenli

    2017-03-15

    Tandem mass spectrometry-based de novo peptide sequencing is a complex and time-consuming process. The current algorithms for de novo peptide sequencing cannot rapidly and thoroughly process large mass spectrometry datasets. In this paper, we propose MRUniNovo, a novel tool for parallel de novo peptide sequencing. MRUniNovo parallelizes UniNovo based on the Hadoop compute platform. Our experimental results demonstrate that MRUniNovo significantly reduces the computation time of de novo peptide sequencing without sacrificing the correctness and accuracy of the results, and thus can process very large datasets that UniNovo cannot. MRUniNovo is an open source software tool implemented in java. The source code and the parameter settings are available at http://bioinfo.hupo.org.cn/MRUniNovo/index.php. s131020002@hnu.edu.cn ; taochen1019@163.com. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com

  16. On the total number of genes and their length distribution in complete microbial genomes

    DEFF Research Database (Denmark)

    Skovgaard, Marie; Jensen, L.J.; Brunak, Søren

    2001-01-01

    In sequenced microbial genomes, some of the annotated genes are actually not protein-coding genes, but rather open reading frames that occur by chance. Therefore, the number of annotated genes is higher than the actual number of genes for most of these microbes. Comparison of the length...... distribution of the annotated genes with the length distribution of those matching a known protein reveals that too many short genes are annotated in many genomes. Here we estimate the true number of protein-coding genes for sequenced genomes. Although it is often claimed that Escherichia coli has about 4300...... genes, we show that it probably has only similar to 3800 genes, and that a similar discrepancy exists for almost all published genomes....

  17. KeyPathwayMiner - De-novo network enrichment by combining multiple OMICS data and biological networks

    DEFF Research Database (Denmark)

    Baumbach, Jan; Alcaraz, Nicolas; Pauling, Josch K.

    We tackle the problem of de-novo pathway extraction. Given a biological network and a set of case-control studies, KeyPathwayMiner efficiently extracts and visualizes all maximal connected sub-networks that contain mainly genes that are dysregulated, e.g., differentially expressed, in most cases ...

  18. Optic atrophy, cataracts, lipodystrophy/lipoatrophy, and peripheral neuropathy caused by a de novo OPA3 mutation

    OpenAIRE

    Bourne, Stephanie C.; Townsend, Katelin N.; Shyr, Casper; Matthews, Allison; Lear, Scott A.; Attariwala, Raj; Lehman, Anna; Wasserman, Wyeth W.; van Karnebeek, Clara; Sinclair, Graham; Vallance, Hilary; Gibson, William T.

    2017-01-01

    We describe a woman who presented with cataracts, optic atrophy, lipodystrophy/lipoatrophy, and peripheral neuropathy. Exome sequencing identified a c.235C > G p.(Leu79Val) variant in the optic atrophy 3 (OPA3) gene that was confirmed to be de novo. This report expands the severity of the phenotypic spectrum of autosomal dominant OPA3 mutations.

  19. De Novo whole genome sequence of Xylella fastidiosa subsp. multiplex strain BB01 from blueberry in Georgia, USA

    Science.gov (United States)

    This study reports a de novo assembled draft genome sequence of Xylella fastidiosa subsp. multiplex strain BB01 causing blueberry bacterial leaf scorch in Georgia, USA. The BB01 genome is 2,517,579 bp with a G+C content of 51.8% and 2,943 open reading frames (ORFs) and 48 RNA genes....

  20. Effects of Stress, Reactive Oxygen Species, and the SOS Response on De Novo Acquisition of Antibiotic Resistance in Escherichia coli

    NARCIS (Netherlands)

    Händel, N.; Hoeksema, M.; Freijo Mata, M.; Brul, S.; ter Kuile, B.H.

    2015-01-01

    Strategies to prevent the development of antibiotic resistance in bacteria are needed to reduce the threat of infectious diseases to human health. The de novo acquisition of resistance due to mutations and/or phenotypic adaptation occurs rapidly as a result of interactions of gene expression and

  1. Identification and characterization of wheat long non-protein coding RNAs responsive to powdery mildew infection and heat stress by using microarray analysis and SBS sequencing

    Directory of Open Access Journals (Sweden)

    Peng Huiru

    2011-04-01

    Full Text Available Abstract Background Biotic and abiotic stresses, such as powdery mildew infection and high temperature, are important limiting factors for yield and grain quality in wheat production. Emerging evidences suggest that long non-protein coding RNAs (npcRNAs are developmentally regulated and play roles in development and stress responses of plants. However, identification of long npcRNAs is limited to a few plant species, such as Arabidopsis, rice and maize, no systematic identification of long npcRNAs and their responses to abiotic and biotic stresses is reported in wheat. Results In this study, by using computational analysis and experimental approach we identified 125 putative wheat stress responsive long npcRNAs, which are not conserved among plant species. Among them, some were precursors of small RNAs such as microRNAs and siRNAs, two long npcRNAs were identified as signal recognition particle (SRP 7S RNA variants, and three were characterized as U3 snoRNAs. We found that wheat long npcRNAs showed tissue dependent expression patterns and were responsive to powdery mildew infection and heat stress. Conclusion Our results indicated that diverse sets of wheat long npcRNAs were responsive to powdery mildew infection and heat stress, and could function in wheat responses to both biotic and abiotic stresses, which provided a starting point to understand their functions and regulatory mechanisms in the future.

  2. De novo malignancy after pancreas transplantation in Japan.

    Science.gov (United States)

    Tomimaru, Y; Ito, T; Marubashi, S; Kawamoto, K; Tomokuni, A; Asaoka, T; Wada, H; Eguchi, H; Mori, M; Doki, Y; Nagano, H

    2015-04-01

    Long-term immunosuppression is associated with an increased risk of cancer. Especially, the immunosuppression in pancreas transplantation is more intensive than that in other organ transplantation because of its strong immunogenicity. Therefore, it suggests that the risk of post-transplant de novo malignancy might increase in pancreas transplantation. However, there have been few studies of de novo malignancy after pancreas transplantation. The aim of this study was to analyze the incidence of de novo malignancy after pancreas transplantation in Japan. Post-transplant patients with de novo malignancy were surveyed and characterized in Japan. Among 107 cases receiving pancreas transplantation in Japan between 2001 and 2010, de novo malignancy developed in 9 cases (8.4%): post-transplant lymphoproliferative disorders in 6 cases, colon cancer in 1 case, renal cancer in 1 case, and brain tumor in 1 case. We clarified the incidence of de novo malignancy after pancreas transplantation in Japan. Copyright © 2015 Elsevier Inc. All rights reserved.

  3. Optimizing and benchmarking de novo transcriptome sequencing: from library preparation to assembly evaluation.

    Science.gov (United States)

    Hara, Yuichiro; Tatsumi, Kaori; Yoshida, Michio; Kajikawa, Eriko; Kiyonari, Hiroshi; Kuraku, Shigehiro

    2015-11-18

    RNA-seq enables gene expression profiling in selected spatiotemporal windows and yields massive sequence information with relatively low cost and time investment, even for non-model species. However, there remains a large room for optimizing its workflow, in order to take full advantage of continuously developing sequencing capacity. Transcriptome sequencing for three embryonic stages of Madagascar ground gecko (Paroedura picta) was performed with the Illumina platform. The output reads were assembled de novo for reconstructing transcript sequences. In order to evaluate the completeness of transcriptome assemblies, we prepared a reference gene set consisting of vertebrate one-to-one orthologs. To take advantage of increased read length of >150 nt, we demonstrated shortened RNA fragmentation time, which resulted in a dramatic shift of insert size distribution. To evaluate products of multiple de novo assembly runs incorporating reads with different RNA sources, read lengths, and insert sizes, we introduce a new reference gene set, core vertebrate genes (CVG), consisting of 233 genes that are shared as one-to-one orthologs by all vertebrate genomes examined (29 species)., The completeness assessment performed by the computational pipelines CEGMA and BUSCO referring to CVG, demonstrated higher accuracy and resolution than with the gene set previously established for this purpose. As a result of the assessment with CVG, we have derived the most comprehensive transcript sequence set of the Madagascar ground gecko by means of assembling individual libraries followed by clustering the assembled sequences based on their overall similarities. Our results provide several insights into optimizing de novo RNA-seq workflow, including the coordination between library insert size and read length, which manifested in improved connectivity of assemblies. The approach and assembly assessment with CVG demonstrated here would be applicable to transcriptome analysis of other species as

  4. Novos paradigmas literários

    Directory of Open Access Journals (Sweden)

    Denise Azevedo Duarte Guimarães

    2005-12-01

    Full Text Available O artigo estuda a emergência de novos paradigmas literários, procurando refletir acerca das textualidades contemporâneas. Focaliza os hipertextos informatizados e a poesia multimídia, com o intuito de desvendar como estão sendo criados novos procedimentos expressivos e em que medida eles podem ser identificados com reflexões teóricas anteriores acerca do texto literário impresso. Remete a questões ligadas à leitura dos diferentes tipos de signos e aos modos como eles se integram para a constituição dessas novíssimas linguagens híbridas em novos suportes.El artículo estudia la emergencia de nuevos paradigmas literarios, procurando reflejar acerca de las textualidades contemporáneas. Enfoca los hipertextos informatizados y la poesía multimedia, intentando desvendar cómo están siendo creados nuevos procedimientos expresivos y en qué medida ellos pueden ser identificados a reflexiones teóricas anteriores acerca del texto literario impreso. Remite a cuestiones ligadas a la lectura de los diferentes tipos de signos y a los modos cómo ellos se interaccionan para la constitución de los novísimos lenguajes híbridos en nuevos supuestos.This article investigates the emergence of new literary paradigms as it tries to understand new contemporary textualities. It analyses some hypertexts and multimedia poetry trying to trace how new expressive procedures are being created. How can these new languages be identified and what are their relations to previous theories which dealt with the literary printed text? This study approaches questions linked to the reading of different types of signs and the modes they function towards the fabrication of these new hybrid languages.

  5. Genome sequencing of bacteria: sequencing, de novo assembly and rapid analysis using open source tools.

    Science.gov (United States)

    Kisand, Veljo; Lettieri, Teresa

    2013-04-01

    De novo genome sequencing of previously uncharacterized microorganisms has the potential to open up new frontiers in microbial genomics by providing insight into both functional capabilities and biodiversity. Until recently, Roche 454 pyrosequencing was the NGS method of choice for de novo assembly because it generates hundreds of thousands of long reads (tools for processing NGS data are increasingly free and open source and are often adopted for both their high quality and role in promoting academic freedom. The error rate of pyrosequencing the Alcanivorax borkumensis genome was such that thousands of insertions and deletions were artificially introduced into the finished genome. Despite a high coverage (~30 fold), it did not allow the reference genome to be fully mapped. Reads from regions with errors had low quality, low coverage, or were missing. The main defect of the reference mapping was the introduction of artificial indels into contigs through lower than 100% consensus and distracting gene calling due to artificial stop codons. No assembler was able to perform de novo assembly comparable to reference mapping. Automated annotation tools performed similarly on reference mapped and de novo draft genomes, and annotated most CDSs in the de novo assembled draft genomes. Free and open source software (FOSS) tools for assembly and annotation of NGS data are being developed rapidly to provide accurate results with less computational effort. Usability is not high priority and these tools currently do not allow the data to be processed without manual intervention. Despite this, genome assemblers now readily assemble medium short reads into long contigs (>97-98% genome coverage). A notable gap in pyrosequencing technology is the quality of base pair calling and conflicting base pairs between single reads at the same nucleotide position. Regardless, using draft whole genomes that are not finished and remain fragmented into tens of contigs allows one to characterize

  6. De novo autoimmune hepatitis after liver transplantation.

    Science.gov (United States)

    Lohse, Ansgar W; Weiler-Norman, Christina; Burdelski, Martin

    2007-10-01

    The Kings College group was the first to describe a clinical syndrome similar to autoimmune hepatitis in children and young adults transplanted for non-immune mediated liver diseases. They coined the term "de novo autoimmune hepatitis". Several other liver transplant centres confirmed this observation. Even though the condition is uncommon, patients with de novo AIH are now seen in most of the major transplant centres. The disease is usually characterized by features of acute hepatitis in otherwise stable transplant recipients. The most characteristic laboratory hallmark is a marked hypergammaglobulinaemia. Autoantibodies are common, mostly ANA. We described also a case of LKM1-positivity in a patients transplanted for Wilson's disease, however this patients did not develop clinical or histological features of AIH. Development of SLA/LP-autoantibodies is also not described. Therefore, serologically de novo AIH appears to correspond to type 1 AIH. Like classical AIH patients respond promptly to treatment with increased doses of prednisolone and azathioprine, while the calcineurin inhibitors cyclosporine or tacrolimus areof very limited value - which is not surprising, as almost all patients develop de novo AIH while receiving these drugs. Despite the good response to treatment, most patients remain a clinical challenge as complete stable remissions are uncommon and flares, relapses and chronic disease activity can often occur. Pathogenetically this syndrome is intriguing. It is not clear, if the immune response is directed against allo-antigens, neo-antigens in the liver, or self-antigens, possibly shared by donor and host cells. It is very likely that the inflammatory milieu due to alloreactive cells in the transplanted organ contribute to the disease process. Either leading to aberrant antigen presentation, or providing co-stimulatory signals leading to the breaking of self-tolerance. The development of this disease in the presence of treatment with calcineurin

  7. De novo transcriptome assembly associated with fumonisin production by the rice pathogen Fusarium fujikuroi

    Directory of Open Access Journals (Sweden)

    Keerthi S. Guruge

    2018-06-01

    Full Text Available The present study employed a next-generation sequencing method to assemble a de novo transcriptome database designed to distinguish gene expression changes exhibited by the fumonisin-producing fungus Fusarium fujikuroi when grown under ‘fumonisin-producing’ compared to ‘non-fumonisin-producing’ conditions. The raw data of this study have been deposited at DNA Data Bank of Japan (DDBJ under the accession ID DRA006146. Keywords: Fusarium fujikuroi, Fumonisin, Next-generation sequencing, Transcriptome, Gene-expression

  8. De Novo Mutations in SLC1A2 and CACNA1A Are Important Causes of Epileptic Encephalopathies

    DEFF Research Database (Denmark)

    Myers, Candace T; McMahon, Jacinta M; Schneider, Amy L

    2016-01-01

    whole-exome sequencing study of 264 parent-child trios revealed more than 290 candidate genes in which only a single individual had a de novo variant. We sought to identify additional pathogenic variants in a subset (n = 27) of these genes via targeted sequencing in an unsolved cohort of 531 individuals...

  9. On the total number of genes and their length distribution in complete microbial genomes

    DEFF Research Database (Denmark)

    Skovgaard, M; Jensen, L J; Brunak, S

    2001-01-01

    In sequenced microbial genomes, some of the annotated genes are actually not protein-coding genes, but rather open reading frames that occur by chance. Therefore, the number of annotated genes is higher than the actual number of genes for most of these microbes. Comparison of the length distribut......In sequenced microbial genomes, some of the annotated genes are actually not protein-coding genes, but rather open reading frames that occur by chance. Therefore, the number of annotated genes is higher than the actual number of genes for most of these microbes. Comparison of the length...... distribution of the annotated genes with the length distribution of those matching a known protein reveals that too many short genes are annotated in many genomes. Here we estimate the true number of protein-coding genes for sequenced genomes. Although it is often claimed that Escherichia coli has about 4300...... genes, we show that it probably has only approximately 3800 genes, and that a similar discrepancy exists for almost all published genomes....

  10. Automated Eukaryotic Gene Structure Annotation Using EVidenceModeler and the Program to Assemble Spliced Alignments

    Energy Technology Data Exchange (ETDEWEB)

    Haas, B J; Salzberg, S L; Zhu, W; Pertea, M; Allen, J E; Orvis, J; White, O; Buell, C R; Wortman, J R

    2007-12-10

    EVidenceModeler (EVM) is presented as an automated eukaryotic gene structure annotation tool that reports eukaryotic gene structures as a weighted consensus of all available evidence. EVM, when combined with the Program to Assemble Spliced Alignments (PASA), yields a comprehensive, configurable annotation system that predicts protein-coding genes and alternatively spliced isoforms. Our experiments on both rice and human genome sequences demonstrate that EVM produces automated gene structure annotation approaching the quality of manual curation.

  11. De novo and rare inherited copy-number variations in the hemiplegic form of cerebral palsy.

    Science.gov (United States)

    Zarrei, Mehdi; Fehlings, Darcy L; Mawjee, Karizma; Switzer, Lauren; Thiruvahindrapuram, Bhooma; Walker, Susan; Merico, Daniele; Casallo, Guillermo; Uddin, Mohammed; MacDonald, Jeffrey R; Gazzellone, Matthew J; Higginbotham, Edward J; Campbell, Craig; deVeber, Gabrielle; Frid, Pam; Gorter, Jan Willem; Hunt, Carolyn; Kawamura, Anne; Kim, Marie; McCormick, Anna; Mesterman, Ronit; Samdup, Dawa; Marshall, Christian R; Stavropoulos, Dimitri J; Wintle, Richard F; Scherer, Stephen W

    2018-02-01

    PurposeHemiplegia is a subtype of cerebral palsy (CP) in which one side of the body is affected. Our earlier study of unselected children with CP demonstrated de novo and clinically relevant rare inherited genomic copy-number variations (CNVs) in 9.6% of participants. Here, we examined the prevalence and types of CNVs specifically in hemiplegic CP.MethodsWe genotyped 97 unrelated probands with hemiplegic CP and their parents. We compared their CNVs to those of 10,851 population controls, in order to identify rare CNVs (<0.1% frequency) that might be relevant to CP. We also sequenced exomes of "CNV-positive" trios.ResultsWe detected de novo CNVs and/or sex chromosome abnormalities in 7/97 (7.2%) of probands, impacting important developmental genes such as GRIK2, LAMA1, DMD, PTPRM, and DIP2C. In 18/97 individuals (18.6%), rare inherited CNVs were found, affecting loci associated with known genomic disorders (17p12, 22q11.21) or involving genes linked to neurodevelopmental disorders.ConclusionWe found an increased rate of de novo CNVs in the hemiplegic CP subtype (7.2%) compared to controls (1%). This result is similar to that for an unselected CP group. Combined with rare inherited CNVs, the genomic data impacts the understanding of the potential etiology of hemiplegic CP in 23/97 (23.7%) of participants.

  12. De novo-based transcriptome profiling of male-sterile and fertile watermelon lines.

    Science.gov (United States)

    Rhee, Sun-Ju; Kwon, Taehyung; Seo, Minseok; Jang, Yoon Jeong; Sim, Tae Yong; Cho, Seoae; Han, Sang-Wook; Lee, Gung Pyo

    2017-01-01

    The whole-genome sequence of watermelon (Citrullus lanatus (Thunb.) Matsum. & Nakai), a valuable horticultural crop worldwide, was released in 2013. Here, we compared a de novo-based approach (DBA) to a reference-based approach (RBA) using RNA-seq data, to aid in efforts to improve the annotation of the watermelon reference genome and to obtain biological insight into male-sterility in watermelon. We applied these techniques to available data from two watermelon lines: the male-sterile line DAH3615-MS and the male-fertile line DAH3615. Using DBA, we newly annotated 855 watermelon transcripts, and found gene functional clusters predicted to be related to stimulus responses, nucleic acid binding, transmembrane transport, homeostasis, and Golgi/vesicles. Among the DBA-annotated transcripts, 138 de novo-exclusive differentially-expressed genes (DEDEGs) related to male sterility were detected. Out of 33 randomly selected newly annotated transcripts and DEDEGs, 32 were validated by RT-qPCR. This study demonstrates the usefulness and reliability of the de novo transcriptome assembly in watermelon, and provides new insights for researchers exploring transcriptional blueprints with regard to the male sterility.

  13. Integration of CpG-free DNA induces de novo methylation of CpG islands in pluripotent stem cells

    KAUST Repository

    Takahashi, Yuta

    2017-05-05

    CpG islands (CGIs) are primarily promoter-associated genomic regions and are mostly unmethylated within highly methylated mammalian genomes. The mechanisms by which CGIs are protected from de novo methylation remain elusive. Here we show that insertion of CpG-free DNA into targeted CGIs induces de novo methylation of the entire CGI in human pluripotent stem cells (PSCs). The methylation status is stably maintained even after CpG-free DNA removal, extensive passaging, and differentiation. By targeting the DNA mismatch repair gene MLH1 CGI, we could generate a PSC model of a cancer-related epimutation. Furthermore, we successfully corrected aberrant imprinting in induced PSCs derived from an Angelman syndrome patient. Our results provide insights into how CpG-free DNA induces de novo CGI methylation and broaden the application of targeted epigenome editing for a better understanding of human development and disease.

  14. Autosomal dominant cutis laxa with progeroid features due to a novel, de novo mutation in ALDH18A1.

    Science.gov (United States)

    Bhola, Priya T; Hartley, Taila; Bareke, Eric; Boycott, Kym M; Nikkel, Sarah M; Dyment, David A

    2017-06-01

    De novo dominant mutations in the aldehyde dehydrogenase 18 family member A1 (ALDH18A1) gene have recently been shown to cause autosomal dominant cutis laxa with progeroid features (MIM 616603). To date, all de novo dominant mutations have been found in a single highly conserved amino acid residue at position p.Arg138. We report an 8-year-old male with a clinical diagnosis of autosomal dominant cutis laxa (ADCL) with progeroid features and a novel de novo missense mutation in ALDH18A1 (NM_002860.3: c.377G>A (p.Arg126His)). This is the first report of an individual with ALDH18A1-ADCL due to a substitution at a residue other than p.Arg138. Knowledge of the complete spectrum of dominant-acting mutations that cause this rare syndrome will have implications for molecular diagnosis and genetic counselling of these families.

  15. Distinct gene number-genome size relationships for eukaryotes and non-eukaryotes: gene content estimation for dinoflagellate genomes.

    Directory of Open Access Journals (Sweden)

    Yubo Hou

    Full Text Available The ability to predict gene content is highly desirable for characterization of not-yet sequenced genomes like those of dinoflagellates. Using data from completely sequenced and annotated genomes from phylogenetically diverse lineages, we investigated the relationship between gene content and genome size using regression analyses. Distinct relationships between log(10-transformed protein-coding gene number (Y' versus log(10-transformed genome size (X', genome size in kbp were found for eukaryotes and non-eukaryotes. Eukaryotes best fit a logarithmic model, Y' = ln(-46.200+22.678X', whereas non-eukaryotes a linear model, Y' = 0.045+0.977X', both with high significance (p0.91. Total gene number shows similar trends in both groups to their respective protein coding regressions. The distinct correlations reflect lower and decreasing gene-coding percentages as genome size increases in eukaryotes (82%-1% compared to higher and relatively stable percentages in prokaryotes and viruses (97%-47%. The eukaryotic regression models project that the smallest dinoflagellate genome (3x10(6 kbp contains 38,188 protein-coding (40,086 total genes and the largest (245x10(6 kbp 87,688 protein-coding (92,013 total genes, corresponding to 1.8% and 0.05% gene-coding percentages. These estimates do not likely represent extraordinarily high functional diversity of the encoded proteome but rather highly redundant genomes as evidenced by high gene copy numbers documented for various dinoflagellate species.

  16. High abundance of Serine/Threonine-rich regions predicted to be hyper-O-glycosylated in the secretory proteins coded by eight fungal genomes

    Directory of Open Access Journals (Sweden)

    González Mario

    2012-09-01

    Full Text Available Abstract Background O-glycosylation of secretory proteins has been found to be an important factor in fungal biology and virulence. It consists in the addition of short glycosidic chains to Ser or Thr residues in the protein backbone via O-glycosidic bonds. Secretory proteins in fungi frequently display Ser/Thr rich regions that could be sites of extensive O-glycosylation. We have analyzed in silico the complete sets of putatively secretory proteins coded by eight fungal genomes (Botrytis cinerea, Magnaporthe grisea, Sclerotinia sclerotiorum, Ustilago maydis, Aspergillus nidulans, Neurospora crassa, Trichoderma reesei, and Saccharomyces cerevisiae in search of Ser/Thr-rich regions as well as regions predicted to be highly O-glycosylated by NetOGlyc (http://www.cbs.dtu.dk. Results By comparison with experimental data, NetOGlyc was found to overestimate the number of O-glycosylation sites in fungi by a factor of 1.5, but to be quite reliable in the prediction of highly O-glycosylated regions. About half of secretory proteins have at least one Ser/Thr-rich region, with a Ser/Thr content of at least 40% over an average length of 40 amino acids. Most secretory proteins in filamentous fungi were predicted to be O-glycosylated, sometimes in dozens or even hundreds of sites. Residues predicted to be O-glycosylated have a tendency to be grouped together forming hyper-O-glycosylated regions of varying length. Conclusions About one fourth of secretory fungal proteins were predicted to have at least one hyper-O-glycosylated region, which consists of 45 amino acids on average and displays at least one O-glycosylated Ser or Thr every four residues. These putative highly O-glycosylated regions can be found anywhere along the proteins but have a slight tendency to be at either one of the two ends.

  17. Prader-Willi region non-protein coding RNA 1 suppressed gastric cancer growth as a competing endogenous RNA of microRNA-425-5p.

    Science.gov (United States)

    Chen, Zihao; Ju, Hongping; Yu, Shan; Zhao, Ting; Jing, Xiaojie; Li, Ping; Jia, Jing; Li, Nan; Tan, Bibo; Li, Yong

    2018-03-13

    Gastric cancer (GC) is one of a major global health problem especially in Asia. Nowadays, long non-coding RNA has gained significantly attention in the current research climate such as carcinogenesis. This research desired to explore the mechanism of Prader-Willi region non-protein coding RNA 1 (PWRN1) on regulating GC process. Differentially expressed lncRNAs in GC tissues were screened out through microarray analysis. The RNA and protein expression level was detected by qRT-PCR and western blot. Cell proliferation, apoptosis rate, metastasis abilities were respectively determined by CCK8, flow cytometry, wound healing and transwell assay. The luciferase reporter system was used to verify the targeting relationships between PWRN1, miR-425-5p and PTEN RIP assay was performed to prove whether PWRN1 acted as a competitive endogenous RNA (ceRNA) of miR-425-5p. Tumor xenograft model and immunohistochemistry were developed to study the influence of PWRN1 on tumor growth in vivo Microarray analysis determined that PWRN1 was different expressed between GC tissues and adjacent tissues. QRT-PCR revealed PWRN1 low expression in GC tissues and cells. PWRN1 up-regulated could reduce proliferation and metastasis and increased apoptosis in GC cells, while miR-425-5p had reverse effects. The RIP assay indicated that PWRN1 may target an oncogene miR-425-5p. The tumor xenograft assay found that up-regulated PWRN1 suppressed the tumor growth. The bioinformatic analysis, luciferase assay and western blot indicated that PWRN1 affected PTEN/Akt/MDM2/p53 axis via suppressing miR-425-5p. Our findings suggested that PWRN1 functioned as a ceRNA targeting to miR-425-5p and suppressed GC development via p53 signaling pathway. ©2018 The Author(s).

  18. De Novo Construction of Redox Active Proteins.

    Science.gov (United States)

    Moser, C C; Sheehan, M M; Ennist, N M; Kodali, G; Bialas, C; Englander, M T; Discher, B M; Dutton, P L

    2016-01-01

    Relatively simple principles can be used to plan and construct de novo proteins that bind redox cofactors and participate in a range of electron-transfer reactions analogous to those seen in natural oxidoreductase proteins. These designed redox proteins are called maquettes. Hydrophobic/hydrophilic binary patterning of heptad repeats of amino acids linked together in a single-chain self-assemble into 4-alpha-helix bundles. These bundles form a robust and adaptable frame for uncovering the default properties of protein embedded cofactors independent of the complexities introduced by generations of natural selection and allow us to better understand what factors can be exploited by man or nature to manipulate the physical chemical properties of these cofactors. Anchoring of redox cofactors such as hemes, light active tetrapyrroles, FeS clusters, and flavins by His and Cys residues allow cofactors to be placed at positions in which electron-tunneling rates between cofactors within or between proteins can be predicted in advance. The modularity of heptad repeat designs facilitates the construction of electron-transfer chains and novel combinations of redox cofactors and new redox cofactor assisted functions. Developing de novo designs that can support cofactor incorporation upon expression in a cell is needed to support a synthetic biology advance that integrates with natural bioenergetic pathways. © 2016 Elsevier Inc. All rights reserved.

  19. De novo assembly and phasing of a Korean human genome.

    Science.gov (United States)

    Seo, Jeong-Sun; Rhie, Arang; Kim, Junsoo; Lee, Sangjin; Sohn, Min-Hwan; Kim, Chang-Uk; Hastie, Alex; Cao, Han; Yun, Ji-Young; Kim, Jihye; Kuk, Junho; Park, Gun Hwa; Kim, Juhyeok; Ryu, Hanna; Kim, Jongbum; Roh, Mira; Baek, Jeonghun; Hunkapiller, Michael W; Korlach, Jonas; Shin, Jong-Yeon; Kim, Changhoon

    2016-10-13

    Advances in genome assembly and phasing provide an opportunity to investigate the diploid architecture of the human genome and reveal the full range of structural variation across population groups. Here we report the de novo assembly and haplotype phasing of the Korean individual AK1 (ref. 1) using single-molecule real-time sequencing, next-generation mapping, microfluidics-based linked reads, and bacterial artificial chromosome (BAC) sequencing approaches. Single-molecule sequencing coupled with next-generation mapping generated a highly contiguous assembly, with a contig N50 size of 17.9 Mb and a scaffold N50 size of 44.8 Mb, resolving 8 chromosomal arms into single scaffolds. The de novo assembly, along with local assemblies and spanning long reads, closes 105 and extends into 72 out of 190 euchromatic gaps in the reference genome, adding 1.03 Mb of previously intractable sequence. High concordance between the assembly and paired-end sequences from 62,758 BAC clones provides strong support for the robustness of the assembly. We identify 18,210 structural variants by direct comparison of the assembly with the human reference, identifying thousands of breakpoints that, to our knowledge, have not been reported before. Many of the insertions are reflected in the transcriptome and are shared across the Asian population. We performed haplotype phasing of the assembly with short reads, long reads and linked reads from whole-genome sequencing and with short reads from 31,719 BAC clones, thereby achieving phased blocks with an N50 size of 11.6 Mb. Haplotigs assembled from single-molecule real-time reads assigned to haplotypes on phased blocks covered 89% of genes. The haplotigs accurately characterized the hypervariable major histocompatability complex region as well as demonstrating allele configuration in clinically relevant genes such as CYP2D6. This work presents the most contiguous diploid human genome assembly so far, with extensive investigation of

  20. Uridine monophosphate synthetase enables eukaryotic de novo NAD+ biosynthesis from quinolinic acid.

    Science.gov (United States)

    McReynolds, Melanie R; Wang, Wenqing; Holleran, Lauren M; Hanna-Rose, Wendy

    2017-07-07

    NAD + biosynthesis is an attractive and promising therapeutic target for influencing health span and obesity-related phenotypes as well as tumor growth. Full and effective use of this target for therapeutic benefit requires a complete understanding of NAD + biosynthetic pathways. Here, we report a previously unrecognized role for a conserved phosphoribosyltransferase in NAD + biosynthesis. Because a required quinolinic acid phosphoribosyltransferase (QPRTase) is not encoded in its genome, Caenorhabditis elegans are reported to lack a de novo NAD + biosynthetic pathway. However, all the genes of the kynurenine pathway required for quinolinic acid (QA) production from tryptophan are present. Thus, we investigated the presence of de novo NAD + biosynthesis in this organism. By combining isotope-tracing and genetic experiments, we have demonstrated the presence of an intact de novo biosynthesis pathway for NAD + from tryptophan via QA, highlighting the functional conservation of this important biosynthetic activity. Supplementation with kynurenine pathway intermediates also boosted NAD + levels and partially reversed NAD + -dependent phenotypes caused by mutation of pnc-1 , which encodes a nicotinamidase required for NAD + salvage biosynthesis, demonstrating contribution of de novo synthesis to NAD + homeostasis. By investigating candidate phosphoribosyltransferase genes in the genome, we determined that the conserved uridine monophosphate phosphoribosyltransferase (UMPS), which acts in pyrimidine biosynthesis, is required for NAD + biosynthesis in place of the missing QPRTase. We suggest that similar underground metabolic activity of UMPS may function in other organisms. This mechanism for NAD + biosynthesis creates novel possibilities for manipulating NAD + biosynthetic pathways, which is key for the future of therapeutics. © 2017 by The American Society for Biochemistry and Molecular Biology, Inc.

  1. The Asian Rice Gall Midge (Orseolia oryzae Mitogenome Has Evolved Novel Gene Boundaries and Tandem Repeats That Distinguish Its Biotypes.

    Directory of Open Access Journals (Sweden)

    Isha Atray

    Full Text Available The complete mitochondrial genome of the Asian rice gall midge, Orseolia oryzae (Diptera; Cecidomyiidae was sequenced, annotated and analysed in the present study. The circular genome is 15,286 bp with 13 protein-coding genes, 22 tRNAs and 2 ribosomal RNA genes, and a 578 bp non-coding control region. All protein coding genes used conventional start codons and terminated with a complete stop codon. The genome presented many unusual features: (1 rearrangement in the order of tRNAs as well as protein coding genes; (2 truncation and unusual secondary structures of tRNAs; (3 presence of two different repeat elements in separate non-coding regions; (4 presence of one pseudo-tRNA gene; (5 inversion of the rRNA genes; (6 higher percentage of non-coding regions when compared with other insect mitogenomes. Rearrangements of the tRNAs and protein coding genes are explained on the basis of tandem duplication and random loss model and why intramitochondrial recombination is a better model for explaining rearrangements in the O. oryzae mitochondrial genome is discussed. Furthermore, we evaluated the number of iterations of the tandem repeat elements found in the mitogenome. This led to the identification of genetic markers capable of differentiating rice gall midge biotypes and the two Orseolia species investigated.

  2. De-novo discovery of differentially abundant transcription factor binding sites including their positional preference.

    Science.gov (United States)

    Keilwagen, Jens; Grau, Jan; Paponov, Ivan A; Posch, Stefan; Strickert, Marc; Grosse, Ivo

    2011-02-10

    Transcription factors are a main component of gene regulation as they activate or repress gene expression by binding to specific binding sites in promoters. The de-novo discovery of transcription factor binding sites in target regions obtained by wet-lab experiments is a challenging problem in computational biology, which has not been fully solved yet. Here, we present a de-novo motif discovery tool called Dispom for finding differentially abundant transcription factor binding sites that models existing positional preferences of binding sites and adjusts the length of the motif in the learning process. Evaluating Dispom, we find that its prediction performance is superior to existing tools for de-novo motif discovery for 18 benchmark data sets with planted binding sites, and for a metazoan compendium based on experimental data from micro-array, ChIP-chip, ChIP-DSL, and DamID as well as Gene Ontology data. Finally, we apply Dispom to find binding sites differentially abundant in promoters of auxin-responsive genes extracted from Arabidopsis thaliana microarray data, and we find a motif that can be interpreted as a refined auxin responsive element predominately positioned in the 250-bp region upstream of the transcription start site. Using an independent data set of auxin-responsive genes, we find in genome-wide predictions that the refined motif is more specific for auxin-responsive genes than the canonical auxin-responsive element. In general, Dispom can be used to find differentially abundant motifs in sequences of any origin. However, the positional distribution learned by Dispom is especially beneficial if all sequences are aligned to some anchor point like the transcription start site in case of promoter sequences. We demonstrate that the combination of searching for differentially abundant motifs and inferring a position distribution from the data is beneficial for de-novo motif discovery. Hence, we make the tool freely available as a component of the open

  3. Extensive gene rearrangements in the mitochondrial genomes of two egg parasitoids, Trichogramma japonicum and Trichogramma ostriniae (Hymenoptera: Chalcidoidea: Trichogrammatidae).

    Science.gov (United States)

    Chen, Long; Chen, Peng-Yan; Xue, Xiao-Feng; Hua, Hai-Qing; Li, Yuan-Xi; Zhang, Fan; Wei, Shu-Jun

    2018-05-04

    Animal mitochondrial genomes usually exhibit conserved gene arrangement across major lineages, while those in the Hymenoptera are known to possess frequent rearrangements, as are those of several other orders of insects. Here, we sequenced two complete mitochondrial genomes of Trichogramma japonicum and Trichogramma ostriniae (Hymenoptera: Chalcidoidea: Trichogrammatidae). In total, 37 mitochondrial genes were identified in both species. The same gene arrangement pattern was found in the two species, with extensive gene rearrangement compared with the ancestral insect mitochondrial genome. Most tRNA genes and all protein-coding genes were encoded on the minority strand. In total, 15 tRNA genes and seven protein-coding genes were rearranged. The rearrangements of cox1 and nad2 as well as most tRNA genes were novel. Phylogenetic analysis based on nucleotide sequences of protein-coding genes and on gene arrangement patterns produced identical topologies that support the relationship of (Agaonidae + Pteromalidae) + Trichogrammatidae in Chalcidoidea. CREx analysis revealed eight rearrangement operations occurred from presumed ancestral gene order of Chalcidoidea to form the derived gene order of Trichogramma. Our study shows that gene rearrangement information in Chalcidoidea can potentially contribute to the phylogeny of Chalcidoidea when more mitochondrial genome sequences are available.

  4. De novo assembly and annotation of the Antarctic copepod (Tigriopus kingsejongensis) transcriptome.

    Science.gov (United States)

    Kim, Hui-Su; Lee, Bo-Young; Han, Jeonghoon; Lee, Young Hwan; Min, Gi-Sik; Kim, Sanghee; Lee, Jae-Seong

    2016-08-01

    The whole transcriptome of the Antarctic copepod (Tigriopus kingsejongensis) was sequenced using Illumina RNA-seq. De novo assembly was performed with 64,785,098 raw reads using Trinity, which assembled into 81,653 contigs. TransDecoder found 38,250 candidate coding contigs which showed homology to other species by BLAST analysis. Functional gene annotation was performed by Gene Ontology (GO), InterProScan, and KEGG pathway analyses. Finally, we identified a number of expressed gene catalog for T. kingsejongensis that is a useful model animal for gene information-based polar research to uncover molecular mechanisms of environmental adaptation on harsh environments. In particular, we observed highly developing lipid metabolism in T. kingsejongensis directly compared to those of the Far East Pacific coast copepod Tigriopus japonicus at the transcriptome level. Copyright © 2016 Elsevier B.V. All rights reserved.

  5. De Novo Assembly and Characterization of the Transcriptome of Grasshopper Shirakiacris shirakii

    Directory of Open Access Journals (Sweden)

    Zhongying Qiu

    2016-07-01

    Full Text Available Background: The grasshopper Shirakiacris shirakii is an important agricultural pest and feeds mainly on gramineous plants, thereby causing economic damage to a wide range of crops. However, genomic information on this species is extremely limited thus far, and transcriptome data relevant to insecticide resistance and pest control are also not available. Methods: The transcriptome of S. shirakii was sequenced using the Illumina HiSeq platform, and we de novo assembled the transcriptome. Results: Its sequencing produced a total of 105,408,878 clean reads, and the de novo assembly revealed 74,657 unigenes with an average length of 680 bp and N50 of 1057 bp. A total of 28,173 unigenes were annotated for the NCBI non-redundant protein sequences (Nr, NCBI non-redundant nucleotide sequences (Nt, a manually-annotated and reviewed protein sequence database (Swiss-Prot, Gene Ontology (GO and Kyoto Encyclopedia of Genes and Genomes (KEGG databases. Based on the Nr annotation results, we manually identified 79 unigenes encoding cytochrome P450 monooxygenases (P450s, 36 unigenes encoding carboxylesterases (CarEs and 36 unigenes encoding glutathione S-transferases (GSTs in S. shirakii. Core RNAi components relevant to miroRNA, siRNA and piRNA pathways, including Pasha, Loquacious, Argonaute-1, Argonaute-2, Argonaute-3, Zucchini, Aubergine, enhanced RNAi-1 and Piwi, were expressed in S. shirakii. We also identified five unigenes that were homologous to the Sid-1 gene. In addition, the analysis of differential gene expressions revealed that a total of 19,764 unigenes were up-regulated and 4185 unigenes were down-regulated in larvae. In total, we predicted 7504 simple sequence repeats (SSRs from 74,657 unigenes. Conclusions: The comprehensive de novo transcriptomic data of S. shirakii will offer a series of valuable molecular resources for better studying insecticide resistance, RNAi and molecular marker discovery in the transcriptome.

  6. Developing de novo human artificial chromosomes in embryonic stem cells using HSV-1 amplicon technology.

    Science.gov (United States)

    Moralli, Daniela; Monaco, Zoia L

    2015-02-01

    De novo artificial chromosomes expressing genes have been generated in human embryonic stem cells (hESc) and are maintained following differentiation into other cell types. Human artificial chromosomes (HAC) are small, functional, extrachromosomal elements, which behave as normal chromosomes in human cells. De novo HAC are generated following delivery of alpha satellite DNA into target cells. HAC are characterized by high levels of mitotic stability and are used as models to study centromere formation and chromosome organisation. They are successful and effective as gene expression vectors since they remain autonomous and can accommodate larger genes and regulatory regions for long-term expression studies in cells unlike other viral gene delivery vectors currently used. Transferring the essential DNA sequences for HAC formation intact across the cell membrane has been challenging for a number of years. A highly efficient delivery system based on HSV-1 amplicons has been used to target DNA directly to the ES cell nucleus and HAC stably generated in human embryonic stem cells (hESc) at high frequency. HAC were detected using an improved protocol for hESc chromosome harvesting, which consistently produced high-quality metaphase spreads that could routinely detect HAC in hESc. In tumour cells, the input DNA often integrated in the host chromosomes, but in the host ES genome, it remained intact. The hESc containing the HAC formed embryoid bodies, generated teratoma in mice, and differentiated into neuronal cells where the HAC were maintained. The HAC structure and chromatin composition was similar to the endogenous hESc chromosomes. This review will discuss the technological advances in HAC vector delivery using HSV-1 amplicons and the improvements in the identification of de novo HAC in hESc.

  7. Antimicrobial peptide capsids of de novo design.

    Science.gov (United States)

    De Santis, Emiliana; Alkassem, Hasan; Lamarre, Baptiste; Faruqui, Nilofar; Bella, Angelo; Noble, James E; Micale, Nicola; Ray, Santanu; Burns, Jonathan R; Yon, Alexander R; Hoogenboom, Bart W; Ryadnov, Maxim G

    2017-12-22

    The spread of bacterial resistance to antibiotics poses the need for antimicrobial discovery. With traditional search paradigms being exhausted, approaches that are altogether different from antibiotics may offer promising and creative solutions. Here, we introduce a de novo peptide topology that-by emulating the virus architecture-assembles into discrete antimicrobial capsids. Using the combination of high-resolution and real-time imaging, we demonstrate that these artificial capsids assemble as 20-nm hollow shells that attack bacterial membranes and upon landing on phospholipid bilayers instantaneously (seconds) convert into rapidly expanding pores causing membrane lysis (minutes). The designed capsids show broad antimicrobial activities, thus executing one primary function-they destroy bacteria on contact.

  8. Sequencing and De Novo Transcriptome Assembly of Brachypodium sylvaticum (Poaceae

    Directory of Open Access Journals (Sweden)

    Samuel E. Fox

    2013-03-01

    Full Text Available Premise of the study: We report the de novo assembly and characterization of the transcriptomes of Brachypodium sylvaticum (slender false-brome accessions from native populations of Spain and Greece, and an invasive population west of Corvallis, Oregon, USA. Methods and Results: More than 350 million sequence reads from the mRNA libraries prepared from three B. sylvaticum genotypes were assembled into 120,091 (Corvallis, 104,950 (Spain, and 177,682 (Greece transcript contigs. In comparison with the B. distachyon Bd21 reference genome and GenBank protein sequences, we estimate >90% exome coverage for B. sylvaticum. The transcripts were assigned Gene Ontology and InterPro annotations. Brachypodium sylvaticum sequence reads aligned against the Bd21 genome revealed 394,654 single-nucleotide polymorphisms (SNPs and >20,000 simple sequence repeat (SSR DNA sites. Conclusions: To our knowledge, this is the first report of transcriptome sequencing of invasive plant species with a closely related sequenced reference genome. The sequences and identified SNP variant and SSR sites will provide tools for developing novel genetic markers for use in genotyping and characterization of invasive behavior of B. sylvaticum.

  9. De novo DNA methylation during monkey pre-implantation embryogenesis.

    Science.gov (United States)

    Gao, Fei; Niu, Yuyu; Sun, Yi Eve; Lu, Hanlin; Chen, Yongchang; Li, Siguang; Kang, Yu; Luo, Yuping; Si, Chenyang; Yu, Juehua; Li, Chang; Sun, Nianqin; Si, Wei; Wang, Hong; Ji, Weizhi; Tan, Tao

    2017-04-01

    Critical epigenetic regulation of primate embryogenesis entails DNA methylome changes. Here we report genome-wide composition, patterning, and stage-specific dynamics of DNA methylation in pre-implantation rhesus monkey embryos as well as male and female gametes studied using an optimized tagmentation-based whole-genome bisulfite sequencing method. We show that upon fertilization, both paternal and maternal genomes undergo active DNA demethylation, and genome-wide de novo DNA methylation is also initiated in the same period. By the 8-cell stage, remethylation becomes more pronounced than demethylation, resulting in an increase in global DNA methylation. Promoters of genes associated with oxidative phosphorylation are preferentially remethylated at the 8-cell stage, suggesting that this mode of energy metabolism may not be favored. Unlike in rodents, X chromosome inactivation is not observed during monkey pre-implantation development. Our study provides the first comprehensive illustration of the 'wax and wane' phases of DNA methylation dynamics. Most importantly, our DNA methyltransferase loss-of-function analysis indicates that DNA methylation influences early monkey embryogenesis.

  10. Single gene microdeletions and microduplication of 3p26.3 in three unrelated families

    DEFF Research Database (Denmark)

    Kashevarova, Anna A; Nazarenko, Lyudmila P; Schultz-Pedersen, Soren

    2014-01-01

    contain several protein-coding genes and regulatory elements, complicating the understanding of genotype-phenotype correlations. We report two siblings with ID and an unrelated patient with atypical autism who had 3p26.3 microdeletions and one intellectually disabled patient with a 3p26.3 microduplication...

  11. Selection of Highly Expressed Gene Variants in Escherichia coli Using Translationally Coupled Antibiotic Selection Markers

    DEFF Research Database (Denmark)

    Rennig, Maja; Daley, Daniel O.; Nørholm, Morten H. H.

    2018-01-01

    Strategies to select highly expressed variants of a protein coding sequence are usually based on trial-and-error approaches, which are time-consuming and expensive. We address this problem using translationally coupled antibiotic resistance markers. The system requires that the target gene can...

  12. Dynamic gene expression response to altered gravity in human T cells.

    Science.gov (United States)

    Thiel, Cora S; Hauschild, Swantje; Huge, Andreas; Tauber, Svantje; Lauber, Beatrice A; Polzer, Jennifer; Paulsen, Katrin; Lier, Hartwin; Engelmann, Frank; Schmitz, Burkhard; Schütte, Andreas; Layer, Liliana E; Ullrich, Oliver

    2017-07-12

    We investigated the dynamics of immediate and initial gene expression response to different gravitational environments in human Jurkat T lymphocytic cells and compared expression profiles to identify potential gravity-regulated genes and adaptation processes. We used the Affymetrix GeneChip® Human Transcriptome Array 2.0 containing 44,699 protein coding genes and 22,829 non-protein coding genes and performed the experiments during a parabolic flight and a suborbital ballistic rocket mission to cross-validate gravity-regulated gene expression through independent research platforms and different sets of control experiments to exclude other factors than alteration of gravity. We found that gene expression in human T cells rapidly responded to altered gravity in the time frame of 20 s and 5 min. The initial response to microgravity involved mostly regulatory RNAs. We identified three gravity-regulated genes which could be cross-validated in both completely independent experiment missions: ATP6V1A/D, a vacuolar H + -ATPase (V-ATPase) responsible for acidification during bone resorption, IGHD3-3/IGHD3-10, diversity genes of the immunoglobulin heavy-chain locus participating in V(D)J recombination, and LINC00837, a long intergenic non-protein coding RNA. Due to the extensive and rapid alteration of gene expression associated with regulatory RNAs, we conclude that human cells are equipped with a robust and efficient adaptation potential when challenged with altered gravitational environments.

  13. De novo point mutations in patients diagnosed with ataxic cerebral palsy.

    Science.gov (United States)

    Parolin Schnekenberg, Ricardo; Perkins, Emma M; Miller, Jack W; Davies, Wayne I L; D'Adamo, Maria Cristina; Pessia, Mauro; Fawcett, Katherine A; Sims, David; Gillard, Elodie; Hudspith, Karl; Skehel, Paul; Williams, Jonathan; O'Regan, Mary; Jayawant, Sandeep; Jefferson, Rosalind; Hughes, Sarah; Lustenberger, Andrea; Ragoussis, Jiannis; Jackson, Mandy; Tucker, Stephen J; Németh, Andrea H

    2015-07-01

    Cerebral palsy is a sporadic disorder with multiple likely aetiologies, but frequently considered to be caused by birth asphyxia. Genetic investigations are rarely performed in patients with cerebral palsy and there is little proven evidence of genetic causes. As part of a large project investigating children with ataxia, we identified four patients in our cohort with a diagnosis of ataxic cerebral palsy. They were investigated using either targeted next generation sequencing or trio-based exome sequencing and were found to have mutations in three different genes, KCNC3, ITPR1 and SPTBN2. All the mutations were de novo and associated with increased paternal age. The mutations were shown to be pathogenic using a combination of bioinformatics analysis and in vitro model systems. This work is the first to report that the ataxic subtype of cerebral palsy can be caused by de novo dominant point mutations, which explains the sporadic nature of these cases. We conclude that at least some subtypes of cerebral palsy may be caused by de novo genetic mutations and patients with a clinical diagnosis of cerebral palsy should be genetically investigated before causation is ascribed to perinatal asphyxia or other aetiologies. © The Author (2015). Published by Oxford University Press on behalf of the Guarantors of Brain.

  14. Autism Spectrum Disorder in a Girl with a De Novo X;19 Balanced Translocation

    Science.gov (United States)

    Baruffi, Marcelo Razera; de Souza, Deise Helena; Bicudo da Silva, Rosana Aparecida; Ramos, Ester Silveira; Moretti-Ferreira, Danilo

    2012-01-01

    Balanced X-autosome translocations are rare, and female carriers are a clinically heterogeneous group of patients, with phenotypically normal women, history of recurrent miscarriage, gonadal dysfunction, X-linked disorders or congenital abnormalities, and/or developmental delay. We investigated a patient with a de novo X;19 translocation. The six-year-old girl has been evaluated due to hyperactivity, social interaction impairment, stereotypic and repetitive use of language with echolalia, failure to follow parents/caretakers orders, inconsolable outbursts, and persistent preoccupation with parts of objects. The girl has normal cognitive function. Her measurements are within normal range, and no other abnormalities were found during physical, neurological, or dysmorphological examinations. Conventional cytogenetic analysis showed a de novo balanced translocation, with the karyotype 46,X,t(X;19)(p21.2;q13.4). Replication banding showed a clear preference for inactivation of the normal X chromosome. The translocation was confirmed by FISH and Spectral Karyotyping (SKY). Although abnormal phenotypes associated with de novo balanced chromosomal rearrangements may be the result of disruption of a gene at one of the breakpoints, submicroscopic deletion or duplication, or a position effect, X; autosomal translocations are associated with additional unique risk factors including X-linked disorders, functional autosomal monosomy, or functional X chromosome disomy resulting from the complex X-inactivation process. PMID:23074688

  15. Autism Spectrum Disorder in a Girl with a De Novo X;19 Balanced Translocation

    Directory of Open Access Journals (Sweden)

    Marcelo Razera Baruffi

    2012-01-01

    Full Text Available Balanced X-autosome translocations are rare, and female carriers are a clinically heterogeneous group of patients, with phenotypically normal women, history of recurrent miscarriage, gonadal dysfunction, X-linked disorders or congenital abnormalities, and/or developmental delay. We investigated a patient with a de novo X;19 translocation. The six-year-old girl has been evaluated due to hyperactivity, social interaction impairment, stereotypic and repetitive use of language with echolalia, failure to follow parents/caretakers orders, inconsolable outbursts, and persistent preoccupation with parts of objects. The girl has normal cognitive function. Her measurements are within normal range, and no other abnormalities were found during physical, neurological, or dysmorphological examinations. Conventional cytogenetic analysis showed a de novo balanced translocation, with the karyotype 46,X,t(X;19(p21.2;q13.4. Replication banding showed a clear preference for inactivation of the normal X chromosome. The translocation was confirmed by FISH and Spectral Karyotyping (SKY. Although abnormal phenotypes associated with de novo balanced chromosomal rearrangements may be the result of disruption of a gene at one of the breakpoints, submicroscopic deletion or duplication, or a position effect, X; autosomal translocations are associated with additional unique risk factors including X-linked disorders, functional autosomal monosomy, or functional X chromosome disomy resulting from the complex X-inactivation process.

  16. De novo transcriptome assembly and positive selection analysis of an individual deep-sea fish.

    Science.gov (United States)

    Lan, Yi; Sun, Jin; Xu, Ting; Chen, Chong; Tian, Renmao; Qiu, Jian-Wen; Qian, Pei-Yuan

    2018-05-24

    High hydrostatic pressure and low temperatures make the deep sea a harsh environment for life forms. Actin organization and microtubules assembly, which are essential for intracellular transport and cell motility, can be disrupted by high hydrostatic pressure. High hydrostatic pressure can also damage DNA. Nucleic acids exposed to low temperatures can form secondary structures that hinder genetic information processing. To study how deep-sea creatures adapt to such a hostile environment, one of the most straightforward ways is to sequence and compare their genes with those of their shallow-water relatives. We captured an individual of the fish species Aldrovandia affinis, which is a typical deep-sea inhabitant, from the Okinawa Trough at a depth of 1550 m using a remotely operated vehicle (ROV). We sequenced its transcriptome and analyzed its molecular adaptation. We obtained 27,633 protein coding sequences using an Illumina platform and compared them with those of several shallow-water fish species. Analysis of 4918 single-copy orthologs identified 138 positively selected genes in A. affinis, including genes involved in microtubule regulation. Particularly, functional domains related to cold shock as well as DNA repair are exposed to positive selection pressure in both deep-sea fish and hadal amphipod. Overall, we have identified a set of positively selected genes related to cytoskeleton structures, DNA repair and genetic information processing, which shed light on molecular adaptation to the deep sea. These results suggest that amino acid substitutions of these positively selected genes may contribute crucially to the adaptation of deep-sea animals. Additionally, we provide a high-quality transcriptome of a deep-sea fish for future deep-sea studies.

  17. Optimizing de novo common wheat transcriptome assembly using short-read RNA-Seq data

    Directory of Open Access Journals (Sweden)

    Duan Jialei

    2012-08-01

    Full Text Available Abstract Background Rapid advances in next-generation sequencing methods have provided new opportunities for transcriptome sequencing (RNA-Seq. The unprecedented sequencing depth provided by RNA-Seq makes it a powerful and cost-efficient method for transcriptome study, and it has been widely used in model organisms and non-model organisms to identify and quantify RNA. For non-model organisms lacking well-defined genomes, de novo assembly is typically required for downstream RNA-Seq analyses, including SNP discovery and identification of genes differentially expressed by phenotypes. Although RNA-Seq has been successfully used to sequence many non-model organisms, the results of de novo assembly from short reads can still be improved by using recent bioinformatic developments. Results In this study, we used 212.6 million pair-end reads, which accounted for 16.2 Gb, to assemble the hexaploid wheat transcriptome. Two state-of-the-art assemblers, Trinity and Trans-ABySS, which use the single and multiple k-mer methods, respectively, were used, and the whole de novo assembly process was divided into the following four steps: pre-assembly, merging different samples, removal of redundancy and scaffolding. We documented every detail of these steps and how these steps influenced assembly performance to gain insight into transcriptome assembly from short reads. After optimization, the assembled transcripts were comparable to Sanger-derived ESTs in terms of both continuity and accuracy. We also provided considerable new wheat transcript data to the community. Conclusions It is feasible to assemble the hexaploid wheat transcriptome from short reads. Special attention should be paid to dealing with multiple samples to balance the spectrum of expression levels and redundancy. To obtain an accurate overview of RNA profiling, removal of redundancy may be crucial in de novo assembly.

  18. De novo complex intra chromosomal rearrangement after ICSI: characterisation by BACs micro array-CGH

    Directory of Open Access Journals (Sweden)

    Quimsiyeh Mazin

    2008-12-01

    Full Text Available Abstract Background In routine Assisted Reproductive Technology (ART men with severe oligozoospermia or azoospermia should be informed about the risk of de novo congenital or chromosomal abnormalities in ICSI program. Also the benefits of preimplantation or prenatal genetic diagnosis practice need to be explained to the couple. Methods From a routine ICSI attempt, using ejaculated sperm from male with severe oligozoospermia and having normal karyotype, a 30 years old pregnant woman was referred to prenatal diagnosis in the 17th week for bichorionic biamniotic twin gestation. Amniocentesis was performed because of the detection of an increased foetal nuchal translucency for one of the fetus by the sonographic examination during the 12th week of gestation (WG. Chromosome and DNA studies of the fetus were realized on cultured amniocytes Results Conventional, molecular cytogenetic and microarray CGH experiments allowed us to conclude that the fetus had a de novo pericentromeric inversion associated with a duplication of the 9p22.1-p24 chromosomal region, 46,XY,invdup(9(p22.1p24 [arrCGH 9p22.1p24 (RP11-130C19 → RP11-87O1x3]. As containing the critical 9p22 region, our case is in coincidence with the general phenotype features of the partial trisomy 9p syndrome with major growth retardation, microcephaly and microretrognathia. Conclusion This de novo complex chromosome rearrangement illustrates the possible risk of chromosome or gene defects in ICSI program and the contribution of array-CGH for mapping rapidly de novo chromosomal imbalance.

  19. Chromosome preference of disease genes and vectorization for the prediction of non-coding disease genes.

    Science.gov (United States)

    Peng, Hui; Lan, Chaowang; Liu, Yuansheng; Liu, Tao; Blumenstein, Michael; Li, Jinyan

    2017-10-03

    Disease-related protein-coding genes have been widely studied, but disease-related non-coding genes remain largely unknown. This work introduces a new vector to represent diseases, and applies the newly vectorized data for a positive-unlabeled learning algorithm to predict and rank disease-related long non-coding RNA (lncRNA) genes. This novel vector representation for diseases consists of two sub-vectors, one is composed of 45 elements, characterizing the information entropies of the disease genes distribution over 45 chromosome substructures. This idea is supported by our observation that some substructures (e.g., the chromosome 6 p-arm) are highly preferred by disease-related protein coding genes, while some (e.g., the 21 p-arm) are not favored at all. The second sub-vector is 30-dimensional, characterizing the distribution of disease gene enriched KEGG pathways in comparison with our manually created pathway groups. The second sub-vector complements with the first one to differentiate between various diseases. Our prediction method outperforms the state-of-the-art methods on benchmark datasets for prioritizing disease related lncRNA genes. The method also works well when only the sequence information of an lncRNA gene is known, or even when a given disease has no currently recognized long non-coding genes.

  20. Deep developmental transcriptome sequencing uncovers numerous new genes and enhances gene annotation in the sponge Amphimedon queenslandica.

    Science.gov (United States)

    Fernandez-Valverde, Selene L; Calcino, Andrew D; Degnan, Bernard M

    2015-05-15

    The demosponge Amphimedon queenslandica is amongst the few early-branching metazoans with an assembled and annotated draft genome, making it an important species in the study of the origin and early evolution of animals. Current gene models in this species are largely based on in silico predictions and low coverage expressed sequence tag (EST) evidence. Amphimedon queenslandica protein-coding gene models are improved using deep RNA-Seq data from four developmental stages and CEL-Seq data from 82 developmental samples. Over 86% of previously predicted genes are retained in the new gene models, although 24% have additional exons; there is also a marked increase in the total number of annotated 3' and 5' untranslated regions (UTRs). Importantly, these new developmental transcriptome data reveal numerous previously unannotated protein-coding genes in the Amphimedon genome, increasing the total gene number by 25%, from 30,060 to 40,122. In general, Amphimedon genes have introns that are markedly smaller than those in other animals and most of the alternatively spliced genes in Amphimedon undergo intron-retention; exon-skipping is the least common mode of alternative splicing. Finally, in addition to canonical polyadenylation signal sequences, Amphimedon genes are enriched in a number of unique AT-rich motifs in their 3' UTRs. The inclusion of developmental transcriptome data has substantially improved the structure and composition of protein-coding gene models in Amphimedon queenslandica, providing a more accurate and comprehensive set of genes for functional and comparative studies. These improvements reveal the Amphimedon genome is comprised of a remarkably high number of tightly packed genes. These genes have small introns and there is pervasive intron retention amongst alternatively spliced transcripts. These aspects of the sponge genome are more similar unicellular opisthokont genomes than to other animal genomes.

  1. PlantTribes: a gene and gene family resource for comparative genomics in plants

    OpenAIRE

    Wall, P. Kerr; Leebens-Mack, Jim; Müller, Kai F.; Field, Dawn; Altman, Naomi S.; dePamphilis, Claude W.

    2007-01-01

    The PlantTribes database (http://fgp.huck.psu.edu/tribe.html) is a plant gene family database based on the inferred proteomes of five sequenced plant species: Arabidopsis thaliana, Carica papaya, Medicago truncatula, Oryza sativa and Populus trichocarpa. We used the graph-based clustering algorithm MCL [Van Dongen (Technical Report INS-R0010 2000) and Enright et al. (Nucleic Acids Res. 2002; 30: 1575–1584)] to classify all of these species’ protein-coding genes into putative gene families, ca...

  2. Los "intelectuales" y el Estado Novo

    Directory of Open Access Journals (Sweden)

    Luís REIS TORGAL

    2010-02-01

    Full Text Available RESUMEN: El concepto de “intelectual” es difícil de definir y sin duda se debate constantemente. Sin embargo, es importante reflexionar sobre él a fin de comprender su significado y los problemas que implica. Sea como fuere, a un Estado autoritario “moderno” se le supone una única ideología que tiene que difundirse por medio de una propaganda bien organizada, proceso en el que los intelectuales desempeñan un papel significativo. El “Nuevo Estado” de Salazar encaja en esta categoría y sin duda el conocimiento acerca de sus “intelectuales” es fundamental. En este artículo, el objetivo es proporcionar algunos ejemplos interesantes de “intelectuales” o de simples “funcionarios políticos” con inclinación intelectual a fin de indicar el sentido y complejidad de un estudio con una dimensión distinta sobre este asunto. A este fin han sido seleccionadas tres personalidades responsables de la actividad ideológica y cultural de gran relieve en el Estado de Salazar: António Ferro, João Ameal y Costa Brochado. Palabras clave: Estado Novo; Authotitarianism; Salazar, António de Oliveira; Intellectuals; Ferro, António; Brochado, Idalino da Costa; Ameal, João. ABSTRACT: The concept of the “intellectual” is difficult to define and undoubtedly constantly debated. It is nevertheless important to reflect on it in order to understand its meaning and the problems involved with it. Be that as it may, a “modern” authoritarian State presumes a single ideology which has to be diffused by means of well-organised propaganda, in which process “intellectuals” play a significant role. Salazar’s “New State” fits this category and, undoubtedly, knowledge about its “intellectuals” is fundamental. The objective in this article is to provide some interesting examples of “intellectuals” or simple “political functionar- ies” with an intellectual bent so as to indicate the sense and complexity of a study of a

  3. Origin of sphinx, a young chimeric RNA gene in Drosophila melanogaster

    Science.gov (United States)

    Wang, Wen; Brunet, Frédéric G.; Nevo, Eviatar; Long, Manyuan

    2002-01-01

    Non-protein-coding RNA genes play an important role in various biological processes. How new RNA genes originated and whether this process is controlled by similar evolutionary mechanisms for the origin of protein-coding genes remains unclear. A young chimeric RNA gene that we term sphinx (spx) provides the first insight into the early stage of evolution of RNA genes. spx originated as an insertion of a retroposed sequence of the ATP synthase chain F gene at the cytological region 60DB since the divergence of Drosophila melanogaster from its sibling species 2–3 million years ago. This retrosequence, which is located at 102F on the fourth chromosome, recruited a nearby exon and intron, thereby evolving a chimeric gene structure. This molecular process suggests that the mechanism of exon shuffling, which can generate protein-coding genes, also plays a role in the origin of RNA genes. The subsequent evolutionary process of spx has been associated with a high nucleotide substitution rate, possibly driven by a continuous positive Darwinian selection for a novel function, as is shown in its sex- and development-specific alternative splicing. To test whether spx has adapted to different environments, we investigated its population genetic structure in the unique “Evolution Canyon” in Israel, revealing a similar haplotype structure in spx, and thus similar evolutionary forces operating on spx between environments. PMID:11904380

  4. A de novo SOX10 mutation causing severe type 4 Waardenburg syndrome without Hirschsprung disease.

    Science.gov (United States)

    Sznajer, Yves; Coldéa, Cristina; Meire, Françoise; Delpierre, Isabelle; Sekhara, Tayeb; Touraine, Renaud L

    2008-04-15

    Type 4 Waardenburg syndrome represents a well define entity caused by neural crest derivatives anomalies (melanocytes, intrinsic ganglion cells, central, autonomous and peripheral nervous systems) leading, with variable expressivity, to pigmentary anomalies, deafness, mental retardation, peripheral neuropathy, and Hirschsprung disease. Autosomal dominant mode of inheritance is prevalent when Sox10 gene mutation is identified. We report the natural history of a child who presented with synophrys, vivid blue eye, deafness, bilateral complete semicircular canals agenesis with mental retardation, subtle signs for peripheral neuropathy and lack of Hirschsprung disease. SOX10 gene sequencing identified "de novo" splice site mutation (c.698-2A > C). The present phenotype and the genotype findings underline the wide spectrum of SOX10 gene implication in unusual type 4 Waardenburg syndrome patient. Copyright 2008 Wiley-Liss, Inc.

  5. Molecular analysis of "de novo" purine biosynthesis in solanaceous species and in Arabidopsis thaliana

    DEFF Research Database (Denmark)

    van der Graaff, Eric; Hooykaas, Paul; Lein, Wolfgang

    2004-01-01

    Purine nucleotides are essential components to sustain plant growth and development. In plants they are either synthesized "de novo" during the process of purine biosynthesis or are recycled from purine bases and purine nucleosides throughout the salvage pathway. Comparison between animals...... biosynthesis pathway in plants, and the in planta functional analysis of PRPP (5-phosphoribosyl-1-pyrophoshate) amidotransferase (ATase), catalyzing the first committed step of the "de novo" purine biosynthesis. The cloning of the genes involved in the purine biosynthesis pathway was attained by a screening...... strategy with heterologous cDNA probes and by using S. cerevisiae mutants for complementation. Southern hybridization showed a complex genomic organization for these genes in solanaceous species and their organ- and developmental specific expression was analyzed by Northern hybridization. The specific role...

  6. Organ-Specific Alterations in Fatty Acid De Novo Synthesis and Desaturation in a Rat Model of Programmed Obesity

    Directory of Open Access Journals (Sweden)

    Desai Mina

    2011-05-01

    Full Text Available Abstract Background Small for gestational age (SGA leads to increased risk of adult obesity and metabolic syndrome. Offspring exposed to 50% maternal food restriction in utero are born smaller than Controls (FR, catch-up in growth by the end of the nursing period, and become obese adults. The objective of the study was to determine stearoyl-CoA desaturase activity (SCD1 and rates of de novo fatty acid synthesis in young FR and Control offspring tissues at the end of the nursing period, as possible contributors to catch-up growth. Methods From gestational day 10 to term, dams fed ad libitum (Control or were 50% food-restricted to produce small FR pups. Control dams nursed all pups. At postnatal day 1 (p1 and p21, offspring body tissues were analyzed by GC/MS, and desaturation indices of palmitoleate/palmitate and oleate/stearate were calculated. SCD1 gene expression was determined by real-time PCR on adipose and liver. Offspring were enriched with deuterium that was given to dams in drinking water during lactation and de novo synthesis of offspring body tissues was determined at p21. Primary adipocyte cell cultures were established at p21 and exposed to U13C-glucose. Results FR offspring exhibited higher desaturation index in p1 and p21 adipose tissue, but decreased desaturation index in liver at p21. SCD1 gene expression at p21 was correspondingly increased in adipose and decreased in liver. FR subcutaneous fat demonstrated increased de novo synthesis at p21. Primary cell cultures exhibited increased de novo synthesis in FR. Conclusions Adipose tissue is the first site to exhibit increased de novo synthesis and desaturase activity in FR. Therefore, abnormal lipogenesis is already present prior to onset of obesity during the period of catch-up growth. These abnormalities may contribute to future obesity development.

  7. Icarus: visualizer for de novo assembly evaluation.

    Science.gov (United States)

    Mikheenko, Alla; Valin, Gleb; Prjibelski, Andrey; Saveliev, Vladislav; Gurevich, Alexey

    2016-11-01

    : Data visualization plays an increasingly important role in NGS data analysis. With advances in both sequencing and computational technologies, it has become a new bottleneck in genomics studies. Indeed, evaluation of de novo genome assemblies is one of the areas that can benefit from the visualization. However, even though multiple quality assessment methods are now available, existing visualization tools are hardly suitable for this purpose. Here, we present Icarus-a novel genome visualizer for accurate assessment and analysis of genomic draft assemblies, which is based on the tool QUAST. Icarus can be used in studies where a related reference genome is available, as well as for non-model organisms. The tool is available online and as a standalone application. http://cab.spbu.ru/software/icarus CONTACT: aleksey.gurevich@spbu.ruSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  8. [Diagnosis and management of de novo epilepsy].

    Science.gov (United States)

    Louise, Tyvaert

    2018-03-01

    The diagnosis of de novo epilepsy is complex. An accurate diagnostic approach has to be followed based on specific key steps. Epileptic seizure or non-epileptic malaise: risk of diagnosis error around 20%. Facing a first unprovoked seizure, the practitioner has to know the risk factors specifically linked to an increase risk of seizure recurrence. In presence of these factors, an antiepileptic drug would be indicated. The first antiepileptic drug has to be highly selected according to the epilepsy type and causes but also to the patient characteristics (sex, age, comorbidities, associated drugs, profession, and way of life…) An exhaustive patient Education needs to support the first antiepileptic drug prescription: (sleep and nutritional advices, benefit of observance, antiepileptic drugs features and side effects, follow-up, prognosis…) A regular follow-up is essential to control the observance, tolerability and efficacy of the antiepileptic drug, and to control also the good acceptance of the disease. A systematic research of common comorbidities may be also performed. Electroencephalogram and antiepileptic drugs levels are unnecessary in the classical follow up of known epileptic patients (except specific cases). Copyright © 2017 Elsevier Masson SAS. All rights reserved.

  9. Identification of putative cis-regulatory elements in Cryptosporidium parvum by de novo pattern finding

    Directory of Open Access Journals (Sweden)

    Kissinger Jessica C

    2007-01-01

    Full Text Available Abstract Background Cryptosporidium parvum is a unicellular eukaryote in the phylum Apicomplexa. It is an obligate intracellular parasite that causes diarrhea and is a significant AIDS-related pathogen. Cryptosporidium parvum is not amenable to long-term laboratory cultivation or classical molecular genetic analysis. The parasite exhibits a complex life cycle, a broad host range, and fundamental mechanisms of gene regulation remain unknown. We have used data from the recently sequenced genome of this organism to uncover clues about gene regulation in C. parvum. We have applied two pattern finding algorithms MEME and AlignACE to identify conserved, over-represented motifs in the 5' upstream regions of genes in C. parvum. To support our findings, we have established comparative real-time -PCR expression profiles for the groups of genes examined computationally. Results We find that groups of genes that share a function or belong to a common pathway share upstream motifs. Different motifs are conserved upstream of different groups of genes. Comparative real-time PCR studies show co-expression of genes within each group (in sub-sets during the life cycle of the parasite, suggesting co-regulation of these genes may be driven by the use of conserved upstream motifs. Conclusion This is one of the first attempts to characterize cis-regulatory elements in the absence of any previously characterized elements and with very limited expression data (seven genes only. Using de novo pattern finding algorithms, we have identified specific DNA motifs that are conserved upstream of genes belonging to the same metabolic pathway or gene family. We have demonstrated the co-expression of these genes (often in subsets using comparative real-time-PCR experiments thus establishing evidence for these conserved motifs as putative cis-regulatory elements. Given the lack of prior information concerning expression patterns and organization of promoters in C. parvum we

  10. Identifying wrong assemblies in de novo short read primary ...

    Indian Academy of Sciences (India)

    2016-08-05

    Aug 5, 2016 ... Most of these assemblies are done using some de novo short read assemblers and other related approaches. .... benchmarking projects like Assemblathon 1, Assemblathon ... from a large insert library (at least 1000 bases).

  11. Gene prediction using the Self-Organizing Map: automatic generation of multiple gene models.

    Science.gov (United States)

    Mahony, Shaun; McInerney, James O; Smith, Terry J; Golden, Aaron

    2004-03-05

    Many current gene prediction methods use only one model to represent protein-coding regions in a genome, and so are less likely to predict the location of genes that have an atypical sequence composition. It is likely that future improvements in gene finding will involve the development of methods that can adequately deal with intra-genomic compositional variation. This work explores a new approach to gene-prediction, based on the Self-Organizing Map, which has the ability to automatically identify multiple gene models within a genome. The current implementation, named RescueNet, uses relative synonymous codon usage as the indicator of protein-coding potential. While its raw accuracy rate can be less than other methods, RescueNet consistently identifies some genes that other methods do not, and should therefore be of interest to gene-prediction software developers and genome annotation teams alike. RescueNet is recommended for use in conjunction with, or as a complement to, other gene prediction methods.

  12. Arginine de novo and nitric oxide production in disease states

    OpenAIRE

    Luiking, Yvette C.; Ten Have, Gabriella A. M.; Wolfe, Robert R.; Deutz, Nicolaas E. P.

    2012-01-01

    Arginine is derived from dietary protein intake, body protein breakdown, or endogenous de novo arginine production. The latter may be linked to the availability of citrulline, which is the immediate precursor of arginine and limiting factor for de novo arginine production. Arginine metabolism is highly compartmentalized due to the expression of the enzymes involved in arginine metabolism in various organs. A small fraction of arginine enters the NO synthase (NOS) pathway. Tetrahydrobiopterin ...

  13. De Novo Collapsing Glomerulopathy in a Renal Allograft Recipient

    Directory of Open Access Journals (Sweden)

    Kanodia K

    2008-01-01

    Full Text Available Collapsing glomerulopathy (CG, characterized histologically by segmental/global glomerular capillary collapse, podocyte hypertrophy and hypercellularity and tubulo-interstitial injury; is characterized clinically by massive proteinuria and rapid progressive renal failure. CG is known to recur in renal allograft and rarely de novo. We report de novo CG 3 years post-transplant in a patient who received renal allograft from haplo-identical type donor.

  14. Language and national identity in Novo Cinema Galego

    Directory of Open Access Journals (Sweden)

    Brais ROMERO SUÁREZ

    2015-12-01

    Full Text Available The talk of town since its inception in 2010, the Cinema Novo Galego has been successful in all competitions and festivals that has been present. From the FIPRESCI prize in Cannes to the Best Emerging Director at Locarno, this new wave of cinema places Galicia in the world film stage. But does Novo Cinema Galego an accurate representation of Galicia? What's the role of Galicia in this movement?

  15. RNAi mediates post-transcriptional repression of gene expression in fission yeast Schizosaccharomyces pombe

    International Nuclear Information System (INIS)

    Smialowska, Agata; Djupedal, Ingela; Wang, Jingwen; Kylsten, Per; Swoboda, Peter; Ekwall, Karl

    2014-01-01

    Highlights: • Protein coding genes accumulate anti-sense sRNAs in fission yeast S. pombe. • RNAi represses protein-coding genes in S. pombe. • RNAi-mediated gene repression is post-transcriptional. - Abstract: RNA interference (RNAi) is a gene silencing mechanism conserved from fungi to mammals. Small interfering RNAs are products and mediators of the RNAi pathway and act as specificity factors in recruiting effector complexes. The Schizosaccharomyces pombe genome encodes one of each of the core RNAi proteins, Dicer, Argonaute and RNA-dependent RNA polymerase (dcr1, ago1, rdp1). Even though the function of RNAi in heterochromatin assembly in S. pombe is established, its role in controlling gene expression is elusive. Here, we report the identification of small RNAs mapped anti-sense to protein coding genes in fission yeast. We demonstrate that these genes are up-regulated at the protein level in RNAi mutants, while their mRNA levels are not significantly changed. We show that the repression by RNAi is not a result of heterochromatin formation. Thus, we conclude that RNAi is involved in post-transcriptional gene silencing in S. pombe

  16. Identification of de novo mutations of Duchénnè/Becker muscular dystrophies in southern Spain.

    Science.gov (United States)

    Garcia, Susana; de Haro, Tomás; Zafra-Ceres, Mercedes; Poyatos, Antonio; Gomez-Capilla, Jose A; Gomez-Llorente, Carolina

    2014-01-01

    Duchénnè/Becker muscular dystrophies (DMD/BMD) are X-linked diseases, which are caused by a de novo gene mutation in one-third of affected males. The study objectives were to determine the incidence of DMD/BMD in Andalusia (Spain) and to establish the percentage of affected males in whom a de novo gene mutation was responsible. Multiplex ligation-dependent probe amplification (MLPA) technology was applied to determine the incidence of DMD/BMD in 84 males with suspicion of the disease and 106 female relatives. Dystrophin gene exon deletion (89.5%) or duplication (10.5%) was detected in 38 of the 84 males by MLPA technology; de novo mutations account for 4 (16.7%) of the 24 mother-son pairs studied. MLPA technology is adequate for the molecular diagnosis of DMD/BMD and establishes whether the mother carries the molecular alteration responsible for the disease, a highly relevant issue for genetic counseling.

  17. Combining Shigella Tn-seq data with gold-standard E. coli gene deletion data suggests rare transitions between essential and non-essential gene functionality.

    Science.gov (United States)

    Freed, Nikki E; Bumann, Dirk; Silander, Olin K

    2016-09-06

    Gene essentiality - whether or not a gene is necessary for cell growth - is a fundamental component of gene function. It is not well established how quickly gene essentiality can change, as few studies have compared empirical measures of essentiality between closely related organisms. Here we present the results of a Tn-seq experiment designed to detect essential protein coding genes in the bacterial pathogen Shigella flexneri 2a 2457T on a genome-wide scale. Superficial analysis of this data suggested that 481 protein-coding genes in this Shigella strain are critical for robust cellular growth on rich media. Comparison of this set of genes with a gold-standard data set of essential genes in the closely related Escherichia coli K12 BW25113 revealed that an excessive number of genes appeared essential in Shigella but non-essential in E. coli. Importantly, and in converse to this comparison, we found no genes that were essential in E. coli and non-essential in Shigella, implying that many genes were artefactually inferred as essential in Shigella. Controlling for such artefacts resulted in a much smaller set of discrepant genes. Among these, we identified three sets of functionally related genes, two of which have previously been implicated as critical for Shigella growth, but which are dispensable for E. coli growth. The data presented here highlight the small number of protein coding genes for which we have strong evidence that their essentiality status differs between the closely related bacterial taxa E. coli and Shigella. A set of genes involved in acetate utilization provides a canonical example. These results leave open the possibility of developing strain-specific antibiotic treatments targeting such differentially essential genes, but suggest that such opportunities may be rare in closely related bacteria.

  18. FunGene: the functional gene pipeline and repository.

    Science.gov (United States)

    Fish, Jordan A; Chai, Benli; Wang, Qiong; Sun, Yanni; Brown, C Titus; Tiedje, James M; Cole, James R

    2013-01-01

    Ribosomal RNA genes have become the standard molecular markers for microbial community analysis for good reasons, including universal occurrence in cellular organisms, availability of large databases, and ease of rRNA gene region amplification and analysis. As markers, however, rRNA genes have some significant limitations. The rRNA genes are often present in multiple copies, unlike most protein-coding genes. The slow rate of change in rRNA genes means that multiple species sometimes share identical 16S rRNA gene sequences, while many more species share identical sequences in the short 16S rRNA regions commonly analyzed. In addition, the genes involved in many important processes are not distributed in a phylogenetically coherent manner, potentially due to gene loss or horizontal gene transfer. While rRNA genes remain the most commonly used markers, key genes in ecologically important pathways, e.g., those involved in carbon and nitrogen cycling, can provide important insights into community composition and function not obtainable through rRNA analysis. However, working with ecofunctional gene data requires some tools beyond those required for rRNA analysis. To address this, our Functional Gene Pipeline and Repository (FunGene; http://fungene.cme.msu.edu/) offers databases of many common ecofunctional genes and proteins, as well as integrated tools that allow researchers to browse these collections and choose subsets for further analysis, build phylogenetic trees, test primers and probes for coverage, and download aligned sequences. Additional FunGene tools are specialized to process coding gene amplicon data. For example, FrameBot produces frameshift-corrected protein and DNA sequences from raw reads while finding the most closely related protein reference sequence. These tools can help provide better insight into microbial communities by directly studying key genes involved in important ecological processes.

  19. FunGene: the Functional Gene Pipeline and Repository

    Directory of Open Access Journals (Sweden)

    Jordan A. Fish

    2013-10-01

    Full Text Available Ribosomal RNA genes have become the standard molecular markers for microbial community analysis for good reasons, including universal occurrence in cellular organisms, availability of large databases, and ease of rRNA gene region amplification and analysis. As markers, however, rRNA genes have some significant limitations. The rRNA genes are often present in multiple copies, unlike most protein-coding genes. The slow rate of change in rRNA genes means that multiple species sometimes share identical 16S rRNA gene sequences, while many more species share identical sequences in the short 16S rRNA regions commonly analyzed. In addition, the genes involved in many important processes are not distributed in a phylogenetically coherent manner, potentially due to gene loss or horizontal gene transfer.While rRNA genes remain the most commonly used markers, key genes in ecologically important pathways, e.g., those involved in carbon and nitrogen cycling, can provide important insights into community composition and function not obtainable through rRNA analysis. However, working with ecofunctional gene data requires some tools beyond those required for rRNA analysis. To address this, our Functional Gene Pipeline and Repository (FunGene; http://fungene.cme.msu.edu/ offers databases of many common ecofunctional genes and proteins, as well as integrated tools that allow researchers to browse these collections and choose subsets for further analysis, build phylogenetic trees, test primers and probes for coverage, and download aligned sequences. Additional FunGene tools are specialized to process coding gene amplicon data. For example, FrameBot produces frameshift-corrected protein and DNA sequences from raw reads while finding the most closely related protein reference sequence. These tools can help provide better insight into microbial communities by directly studying key genes involved in important ecological processes.

  20. Reactivity of some mammalian sera with the bovine leukaemia virus env gene polypeptide expressed in Escherichia coli

    International Nuclear Information System (INIS)

    Slavikova, K.; Zajac, V.

    1989-01-01

    Sera from bovine leukaemia virus (BLV)-infected cattle and sheep were tested by radioimmunoassay and Western blot for their reactivity with 60,000 protein coded by the env gene of BLV and expressed in Escherichia coli. This protein, antigenically similar to BLV protein, reacted with antibodies against BLV antigens in the sera tested. (author). 3 figs., 1 tab., 13 refs

  1. Characterization and analysis of a de novo transcriptome from the pygmy grasshopper Tetrix japonica.

    Science.gov (United States)

    Qiu, Zhongying; Liu, Fei; Lu, Huimeng; Huang, Yuan

    2017-05-01

    The pygmy grasshopper Tetrix japonica is a common insect distributed throughout the world, and it has the potential for use in studies of body colour polymorphism, genomics and the biology of Tetrigoidea (Insecta: Orthoptera). However, limited biological information is available for this insect. Here, we conducted a de novo transcriptome study of adult and larval T. japonica to provide a better understanding of its gene expression and develop genomic resources for future work. We sequenced and explored the characteristics of the de novo transcriptome of T. japonica using Illumina HiSeq 2000 platform. A total of 107 608 206 paired-end clean reads were assembled into 61 141 unigenes using the trinity software; the mean unigene size was 771 bp, and the N50 length was 1238 bp. A total of 29 225 unigenes were functionally annotated to the NCBI nonredundant protein sequences (Nr), NCBI nonredundant nucleotide sequences (Nt), a manually annotated and reviewed protein sequence database (Swiss-Prot), Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) databases. A large number of putative genes that are potentially involved in pigment pathways, juvenile hormone (JH) metabolism and signalling pathways were identified in the T. japonica transcriptome. Additionally, 165 769 and 156 796 putative single nucleotide polymorphisms occurred in the adult and larvae transcriptomes, respectively, and a total of 3162 simple sequence repeats were detected in this assembly. This comprehensive transcriptomic data for T. japonica will provide a usable resource for gene predictions, signalling pathway investigations and molecular marker development for this species and other pygmy grasshoppers. © 2016 John Wiley & Sons Ltd.

  2. Archaeal Clusters of Orthologous Genes (arCOGs): An Update and Application for Analysis of Shared Features between Thermococcales, Methanococcales, and Methanobacteriales

    OpenAIRE

    Makarova, Kira; Wolf, Yuri; Koonin, Eugene

    2015-01-01

    With the continuously accelerating genome sequencing from diverse groups of archaea and bacteria, accurate identification of gene orthology and availability of readily expandable clusters of orthologous genes are essential for the functional annotation of new genomes. We report an update of the collection of archaeal Clusters of Orthologous Genes (arCOGs) to cover, on average, 91% of the protein-coding genes in 168 archaeal genomes. The new arCOGs were constructed using refined algorithms for...

  3. De novo mutations of GCK, HNF1A and HNF4A may be more frequent in MODY than previously assumed.

    Science.gov (United States)

    Stanik, Juraj; Dusatkova, Petra; Cinek, Ondrej; Valentinova, Lucia; Huckova, Miroslava; Skopkova, Martina; Dusatkova, Lenka; Stanikova, Daniela; Pura, Mikulas; Klimes, Iwar; Lebl, Jan; Gasperikova, Daniela; Pruhova, Stepanka

    2014-03-01

    MODY is mainly characterised by an early onset of diabetes and a positive family history of diabetes with an autosomal dominant mode of inheritance. However, de novo mutations have been reported anecdotally. The aim of this study was to systematically revisit a large collection of MODY patients to determine the minimum prevalence of de novo mutations in the most prevalent MODY genes (i.e. GCK, HNF1A, HNF4A). Analysis of 922 patients from two national MODY centres (Slovakia and the Czech Republic) identified 150 probands (16%) who came from pedigrees that did not fulfil the criterion of two generations with diabetes but did fulfil the remaining criteria. The GCK, HNF1A and HNF4A genes were analysed by direct sequencing. Mutations in GCK, HNF1A or HNF4A genes were detected in 58 of 150 individuals. Parents of 28 probands were unavailable for further analysis, and in 19 probands the mutation was inherited from an asymptomatic parent. In 11 probands the mutations arose de novo. In our cohort of MODY patients from two national centres the de novo mutations in GCK, HNF1A and HNF4A were present in 7.3% of the 150 families without a history of diabetes and 1.2% of all of the referrals for MODY testing. This is the largest collection of de novo MODY mutations to date, and our findings indicate a much higher frequency of de novo mutations than previously assumed. Therefore, genetic testing of MODY could be considered for carefully selected individuals without a family history of diabetes.

  4. Finding Nemo's Genes: A chromosome-scale reference assembly of the genome of the orange clownfish Amphiprion percula

    KAUST Repository

    Lehmann, Robert; Lightfoot, Damien J; Schunter, Celia Marei; Michell, Craig T; Ohyanagi, Hajime; Mineta, Katsuhiko; Foret, Sylvain; Berumen, Michael L.; Miller, David J; Aranda, Manuel; Gojobori, Takashi; Munday, Philip L; Ravasi, Timothy

    2018-01-01

    The iconic orange clownfish, Amphiprion percula, is a model organism for studying the ecology and evolution of reef fishes, including patterns of population connectivity, sex change, social organization, habitat selection and adaptation to climate change. Notably, the orange clownfish is the only reef fish for which a complete larval dispersal kernel has been established and was the first fish species for which it was demonstrated that anti-predator responses of reef fishes could be impaired by ocean acidification. Despite its importance, molecular resources for this species remain scarce and until now it lacked a reference genome assembly. Here we present a de novo chromosome-scale assembly of the genome of the orange clownfish Amphiprion percula. We utilized single-molecule real-time sequencing technology from Pacific Biosciences to produce an initial polished assembly comprised of 1,414 contigs, with a contig N50 length of 1.86 Mb. Using Hi-C based chromatin contact maps, 98% of the genome assembly were placed into 24 chromosomes, resulting in a final assembly of 908.8 Mb in length with contig and scaffold N50s of 3.12 and 38.4 Mb, respectively. This makes it one of the most contiguous and complete fish genome assemblies currently available. The genome was annotated with 26,597 protein coding genes and contains 96% of the core set of conserved actinopterygian orthologs. The availability of this reference genome assembly as a community resource will further strengthen the role of the orange clownfish as a model species for research on the ecology and evolution of reef fishes.

  5. Finding Nemo's Genes: A chromosome-scale reference assembly of the genome of the orange clownfish Amphiprion percula

    KAUST Repository

    Lehmann, Robert

    2018-03-08

    The iconic orange clownfish, Amphiprion percula, is a model organism for studying the ecology and evolution of reef fishes, including patterns of population connectivity, sex change, social organization, habitat selection and adaptation to climate change. Notably, the orange clownfish is the only reef fish for which a complete larval dispersal kernel has been established and was the first fish species for which it was demonstrated that anti-predator responses of reef fishes could be impaired by ocean acidification. Despite its importance, molecular resources for this species remain scarce and until now it lacked a reference genome assembly. Here we present a de novo chromosome-scale assembly of the genome of the orange clownfish Amphiprion percula. We utilized single-molecule real-time sequencing technology from Pacific Biosciences to produce an initial polished assembly comprised of 1,414 contigs, with a contig N50 length of 1.86 Mb. Using Hi-C based chromatin contact maps, 98% of the genome assembly were placed into 24 chromosomes, resulting in a final assembly of 908.8 Mb in length with contig and scaffold N50s of 3.12 and 38.4 Mb, respectively. This makes it one of the most contiguous and complete fish genome assemblies currently available. The genome was annotated with 26,597 protein coding genes and contains 96% of the core set of conserved actinopterygian orthologs. The availability of this reference genome assembly as a community resource will further strengthen the role of the orange clownfish as a model species for research on the ecology and evolution of reef fishes.

  6. The red deer Cervus elaphus genome CerEla1.0: sequencing, annotating, genes, and chromosomes.

    Science.gov (United States)

    Bana, Nóra Á; Nyiri, Anna; Nagy, János; Frank, Krisztián; Nagy, Tibor; Stéger, Viktor; Schiller, Mátyás; Lakatos, Péter; Sugár, László; Horn, Péter; Barta, Endre; Orosz, László

    2018-01-02

    We present here the de novo genome assembly CerEla1.0 for the red deer, Cervus elaphus, an emblematic member of the natural megafauna of the Northern Hemisphere. Humans spread the species in the South. Today, the red deer is also a farm-bred animal and is becoming a model animal in biomedical and population studies. Stag DNA was sequenced at 74× coverage by Illumina technology. The ALLPATHS-LG assembly of the reads resulted in 34.7 × 10 3 scaffolds, 26.1 × 10 3 of which were utilized in Cer.Ela1.0. The assembly spans 3.4 Gbp. For building the red deer pseudochromosomes, a pre-established genetic map was used for main anchor points. A nearly complete co-linearity was found between the mapmarker sequences of the deer genetic map and the order and orientation of the orthologous sequences in the syntenic bovine regions. Syntenies were also conserved at the in-scaffold level. The cM distances corresponded to 1.34 Mbp uniformly along the deer genome. Chromosomal rearrangements between deer and cattle were demonstrated. 2.8 × 10 6 SNPs, 365 × 10 3 indels and 19368 protein-coding genes were identified in CerEla1.0, along with positions for centromerons. CerEla1.0 demonstrates the utilization of dual references, i.e., when a target genome (here C. elaphus) already has a pre-established genetic map, and is combined with the well-established whole genome sequence of a closely related species (here Bos taurus). Genome-wide association studies (GWAS) that CerEla1.0 (NCBI, MKHE00000000) could serve for are discussed.

  7. A 380-kb Duplication in 7p22.3 Encompassing the LFNG Gene in a Boy with Asperger Syndrome

    NARCIS (Netherlands)

    Vulto-van Silfhout, A.T.; de Brouwer, A.F.; de Leeuw, N.; Obihara, C.C.; Brunner, H.G.; Vries, L.B.A. de

    2012-01-01

    De novo genomic aberrations are considered an important cause of autism spectrum disorders. We describe a de novo 380-kb gain in band p22.3 of chromosome 7 in a patient with Asperger syndrome. This duplicated region contains 9 genes including the LNFG gene that is an important regulator of NOTCH

  8. Developing a de novo targeted knock-in method based on in utero electroporation into the mammalian brain.

    Science.gov (United States)

    Tsunekawa, Yuji; Terhune, Raymond Kunikane; Fujita, Ikumi; Shitamukai, Atsunori; Suetsugu, Taeko; Matsuzaki, Fumio

    2016-09-01

    Genome-editing technology has revolutionized the field of biology. Here, we report a novel de novo gene-targeting method mediated by in utero electroporation into the developing mammalian brain. Electroporation of donor DNA with the CRISPR/Cas9 system vectors successfully leads to knock-in of the donor sequence, such as EGFP, to the target site via the homology-directed repair mechanism. We developed a targeting vector system optimized to prevent anomalous leaky expression of the donor gene from the plasmid, which otherwise often occurs depending on the donor sequence. The knock-in efficiency of the electroporated progenitors reached up to 40% in the early stage and 20% in the late stage of the developing mouse brain. Furthermore, we inserted different fluorescent markers into the target gene in each homologous chromosome, successfully distinguishing homozygous knock-in cells by color. We also applied this de novo gene targeting to the ferret model for the study of complex mammalian brains. Our results demonstrate that this technique is widely applicable for monitoring gene expression, visualizing protein localization, lineage analysis and gene knockout, all at the single-cell level, in developmental tissues. © 2016. Published by The Company of Biologists Ltd.

  9. Clinicopathologic factors associated with de novo metastatic breast cancer.

    Science.gov (United States)

    Shen, Tiansheng; Siegal, Gene P; Wei, Shi

    2016-12-01

    While breast cancers with distant metastasis at presentation (de novo metastasis) harbor significantly inferior clinical outcomes, there have been limited studies analyzing the clinicopathologic characteristics in this subset of patients. In this study, we analyzed 6126 breast cancers diagnosed between 1998 and 2013 to identify factors associated with de novo metastatic breast cancer. When compared to patients without metastasis at presentation, race, histologic grade, estrogen/progesterone receptor (ER/PR) and HER2 statuses were significantly associated with de novo metastasis in the entire cohort, whereas age, histologic grade, PR and HER2 status were the significant parameters in the subset of patients with locally advanced breast cancer (Stage IIB/III). The patients with de novo metastatic breast cancer had a significant older mean age and a lower proportion of HER2-positive tumors when compared to those with metastatic recurrence. Further, the HER2-rich subtype demonstrated a drastically higher incidence of de novo metastasis when compared to the luminal and triple-negative breast cancers in the entire cohort [odds ratio (OR)=5.68 and 2.27, respectively] and in the patients with locally advanced disease (OR=4.02 and 2.12, respectively), whereas no significant difference was seen between de novo metastatic cancers and those with metastatic recurrence. Moreover, the luminal and HER2-rich subtypes showed bone-seeking (OR=1.92) and liver-homing (OR=2.99) characteristics, respectively, for the sites of de novo metastasis, while the latter was not observed in those with metastatic recurrence. Our data suggest that an algorithm incorporating clinicopathologic factors, especially histologic grade and receptor profile, remains of significant benefit during decision making in newly diagnosed breast cancer in the pursuit of precision medicine. Copyright © 2016 Elsevier GmbH. All rights reserved.

  10. NovoPen Echo® insulin delivery device

    Directory of Open Access Journals (Sweden)

    Hyllested-Winge J

    2016-01-01

    Full Text Available Jacob Hyllested-Winge,1 Thomas Sparre,2 Line Kynemund Pedersen2 1Novo Nordisk Pharma Ltd, Tokyo, Japan; 2Novo Nordisk A/S, Søborg, Denmark Abstract: The introduction of insulin pen devices has provided easier, well-tolerated, and more convenient treatment regimens for patients with diabetes mellitus. When compared with vial and syringe regimens, insulin pens offer a greater clinical efficacy, improved quality of life, and increased dosing accuracy, particularly at low doses. The portable and discreet nature of pen devices reduces the burden on the patient, facilitates adherence, and subsequently contributes to the improvement in glycemic control. NovoPen Echo® is one of the latest members of the NovoPen® family that has been specifically designed for the pediatric population and is the first to combine half-unit increment (=0.5 U of insulin dosing with a simple memory function. The half-unit increment dosing amendments and accurate injection of 0.5 U of insulin are particularly beneficial for children (and insulin-sensitive adults/elders, who often require small insulin doses. The memory function can be used to record the time and amount of the last dose, reducing the fear of double dosing or missing a dose. The memory function also provides parents with extra confidence and security that their child is taking insulin at the correct doses and times. NovoPen Echo is a lightweight, durable insulin delivery pen; it is available in two different colors, which may help to distinguish between different types of insulin, providing more confidence for both users and caregivers. Studies have demonstrated a high level of patient satisfaction, with 80% of users preferring NovoPen Echo to other pediatric insulin pens. Keywords: NovoPen Echo®, memory function, half-unit increment dosing, adherence, children, adolescents 

  11. TYK2 protein-coding variants protect against rheumatoid arthritis and autoimmunity, with no evidence of major pleiotropic effects on non-autoimmune complex traits

    NARCIS (Netherlands)

    Diogo, Dorothée; Bastarache, Lisa; Liao, Katherine P.; Graham, Robert R.; Fulton, Robert S.; Greenberg, Jeffrey D.; Eyre, Steve; Bowes, John; Cui, Jing; Lee, Annette; Pappas, Dimitrios A.; Kremer, Joel M.; Barton, Anne; Coenen, Marieke J. H.; Franke, Barbara; Kiemeney, Lambertus A.; Mariette, Xavier; Richard-Miceli, Corrine; Canhão, Helena; Fonseca, João E.; de Vries, Niek; Tak, Paul P.; Crusius, J. Bart A.; Nurmohamed, Michael T.; Kurreeman, Fina; Mikuls, Ted R.; Okada, Yukinori; Stahl, Eli A.; Larson, David E.; Deluca, Tracie L.; O'Laughlin, Michelle; Fronick, Catrina C.; Fulton, Lucinda L.; Kosoy, Roman; Ransom, Michael; Bhangale, Tushar R.; Ortmann, Ward; Cagan, Andrew; Gainer, Vivian; Karlson, Elizabeth W.; Kohane, Isaac; Murphy, Shawn N.; Martin, Javier; Zhernakova, Alexandra; Klareskog, Lars; Padyukov, Leonid; Worthington, Jane; Mardis, Elaine R.; Seldin, Michael F.; Gregersen, Peter K.; Behrens, Timothy; Raychaudhuri, Soumya; Denny, Joshua C.; Plenge, Robert M.

    2015-01-01

    Despite the success of genome-wide association studies (GWAS) in detecting a large number of loci for complex phenotypes such as rheumatoid arthritis (RA) susceptibility, the lack of information on the causal genes leaves important challenges to interpret GWAS results in the context of the disease

  12. De novo status epilepticus with isolated aphasia.

    Science.gov (United States)

    Flügel, Dominique; Kim, Olaf Chan-Hi; Felbecker, Ansgar; Tettenborn, Barbara

    2015-08-01

    Sudden onset of aphasia is usually due to stroke. Rapid diagnostic workup is necessary if reperfusion therapy is considered. Ictal aphasia is a rare condition but has to be excluded. Perfusion imaging may differentiate acute ischemia from other causes. In dubious cases, EEG is required but is time-consuming and laborious. We report a case where we considered de novo status epilepticus as a cause of aphasia without any lesion even at follow-up. A 62-year-old right-handed woman presented to the emergency department after nurses found her aphasic. She had undergone operative treatment of varicosis 3 days earlier. Apart from hypertension and obesity, no cardiovascular risk factors and no intake of medication other than paracetamol were reported. Neurological examination revealed global aphasia and right pronation in the upper extremity position test. Computed tomography with angiography and perfusion showed no abnormalities. Electroencephalogram performed after the CT scan showed left-sided slowing with high-voltage rhythmic 2/s delta waves but no clear ictal pattern. Intravenous lorazepam did improve EEG slightly, while aphasia did not change. Lumbar puncture was performed which likely excluded encephalitis. Magnetic resonance imaging showed cortical pathological diffusion imaging (restriction) and cortical hyperperfusion in the left parietal region. Intravenous anticonvulsant therapy under continuous EEG resolved neurological symptoms. The patient was kept on anticonvulsant therapy. Magnetic resonance imaging after 6 months showed no abnormalities along with no clinical abnormalities. Magnetic resonance imaging findings were only subtle, and EEG was without clear ictal pattern, so the diagnosis of aphasic status remains with some uncertainty. However, status epilepticus can mimic stroke symptoms and has to be considered in patients with aphasia even when no previous stroke or structural lesions are detectable and EEG shows no epileptic discharges. Epileptic origin is

  13. Biophysical characterization of a de novo elastin

    Science.gov (United States)

    Greenland, Kelly Nicole

    Natural human elastin is found in tissue such as the lungs, arteries, and skin. This protein is formed at birth with no mechanism present to repair or supplement the initial quantity formed. As a result, the functionality and durability of elastin's elasticity is critically important. To date, the mechanics of this ability to stretch and recoil is not fully understood. This study utilizes de novo protein design to create a small library of simplistic versions of elastin-like proteins, demonstrate the elastin-like proteins, maintain elastin's functionality, and inquire into its structure using solution nuclear magnetic resonance (NMR). Elastin is formed from cross-linked tropoelastin. Therefore, the first generation of designed proteins consisted of one protein that utilized homogony of interspecies tropoelastin by using three common domains, two hydrophobic and one cross-linking domains. Basic modifications were made to open the hydrophobic region and also to make the protein easier to purify and characterize. The designed protein maintained its functionality, self-aggregating as the temperature increased. Uniquely, the protein remained self-aggregated as the temperature returned below the critical transition temperature. Self-aggregation was additionally induced by increasing salt concentrations and by modifying the pH. The protein appeared to have little secondary structure when studied with solution NMR. These results fueled a second generation of designed elastin-like proteins. This generation contained variations designed to study the cross-linking domain, one specific hydrophobic domain, and the effect of the length of the elastin-like protein. The cross-linking domain in one variation has been significantly modified while the flanking hydrophobic domains have remained unchanged. This characterization of this protein will answer questions regarding the specificity of the homologous nature of the cross-linking domain of tropoelastin across species. A second

  14. Molecular phylogenetics of the family Cyprinidae (Actinopterygii: Cypriniformes) as evidenced by sequence variation in the first intron of S7 ribosomal protein-coding gene: further evidence from a nuclear gene of the systematic chaos in the family.

    Science.gov (United States)

    He, Shunping; Mayden, Richard L; Wang, Xuzheng; Wang, Wei; Tang, Kevin L; Chen, Wei-Jen; Chen, Yiyu

    2008-03-01

    The family Cyprinidae is the largest freshwater fish group in the world, including over 200 genera and 2100 species. The phylogenetic relationships of major clades within this family are simply poorly understood, largely because of the overwhelming diversity of the group; however, several investigators have advanced different hypotheses of relationships that pre- and post-date the use of shared-derived characters as advocated through phylogenetic systematics. As expected, most previous investigations used morphological characters. Recently, mitochondrial DNA (mtDNA) sequences and combined morphological and mtDNA investigations have been used to explore and advance our understanding of species relationships and test monophyletic groupings. Limitations of these studies include limited taxon sampling and a strict reliance upon maternally inherited mtDNA variation. The present study is the first endeavor to recover the phylogenetic relationships of the 12 previously recognized monophyletic subfamilies within the Cyprinidae using newly sequenced nuclear DNA (nDNA) for over 50 species representing members of the different previously hypothesized subfamily and family groupings within the Cyprinidae and from other cypriniform families as outgroup taxa. Hypothesized phylogenetic relationships are constructed using maximum parsimony and Basyesian analyses of 1042 sites, of which 971 sites were variable and 790 were phylogenetically informative. Using other appropriate cypriniform taxa of the families Catostomidae (Myxocyprinus asiaticus), Gyrinocheilidae (Gyrinocheilus aymonieri), and Balitoridae (Nemacheilus sp. and Beaufortia kweichowensis) as outgroups, the Cyprinidae is resolved as a monophyletic group. Within the family the genera Raiamas, Barilius, Danio, and Rasbora, representing many of the tropical cyprinids, represent basal members of the family. All other species can be classified into variably supported and resolved monophyletic lineages, depending upon analysis, that are consistent with or correspond to Barbini and Leuciscini. The Barbini includes taxa traditionally aligned with the subfamily Cyprininae sensu previous morphological revisionary studies by Howes (Barbinae, Labeoninae, Cyprininae and Schizothoracinae). The Leuciscini includes six other subfamilies that are mainly divided into three separate lineages. The relationships among genera and subfamilies are discussed as well as the possible origins of major lineages.

  15. A Tissue-Mapped Axolotl De Novo Transcriptome Enables Identification of Limb Regeneration Factors

    Directory of Open Access Journals (Sweden)

    Donald M. Bryant

    2017-01-01

    Full Text Available Mammals have extremely limited regenerative capabilities; however, axolotls are profoundly regenerative and can replace entire limbs. The mechanisms underlying limb regeneration remain poorly understood, partly because the enormous and incompletely sequenced genomes of axolotls have hindered the study of genes facilitating regeneration. We assembled and annotated a de novo transcriptome using RNA-sequencing profiles for a broad spectrum of tissues that is estimated to have near-complete sequence information for 88% of axolotl genes. We devised expression analyses that identified the axolotl orthologs of cirbp and kazald1 as highly expressed and enriched in blastemas. Using morpholino anti-sense oligonucleotides, we find evidence that cirbp plays a cytoprotective role during limb regeneration whereas manipulation of kazald1 expression disrupts regeneration. Our transcriptome and annotation resources greatly complement previous transcriptomic studies and will be a valuable resource for future research in regenerative biology.

  16. Chromosome aberrations and oncogene alterations in atomic bomb related leukemias - different mechanisms from de novo leukemias

    International Nuclear Information System (INIS)

    Tanaka, K.; Tanaka, H.; Kamada, N.

    2003-01-01

    It is well known that leukemia occurred more frequently among atomic bomb survivors. In 132 atomic bomb related ( AB- related) leukemia patients during 1978-1999, 33 acute myeloid leukemia (AML)/myelodysplastic syndrome (MDS) patients had their exposure doses of more than 1Gy (DS86). Chromosome aberrations of the 33 patients were compared with those from 588 de novo AML/MDS patients who had been bone before August 1945 as control. No FAB M3 patient was observed in the exposed group. Most AB-related AML preceded a long term of MDS stage. Twenty seven of the 33 patients showed complex types of chromosome aberrations with more than three chromosomes involving chromosomes 5,7 and 11. The number of chromosomes abnormality per cell in the AB-related leukemia was 3.78 while 0.92 in de novo leukemia. Only one of the 33 patients had normal karyotype, while 44.1% in de novo leukemia patients. Translocations of chromosome 11 at 11q13 to 11q23 and deletion/ loss of chromosome 20 were frequently observed in AB-related leukemia. No leukemia-type specific translocations such as t(8;21),t(15;17) and 11q23 were found in the 33 AB-related leukemia patients. Furthermore, molecular analyses using FISH and PCR-SSCP revealed the presence of breakpoint located outside of MLL gene in the patients with translocations at 11q22-23 and DNA base derangements of RUNT domain of AML1(CBF β 2)gene with AML/MDS patients without t(8;21) and with a high dose of exposure. These results suggest that AB-related leukemia derives from an exposed pluripotent hematopoietic stem cell which has been preserved for a long time in the bone marrow, expressing high genetic instability such as microsatellite instability. On the other hand, de novo leukemia develops from a committed hematopoietic stem cell and shows simple and leukemia-type specific chromosome aberrations. These findings are important for understanding mechanisms for radiation-induced leukemia

  17. Transcriptome sequencing and de novo analysis of the copepod Calanus sinicus using 454 GS FLX.

    Directory of Open Access Journals (Sweden)

    Juan Ning

    Full Text Available BACKGROUND: Despite their species abundance and primary economic importance, genomic information about copepods is still limited. In particular, genomic resources are lacking for the copepod Calanus sinicus, which is a dominant species in the coastal waters of East Asia. In this study, we performed de novo transcriptome sequencing to produce a large number of expressed sequence tags for the copepod C. sinicus. RESULTS: Copepodid larvae and adults were used as the basic material for transcriptome sequencing. Using 454 pyrosequencing, a total of 1,470,799 reads were obtained, which were assembled into 56,809 high quality expressed sequence tags. Based on their sequence similarity to known proteins, about 14,000 different genes were identified, including members of all major conserved signaling pathways. Transcripts that were putatively involved with growth, lipid metabolism, molting, and diapause were also identified among these genes. Differentially expressed genes related to several processes were found in C. sinicus copepodid larvae and adults. We detected 284,154 single nucleotide polymorphisms (SNPs that provide a resource for gene function studies. CONCLUSION: Our data provide the most comprehensive transcriptome resource available for C. sinicus. This resource allowed us to identify genes associated with primary physiological processes and SNPs in coding regions, which facilitated the quantitative analysis of differential gene expression. These data should provide foundation for future genetic and genomic studies of this and related species.

  18. Monozygotic twins with a de novo 0.32 Mb 16q24.3 deletion, including TUBB3 presenting with developmental delay and mild facial dysmorphism but without overt brain malformation

    DEFF Research Database (Denmark)

    Grønborg, Sabine; Kjaergaard, Susanne; Hove, Hanne

    2015-01-01

    been associated with missense mutations in this group of genes. Here, we report two patients, monozygotic twins, carrying a de novo 0.32 Mb deletion of chromosome 16q24.3 including the TUBB3 gene. The patients presented with global developmental delay, mild facial dysmorphism, secondary microcephaly...

  19. LATERAL GENE TRANSFER AND THE HISTORY OF BACTERIAL GENOMES

    Energy Technology Data Exchange (ETDEWEB)

    Howard Ochman

    2006-02-22

    The aims of this research were to elucidate the role and extent of lateral transfer in the differentiation of bacterial strains and species, and to assess the impact of gene transfer on the evolution of bacterial genomes. The ultimate goal of the project is to examine the dynamics of a core set of protein-coding genes (i.e., those that are distributed universally among Bacteria) by developing conserved primers that would allow their amplification and sequencing in any bacterial taxa. In addition, we adopted a bioinformatic approach to elucidate the extent of lateral gene transfer in sequenced genome.

  20. Purine biosynthesis de novo by lymphocytes in gout

    International Nuclear Information System (INIS)

    Kamoun, P.; Chanard, J.; Brami, M.; Funck-Brentano, J.L.

    1978-01-01

    A method of measurement in vitro of purine biosynthesis de novo in human circulating blood lymphocytes is proposed. The rate of early reactions of purine biosynthesis de novo was determined by the incorporation of [ 14 C]formate into N-formyl glycinamide ribonucleotide when the subsequent reactions of the metabolic pathway were completely inhibited by the antibiotic azaserine. Synthesis of 14 C-labelled N-formyl glycinamide ribonucleotide by lymphocytes was measured in healthy control subjects and patients with primary gout or hyperuricaemia secondary to renal failure, with or without allopurinol therapy. The average synthesis was higher in gouty patients without therapy than in control subjects, but the values contained overlap the normal range. In secondary hyperuricaemia the synthesis was at same value as in control subjects. These results are in agreement with the inconstant acceleration of purine biosynthesis de novo in gouty patients as seen by others with measurement of [ 14 C]glycine incorporation into urinary uric acid. (author)

  1. Functional characterization of a rice de novo DNA methyltransferase, OsDRM2, expressed in Escherichia coli and yeast

    Energy Technology Data Exchange (ETDEWEB)

    Pang, Jinsong, E-mail: pangjs542@nenu.edu.cn [Key Laboratory of Molecular Epigenetics of the Ministry of Education, Northeast Normal University, Changchun, Jilin 130024 (China); Dong, Mingyue; Li, Ning; Zhao, Yanli [Key Laboratory of Molecular Epigenetics of the Ministry of Education, Northeast Normal University, Changchun, Jilin 130024 (China); Liu, Bao, E-mail: baoliu@nenu.edu.cn [Key Laboratory of Molecular Epigenetics of the Ministry of Education, Northeast Normal University, Changchun, Jilin 130024 (China)

    2013-03-01

    Highlights: ► A rice de novo DNA methyltransferase OsDRM2 was cloned. ► In vitro methylation activity of OsDRM2 was characterized with Escherichia coli. ► Assays of OsDRM2 in vivo methylation were done with Saccharomyces cerevisiae. ► OsDRM2 methylation activity is not preferential to any type of cytosine context. ► The activity of OsDRM2 is independent of RdDM pathway. - Abstract: DNA methylation of cytosine nucleotides is an important epigenetic modification that occurs in most eukaryotic organisms and is established and maintained by various DNA methyltransferases together with their co-factors. There are two major categories of DNA methyltransferases: de novo and maintenance. Here, we report the isolation and functional characterization of a de novo methyltransferase, named OsDRM2, from rice (Oryza sativa L.). The full-length coding region of OsDRM2 was cloned and transformed into Escherichia coli and Saccharomyces cerevisiae. Both of these organisms expressed the OsDRM2 protein, which exhibited stochastic de novo methylation activity in vitro at CG, CHG, and CHH di- and tri-nucleotide patterns. Two lines of evidence demonstrated the de novo activity of OsDRM2: (1) a 5′-CCGG-3′ containing DNA fragment that had been pre-treated with OsDRM2 protein expressed in E. coli was protected from digestion by the CG-methylation-sensitive isoschizomer HpaII; (2) methylation-sensitive amplified polymorphism (MSAP) analysis of S. cerevisiae genomic DNA from transformants that had been introduced with OsDRM2 revealed CG and CHG methylation levels of 3.92–9.12%, and 2.88–6.93%, respectively, whereas the mock control S. cerevisiae DNA did not exhibit cytosine methylation. These results were further supported by bisulfite sequencing of the 18S rRNA and EAF5 genes of the transformed S. cerevisiae, which exhibited different DNA methylation patterns, which were observed in the genomic DNA. Our findings establish that OsDRM2 is an active de novo DNA

  2. Functional characterization of a rice de novo DNA methyltransferase, OsDRM2, expressed in Escherichia coli and yeast

    International Nuclear Information System (INIS)

    Pang, Jinsong; Dong, Mingyue; Li, Ning; Zhao, Yanli; Liu, Bao

    2013-01-01

    Highlights: ► A rice de novo DNA methyltransferase OsDRM2 was cloned. ► In vitro methylation activity of OsDRM2 was characterized with Escherichia coli. ► Assays of OsDRM2 in vivo methylation were done with Saccharomyces cerevisiae. ► OsDRM2 methylation activity is not preferential to any type of cytosine context. ► The activity of OsDRM2 is independent of RdDM pathway. - Abstract: DNA methylation of cytosine nucleotides is an important epigenetic modification that occurs in most eukaryotic organisms and is established and maintained by various DNA methyltransferases together with their co-factors. There are two major categories of DNA methyltransferases: de novo and maintenance. Here, we report the isolation and functional characterization of a de novo methyltransferase, named OsDRM2, from rice (Oryza sativa L.). The full-length coding region of OsDRM2 was cloned and transformed into Escherichia coli and Saccharomyces cerevisiae. Both of these organisms expressed the OsDRM2 protein, which exhibited stochastic de novo methylation activity in vitro at CG, CHG, and CHH di- and tri-nucleotide patterns. Two lines of evidence demonstrated the de novo activity of OsDRM2: (1) a 5′-CCGG-3′ containing DNA fragment that had been pre-treated with OsDRM2 protein expressed in E. coli was protected from digestion by the CG-methylation-sensitive isoschizomer HpaII; (2) methylation-sensitive amplified polymorphism (MSAP) analysis of S. cerevisiae genomic DNA from transformants that had been introduced with OsDRM2 revealed CG and CHG methylation levels of 3.92–9.12%, and 2.88–6.93%, respectively, whereas the mock control S. cerevisiae DNA did not exhibit cytosine methylation. These results were further supported by bisulfite sequencing of the 18S rRNA and EAF5 genes of the transformed S. cerevisiae, which exhibited different DNA methylation patterns, which were observed in the genomic DNA. Our findings establish that OsDRM2 is an active de novo DNA

  3. New progress in snake mitochondrial gene rearrangement.

    Science.gov (United States)

    Chen, Nian; Zhao, Shujin

    2009-08-01

    To further understand the evolution of snake mitochondrial genomes, the complete mitochondrial DNA (mtDNA) sequences were determined for representative species from two snake families: the Many-banded krait, the Banded krait, the Chinese cobra, the King cobra, the Hundred-pace viper, the Short-tailed mamushi, and the Chain viper. Thirteen protein-coding genes, 22-23 tRNA genes, 2 rRNA genes, and 2 control regions were identified in these mtDNAs. Duplication of the control region and translocation of the tRNAPro gene were two notable features of the snake mtDNAs. These results from the gene rearrangement comparisons confirm the correctness of traditional classification schemes and validate the utility of comparing complete mtDNA sequences for snake phylogeny reconstruction.

  4. Airline Maintenance Manpower Optimization from the De Novo Perspective

    Science.gov (United States)

    Liou, James J. H.; Tzeng, Gwo-Hshiung

    Human resource management (HRM) is an important issue for today’s competitive airline marketing. In this paper, we discuss a multi-objective model designed from the De Novo perspective to help airlines optimize their maintenance manpower portfolio. The effectiveness of the model and solution algorithm is demonstrated in an empirical study of the optimization of the human resources needed for airline line maintenance. Both De Novo and traditional multiple objective programming (MOP) methods are analyzed. A comparison of the results with those of traditional MOP indicates that the proposed model and solution algorithm does provide better performance and an improved human resource portfolio.

  5. Persistent hyperthyroidism and de novo Graves' ophthalmopathy after total thyroidectomy.

    Science.gov (United States)

    Tay, Wei Lin; Loh, Wann Jia; Lee, Lianne Ai Ling; Chng, Chiaw Ling

    2017-01-01

    We report a patient with Graves' disease who remained persistently hyperthyroid after a total thyroidectomy and also developed de novo Graves' ophthalmopathy 5 months after surgery. She was subsequently found to have a mature cystic teratoma containing struma ovarii after undergoing a total hysterectomy and salpingo-oophorectomy for an incidental ovarian lesion. It is important to investigate for other causes of primary hyperthyroidism when thyrotoxicosis persists after total thyroidectomy.TSH receptor antibody may persist after total thyroidectomy and may potentially contribute to the development of de novo Graves' ophthalmopathy.

  6. Demanda dos principais metais e novos materiais : analise de tendencias

    OpenAIRE

    Wilson Trigueiro de Sousa

    1990-01-01

    Resumo: Neste trabalho são analisadas algumas tendências na área de novos materiais na tentativa de obter um melhor entendimento das repercussões das atuais inovações tecnológicas para o setor mineral. Inicialmente são revisados os principais estudos sobre as mudanças ocorridas por volta de 1972/74 no comportamento da demanda dos metais mais importantes. Entre as possíveis causas, está o progresso técnico, que tornou possível o surgimento de novos materiais e o aperfeiçoamento de outros em us...

  7. De Novo Assembly and Characterization of Sophora japonica Transcriptome Using RNA-seq

    Directory of Open Access Journals (Sweden)

    Liucun Zhu

    2014-01-01

    Full Text Available Sophora japonica Linn (Chinese Scholar Tree is a shrub species belonging to the subfamily Faboideae of the pea family Fabaceae. In this study, RNA sequencing of S. japonica transcriptome was performed to produce large expression datasets for functional genomic analysis. Approximate 86.1 million high-quality clean reads were generated and assembled de novo into 143010 unique transcripts and 57614 unigenes. The average length of unigenes was 901 bps with an N50 of 545 bps. Four public databases, including the NCBI nonredundant protein (NR, Swiss-Prot, Kyoto Encyclopedia of Genes and Genomes (KEGG, and the Cluster of Orthologous Groups (COG, were used to annotate unigenes through NCBI BLAST procedure. A total of 27541 of 57614 unigenes (47.8% were annotated for gene descriptions, conserved protein domains, or gene ontology. Moreover, an interaction network of unigenes in S. japonica was predicted based on known protein-protein interactions of putative orthologs of well-studied plant genomes. The transcriptome data of S. japonica reported here represents first genome-scale investigation of gene expressions in Faboideae plants. We expect that our study will provide a useful resource for further studies on gene expression, genomics, functional genomics, and protein-protein interaction in S. japonica.

  8. De novo transcriptome sequencing and assembly from apomictic and sexual Eragrostis curvula genotypes.

    Directory of Open Access Journals (Sweden)

    Ingrid Garbus

    Full Text Available A long-standing goal in plant breeding has been the ability to confer apomixis to agriculturally relevant species, which would require a deeper comprehension of the molecular basis of apomictic regulatory mechanisms. Eragrostis curvula (Schrad. Nees is a perennial grass that includes both sexual and apomictic cytotypes. The availability of a reference transcriptome for this species would constitute a very important tool toward the identification of genes controlling key steps of the apomictic pathway. Here, we used Roche/454 sequencing technologies to generate reads from inflorescences of E. curvula apomictic and sexual genotypes that were de novo assembled into a reference transcriptome. Near 90% of the 49568 assembled isotigs showed sequence similarity to sequences deposited in the public databases. A gene ontology analysis categorized 27448 isotigs into at least one of the three main GO categories. We identified 11475 SSRs, and several of them were assayed in E curvula germoplasm using SSR-based primers, providing a valuable set of molecular markers that could allow direct allele selection. The differential contribution to each library of the spliced forms of several transcripts revealed the existence of several isotigs produced via alternative splicing of single genes. The reference transcriptome presented and validated in this work will be useful for the identification of a wide range of gene(s related to agronomic traits of E. curvula, including those controlling key steps of the apomictic pathway in this species, allowing the extrapolation of the findings to other plant species.

  9. Biased exonization of transposed elements in duplicated genes: A lesson from the TIF-IA gene

    Directory of Open Access Journals (Sweden)

    Shomron Noam

    2007-11-01

    Full Text Available Abstract Background Gene duplication and exonization of intronic transposed elements are two mechanisms that enhance genomic diversity. We examined whether there is less selection against exonization of transposed elements in duplicated genes than in single-copy genes. Results Genome-wide analysis of exonization of transposed elements revealed a higher rate of exonization within duplicated genes relative to single-copy genes. The gene for TIF-IA, an RNA polymerase I transcription initiation factor, underwent a humanoid-specific triplication, all three copies of the gene are active transcriptionally, although only one copy retains the ability to generate the TIF-IA protein. Prior to TIF-IA triplication, an Alu element was inserted into the first intron. In one of the non-protein coding copies, this Alu is exonized. We identified a single point mutation leading to exonization in one of the gene duplicates. When this mutation was introduced into the TIF-IA coding copy, exonization was activated and the level of the protein-coding mRNA was reduced substantially. A very low level of exonization was detected in normal human cells. However, this exonization was abundant in most leukemia cell lines evaluated, although the genomic sequence is unchanged in these cancerous cells compared to normal cells. Conclusion The definition of the Alu element within the TIF-IA gene as an exon is restricted to certain types of cancers; the element is not exonized in normal human cells. These results further our understanding of the delicate interplay between gene duplication and alternative splicing and of the molecular evolutionary mechanisms leading to genetic innovations. This implies the existence of purifying selection against exonization in single copy genes, with duplicate genes free from such constrains.

  10. Isolation and characterization of an atypical LEA protein coding cDNA and its promoter from drought-tolerant plant Prosopis juliflora.

    Science.gov (United States)

    George, Suja; Usha, B; Parida, Ajay

    2009-05-01

    Plant growth and productivity are adversely affected by various abiotic and biotic stress factors. Despite the wealth of information on abiotic stress and stress tolerance in plants, many aspects still remain unclear. Prosopis juliflora is a hardy plant reported to be tolerant to drought, salinity, extremes of soil pH, and heavy metal stress. In this paper, we report the isolation and characterization of the complementary DNA clone for an atypical late embryogenesis abundant (LEA) protein (Pj LEA3) and its putative promoter sequence from P. juliflora. Unlike typical LEA proteins, rich in glycine, Pj LEA3 has alanine as the most abundant amino acid followed by serine and shows an average negative hydropathy. Pj LEA3 is significantly different from other LEA proteins in the NCBI database and shows high similarity to indole-3 acetic-acid-induced protein ARG2 from Vigna radiata. Northern analysis for Pj LEA3 in P. juliflora leaves under 90 mM H2O2 stress revealed up-regulation of transcript at 24 and 48 h. A 1.5-kb fragment upstream the 5' UTR of this gene (putative promoter) was isolated and analyzed in silico. The possible reasons for changes in gene expression during stress in relation to the host plant's stress tolerance mechanisms are discussed.

  11. Analysis of t(9;17)(q33.2;q25.3) chromosomal breakpoint regions and genetic association reveals novel candidate genes for bipolar disorder

    DEFF Research Database (Denmark)

    Rajkumar, Anto P; Christensen, Jane H; Mattheisen, Manuel

    2015-01-01

    ,856) data. Genetic associations between these disorders and single nucleotide polymorphisms within these breakpoint regions were analysed by BioQ, FORGE, and RegulomeDB programmes. RESULTS: Four protein-coding genes [coding for (endonuclease V (ENDOV), neuronal pentraxin I (NPTX1), ring finger protein 213...

  12. De novo biosynthesis of vanillin in fission yeast (Schizosaccharomyces pombe) and baker's yeast (Saccharomyces cerevisiae).

    Science.gov (United States)

    Hansen, Esben H; Møller, Birger Lindberg; Kock, Gertrud R; Bünner, Camilla M; Kristensen, Charlotte; Jensen, Ole R; Okkels, Finn T; Olsen, Carl E; Motawia, Mohammed S; Hansen, Jørgen

    2009-05-01

    Vanillin is one of the world's most important flavor compounds, with a global market of 180 million dollars. Natural vanillin is derived from the cured seed pods of the vanilla orchid (Vanilla planifolia), but most of the world's vanillin is synthesized from petrochemicals or wood pulp lignins. We have established a true de novo biosynthetic pathway for vanillin production from glucose in Schizosaccharomyces pombe, also known as fission yeast or African beer yeast, as well as in baker's yeast, Saccharomyces cerevisiae. Productivities were 65 and 45 mg/liter, after introduction of three and four heterologous genes, respectively. The engineered pathways involve incorporation of 3-dehydroshikimate dehydratase from the dung mold Podospora pauciseta, an aromatic carboxylic acid reductase (ACAR) from a bacterium of the Nocardia genus, and an O-methyltransferase from Homo sapiens. In S. cerevisiae, the ACAR enzyme required activation by phosphopantetheinylation, and this was achieved by coexpression of a Corynebacterium glutamicum phosphopantetheinyl transferase. Prevention of reduction of vanillin to vanillyl alcohol was achieved by knockout of the host alcohol dehydrogenase ADH6. In S. pombe, the biosynthesis was further improved by introduction of an Arabidopsis thaliana family 1 UDP-glycosyltransferase, converting vanillin into vanillin beta-D-glucoside, which is not toxic to the yeast cells and thus may be accumulated in larger amounts. These de novo pathways represent the first examples of one-cell microbial generation of these valuable compounds from glucose. S. pombe yeast has not previously been metabolically engineered to produce any valuable, industrially scalable, white biotech commodity.

  13. De novo mutations in ATP1A3 cause alternating hemiplegia of childhood

    Science.gov (United States)

    Heinzen, Erin L.; Swoboda, Kathryn J.; Hitomi, Yuki; Gurrieri, Fiorella; Nicole, Sophie; de Vries, Boukje; Tiziano, F. Danilo; Fontaine, Bertrand; Walley, Nicole M.; Heavin, Sinéad; Panagiotakaki, Eleni; Fiori, Stefania; Abiusi, Emanuela; Di Pietro, Lorena; Sweney, Matthew T.; Newcomb, Tara M.; Viollet, Louis; Huff, Chad; Jorde, Lynn B.; Reyna, Sandra P.; Murphy, Kelley J.; Shianna, Kevin V.; Gumbs, Curtis E.; Little, Latasha; Silver, Kenneth; Ptác̆ek, Louis J.; Haan, Joost; Ferrari, Michel D.; Bye, Ann M.; Herkes, Geoffrey K.; Whitelaw, Charlotte M.; Webb, David; Lynch, Bryan J.; Uldall, Peter; King, Mary D.; Scheffer, Ingrid E.; Neri, Giovanni; Arzimanoglou, Alexis; van den Maagdenberg, Arn M.J.M.; Sisodiya, Sanjay M.; Mikati, Mohamad A.; Goldstein, David B.; Nicole, Sophie; Gurrieri, Fiorella; Neri, Giovanni; de Vries, Boukje; Koelewijn, Stephany; Kamphorst, Jessica; Geilenkirchen, Marije; Pelzer, Nadine; Laan, Laura; Haan, Joost; Ferrari, Michel; van den Maagdenberg, Arn; Zucca, Claudio; Bassi, Maria Teresa; Franchini, Filippo; Vavassori, Rosaria; Giannotta, Melania; Gobbi, Giuseppe; Granata, Tiziana; Nardocci, Nardo; De Grandis, Elisa; Veneselli, Edvige; Stagnaro, Michela; Gurrieri, Fiorella; Neri, Giovanni; Vigevano, Federico; Panagiotakaki, Eleni; Oechsler, Claudia; Arzimanoglou, Alexis; Nicole, Sophie; Giannotta, Melania; Gobbi, Giuseppe; Ninan, Miriam; Neville, Brian; Ebinger, Friedrich; Fons, Carmen; Campistol, Jaume; Kemlink, David; Nevsimalova, Sona; Laan, Laura; Peeters-Scholte, Cacha; van den Maagdenberg, Arn; Casaer, Paul; Casari, Giorgio; Sange, Guenter; Spiel, Georg; Boneschi, Filippo Martinelli; Zucca, Claudio; Bassi, Maria Teresa; Schyns, Tsveta; Crawley, Francis; Poncelin, Dominique; Vavassori, Rosaria

    2012-01-01

    Alternating hemiplegia of childhood (AHC) is a rare, severe neurodevelopmental syndrome characterized by recurrent hemiplegic episodes and distinct neurologic manifestations. AHC is usually a sporadic disorder with unknown etiology. Using exome sequencing of seven patients with AHC, and their unaffected parents, we identified de novo nonsynonymous mutations in ATP1A3 in all seven AHC patients. Subsequent sequence analysis of ATP1A3 in 98 additional patients revealed that 78% of AHC cases have a likely causal ATP1A3 mutation, including one inherited mutation in a familial case of AHC. Remarkably, six ATP1A3 mutations explain the majority of patients, including one observed in 36 patients. Unlike ATP1A3 mutations that cause rapid-onset-dystonia-parkinsonism, AHC-causing mutations revealed consistent reductions in ATPase activity without effects on protein expression. This work identifies de novo ATP1A3 mutations as the primary cause of AHC, and offers insight into disease pathophysiology by expanding the spectrum of phenotypes associated with mutations in this gene. PMID:22842232

  14. A de novo nonsense PDGFB mutation causing idiopathic basal ganglia calcification with laryngeal dystonia.

    Science.gov (United States)

    Nicolas, Gaël; Jacquin, Agnès; Thauvin-Robinet, Christel; Rovelet-Lecrux, Anne; Rouaud, Olivier; Pottier, Cyril; Aubriot-Lorton, Marie-Hélène; Rousseau, Stéphane; Wallon, David; Duvillard, Christian; Béjot, Yannick; Frébourg, Thierry; Giroud, Maurice; Campion, Dominique; Hannequin, Didier

    2014-10-01

    Idiopathic basal ganglia calcification (IBGC) is characterized by brain calcification and a wide variety of neurologic and psychiatric symptoms. In families with autosomal dominant inheritance, three causative genes have been identified: SLC20A2, PDGFRB, and, very recently, PDGFB. Whereas in clinical practice sporadic presentation of IBGC is frequent, well-documented reports of true sporadic occurrence are rare. We report the case of a 20-year-old woman who presented laryngeal dystonia revealing IBGC. Her healthy parents' CT scans were both normal. We identified in the proband a new nonsense mutation in exon 4 of PDGFB, c.439C>T (p.Gln147*), which was absent from the parents' DNA. This mutation may result in a loss-of-function of PDGF-B, which has been shown to cause IBGC in humans and to disrupt the blood-brain barrier in mice, resulting in brain calcification. The c.439C>T mutation is located between two previously reported nonsense mutations, c.433C>T (p.Gln145*) and c.445C>T (p.Arg149*), on a region that could be a hot spot for de novo mutations. We present the first full demonstration of the de novo occurrence of an IBGC-causative mutation in a sporadic case.

  15. Visualizing the origins of selfish de novo mutations in individual seminiferous tubules of human testes.

    Science.gov (United States)

    Maher, Geoffrey J; McGowan, Simon J; Giannoulatou, Eleni; Verrill, Clare; Goriely, Anne; Wilkie, Andrew O M

    2016-03-01

    De novo point mutations arise predominantly in the male germline and increase in frequency with age, but it has not previously been possible to locate specific, identifiable mutations directly within the seminiferous tubules of human testes. Using microdissection of tubules exhibiting altered expression of the spermatogonial markers MAGEA4, FGFR3, and phospho-AKT, whole genome amplification, and DNA sequencing, we establish an in situ strategy for discovery and analysis of pathogenic de novo mutations. In 14 testes from men aged 39-90 y, we identified 11 distinct gain-of-function mutations in five genes (fibroblast growth factor receptors FGFR2 and FGFR3, tyrosine phosphatase PTPN11, and RAS oncogene homologs HRAS and KRAS) from 16 of 22 tubules analyzed; all mutations have known associations with severe diseases, ranging from congenital or perinatal lethal disorders to somatically acquired cancers. These results support proposed selfish selection of spermatogonial mutations affecting growth factor receptor-RAS signaling, highlight its prevalence in older men, and enable direct visualization of the microscopic anatomy of elongated mutant clones.

  16. Damaging de novo mutations diminish motor skills in children on the autism spectrum.

    Science.gov (United States)

    Buja, Andreas; Volfovsky, Natalia; Krieger, Abba M; Lord, Catherine; Lash, Alex E; Wigler, Michael; Iossifov, Ivan

    2018-02-20

    In individuals with autism spectrum disorder (ASD), de novo mutations have previously been shown to be significantly correlated with lower IQ but not with the core characteristics of ASD: deficits in social communication and interaction and restricted interests and repetitive patterns of behavior. We extend these findings by demonstrating in the Simons Simplex Collection that damaging de novo mutations in ASD individuals are also significantly and convincingly correlated with measures of impaired motor skills. This correlation is not explained by a correlation between IQ and motor skills. We find that IQ and motor skills are distinctly associated with damaging mutations and, in particular, that motor skills are a more sensitive indicator of mutational severity than is IQ, as judged by mutational type and target gene. We use this finding to propose a combined classification of phenotypic severity: mild (little impairment of either), moderate (impairment mainly to motor skills), and severe (impairment of both IQ and motor skills). Copyright © 2018 the Author(s). Published by PNAS.

  17. De novo mutations in HCN1 cause early infantile epileptic encephalopathy.

    Science.gov (United States)

    Nava, Caroline; Dalle, Carine; Rastetter, Agnès; Striano, Pasquale; de Kovel, Carolien G F; Nabbout, Rima; Cancès, Claude; Ville, Dorothée; Brilstra, Eva H; Gobbi, Giuseppe; Raffo, Emmanuel; Bouteiller, Delphine; Marie, Yannick; Trouillard, Oriane; Robbiano, Angela; Keren, Boris; Agher, Dahbia; Roze, Emmanuel; Lesage, Suzanne; Nicolas, Aude; Brice, Alexis; Baulac, Michel; Vogt, Cornelia; El Hajj, Nady; Schneider, Eberhard; Suls, Arvid; Weckhuysen, Sarah; Gormley, Padhraig; Lehesjoki, Anna-Elina; De Jonghe, Peter; Helbig, Ingo; Baulac, Stéphanie; Zara, Federico; Koeleman, Bobby P C; Haaf, Thomas; LeGuern, Eric; Depienne, Christel

    2014-06-01

    Hyperpolarization-activated, cyclic nucleotide-gated (HCN) channels contribute to cationic Ih current in neurons and regulate the excitability of neuronal networks. Studies in rat models have shown that the Hcn1 gene has a key role in epilepsy, but clinical evidence implicating HCN1 mutations in human epilepsy is lacking. We carried out exome sequencing for parent-offspring trios with fever-sensitive, intractable epileptic encephalopathy, leading to the discovery of two de novo missense HCN1 mutations. Screening of follow-up cohorts comprising 157 cases in total identified 4 additional amino acid substitutions. Patch-clamp recordings of Ih currents in cells expressing wild-type or mutant human HCN1 channels showed that the mutations had striking but divergent effects on homomeric channels. Individuals with mutations had clinical features resembling those of Dravet syndrome with progression toward atypical absences, intellectual disability and autistic traits. These findings provide clear evidence that de novo HCN1 point mutations cause a recognizable early-onset epileptic encephalopathy in humans.

  18. Evaluation of the impact of RNA preservation methods of spiders for de novo transcriptome assembly.

    Science.gov (United States)

    Kono, Nobuaki; Nakamura, Hiroyuki; Ito, Yusuke; Tomita, Masaru; Arakawa, Kazuharu

    2016-05-01

    With advances in high-throughput sequencing technologies, de novo transcriptome sequencing and assembly has become a cost-effective method to obtain comprehensive genetic information of a species of interest, especially in nonmodel species with large genomes such as spiders. However, high-quality RNA is essential for successful sequencing, and sample preservation conditions require careful consideration for the effective storage of field-collected samples. To this end, we report a streamlined feasibility study of various storage conditions and their effects on de novo transcriptome assembly results. The storage parameters considered include temperatures ranging from room temperature to -80°C; preservatives, including ethanol, RNAlater, TRIzol and RNAlater-ICE; and sample submersion states. As a result, intact RNA was extracted and assembly was successful when samples were preserved at low temperatures regardless of the type of preservative used. The assemblies as well as the gene expression profiles were shown to be robust to RNA degradation, when 30 million 150-bp paired-end reads are obtained. The parameters for sample storage, RNA extraction, library preparation, sequencing and in silico assembly considered in this work provide a guideline for the study of field-collected samples of spiders. © 2015 John Wiley & Sons Ltd.

  19. De Novo Nodal Diffuse Large B-Cell Lymphoma: Identification of Biologic Prognostic Factors

    International Nuclear Information System (INIS)

    Abd El-Hameed, A.

    2005-01-01

    Diffuse large B-cell Lymphoma (DLBCL) represents the most frequent type of non-Hodgkin lymphoma (NHL). Although combination chemotherapy has improved the outcome, long-term cure is now possible for approximately 50% of all patients. making the search for parameters identifying patients at high risk particularly needed. The presence of bcl-2 gene rearrangement in de novo DLBCL suggests a possible follicle center cell origin and perhaps a distinct clinical behavior. This study investigated the frequency and prognostic significance of t( 14; 18) translocation and bcl-2 protein overexpression in a cohort of patients with de novo nodal DLBCL who where uniformly evaluated and treated. Material and Methods: A total of 40 patients with de novo nodal DLBCL treated at National Cancer Institute (NCI), Cairo University were investigated. Formal infixed, paraffin-embedded sections were analyzed for: I) bcl-2 gene rearrangement including major break point region (mbr) and minor cluster region (mcr) by polymerase chain reaction (PCR). and 2) bcl-2 protein expression by immunohistochemistry using Dako 124 clone. Results were correlated with the clinical features and subsequent clinical course. Bcl-2 gene rearrangement was detected in 8 cases (20%). 2 cases at mbr, and 6 cases at mcr. Bcl-2 protein (> I 0%) was expressed in 24 cases (60%), irrespective of the presence of t( 14; 18) translocation. The t( 14; 18), and bcl-2 protein overexpression were more frequently associated with failure to achieve a complete response to therapy (ρ=0.008. and 0.04. respectively). DLBCL patients with t(14;18), and bcl-2 protein expression had a significantly reduced 5-year disease free survival (ρ=0.04, and 0.01, respectively). The t( 14; 18) translocation, and bcl-2 protein expression define a group of DLBCL patients with a poor prognosis, and could be used to tailor treatment, and to identify candidates for therapeutic approaches. Geographic differences in t(14;18) may be related to the

  20. Delimiting Coalescence Genes (C-Genes) in Phylogenomic Data Sets.

    Science.gov (United States)

    Springer, Mark S; Gatesy, John

    2018-02-26

    coalescence methods have emerged as a popular alternative for inferring species trees with large genomic datasets, because these methods explicitly account for incomplete lineage sorting. However, statistical consistency of summary coalescence methods is not guaranteed unless several model assumptions are true, including the critical assumption that recombination occurs freely among but not within coalescence genes (c-genes), which are the fundamental units of analysis for these methods. Each c-gene has a single branching history, and large sets of these independent gene histories should be the input for genome-scale coalescence estimates of phylogeny. By contrast, numerous studies have reported the results of coalescence analyses in which complete protein-coding sequences are treated as c-genes even though exons for these loci can span more than a megabase of DNA. Empirical estimates of recombination breakpoints suggest that c-genes may be much shorter, especially when large clades with many species are the focus of analysis. Although this idea has been challenged recently in the literature, the inverse relationship between c-gene size and increased taxon sampling in a dataset-the 'recombination ratchet'-is a fundamental property of c-genes. For taxonomic groups characterized by genes with long intron sequences, complete protein-coding sequences are likely not valid c-genes and are inappropriate units of analysis for summary coalescence methods unless they occur in recombination deserts that are devoid of incomplete lineage sorting (ILS). Finally, it has been argued that coalescence methods are robust when the no-recombination within loci assumption is violated, but recombination must matter at some scale because ILS, a by-product of recombination, is the raison d'etre for coalescence methods. That is, extensive recombination is required to yield the large number of independently segregating c-genes used to infer a species tree. If coalescent methods are powerful

  1. Model-Based GUI Testing Using Uppaal at Novo Nordisk

    DEFF Research Database (Denmark)

    H. Hjort, Ulrik; Rasmussen, Jacob Illum; Larsen, Kim Guldstrand

    2009-01-01

    This paper details a collaboration between Aalborg University and Novo Nordiskin developing an automatic model-based test generation tool for system testing of the graphical user interface of a medical device on an embedded platform. The tool takes as input an UML Statemachine model and generates...

  2. Engineering and introduction of de novo disulphide bridges in ...

    Indian Academy of Sciences (India)

    The engineeringof de novo disulphide bridges has been explored as a means to increase the thermal stability of enzymes in the rationalmethod of protein engineering. In this study, Disulphide by Design software, homology modelling and moleculardynamics simulations were used to select appropriate amino acid pairs for ...

  3. De novo synthesis of milk triglycerides in humans

    Science.gov (United States)

    Mammary gland (MG) de novo lipogenesis contributes significantly to milk fat in animals but little is known in humans. Objective: To test the hypothesis that the incorporation of 13C carbons from [U-13C]glucose into fatty acids (FA) and glycerol in triglycerides (TG) will be greater: 1) in milk tha...

  4. Response monitoring in de novo patients with Parkinson's disease.

    Directory of Open Access Journals (Sweden)

    Rita Willemssen

    Full Text Available BACKGROUND: Parkinson's disease (PD is accompanied by dysfunctions in a variety of cognitive processes. One of these is error processing, which depends upon phasic decreases of medial prefrontal dopaminergic activity. Until now, there is no study evaluating these processes in newly diagnosed, untreated patients with PD ("de novo PD". METHODOLOGY/PRINCIPAL FINDINGS: Here we report large changes in performance monitoring processes using event-related potentials (ERPs in de novo PD-patients. The results suggest that increases in medial frontal dopaminergic activity after an error (Ne are decreased, relative to age-matched controls. In contrast, neurophysiological processes reflecting general motor response monitoring (Nc are enhanced in de novo patients. CONCLUSIONS/SIGNIFICANCE: It may be hypothesized that the Nc-increase is at costs of dopaminergic activity after an error; on a functional level errors may not always be detected and correct responses sometimes be misinterpreted as errors. This pattern differs from studies examining patients with a longer history of PD and may reflect compensatory processes, frequently occurring in pre-manifest stages of PD. From a clinical point of view the clearly attenuated Ne in the de novo PD patients may prove a useful additional tool for the early diagnosis of basal ganglia dysfunction in PD.

  5. Towards accurate de novo assembly for genomes with repeats

    NARCIS (Netherlands)

    Bucur, Doina

    2017-01-01

    De novo genome assemblers designed for short k-mer length or using short raw reads are unlikely to recover complex features of the underlying genome, such as repeats hundreds of bases long. We implement a stochastic machine-learning method which obtains accurate assemblies with repeats and

  6. On the performance of de novo pathway enrichment

    DEFF Research Database (Denmark)

    Batra, Richa; Alcaraz, Nicolas; Gitzhofer, Kevin

    2017-01-01

    De novo pathway enrichment is a powerful approach to discover previously uncharacterized molecular mechanisms in addition to already known pathways. To achieve this, condition-specific functional modules are extracted from large interaction networks. Here, we give an overview of the state...

  7. Illumina-based de novo transcriptome sequencing and analysis

    Indian Academy of Sciences (India)

    In the present study, we used Illumina HiSeq technology to perform de novo assembly of heart and musk gland transcriptomes from the Chinese forest musk deer. A total of 239,383 transcripts and 176,450 unigenes were obtained, of which 37,329 unigenes were matched to known sequences in the NCBI nonredundant ...

  8. De novo structural modeling and computational sequence analysis ...

    African Journals Online (AJOL)

    Different bioinformatics tools and machine learning techniques were used for protein structural classification. De novo protein modeling was performed by using I-TASSER server. The final model obtained was accessed by PROCHECK and DFIRE2, which confirmed that the final model is reliable. Until complete biochemical ...

  9. Direct Visualization of De novo Lipogenesis in Single Living Cells

    Science.gov (United States)

    Li, Junjie; Cheng, Ji-Xin

    2014-10-01

    Increased de novo lipogenesis is being increasingly recognized as a hallmark of cancer. Despite recent advances in fluorescence microscopy, autoradiography and mass spectrometry, direct observation of de novo lipogenesis in living systems remains to be challenging. Here, by coupling stimulated Raman scattering (SRS) microscopy with isotope labeled glucose, we were able to trace the dynamic metabolism of glucose in single living cells with high spatial-temporal resolution. As the first direct visualization, we observed that glucose was largely utilized for lipid synthesis in pancreatic cancer cells, which occurs at a much lower rate in immortalized normal pancreatic epithelial cells. By inhibition of glycolysis and fatty acid synthase (FAS), the key enzyme for fatty acid synthesis, we confirmed the deuterium labeled lipids in cancer cells were from de novo lipid synthesis. Interestingly, we also found that prostate cancer cells exhibit relatively lower level of de novo lipogenesis, but higher fatty acid uptake compared to pancreatic cancer cells. Together, our results demonstrate a valuable tool to study dynamic lipid metabolism in cancer and other disorders.

  10. Particulated articular cartilage: CAIS and DeNovo NT.

    Science.gov (United States)

    Farr, Jack; Cole, Brian J; Sherman, Seth; Karas, Vasili

    2012-03-01

    Cartilage Autograft Implantation System (CAIS; DePuy/Mitek, Raynham, MA) and DeNovo Natural Tissue (NT; ISTO, St. Louis, MO) are novel treatment options for focal articular cartilage defects in the knee. These methods involve the implantation of particulated articular cartilage from either autograft or juvenile allograft donor, respectively. In the laboratory and in animal models, both CAIS and DeNovo NT have demonstrated the ability of the transplanted cartilage cells to "escape" from the extracellular matrix, migrate, multiply, and form a new hyaline-like cartilage tissue matrix that integrates with the surrounding host tissue. In clinical practice, the technique for both CAIS and DeNovo NT is straightforward, requiring only a single surgery to affect cartilage repair. Clinical experience is limited, with short-term studies demonstrating both procedures to be safe, feasible, and effective, with improvements in subjective patient scores, and with magnetic resonance imaging evidence of good defect fill. While these treatment options appear promising, prospective randomized controlled studies are necessary to refine the indications and contraindications for both CAIS and DeNovo NT.

  11. A novel TBP-TAF complex on RNA polymerase II-transcribed snRNA genes.

    Science.gov (United States)

    Zaborowska, Justyna; Taylor, Alice; Roeder, Robert G; Murphy, Shona

    2012-01-01

    Initiation of transcription of most human genes transcribed by RNA polymerase II (RNAP II) requires the formation of a preinitiation complex comprising TFIIA, B, D, E, F, H and RNAP II. The general transcription factor TFIID is composed of the TATA-binding protein and up to 13 TBP-associated factors. During transcription of snRNA genes, RNAP II does not appear to make the transition to long-range productive elongation, as happens during transcription of protein-coding genes. In addition, recognition of the snRNA gene-type specific 3' box RNA processing element requires initiation from an snRNA gene promoter. These characteristics may, at least in part, be driven by factors recruited to the promoter. For example, differences in the complement of TAFs might result in differential recruitment of elongation and RNA processing factors. As precedent, it already has been shown that the promoters of some protein-coding genes do not recruit all the TAFs found in TFIID. Although TAF5 has been shown to be associated with RNAP II-transcribed snRNA genes, the full complement of TAFs associated with these genes has remained unclear. Here we show, using a ChIP and siRNA-mediated approach, that the TBP/TAF complex on snRNA genes differs from that found on protein-coding genes. Interestingly, the largest TAF, TAF1, and the core TAFs, TAF10 and TAF4, are not detected on snRNA genes. We propose that this snRNA gene-specific TAF subset plays a key role in gene type-specific control of expression.

  12. De novo assembly and comparative transcriptome analysis of the foot from Chinese green mussel (Perna viridis in response to cadmium stimulation.

    Directory of Open Access Journals (Sweden)

    Xinhui Zhang

    Full Text Available The Chinese green mussel, Perna viridis, is a marine bivalve with important economic values as well as biomonitoring roles for aquatic pollution. Byssus, secreted by the foot gland, has been proved to bind heavy metals effectively. In this study, using the RNA sequencing technology, we performed comparative transcriptomic analysis on the mussel feet with or without inducing by cadmium (Cd. Our current work is aiming at providing insights into the molecular mechanisms of byssus binding to heavy metal ions. The transcriptome sequencing generated a total of 26.13-Gb raw data. After a careful assembly of clean data, we obtained a primary set of 105,127 unigenes, in which 32,268 unigenes were annotated. Based on the expression profiles, we identified 9,048 differentially expressed genes (DEGs between Cd treatment (50 or 100 μg/L at 48 h and the control, suggesting an extensive transcriptome response of the mussels during the Cd stimulation. Moreover, we observed that the expression levels of 54 byssus protein coding genes increased significantly after the 48-h Cd stimulation. In addition, 16 critical byssus protein coding genes were picked for profiling by quantitative real-time PCR (qRT-PCR. Finally, we reached a primary conclusion that high content of tyrosine (Tyr, cysteine (Cys, histidine (His residues or the special motif plays an important role in the accumulation of heavy metals in byssus. We also proposed an interesting model for the confirmed byssal Cd accumulation, in which biosynthesis of byssus proteins may play simultaneously critical roles since their transcription levels were significantly elevated.

  13. Genome-wide patterns and properties of de novo mutations in humans

    NARCIS (Netherlands)

    Francioli, Laurent C.; Polak, Paz P.; Koren, Amnon; Menelaou, Androniki; Chun, Sung; Renkens, Ivo; van Duijn, Cornelia M.; Swertz, Morris; Wijmenga, Cisca; van Ommen, Gertjan; Slagboom, P. Eline; Boomsma, Dorret I.; Ye, Kai; Guryev, Victor; Arndt, Peter F.; Kloosterman, Wigard P.; de Bakker, Paul I. W.; Sunyaev, Shamil R.

    Mutations create variation in the population, fuel evolution and cause genetic diseases. Current knowledge about de novo mutations is incomplete and mostly indirect(1-10). Here we analyze 11,020 de novo mutations from the whole genomes of 250 families. We show that de novo mutations in the offspring

  14. Genome-wide patterns and properties of de novo mutations in humans

    NARCIS (Netherlands)

    Francioli, L.C.; Polak, P.P.; Koren, A.; Menelaou, A.; Chun, S.; Renkens, I.; van Duijn, C.M.; Swertz, M.A.; Wijmenga, C.; van Ommen, G.J.; Slagboom, P.E.; Boomsma, D.I.; Ye, K.; Guryev, V.; Arndt, P.F.; Kloosterman, W.P.; Bakker, P.I.W.; Sunyaev, S.R.; Dijk, F.; Neerincx, P.B.T.; Pulit, S.L.; Deelen, P.; Elbers, C.C.; Palamara, P.F.; Pe'er, I.; Abdellaoui, A.; van Oven, M.; Vermaat, M.; Li, M.; Laros, J.F.J.; Stoneking, M.; de Knijff, P.; Kayser, M.; Veldink, J.H.; Van den Berg, L.H.; Byelas, H.; den Dunnen, J.T.; Dijkstra, M.; Amin, N.; van der Velde, K.J.; Hottenga, J.J.; van Setten, J.; van Leeuwen, E.M.; Kanterakis, A.; Kattenberg, V.M.; Karssen, L.C.; van Schaik, B.D.C.; Bot, J.; Nijman, I.J.; van Enckevort, D.; Mei, H.; Koval, V.; Estrada, K.; Medina-Gomez, C.; Lameijer, E.W.; Moed, M.H.; Hehir-Kwa, J.Y.; Handsaker, R.E.; McCarroll, S.A.; Vuzman, D.; Sohail, M.; Hormozdiari, F.; Marschall, T.; Schönhuth, A.; Beekman, M.; de Craen, A.J.; Suchiman, H.E.D.; Hofman, A.; Oostra, B.; Isaacs, A.; Rivadeneira, F.; Uitterlinden, A.G.; Willemsen, G.; Platteel, M.; Pitts, S.J.; Potluri, S.; Sundar, P.; Cox, D.R.; Li, Q.; Li, Y.; Du, Y.; Chen, R.; Cao, H.; Li, N.; Cao, S.; Wang, J.; Bovenberg, J.A.; Brandsma, M.

    2015-01-01

    Mutations create variation in the population, fuel evolution and cause genetic diseases. Current knowledge about de novo mutations is incomplete and mostly indirect. Here we analyze 11,020 de novo mutations from the whole genomes of 250 families. We show that de novo mutations in the offspring of

  15. The complete mitochondrial genome of Setaria digitata (Nematoda: Filarioidea): Mitochondrial gene content, arrangement and composition compared with other nematodes.

    Science.gov (United States)

    Yatawara, Lalani; Wickramasinghe, Susiji; Rajapakse, R P V J; Agatsuma, Takeshi

    2010-09-01

    In the present study, we determined the complete mitochondrial (mt) genome sequence (13,839bp) of parasitic nematode Setaria digitata and its structure and organization compared with Onchocerca volvulus, Dirofilaria immitis and Brugia malayi. The mt genome of S. digitata is slightly larger than the mt genomes of other filarial nematodes. S. digitata mt genome contains 36 genes (12 protein-coding genes, 22 transfer RNAs and 2 ribosomal RNAs) that are typically found in metazoans. This genome contains a high A+T (75.1%) content and low G+C content (24.9%). The mt gene order for S. digitata is the same as those for O. volvulus, D. immitis and B. malayi but it is distinctly different from other nematodes compared. The start codons inferred in the mt genome of S. digitata are TTT, ATT, TTG, ATG, GTT and ATA. Interestingly, the initiation codon TTT is unique to S. digitata mt genome and four protein-coding genes use this codon as a translation initiation codon. Five protein-coding genes use TAG as a stop codon whereas three genes use TAA and four genes use T as a termination codon. Out of 64 possible codons, only 57 are used for mitochondrial protein-coding genes of S. digitata. T-rich codons such as TTT (18.9%), GTT (7.9%), TTG (7.8%), TAT (7%), ATT (5.7%), TCT (4.8%) and TTA (4.1%) are used more frequently. This pattern of codon usage reflects the strong bias for T in the mt genome of S. digitata. In conclusion, the present investigation provides new molecular data for future studies of the comparative mitochondrial genomics and systematic of parasitic nematodes of socio-economic importance. 2010 Elsevier B.V. All rights reserved.

  16. A de novo frameshift in HNRNPK causing a Kabuki-like syndrome with nodular heterotopia.

    Science.gov (United States)

    Lange, L; Pagnamenta, A T; Lise, S; Clasper, S; Stewart, H; Akha, E S; Quaghebeur, G; Knight, S J L; Keays, D A; Taylor, J C; Kini, U

    2016-09-01

    Kabuki syndrome is a heterogeneous condition characterized by distinctive facial features, intellectual disability, growth retardation, skeletal abnormalities and a range of organ malformations. Although at least two major causative genes have been identified, these do not explain all cases. Here we describe a patient with a complex Kabuki-like syndrome that included nodular heterotopia, in whom testing for several single-gene disorders had proved negative. Exome sequencing uncovered a de novo c.931_932insTT variant in HNRNPK (heterogeneous nuclear ribonucleoprotein K). Although this variant was identified in March 2012, its clinical relevance could only be confirmed following the August 2015 publication of two cases with HNRNPK mutations and an overlapping phenotype that included intellectual disability, distinctive facial dysmorphism and skeletal/connective tissue abnormalities. Whilst we had attempted (unsuccessfully) to identify additional cases through existing collaborators, the two published cases were 'matched' using GeneMatcher, a web-based tool for connecting researchers and clinicians working on identical genes. Our report therefore exemplifies the importance of such online tools in clinical genetics research and the benefits of periodically reviewing cases with variants of unproven significance. Our study also suggests that loss of function variants in HNRNPK should be considered as a molecular basis for patients with Kabuki-like syndrome. © 2016 The Authors. Clinical Genetics published by John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  17. Distinguishing between Selective Sweeps from Standing Variation and from a De Novo Mutation

    Science.gov (United States)

    Peter, Benjamin M.; Huerta-Sanchez, Emilia; Nielsen, Rasmus

    2012-01-01

    An outstanding question in human genetics has been the degree to which adaptation occurs from standing genetic variation or from de novo mutations. Here, we combine several common statistics used to detect selection in an Approximate Bayesian Computation (ABC) framework, with the goal of discriminating between models of selection and providing estimates of the age of selected alleles and the selection coefficients acting on them. We use simulations to assess the power and accuracy of our method and apply it to seven of the strongest sweeps currently known in humans. We identify two genes, ASPM and PSCA, that are most likely affected by selection on standing variation; and we find three genes, ADH1B, LCT, and EDAR, in which the adaptive alleles seem to have swept from a new mutation. We also confirm evidence of selection for one further gene, TRPV6. In one gene, G6PD, neither neutral models nor models of selective sweeps fit the data, presumably because this locus has been subject to balancing selection. PMID:23071458

  18. Distinguishing between selective sweeps from standing variation and from a de novo mutation.

    Directory of Open Access Journals (Sweden)

    Benjamin M Peter

    Full Text Available An outstanding question in human genetics has been the degree to which adaptation occurs from standing genetic variation or from de novo mutations. Here, we combine several common statistics used to detect selection in an Approximate Bayesian Computation (ABC framework, with the goal of discriminating between models of selection and providing estimates of the age of selected alleles and the selection coefficients acting on them. We use simulations to assess the power and accuracy of our method and apply it to seven of the strongest sweeps currently known in humans. We identify two genes, ASPM and PSCA, that are most likely affected by selection on standing variation; and we find three genes, ADH1B, LCT, and EDAR, in which the adaptive alleles seem to have swept from a new mutation. We also confirm evidence of selection for one further gene, TRPV6. In one gene, G6PD, neither neutral models nor models of selective sweeps fit the data, presumably because this locus has been subject to balancing selection.

  19. gEVE: a genome-based endogenous viral element database provides comprehensive viral protein-coding sequences in mammalian genomes.

    Science.gov (United States)

    Nakagawa, So; Takahashi, Mahoko Ueda

    2016-01-01

    In mammals, approximately 10% of genome sequences correspond to endogenous viral elements (EVEs), which are derived from ancient viral infections of germ cells. Although most EVEs have been inactivated, some open reading frames (ORFs) of EVEs obtained functions in the hosts. However, EVE ORFs usually remain unannotated in the genomes, and no databases are available for EVE ORFs. To investigate the function and evolution of EVEs in mammalian genomes, we developed EVE ORF databases for 20 genomes of 19 mammalian species. A total of 736,771 non-overlapping EVE ORFs were identified and archived in a database named gEVE (http://geve.med.u-tokai.ac.jp). The gEVE database provides nucleotide and amino acid sequences, genomic loci and functional annotations of EVE ORFs for all 20 genomes. In analyzing RNA-seq data with the gEVE database, we successfully identified the expressed EVE genes, suggesting that the gEVE database facilitates studies of the genomic analyses of various mammalian species.Database URL: http://geve.med.u-tokai.ac.jp. © The Author(s) 2016. Published by Oxford University Press.

  20. A nutrient-driven tRNA modification alters translational fidelity and genome-wide protein coding across an animal genus.

    Science.gov (United States)

    Zaborske, John M; DuMont, Vanessa L Bauer; Wallace, Edward W J; Pan, Tao; Aquadro, Charles F; Drummond, D Allan

    2014-12-01

    Natural selection favors efficient expression of encoded proteins, but the causes, mechanisms, and fitness consequences of evolved coding changes remain an area of aggressive inquiry. We report a large-scale reversal in the relative translational accuracy of codons across 12 fly species in the Drosophila/Sophophora genus. Because the reversal involves pairs of codons that are read by the same genomically encoded tRNAs, we hypothesize, and show by direct measurement, that a tRNA anticodon modification from guanosine to queuosine has coevolved with these genomic changes. Queuosine modification is present in most organisms but its function remains unclear. Modification levels vary across developmental stages in D. melanogaster, and, consistent with a causal effect, genes maximally expressed at each stage display selection for codons that are most accurate given stage-specific queuosine modification levels. In a kinetic model, the known increased affinity of queuosine-modified tRNA for ribosomes increases the accuracy of cognate codons while reducing the accuracy of near-cognate codons. Levels of queuosine modification in D. melanogaster reflect bioavailability of the precursor queuine, which eukaryotes scavenge from the tRNAs of bacteria and absorb in the gut. These results reveal a strikingly direct mechanism by which recoding of entire genomes results from changes in utilization of a nutrient.

  1. A nutrient-driven tRNA modification alters translational fidelity and genome-wide protein coding across an animal genus.

    Directory of Open Access Journals (Sweden)

    John M Zaborske

    2014-12-01

    Full Text Available Natural selection favors efficient expression of encoded proteins, but the causes, mechanisms, and fitness consequences of evolved coding changes remain an area of aggressive inquiry. We report a large-scale reversal in the relative translational accuracy of codons across 12 fly species in the Drosophila/Sophophora genus. Because the reversal involves pairs of codons that are read by the same genomically encoded tRNAs, we hypothesize, and show by direct measurement, that a tRNA anticodon modification from guanosine to queuosine has coevolved with these genomic changes. Queuosine modification is present in most organisms but its function remains unclear. Modification levels vary across developmental stages in D. melanogaster, and, consistent with a causal effect, genes maximally expressed at each stage display selection for codons that are most accurate given stage-specific queuosine modification levels. In a kinetic model, the known increased affinity of queuosine-modified tRNA for ribosomes increases the accuracy of cognate codons while reducing the accuracy of near-cognate codons. Levels of queuosine modification in D. melanogaster reflect bioavailability of the precursor queuine, which eukaryotes scavenge from the tRNAs of bacteria and absorb in the gut. These results reveal a strikingly direct mechanism by which recoding of entire genomes results from changes in utilization of a nutrient.

  2. A de novo Mutation in KMT2A (MLL) in monozygotic twins with Wiedemann-Steiner syndrome.

    Science.gov (United States)

    Dunkerton, Sophie; Field, Matthew; Cho, Vicki; Bertram, Edward; Whittle, Belinda; Groves, Alexandra; Goel, Himanshu

    2015-09-01

    Growth deficiency, psychomotor delay, and facial dysmorphism was originally described in a male patient in 1989 by Wiedemann et al. and later in 2000 by Steiner et al. Wiedemann-Steiner syndrome (WSS) has since been described only a few times in the literature, with the phenotypic spectrum both expanding and becoming more delineated with each patient reported. We report on the clinical and molecular features of monozygotic twins with a de novo mutation in KMT2A. Single nucleotide polymorphism (SNP) microarray was done on both twins and whole-exome sequencing was done using both parents and one of the affected twins. SNP microarray confirmed that they were monozygotic twins. A de novo heterozygous variant (p. Arg1083*) in the KMT2A gene was identified through whole-exome sequencing, confirming the diagnosis of WSS. In this study, we have identified a de novo mutation in KMT2A associated with psychomotor developmental delay, facial dysmorphism, short stature, hypertrichosis cubiti, and small kidneys. This finding in monozygotic twins gives specificity to the WSS. The description of more cases of WSS is needed for further delineation of this condition. Small kidneys with normal function have not been described in this condition in the medical literature before. © 2015 Wiley Periodicals, Inc.

  3. A Swedish family with de novo alpha-synuclein A53T mutation: evidence for early cortical dysfunction

    DEFF Research Database (Denmark)

    Puschmann, Andreas; Ross, Owen A; Vilariño-Güell, Carles

    2009-01-01

    A de novo alpha-synuclein A53T (p.Ala53 Th; c.209G > A) mutation has been identified in a Swedish family with autosomal dominant Parkinson's disease (PD). Two affected individuals had early-onset (before 31 and 40 years), severe levodopa-responsive PD with prominent dysphasia, dysarthria, and cog......A de novo alpha-synuclein A53T (p.Ala53 Th; c.209G > A) mutation has been identified in a Swedish family with autosomal dominant Parkinson's disease (PD). Two affected individuals had early-onset (before 31 and 40 years), severe levodopa-responsive PD with prominent dysphasia, dysarthria......) and the Greek-American Family H kindreds. One unaffected family member carried the mutation haplotype without the c.209A mutation, strongly suggesting its de novo occurrence within this family. Furthermore, a novel mutation c.488G > A (p.Arg163His; R163H) in the presenilin-2 (PSEN2) gene was detected...

  4. The sequence and de novo assembly of the giant panda genome

    Science.gov (United States)

    Li, Ruiqiang; Fan, Wei; Tian, Geng; Zhu, Hongmei; He, Lin; Cai, Jing; Huang, Quanfei; Cai, Qingle; Li, Bo; Bai, Yinqi; Zhang, Zhihe; Zhang, Yaping; Wang, Wen; Li, Jun; Wei, Fuwen; Li, Heng; Jian, Min; Li, Jianwen; Zhang, Zhaolei; Nielsen, Rasmus; Li, Dawei; Gu, Wanjun; Yang, Zhentao; Xuan, Zhaoling; Ryder, Oliver A.; Leung, Frederick Chi-Ching; Zhou, Yan; Cao, Jianjun; Sun, Xiao; Fu, Yonggui; Fang, Xiaodong; Guo, Xiaosen; Wang, Bo; Hou, Rong; Shen, Fujun; Mu, Bo; Ni, Peixiang; Lin, Runmao; Qian, Wubin; Wang, Guodong; Yu, Chang; Nie, Wenhui; Wang, Jinhuan; Wu, Zhigang; Liang, Huiqing; Min, Jiumeng; Wu, Qi; Cheng, Shifeng; Ruan, Jue; Wang, Mingwei; Shi, Zhongbin; Wen, Ming; Liu, Binghang; Ren, Xiaoli; Zheng, Huisong; Dong, Dong; Cook, Kathleen; Shan, Gao; Zhang, Hao; Kosiol, Carolin; Xie, Xueying; Lu, Zuhong; Zheng, Hancheng; Li, Yingrui; Steiner, Cynthia C.; Lam, Tommy Tsan-Yuk; Lin, Siyuan; Zhang, Qinghui; Li, Guoqing; Tian, Jing; Gong, Timing; Liu, Hongde; Zhang, Dejin; Fang, Lin; Ye, Chen; Zhang, Juanbin; Hu, Wenbo; Xu, Anlong; Ren, Yuanyuan; Zhang, Guojie; Bruford, Michael W.; Li, Qibin; Ma, Lijia; Guo, Yiran; An, Na; Hu, Yujie; Zheng, Yang; Shi, Yongyong; Li, Zhiqiang; Liu, Qing; Chen, Yanling; Zhao, Jing; Qu, Ning; Zhao, Shancen; Tian, Feng; Wang, Xiaoling; Wang, Haiyin; Xu, Lizhi; Liu, Xiao; Vinar, Tomas; Wang, Yajun; Lam, Tak-Wah; Yiu, Siu-Ming; Liu, Shiping; Zhang, Hemin; Li, Desheng; Huang, Yan; Wang, Xia; Yang, Guohua; Jiang, Zhi; Wang, Junyi; Qin, Nan; Li, Li; Li, Jingxiang; Bolund, Lars; Kristiansen, Karsten; Wong, Gane Ka-Shu; Olson, Maynard; Zhang, Xiuqing; Li, Songgang; Yang, Huanming; Wang, Jian; Wang, Jun

    2013-01-01

    Using next-generation sequencing technology alone, we have successfully generated and assembled a draft sequence of the giant panda genome. The assembled contigs (2.25 gigabases (Gb)) cover approximately 94% of the whole genome, and the remaining gaps (0.05 Gb) seem to contain carnivore-specific repeats and tandem repeats. Comparisons with the dog and human showed that the panda genome has a lower divergence rate. The assessment of panda genes potentially underlying some of its unique traits indicated that its bamboo diet might be more dependent on its gut microbiome than its own genetic composition. We also identified more than 2.7 million heterozygous single nucleotide polymorphisms in the diploid genome. Our data and analyses provide a foundation for promoting mammalian genetic research, and demonstrate the feasibility for using next-generation sequencing technologies for accurate, cost-effective and rapid de novo assembly of large eukaryotic genomes. PMID:20010809

  5. De novo activating epidermal growth factor mutations (EGFR) in small-cell lung cancer.

    Science.gov (United States)

    Thai, Alesha; Chia, Puey L; Russell, Prudence A; Do, Hongdo; Dobrovic, Alex; Mitchell, Paul; John, Thomas

    2017-09-01

    In Australia, mutations in epidermal growth factor mutations (EGFR) occur in 15% of patients diagnosed with non-small-cell lung cancer and are found with higher frequency in female, non-smokers of Asian ethnicity. Activating mutations in the EGFR gene are rarely described in SCLC. We present two cases of de novo EGFR mutations in patients with SCLC detected in tissue and in plasma cell free DNA, both of whom were of Asian ethnicity and never-smokers. These two cases add to the growing body of evidence suggesting that screening for EGFR mutations in SCLC should be considered in patients with specific clinical features. © 2017 Royal Australasian College of Physicians.

  6. De novo 12q22.q23.3 duplication associated with temporal lobe epilepsy.

    Science.gov (United States)

    Vari, Maria Stella; Traverso, Monica; Bellini, Tommaso; Madia, Francesca; Pinto, Francesca; Minetti, Carlo; Striano, Pasquale; Zara, Federico

    2017-08-01

    Temporal lobe epilepsy (TLE) is the most common form of focal epilepsy and may be associated with acquired central nervous system lesions or could be genetic. Various susceptibility genes and environmental factors are believed to be involved in the aetiology of TLE, which is considered to be a heterogeneous, polygenic, and complex disorder. Rare point mutations in LGI1, DEPDC5, and RELN as well as some copy number variations (CNVs) have been reported in families with TLE patients. We perform a genetic analysis by Array-CGH in a patient with dysmorphic features and temporal lobe epilepsy. We report a de novo duplication of the long arm of chromosome 12. We confirm that 12q22-q23.3 is a candidate locus for familial temporal lobe epilepsy with febrile seizures and highlight the role of chromosomal rearrangements in patients with epilepsy and intellectual disability. Copyright © 2017 British Epilepsy Association. Published by Elsevier Ltd. All rights reserved.

  7. Robust de novo pathway enrichment with KeyPathwayMiner 5 [version 1; referees: 2 approved

    Directory of Open Access Journals (Sweden)

    Nicolas Alcaraz

    2016-06-01

    Full Text Available Identifying functional modules or novel active pathways, recently termed de novo pathway enrichment, is a computational systems biology challenge that has gained much attention during the last decade. Given a large biological interaction network, KeyPathwayMiner extracts connected subnetworks that are enriched for differentially active entities from a series of molecular profiles encoded as binary indicator matrices. Since interaction networks constantly evolve, an important question is how robust the extracted results are when the network is modified. We enable users to study this effect through several network perturbation techniques and over a range of perturbation degrees. In addition, users may now provide a gold-standard set to determine how enriched extracted pathways are with relevant genes compared to randomized versions of the original network.

  8. De novo acute leukemia with a sole 5q-: morphological, immunological, and clinical correlations.

    Science.gov (United States)

    Duchayne, E; Dastugue, N; Kuhlein, E; Huguet, F; Pris, J

    1993-11-01

    The 5 q deletion is frequently found in myelodysplastic syndromes and acute non lymphoid leukemia, but this anomaly is usually found in secondary diseases and associated with many other chromosomal aberrations. This report describes four cases of "de novo" acute leukemia with a sole 5q- anomaly. They had no cytological, genetic or clinical characteristics of secondary disorders. It is important to note that of the four patients studied, three had proliferation of immature blast cells. One case was classified as a MO AML and two as "undifferentiated" acute leukemia. Furthermore, these four cases of acute leukemia showed a deletion of the same portion of the long arm of chromosome 5: q22q33. On the same part of this chromosome many hematopoietic growth factor genes have been located, like IL3 and GM-CSF which have early undifferentiated hematopoietic stem cells as a their target.

  9. Genesis by meiotic unequal crossover of a de novo deletion that contributes to steroid 21-hydroxylase deficiency

    International Nuclear Information System (INIS)

    Sinnott, P.; Collier, S.; Dyer, P.A.; Harris, R.; Strachan, T.; Costigan, C.

    1990-01-01

    The HLA-linked human steroid 21-hydroxylase gene CYP21B and its closely homologous pseudogene CYP21A are each normally located centromeric to a fourth component of complement (C4) gene, C4B and C4A, respectively, in an organization suggesting tandem duplication of a ca. 30-kilobase DNA unit containing a CYP21 gene and a C4 gene. Such an organization has been considered to facilitate gene deletion and addition events by unequal crossover between the tandem repeats. The authors have identified a steroid 21-hydroxylase deficiency patient who has a maternally inherited disease haplotype that carries a de novo deletion of a ca. 30-kilobase repeat unit including the CYP21B gene and associated C4B gene. This disease haplotype appears to have been generated as a result of meiotic unequal crossover between maternal homologous chromosomes. One of the maternal haplotypes is the frequently occurring HLA-DR3,B8,A1 haplotype that normally carries a deletion of a ca. 30-kilobase unit including the CYP21A gene and C4A gene. Haplotypes of this type may possible act as premutations, increasing the susceptibility of developing a 21-hydroxylase deficiency mutation by facilitating unequal chromosome pairing

  10. Genome-wide targeted prediction of ABA responsive genes in rice based on over-represented cis-motif in co-expressed genes.

    Science.gov (United States)

    Lenka, Sangram K; Lohia, Bikash; Kumar, Abhay; Chinnusamy, Viswanathan; Bansal, Kailash C

    2009-02-01

    Abscisic acid (ABA), the popular plant stress hormone, plays a key role in regulation of sub-set of stress responsive genes. These genes respond to ABA through specific transcription factors which bind to cis-regulatory elements present in their promoters. We discovered the ABA Responsive Element (ABRE) core (ACGT) containing CGMCACGTGB motif as over-represented motif among the promoters of ABA responsive co-expressed genes in rice. Targeted gene prediction strategy using this motif led to the identification of 402 protein coding genes potentially regulated by ABA-dependent molecular genetic network. RT-PCR analysis of arbitrarily chosen 45 genes from the predicted 402 genes confirmed 80% accuracy of our prediction. Plant Gene Ontology (GO) analysis of ABA responsive genes showed enrichment of signal transduction and stress related genes among diverse functional categories.

  11. Sequencing, de novo assembling, and annotating the genome of the endangered Chinese crocodile lizard Shinisaurus crocodilurus.

    Science.gov (United States)

    Gao, Jian; Li, Qiye; Wang, Zongji; Zhou, Yang; Martelli, Paolo; Li, Fang; Xiong, Zijun; Wang, Jian; Yang, Huanming; Zhang, Guojie

    2017-07-01

    The Chinese crocodile lizard, Shinisaurus crocodilurus, is the only living representative of the monotypic family Shinisauridae under the order Squamata. It is an obligate semi-aquatic, viviparous, diurnal species restricted to specific portions of mountainous locations in southwestern China and northeastern Vietnam. However, in the past several decades, this species has undergone a rapid decrease in population size due to illegal poaching and habitat disruption, making this unique reptile species endangered and listed in the Convention on International Trade in Endangered Species of Wild Fauna and Flora Appendix II since 1990. A proposal to uplist it to Appendix I was passed at the Convention on International Trade in Endangered Species of Wild Fauna and Flora Seventeenth meeting of the Conference of the Parties in 2016. To promote the conservation of this species, we sequenced the genome of a male Chinese crocodile lizard using a whole-genome shotgun strategy on the Illumina HiSeq 2000 platform. In total, we generated ∼291 Gb of raw sequencing data (×149 depth) from 13 libraries with insert sizes ranging from 250 bp to 40 kb. After filtering for polymerase chain reaction-duplicated and low-quality reads, ∼137 Gb of clean data (×70 depth) were obtained for genome assembly. We yielded a draft genome assembly with a total length of 2.24 Gb and an N50 scaffold size of 1.47 Mb. The assembled genome was predicted to contain 20 150 protein-coding genes and up to 1114 Mb (49.6%) of repetitive elements. The genomic resource of the Chinese crocodile lizard will contribute to deciphering the biology of this organism and provides an essential tool for conservation efforts. It also provides a valuable resource for future study of squamate evolution. © The Authors 2017. Published by Oxford University Press.

  12. Improved protein quality in transgenic soybean expressing a de novo synthetic protein, MB-16.

    Science.gov (United States)

    Zhang, Yunfang; Schernthaner, Johann; Labbé, Natalie; Hefford, Mary A; Zhao, Jiping; Simmonds, Daina H

    2014-06-01

    To improve soybean [Glycine max (L.) Merrill] seed nutritional quality, a synthetic gene, MB-16 was introduced into the soybean genome to boost seed methionine content. MB-16, an 11 kDa de novo protein enriched in the essential amino acids (EAAs) methionine, threonine, lysine and leucine, was originally developed for expression in rumen bacteria. For efficient seed expression, constructs were designed using the soybean codon bias, with and without the KDEL ER retention sequence, and β-conglycinin or cruciferin seed specific protein storage promoters. Homozygous lines, with single locus integrations, were identified for several transgenic events. Transgene transmission and MB-16 protein expression were confirmed to the T5 and T7 generations, respectively. Quantitative RT-PCR analysis of developing seed showed that the transcript peaked in growing seed, 5-6 mm long, remained at this peak level to the full-sized green seed and then was significantly reduced in maturing yellow seed. Transformed events carrying constructs with the rumen bacteria codon preference showed the same transcription pattern as those with the soybean codon preference, but the transcript levels were lower at each developmental stage. MB-16 protein levels, as determined by immunoblots, were highest in full-sized green seed but the protein virtually disappeared in mature seed. However, amino acid analysis of mature seed, in the best transgenic line, showed a significant increase of 16.2 and 65.9 % in methionine and cysteine, respectively, as compared to the parent. This indicates that MB-16 elevated the sulfur amino acids, improved the EAA seed profile and confirms that a de novo synthetic gene can enhance the nutritional quality of soybean.

  13. De novo peptide design and experimental validation of histone methyltransferase inhibitors.

    Directory of Open Access Journals (Sweden)

    James Smadbeck

    Full Text Available Histones are small proteins critical to the efficient packaging of DNA in the nucleus. DNA–protein complexes, known as nucleosomes, are formed when the DNA winds itself around the surface of the histones. The methylation of histone residues by enhancer of zeste homolog 2 (EZH2 maintains gene repression over successive cell generations. Overexpression of EZH2 can silence important tumor suppressor genes leading to increased invasiveness of many types of cancers. This makes the inhibition of EZH2 an important target in the development of cancer therapeutics. We employed a three-stage computational de novo peptide design method to design inhibitory peptides of EZH2. The method consists of a sequence selection stage and two validation stages for fold specificity and approximate binding affinity. The sequence selection stage consists of an integer linear optimization model that was solved to produce a rank-ordered list of amino acid sequences with increased stability in the bound peptide-EZH2 structure. These sequences were validated through the calculation of the fold specificity and approximate binding affinity of the designed peptides. Here we report the discovery of novel EZH2 inhibitory peptides using the de novo peptide design method. The computationally discovered peptides were experimentally validated in vitro using dose titrations and mechanism of action enzymatic assays. The peptide with the highest in vitro response, SQ037, was validated in nucleo using quantitative mass spectrometry-based proteomics. This peptide had an IC50 of 13.5 mM, demonstrated greater potency as an inhibitor when compared to the native and K27A mutant control peptides, and demonstrated competitive inhibition versus the peptide substrate. Additionally, this peptide demonstrated high specificity to the EZH2 target in comparison to other histone methyltransferases. The validated peptides are the first computationally designed peptides that directly inhibit EZH2

  14. De novo peptide design and experimental validation of histone methyltransferase inhibitors.

    Directory of Open Access Journals (Sweden)

    James Smadbeck

    Full Text Available Histones are small proteins critical to the efficient packaging of DNA in the nucleus. DNA-protein complexes, known as nucleosomes, are formed when the DNA winds itself around the surface of the histones. The methylation of histone residues by enhancer of zeste homolog 2 (EZH2 maintains gene repression over successive cell generations. Overexpression of EZH2 can silence important tumor suppressor genes leading to increased invasiveness of many types of cancers. This makes the inhibition of EZH2 an important target in the development of cancer therapeutics. We employed a three-stage computational de novo peptide design method to design inhibitory peptides of EZH2. The method consists of a sequence selection stage and two validation stages for fold specificity and approximate binding affinity. The sequence selection stage consists of an integer linear optimization model that was solved to produce a rank-ordered list of amino acid sequences with increased stability in the bound peptide-EZH2 structure. These sequences were validated through the calculation of the fold specificity and approximate binding affinity of the designed peptides. Here we report the discovery of novel EZH2 inhibitory peptides using the de novo peptide design method. The computationally discovered peptides were experimentally validated in vitro using dose titrations and mechanism of action enzymatic assays. The peptide with the highest in vitro response, SQ037, was validated in nucleo using quantitative mass spectrometry-based proteomics. This peptide had an IC50 of 13.5 [Formula: see text]M, demonstrated greater potency as an inhibitor when compared to the native and K27A mutant control peptides, and demonstrated competitive inhibition versus the peptide substrate. Additionally, this peptide demonstrated high specificity to the EZH2 target in comparison to other histone methyltransferases. The validated peptides are the first computationally designed peptides that directly

  15. De novo assembly of the perennial ryegrass transcriptome using an RNA-Seq strategy.

    Directory of Open Access Journals (Sweden)

    Jacqueline D Farrell

    Full Text Available Perennial ryegrass is a highly heterozygous outbreeding grass species used for turf and forage production. Heterozygosity can affect de-Bruijn graph assembly making de novo transcriptome assembly of species such as perennial ryegrass challenging. Creating a reference transcriptome from a homozygous perennial ryegrass genotype can circumvent the challenge of heterozygosity. The goals of this study were to perform RNA-sequencing on multiple tissues from a highly inbred genotype to develop a reference transcriptome. This was complemented with RNA-sequencing of a highly heterozygous genotype for SNP calling.De novo transcriptome assembly of the inbred genotype created 185,833 transcripts with an average length of 830 base pairs. Within the inbred reference transcriptome 78,560 predicted open reading frames were found of which 24,434 were predicted as complete. Functional annotation found 50,890 transcripts with a BLASTp hit from the Swiss-Prot non-redundant database, 58,941 transcripts with a Pfam protein domain and 1,151 transcripts encoding putative secreted peptides. To evaluate the reference transcriptome we targeted the high-affinity K+ transporter gene family and found multiple orthologs. Using the longest unique open reading frames as the reference sequence, 64,242 single nucleotide polymorphisms were found. One thousand sixty one open reading frames from the inbred genotype contained heterozygous sites, confirming the high degree of homozygosity.Our study has developed an annotated, comprehensive transcriptome reference for perennial ryegrass that can aid in determining genetic variation, expression analysis, genome annotation, and gene mapping.

  16. De novo loss-of-function mutations in WAC cause a recognizable intellectual disability syndrome and learning deficits in Drosophila.

    Science.gov (United States)

    Lugtenberg, Dorien; Reijnders, Margot R F; Fenckova, Michaela; Bijlsma, Emilia K; Bernier, Raphael; van Bon, Bregje W M; Smeets, Eric; Vulto-van Silfhout, Anneke T; Bosch, Danielle; Eichler, Evan E; Mefford, Heather C; Carvill, Gemma L; Bongers, Ernie M H F; Schuurs-Hoeijmakers, Janneke Hm; Ruivenkamp, Claudia A; Santen, Gijs W E; van den Maagdenberg, Arn M J M; Peeters-Scholte, Cacha M P C D; Kuenen, Sabine; Verstreken, Patrik; Pfundt, Rolph; Yntema, Helger G; de Vries, Petra F; Veltman, Joris A; Hoischen, Alexander; Gilissen, Christian; de Vries, Bert B A; Schenck, Annette; Kleefstra, Tjitske; Vissers, Lisenka E L M

    2016-08-01

    Recently WAC was reported as a candidate gene for intellectual disability (ID) based on the identification of a de novo mutation in an individual with severe ID. WAC regulates transcription-coupled histone H2B ubiquitination and has previously been implicated in the 10p12p11 contiguous gene deletion syndrome. In this study, we report on 10 individuals with de novo WAC mutations which we identified through routine (diagnostic) exome sequencing and targeted resequencing of WAC in 2326 individuals with unexplained ID. All but one mutation was expected to lead to a loss-of-function of WAC. Clinical evaluation of all individuals revealed phenotypic overlap for mild ID, hypotonia, behavioral problems and distinctive facial dysmorphisms, including a square-shaped face, deep set eyes, long palpebral fissures, and a broad mouth and chin. These clinical features were also previously reported in individuals with 10p12p11 microdeletion syndrome. To investigate the role of WAC in ID, we studied the importance of the Drosophila WAC orthologue (CG8949) in habituation, a non-associative learning paradigm. Neuronal knockdown of Drosophila CG8949 resulted in impaired learning, suggesting that WAC is required in neurons for normal cognitive performance. In conclusion, we defined a clinically recognizable ID syndrome, caused by de novo loss-of-function mutations in WAC. Independent functional evidence in Drosophila further supported the role of WAC in ID. On the basis of our data WAC can be added to the list of ID genes with a role in transcription regulation through histone modification.

  17. The gene identification problem: An overview for developers

    Energy Technology Data Exchange (ETDEWEB)

    Fickett, J.W.

    1995-03-27

    The gene identification problem is the problem of interpreting nucleotide sequences by computer, in order to provide tentative annotation on the location, structure, and functional class of protein-coding genes. This problem is of self-evident importance, and is far from being fully solved, particularly for higher eukaryotes, Thus it is not surprising that the number of algorithm and software developers working in this area is rapidly increasing. The present paper is an overview of the field, with an emphasis on eukaryotes, for such developers.

  18. De novo assembly of Phlomis purpurea after challenging with Phytophthora cinnamomi.

    Science.gov (United States)

    Baldé, Aladje; Neves, Dina; García-Breijo, Francisco J; Pais, Maria Salomé; Cravador, Alfredo

    2017-09-06

    Phlomis plants are a source of biological active substances with potential applications in the control of phytopathogens. Phlomis purpurea (Lamiaceae) is autochthonous of southern Iberian Peninsula and Morocco and was found to be resistant to Phytophthora cinnamomi. Phlomis purpurea has revealed antagonistic effect in the rhizosphere of Quercus suber and Q. ilex against P. cinnamomi. Phlomis purpurea roots produce bioactive compounds exhibiting antitumor and anti-Phytophthora activities with potential to protect susceptible plants. Although these important capacities of P. purpurea have been demonstrated, there is no transcriptomic or genomic information available in public databases that could bring insights on the genes underlying this anti-oomycete activity. Using Illumina technology we obtained a de novo assembly of P. purpurea transcriptome and differential transcript abundance to identify putative defence related genes in challenged versus non-challenged plants. A total of 1,272,600,000 reads from 18 cDNA libraries were merged and assembled into 215,739 transcript contigs. BLASTX alignment to Nr NCBI database identified 124,386 unique annotated transcripts (57.7%) with significant hits. Functional annotation identified 83,550 out of 124,386 unique transcripts, which were mapped to 141 pathways. 39% of unigenes were assigned GO terms. Their functions cover biological processes, cellular component and molecular functions. Genes associated with response to stimuli, cellular and primary metabolic processes, catalytic and transporter functions were among those identified. Differential transcript abundance analysis using DESeq revealed significant differences among libraries depending on post-challenge times. Comparative cyto-histological studies of P. purpurea roots challenged with P. cinnamomi zoospores and controls revealed specific morphological features (exodermal strips and epi-cuticular layer), that may provide a constitutive efficient barrier against

  19. A de novo missense mutation of FGFR2 causes facial dysplasia syndrome in Holstein cattle.

    Science.gov (United States)

    Agerholm, Jørgen S; McEvoy, Fintan J; Heegaard, Steffen; Charlier, Carole; Jagannathan, Vidhya; Drögemüller, Cord

    2017-08-02

    Surveillance for bovine genetic diseases in Denmark identified a hitherto unreported congenital syndrome occurring among progeny of a Holstein sire used for artificial breeding. A genetic aetiology due to a dominant inheritance with incomplete penetrance or a mosaic germline mutation was suspected as all recorded cases were progeny of the same sire. Detailed investigations were performed to characterize the syndrome and to reveal its cause. Seven malformed calves were submitted examination. All cases shared a common morphology with the most striking lesions being severe facial dysplasia and complete prolapse of the eyes. Consequently the syndrome was named facial dysplasia syndrome (FDS). Furthermore, extensive brain malformations, including microencephaly, hydrocephalus, lobation of the cerebral hemispheres and compression of the brain were present. Subsequent data analysis of progeny of the sire revealed that around 0.5% of his offspring suffered from FDS. High density single nucleotide polymorphism (SNP) genotyping data of the seven cases and their parents were used to map the defect in the bovine genome. Significant genetic linkage was obtained for three regions, including chromosome 26 where whole genome sequencing of a case-parent trio revealed two de novo variants perfectly associated with the disease: an intronic SNP in the DMBT1 gene and a single non-synonymous variant in the FGFR2 gene. This FGFR2 missense variant (c.927G>T) affects a gene encoding a member of the fibroblast growth factor receptor family, where amino acid sequence is highly conserved between members and across species. It is predicted to change an evolutionary conserved tryptophan into a cysteine residue (p.Trp309Cys). Both variant alleles were proven to result from de novo mutation events in the germline of the sire. FDS is a novel genetic disorder of Holstein cattle. Mutations in the human FGFR2 gene are associated with various dominant inherited craniofacial dysostosis syndromes. Given

  20. O sistema endocanabinóide: novo paradigma no tratamento da síndrome metabólica

    OpenAIRE

    Godoy-Matos, Amélio F. de; Guedes, Erika Paniago; Souza, Luciana Lopes de; Valério, Cynthia Melissa

    2006-01-01

    O balanço energético é um dos mais importantes mecanismos de homeostase e de sobrevivência das espécies. O sistema endocanabinóide é um novo e importante componente entre estes mecanismos. Os seus receptores e agonistas endógenos se expressam no sistema nervoso central (SNC) e perifericamente, em vários sítios, estabelecendo uma rede de comunicação periferiaSNC. Um aspecto marcante é a sua expressão no tecido adiposo, onde regula a lipogênese e aumenta a expressão de genes influentes no metab...

  1. IDBA-tran: a more robust de novo de Bruijn graph assembler for transcriptomes with uneven expression levels.

    Science.gov (United States)

    Peng, Yu; Leung, Henry C M; Yiu, Siu-Ming; Lv, Ming-Ju; Zhu, Xin-Guang; Chin, Francis Y L

    2013-07-01

    RNA sequencing based on next-generation sequencing technology is effective for analyzing transcriptomes. Like de novo genome assembly, de novo transcriptome assembly does not rely on any reference genome or additional annotation information, but is more difficult. In particular, isoforms can have very uneven expression levels (e.g. 1:100), which make it very difficult to identify low-expressed isoforms. One challenge is to remove erroneous vertices/edges with high multiplicity (produced by high-expressed isoforms) in the de Bruijn graph without removing correct ones with not-so-high multiplicity from low-expressed isoforms. Failing to do so will result in the loss of low-expressed isoforms or having complicated subgraphs with transcripts of different genes mixed together due to erroneous vertices/edges. Contributions: Unlike existing tools, which remove erroneous vertices/edges with multiplicities lower than a global threshold, we use a probabilistic progressive approach to iteratively remove them with local thresholds. This enables us to decompose the graph into disconnected components, each containing a few genes, if not a single gene, while retaining many correct vertices/edges of low-expressed isoforms. Combined with existing techniques, IDBA-Tran is able to assemble both high-expressed and low-expressed transcripts and outperform existing assemblers in terms of sensitivity and specificity for both simulated and real data. http://www.cs.hku.hk/~alse/idba_tran. Supplementary data are available at Bioinformatics online.

  2. Genome-wide annotation of porcine microRNA genes and transcriptome profiling during Actinobacillus infection

    DEFF Research Database (Denmark)

    Nielsen, Mathilde

    MicroRNAs are small single stranded non-coding RNA molecules which contributes to the regulation of gene expression by primarily binding to the 3´end of protein coding mRNA, hereby inhibiting the translation process or promting degradation of the mRNA. The main focus of this PhD project was to ex......MicroRNAs are small single stranded non-coding RNA molecules which contributes to the regulation of gene expression by primarily binding to the 3´end of protein coding mRNA, hereby inhibiting the translation process or promting degradation of the mRNA. The main focus of this PhD project...

  3. Early de novo DNA methylation and prolonged demethylation in the muscle lineage.

    Science.gov (United States)

    Tsumagari, Koji; Baribault, Carl; Terragni, Jolyon; Varley, Katherine E; Gertz, Jason; Pradhan, Sirharsa; Badoo, Melody; Crain, Charlene M; Song, Lingyun; Crawford, Gregory E; Myers, Richard M; Lacey, Michelle; Ehrlich, Melanie

    2013-03-01

    Myogenic cell cultures derived from muscle biopsies are excellent models for human cell differentiation. We report the first comprehensive analysis of myogenesis-specific DNA hyper- and hypo-methylation throughout the genome for human muscle progenitor cells (both myoblasts and myotubes) and skeletal muscle tissue vs. 30 non-muscle samples using reduced representation bisulfite sequencing. We also focused on four genes with extensive hyper- or hypo-methylation in the muscle lineage (PAX3, TBX1, MYH7B/MIR499 and OBSCN) to compare DNA methylation, DNaseI hypersensitivity, histone modification, and CTCF binding profiles. We found that myogenic hypermethylation was strongly associated with homeobox or T-box genes and muscle hypomethylation with contractile fiber genes. Nonetheless, there was no simple relationship between differential gene expression and myogenic differential methylation, rather only for subsets of these genes, such as contractile fiber genes. Skeletal muscle retained ~30% of the hypomethylated sites but only ~3% of hypermethylated sites seen in myogenic progenitor cells. By enzymatic assays, skeletal muscle was 2-fold enriched globally in genomic 5-hydroxymethylcytosine (5-hmC) vs. myoblasts or myotubes and was the only sample type enriched in 5-hmC at tested myogenic hypermethylated sites in PAX3/CCDC140 andTBX1. TET1 and TET2 RNAs, which are involved in generation of 5-hmC and DNA demethylation, were strongly upregulated in myoblasts and myotubes. Our findings implicate de novo methylation predominantly before the myoblast stage and demethylation before and after the myotube stage in control of transcription and co-transcriptional RNA processing. They also suggest that, in muscle, TET1 or TET2 are involved in active demethylation and in formation of stable 5-hmC residues.

  4. Melhoramento do cafeeiro: IV - Café Mundo Novo

    Directory of Open Access Journals (Sweden)

    A. Carvalho

    1952-06-01

    Full Text Available Em um conjunto de cafeeiros existentes em Mundo Novo, hoje Urupês, na região Araraquarense do Estado de São Paulo, foram feitas seleções de vários cafeeiros baseando-se no seu aspecto vegetativo, na produção existente na época da seleção e na provável produção do ano seguinte. Estudou-se a origem da plantação inicial desse café, tanto em Urupês como em Jaú, chegando-se à conclusão de que é provavelmente originário desta última localidade. Progênies do café "Mundo Novo", anteriormente conhecido por "Sumatra" e derivado de plantas selecionadas em Urupês e Jaú, acham-se em estudo em seis localidades do Estado : Campinas, Ribeirão Prêto, Pindorama, Mococa, Jaú e Monte Alegre do Sul. No presente trabalho são apenas aproveitados dados referentes à variabilidade morfológica e característicos da produção das progênies dos primeiros cafeeiros selecionados em Urupês e estudados em Campinas, Jaú, Pindorama e Mococa. Em tôdas as localidades, observou-se variação nos caracteres morfológicos das progênies, verificando-se a ocorrência de plantas quase improdutivas. A maioria das progênies, no entanto, se caracteriza por acentuado vigor vegetativo. Foram estudadas as produções totais das progénies e das plantas, no período 1946-1951, notando-se que algumas progénies se salientaram pela elevada produção em tôdas as localidades. Os tipos de sementes "moca", "concha" e "chato" foram determinados em amostras de tôdas as plantas, por um período de três anos, notando-se que a variação ocorrida é da mesma ordem que a encontrada em outros cafeeiros em seleção. Procurou-se eliminar, pela seleção, cafeeiros com elevada produção de frutos sem sementes em uma ou duas lojas, característico êsse que parece ser hereditário. Os resultados obtidos de cruzamento entre os melhores cafeeiros "Mundo Novo" de Campinas e plantas da variedade murta, indicaram que esses cafeeiros são do tipo bourbon. Provavelmente

  5. Recurrence risk in de novo structural chromosomal rearrangements.

    Science.gov (United States)

    Röthlisberger, Benno; Kotzot, Dieter

    2007-08-01

    According to the textbook of Gardner and Sutherland [2004], the standard on genetic counseling for chromosome abnormalities, the recurrence risk of de novo structural or combined structural and numeric chromosome rearrangements is less than 0.5-2% and takes into account recurrence by chance, gonadal mosaicism, and somatic-gonadal mosaicism. However, these figures are roughly estimated and neither any systematic study nor exact or evidence-based risk calculations are available. To address this question, an extensive literature search was performed and surprisingly only 29 case reports of recurrence of de novo structural or combined structural and numeric chromosomal rearrangements were found. Thirteen of them were with a trisomy 21 due to an i(21q) replacing one normal chromosome 21. In eight of them low-level mosaicism in one of the parents was found either in fibroblasts or in blood or in both. As a consequence of the low number of cases and theoretical considerations (clinical consequences, mechanisms of formation, etc.), the recurrence risk should be reduced to less than 1% for a de novo i(21q) and to even less than 0.3% for all other de novo structural or combined structural and numeric chromosomal rearrangements. As the latter is lower than the commonly accepted risk of approximately 0.3% for indicating an invasive prenatal diagnosis and as the risk of abortion of a healthy fetus after chorionic villous sampling or amniocentesis is higher than approximately 0.5%, invasive prenatal investigation in most cases is not indicated and should only be performed if explicitly asked by the parents subsequent to appropriate genetic counseling. (c) 2007 Wiley-Liss, Inc.

  6. Novos encontros de anofelíneos em recipientes artificiais

    Directory of Open Access Journals (Sweden)

    Oswaldo Paulo Forattini

    1998-12-01

    Full Text Available Assinalam-se novos encontros de anofelíneos em recipientes artificiais. Um deles diz respeito a formas imaturas de Anopheles bellator em criadouros experimentais e outro é concernente ao achado de An. albitarsis l.s., em recipiente abandonado. Tecem-se considerações sobre a pressão seletiva representada pela produção, cada vez maior, de objetos descartáveis.

  7. Infant Mortality in Novo Hamburgo: Associated Factors and Cardiovascular Causes

    Energy Technology Data Exchange (ETDEWEB)

    Brum, Camila de Andrade [Instituto de Cardiologia/Fundação Universitária de Cardiologia (IC/FUC), Porto Alegre, RS (Brazil); Stein, Airton Tetelbom [Universidade Federal de Ciências da Saúde de Porto Alegre (UFCSPA), Porto Alegre, RS (Brazil); Grupo Hospitalar Conceição (GHC), Porto Alegre, RS (Brazil); Universidade Luterana do Brasil (ULBRA), Porto Alegre, RS (Brazil); Pellanda, Lucia Campos, E-mail: luciapell.pesquisa@cardiologia.org.br [Instituto de Cardiologia/Fundação Universitária de Cardiologia (IC/FUC), Porto Alegre, RS (Brazil); Universidade Federal de Ciências da Saúde de Porto Alegre (UFCSPA), Porto Alegre, RS (Brazil)

    2015-04-15

    Infant mortality has decreased in Brazil, but remains high as compared to that of other developing countries. In 2010, the Rio Grande do Sul state had the lowest infant mortality rate in Brazil. However, the municipality of Novo Hamburgo had the highest infant mortality rate in the Porto Alegre metropolitan region. To describe the causes of infant mortality in the municipality of Novo Hamburgo from 2007 to 2010, identifying which causes were related to heart diseases and if they were diagnosed in the prenatal period, and to assess the access to healthcare services. This study assessed infants of the municipality of Novo Hamburgo, who died, and whose data were collected from the infant death investigation records. Of the 157 deaths in that period, 35.3% were reducible through diagnosis and early treatment, 25% were reducible through partnership with other sectors, 19.2% were non-preventable, 11.5% were reducible by means of appropriate pregnancy monitoring, 5.1% were reducible through appropriate delivery care, and 3.8% were ill defined. The major cause of death related to heart disease (13.4%), which was significantly associated with the variables ‘age at death’, ‘gestational age’ and ‘birth weight’. Regarding access to healthcare services, 60.9% of the pregnant women had a maximum of six prenatal visits. It is mandatory to enhance prenatal care and newborn care at hospitals and basic healthcare units to prevent infant mortality.

  8. Infant Mortality in Novo Hamburgo: Associated Factors and Cardiovascular Causes

    Directory of Open Access Journals (Sweden)

    Camila de Andrade Brum

    2015-04-01

    Full Text Available Background: Infant mortality has decreased in Brazil, but remains high as compared to that of other developing countries. In 2010, the Rio Grande do Sul state had the lowest infant mortality rate in Brazil. However, the municipality of Novo Hamburgo had the highest infant mortality rate in the Porto Alegre metropolitan region. Objective: To describe the causes of infant mortality in the municipality of Novo Hamburgo from 2007 to 2010, identifying which causes were related to heart diseases and if they were diagnosed in the prenatal period, and to assess the access to healthcare services. Methods: This study assessed infants of the municipality of Novo Hamburgo, who died, and whose data were collected from the infant death investigation records. Results: Of the 157 deaths in that period, 35.3% were reducible through diagnosis and early treatment, 25% were reducible through partnership with other sectors, 19.2% were non-preventable, 11.5% were reducible by means of appropriate pregnancy monitoring, 5.1% were reducible through appropriate delivery care, and 3.8% were ill defined. The major cause of death related to heart disease (13.4%, which was significantly associated with the variables ‘age at death’, ‘gestational age’ and ‘birth weight’. Regarding access to healthcare services, 60.9% of the pregnant women had a maximum of six prenatal visits. Conclusion: It is mandatory to enhance prenatal care and newborn care at hospitals and basic healthcare units to prevent infant mortality.

  9. Generative Recurrent Networks for De Novo Drug Design.

    Science.gov (United States)

    Gupta, Anvita; Müller, Alex T; Huisman, Berend J H; Fuchs, Jens A; Schneider, Petra; Schneider, Gisbert

    2018-01-01

    Generative artificial intelligence models present a fresh approach to chemogenomics and de novo drug design, as they provide researchers with the ability to narrow down their search of the chemical space and focus on regions of interest. We present a method for molecular de novo design that utilizes generative recurrent neural networks (RNN) containing long short-term memory (LSTM) cells. This computational model captured the syntax of molecular representation in terms of SMILES strings with close to perfect accuracy. The learned pattern probabilities can be used for de novo SMILES generation. This molecular design concept eliminates the need for virtual compound library enumeration. By employing transfer learning, we fine-tuned the RNN's predictions for specific molecular targets. This approach enables virtual compound design without requiring secondary or external activity prediction, which could introduce error or unwanted bias. The results obtained advocate this generative RNN-LSTM system for high-impact use cases, such as low-data drug discovery, fragment based molecular design, and hit-to-lead optimization for diverse drug targets. © 2017 The Authors. Published by Wiley-VCH Verlag GmbH & Co. KGaA.

  10. Infant Mortality in Novo Hamburgo: Associated Factors and Cardiovascular Causes

    International Nuclear Information System (INIS)

    Brum, Camila de Andrade; Stein, Airton Tetelbom; Pellanda, Lucia Campos

    2015-01-01

    Infant mortality has decreased in Brazil, but remains high as compared to that of other developing countries. In 2010, the Rio Grande do Sul state had the lowest infant mortality rate in Brazil. However, the municipality of Novo Hamburgo had the highest infant mortality rate in the Porto Alegre metropolitan region. To describe the causes of infant mortality in the municipality of Novo Hamburgo from 2007 to 2010, identifying which causes were related to heart diseases and if they were diagnosed in the prenatal period, and to assess the access to healthcare services. This study assessed infants of the municipality of Novo Hamburgo, who died, and whose data were collected from the infant death investigation records. Of the 157 deaths in that period, 35.3% were reducible through diagnosis and early treatment, 25% were reducible through partnership with other sectors, 19.2% were non-preventable, 11.5% were reducible by means of appropriate pregnancy monitoring, 5.1% were reducible through appropriate delivery care, and 3.8% were ill defined. The major cause of death related to heart disease (13.4%), which was significantly asso