WorldWideScience

Sample records for cdna deep sequencing

  1. The venom gland transcriptome of Latrodectus tredecimguttatus revealed by deep sequencing and cDNA library analysis.

    Directory of Open Access Journals (Sweden)

    Quanze He

    Full Text Available Latrodectus tredecimguttatus, commonly known as black widow spider, is well known for its dangerous bite. Although its venom has been characterized extensively, some fundamental questions about its molecular composition remain unanswered. The limited transcriptome and genome data available prevent further understanding of spider venom at the molecular level. In the present study, we combined next-generation sequencing and conventional DNA sequencing to construct a venom gland transcriptome of the spider L. tredecimguttatus, which resulted in the identification of 9,666 and 480 high-confidence proteins among 34,334 de novo sequences and 1,024 cDNA sequences, respectively, by assembly, translation, filtering, quantification and annotation. Extensive functional analyses of these proteins indicated that mRNAs involved in RNA transport and spliceosome, protein translation, processing and transport were highly enriched in the venom gland, which is consistent with the specific function of venom glands, namely the production of toxins. Furthermore, we identified 146 toxin-like proteins forming 12 families, including 6 new families in this spider in which α-LTX-Lt1a family2 is firstly identified as a subfamily of α-LTX-Lt1a family. The toxins were classified according to their bioactivities into five categories that functioned in a coordinate way. Few ion channels were expressed in venom gland cells, suggesting a possible mechanism of protection from the attack of their own toxins. The present study provides a gland transcriptome profile and extends our understanding of the toxinome of spiders and coordination mechanism for toxin production in protein expression quantity.

  2. cDNA sequence quality data - Budding yeast cDNA sequencing project | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available Budding yeast cDNA sequencing project cDNA sequence quality data Data detail Data name cDNA sequence quality... data Description of data contents Phred's quality score. PHD format, one file to a single cDNA data, and co...ription Download License Update History of This Database Site Policy | Contact Us cDNA sequence quality data - Budding yeast cDNA sequencing project | LSDB Archive ...

  3. cDNA cloning and sequencing of ostrich Growth hormone

    Directory of Open Access Journals (Sweden)

    Doosti Abbas

    2012-01-01

    Full Text Available In recent years, industrial breeding of ostrich (Struthio camelus has been widely developed in Iran. Growth hormone (GH is a peptide hormone that stimulates growth and cell reproduction in different animals. The aim of this study was to clone and sequence the ostrich growth hormone gene in E. coli, done for the first time in Iran. The cDNA that encodes ostrich growth hormone was isolated from total mRNA of the pituitary gland and amplified by RT-PCR using GH specific PCR primers. Then GH cDNA was cloned by T/A cloning technique and the construct was transformed into E. coli. Finally, GH cDNA sequence was submitted to the GenBank (Accession number: JN559394. The results of present study showed that GH cDNA was successfully cloned in E. coli. Sequencing confirmed that GH cDNA was cloned and that the length of ostrich GH cDNA was 672 bp; BLAST search showed that the sequence of growth hormone cDNA of the ostrich from Iran has 100% homology with other records existing in GenBank.

  4. Vector sequences - Budding yeast cDNA sequencing project | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available [ Credits ] BLAST Search Image Search Home About Archive Update History Contact us ...od - Number of data entries 7 entries - Joomla SEF URLs by Artio About This Database Database Description Download License Update His...tory of This Database Site Policy | Contact Us Vector sequences - Budding yeast cDNA sequencing project | LSDB Archive ...

  5. Cloning, sequencing and expression of cDNA encoding growth hormone from Indian catfish (Heteropneustes fossilis)

    Indian Academy of Sciences (India)

    Vikas Anathy; Thayanithy Venugopal; Ramanathan Koteeswaran; Thavamani J Pandian; Sinnakaruppan Mathavan

    2013-03-01

    A tissue-specific cDNA library was constructed using polyA+ RNA from pituitary glands of the Indian catfish Heteropneustes fossilis (Bloch) and a cDNA clone encoding growth hormone (GH) was isolated. Using polymerase chain reaction (PCR) primers representing the conserved regions of fish GH sequences the 3′ region of catfish GH cDNA (540 bp) was cloned by random amplification of cDNA ends and the clone was used as a probe to isolate recombinant phages carrying the full-length cDNA sequence. The full-length cDNA clone is 1132 bp in length, coding for an open reading frame (ORF) of 603 bp; the reading frame encodes a putative polypeptide of 200 amino acids including the signal sequence of 22 amino acids. The 5′ and 3′ untranslated regions of the cDNA are 58 bp and 456 bp long, respectively. The predicted amino acid sequence of H. fossils GH shared 98% homology with other catfishes. Mature GH protein was efficiently expressed in bacterial and zebrafish systems using appropriate expression vectors. The successful expression of the cloned GH cDNA of catfish confirms the functional viability of the clone.

  6. Cloning, sequencing and expression of cDNA encoding growth hormone from Indian catfish (Heteropneustes fossilis)

    Indian Academy of Sciences (India)

    Vikas Anathy; Thayanithy Venugopal; Ramanathan Koteeswaran; Thavamani J Pandian; Sinnakaruppan Mathavan

    2001-09-01

    A tissue-specific cDNA library was constructed using polyA+ RNA from pituitary glands of the Indian catfish Heteropneustes fossilis (Bloch) and a cDNA clone encoding growth hormone (GH) was isolated. Using polymerase chain reaction (PCR) primers representing the conserved regions of fish GH sequences the 3′ region of catfish GH cDNA (540 bp) was cloned by random amplification of cDNA ends and the clone was used as a probe to isolate recombinant phages carrying the full-length cDNA sequence. The full-length cDNA clone is 1132 bp in length, coding for an open reading frame (ORF) of 603 bp; the reading frame encodes a putative polypeptide of 200 amino acids including the signal sequence of 22 amino acids. The 5′ and 3′ untranslated regions of the cDNA are 58 bp and 456 bp long, respectively. The predicted amino acid sequence of H. fossils GH shared 98% homology with other catfishes. Mature GH protein was efficiently expressed in bacterial and zebrafish systems using appropriate expression vectors. The successful expression of the cloned GH cDNA of catfish confirms the functional viability of the clone.

  7. Mouse tetranectin: cDNA sequence, tissue-specific expression, and chromosomal mapping

    DEFF Research Database (Denmark)

    Ibaraki, K; Kozak, C A; Wewer, U M;

    1995-01-01

    regulation, mouse tetranectin cDNA was cloned from a 16-day-old mouse embryo library. Sequence analysis revealed a 992-bp cDNA with an open reading frame of 606 bp, which is identical in length to the human tetranectin cDNA. The deduced amino acid sequence showed high homology to the human cDNA with 76......(s) of tetranectin. The sequence analysis revealed a difference in both sequence and size of the noncoding regions between mouse and human cDNAs. Northern analysis of the various tissues from mouse, rat, and cow showed the major transcript(s) to be approximately 1 kb, which is similar in size to that observed......, was determined to be on distal mouse Chromosome (Chr) 9 by analysis of two sets of multilocus crosses....

  8. Characterization of Expressed Sequence Tags From a Gallus gallus Pineal Gland cDNA Library

    OpenAIRE

    2005-01-01

    The pineal gland is the circadian oscillator in the chicken, regulating diverse functions ranging from egg laying to feeding. Here, we describe the isolation and characterization of expressed sequence tags (ESTs) isolated from a chicken pineal gland cDNA library. A total of 192 unique sequences were analysed and submitted to GenBank; 6% of the ESTs matched neither GenBank cDNA sequences nor the newly assembled chicken genomic DNA sequence, three ESTs aligned with sequences designated to be on...

  9. Cloning and sequencing of dolphinfish (Coryphaena hippurus, Coryphaenidae) growth hormone-encoding cDNA.

    Science.gov (United States)

    Peduel, A D; Elizur, A; Knibb, W

    1994-01-01

    The cDNA encoding the preprotein growth hormone from the dolphinfish (Coryphaena hippurus) has been cloned and sequenced. The cDNA was derived by reverse transcription of RNA from the pituitary of a young fish using the method known as Rapid Amplification of cDNA Ends (RACE). An oligonucleotide primer corresponding to the 5' region of Pagrus major and the universal RACE primer enabled amplification using the Polymerase Chain Reaction (PCR). The dolphinfish and yellow-tail, Seriola quineqeradiata, are both members of the sub-order Percoidei (Perciforme) and their GH sequences show a high level of homology.

  10. 5'-end sequences of budding yeast full-length cDNA clones and quality scores - Budding yeast cDNA sequencing project | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available Budding yeast cDNA sequencing project 5'-end sequences of budding yeast full-length cDNA clones and quality ...scores Data detail Data name 5'-end sequences of budding yeast full-length cDNA clones and quality scores De...from the budding yeast full-length cDNA library by the vector-capping method, the sequence quality score gen...s accession only. Sequence 5'-end sequence data of budding yeast full-length cDNA clones. FASTA format. Quality Phred's quality... Update History of This Database Site Policy | Contact Us 5'-end sequences of budding yeast full-length cDNA clones and quality

  11. cDNA sequencing improves the detection of P53 missense mutations in colorectal cancer

    Directory of Open Access Journals (Sweden)

    Jesionek-Kupnicka Dorota

    2009-08-01

    Full Text Available Abstract Background Recently published data showed discrepancies beteween P53 cDNA and DNA sequencing in glioblastomas. We hypothesised that similar discrepancies may be observed in other human cancers. Methods To this end, we analyzed 23 colorectal cancers for P53 mutations and gene expression using both DNA and cDNA sequencing, real-time PCR and immunohistochemistry. Results We found P53 gene mutations in 16 cases (15 missense and 1 nonsense. Two of the 15 cases with missense mutations showed alterations based only on cDNA, and not DNA sequencing. Moreover, in 6 of the 15 cases with a cDNA mutation those mutations were difficult to detect in the DNA sequencing, so the results of DNA analysis alone could be misinterpreted if the cDNA sequencing results had not also been available. In all those 15 cases, we observed a higher ratio of the mutated to the wild type template by cDNA analysis, but not by the DNA analysis. Interestingly, a similar overexpression of P53 mRNA was present in samples with and without P53 mutations. Conclusion In terms of colorectal cancer, those discrepancies might be explained under three conditions: 1, overexpression of mutated P53 mRNA in cancer cells as compared with normal cells; 2, a higher content of cells without P53 mutation (normal cells and cells showing K-RAS and/or APC but not P53 mutation in samples presenting P53 mutation; 3, heterozygous or hemizygous mutations of P53 gene. Additionally, for heterozygous mutations unknown mechanism(s causing selective overproduction of mutated allele should also be considered. Our data offer new clues for studying discrepancy in P53 cDNA and DNA sequencing analysis.

  12. cDNA cloning, sequence analysis, and chromosomal localization of the gene for human carnitine palmitoyltransferase

    Energy Technology Data Exchange (ETDEWEB)

    Finocchiaro, G.; Taroni, F.; Martin, A.L.; Colombo, I.; Tarelli, G.T.; DiDonato, S. (Istituto Nazionale Neurologico C. Besta, Milan (Italy)); Rocchi, M. (Istituto G. Gaslini, Genoa (Italy))

    1991-01-15

    The authors have cloned and sequenced a cDNA encoding human liver carnitine palmitoyltransferase an inner mitochondrial membrane enzyme that plays a major role in the fatty acid oxidation pathway. Mixed oligonucleotide primers whose sequences were deduced from one tryptic peptide obtained from purified CPTase were used in a polymerase chain reaction, allowing the amplification of a 0.12-kilobase fragment of human genomic DNA encoding such a peptide. A 60-base-pair (bp) oligonucleotide synthesized on the basis of the sequence from this fragment was used for the screening of a cDNA library from human liver and hybridized to a cDNA insert of 2255 bp. This cDNA contains an open reading frame of 1974 bp that encodes a protein of 658 amino acid residues including 25 residues of an NH{sub 2}-terminal leader peptide. The assignment of this open reading frame to human liver CPTase is confirmed by matches to seven different amino acid sequences of tryptic peptides derived from pure human CPTase and by the 82.2% homology with the amino acid sequence of rat CPTase. The NH{sub 2}-terminal region of CPTase contains a leucine-proline motif that is shared by carnitine acetyl- and octanoyltransferases and by choline acetyltransferase. The gene encoding CPTase was assigned to human chromosome 1, region 1q12-1pter, by hybridization of CPTase cDNA with a DNA panel of 19 human-hanster somatic cell hybrids.

  13. Molecular cloning and nucleotide sequence of cDNA for human liver arginase

    Energy Technology Data Exchange (ETDEWEB)

    Haraguchi, Y.; Takiguchi, M.; Amaya, Y.; Kawamoto, S.; Matsuda, I.; Mori, M.

    1987-01-01

    Arginase (EC3.5.3.1) catalyzes the last step of the urea cycle in the liver of ureotelic animals. Inherited deficiency of the enzyme results in argininemia, an autosomal recessive disorder characterized by hyperammonemia. To facilitate investigation of the enzyme and gene structures and to elucidate the nature of the mutation in argininemia, the authors isolated cDNA clones for human liver arginase. Oligo(dT)-primed and random primer human liver cDNA libraries in lambda gt11 were screened using isolated rat arginase cDNA as a probe. Two of the positive clones, designated lambda hARG6 and lambda hARG109, contained an overlapping cDNA sequence with an open reading frame encoding a polypeptide of 322 amino acid residues (predicted M/sub r/, 34,732), a 5'-untranslated sequence of 56 base pairs, a 3'-untranslated sequence of 423 base pairs, and a poly(A) segment. Arginase activity was detected in Escherichia coli cells transformed with the plasmid carrying lambda hARG6 cDNA insert. RNA gel blot analysis of human liver RNA showed a single mRNA of 1.6 kilobases. The predicted amino acid sequence of human liver arginase is 87% and 41% identical with those of the rat liver and yeast enzymes, respectively. There are several highly conserved segments among the human, rat, and yeast enzymes.

  14. Budding yeast cDNA sequencing project: S03052-76_F01 [Budding yeast cDNA sequencing project

    Lifescience Database Archive (English)

    Full Text Available EST - Link to UCSC Genome Browser - Sequence >S03052-76_F01.phd NNNNNNNNNNNNNNNNNNNNNNNNNTNTAAAANNNNGANNNGANNNGTGGNTNTNTNTNT TNT...ANTTTNAANAAANAACNNNCCCTNNNNCNCNNNNNNNGAGNAAAAANNGGGTNTNNT NTTTTNNTNNTNTNTNNNNCNNN Qualit

  15. Deep sequencing approach for investigating infectious agents causing fever.

    Science.gov (United States)

    Susilawati, T N; Jex, A R; Cantacessi, C; Pearson, M; Navarro, S; Susianto, A; Loukas, A C; McBride, W J H

    2016-07-01

    Acute undifferentiated fever (AUF) poses a diagnostic challenge due to the variety of possible aetiologies. While the majority of AUFs resolve spontaneously, some cases become prolonged and cause significant morbidity and mortality, necessitating improved diagnostic methods. This study evaluated the utility of deep sequencing in fever investigation. DNA and RNA were isolated from plasma/sera of AUF cases being investigated at Cairns Hospital in northern Australia, including eight control samples from patients with a confirmed diagnosis. Following isolation, DNA and RNA were bulk amplified and RNA was reverse transcribed to cDNA. The resulting DNA and cDNA amplicons were subjected to deep sequencing on an Illumina HiSeq 2000 platform. Bioinformatics analysis was performed using the program Kraken and the CLC assembly-alignment pipeline. The results were compared with the outcomes of clinical tests. We generated between 4 and 20 million reads per sample. The results of Kraken and CLC analyses concurred with diagnoses obtained by other means in 87.5 % (7/8) and 25 % (2/8) of control samples, respectively. Some plausible causes of fever were identified in ten patients who remained undiagnosed following routine hospital investigations, including Escherichia coli bacteraemia and scrub typhus that eluded conventional tests. Achromobacter xylosoxidans, Alteromonas macleodii and Enterobacteria phage were prevalent in all samples. A deep sequencing approach of patient plasma/serum samples led to the identification of aetiological agents putatively implicated in AUFs and enabled the study of microbial diversity in human blood. The application of this approach in hospital practice is currently limited by sequencing input requirements and complicated data analysis.

  16. Download - Budding yeast cDNA sequencing project | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available Budding yeast cDNA sequencing project Download First of all, please read the license of this database. Data ...names and data descriptions are about the downloadable data in this page. They might not correspond to the c...f the data. # Data name File Simple search and download 1 README README_e.html - 2 5'-end sequences of buddi...ng yeast full-length cDNA clones and quality scores yeast_seq_qual.zip (59.9MB) Simple search and download 3...Downlaod via FTP Joomla SEF URLs by Artio About This Database Database Description Download License Update H

  17. [cDNA cloning and sequence analysis of pluripotency genes in tree shrews (Tupaia belangeri)].

    Science.gov (United States)

    Wang, Cai-Yun; Ma, Yun-Han; He, Da-Jian; Yang, Shi-Hua

    2013-04-01

    In this paper, partial sequences of the tree shrew (Tupaia belangeri) Klf4, Sox2, and c-Myc genes were cloned and sequenced, which were 382, 612, and 485 bp in length and encoded 127, 204, and 161 amino acids, respectively. Whereas, their cDNA sequence identities with those of human were 89%, 98%, and 89%, respectively. Their phylogenetic tree results indicated different topologies and suggested individual evolutional pathways. These results can facilitate further functional studies.

  18. Cloning and sequencing of complete -crystallin cDNA from embryonic lens of Crocodylus palustris

    Indian Academy of Sciences (India)

    Raman Agrawal; Reena Chandrashekhar; Anurag Kumar Mishra; Jetty Ramadevi; Yogendra Sharma; Ramesh K Aggarwal

    2002-06-01

    -Crystallin is a taxon-specific structural protein found in eye lenses. We present here the cloning and sequencing of complete -crystallin cDNA from the embryonic lens of Crocodylus palustris and establish it to be identical to the -enolase gene from non-lenticular tissues. Quantitatively, the -crystallin was found to be the least abundant crystallin of the crocodilian embryonic lenses. Crocodile -crystallin cDNA was isolated by RT-PCR using primers designed from the only other reported sequence from duck and completed by 5′- and 3′-rapid amplification of cDNA ends (RACE) using crocodile gene specific primers designed in the study. The complete -crystallin cDNA of crocodile comprises 1305 bp long ORF and 92 and 409 bp long untranslated 5′-and 3′-ends respectively. Further, it was found to be identical to its putative counterpart enzyme -enolase, from brain, heart and gonad, suggesting both to be the product of the same gene. The study thus provides the first report on cDNA sequence of -crystallin from a reptilian species and also re-confirms it to be an example of the phenomenon of gene sharing as was demonstrated earlier in the case of peking duck. Moreover, the gene lineage reconstruction analysis helps our understanding of the evolution of crocodilians and avian species.

  19. NGS-based deep bisulfite sequencing.

    Science.gov (United States)

    Lee, Suman; Kim, Joomyeong

    2016-01-01

    We have developed an NGS-based deep bisulfite sequencing protocol for the DNA methylation analysis of genomes. This approach allows the rapid and efficient construction of NGS-ready libraries with a large number of PCR products that have been individually amplified from bisulfite-converted DNA. This approach also employs a bioinformatics strategy to sort the raw sequence reads generated from NGS platforms and subsequently to derive DNA methylation levels for individual loci. The results demonstrated that this NGS-based deep bisulfite sequencing approach provide not only DNA methylation levels but also informative DNA methylation patterns that have not been seen through other existing methods.•This protocol provides an efficient method generating NGS-ready libraries from individually amplified PCR products.•This protocol provides a bioinformatics strategy sorting NGS-derived raw sequence reads.•This protocol provides deep bisulfite sequencing results that can measure DNA methylation levels and patterns of individual loci.

  20. Molecular Cloning and Sequencing of Channel Catfish, Ictalurus punctatus, Cathepsin H and L cDNA

    Science.gov (United States)

    Cathepsin H and L, a lysosomal cysteine endopeptidase of the papain family, are ubiquitously expressed and involve in antigen processing. In this communication, the channel catfish cathepsin H and L transcripts were sequenced and analyzed. Total RNA from tissues was extracted and cDNA libraries we...

  1. Rapid Amplification of cDNA Ends for RNA Transcript Sequencing in Staphylococcus.

    Science.gov (United States)

    Miller, Eric

    2016-01-01

    Rapid amplification of cDNA ends (RACE) is a technique that was developed to swiftly and efficiently amplify full-length RNA molecules in which the terminal ends have not been characterized. Current usage of this procedure has been more focused on sequencing and characterizing RNA 5' and 3' untranslated regions. Herein is described an adapted RACE protocol to amplify bacterial RNA transcripts.

  2. Complete amino acid sequence of human intestinal aminopeptidase N as deduced from cloned cDNA

    DEFF Research Database (Denmark)

    Cowell, G M; Kønigshøfer, E; Danielsen, E M;

    1988-01-01

    The complete primary structure (967 amino acids) of an intestinal human aminopeptidase N (EC 3.4.11.2) was deduced from the sequence of a cDNA clone. Aminopeptidase N is anchored to the microvillar membrane via an uncleaved signal for membrane insertion. A domain constituting amino acid 250-555 p...

  3. Cloning and Sequence Analysis of cDNA Encoding MRJP3 of Apis cerana cerana

    Institute of Scientific and Technical Information of China (English)

    SU Song-kun; ZHNEG Huo-qing; CHEN Sheng-lu; ZHONG Bo-xiong; Stefan Albert

    2005-01-01

    By screening the worker (Apis cerana cerana) heads cDNA library using a fragment of the mrjp3 gene ofApis cerana as probe, 120 positive clones were obtained. The clone containing A. cerana cerana MRJP3 (AccMRJP3) cDNA was selected. Based on the sequencing of the inserts of the positive clone, a sequence of AccMRJP3 cDNA which is 1 887 bp long including a poly (A) tail was obtained. The AccMRJP3 cDNA encompassed an open-reading frame (ORF) with 1 779 bp encoding 593 amino acids. The un-translated regions (UTR) of the 5' end and 3' end are 46 bp and 160 bp in length,respectively. Similar to AmMRJP3 and AdMRJP3, the putative AccMRJP3 also has a repetitive region. The comparison of the repetitive region of AccMRJP3, AmMRJP3 and AdMRJP3 shows some differences between them.

  4. Revised sequence and expression of cyclin B cDNA from the starfish Asterina pectinifera.

    Science.gov (United States)

    Miyake, Y; Deshimaru, S; Toraya, T

    2001-05-01

    Cyclin B cDNA was cloned from the ovary of the starfish Asterina pectinifera and analyzed by RT-PCR and 3'- and 5'-RACE techniques. The cDNA consists of a 0.13-kb upstream untranslated region, a 1.22-kb coding region, and a 0.86-kb downstream untranslated region. The open reading frame encoded a polypeptide of 404 amino acid residues with a calculated molecular weight of 45,692. All the characteristic sequences, such as destruction and cyclin boxes, cyclin B motif, and cytoplasmic retention and nuclear export signals, were found in the newly cloned cyclin B cDNA. The deduced amino acid sequence of the cyclin B cDNA was highly homologous in the middle and carboxy terminal regions to that from mature eggs of the same organism, but quite different in the amino terminal region. Evidence was obtained which suggested that this cyclin B is expressed in immature and maturing oocytes and is the same as that cloned from mature eggs.

  5. Cloning and sequencing of Indian Water buffalo (Bubalus bubalis) interleukin-3 cDNA

    KAUST Repository

    Sugumar, Thennarasu

    2011-12-12

    Full-length cDNA (435 bp) of the interleukin-3(IL-3) gene of the Indian water buffalo was amplified by reverse transcriptase-polymerase chain reaction and sequenced. This sequence had 96% nucleotide identity and 92% amino acid identity with bovine IL-3. There are 10 amino acid substitutions in buffalo compared with that of bovine. The amino acid sequence of buffalo IL-3 also showed very high identity with that of other ruminants, indicating functional cross-reactivity. Structural homology modelling of buffalo IL-3 protein with human IL-3 showed the presence of five helical structures.

  6. Characterization of full-length sequenced cDNA inserts (FLIcs from Atlantic salmon (Salmo salar

    Directory of Open Access Journals (Sweden)

    Lunner Sigbjørn

    2009-10-01

    Full Text Available Abstract Background Sequencing of the Atlantic salmon genome is now being planned by an international research consortium. Full-length sequenced inserts from cDNAs (FLIcs are an important tool for correct annotation and clustering of the genomic sequence in any species. The large amount of highly similar duplicate sequences caused by the relatively recent genome duplication in the salmonid ancestor represents a particular challenge for the genome project. FLIcs will therefore be an extremely useful resource for the Atlantic salmon sequencing project. In addition to be helpful in order to distinguish between duplicate genome regions and in determining correct gene structures, FLIcs are an important resource for functional genomic studies and for investigation of regulatory elements controlling gene expression. In contrast to the large number of ESTs available, including the ESTs from 23 developmental and tissue specific cDNA libraries contributed by the Salmon Genome Project (SGP, the number of sequences where the full-length of the cDNA insert has been determined has been small. Results High quality full-length insert sequences from 560 pre-smolt white muscle tissue specific cDNAs were generated, accession numbers [GenBank: BT043497 - BT044056]. Five hundred and ten (91% of the transcripts were annotated using Gene Ontology (GO terms and 440 of the FLIcs are likely to contain a complete coding sequence (cCDS. The sequence information was used to identify putative paralogs, characterize salmon Kozak motifs, polyadenylation signal variation and to identify motifs likely to be involved in the regulation of particular genes. Finally, conserved 7-mers in the 3'UTRs were identified, of which some were identical to miRNA target sequences. Conclusion This paper describes the first Atlantic salmon FLIcs from a tissue and developmental stage specific cDNA library. We have demonstrated that many FLIcs contained a complete coding sequence (cCDS. This

  7. Generation and Analysis of Full-length cDNA Sequences from Elephant Shark (Callorhinchus milii)

    KAUST Repository

    Kodzius, Rimantas

    2009-03-17

    Cartilaginous fishes are the oldest living group of jawed vertebrates and therefore is an important group for understanding the evolution of vertebrate genomes including the human genome. Our laboratory has proposed elephant shark (C. milii) as a model cartilaginous fish genome because of its relatively small genome size (910 Mb). The whole genome of C. milii is being sequenced (first cartilaginous fish genome to be sequenced completely). To characterize the transcriptome of C. milii and to assist in annotating exon-intron boundaries, transcriptional start sites and alternatively spliced transcripts, we are generating full-length cDNA sequences from C. milii.

  8. Molecular cloning and sequencing of a cDNA encoding partial putative molt-inhibiting hormone from Penaeus chinensis

    Science.gov (United States)

    Wang, Zai-Zhao; Xiang, Jian-Hai

    2002-09-01

    Total RNA was extracted from eyestalks of shrimp Penaeus chinensis. Eyestalk cDNA was obtained from total RNA by reverse transcription. Reverse transcriptase-polymerase chain reaction (RT-PCR) was initiated using eyestalk cDNA and degenerate primers designed from the amino acid sequence of molt-inhibiting hormone from shrimp Penaeus japonicus. A specific cDNA was obtained and cloned into a T vector for sequencing. The cDNA consisted of 201 base pairs and encoding for a peptide of 67 amino acid residues. The peptide of P. chinensis had the highest identity with molt-inhibiting hormones of P. japonicus. The cDNA could be a partial gene of molt-inhibiting hormones from P. chinensis. This paper reports for the first time cDNA encoding for neuropeptide of P. chinensis.

  9. MOLECULAR CLONING AND SEQUENCING OF A cDNA ENCODING PARTIAL PUTATIVE MOLT-INHIBITING HORMONE FROM PENAEUS CHINENSIS

    Institute of Scientific and Technical Information of China (English)

    王在照; 相建海

    2002-01-01

    Total RNA was extracted from eyestalks of shrimp Penaeus chinensis. Eyestalk cDNA was obtained from total RNA by reverse transcription. Reverse transcriptase-polymer ase chain reaction (RT-PCR) was initiated using eyestalk cDNA and degenerate primers designed from the amino acid sequence of molt-inhibiting hormone from shrimp Penaeus japonicus. A s pecific cDNA was obtained and cloned into a T vector for sequencing. The cDNA consisted of 201 ba se pairs and encoding for a peptide of 67 amino acid residues. The peptide of P. chinensis had the highest identity with molt-inhibiting hormones of P. japonicus. The cDNA could be a partial gene of molt-inhibiting hormones from P. chinensis. This paper reports for the first time cDNA encoding for neuropeptide of P. chinensis.

  10. MOLECULAR CLONING AND SEQUENCING OF A cDNA ENCODING PARTIAL PUTATIVE MOLT-INHIBITING HORMONE FROM PENAEUS CHINENSIS

    Institute of Scientific and Technical Information of China (English)

    王在照; 相建海

    2002-01-01

    Total RNA was extracted from eyestalks of shrimp Penaeue chinensis. Eyestalk cDNA was obtained from total RNA by reverse transcription. Reverse transcriptase-polymerase chain reaction (RT-PCR) was initiated using eyestalk cDNA and degenerate primers designed from the amino acid sequence of molt-inhibiting hormone from shrimp Penaeus japonicus. A specific cDNA was obtained and cloned into a T vector for sequencing. The cDNA consisted of 201 base pairs and encoding for a peptide of 67 amino acid residues. The peptide of P. chinensis had the highest identity with molt-inhibiting hormones of P. japonicus. The cDNA could be a partial gene of molt-inhibiting hormones from P. chinensis. This paper reports for the first time cDNA encoding for neuropeptide of P. chinensis.

  11. Nucleic acid (cDNA) and amino acid sequences of alpha-type gliadins from wheat (Triticum aestivum).

    Science.gov (United States)

    Kasarda, D D; Okita, T W; Bernardin, J E; Baecker, P A; Nimmo, C C; Lew, E J; Dietler, M D; Greene, F C

    1984-01-01

    The complete amino acid sequence for an alpha-type gliadin protein of wheat (Triticum aestivum Linnaeus) endosperm has been derived from a cloned cDNA sequence. An additional cDNA clone that corresponds to about 75% of a similar alpha-type gliadin has been sequenced and shows some important differences. About 97% of the composite sequence of A-gliadin (an alpha-type gliadin fraction) has also been obtained by direct amino acid sequencing. This sequence shows a high degree of similarity with amino acid sequences derived from both cDNA clones and is virtually identical to one of them. On the basis of sequence information, after loss of the signal sequence, the mature alpha-type gliadins may be divided into five different domains, two of which may have evolved from an ancestral gliadin gene, whereas the remaining three contain repeating sequences that may have developed independently. Images PMID:6589619

  12. cDNA Cloning and Sequence Analysis of Rice Sbel and Sbe3 Genes

    Institute of Scientific and Technical Information of China (English)

    CHENXiu-hua; LIUQiao-quan; WuHsin-kan; WANGZong-yang; GuMing-hong

    2004-01-01

    Two starch-branching enzyme (SBE) in rice, is known to be a key enzyme in amylopectin biosynthesis. The cDNA of two SBE(starch-branching enzyme) genes SheI and Shed encoding SBE Ⅰ and SBE Ⅲ (two major isoforms in rice) were cloned by an improved RT-PCR technique, from a template cDNA libray, derived from the total mRNAs extracted from the immature seeds of a japonica rice Wuyunjing 7. DNA sequence analysis showed that the size of the cloned SheI and Shed cDNAs were 2490 and 2481 bp long, respectively, including their entire coding sequences. Comparison analysis indicated that the nucleotide sequence of She3 was the same as that of shed (Genbank Accession No. D16201) as reported previously. There were only four base-pairs difference,which resulted in changes of two deduced amino acids between the cloned She1 cDNA and the reported she1 (Genbank Accession No. D11082). The cloned SheI and Shed cDNAs make it possible to improve rice starch quality through genetic engineering.

  13. cDNA Cloning and Sequence Analysis of Rice Sbe1 and Sbe3 Genes

    Institute of Scientific and Technical Information of China (English)

    CHEN Xiu-hua; LIU Qiao-quan; WU Hsin-kan; WANG Zong-yang; GU Ming-hong

    2004-01-01

    Two starch-branching enzyme (SBE) in rice, is known to be a key enzyme in amylopectin biosynthesis. The cDNA of two SBE(starch-branching enzyme) genes Sbe1 and Sbe3 encoding SBE I and SBE Ⅲ (two major isoforms in rice) were cloned by an improved RT-PCR technique, from a template cDNA library derived from the total mRNAs extracted from the immature seeds of a japonica rice Wuyunjing 7. DNA sequence analysis showed that the size of the cloned Sbe1 and Sbe3 cDNAs were 2490 and 2481 bp long, respectively, including their entire coding sequences. Comparison analysis indicated that the nucleotide sequence of Sbe3 was the same as that of sbe3 (Genbank Accession No. D16201) as reported previously. There were only four base-pairs difference,which resulted in changes of two deduced amino acids between the cloned Sbe1 cDNA and the reported sbe1 (Genbank Accession No. D11082). The cloned Sbe1 and Sbe3 cDNAs make it possible to improve rice starch quality through genetic engineering

  14. Nucleotide sequence of cloned cDNA for human pancreatic kallikrein.

    Science.gov (United States)

    Fukushima, D; Kitamura, N; Nakanishi, S

    1985-12-31

    Cloned cDNA sequences for human pancreatic kallikrein have been isolated and determined by molecular cloning and sequence analysis. The identity between human pancreatic and urinary kallikreins is indicated by the complete coincidence between the amino acid sequence deduced from the cloned cDNA sequence and that reported partially for urinary kallikrein. The active enzyme form of the human pancreatic kallikrein consists of 238 amino acids and is preceded by a signal peptide and a profragment of 24 amino acids. A sequence comparison of this with other mammalian kallikreins indicates that key amino acid residues required for both serine protease activity and kallikrein-like cleavage specificity are retained in the human sequence, and residues corresponding to some external loops of the kallikrein diverge from other kallikreins. Analyses by RNA blot hybridization, primer extension, and S1 nuclease mapping indicate that the pancreatic kallikrein mRNA is also expressed in the kidney and sublingual gland, suggesting the active synthesis of urinary kallikrein in these tissues. Furthermore, the tissue-specific regulation of the expression of the members of the human kallikrein gene family has been discussed.

  15. Microarray and cDNA sequence analysis of transcription during nerve-dependent limb regeneration

    Directory of Open Access Journals (Sweden)

    Bryant Susan V

    2009-01-01

    Full Text Available Abstract Background Microarray analysis and 454 cDNA sequencing were used to investigate a centuries-old problem in regenerative biology: the basis of nerve-dependent limb regeneration in salamanders. Innervated (NR and denervated (DL forelimbs of Mexican axolotls were amputated and transcripts were sampled after 0, 5, and 14 days of regeneration. Results Considerable similarity was observed between NR and DL transcriptional programs at 5 and 14 days post amputation (dpa. Genes with extracellular functions that are critical to wound healing were upregulated while muscle-specific genes were downregulated. Thus, many processes that are regulated during early limb regeneration do not depend upon nerve-derived factors. The majority of the transcriptional differences between NR and DL limbs were correlated with blastema formation; cell numbers increased in NR limbs after 5 dpa and this yielded distinct transcriptional signatures of cell proliferation in NR limbs at 14 dpa. These transcriptional signatures were not observed in DL limbs. Instead, gene expression changes within DL limbs suggest more diverse and protracted wound-healing responses. 454 cDNA sequencing complemented the microarray analysis by providing deeper sampling of transcriptional programs and associated biological processes. Assembly of new 454 cDNA sequences with existing expressed sequence tag (EST contigs from the Ambystoma EST database more than doubled (3935 to 9411 the number of non-redundant human-A. mexicanum orthologous sequences. Conclusion Many new candidate gene sequences were discovered for the first time and these will greatly enable future studies of wound healing, epigenetics, genome stability, and nerve-dependent blastema formation and outgrowth using the axolotl model.

  16. Rat serum amyloid P component. Analysis of cDNA sequence and gene expression.

    Science.gov (United States)

    Dowton, S B; McGrew, S D

    1990-09-01

    cDNA clones for rat serum amyloid P component (SAP) were isolated, and the derived amino acid sequence for pre-SAP was determined from the complete nucleotide sequence. Rat SAP is encoded by approximately 1 kb of mRNA, and the mature SAP protein is predicted to be 208 amino acids long. An increase in hepatic mRNA levels for rat SAP was found after administration of lipopolysaccharide, and SAP mRNA levels in livers of unstimulated male rats were lower than in hepatic RNA from female rats.

  17. Rice bicoid-related cDNA sequence and its expression during early embryogenesis

    Institute of Scientific and Technical Information of China (English)

    2001-01-01

    Bicoid is one of the important Drosophila maternal genes involved in the control of embryo polarity and larvae segmentation.To clone and characterize the rice bicoid-related genes,one cDNA clone,Rb24 (EMBL accession number: AJ2771380),was isolated by screening of rice unmature seed cDNA library.Sequence analysis indicates that Rb24 contains a putative amino acid sequence,which is homologous to unique 8 amino acids sequence within Drosophila bicoid homeodomain (50% identity,75% similarity) and involves a lys-9 in putative helix 3.Northern blot analysis of rice RNA has shown that this sequence is expressed in a tissue-specific manner.The transcript was detected strongly in young panicles,but less in young leaves and roots.This results are further confirmed with paraffin section in situ hybridization.The signal is intensive in rice globular embryo and located at the apical tip of the embryo,then,along with the development of embryo,the signal is getting reduced and transfers into both sides of embryo.The existence of bicoid-related sequence in rice embryo and the similarity of polar distribution of bicoid and Rb24 mRNA in early embryo development may implicates a conserved maternal regulation mechanism of body axis presents in Drosophila and in rice.

  18. Nucleic acid (cDNA) and amino acid sequences of alpha-type gliadins from wheat (Triticum aestivum).

    OpenAIRE

    Kasarda, D.D.; Okita, T W; Bernardin, J. E.; Baecker, P A; Nimmo, C C; Lew, E J; Dietler, M D; Greene, F C

    1984-01-01

    The complete amino acid sequence for an alpha-type gliadin protein of wheat (Triticum aestivum Linnaeus) endosperm has been derived from a cloned cDNA sequence. An additional cDNA clone that corresponds to about 75% of a similar alpha-type gliadin has been sequenced and shows some important differences. About 97% of the composite sequence of A-gliadin (an alpha-type gliadin fraction) has also been obtained by direct amino acid sequencing. This sequence shows a high degree of similarity with a...

  19. Sequence characterization of a human embryonic craniofacial cDNA library

    Energy Technology Data Exchange (ETDEWEB)

    Padanilam, B.J.; Barsel, S.; Solursh, M. [and others

    1994-09-01

    Broad-based sequencing approaches for the characterization of human cDNA libraries have proven successful in identifying large numbers of novel genes of specific tissue or developmental stages. To pursue our interests in human craniofacial development, stages. To pursue our interests in human craniofacial development, we have made use of both subtracted and unsubtracted cDNA libraries constructed from embryonic craniofacial tissue obtained from pooled samples at 42-54 days gestation. Single-pass sequencing was carried out using an ABI automated sequencer and T3 or T7 primers. Sequences were characterized using BLAST and GRAIL, and the identified homologous sequences grouped according to gene class and family. Four genes have been mapped using repeat sequence elements identified in the clones. Using primers developed from sequence data, other genes are being mapped using a panel of somatic cell hybrids. To date, a total of 786 sequences have been returned with 35% identifying no homologies, and 35% with strong homologies to previously identified genes. A number of genes previously identified to play a role in human embryonic development have been returned from the sequence comparisons providing evidence that the library is representative of this tissue and stage of development. Previous characterization of the library has also identified a number of novel embryonically expressed human homeobox genes. Genes felt to be of special relevance based on their homology to characterized genes known to play a role in development or that are members of novel classes but with high scores on GRAIL searches are being characterized using whole mount in situ hybridization with mouse embryos. Characterization of the library with respect to chromosomal mapping, gene types and make-up, and embryonic expression patterns will be presented.

  20. Detection of reverse transcriptase termination sites using cDNA ligation and massive parallel sequencing

    DEFF Research Database (Denmark)

    Kielpinski, Lukasz J; Boyd, Mette; Sandelin, Albin;

    2013-01-01

    of these methods can be increased by applying massive parallel sequencing technologies.Here, we describe a versatile method for detection of reverse transcriptase termination sites based on ligation of an adapter to the 3' end of cDNA with bacteriophage TS2126 RNA ligase (CircLigase™). In the following PCR......Detection of reverse transcriptase termination sites is important in many different applications, such as structural probing of RNAs, rapid amplification of cDNA 5' ends (5' RACE), cap analysis of gene expression, and detection of RNA modifications and protein-RNA cross-links. The throughput...... that do not require formal bioinformatics training. As an example, we apply the method to detection of transcription start sites in mouse liver cells....

  1. Analysis of cDNA sequence, protein structure and expression of parotid secretory protein in pig

    Institute of Scientific and Technical Information of China (English)

    YIN Haifang; FAN Baoliang; ZHAO Zhihui; LIU Zhaoliang; FEI Jing; LI Ning

    2003-01-01

    Parotid secretory protein (PSP) secreted abundantly in saliva, whose function is related with the anti-bacterial effect. The PSP cDNA has been isolated from pig parotid glands by 3′ and 5′ rapid amplification of cDNA end (RACE),based on the conserved signal peptide region among the known mammalian PSP. Theresult of homologous comparison shows that pig PSP and human PSP shares the high identity at the level of the primary, secondary and tertiary protein structure. A search for functionally significant protein motifs revealed a unique amino acid sequence pattern consisting of the residues Leu-X(6)-Leu-X(6)-Leu- X(7)-Leu-X(6)-Leu-X(6)-Leu near the amino-terminal portion of the protein, which is important to its function. RT-PCR, Dot blot and Northern blot analysis demonstrated that PSP was strongly expressed in parotid glands, but not in other tissues.

  2. Characterisation of full-length cDNA sequences provides insights into the Eimeria tenellatranscriptome

    Directory of Open Access Journals (Sweden)

    Amiruddin Nadzirah

    2012-01-01

    Full Text Available Abstract Background Eimeria tenella is an apicomplexan parasite that causes coccidiosis in the domestic fowl. Infection with this parasite is diagnosed frequently in intensively reared poultry and its control is usually accorded a high priority, especially in chickens raised for meat. Prophylactic chemotherapy has been the primary method used for the control of coccidiosis. However, drug efficacy can be compromised by drug-resistant parasites and the lack of new drugs highlights demands for alternative control strategies including vaccination. In the long term, sustainable control of coccidiosis will most likely be achieved through integrated drug and vaccination programmes. Characterisation of the E. tenella transcriptome may provide a better understanding of the biology of the parasite and aid in the development of a more effective control for coccidiosis. Results More than 15,000 partial sequences were generated from the 5' and 3' ends of clones randomly selected from an E. tenella second generation merozoite full-length cDNA library. Clustering of these sequences produced 1,529 unique transcripts (UTs. Based on the transcript assembly and subsequently primer walking, 433 full-length cDNA sequences were successfully generated. These sequences varied in length, ranging from 441 bp to 3,083 bp, with an average size of 1,647 bp. Simple sequence repeat (SSR analysis identified CAG as the most abundant trinucleotide motif, while codon usage analysis revealed that the ten most infrequently used codons in E. tenella are UAU, UGU, GUA, CAU, AUA, CGA, UUA, CUA, CGU and AGU. Subsequent analysis of the E. tenella complete coding sequences identified 25 putative secretory and 60 putative surface proteins, all of which are now rational candidates for development as recombinant vaccines or drug targets in the effort to control avian coccidiosis. Conclusions This paper describes the generation and characterisation of full-length cDNA sequences from E

  3. Construction of cDNA library and preliminary analysis of expressed sequence tags from Siberian tiger

    Directory of Open Access Journals (Sweden)

    Chang-Qing Liu, Tao-Feng Lu, Bao-Gang Feng, Dan Liu, Wei-Jun Guan, Yue-Hui Ma

    2010-01-01

    Full Text Available In this study we successfully constructed a full-length cDNA library from Siberian tiger, Panthera tigris altaica, the most well-known wild Animal. Total RNA was extracted from cultured Siberian tiger fibroblasts in vitro. The titers of primary and amplified libraries were 1.30×106 pfu/ml and 1.62×109 pfu/ml respectively. The proportion of recombinants from unamplified library was 90.5% and average length of exogenous inserts was 1.13 kb. A total of 282 individual ESTs with sizes ranging from 328 to 1,142bps were then analyzed the BLASTX score revealed that 53.9% of the sequences were classified as strong match, 38.6% as nominal and 7.4% as weak match. 28.0% of them were found to be related to enzyme/catalytic protein, 20.9% ESTs to metabolism, 13.1% ESTs to transport, 12.1% ESTs to signal transducer/cell communication, 9.9% ESTs to structure protein, 3.9% ESTs to immunity protein/defense metabolism, 3.2% ESTs to cell cycle, and 8.9 ESTs classified as novel genes. These results demonstrated that the reliability and representativeness of the cDNA library attained to the requirements of a standard cDNA library. This library provided a useful platform for the functional genomic research of Siberian tigers.

  4. THE TREE SHREW APOLIPOPROTEIN C-I cDNA: SEQUENCE AND ITS EXPRESSION

    Institute of Scientific and Technical Information of China (English)

    王克勤; 吕新跃; 吴钢; 薛红; 陈保生

    2001-01-01

    A rabbit anti-serum to tree shrew apolipoprotein C-I (apo C-l) was used to screen an expression cDNA li-braDy constructed by us from tree shrew (TS) liver tissue. Two apo C-I cDNA clones were obtained. The longerone consists of 380 nucleotides, including 21 bp and 95 bp at the 5' and 3' end of the non-translated region srespectively, and a 2 64-bp fragment in an open reading frame encoding 88 amino acids prepropeptide which con-ta-ins 26 amino acids of signal peptide and a mature protein (62 amino acids). Comparing the amino-acid se-quence deduced from this cDNA with those of the published mammalian apo C-Is reveals that it shared some struc-tural similarity with zat, mouse and dog apo C-l, but it had 5 more amino acids than that of human and baboon.The expression of apo C-I mRNA in 8 different tissues were also assayed with Northern blot. The results demonstrat-ed that liver had the highest expression, intestine had much less expression and no expression in other tissues,which is much different from human and other species. This study has laid down a good foundation for further study-ing on the function and the stucture of tree shrew apo C-I gene.``

  5. Infectivity and complete nucleotide sequence of cucumber fruit mottle mosaic virus isolate Cm cDNA.

    Science.gov (United States)

    Rhee, Sun-Ju; Hong, Jin-Sung; Lee, Gung Pyo

    2014-07-01

    Three isolates of cucumber fruit mottle mosaic virus (CFMMV) were collected from melon, cucumber, and pumpkin plants in Korea. A full-length cDNA clone of CFMMV-Cm (melon isolate) was produced and evaluated for infectivity after T7 transcription in vitro (pT7CF-Cmflc). The complete CFMMV genome sequence of the infectious clone pT7CF-Cmflc was determined. The genome of CFMMV-Cm consisted of 6,571 nucleotides and shared high nucleotide sequence identity (98.8 %) with the Israel isolate of CFMMV. Based on the infectious clone pT7CF-Cmflc, a CaMV 35S-promoter driven cDNA clone (p35SCF-Cmflc) was subsequently constructed and sequenced. Mechanical inoculation with RNA transcripts of pT7CF-Cmflc and agro-inoculation with p35SCF-Cmflc resulted in systemic infection of cucumber and melon, producing symptoms similar to those produced by CFMMV-Cm. Progeny virus in infected plants was detected by RT-PCR, western blot assay, and transmission electron microscopy.

  6. Rabbit serum amyloid protein A: expression and primary structure deduced from cDNA sequences.

    Science.gov (United States)

    Rygg, M; Marhaug, G; Husby, G; Dowton, S B

    1991-12-01

    Serum amyloid A protein (SAA), the precursor of amyloid protein A (AA) in deposits of secondary amyloidosis, is an acute phase plasma apolipoprotein produced by hepatocytes. The primary structure of SAA demonstrates high interspecies homology. Several isoforms exist in individual species, probably with different amyloidogenic potential. The nucleotide sequences of two different rabbit serum amyloid A cDNA clones have been analysed, one (corresponding to SAA1) 569 base pairs (bp) long and the other (corresponding to SAA2) 513 bp long. Their deduced amino acid sequences differ at five amino acid positions, four of which are located in the NH2-terminal region of the protein. The deduced amino acid sequence of SAA2 corresponds to rabbit protein AA previously described except for one amino acid in position 22. Eighteen hours after turpentine stimulation, rabbit SAA mRNA is abundant in liver, while lower levels are present in spleen. None of the other extrahepatic organs studied showed any SAA mRNA expression. A third mRNA species (1.9 kb) hybridizing with a single-stranded RNA probe transcribed from the rabbit SAA cDNA, was identified. SAA1 and SAA2 mRNA were found in approximately equal amounts in turpentine-stimulated rabbit liver, but seem to be coordinately decreased after repeated inflammatory stimulation.

  7. License - Budding yeast cDNA sequencing project | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available Standard License, as long as you comply with the following conditions: You must attribute this database in t...Budding yeast cDNA sequencing project License to Use This Database Last updated : 2010/02/15 You may use thi... of this database and the requirements you must follow in using this database. The Additional License specif...ecified in the Creative Commons Attribution-Share Alike 2.1 Japan . If you use data from this database, plea...n . The summary of the Creative Commons Attribution-Share Alike 2.1 Japan is found here . With regard to this database, you

  8. Benchmarking of the Oxford Nanopore MinION sequencing for quantitative and qualitative assessment of cDNA populations.

    Science.gov (United States)

    Oikonomopoulos, Spyros; Wang, Yu Chang; Djambazian, Haig; Badescu, Dunarel; Ragoussis, Jiannis

    2016-08-24

    To assess the performance of the Oxford Nanopore Technologies MinION sequencing platform, cDNAs from the External RNA Controls Consortium (ERCC) RNA Spike-In mix were sequenced. This mix mimics mammalian mRNA species and consists of 92 polyadenylated transcripts with known concentration. cDNA libraries were generated using a template switching protocol to facilitate the direct comparison between different sequencing platforms. The MinION performance was assessed for its ability to sequence the cDNAs directly with good accuracy in terms of abundance and full length. The abundance of the ERCC cDNA molecules sequenced by MinION agreed with their expected concentration. No length or GC content bias was observed. The majority of cDNAs were sequenced as full length. Additionally, a complex cDNA population derived from a human HEK-293 cell line was sequenced on an Illumina HiSeq 2500, PacBio RS II and ONT MinION platforms. We observed that there was a good agreement in the measured cDNA abundance between PacBio RS II and ONT MinION (rpearson = 0.82, isoforms with length more than 700bp) and between Illumina HiSeq 2500 and ONT MinION (rpearson = 0.75). This indicates that the ONT MinION can sequence quantitatively both long and short full length cDNA molecules.

  9. Achieving high throughput sequencing of a cDNA library utilizing an alternative protocol for the bench top next-generation sequencing system.

    Science.gov (United States)

    Wan, Minxi; Faruq, Junaid; Rosenberg, Julian N; Xia, Jinlan; Oyler, George A; Betenbaugh, Michael J

    2013-02-15

    The development of next-generation sequencing (NGS) technologies has provided novel tools for genome analysis and expression profiling. A high throughput cDNA sequencing method using a bench top next-generation sequencing system, GS Junior, is now available. Here, we used an alternative protocol to the standard method for generating the cDNA library. This protocol can decrease the number of processing steps to manipulate RNA when constructing a cDNA library from an RNA sample, and does not require mRNA isolation from total RNA. Thus it can decrease the risk of RNA degradation and the cost for preparing a cDNA library. Also, the efficiency of sequencing data obtained with this approach is comparable to the standard method as verified by sequencing characteristics and expression levels of the reference gene glyceraldehyde-3-phosphate dehydrogenase (GAPDH).

  10. Cloning and Sequencing cDNA Encoding for Rhoptry-2 Toxoplasma Gondii Tachyzoite Local Isolate

    Directory of Open Access Journals (Sweden)

    Wayan T. Artama

    2015-10-01

    Full Text Available Rhoptry protein belongs to an excretory and secretory antigens (ESAs that play an important role during activepenetration of parasite into the cell target. This protein an able Toxoplasma gondii to actively penetrate targetedcell, meanwhile ESAs protein stimulates intracellular vacuole modification. It is, therefore, after the parasitesuccessfully enter the cell target then Granule (GRA proteins are responsible for the formation of parasitophorusvacuole, which is protect the fusion with other intracellular compartments such as lysosomal vacuole. Consequently,this parasite is being able to survive and multiply at the cell target. The current study was aimed to clone andsequens cDNA encoding for ROP-2 of local isolated T. gondii tachizoite through DNA recombinant technique.Total ribonucleic acid (RNA was isolated from tachyzoites of local isolated T. gondii that were grown up in Balb/c mice. Messenger RNA was isolated from total RNA using PolyAtract mRNA Isolation System. Messenger RNA wasused as a template for synthesis cDNA using Riboclone cDNA Synthesis System AMV-RT. EcoRI adaptor fromRiboclone EcoRI Adaptor Ligation System was added to Complementary DNA and than ligated to pUC19. Recombinantplasmid was transformed into E. coli (XL1-Blue. The transformed E. coli XL-1 Blue were plated on LB agarcontaining X-Gal, IPTG and ampicillin. Recombinant clones (white colony were picked up and grown up in theLB medium at 37oC overnight. Expression of recombinant protein was analysed by immunoblotting in order toidentify cDNA recombinant wich is express ESA of T. gondii local isolate. Recombinant plasmid were isolatedusing alkalilysis method and were elektroforated in 1% agarose gel. The isolated DNA recombinant plasmid wascut using Eco RI and then sequenced through Big Dye Terminator Mix AB1 377A Sequencer using M13 Forward andM13 Reverse primers. The conclusion of this results showed that the recombinant clone was coding for excretoryand secretory

  11. Cloning and Sequencing cDNA Encoding for Rhoptry-2 Toxoplasma Gondii Tachyzoite Local Isolate

    Directory of Open Access Journals (Sweden)

    Murwantoko M

    2015-11-01

    Full Text Available Rhoptry protein belongs to an excretory and secretory antigens (ESAs that play an important role during active penetration of parasite into the cell target. This protein an able Toxoplasma gondii to actively penetrate targeted cell, meanwhile ESAs protein stimulates intracellular vacuole modification. It is, therefore, after the parasite successfully enter the cell target then Granule (GRA proteins are responsible for the formation of parasitophorus vacuole, which is protect the fusion with other intracellular compartments such as lysosomal vacuole. Consequently, this parasite is being able to survive and multiply at the cell target. The current study was aimed to clone and sequens cDNA encoding for ROP-2 of local isolated T. gondii tachizoite through DNA recombinant technique. Total ribonucleic acid (RNA was isolated from tachyzoites of local isolated T. gondii that were grown up in Balb/c mice. Messenger RNA was isolated from total RNA using PolyAtract mRNA Isolation System. Messenger RNA was used as a template for synthesis cDNA using Riboclone cDNA Synthesis System AMV-RT. EcoRI adaptor from Riboclone EcoRI Adaptor Ligation System was added to Complementary DNA and than ligated to pUC19. Recombinant plasmid was transformed into E. coli (XL1-Blue. The transformed E. coli XL-1 Blue were plated on LB agar containing X-Gal, IPTG and ampicillin. Recombinant clones (white colony were picked up and grown up in the LB medium at 37oC overnight. Expression of recombinant protein was analysed by immunoblotting in order to identify cDNA recombinant wich is express ESA of T. gondii local isolate. Recombinant plasmid were isolated using alkalilysis method and were elektroforated in 1% agarose gel. The isolated DNA recombinant plasmid was cut using Eco RI and then sequenced through Big Dye Terminator Mix AB1 377A Sequencer using M13 Forward and M13 Reverse primers. The conclusion of this results showed that the recombinant clone was coding for excretory

  12. Mink serum amyloid A protein. Expression and primary structure based on cDNA sequences.

    Science.gov (United States)

    Marhaug, G; Husby, G; Dowton, S B

    1990-06-15

    The nucleotide sequences of two mink serum amyloid A (SAA) cDNA clones have been analyzed, one (SAA1) 776 base pairs long and the other (SAA2) 552 base pairs long. Significant differences were discovered when derived amino acid sequences were compared with data for apoSAA isolated from high density lipoprotein. Previous studies of mink protein SAA and amyloid protein A (AA) suggest that only one SAA isotype is amyloidogenic. The cDNA clone for SAA2 defines the "amyloid prone" isotype while SAA1 is found only in serum. Mink SAA1 has alanine in position 10, isoleucine in positions 24, 67, and 71, lysine in position 27, and proline in position 105. Residue 10 in mink SAA2 is valine while arginine and asparagine are at positions 24 and 27, respectively, all characteristics of protein AA isolated from mink amyloid fibrils. Mink SAA2 also has valine in position 67, phenylalanine in position 71, and amino acid 105 is serine. It remains unknown why these six amino acid substitutions render SAA2 more amyloidogenic than SAA1. Eighteen hours after lipopolysaccharide stimulation, mink SAA mRNA is abundant in liver with relatively minor accumulations in brain and lung. Genes encoding both SAA isotypes are expressed in all three organs while no SAA mRNA was detectable in amyloid prone organs, including spleen and intestine, indicating that deposition of AA from locally synthesized SAA is unlikely. A third mRNA species (2.2 kilobases) was identified and hybridizes with cDNA probes for mink SAA1 and SAA2. In addition to a major primary translation product (molecular mass 14,400 Da) an additional product with molecular mass 28,000 Da was immunoprecipitable.

  13. Identification and complete sequencing of novel human transcripts through the use of mouse orthologs and testis cDNA sequences

    DEFF Research Database (Denmark)

    Ferreira, Elisa N; Pires, Lilian C; Parmigiani, Raphael B;

    2004-01-01

    The correct identification of all human genes, and their derived transcripts, has not yet been achieved, and it remains one of the major aims of the worldwide genomics community. Computational programs suggest the existence of 30,000 to 40,000 human genes. However, definitive gene identification...... can only be achieved by experimental approaches. We used two distinct methodologies, one based on the alignment of mouse orthologous sequences to the human genome, and another based on the construction of a high-quality human testis cDNA library, in an attempt to identify new human transcripts within...

  14. Molecular cloning and sequence analysis of growth hormone cDNA of Neotropical freshwater fish Pacu (Piaractus mesopotamicus

    Directory of Open Access Journals (Sweden)

    Janeth Silva Pinheiro

    2008-01-01

    Full Text Available RT-PCR was used for amplifying Piaractus mesopotamicus growth hormone (GH cDNA obtained from mRNA extracted from pituitary cells. The amplified fragment was cloned and the complete cDNA sequence was determined. The cloned cDNA encompassed a sequence of 543 nucleotides that encoded a polypeptide of 178 amino acids corresponding to mature P. mesopotamicus GH. Comparison with other GH sequences showed a gap of 10 amino acids localized in the N terminus of the putative polypeptide of P. mesopotamicus. This same gap was also observed in other members of the family. Neighbor-joining tree analysis with GH sequences from fishes belonging to different taxonomic groups placed the P. mesopotamicus GH within the Otophysi group. To our knowledge, this is the first GH sequence of a Neotropical characiform fish deposited in GenBank.

  15. An efficient strategy for large-scale high-throughput transposon-mediated sequencing of cDNA clones

    Science.gov (United States)

    Butterfield, Yaron S. N.; Marra, Marco A.; Asano, Jennifer K.; Chan, Susanna Y.; Guin, Ranabir; Krzywinski, Martin I.; Lee, Soo Sen; MacDonald, Kim W. K.; Mathewson, Carrie A.; Olson, Teika E.; Pandoh, Pawan K.; Prabhu, Anna-Liisa; Schnerch, Angelique; Skalska, Ursula; Smailus, Duane E.; Stott, Jeff M.; Tsai, Miranda I.; Yang, George S.; Zuyderduyn, Scott D.; Schein, Jacqueline E.; Jones, Steven J. M.

    2002-01-01

    We describe an efficient high-throughput method for accurate DNA sequencing of entire cDNA clones. Developed as part of our involvement in the Mammalian Gene Collection full-length cDNA sequencing initiative, the method has been used and refined in our laboratory since September 2000. Amenable to large scale projects, we have used the method to generate >7 Mb of accurate sequence from 3695 candidate full-length cDNAs. Sequencing is accomplished through the insertion of Mu transposon into cDNAs, followed by sequencing reactions primed with Mu-specific sequencing primers. Transposon insertion reactions are not performed with individual cDNAs but rather on pools of up to 96 clones. This pooling strategy reduces the number of transposon insertion sequencing libraries that would otherwise be required, reducing the costs and enhancing the efficiency of the transposon library construction procedure. Sequences generated using transposon-specific sequencing primers are assembled to yield the full-length cDNA sequence, with sequence editing and other sequence finishing activities performed as required to resolve sequence ambiguities. Although analysis of the many thousands (22 785) of sequenced Mu transposon insertion events revealed a weak sequence preference for Mu insertion, we observed insertion of the Mu transposon into 1015 of the possible 1024 5mer candidate insertion sites. PMID:12034834

  16. Primary analysis of the expressed sequence tags in a pentastomid nymph cDNA library.

    Directory of Open Access Journals (Sweden)

    Jing Zhang

    Full Text Available BACKGROUND: Pentastomiasis is a rare zoonotic disease caused by pentastomids. Despite their worm-like appearance, they are commonly placed into a separate sub-class of the subphylum Crustacea, phylum Arthropoda. However, until now, the systematic classification of the pentastomids and the diagnosis of pentastomiasis are immature, and genetic information about pentastomid nylum is almost nonexistent. The objective of this study was to obtain information on pentastomid nymph genes and identify the gene homologues related to host-parasite interactions or stage-specific antigens. METHODOLOGY/PRINCIPAL FINDINGS: Total pentastomid nymph RNA was used to construct a cDNA library and 500 colonies were sequenced. Analysis shows one hundred and ninety-seven unigenes were identified. In which, 147 genes were annotated, and 75 unigenes (53.19% were mapped to 82 KEGG pathways, including 29 metabolism pathways, 29 genetic information processing pathways, 4 environmental information processing pathways, 7 cell motility pathways and 5 organismal systems pathways. Additionally, two host-parasite interaction-related gene homologues, a putative Kunitz inhibitor and a putative cysteine protease. CONCLUSION/SIGNIFICANCE: We first successfully constructed a cDNA library and gained a number of expressed sequence tags (EST from pentastomid nymphs, which will lay the foundation for the further study on pentastomids and pentastomiasis.

  17. Molecular cloning and sequence analysis of hamster CENP-A cDNA

    Science.gov (United States)

    Figueroa, Javier; Pendón, Carlos; Valdivia, Manuel M

    2002-01-01

    Background The centromere is a specialized locus that mediates chromosome movement during mitosis and meiosis. This chromosomal domain comprises a uniquely packaged form of heterochromatin that acts as a nucleus for the assembly of the kinetochore a trilaminar proteinaceous structure on the surface of each chromatid at the primary constriction. Kinetochores mediate interactions with the spindle fibers of the mitotic apparatus. Centromere protein A (CENP-A) is a histone H3-like protein specifically located to the inner plate of kinetochore at active centromeres. CENP-A works as a component of specialized nucleosomes at centromeres bound to arrays of repeat satellite DNA. Results We have cloned the hamster homologue of human and mouse CENP-A. The cDNA isolated was found to contain an open reading frame encoding a polypeptide consisting of 129 amino acid residues with a C-terminal histone fold domain highly homologous to those of CENP-A and H3 sequences previously released. However, significant sequence divergence was found at the N-terminal region of hamster CENP-A that is five and eleven residues shorter than those of mouse and human respectively. Further, a human serine 7 residue, a target site for Aurora B kinase phosphorylation involved in the mechanism of cytokinesis, was not found in the hamster protein. A human autoepitope at the N-terminal region of CENP-A described in autoinmune diseases is not conserved in the hamster protein. Conclusions We have cloned the hamster cDNA for the centromeric protein CENP-A. Significant differences on protein sequence were found at the N-terminal tail of hamster CENP-A in comparison with that of human and mouse. Our results show a high degree of evolutionary divergence of kinetochore CENP-A proteins in mammals. This is related to the high diverse nucleotide repeat sequences found at the centromere DNA among species and support a current centromere model for kinetochore function and structural plasticity. PMID:12019018

  18. cDNA cloning and sequence analysis of NIb gene of soybean mosaic virus

    Institute of Scientific and Technical Information of China (English)

    刘俊君; 彭学贤; 莽克强

    1995-01-01

    cDNA of soybean mosaic virus (Beijing isolate, SMV-BJ) has been synthesized, using viralgenomic RNA as template and random hexanucleotides as primers. Based on the sequences of SMV-BJ coat protein (CP) gene as well as SMV- and WMV-II-related regions, oligonucleotides were made as primers for polymerase chain reaction (PCR). NIb gene of SMV-BJ was amplified by PCR, and cloned into pBluescript SK. The complete sequence was determined. The comparison of NIb genes between SMV-BJ and WMV-II . (USA) shows that similarities for nucleotide sequence reach 80.3%, and the deduced amino acid sequence. 91 3%. In consideration of the high identities in between the CP gene and the 3’-non-coding region between them, WMV-II might be considered as a watermelon strain of SMV Besides, some unexpected sequences were found in the 3’-region of 2 NIb gene clones. Following modification and splicing, a binary vector of NIb gene has been constructed for its expression in higher plant for the purpose of studying the possible repl

  19. Determination of cDNA and genomic DNA sequences of hevamine, a chitinase from the rubber tree Hevea brasiliensis

    NARCIS (Netherlands)

    Bokma, E; Spiering, M; Chow, KS; Mulder, PPMFA; Subroto, T; Beintema, JJ

    2001-01-01

    Hevamine is a chitinase from the rubber tree Hevea brasiliensis and belongs to the family 18 glycosyl hydrolases. This paper describes the cloning of hevamine DNA and cDNA sequences. Hevamine contains a signal peptide at the N-terminus and a putative vacuolar targeting sequence at the C-terminus whi

  20. Isolation and characterization of sequences homologous to the tobacco clone axi 1 (auxin independent) from a Vicia sativa nodule cDNA library

    NARCIS (Netherlands)

    Yalçin-Mendi, Y.; Çetiner, S.; Bisseling, T.

    2001-01-01

    In this research, partial nucleotide sequences of the axi 1 gene, which is related to auxin perception and transduction, isolated from Vicia sativa using cDNA library screening were investigated. Four V. sativa cDNA clones representing homologous of the tobacco axi 1 (auxin independent) cDNA clone w

  1. Cloning and Expression Analysis of Downy Mildew Resistance-Related cDNA Sequences in Melon

    Institute of Scientific and Technical Information of China (English)

    2005-01-01

    Melon downy mildew caused by Pseudoperonospora cubensis leads to significant losses in melon yields worldwide.Reverse-transcription Polymerase Chain Reaction (RT-PCR) was performed using cDNAs as templates from melonHuangdanzi induced with fungus Pseudoperonospora cubensis, and degenerate primers designed based on the conserved amino acid sequences of known plant disease-resistance genes. A polymorphic cDNA fragment which we named mp-19was cloned and sequenced. The Open Reading Frame (ORF) of this product comprised of 510 base pairs which encodes DNA or RNA-binding protein with 170 amino acids. The putative amino acid sequence of mp-19 appeared highly homologous with those of NBS-type resistant-genes isolated from other plants. Southern blot indicated that the melon genome contained more than 3 copies of mp-19. The obvious expression differences detected by semi-quantitative RTPCR could be observed between resistant-line Huangdanzi and susceptible-line Jiashi after Pseudoperonospora cubensis infection, which implied that mp-19 gene may be related to the resistance of downy mildew in melon.

  2. Nucleotide sequence of cDNA coding for dianthin 30, a ribosome inactivating protein from Dianthus caryophyllus.

    Science.gov (United States)

    Legname, G; Bellosta, P; Gromo, G; Modena, D; Keen, J N; Roberts, L M; Lord, J M

    1991-08-27

    Rabbit antibodies raised against dianthin 30, a ribosome inactivating protein from carnation (Dianthus caryophyllus) leaves, were used to identify a full length dianthin precursor cDNA clone from a lambda gt11 expression library. N-terminal amino acid sequencing of purified dianthin 30 and dianthin 32 confirmed that the clone encoded dianthin 30. The cDNA was 1153 basepairs in length and encoded a precursor protein of 293 amino acid residues. The first 23 N-terminal amino acids of the precursor represented the signal sequence. The protein contained a carboxy-terminal region which, by analogy with barley lectin, may contain a vacuolar targeting signal.

  3. Annotated expressed sequence tags and cDNA microarrays for studies of brain and behavior in the honey bee.

    Science.gov (United States)

    Whitfield, Charles W; Band, Mark R; Bonaldo, Maria F; Kumar, Charu G; Liu, Lei; Pardinas, Jose R; Robertson, Hugh M; Soares, M Bento; Robinson, Gene E

    2002-04-01

    To accelerate the molecular analysis of behavior in the honey bee (Apis mellifera), we created expressed sequence tag (EST) and cDNA microarray resources for the bee brain. Over 20,000 cDNA clones were partially sequenced from a normalized (and subsequently subtracted) library generated from adult A. mellifera brains. These sequences were processed to identify 15,311 high-quality ESTs representing 8912 putative transcripts. Putative transcripts were functionally annotated (using the Gene Ontology classification system) based on matching gene sequences in Drosophila melanogaster. The brain ESTs represent a broad range of molecular functions and biological processes, with neurobiological classifications particularly well represented. Roughly half of Drosophila genes currently implicated in synaptic transmission and/or behavior are represented in the Apis EST set. Of Apis sequences with open reading frames of at least 450 bp, 24% are highly diverged with no matches to known protein sequences. Additionally, over 100 Apis transcript sequences conserved with other organisms appear to have been lost from the Drosophila genome. DNA microarrays were fabricated with over 7000 EST cDNA clones putatively representing different transcripts. Using probe derived from single bee brain mRNA, microarrays detected gene expression for 90% of Apis cDNAs two standard deviations greater than exogenous control cDNAs. [The sequence data described in this paper have been submitted to Genbank data library under accession nos. BI502708-BI517278. The sequences are also available at http://titan.biotec.uiuc.edu/bee/honeybee_project.htm.

  4. Approaching marine bioprospecting in hexacorals by RNA deep sequencing.

    Science.gov (United States)

    Johansen, Steinar D; Emblem, Ase; Karlsen, Bård Ove; Okkenhaug, Siri; Hansen, Hilde; Moum, Truls; Coucheron, Dag H; Seternes, Ole Morten

    2010-07-31

    RNA deep sequencing represents a new complementary approach in marine bioprospecting. Next-generation sequencing platforms have recently been developed for de novo whole transcriptome analysis, small RNA discovery and gene expression profiling. Deep sequencing transcriptomics (sequencing the complete set of cellular transcripts at a specific stage or condition) leads to sequential identification of all expressed genes in a sample. When combined to high-throughput bioinformatics and protein synthesis, RNA deep sequencing represents a new powerful approach in gene product discovery and bioprospecting. Here we summarize recent progress in the analyses of hexacoral transcriptomes with the focus on cold-water sea anemones and related organisms.

  5. cDNA sequence and protein bioinformatics analyses of MSTN in African catfish (Clarias gariepinus).

    Science.gov (United States)

    Kanjanaworakul, Poonmanee; Sawatdichaikul, Orathai; Poompuang, Supawadee

    2016-04-01

    Myostatin, also known as growth differentiation factor 8, has been identified as a potent negative regulator of skeletal muscle growth. The purpose of this study was to characterize and predict function of the myostatin gene of the African catfish (Cg-MSTN). Expression of Cg-MSTN was determined at three growth stages to establish the relationship between the levels of MSTN transcript and skeletal muscle growth. The partial cDNA sequence of Cg-MSTN was cloned by using published information from its congener walking catfish (Cm-MSTN). The Cg-MSTN was 1194 bp in length encoding a protein of 397 amino acids. The deduced MSTN sequence exhibited key functional sites similar to those of other members of the TGF-β superfamily, especially, the proteolytic processing site (RXXR motif) and nine conserved cysteines at the C-terminal. Expression of MSTN appeared to be correlated with muscle development and growth of African catfish. Protein bioinformatics revealed that the primary sequence of Cg-MSTN shared 98 % sequence identity with that of walking catfish Cm-MSTN with only two different residues, [Formula: see text]. and [Formula: see text]. The proposed model of Cg-MSTN revealed the key point mutation [Formula: see text] causing a 7.35 Å shorter distance between the N- and C-lobes and an approximately 11° narrow angle than those of Cm-MSTN. The substitution of a proline residue near the proteolytic processing site which altered the structure of myostatin may play a critical role in reducing proteolytic activity of this protein in African catfish.

  6. Human secreted carbonic anhydrase: cDNA cloning, nucleotide sequence, and hybridization histochemistry

    Energy Technology Data Exchange (ETDEWEB)

    Aldred, P.; Fu, Ping; Barrett, G.; Penschow, J.D.; Wright, R.D.; Coghlan, J.P.; Fernley, R.T. (The Howard Florey Institute of Experimental Physiology and Medicine, Parkville, Victoria (Australia))

    1991-01-01

    Complementary DNA clones coding for the human secreted carbonic anhydrase isozyme (CAVI) have been isolated and their nucleotide sequences determined. These clones identify a 1.45-kb mRNA that is present in high levels in parotid submandibular salivary glands but absent in other tissues such as the sublingual gland, kidney, liver, and prostate gland. Hybridization histochemistry of human salivary glands shows mRNA for CA VI located in the acinar cells of these glands. The cDNA clones encode a protein of 308 amino acids that includes a 17 amino acid leader sequence typical of secreted proteins. The mature protein has 291 amino acids compared to 259 or 260 for the cytoplasmic isozymes, with most of the extra amino acids present as a carboxyl terminal extension. In comparison, sheep CA VI has a 45 amino acid extension. Overall the human CA VI protein has a sequence identity of 35 {percent} with human CA II, while residues involved in the active site of the enzymes have been conserved. The human and sheep secreted carbonic anhydrases have a sequence identity of 72 {percent}. This includes the two cysteine residues that are known to be involved in an intramolecular disulfide bond in the sheep CA VI. The enzyme is known to be glycosylated and three potential N-glycosylation sites (Asn-X-Thr/Ser) have been identified. Two of these are known to be glycosylated in sheep CA VI. Southern analysis of human DNA indicates that there is only one gene coding for CA VI.

  7. Deep sequencing: becoming a critical tool in clinical virology.

    Science.gov (United States)

    Quiñones-Mateu, Miguel E; Avila, Santiago; Reyes-Teran, Gustavo; Martinez, Miguel A

    2014-09-01

    Population (Sanger) sequencing has been the standard method in basic and clinical DNA sequencing for almost 40 years; however, next-generation (deep) sequencing methodologies are now revolutionizing the field of genomics, and clinical virology is no exception. Deep sequencing is highly efficient, producing an enormous amount of information at low cost in a relatively short period of time. High-throughput sequencing techniques have enabled significant contributions to multiples areas in virology, including virus discovery and metagenomics (viromes), molecular epidemiology, pathogenesis, and studies of how viruses to escape the host immune system and antiviral pressures. In addition, new and more affordable deep sequencing-based assays are now being implemented in clinical laboratories. Here, we review the use of the current deep sequencing platforms in virology, focusing on three of the most studied viruses: human immunodeficiency virus (HIV), hepatitis C virus (HCV), and influenza virus.

  8. Update History of This Database - Budding yeast cDNA sequencing project | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available [ Credits ] BLAST Search Image Search Home About Archive Update History Contact us ...Budding yeast cDNA sequencing project Update History of This Database Date Update contents 2010/03/29 Buddin...tio About This Database Database Description Download License Update History of This Database Site Policy | Contact Us Update History

  9. Cloning and sequencing of murine T3 gamma cDNA from a subtractive cDNA library.

    Science.gov (United States)

    Haser, W G; Saito, H; Koyama, T; Tonegawa, S

    1987-10-01

    The coding sequences of the murine and human T3 gamma chains are of identical length (182 amino acids) and contain a remarkable conservation of residues. The most striking observation is the high degree of homology between the murine and human cytosolic domains (89%), suggesting that the effector function of the T3 complex may be extremely similar or identical within human and murine lymphocytes. Both murine and human T lymphocytes can express two T3 gamma mRNA transcripts, suggesting that a second polyadenylation signal is present downstream. A poly(A) tail is not found in the 3' untranslated region of the murine gamma presented here, indicating that the murine clones analyzed represent mRNA generated by reading through the overlapping poly(A) signals at position 850-860 and possibly terminating at a position that would produce the 1.0 kb transcript.

  10. Studies of the hyperthermophile Thermotoga maritima by random sequencing of cDNA and genomic libraries. Identification and sequencing of the trpEG (D) operon.

    Science.gov (United States)

    Kim, C W; Markiewicz, P; Lee, J J; Schierle, C F; Miller, J H

    1993-06-20

    Random sequencing of cDNA and genomic libraries has been used to study the genome of the hyperthermophile Thermotoga maritima. To date, 175 unique clones have been analyzed by comparing short sequence tags with known proteins in the PIR and GenBank databases. We find that a significant proportion of sequences can be matched to previously identified protein from non-Thermotoga sources. A high match rate was obtained from an oligo(dT)-primed cDNA library, where one-third of all unique sequences analyzed (21/65) shared high amino acid sequence similarity with proteins in the PIR and GenBank databases. Also, approximately one-third of the unique sequences from a second cDNA library (28/89), constructed with random oligo primers, could be matched to sequences in PIR and GenBank. Identification of genes from the oligo(dT)-primed cDNA library indicates that some Thermotoga mRNAs are polyadenylated. Genes have also been identified from a 1 to 2 kb genomic DNA library. Here, (3/21) of genomic sequences analyzed could be matched to protein in PIR and GenBank. One of the genomic clones had high sequence similarity to the tryptophan synthesis gene anthranilate synthase component I (trpE). Using this sequence tag, the Thermotoga trp operon was isolated and sequenced. The Thermotoga maritima trp operon is arranged with trpE forming an overlapping transcript with a second protein consisting of a fusion of anthranilate synthase component II (trpG) and anthranilate phosphoribosyltransferse (trpD). With regard to the fusion, the operon organization is similar to Escherichia coli and Salmonella typhimurium, but lacks the classic attenuation system of enteric bacteria. Amino acid sequence comparison with 19 trpE, 18 trpG and 14 trpD genes from other organisms suggest that the Thermotoga trp genes resemble corresponding genes from other thermophiles more closely than expected.

  11. Cloning and Sequencing of Protein Kinase cDNA from Harbor Seal (Phoca vitulina Lymphocytes

    Directory of Open Access Journals (Sweden)

    Jennifer C. C. Neale

    2004-01-01

    Full Text Available Protein kinases (PKs play critical roles in signal transduction and activation of lymphocytes. The identification of PK genes provides a tool for understanding mechanisms of immunotoxic xenobiotics. As part of a larger study investigating persistent organic pollutants in the harbor seal and their possible immunomodulatory actions, we sequenced harbor seal cDNA fragments encoding PKs. The procedure, using degenerate primers based on conserved motifs of human protein tyrosine kinases (PTKs, successfully amplified nine phocid PK gene fragments with high homology to human and rodent orthologs. We identified eight PTKs and one dual (serine/threonine and tyrosine kinase. Among these were several PKs important in early signaling events through the B- and T-cell receptors (FYN, LYN, ITK and SYK and a MAP kinase involved in downstream signal transduction. V-FGR, RET and DDR2 were also expressed. Sequential activation of protein kinases ultimately induces gene transcription leading to the proliferation and differentiation of lymphocytes critical to adaptive immunity. PKs are potential targets of bioactive xenobiotics, including persistent organic pollutants of the marine environment; characterization of these molecules in the harbor seal provides a foundation for further research illuminating mechanisms of action of contaminants speculated to contribute to large-scale die-offs of marine mammals via immunosuppression.

  12. Geoseq: a tool for dissecting deep-sequencing datasets

    OpenAIRE

    Homann Robert; George Ajish; Levovitz Chaya; Shah Hardik; Cancio Anthony; Gurtowski James; Sachidanandam Ravi

    2010-01-01

    Abstract Background Datasets generated on deep-sequencing platforms have been deposited in various public repositories such as the Gene Expression Omnibus (GEO), Sequence Read Archive (SRA) hosted by the NCBI, or the DNA Data Bank of Japan (ddbj). Despite being rich data sources, they have not been used much due to the difficulty in locating and analyzing datasets of interest. Results Geoseq http://geoseq.mssm.edu provides a new method of analyzing short reads from deep sequencing experiments...

  13. An analysis of expressed sequence tags of developing castor endosperm using a full-length cDNA library

    Directory of Open Access Journals (Sweden)

    Wallis James G

    2007-07-01

    Full Text Available Abstract Background Castor seeds are a major source for ricinoleate, an important industrial raw material. Genomics studies of castor plant will provide critical information for understanding seed metabolism, for effectively engineering ricinoleate production in transgenic oilseeds, or for genetically improving castor plants by eliminating toxic and allergic proteins in seeds. Results Full-length cDNAs are useful resources in annotating genes and in providing functional analysis of genes and their products. We constructed a full-length cDNA library from developing castor endosperm, and obtained 4,720 ESTs from 5'-ends of the cDNA clones representing 1,908 unique sequences. The most abundant transcripts are genes encoding storage proteins, ricin, agglutinin and oleosins. Several other sequences are also very numerous, including two acidic triacylglycerol lipases, and the oleate hydroxylase (FAH12 gene that is responsible for ricinoleate biosynthesis. The role(s of the lipases in developing castor seeds are not clear, and co-expressing of a lipase and the FAH12 did not result in significant changes in hydroxy fatty acid accumulation in transgenic Arabidopsis seeds. Only one oleate desaturase (FAD2 gene was identified in our cDNA sequences. Sequence and functional analyses of the castor FAD2 were carried out since it had not been characterized previously. Overexpression of castor FAD2 in a FAH12-expressing Arabidopsis line resulted in decreased accumulation of hydroxy fatty acids in transgenic seeds. Conclusion Our results suggest that transcriptional regulation of FAD2 and FAH12 genes maybe one of the mechanisms that contribute to a high level of ricinoleate accumulation in castor endosperm. The full-length cDNA library will be used to search for additional genes that affect ricinoleate accumulation in seed oils. Our EST sequences will also be useful to annotate the castor genome, which whole sequence is being generated by shotgun sequencing at

  14. Large-scale Identification of Expressed Sequence Tags (ESTs from Nicotianatabacum by Normalized cDNA Library Sequencing

    Directory of Open Access Journals (Sweden)

    Alvarez S Perez

    2014-12-01

    Full Text Available An expressed sequence tags (EST resource for tobacco plants (Nicotianatabacum was established using high-throughput sequencing of randomly selected clones from one cDNA library representing a range of plant organs (leaf, stem, root and root base. Over 5000 ESTs were generated from the 3’ ends of 8000 clones, analyzed by BLAST searches and categorized functionally. All annotated ESTs were classified into 18 functional categories, unique transcripts involved in energy were the largest group accounting for 831 (32.32% of the annotated ESTs. After excluding 2450 non-significant tentative unique transcripts (TUTs, 100 unique sequences (1.67% of total TUTs were identified from the N. tabacum database. In the array result two genes strongly related to the tobacco mosaic virus (TMV were obtained, one basic form of pathogenesis-related protein 1 precursor (TBT012G08 and ubiquitin (TBT087G01. Both of them were found in the variety Hongda, some other important genes were classified into two groups, one of these implicated in plant development like those genes related to a photosynthetic process (chlorophyll a-b binding protein, photosystem I, ferredoxin I and III, ATP synthase and a further group including genes related to plant stress response (ubiquitin, ubiquitin-like protein SMT3, glycine-rich RNA binding protein, histones and methallothionein. The interesting finding in this study is that two of these genes have never been reported before in N. tabacum (ubiquitin-like protein SMT3 and methallothionein. The array results were confirmed using quantitative PCR.

  15. Next-generation sequencing-based 5' rapid amplification of cDNA ends for alternative promoters.

    Science.gov (United States)

    Perera, Bambarendage P U; Kim, Joomyeong

    2016-02-01

    Mammalian genomes contain many unknown alternative first exons and promoters. Thus, we have modified the existing 5'RACE (5' rapid amplification of cDNA ends) approach into a next-generation sequencing (NGS)-based new protocol that can identify these alternative promoters. This protocol has incorporated two main ideas: (i) 5'RACE starting from the known second exons of genes and (ii) NGS-based sequencing of the subsequent cDNA products. This protocol also provides a bioinformatics strategy that processes the sequence reads from NGS runs. This protocol has successfully identified several alternative promoters for an imprinted gene, PEG3. Overall, this NGS-based 5'RACE protocol is a sensitive and reliable method for detecting low-abundant transcripts and promoters.

  16. Human liver phosphatase 2A: cDNA and amino acid sequence of two catalytic subunit isotypes

    Energy Technology Data Exchange (ETDEWEB)

    Arino, J.; Woon, Chee Wai; Brautigan, D.L.; Miller, T.B. Jr.; Johnson, G.L. (Univ. of Massachusetts Medical School, Worcester (USA))

    1988-06-01

    Two cDNA clones were isolated from a human liver library that encode two phosphatase 2A catalytic subunits. The two cDNAs differed in eight amino acids (97% identity) with three nonconservative substitutions. All of the amino acid substitutions were clustered in the amino-terminal domain of the protein. Amino acid sequence of one human liver clone (HL-14) was identical to the rabbit skeletal muscle phosphatase 2A cDNA (with 97% nucleotide identity). The second human liver clone (HL-1) is encoded by a separate gene, and RNA gel blot analysis indicates that both mRNAs are expressed similarly in several human clonal cell lines. Sequence comparison with phosphatase 1 and 2A indicates highly divergent amino acid sequences at the amino and carboxyl termini of the proteins and identifies six highly conserved regions between the two proteins that are predicted to be important for phosphatase enzymatic activity.

  17. Isolation, characterization and cDNA sequencing of a Kazal family proteinase inhibitor from seminal plasma of turkey (Meleagris gallopavo).

    Science.gov (United States)

    Słowińska, Mariola; Olczak, Mariusz; Wojtczak, Mariola; Glogowski, Jan; Jankowski, Jan; Watorek, Wiesław; Amarowicz, Ryszard; Ciereszko, Andrzej

    2008-06-01

    The turkey reproductive tract and seminal plasma contain a serine proteinase inhibitor that seems to be unique for the reproductive tract. Our experimental objective was to isolate, characterize and cDNA sequence the Kazal family proteinase inhibitor from turkey seminal plasma and testis. Seminal plasma contains two forms of a Kazal family inhibitor: virgin (Ia) represented by an inhibitor of moderate electrophoretic migration rate (present also in the testis) and modified (Ib, a split peptide bond) represented by an inhibitor with a fast migration rate. The inhibitor from the seminal plasma was purified by affinity, ion-exchange and reverse phase chromatography. The testis inhibitor was purified by affinity and ion-exchange chromatography. N-terminal Edman sequencing of the two seminal plasma inhibitors and testis inhibitor were identical. This sequence was used to construct primers and obtain a cDNA sequence from the testis. Analysis of a cDNA sequence indicated that turkey proteinase inhibitor belongs to Kazal family inhibitors (pancreatic secretory trypsin inhibitors, mammalian acrosin inhibitors) and caltrin. The turkey seminal plasma Kazal inhibitor belongs to low molecular mass inhibitors and is characterized by a high value of the equilibrium association constant for inhibitor/trypsin complexes.

  18. Glutamate-gated chloride channel subunit cDNA sequencing of Cochliomyia hominivorax (Diptera: Calliphoridae): cDNA variants and polymorphisms.

    Science.gov (United States)

    Lopes, Alberto Moura Mendes; de Carvalho, Renato Assis; de Azeredo-Espin, Ana Maria Lima

    2014-09-01

    The New World screwworm (NWS) Cochliomyia hominivorax (Coquerel) is one of the major myiasis-causing flies that injures livestock and leads to losses of ~US$ 2.7 billions/year in the Neotropics. Ivermectin (IVM), a macrocyclic lactone (ML), is the most used preventive insecticide for this parasite and targets the glutamate-gated chloride (GLUCLα) channels. Several authors have associated altered GluClα homologues to MLs resistance in invertebrates, although studies about resistance in NWS are limited to other genes. Here, we aimed to characterise the NWS GluClα (ChGluClα) cDNA and to search for alterations associated with IVM resistance in NWS larvae from a bioassay. The open reading frame of the ChGluClα comprised 1,359 bp and encoded a sequence of 452 amino acids. The ChGluClα cDNAs of the bioassay larvae showed different sequences that could be splice variants, which agree with the occurrence of alternative splicing in GluClα homologues. In addition, we found cDNAs with premature stop codons and the K242R SNP, which occurred more frequently in the surviving larvae and was located close to mutation (L256F) involved in ML resistance. Although these alterations were in low frequency, the ChGluClα sequencing will allow further studies to find alterations in the gene of resistant natural populations.

  19. Nucleotide sequence and infectious cDNA clone of the L1 isolate of Pea seed-borne mosaic potyvirus.

    Science.gov (United States)

    Olsen, B S; Johansen, I E

    2001-01-01

    The complete nucleotide sequence of Pea seed-borne mosaic potyvirus isolate L1 has been determined from cloned virus cDNA. The PSbMV L1 genome is 9895 nucleotides in length excluding the poly(A) tail. Computer analysis of the sequence revealed a single long open reading frame (ORF) of 9594 nucleotides. The ORF potentially encodes a polyprotein of 3198 amino acids with a deduced Mr of 363537. Nine putative proteolytic cleavage sites were identified by analogy to consensus sequences and genome arrangement in other potyviruses. Two full-length cDNA clones, p35S-L1-4 and p35S-L1-5, were assembled under control of an enhanced 35S promoter and nopaline synthase terminator. Clone p35S-L1-4 was constructed with four introns and p35S-L1-5 with five introns inserted in the cDNA. Clone p35S-L1-4 was unstable in Escherichia coli often resulting in amplification of plasmids with deletions. Clone p35S-L1-5 was stable and apparently less toxic to Escherichia coli resulting in larger bacterial colonies and higher plasmid yield. Both clones were infectious upon mechanical inoculation of plasmid DNA on susceptible pea cultivars Fjord, Scout, and Brutus. Eight pea genotypes resistant to L1 virus were also resistant to the cDNA derived L1 virus. Both native PSbMV L1 and the cDNA derived virus infected Chenopodium quinoa systemically giving rise to characteristic necrotic lesions on uninoculated leaves.

  20. Generation and Analysis of Expressed Sequence Tags (ESTs) from Muscle Full-Length cDNA Library of Wujin Pig

    Institute of Scientific and Technical Information of China (English)

    ZHAO Su-mei; LIU Yong-gang; PAN Hong-bing; ZHANG Xi; GE Chang-rong; JIA Jun-jing; GAO Shi-zheng

    2014-01-01

    Porcine skeletal muscle genes play a major role in determining muscle growth and meat quality. Construction of a full-length cDNA library is an effective way to understand the expression of functional genes in muscle tissues. In addition, novel genes for further research could be identiifed in the library. In this study, we constructed a full-length cDNA library from porcine muscle tissue. The estimated average size of the cDNA inserts was 1076 bp, and the cDNA fullness ratio was 86.2%. A total of 1058 unique sequences with 342 contigs (32.3%) and 716 singleton (67.7%) expressed sequence tags (EST) were obtained by clustering and assembling. Meanwhile, 826 (78.1%) ESTs were categorized as known genes, and 232 (21.9%) ESTs were categorized as unknown genes. 65 novel porcine genes that exhibit no identity in the TIGR gene index ofSus scrofa and 124 full-length sequences with unknown functions were deposited in the dbEST division of GenBank (accession numbers: EU650784-EU650788, GE843306, GH228978-GH229100). The abundantly expressed genes in porcine muscle tissue were related to muscle ifber development, energy metabolism and protein synthesis. Gene ontology analysis showed that sequences expressed in porcine muscle tissue contained a high percentage of binding activity, catalytic activity, structural molecule activity and motor activity, which involved mainly in metabolic, cellular and developmental process, distributed mainly in intracellular region. The sequence data generated in this study would provide valuable information for identifying porcine genes expressed in muscle tissue and help to advance the study on the structure and function of genes in pigs.

  1. cDNA sequence, mRNA expression and genomic DNA of trypsinogen from the indianmeal moth, Plodia interpunctella.

    Science.gov (United States)

    Zhu, Y C; Oppert, B; Kramer, K J; McGaughey, W H; Dowdy, A K

    2000-02-01

    Trypsin-like enzymes are major insect gut enzymes that digest dietary proteins and proteolytically activate insecticidal proteins produced by the bacterium Bacillus thuringiensis (Bt). Resistance to Bt in a strain of the Indianmeal moth, Plodia interpunctella, was linked to the absence of a major trypsin-like proteinase (Oppert et al., 1997). In this study, trypsin-like proteinases, cDNA sequences, mRNA expression levels and genomic DNAs from Bt-susceptible and -resistant strains of the Indianmeal moth were compared. Proteinase activity blots of gut extracts indicated that the susceptible strain had two major trypsin-like proteinases, whereas the resistant strain had only one. Several trypsinogen-like cDNA clones were isolated and sequenced from cDNA libraries of both strains using a probe deduced from a conserved sequence for a serine proteinase active site. cDNAs of 852 nucleotides from the susceptible strain and 848 nucleotides from the resistant strain contained an open reading frame of 783 nucleotides which encoded a 261-amino acid trypsinogen-like protein. There was a single silent nucleotide difference between the two cDNAs in the open reading frame and the predicted amino acid sequence from the cDNA clones was most similar to sequences of trypsin-like proteinases from the spruce budworm, Choristoneura fumiferana, and the tobacco hornworm, Manduca sexta. The encoded protein included amino acid sequence motifs of serine proteinase active sites, conserved cysteine residues, and both zymogen activation and signal peptides. Northern blotting analysis showed no major difference between the two strains in mRNA expression in fourth-instar larvae, indicating that transcription was similar in the strains. Southern blotting analysis revealed that the restriction sites for the trypsinogen genes from the susceptible and resistant strains were different. Based on an enzyme size comparison, the cDNA isolated in this study corresponded to the gene for the smaller of two

  2. In-depth cDNA Library Sequencing Provides Quantitative Gene Expression Profiling in Cancer Biomarker Discovery

    Institute of Scientific and Technical Information of China (English)

    Wanling Yang; Dingge Ying; Yu-Lung Lau

    2009-01-01

    procedures may allow detection of many expres-sion features for less abundant gene variants. With the reduction of sequencing cost and the emerging of new generation sequencing technology, in-depth sequencing of cDNA pools or libraries may represent a better and powerful tool in gene expression profiling and cancer biomarker detection. We also propose using sequence-specific subtraction to remove hundreds of the most abundant housekeeping genes to in-crease sequencing depth without affecting relative expression ratio of other genes, as transcripts from as few as 300 most abundantly expressed genes constitute about 20% of the total transcriptome. In-depth sequencing also represents a unique ad-vantage of detecting unknown forms of transcripts, such as alternative splicing variants, fusion genes, and regulatory RNAs, as well as detecting mutations and polymorphisms that may play important roles in disease pathogenesis.

  3. 诸葛菜基因EPSPS的cDNA核苷酸序列%The cDNA Nucleotide Sequence of EPSPS Gene from Orychophragmus violaceus

    Institute of Scientific and Technical Information of China (English)

    刘晓军; 邓运涛; 游大慧; 李旭锋

    2002-01-01

    @@1 Source The sequence was determined using the 3′ RACE(rapid amplification cDNA ends) RT-PCR and 5′RACE RT-PCR product, which was ligated to the pMD18-T vector, from the cDNA of Orychophragmus violaceus.

  4. Deep Sequencing Analysis of Apple Infecting Viruses in Korea

    OpenAIRE

    In-Sook Cho; Davaajargal Igori; Seungmo Lim; Gug-Seoun Choi; John Hammond; Hyoun-Sub Lim; Jae Sun Moon

    2016-01-01

    Deep sequencing has generated 52 contigs derived from five viruses; Apple chlorotic leaf spot virus (ACLSV), Apple stem grooving virus (ASGV), Apple stem pitting virus (ASPV), Apple green crinkle associated virus (AGCaV), and Apricot latent virus (ApLV) were identified from eight apple samples showing small leaves and/or growth retardation. Nucleotide (nt) sequence identity of the assembled contigs was from 68% to 99% compared to the reference sequences of the five respective viral genomes. S...

  5. cDNA sequence analysis of a 29-kDa cysteine-rich surface antigen of pathogenic Entamoeba histolytica

    Energy Technology Data Exchange (ETDEWEB)

    Torian, B.E.; Stroeher, V.L.; Stamm, W.E. (Univ. of Washington, Seattle (USA)); Flores, B.M. (Louisiana State Univ. Medical Center, New Orleans (USA)); Hagen, F.S. (Zymogenetics Incorporated, Seattle, WA (USA))

    1990-08-01

    A {lambda}gt11 cDNA library was constructed from poly(U)-Spharose-selected Entamoeba histolytica trophozoite RNA in order to clone and identify surface antigens. The library was screened with rabbit polyclonal anti-E. histolytica serum. A 700-base-pair cDNA insert was isolated and the nucleotide sequence was determined. The deduced amino acid sequence of the cDNA revealed a cysteine-rich protein. DNA hybridizations showed that the gene was specific to E. histolytica since the cDNA probe reacted with DNA from four axenic strains of E. histolytica but did not react with DNA from Entamoeba invadens, Acanthamoeba castellanii, or Trichomonas vaginalis. The insert was subcloned into the expression vector pGEX-1 and the protein was expressed as a fusion with the C terminus of glutathione S-transferase. Purified fusion protein was used to generate 22 monoclonal antibodies (mAbs) and a mouse polyclonal antiserum specific for the E. histolytica portion of the fusion protein. A 29-kDa protein was identified as a surface antigen when mAbs were used to immunoprecipitate the antigen from metabolically {sup 35}S-labeled live trophozoites. The surface location of the antigen was corroborated by mAb immunoprecipitation of a 29-kDa protein from surface-{sup 125}I-labeled whole trophozoites as well as by the reaction of mAbs with live trophozoites in an indirect immunofluorescence assay performed at 4{degree}C. Immunoblotting with mAbs demonstrated that the antigen was present on four axenic isolates tested. mAbs recognized epitopes on the 29-kDa native antigen on some but not all clinical isolates tested.

  6. Geoseq: a tool for dissecting deep-sequencing datasets

    Directory of Open Access Journals (Sweden)

    Homann Robert

    2010-10-01

    Full Text Available Abstract Background Datasets generated on deep-sequencing platforms have been deposited in various public repositories such as the Gene Expression Omnibus (GEO, Sequence Read Archive (SRA hosted by the NCBI, or the DNA Data Bank of Japan (ddbj. Despite being rich data sources, they have not been used much due to the difficulty in locating and analyzing datasets of interest. Results Geoseq http://geoseq.mssm.edu provides a new method of analyzing short reads from deep sequencing experiments. Instead of mapping the reads to reference genomes or sequences, Geoseq maps a reference sequence against the sequencing data. It is web-based, and holds pre-computed data from public libraries. The analysis reduces the input sequence to tiles and measures the coverage of each tile in a sequence library through the use of suffix arrays. The user can upload custom target sequences or use gene/miRNA names for the search and get back results as plots and spreadsheet files. Geoseq organizes the public sequencing data using a controlled vocabulary, allowing identification of relevant libraries by organism, tissue and type of experiment. Conclusions Analysis of small sets of sequences against deep-sequencing datasets, as well as identification of public datasets of interest, is simplified by Geoseq. We applied Geoseq to, a identify differential isoform expression in mRNA-seq datasets, b identify miRNAs (microRNAs in libraries, and identify mature and star sequences in miRNAS and c to identify potentially mis-annotated miRNAs. The ease of using Geoseq for these analyses suggests its utility and uniqueness as an analysis tool.

  7. Analysis of expressed sequence tags generated from full-length enriched cDNA libraries of melon

    Directory of Open Access Journals (Sweden)

    Bendahmane Abdelhafid

    2011-05-01

    Full Text Available Abstract Background Melon (Cucumis melo, an economically important vegetable crop, belongs to the Cucurbitaceae family which includes several other important crops such as watermelon, cucumber, and pumpkin. It has served as a model system for sex determination and vascular biology studies. However, genomic resources currently available for melon are limited. Result We constructed eleven full-length enriched and four standard cDNA libraries from fruits, flowers, leaves, roots, cotyledons, and calluses of four different melon genotypes, and generated 71,577 and 22,179 ESTs from full-length enriched and standard cDNA libraries, respectively. These ESTs, together with ~35,000 ESTs available in public domains, were assembled into 24,444 unigenes, which were extensively annotated by comparing their sequences to different protein and functional domain databases, assigning them Gene Ontology (GO terms, and mapping them onto metabolic pathways. Comparative analysis of melon unigenes and other plant genomes revealed that 75% to 85% of melon unigenes had homologs in other dicot plants, while approximately 70% had homologs in monocot plants. The analysis also identified 6,972 gene families that were conserved across dicot and monocot plants, and 181, 1,192, and 220 gene families specific to fleshy fruit-bearing plants, the Cucurbitaceae family, and melon, respectively. Digital expression analysis identified a total of 175 tissue-specific genes, which provides a valuable gene sequence resource for future genomics and functional studies. Furthermore, we identified 4,068 simple sequence repeats (SSRs and 3,073 single nucleotide polymorphisms (SNPs in the melon EST collection. Finally, we obtained a total of 1,382 melon full-length transcripts through the analysis of full-length enriched cDNA clones that were sequenced from both ends. Analysis of these full-length transcripts indicated that sizes of melon 5' and 3' UTRs were similar to those of tomato, but

  8. Deep sequencing in the management of hepatitis virus infections.

    Science.gov (United States)

    Quer, Josep; Rodríguez-Frias, Francisco; Gregori, Josep; Tabernero, David; Soria, Maria Eugenia; García-Cehic, Damir; Homs, Maria; Bosch, Albert; Pintó, Rosa María; Esteban, Juan Ignacio; Domingo, Esteban; Perales, Celia

    2016-12-28

    The hepatitis viruses represent a major public health problem worldwide. Procedures for characterization of the genomic composition of their populations, accurate diagnosis, identification of multiple infections, and information on inhibitor-escape mutants for treatment decisions are needed. Deep sequencing methodologies are extremely useful for these viruses since they replicate as complex and dynamic quasispecies swarms whose complexity and mutant composition are biologically relevant traits. Population complexity is a major challenge for disease prevention and control, but also an opportunity to distinguish among related but phenotypically distinct variants that might anticipate disease progression and treatment outcome. Detailed characterization of mutant spectra should permit choosing better treatment options, given the increasing number of new antiviral inhibitors available. In the present review we briefly summarize our experience on the use of deep sequencing for the management of hepatitis virus infections, particularly for hepatitis B and C viruses, and outline some possible new applications of deep sequencing for these important human pathogens.

  9. Preparing DNA libraries for multiplexed paired-end deep sequencing for Illumina GA sequencers.

    Science.gov (United States)

    Son, Mike S; Taylor, Ronald K

    2011-02-01

    Whole-genome sequencing, also known as deep sequencing, is becoming a more affordable and efficient way to identify SNP mutations, deletions, and insertions in DNA sequences across several different strains. Two major obstacles preventing the widespread use of deep sequencers are the costs involved in services used to prepare DNA libraries for sequencing and the overall accuracy of the sequencing data. This unit describes the preparation of DNA libraries for multiplexed paired-end sequencing using the Illumina GA series sequencer. Self-preparation of DNA libraries can help reduce overall expenses, especially if optimization is required for the different samples, and use of the Illumina GA Sequencer can improve the quality of the data.

  10. Canine amino acid transport system Xc(-): cDNA sequence, distribution and cystine transport activity in lens epithelial cells.

    Science.gov (United States)

    Maruo, Takuya; Kanemaki, Nobuyuki; Onda, Ken; Sato, Reiichiro; Ichihara, Nobuteru; Ochiai, Hideharu

    2014-04-01

    The cystine transport activity of a lens epithelial cell line originated from a canine mature cataract was investigated. The distinct cystine transport activity was observed, which was inhibited to 28% by extracellular 1 mM glutamate. The cDNA sequences of canine cysteine/glutamate exchanger (xCT) and 4F2hc were determined. The predicted amino acid sequences were 527 and 533 amino acid polypeptides, respectively. The amino acid sequences of canine xCT and 4F2hc showed high similarities (>80%) to those of humans. The expression of xCT in lens epithelial cell line was confirmed by western blot analysis. RT-PCR analysis revealed high level expression only in the brain, and it was below the detectable level in other tissues.

  11. Hybridization-based antibody cDNA recovery for the production of recombinant antibodies identified by repertoire sequencing.

    Science.gov (United States)

    Valdés-Alemán, Javier; Téllez-Sosa, Juan; Ovilla-Muñoz, Marbella; Godoy-Lozano, Elizabeth; Velázquez-Ramírez, Daniel; Valdovinos-Torres, Humberto; Gómez-Barreto, Rosa E; Martinez-Barnetche, Jesús

    2014-01-01

    High-throughput sequencing of the antibody repertoire is enabling a thorough analysis of B cell diversity and clonal selection, which may improve the novel antibody discovery process. Theoretically, an adequate bioinformatic analysis could allow identification of candidate antigen-specific antibodies, requiring their recombinant production for experimental validation of their specificity. Gene synthesis is commonly used for the generation of recombinant antibodies identified in silico. Novel strategies that bypass gene synthesis could offer more accessible antibody identification and validation alternatives. We developed a hybridization-based recovery strategy that targets the complementarity-determining region 3 (CDRH3) for the enrichment of cDNA of candidate antigen-specific antibody sequences. Ten clonal groups of interest were identified through bioinformatic analysis of the heavy chain antibody repertoire of mice immunized with hen egg white lysozyme (HEL). cDNA from eight of the targeted clonal groups was recovered efficiently, leading to the generation of recombinant antibodies. One representative heavy chain sequence from each clonal group recovered was paired with previously reported anti-HEL light chains to generate full antibodies, later tested for HEL-binding capacity. The recovery process proposed represents a simple and scalable molecular strategy that could enhance antibody identification and specificity assessment, enabling a more cost-efficient generation of recombinant antibodies.

  12. Expressed sequence tags analysis of a liver tissue cDNA library from a highly inbred minipig line

    Institute of Scientific and Technical Information of China (English)

    CHEN You-nan; TAN Wei-dong; LU Yan-rong; QIN Sheng-fang; LI Sheng-fu; ZENG Yang-zhi; BU Hong; LI You-ping; CHENG Jing-qiu

    2007-01-01

    Background Porcine liver performing efficient physiological functions in the human body is prerequisite for successful liver xenotransplantation. However, the protein differences between pig and human remain largely unexplored. Therefore,we investigated the liver expression profile of a highly inbred minipig line.Methods A cDNA library was constructed from liver tissue of an inbred Banna minipig. Two hundred randomly selected clones were sequenced then analysed by BLAST programme.Results Alignments of the sequences showed 44% encoded previously known porcine genes. Among the 56% unknown genes, sequences of 72 clones had high similarities with known genes of other species and the similarities to human were mostly above 0.80. The other 40 clones showing no similarity to genes in National Centre for Biotechnology Information are newly discovered, expressed sequence tags specific to liver of inbred Banna minipig. Twenty-two of the 200 clones had full length encoding regions, 38 complete 5' terminal sequences and 140 complete 3' terminal sequences.Conclusion These newly discovered expression sequences may be an important resource for research involving physiological characteristics and medical usage of inbred pigs and contribute to matching studies in xenotransplantation.

  13. Generation and analysis of expressed sequence tags from a cDNA library of the fruiting body of Ganoderma lucidum

    Directory of Open Access Journals (Sweden)

    Li Xiwen

    2010-03-01

    Full Text Available Abstract Background Little genomic or trancriptomic information on Ganoderma lucidum (Lingzhi is known. This study aims to discover the transcripts involved in secondary metabolite biosynthesis and developmental regulation of G. lucidum using an expressed sequence tag (EST library. Methods A cDNA library was constructed from the G. lucidum fruiting body. Its high-quality ESTs were assembled into unique sequences with contigs and singletons. The unique sequences were annotated according to sequence similarities to genes or proteins available in public databases. The detection of simple sequence repeats (SSRs was preformed by online analysis. Results A total of 1,023 clones were randomly selected from the G. lucidum library and sequenced, yielding 879 high-quality ESTs. These ESTs showed similarities to a diverse range of genes. The sequences encoding squalene epoxidase (SE and farnesyl-diphosphate synthase (FPS were identified in this EST collection. Several candidate genes, such as hydrophobin, MOB2, profilin and PHO84 were detected for the first time in G. lucidum. Thirteen (13 potential SSR-motif microsatellite loci were also identified. Conclusion The present study demonstrates a successful application of EST analysis in the discovery of transcripts involved in the secondary metabolite biosynthesis and the developmental regulation of G. lucidum.

  14. Transcription Profiling of the Model Cyanobacterium Synechococcus sp. Strain PCC 7002 by Next-Gen (SOLiD™) Sequencing of cDNA

    OpenAIRE

    Ludwig, Marcus; Bryant, Donald A.

    2011-01-01

    The genome of the unicellular, euryhaline cyanobacterium Synechococcus sp. PCC 7002 encodes about 3200 proteins. Transcripts were detected for nearly all annotated open reading frames by a global transcriptomic analysis by Next-Generation (SOLiD™) sequencing of cDNA. In the cDNA samples sequenced, ∼90% of the mapped sequences were derived from the 16S and 23S ribosomal RNAs and ∼10% of the sequences were derived from mRNAs. In cells grown photoautotrophically under standard conditions [38°C, ...

  15. Transcription profiling of the model cyanobacterium Synechococcus sp. strain PCC 7002 by NextGen (SOLiD™) Sequencing of cDNA

    OpenAIRE

    Marcus eLudwig; Bryant, Donald A.

    2011-01-01

    The genome of the unicellular, euryhaline cyanobacterium Synechococcus sp. PCC 7002 encodes about 3200 proteins. Transcripts were detected for nearly all annotated open reading frames by a global transcriptomic analysis by Next-Generation (SOLiDTM) sequencing of cDNA. In the cDNA samples sequenced, ~90% of the mapped sequences were derived from the 16S and 23S ribosomal RNAs and ~10% of the sequences were derived from mRNAs. In cells grown photoautotrophically under standard conditions (38 &#...

  16. cDNA, genomic sequence cloning and overexpression of ribosomal protein S25 gene (RPS25) from the Giant Panda.

    Science.gov (United States)

    Hao, Yan-Zhe; Hou, Wan-Ru; Hou, Yi-Ling; Du, Yu-Jie; Zhang, Tian; Peng, Zheng-Song

    2009-11-01

    RPS25 is a component of the 40S small ribosomal subunit encoded by RPS25 gene, which is specific to eukaryotes. Studies in reference to RPS25 gene from animals were handful. The Giant Panda (Ailuropoda melanoleuca), known as a "living fossil", are increasingly concerned by the world community. Studies on RPS25 of the Giant Panda could provide scientific data for inquiring into the hereditary traits of the gene and formulating the protective strategy for the Giant Panda. The cDNA of the RPS25 cloned from Giant Panda is 436 bp in size, containing an open reading frame of 378 bp encoding 125 amino acids. The length of the genomic sequence is 1,992 bp, which was found to possess four exons and three introns. Alignment analysis indicated that the nucleotide sequence of the coding sequence shows a high homology to those of Homo sapiens, Bos taurus, Mus musculus and Rattus norvegicus as determined by Blast analysis, 92.6, 94.4, 89.2 and 91.5%, respectively. Primary structure analysis revealed that the molecular weight of the putative RPS25 protein is 13.7421 kDa with a theoretical pI 10.12. Topology prediction showed there is one N-glycosylation site, one cAMP and cGMP-dependent protein kinase phosphorylation site, two Protein kinase C phosphorylation sites and one Tyrosine kinase phosphorylation site in the RPS25 protein of the Giant Panda. The RPS25 gene was overexpressed in E. coli BL21 and Western Blotting of the RPS25 protein was also done. The results indicated that the RPS25 gene can be really expressed in E. coli and the RPS25 protein fusioned with the N-terminally his-tagged form gave rise to the accumulation of an expected 17.4 kDa polypeptide. The cDNA and the genomic sequence of RPS25 were cloned successfully for the first time from the Giant Panda using RT-PCR technology and Touchdown-PCR, respectively, which were both sequenced and analyzed preliminarily; then the cDNA of the RPS25 gene was overexpressed in E. coli BL21 and immunoblotted, which is the first

  17. Targeted rapid amplification of cDNA ends (T-RACE)--an improved RACE reaction through degradation of non-target sequences.

    Science.gov (United States)

    Bower, Neil I; Johnston, Ian A

    2010-11-01

    Amplification of the 5' ends of cDNA, although simple in theory, can often be difficult to achieve. We describe a novel method for the specific amplification of cDNA ends. An oligo-dT adapter incorporating a dUTP-containing PCR primer primes first-strand cDNA synthesis incorporating dUTP. Using the Cap finder approach, another distinct dUTP containing adapter is added to the 3' end of the newly synthesized cDNA. Second-strand synthesis incorporating dUTP is achieved by PCR, using dUTP-containing primers complimentary to the adapter sequences incorporated in the cDNA ends. The double-stranded cDNA-containing dUTP serves as a universal template for the specific amplification of the 3' or 5' end of any gene. To amplify the ends of cDNA, asymmetric PCR is performed using a single gene-specific primer and standard dNTPs. The asymmetric PCR product is purified and non-target transcripts containing dUTP degraded by Uracil DNA glycosylase, leaving only those transcripts produced during the asymmetric PCR. Subsequent PCR using a nested gene-specific primer and the 3' or 5' T-RACE primer results in specific amplification of cDNA ends. This method can be used to specifically amplify the 3' and 5' ends of numerous cDNAs from a single cDNA synthesis reaction.

  18. Deep sequencing analysis of phage libraries using Illumina platform.

    Science.gov (United States)

    Matochko, Wadim L; Chu, Kiki; Jin, Bingjie; Lee, Sam W; Whitesides, George M; Derda, Ratmir

    2012-09-01

    This paper presents an analysis of phage-displayed libraries of peptides using Illumina. We describe steps for the preparation of short DNA fragments for deep sequencing and MatLab software for the analysis of the results. Screening of peptide libraries displayed on the surface of bacteriophage (phage display) can be used to discover peptides that bind to any target. The key step in this discovery is the analysis of peptide sequences present in the library. This analysis is usually performed by Sanger sequencing, which is labor intensive and limited to examination of a few hundred phage clones. On the other hand, Illumina deep-sequencing technology can characterize over 10(7) reads in a single run. We applied Illumina sequencing to analyze phage libraries. Using PCR, we isolated the variable regions from M13KE phage vectors from a phage display library. The PCR primers contained (i) sequences flanking the variable region, (ii) barcodes, and (iii) variable 5'-terminal region. We used this approach to examine how diversity of peptides in phage display libraries changes as a result of amplification of libraries in bacteria. Using HiSeq single-end Illumina sequencing of these fragments, we acquired over 2×10(7) reads, 57 base pairs (bp) in length. Each read contained information about the barcode (6bp), one complimentary region (12bp) and a variable region (36bp). We applied this sequencing to a model library of 10(6) unique clones and observed that amplification enriches ∼150 clones, which dominate ∼20% of the library. Deep sequencing, for the first time, characterized the collapse of diversity in phage libraries. The results suggest that screens based on repeated amplification and small-scale sequencing identify a few binding clones and miss thousands of useful clones. The deep sequencing approach described here could identify under-represented clones in phage screens. It could also be instrumental in developing new screening strategies, which can preserve

  19. cDNA sequence analysis of ribosomal protein S13 gene in Plutella xylostella (Lepidoptera: Plutellidae)

    Institute of Scientific and Technical Information of China (English)

    SHAO-LIWANG; CHENG-FASHENG; CHUAN-LINGQIAO; MIYATATADASHI

    2005-01-01

    Ribosomal protein S 13 gene has been cloned and analyzed in many organisms,but there are few documents relating to insects. In this communication, the full-length cDNA sequence of ribosomal protein S 13 gene in the diamondback moth, Plutella xylostella(Lepidoptera: Plutellidae), was determined by using PCR amplification technique. The features of the ribosomal protein S 13 gene sequence were analyzed and the deduced amino acids sequence was compared with those from other insects. The results of multi-alignment of the amino acid sequences between the diamondback moth and other insect species revealed that this gene sequence is highly conserved in insects. Based on maximum likelihood method, a phylogenetic tree was constructed from 10 different species using PHYLIP software. It showed that nematode is one separate lineage and the five insect speciesbe long to another lineage, whereas those species higher than insects form the third one. The pattern of this phylogenetic tree evidently represented the evolution of different species.

  20. Construction and Evaluation of Normalized cDNA Libraries Enriched with Full-Length Sequences for Rapid Discovery of New Genes from Sisal (Agave sisalana Perr.) Different Developmental Stages

    Science.gov (United States)

    Zhou, Wen-Zhao; Zhang, Yan-Mei; Lu, Jun-Ying; Li, Jun-Feng

    2012-01-01

    To provide a resource of sisal-specific expressed sequence data and facilitate this powerful approach in new gene research, the preparation of normalized cDNA libraries enriched with full-length sequences is necessary. Four libraries were produced with RNA pooled from Agave sisalana multiple tissues to increase efficiency of normalization and maximize the number of independent genes by SMART™ method and the duplex-specific nuclease (DSN). This procedure kept the proportion of full-length cDNAs in the subtracted/normalized libraries and dramatically enhanced the discovery of new genes. Sequencing of 3875 cDNA clones of libraries revealed 3320 unigenes with an average insert length about 1.2 kb, indicating that the non-redundancy of libraries was about 85.7%. These unigene functions were predicted by comparing their sequences to functional domain databases and extensively annotated with Gene Ontology (GO) terms. Comparative analysis of sisal unigenes and other plant genomes revealed that four putative MADS-box genes and knotted-like homeobox (knox) gene were obtained from a total of 1162 full-length transcripts. Furthermore, real-time PCR showed that the characteristics of their transcripts mainly depended on the tight expression regulation of a number of genes during the leaf and flower development. Analysis of individual library sequence data indicated that the pooled-tissue approach was highly effective in discovering new genes and preparing libraries for efficient deep sequencing. PMID:23202944

  1. Construction and evaluation of normalized cDNA libraries enriched with full-length sequences for rapid discovery of new genes from Sisal (Agave sisalana Perr.) different developmental stages.

    Science.gov (United States)

    Zhou, Wen-Zhao; Zhang, Yan-Mei; Lu, Jun-Ying; Li, Jun-Feng

    2012-10-12

    To provide a resource of sisal-specific expressed sequence data and facilitate this powerful approach in new gene research, the preparation of normalized cDNA libraries enriched with full-length sequences is necessary. Four libraries were produced with RNA pooled from Agave sisalana multiple tissues to increase efficiency of normalization and maximize the number of independent genes by SMART™ method and the duplex-specific nuclease (DSN). This procedure kept the proportion of full-length cDNAs in the subtracted/normalized libraries and dramatically enhanced the discovery of new genes. Sequencing of 3875 cDNA clones of libraries revealed 3320 unigenes with an average insert length about 1.2 kb, indicating that the non-redundancy of libraries was about 85.7%. These unigene functions were predicted by comparing their sequences to functional domain databases and extensively annotated with Gene Ontology (GO) terms. Comparative analysis of sisal unigenes and other plant genomes revealed that four putative MADS-box genes and knotted-like homeobox (knox) gene were obtained from a total of 1162 full-length transcripts. Furthermore, real-time PCR showed that the characteristics of their transcripts mainly depended on the tight expression regulation of a number of genes during the leaf and flower development. Analysis of individual library sequence data indicated that the pooled-tissue approach was highly effective in discovering new genes and preparing libraries for efficient deep sequencing.

  2. Isolation and sequence analysis of a cDNA clone encoding the fifth complement component

    DEFF Research Database (Denmark)

    Lundwall, Åke B; Wetsel, Rick A; Kristensen, Torsten;

    1985-01-01

    DNA clone of 1.85 kilobase pairs was isolated. Hybridization of the mixed-sequence probe to the complementary strand of the plasmid insert and sequence analysis by the dideoxy method predicted the expected protein sequence of C5a (positions 1-12), amino-terminal to the anticipated priming site. The sequence......, subcloned into M13 mp8, and sequenced at random by the dideoxy technique, thereby generating a contiguous sequence of 1703 base pairs. This clone contained coding sequence for the C-terminal 262 amino acid residues of the beta-chain, the entire C5a fragment, and the N-terminal 98 residues of the alpha......'-chain. The 3' end of the clone had a polyadenylated tail preceded by a polyadenylation recognition site, a 3'-untranslated region, and base pairs homologous to the human Alu concensus sequence. Comparison of the derived partial human C5 protein sequence with that previously determined for murine C3 and human...

  3. deepTools: a flexible platform for exploring deep-sequencing data

    OpenAIRE

    Ramírez, Fidel; Dündar, Friederike; Diehl, Sarah; Grüning, Björn A; Manke, Thomas

    2014-01-01

    We present a Galaxy based web server for processing and visualizing deeply sequenced data. The web server's core functionality consists of a suite of newly developed tools, called deepTools, that enable users with little bioinformatic background to explore the results of their sequencing experiments in a standardized setting. Users can upload pre-processed files with continuous data in standard formats and generate heatmaps and summary plots in a straight-forward, yet highly customizable mann...

  4. deepTools: a flexible platform for exploring deep-sequencing data.

    Science.gov (United States)

    Ramírez, Fidel; Dündar, Friederike; Diehl, Sarah; Grüning, Björn A; Manke, Thomas

    2014-07-01

    We present a Galaxy based web server for processing and visualizing deeply sequenced data. The web server's core functionality consists of a suite of newly developed tools, called deepTools, that enable users with little bioinformatic background to explore the results of their sequencing experiments in a standardized setting. Users can upload pre-processed files with continuous data in standard formats and generate heatmaps and summary plots in a straight-forward, yet highly customizable manner. In addition, we offer several tools for the analysis of files containing aligned reads and enable efficient and reproducible generation of normalized coverage files. As a modular and open-source platform, deepTools can easily be expanded and customized to future demands and developments. The deepTools webserver is freely available at http://deeptools.ie-freiburg.mpg.de and is accompanied by extensive documentation and tutorials aimed at conveying the principles of deep-sequencing data analysis. The web server can be used without registration. deepTools can be installed locally either stand-alone or as part of Galaxy.

  5. The cDNA sequence for the protein-tyrosine kinase substrate p36 (calpactin I heavy chain) reveals a multidomain protein with internal repeats

    DEFF Research Database (Denmark)

    Sarin, C T; Tack, B F; Kristensen, Torsten;

    1986-01-01

    We have isolated and sequenced a full-length cDNA clone for the protein-tyrosine kinase substrate p36 (calpactin I heavy chain). This sequence predicts a 339 amino acid (Mr 38,493) protein containing an N-terminal region of 20 amino acids, known to interact with a 10 kd protein (light chain), and...

  6. DSAP: deep-sequencing small RNA analysis pipeline.

    Science.gov (United States)

    Huang, Po-Jung; Liu, Yi-Chung; Lee, Chi-Ching; Lin, Wei-Chen; Gan, Richie Ruei-Chi; Lyu, Ping-Chiang; Tang, Petrus

    2010-07-01

    DSAP is an automated multiple-task web service designed to provide a total solution to analyzing deep-sequencing small RNA datasets generated by next-generation sequencing technology. DSAP uses a tab-delimited file as an input format, which holds the unique sequence reads (tags) and their corresponding number of copies generated by the Solexa sequencing platform. The input data will go through four analysis steps in DSAP: (i) cleanup: removal of adaptors and poly-A/T/C/G/N nucleotides; (ii) clustering: grouping of cleaned sequence tags into unique sequence clusters; (iii) non-coding RNA (ncRNA) matching: sequence homology mapping against a transcribed sequence library from the ncRNA database Rfam (http://rfam.sanger.ac.uk/); and (iv) known miRNA matching: detection of known miRNAs in miRBase (http://www.mirbase.org/) based on sequence homology. The expression levels corresponding to matched ncRNAs and miRNAs are summarized in multi-color clickable bar charts linked to external databases. DSAP is also capable of displaying miRNA expression levels from different jobs using a log(2)-scaled color matrix. Furthermore, a cross-species comparative function is also provided to show the distribution of identified miRNAs in different species as deposited in miRBase. DSAP is available at http://dsap.cgu.edu.tw.

  7. Completion sequence and cloning of the infectious cDNA of a chb isolate of cucumber green mottle mosaic virus.

    Science.gov (United States)

    Zhong, M; Zhao, X; Liu, Y; Wang, Y; Cao, K

    2015-03-01

    Cucumber green mottle mosaic virus (CGMMV) is an important and widespread seed-borne virus that infects Cucurbitaceous plants. It is a member of the genus Tobamovirus in the family Virgaviridae with a monopartite (+) ssRNA genome. Here we report the complete genome sequence, construction and testing of the infectious clones of a chb isolate of CGMMV. Full-length CGMMV cDNA was cloned into the vector pUC19. The linearized vector containing full-length cDNA was used as template for in vitro transcription, and the synthesized capped transcript was highly infectious in Chenopodium amaranticolor and cucumber (Cucumis sativus). Inoculated plants showed symptoms typical of CGMMV infection. The infectivity was confirmed by mechanical transmission to new plants, RT-PCR and western blot. Progeny virus derived from infectious transcripts had the same biological and biochemical properties as wild-type virus. To our knowledge, this is the first detailed report of a biologically active transcript from CGMMV.

  8. Purification, amino acid sequence, and cDNA cloning of trypsin inhibitors from onion (Allium cepa L.) bulbs.

    Science.gov (United States)

    Deshimaru, Masanobu; Watanabe, Akira; Suematsu, Keiko; Hatano, Maki; Terada, Shigeyuki

    2003-08-01

    Three protease inhibitors (OTI-1-3) have been purified from onion (Allium cepa L.) bulbs. Molecular masses of these inhibitors were found to be 7,370.2, 7,472.2, and 7,642.6 Da by matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF-MS), respectively. Based on amino acid composition and N-terminal sequence, OTI-1 and -2 are the N-terminal truncated proteins of OTI-3. All the inhibitors are stable to heat and extreme pH. OTI-3 inhibited trypsin, chymotrypsin, and plasmin with dissociation constants of 1.3 x 10(-9) M, 2.3 x 10(-7) M, and 3.1 x 10(-7) M, respectively. The complete amino acid sequence of OTI-3 showed a significant homology to Bowman-Birk family inhibitors, and the first reactive site (P1) was found to be Arg17 by limited proteolysis by trypsin. The second reactive site (P1) was estimated to be Leu46, that may inhibit chymotrypsin. OTI-3 lacks an S-S bond near the second reactive site, resulting in a low affinity for the enzyme. The sequence of OTI-3 was also ascertained by the nucleotide sequence of a cDNA clone encoding a 101-residue precursor of the onion inhibitor.

  9. Existence of homologous sequences corresponding to cDNA of the ver

    Institute of Scientific and Technical Information of China (English)

    2001-01-01

    The presence of DNA homologues corresponding to verc203 (vernalization-related cDNA clone) was investigated by molecular hybridization techniques. The genes were detected in 16 plant species that cover 12 subclasses of the Takhtajan system of angiosperms classification including diverse model species. The results of Southern blot analysis showed a low copy number of this gene existed in rice, wheat, barley and Arabidopsis. The hybridization result of PCR products demonstrated the conservation of the gene corresponding to ver203 in diverse plants. The phylogenetic tree of the ver203 gene in tested plants was supported by evolution relationship of species. The ver203 gene expressed in a vernalized plumule winter wheat, instead of the root. And the endosperm before the treatment was essential for the ver203 expression during vernalization in wheat. In Arabidopsis thaliana, the pattern of expression showed that the gene corresponding to ver203 was expressed at low temperature for 14 days. Gibberellin (GA3) may accelerate the expression of ver203 gene in Arabidopsis exposed to low temperature. However, it could not replace vernalization treatment to initiate the gene expression.

  10. Controlled ribonucleotide tailing of cDNA ends (CRTC) by terminal deoxynucleotidyl transferase: a new approach in PCR-mediated analysis of mRNA sequences.

    Science.gov (United States)

    Schmidt, W M; Mueller, M W

    1996-05-01

    Controlled ribonucleotide tailing of cDNA ends (CRTC) by terminal deoxynucleotidyl transferase is a polymerase chain reaction (PCR)-mediated technique that was developed to facilitate cloning and direct sequence analysis of complete 5'-terminal unknown coding regions of rare RNA molecules. In contrast with standard tailing protocols using dNTPs as the substrate, ribo-tailing of cDNA ends is easily controllable, self-limited (from two to four rNMP incorporations) and highly efficient (>98%). By virtue of the homopolymeric ribo-tail, the modified cDNA is anchored to the 3' overhang of a double-stranded DNA-adaptor in a T4 DNA ligase-dependent ligation. PCR amplification, mediated by two sequence-specific primers, yields the desired unique product suitable for cloning and dideoxy-sequencing.

  11. Deep whole-genome sequencing of 100 southeast Asian Malays.

    Science.gov (United States)

    Wong, Lai-Ping; Ong, Rick Twee-Hee; Poh, Wan-Ting; Liu, Xuanyao; Chen, Peng; Li, Ruoying; Lam, Kevin Koi-Yau; Pillai, Nisha Esakimuthu; Sim, Kar-Seng; Xu, Haiyan; Sim, Ngak-Leng; Teo, Shu-Mei; Foo, Jia-Nee; Tan, Linda Wei-Lin; Lim, Yenly; Koo, Seok-Hwee; Gan, Linda Seo-Hwee; Cheng, Ching-Yu; Wee, Sharon; Yap, Eric Peng-Huat; Ng, Pauline Crystal; Lim, Wei-Yen; Soong, Richie; Wenk, Markus Rene; Aung, Tin; Wong, Tien-Yin; Khor, Chiea-Chuen; Little, Peter; Chia, Kee-Seng; Teo, Yik-Ying

    2013-01-10

    Whole-genome sequencing across multiple samples in a population provides an unprecedented opportunity for comprehensively characterizing the polymorphic variants in the population. Although the 1000 Genomes Project (1KGP) has offered brief insights into the value of population-level sequencing, the low coverage has compromised the ability to confidently detect rare and low-frequency variants. In addition, the composition of populations in the 1KGP is not complete, despite the fact that the study design has been extended to more than 2,500 samples from more than 20 population groups. The Malays are one of the Austronesian groups predominantly present in Southeast Asia and Oceania, and the Singapore Sequencing Malay Project (SSMP) aims to perform deep whole-genome sequencing of 100 healthy Malays. By sequencing at a minimum of 30× coverage, we have illustrated the higher sensitivity at detecting low-frequency and rare variants and the ability to investigate the presence of hotspots of functional mutations. Compared to the low-pass sequencing in the 1KGP, the deeper coverage allows more functional variants to be identified for each person. A comparison of the fidelity of genotype imputation of Malays indicated that a population-specific reference panel, such as the SSMP, outperforms a cosmopolitan panel with larger number of individuals for common SNPs. For lower-frequency (population-level sequencing versus low-pass sequencing, especially in populations that are poorly represented in population-genetics studies.

  12. Deep Sequencing Analysis of Apple Infecting Viruses in Korea

    Science.gov (United States)

    Cho, In-Sook; Igori, Davaajargal; Lim, Seungmo; Choi, Gug-Seoun; Hammond, John; Lim, Hyoun-Sub; Moon, Jae Sun

    2016-01-01

    Deep sequencing has generated 52 contigs derived from five viruses; Apple chlorotic leaf spot virus (ACLSV), Apple stem grooving virus (ASGV), Apple stem pitting virus (ASPV), Apple green crinkle associated virus (AGCaV), and Apricot latent virus (ApLV) were identified from eight apple samples showing small leaves and/or growth retardation. Nucleotide (nt) sequence identity of the assembled contigs was from 68% to 99% compared to the reference sequences of the five respective viral genomes. Sequences of ASPV and ASGV were the most abundantly represented by the 52 contigs assembled. The presence of the five viruses in the samples was confirmed by RT-PCR using specific primers based on the sequences of each assembled contig. All five viruses were detected in three of the samples, whereas all samples had mixed infections with at least two viruses. The most frequently detected virus was ASPV, followed by ASGV, ApLV, ACLSV, and AGCaV which were withal found in mixed infections in the tested samples. AGCaV was identified in assembled contigs ID 1012480 and 93549, which showed 82% and 78% nt sequence identity with ORF1 of AGCaV isolate Aurora-1. ApLV was identified in three assembled contigs, ID 65587, 1802365, and 116777, which showed 77%, 78%, and 76% nt sequence identity respectively with ORF1 of ApLV isolate LA2. Deep sequencing assay was shown to be a valuable and powerful tool for detection and identification of known and unknown virome in infected apple trees, here identifying ApLV and AGCaV in commercial orchards in Korea for the first time.

  13. Deep Sequencing Analysis of Apple Infecting Viruses in Korea.

    Science.gov (United States)

    Cho, In-Sook; Igori, Davaajargal; Lim, Seungmo; Choi, Gug-Seoun; Hammond, John; Lim, Hyoun-Sub; Moon, Jae Sun

    2016-10-01

    Deep sequencing has generated 52 contigs derived from five viruses; Apple chlorotic leaf spot virus (ACLSV), Apple stem grooving virus (ASGV), Apple stem pitting virus (ASPV), Apple green crinkle associated virus (AGCaV), and Apricot latent virus (ApLV) were identified from eight apple samples showing small leaves and/or growth retardation. Nucleotide (nt) sequence identity of the assembled contigs was from 68% to 99% compared to the reference sequences of the five respective viral genomes. Sequences of ASPV and ASGV were the most abundantly represented by the 52 contigs assembled. The presence of the five viruses in the samples was confirmed by RT-PCR using specific primers based on the sequences of each assembled contig. All five viruses were detected in three of the samples, whereas all samples had mixed infections with at least two viruses. The most frequently detected virus was ASPV, followed by ASGV, ApLV, ACLSV, and AGCaV which were withal found in mixed infections in the tested samples. AGCaV was identified in assembled contigs ID 1012480 and 93549, which showed 82% and 78% nt sequence identity with ORF1 of AGCaV isolate Aurora-1. ApLV was identified in three assembled contigs, ID 65587, 1802365, and 116777, which showed 77%, 78%, and 76% nt sequence identity respectively with ORF1 of ApLV isolate LA2. Deep sequencing assay was shown to be a valuable and powerful tool for detection and identification of known and unknown virome in infected apple trees, here identifying ApLV and AGCaV in commercial orchards in Korea for the first time.

  14. Deep Sequencing Analysis of Apple Infecting Viruses in Korea

    Directory of Open Access Journals (Sweden)

    In-Sook Cho

    2016-10-01

    Full Text Available Deep sequencing has generated 52 contigs derived from five viruses; Apple chlorotic leaf spot virus (ACLSV, Apple stem grooving virus (ASGV, Apple stem pitting virus (ASPV, Apple green crinkle associated virus (AGCaV, and Apricot latent virus (ApLV were identified from eight apple samples showing small leaves and/or growth retardation. Nucleotide (nt sequence identity of the assembled contigs was from 68% to 99% compared to the reference sequences of the five respective viral genomes. Sequences of ASPV and ASGV were the most abundantly represented by the 52 contigs assembled. The presence of the five viruses in the samples was confirmed by RT-PCR using specific primers based on the sequences of each assembled contig. All five viruses were detected in three of the samples, whereas all samples had mixed infections with at least two viruses. The most frequently detected virus was ASPV, followed by ASGV, ApLV, ACLSV, and AGCaV which were withal found in mixed infections in the tested samples. AGCaV was identified in assembled contigs ID 1012480 and 93549, which showed 82% and 78% nt sequence identity with ORF1 of AGCaV isolate Aurora-1. ApLV was identified in three assembled contigs, ID 65587, 1802365, and 116777, which showed 77%, 78%, and 76% nt sequence identity respectively with ORF1 of ApLV isolate LA2. Deep sequencing assay was shown to be a valuable and powerful tool for detection and identification of known and unknown virome in infected apple trees, here identifying ApLV and AGCaV in commercial orchards in Korea for the first time.

  15. Fiscal 1999 achievement report on the analysis of the complete sequencing of full-length cDNA; 1999 nendo dai 2 ji hosei yosan kanzencho cDNA kozo kaiseki seika hokokusho

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    2001-03-01

    This report accommodates the results of study conducted during the period of April 1, 2000, through March 31, 2001. The study began with the partial sequence determination for cDNA (complementary deoxyribonucleic acid) terminals presented by the cDNA library, novel clones were then selected out of them, and efforts proceeded to sequence all the bases therein. In this study, partial sequences were determined for 519,000 clones, and entire sequences for 7252 clones. The obtained sequence data were subjected to a homological analysis and then converted into an amino acid sequence, and then protein function prediction and the like were performed using the SOSUI program or the like. A prototype system to isolate novel clones out of partial sequences and a system for the graphic display of cDNA-genome links were fabricated. As for expression profile databases, the iAFLP (introduced amplified fragment length polymorphism) method was used to construct a high-throughput system and a function analysis database. (NEDO)

  16. Expressed Sequence Tags Analysis and Design of Simple Sequence Repeats Markers from a Full-Length cDNA Library in Perilla frutescens (L.

    Directory of Open Access Journals (Sweden)

    Eun Soo Seong

    2015-01-01

    Full Text Available Perilla frutescens is valuable as a medicinal plant as well as a natural medicine and functional food. However, comparative genomics analyses of P. frutescens are limited due to a lack of gene annotations and characterization. A full-length cDNA library from P. frutescens leaves was constructed to identify functional gene clusters and probable EST-SSR markers via analysis of 1,056 expressed sequence tags. Unigene assembly was performed using basic local alignment search tool (BLAST homology searches and annotated Gene Ontology (GO. A total of 18 simple sequence repeats (SSRs were designed as primer pairs. This study is the first to report comparative genomics and EST-SSR markers from P. frutescens will help gene discovery and provide an important source for functional genomics and molecular genetic research in this interesting medicinal plant.

  17. Microsatellite discovery by deep sequencing of enriched genomic libraries.

    Science.gov (United States)

    Santana, Quentin; Coetzee, Martin; Steenkamp, Emma; Mlonyeni, Osmond; Hammond, Gifty; Wingfield, Michael; Wingfield, Brenda

    2009-03-01

    Robust molecular markers such as microsatellites are important tools used to understand the dynamics of natural populations, but their identification and development are typically time consuming and labor intensive. The recent emergence of so-called next-generation sequencing raised the question as to whether this new technology might be applied to microsatellite development. Following this view, we considered whether deep sequencing using the 454 Life Sciences/Roche GS-FLX genome sequencing system could lead to a rapid protocol to develop microsatellite primers as markers for genetic studies. For this purpose, genomic DNA was sourced from three unrelated organisms: a fungus (the pine pathogen Fusarium circinatum), an insect (the pine-damaging wasp Sirex noctilio), and the wasp's associated nematode parasite (Deladenus siricidicola). Two methods, FIASCO (fast isolation by AFLP of sequences containing repeats) and ISSR-PCR (inter-simple sequence repeat PCR), were used to generate microsatellite-enriched DNA for the 454 libraries. From the resulting 1.2-1.7 megabases of DNA sequence data, we were able to identify 873 microsatellites that have sufficient flanking sequence available for primer design and potential amplification. This approach to microsatellite discovery was substantially more rapid, effective, and economical than other methods, and this study has shown that pyrosequencing provides an outstanding new technology that can be applied to this purpose.

  18. The cDNA sequences encoding two components of the polymeric fraction of the intracellular hemoglobin of Glycera dibranchiata.

    Science.gov (United States)

    Zafar, R S; Chow, L H; Stern, M S; Scully, J S; Sharma, P R; Vinogradov, S N; Walz, D A

    1990-12-15

    The intracellular hemoglobin of the polychaete Glycera dibranchiata consists of several components, some of which self-associate into a "polymeric" fraction. The cDNA library constructed from the poly(A+) mRNA of Glycera erythrocytes (Simons, P. C., and Satterlee, J. D. (1989) Biochemistry 28, 8525-8530) was screened with two oligodeoxynucleotide probes corresponding to the amino acid sequences MEEKVP and AMNSKV. Each of the two probes identified a full-length positive insert; these were sequenced using the dideoxynucleotide chain termination method. One clone was 630 bases long and contained 36 bases of 5'-untranslated RNA, a reading frame of 441 bases coding for the 147 amino acids of globin P2 including the residues MEEKVP, and a 3'-untranslated region of 153 bases. The other clone was 540 bases long and contained 24 bases of 5'-untranslated RNA, an open reading frame of 441 bases coding for globin P3 including the residues AMNSKV, and a 3'-untranslated region of 75 bases. The inferred amino acid sequences of the two globins were in agreement with the partial amino acid sequences obtained by chemical methods. The P2 and P3 globin sequences, together with the previously determined P1 sequence of a complete insert and partial sequences P4, P5, and P6 obtained from partial inserts (Zafar, R. S., Chow, L. H., Stern, M. S., Vinogradov, S. N., and Walz, D. A. (1990) Biochim. Biophys. Acta, in press) suggest that there are at least six components in the polymeric fraction of Glycera hemoglobin, which is in agreement with the results of polyacrylamide gel electrophoresis in Tris/glycine buffer, pH 8.3, 6 M urea. Nothern and dot blot analyses of Glycera erythrocyte poly(A+) mRNA using the foregoing two cDNA probes clearly demonstrated the presence of mature messages encoding both types of globins. Comparison of the polymeric sequences P1, P2, and P3 with the "monomeric" globins M-II and M-IV using the alignment and templates of Bashford et al. (Bashford, D., Chothia, C

  19. [Molecular cloning and analysis of cDNA sequences encoding serine proteinase and Kunitz type inhibitor in venom gland of Vipera nikolskii viper].

    Science.gov (United States)

    Ramazanova, A S; Fil'kin, S Iu; Starkov, V G; Utkin, Iu N

    2011-01-01

    Serine proteinases and Kunitz type inhibitors are widely represented in venoms of snakes from different genera. During the study of the venoms from snakes inhabiting Russia we have cloned cDNAs encoding new proteins belonging to these protein families. Thus, a new serine proteinase called nikobin was identified in the venom gland of Vipera nikolskii viper. By amino acid sequence deduced from the cDNA sequence, nikobin differs from serine proteinases identified in other snake species. Nikobin amino acid sequence contains 15 unique substitutions. This is the first serine proteinase of viper from Vipera genus for which a complete amino acid sequence established. The cDNA encoding Kunitz type inhibitor was also cloned. The deduced amino acid sequence of inhibitor is homologous to those of other proteins from that snakes of Vipera genus. However there are several unusual amino acid substitutions that might result in the change of biological activity of inhibitor.

  20. Development of genic-SSR markers by deep transcriptome sequencing in pigeonpea [Cajanus cajan (L. Millspaugh

    Directory of Open Access Journals (Sweden)

    Bashasab Fakrudin

    2011-01-01

    Full Text Available Abstract Background Pigeonpea [Cajanus cajan (L. Millspaugh], one of the most important food legumes of semi-arid tropical and subtropical regions, has limited genomic resources, particularly expressed sequence based (genic markers. We report a comprehensive set of validated genic simple sequence repeat (SSR markers using deep transcriptome sequencing, and its application in genetic diversity analysis and mapping. Results In this study, 43,324 transcriptome shotgun assembly unigene contigs were assembled from 1.696 million 454 GS-FLX sequence reads of separate pooled cDNA libraries prepared from leaf, root, stem and immature seed of two pigeonpea varieties, Asha and UPAS 120. A total of 3,771 genic-SSR loci, excluding homopolymeric and compound repeats, were identified; of which 2,877 PCR primer pairs were designed for marker development. Dinucleotide was the most common repeat motif with a frequency of 60.41%, followed by tri- (34.52%, hexa- (2.62%, tetra- (1.67% and pentanucleotide (0.76% repeat motifs. Primers were synthesized and tested for 772 of these loci with repeat lengths of ≥18 bp. Of these, 550 markers were validated for consistent amplification in eight diverse pigeonpea varieties; 71 were found to be polymorphic on agarose gel electrophoresis. Genetic diversity analysis was done on 22 pigeonpea varieties and eight wild species using 20 highly polymorphic genic-SSR markers. The number of alleles at these loci ranged from 4-10 and the polymorphism information content values ranged from 0.46 to 0.72. Neighbor-joining dendrogram showed distinct separation of the different groups of pigeonpea cultivars and wild species. Deep transcriptome sequencing of the two parental lines helped in silico identification of polymorphic genic-SSR loci to facilitate the rapid development of an intra-species reference genetic map, a subset of which was validated for expected allelic segregation in the reference mapping population. Conclusion We

  1. Preparation of microbial community cDNA for metatranscriptomic analysis in marine plankton.

    Science.gov (United States)

    Stewart, Frank J

    2013-01-01

    High-throughput sequencing and analysis of microbial community cDNA (metatranscriptomics) are providing valuable insight into in situ microbial activity and metabolism in the oceans. A critical first step in metatranscriptomic studies is the preparation of high-quality cDNA. At the minimum, preparing cDNA for sequencing involves steps of biomass collection, RNA preservation, total RNA extraction, and cDNA synthesis. Each of these steps may present unique challenges for marine microbial samples, particularly for deep-sea samples whose transcriptional profiles may change between water collection and RNA preservation. Because bacterioplankton community RNA yields may be relatively low (microbiology research.

  2. Cloning and sequencing of a rice ( Oryza sativa L.) RAPB cDNA using yeast one-hybrid system

    Institute of Scientific and Technical Information of China (English)

    姚泉洪; 邢彦彦; 王宗阳; 张景六; 彭日荷; 洪孟民

    1999-01-01

    Cis-acting elements containing CCAAT core sequence are located in 5’ upstream regions of numerous eukaryotic genes. CCAAT-binding factors interact with these cis-acting elements as heteromeric complex and therefore control the gene transcription. CCAAT binding factors contain at least three subunits and each subunit alone cannot bind to CCAAT box. The cloning of a rice cDNA called RAPB which homologizes to yeast HAP2 (one of the subunits in CCAAT-binding factors) using yeast one-hybrid system and functional complementation approaches is reported. The analytic results indicate that the deduced amino acid sequence in the C terminal of RAPB also contains the functional domain of 60 amino acids highly homologous with yeast HAP2, whereas the deduced amino acids in N terminal region differs significantly, and no Gln-rich region is found in the RAPB protein as in HAP2. The Southern blotting analysis demonstrates that only one copy of RAPB gene exists in rice genome.

  3. UMD‐Predictor: A High‐Throughput Sequencing Compliant System for Pathogenicity Prediction of any Human cDNA Substitution

    Science.gov (United States)

    Salgado, David; Desvignes, Jean‐Pierre; Rai, Ghadi; Blanchard, Arnaud; Miltgen, Morgane; Pinard, Amélie; Lévy, Nicolas; Collod‐Béroud, Gwenaëlle

    2016-01-01

    ABSTRACT Whole‐exome sequencing (WES) is increasingly applied to research and clinical diagnosis of human diseases. It typically results in large amounts of genetic variations. Depending on the mode of inheritance, only one or two correspond to pathogenic mutations responsible for the disease and present in affected individuals. Therefore, it is crucial to filter out nonpathogenic variants and limit downstream analysis to a handful of candidate mutations. We have developed a new computational combinatorial system UMD‐Predictor (http://umd‐predictor.eu) to efficiently annotate cDNA substitutions of all human transcripts for their potential pathogenicity. It combines biochemical properties, impact on splicing signals, localization in protein domains, variation frequency in the global population, and conservation through the BLOSUM62 global substitution matrix and a protein‐specific conservation among 100 species. We compared its accuracy with the seven most used and reliable prediction tools, using the largest reference variation datasets including more than 140,000 annotated variations. This system consistently demonstrated a better accuracy, specificity, Matthews correlation coefficient, diagnostic odds ratio, speed, and provided the shortest list of candidate mutations for WES. Webservices allow its implementation in any bioinformatics pipeline for next‐generation sequencing analysis. It could benefit to a wide range of users and applications varying from gene discovery to clinical diagnosis. PMID:26842889

  4. deepTools2: a next generation web server for deep-sequencing data analysis.

    Science.gov (United States)

    Ramírez, Fidel; Ryan, Devon P; Grüning, Björn; Bhardwaj, Vivek; Kilpert, Fabian; Richter, Andreas S; Heyne, Steffen; Dündar, Friederike; Manke, Thomas

    2016-07-08

    We present an update to our Galaxy-based web server for processing and visualizing deeply sequenced data. Its core tool set, deepTools, allows users to perform complete bioinformatic workflows ranging from quality controls and normalizations of aligned reads to integrative analyses, including clustering and visualization approaches. Since we first described our deepTools Galaxy server in 2014, we have implemented new solutions for many requests from the community and our users. Here, we introduce significant enhancements and new tools to further improve data visualization and interpretation. deepTools continue to be open to all users and freely available as a web service at deeptools.ie-freiburg.mpg.de The new deepTools2 suite can be easily deployed within any Galaxy framework via the toolshed repository, and we also provide source code for command line usage under Linux and Mac OS X. A public and documented API for access to deepTools functionality is also available.

  5. Deep sequencing of HIV: clinical and research applications.

    Science.gov (United States)

    Chabria, Shiven B; Gupta, Shaili; Kozal, Michael J

    2014-01-01

    Human immunodeficiency virus (HIV) exhibits remarkable diversity in its genomic makeup and exists in any given individual as a complex distribution of closely related but nonidentical genomes called a viral quasispecies, which is subject to genetic variation, competition, and selection. This viral diversity clinically manifests as a selection of mutant variants based on viral fitness in treatment-naive individuals and based on drug-selective pressure in those on antiretroviral therapy (ART). The current standard-of-care ART consists of a combination of antiretroviral agents, which ensures maximal viral suppression while preventing the emergence of drug-resistant HIV variants. Unfortunately, transmission of drug-resistant HIV does occur, affecting 5% to >20% of newly infected individuals. To optimize therapy, clinicians rely on viral genotypic information obtained from conventional population sequencing-based assays, which cannot reliably detect viral variants that constitute <20% of the circulating viral quasispecies. These low-frequency variants can be detected by highly sensitive genotyping methods collectively grouped under the moniker of deep sequencing. Low-frequency variants have been correlated to treatment failures and HIV transmission, and detection of these variants is helping to inform strategies for vaccine development. Here, we discuss the molecular virology of HIV, viral heterogeneity, drug-resistance mutations, and the application of deep sequencing technologies in research and the clinical care of HIV-infected individuals.

  6. Deep Sequencing Analysis of the Ixodes ricinus Haemocytome.

    Directory of Open Access Journals (Sweden)

    Michalis Kotsyfakis

    2015-05-01

    Full Text Available Ixodes ricinus is the main tick vector of the microbes that cause Lyme disease and tick-borne encephalitis in Europe. Pathogens transmitted by ticks have to overcome innate immunity barriers present in tick tissues, including midgut, salivary glands epithelia and the hemocoel. Molecularly, invertebrate immunity is initiated when pathogen recognition molecules trigger serum or cellular signalling cascades leading to the production of antimicrobials, pathogen opsonization and phagocytosis. We presently aimed at identifying hemocyte transcripts from semi-engorged female I. ricinus ticks by mass sequencing a hemocyte cDNA library and annotating immune-related transcripts based on their hemocyte abundance as well as their ubiquitous distribution.De novo assembly of 926,596 pyrosequence reads plus 49,328,982 Illumina reads (148 nt length from a hemocyte library, together with over 189 million Illumina reads from salivary gland and midgut libraries, generated 15,716 extracted coding sequences (CDS; these are displayed in an annotated hyperlinked spreadsheet format. Read mapping allowed the identification and annotation of tissue-enriched transcripts. A total of 327 transcripts were found significantly over expressed in the hemocyte libraries, including those coding for scavenger receptors, antimicrobial peptides, pathogen recognition proteins, proteases and protease inhibitors. Vitellogenin and lipid metabolism transcription enrichment suggests fat body components. We additionally annotated ubiquitously distributed transcripts associated with immune function, including immune-associated signal transduction proteins and transcription factors, including the STAT transcription factor.This is the first systems biology approach to describe the genes expressed in the haemocytes of this neglected disease vector. A total of 2,860 coding sequences were deposited to GenBank, increasing to 27,547 the number so far deposited by our previous transcriptome studies

  7. Amino acid sequence of Coprinus macrorhizus peroxidase and cDNA sequence encoding Coprinus cinereus peroxidase. A new family of fungal peroxidases.

    Science.gov (United States)

    Baunsgaard, L; Dalbøge, H; Houen, G; Rasmussen, E M; Welinder, K G

    1993-04-01

    Sequence analysis and cDNA cloning of Coprinus peroxidase (CIP) were undertaken to expand the understanding of the relationships of structure, function and molecular genetics of the secretory heme peroxidases from fungi and plants. Amino acid sequencing of Coprinus macrorhizus peroxidase, and cDNA sequencing of Coprinus cinereus peroxidase showed that the mature proteins are identical in amino acid sequence, 343 residues in size and preceded by a 20-residue signal peptide. Their likely identity to peroxidase from Arthromyces ramosus is discussed. CIP has an 8-residue, glycine-rich N-terminal extension blocked with a pyroglutamate residue which is absent in other fungal peroxidases. The presence of pyroglutamate, formed by cyclization of glutamine, and the finding of a minor fraction of a variant form lacking the N-terminal residue, indicate that signal peptidase cleavage is followed by further enzymic processing. CIP is 40-45% identical in amino-acid sequence to 11 lignin peroxidases from four fungal species, and 42-43% identical to the two known Mn-peroxidases. Like these white-rot fungal peroxidases, CIP has an additional segment of approximately 40 residues at the C-terminus which is absent in plant peroxidases. Although CIP is much more similar to horseradish peroxidase (HRP C) in substrate specificity, specific activity and pH optimum than to white-rot fungal peroxidases, the sequences of CIP and HRP C showed only 18% identity. Hence, CIP qualifies as the first member of a new family of fungal peroxidases. The nine invariant residues present in all plant, fungal and bacterial heme peroxidases are also found in CIP. The present data support the hypothesis that only one chromosomal CIP gene exists. In contrast, a large number of secretory plant and fungal peroxidases are expressed from several peroxidase gene clusters. Analyses of three batches of CIP protein and of 49 CIP clones revealed the existence of only two highly similar alleles indicating less

  8. Cloning, characterization and heterologous expression of epoxide hydrolase-encoding cDNA sequences from yeasts belonging to the genera Rhodotorula and Rhodosporidium

    NARCIS (Netherlands)

    Visser, H.; Weijers, C.A.G.M.; Ooyen, van A.J.J.; Verdoes, J.C.

    2002-01-01

    Epoxide hydrolase-encoding cDNA sequences were isolated from the basidiomycetous yeast species Rhodosporidium toruloides CBS 349, Rhodosporidium toruloides CBS 14 and Rhodotorula araucariae CBS 6031 in order to evaluate the molecular data and potential application of this type of enzymes. The deduce

  9. Nucleotide sequence of a cDNA clone encoding a major allergenic protein in rice seeds. Homology of the deduced amino acid sequence with members of alpha-amylase/trypsin inhibitor family.

    Science.gov (United States)

    Izumi, H; Adachi, T; Fujii, N; Matsuda, T; Nakamura, R; Tanaka, K; Urisu, A; Kurosawa, Y

    1992-05-18

    A cDNA clone of rice major allergenic protein (RAP) was isolated from a cDNA library of maturing rice seeds. The cDNA had an open reading frame (486 nucleotides) which coded a 162 amino acid residue polypeptide comprising a 27-residue signal peptide and a 135-residue mature protein of M(r) 14,764. The deduced amino acid sequence of RAP showed a considerable similarity to barley trypsin inhibitor [1983, J. Biol. Chem. 258, 7998-8003] and wheat alpha-amylase inhibitor [1981, Phytochemistry 20, 1781-1784].

  10. Selective amplification of cDNA sequence from total RNA by cassette-ligation mediated polymerase chain reaction (PCR): application to sequencing 6.5 kb genome segment of hantavirus strain B-1.

    Science.gov (United States)

    Isegawa, Y; Sheng, J; Sokawa, Y; Yamanishi, K; Nakagomi, O; Ueda, S

    1992-12-01

    A method, referred to as cassette-ligation mediated polymerase chain reaction (PCR), has been developed to permit selective and specific amplification of cDNA sequence from total cellular RNA. This technique comprises (i) digestion of cDNA with multiple restriction enzymes, (ii) ligation of cleavage products to double-stranded DNA cassettes possessing a corresponding restriction site and (iii) amplification of cassette-ligated restriction fragments containing a short, known sequence (but not all the other ligation products) by PCR using the specific and cassette primers; the specific primer is designed to prime synthesis from the known sequence of the cDNA whereas the cassette primer anneals to one strand of the cassette. Sequencing from the cassette primer provides information to design a new primer for the next walking step. The amplified cDNA fragments are often larger than the maximum DNA fragments (500-600 bp) that can be sequenced without the need of synthesizing internal sequencing primer. Each of such large cDNA fragments is dissected into smaller DNA fragments by repeating cassette-ligation mediated PCR exploiting different restriction sites and different sets of cassette primers. This dissection process reduces the number of specific primers to a minimum, thereby increasing the speed of sequencing and minimizing the overall cost. We have successfully applied this cDNA walking and sequencing by the cassette-ligation mediated PCR to the sequencing of an entire 6.5 kb genome segment of hantavirus strain B-1.(ABSTRACT TRUNCATED AT 250 WORDS)

  11. Coagulant thrombin-like enzyme (barnettobin) from Bothrops barnetti venom: molecular sequence analysis of its cDNA and biochemical properties.

    Science.gov (United States)

    Vivas-Ruiz, Dan E; Sandoval, Gustavo A; Mendoza, Julio; Inga, Rosalina R; Gontijo, Silea; Richardson, Michael; Eble, Johannes A; Yarleque, Armando; Sanchez, Eladio F

    2013-07-01

    The thrombin-like enzyme from Bothrops barnetti named barnettobin was purified. We report some biochemical features of barnettobin including the complete amino acid sequence that was deduced from the cDNA. Snake venom serine proteases affect several steps of human hemostasis ranging from the blood coagulation cascade to platelet function. Barnettobin is a monomeric glycoprotein of 52 kDa as shown by reducing SDS-PAGE, and contains approx. 52% carbohydrate by mass which could be removed by N-glycosidase. The complete amino acid sequence was deduced from the cDNA sequence. Its sequence contains a single chain of 233 amino acid including three N-glycosylation sites. The sequence exhibits significant homology with those of mammalian serine proteases e.g. thrombin and with homologous TLEs. Its specific coagulant activity was 251.7 NIH thrombin units/mg, releasing fibrinopeptide A from human fibrinogen and showed defibrinogenating effect in mouse. Both coagulant and amidolytic activities were inhibited by PMSF. N-deglycosylation impaired its temperature and pH stability. Its cDNA sequence with 750 bp encodes a protein of 233 residues. Indications that carbohydrate moieties may play a role in the interaction with substrates are presented. Barnettobin is a new defibrinogenating agent which may provide an opportunity for the development of new types of anti-thrombotic drugs.

  12. Cathepsin B from the white shrimp Litopenaeus vannamei: cDNA sequence analysis, tissues-specific expression and biological activity.

    Science.gov (United States)

    Stephens, A; Rojo, L; Araujo-Bernal, S; Garcia-Carreño, F; Muhlia-Almazan, A

    2012-01-01

    Cathepsin B is a cystein proteinase scarcely studied in crustaceans. Its function has not been clearly described in shrimp species belonging to the sub-order Dendrobranchiata, which includes the white shrimp Litopenaeus vannamei and other species from the Penaeidae family. Studies on vertebrates suggest that these lysosomal enzymes intracellularly hydrolize protein, as other cystein proteinases. However, the expression of the gene encoding the shrimp cathepsin B in the midgut gland was affected by starvation in a similar way as other digestive proteinases which extracellularly hydrolyze food protein. In this study the white shrimp L. vannamei cathepsin B (LvCathB) cDNA was sequenced, and characterized. Its gene expression was evaluated in various shrimp tissues, and changes in the mRNA amounts were compared with those observed on other digestive proteinases from the midgut gland during starvation. By using qRT-PCR it was found that LvCathB is expressed in most shrimp tissues except in pleopods and eye stalk. Changes on LvCathB mRNA during starvation suggest that the enzyme participates during intracellular protein hydrolysis but also, after food ingestion, it participates in hydrolyzing food proteins extracellularly as confirmed by the high activity levels we found in the gastric juice and midgut gland of the white shrimp.

  13. Sequencing and analysis of 10967 full-length cDNA clones from Xenopus laevis and Xenopus tropicalis

    Energy Technology Data Exchange (ETDEWEB)

    Morin, R D; Chang, E; Petrescu, A; Liao, N; Kirkpatrick, R; Griffith, M; Butterfield, Y; Stott, J; Barber, S; Babakaiff, R; Matsuo, C; Wong, D; Yang, G; Smailus, D; Brown-John, M; Mayo, M; Beland, J; Gibson, S; Olson, T; Tsai, M; Featherstone, R; Chand, S; Siddiqui, A; Jang, W; Lee, E; Klein, S; Prange, C; Myers, R M; Green, E D; Wagner, L; Gerhard, D; Marra, M; Jones, S M; Holt, R

    2005-10-31

    Sequencing of full-insert clones from full-length cDNA libraries from both Xenopus laevis and Xenopus tropicalis has been ongoing as part of the Xenopus Gene Collection initiative. Here we present an analysis of 10967 clones (8049 from X. laevis and 2918 from X. tropicalis). The clone set contains 2013 orthologs between X. laevis and X. tropicalis as well as 1795 paralog pairs within X. laevis. 1199 are in-paralogs, believed to have resulted from an allotetraploidization event approximately 30 million years ago, and the remaining 546 are likely out-paralogs that have resulted from more ancient gene duplications, prior to the divergence between the two species. We do not detect any evidence for positive selection by the Yang and Nielsen maximum likelihood method of approximating d{sub N}/d{sub S}. However, d{sub N}/d{sub S} for X. laevis in-paralogs is elevated relative to X. tropicalis orthologs. This difference is highly significant, and indicates an overall relaxation of selective pressures on duplicated gene pairs. Within both groups of paralogs, we found evidence of subfunctionalization, manifested as differential expression of paralogous genes among tissues, as measured by EST information from public resources. We have observed, as expected, a higher instance of subfunctionalization in out-paralogs relative to in-paralogs.

  14. Construction of a cDNA library and preliminary analysis of expressed sequence tags in Piper hainanense.

    Science.gov (United States)

    Fan, R; Ling, P; Hao, C Y; Li, F P; Huang, L F; Wu, B D; Wu, H S

    2015-10-19

    Black pepper is a perennial climbing vine. It is widely cultivated because its berries can be utilized not only as a spice in food but also for medicinal use. This study aimed to construct a standardized, high-quality cDNA library to facilitated identification of new Piper hainanense transcripts. For this, 262 unigenes were used to generate raw reads. The average length of these 262 unigenes was 774.8 bp. Of these, 94 genes (35.9%) were newly identified, according to the NCBI protein database. Thus, identification of new genes may broaden the molecular knowledge of P. hainanense on the basis of Clusters of Orthologous Groups and Gene Ontology categories. In addition, certain basic genes linked to physiological processes, which can contribute to disease resistance and thereby to the breeding of black pepper. A total of 26 unigenes were found to be SSR markers. Dinucleotide SSR was the main repeat motif, accounting for 61.54%, followed by trinucleotide SSR (23.07%). Eight primer pairs successfully amplified DNA fragments and detected significant amounts of polymorphism among twenty-one piper germplasm. These results present a novel sequence information of P. hainanense, which can serve as the foundation for further genetic research on this species.

  15. Full-Length Venom Protein cDNA Sequences from Venom-Derived mRNA: Exploring Compositional Variation and Adaptive Multigene Evolution.

    Directory of Open Access Journals (Sweden)

    Cassandra M Modahl

    2016-06-01

    Full Text Available Envenomation of humans by snakes is a complex and continuously evolving medical emergency, and treatment is made that much more difficult by the diverse biochemical composition of many venoms. Venomous snakes and their venoms also provide models for the study of molecular evolutionary processes leading to adaptation and genotype-phenotype relationships. To compare venom complexity and protein sequences, venom gland transcriptomes are assembled, which usually requires the sacrifice of snakes for tissue. However, toxin transcripts are also present in venoms, offering the possibility of obtaining cDNA sequences directly from venom. This study provides evidence that unknown full-length venom protein transcripts can be obtained from the venoms of multiple species from all major venomous snake families. These unknown venom protein cDNAs are obtained by the use of primers designed from conserved signal peptide sequences within each venom protein superfamily. This technique was used to assemble a partial venom gland transcriptome for the Middle American Rattlesnake (Crotalus simus tzabcan by amplifying sequences for phospholipases A2, serine proteases, C-lectins, and metalloproteinases from within venom. Phospholipase A2 sequences were also recovered from the venoms of several rattlesnakes and an elapid snake (Pseudechis porphyriacus, and three-finger toxin sequences were recovered from multiple rear-fanged snake species, demonstrating that the three major clades of advanced snakes (Elapidae, Viperidae, Colubridae have stable mRNA present in their venoms. These cDNA sequences from venom were then used to explore potential activities derived from protein sequence similarities and evolutionary histories within these large multigene superfamilies. Venom-derived sequences can also be used to aid in characterizing venoms that lack proteomic profiles and identify sequence characteristics indicating specific envenomation profiles. This approach, requiring only

  16. Deep sequencing the transcriptome reveals seasonal adaptive mechanisms in a hibernating mammal.

    Directory of Open Access Journals (Sweden)

    Marshall Hampton

    Full Text Available Mammalian hibernation is a complex phenotype involving metabolic rate reduction, bradycardia, profound hypothermia, and a reliance on stored fat that allows the animal to survive for months without food in a state of suspended animation. To determine the genes responsible for this phenotype in the thirteen-lined ground squirrel (Ictidomys tridecemlineatus we used the Roche 454 platform to sequence mRNA isolated at six points throughout the year from three key tissues: heart, skeletal muscle, and white adipose tissue (WAT. Deep sequencing generated approximately 3.7 million cDNA reads from 18 samples (6 time points ×3 tissues with a mean read length of 335 bases. Of these, 3,125,337 reads were assembled into 140,703 contigs. Approximately 90% of all sequences were matched to proteins in the human UniProt database. The total number of distinct human proteins matched by ground squirrel transcripts was 13,637 for heart, 12,496 for skeletal muscle, and 14,351 for WAT. Extensive mitochondrial RNA sequences enabled a novel approach of using the transcriptome to construct the complete mitochondrial genome for I. tridecemlineatus. Seasonal and activity-specific changes in mRNA levels that met our stringent false discovery rate cutoff (1.0 × 10(-11 were used to identify patterns of gene expression involving various aspects of the hibernation phenotype. Among these patterns are differentially expressed genes encoding heart proteins AT1A1, NAC1 and RYR2 controlling ion transport required for contraction and relaxation at low body temperatures. Abundant RNAs in skeletal muscle coding ubiquitin pathway proteins ASB2, UBC and DDB1 peak in October, suggesting an increase in muscle proteolysis. Finally, genes in WAT that encode proteins involved in lipogenesis (ACOD, FABP4 are highly expressed in August, but gradually decline in expression during the seasonal transition to lipolysis.

  17. Sequence analysis of keratin-like proteins and cloning of intermediate filament-like cDNA from higher plant cells

    Institute of Scientific and Technical Information of China (English)

    2000-01-01

    Two keratin-like proteins of 64 and 55 ku were purified from suspension cells of Daucus carota L.,and their partial amino acid sequences were determined.The homological analysis showed that the sequence from the 64 ku protein was highly homological to b -glucosidase,and that from the 55 ku protein had no significant homologue in GenBank.Using conservative sequence of animal IF proteins as primer,we cloned a cDNA fragment from Daucus carota L.Southern blot and Northern blot results indicated that this cDNA fragment was a single copy gene and expressed both in suspension cells and leaves.Homological analysis revealed that it had moderate homology to a variety of a -helical proteins.Our results might shed more light on molecular characterization of IF existence in higher plant.

  18. Sequence analysis of keratin-like proteins and cloning of intermediate filament-like cDNA from higher plant cells

    Institute of Scientific and Technical Information of China (English)

    赵大中; 陈丹英; 杨橙; 翟中和

    2000-01-01

    Two keratin-like proteins of 64 and 55 ku were purified from suspension cells of Caucus carota L, and their partial amino acid sequences were determined. The homological analysis showed that the sequence from the 64 ku protein was highly homological to p-glucosidase, and that from the 55 ku protein had no significant homologue in GenBank. Using conservative sequence of animal IF proteins as primer, we cloned a cDNA fragment from Daucus carota L. Southern blot and Northern blot results indicated that this cDNA fragment was a single copy gene and expressed both in suspension cells and leaves. Homological analysis revealed that it had moderate homology to a variety of a-helical proteins. Our results might shed more light on molecular characterization of IF existence in higher plant.

  19. Cloning and Sequence Analysis of Interleukin 10 (IL-10) Full-length cDNA from Cyprinus carpio L.

    Institute of Scientific and Technical Information of China (English)

    Xiangru FENG; Yilong CHEN; Xiao ZHAO; Wendong WANG; Junhui ZHANG; Zhenguo YANG SUN; Shengmei JIA; Qiang LU

    2012-01-01

    Abstract [Objective] This study aimed to obtain IL-IO (interleukin 10) full-length cD- NA of common carp (Cyprinus carpio L.) and conduct the sequence analysis. []~lethod] The differentially expressed cDNA fragment was obtained by DD-RTPCR (differential display RT-PCR). The cDNA library of peripheral blood leukocytes which were separated from common carp and stimulated by mitogen was screened with a probe labeled with DIG (digoxigenin). The IL-IO full-length cDNA was cloned from 0.8x104 pfu of recombinant phages, and the sequence analysis and homology com- parison were carried out. [Result] Sequence analysis indicated that the IL-IO full- length cDNA of common carp was 1 117 bp long, containing a.55 bp 5'-UTR, a 522 bp 3"-UTR, and a 540 bp open reading frame(ORF) encoding 179 amino acids. In addition, there were three mRNA instability motifs (ATTTA) in the 3"-untranslated region. The deduced protein sequence shared typical sequence features of the IL-IO family. Homology comparison indicated that the obtained sequence shared 89.1% homology with the carp IL-IO gene from GenBank. [Conclusion] This study laid foun- dation for further study of the expression manner, functional characteristic and regu- lation mechanism of IL-IO in vivo and the interaction mechanism in the inflammatory reaction and immune response.

  20. Cloning, sequence analysis, and expression of cDNA coding for the major house dust mite allergen, Der f 1, in Escherichia coli

    Directory of Open Access Journals (Sweden)

    Y. Cui

    2008-05-01

    Full Text Available Our objective was to clone, express and characterize adult Dermatophagoides farinae group 1 (Der f 1 allergens to further produce recombinant allergens for future clinical applications in order to eliminate side reactions from crude extracts of mites. Based on GenBank data, we designed primers and amplified the cDNA fragment coding for Der f 1 by nested-PCR. After purification and recovery, the cDNA fragment was cloned into the pMD19-T vector. The fragment was then sequenced, subcloned into the plasmid pET28a(+, expressed in Escherichia coli BL21 and identified by Western blotting. The cDNA coding for Der f 1 was cloned, sequenced and expressed successfully. Sequence analysis showed the presence of an open reading frame containing 966 bp that encodes a protein of 321 amino acids. Interestingly, homology analysis showed that the Der p 1 shared more than 87% identity in amino acid sequence with Eur m 1 but only 80% with Der f 1. Furthermore, phylogenetic analyses suggested that D. pteronyssinus was evolutionarily closer to Euroglyphus maynei than to D. farinae, even though D. pteronyssinus and D. farinae belong to the same Dermatophagoides genus. A total of three cysteine peptidase active sites were found in the predicted amino acid sequence, including 127-138 (QGGCGSCWAFSG, 267-277 (NYHAVNIVGYG and 284-303 (YWIVRNSWDTTWGDSGYGYF. Moreover, secondary structure analysis revealed that Der f 1 contained an a helix (33.96%, an extended strand (17.13%, a ß turn (5.61%, and a random coil (43.30%. A simple three-dimensional model of this protein was constructed using a Swiss-model server. The cDNA coding for Der f 1 was cloned, sequenced and expressed successfully. Alignment and phylogenetic analysis suggests that D. pteronyssinus is evolutionarily more similar to E. maynei than to D. farinae.

  1. Cloning and Sequence Analysis of the Full-length cDNA of a Novel yp05 Gene Associated With Citrinin Production in Monascus aurantiacus

    Institute of Scientific and Technical Information of China (English)

    YON-GHUA XIONG; YANG XU; WEI-HUA LAI; YAN-PIN LI; HUA WEI

    2007-01-01

    Objective To obtain the full-length cDNA of a novel gene (named yp05) associated with citrinin production-related genes in Monascus aurantiacus. Methods Total RNA was extracted from mycelium, 3' and 5' cDNA end of yp05 gene was amplified using smartTM trace cDNA amplification kit, and the full-length cDNA of a novel gene (named yp05) was obtained from the electronic assembly of 3'-RACE and 5'- RACE products. Results This yp05 gene was 787 bp including a 597 bp open reading frame (ORF) and encoded a deduced protein with 199 amino acid residues, and the amino acid sequence of this protein was found similar with the sequences of many fungal manganese-superoxide dismutases in the GenBank with the aid of BLASTp. The transcription of yp05 gene in Monascus strains was analyzed with the aid of Northern blotting. The transcription of yp05 gene was only detected in Monascus strains, provided that citrinin was produced. Conclusion The transcription of yp05 gene belongs to differential expression genes of citrinin yielded from Monascus and has no correlation with the biosynthesis pathway of red pigments.

  2. 虹鳟 Ndufb2基因全长 cDNA 序列的克隆与分析%Cloning and sequence analysis of Ndufb2 full-length cDNA derived from Oncorhynchus mykiss

    Institute of Scientific and Technical Information of China (English)

    王家庆; 边佳; 李代宗; 马爽; 王亮; 那广宁

    2013-01-01

    Summary Rainbow trout belongs to Salmonidae aerobic fish,and it is necessary for high dissolved oxygen content of living water environment.If the dissolved oxygen content of living water is less than 5 mg/L,it will cause the increase of respiratory rate,which is the so-called“aquaculture floating head”phenomenon.Because the fish lives in hypoxia environment and the 90% oxygen consumption is in the mitochondria,the transmission mechanism in composition and electronic respiratory chain may be different from the terrestrial animal.At the mitochondrial inner membrane,electrons from NADH and succinate pass through the electron transport chain to oxygen,which is reduced to water.Complex I is one of the main sites at which premature electron leakage to oxygen occurs,thus being one of the main sites of production of harmful superoxide.The first isolation of mitochondrial complex I since 1 961,its composition and structure have had a primary understanding,but the specific mechanism of its participation in respiration,especially the function of each subunit is not clear.The protein encoded by Ndufb2 gene is a subunit of the multisubunit NADH:ubiquinone oxidoreductase(complex I). Mammalian complex I is composed of 45 different subunits.This protein has NADH dehydrogenase activity and oxidoreductase activity.It plays an important role in transferring electrons from NADH to the respiratory chain. Reverse transcription PCR(RT-PCR) and rapid amplification of cDNA ends(RACE)methods were used for the isolation of the whole cDNA of Ndufb2 gene from brain of Oncorhynchus mykiss .The assembly taskes of 3' and 5'-RACE sequence were completed by DNAman program.A pair of gene specific primers were designed to amplify the full-length cDNA sequence.ClustalX 1.81 and MEGA 3.0 software were used to calculate the amino acid sequence differences,and then the phylogenetic relationships of rainbow trout Ndufb2 gene sequence with other species were analyzed.Protein phosphorylation sites and

  3. Error Analysis of Deep Sequencing of Phage Libraries: Peptides Censored in Sequencing

    Directory of Open Access Journals (Sweden)

    Wadim L. Matochko

    2013-01-01

    Full Text Available Next-generation sequencing techniques empower selection of ligands from phage-display libraries because they can detect low abundant clones and quantify changes in the copy numbers of clones without excessive selection rounds. Identification of errors in deep sequencing data is the most critical step in this process because these techniques have error rates >1%. Mechanisms that yield errors in Illumina and other techniques have been proposed, but no reports to date describe error analysis in phage libraries. Our paper focuses on error analysis of 7-mer peptide libraries sequenced by Illumina method. Low theoretical complexity of this phage library, as compared to complexity of long genetic reads and genomes, allowed us to describe this library using convenient linear vector and operator framework. We describe a phage library as N×1 frequency vector n=ni, where ni is the copy number of the ith sequence and N is the theoretical diversity, that is, the total number of all possible sequences. Any manipulation to the library is an operator acting on n. Selection, amplification, or sequencing could be described as a product of a N×N matrix and a stochastic sampling operator (Sa. The latter is a random diagonal matrix that describes sampling of a library. In this paper, we focus on the properties of Sa and use them to define the sequencing operator (Seq. Sequencing without any bias and errors is Seq=Sa IN, where IN is a N×N unity matrix. Any bias in sequencing changes IN to a nonunity matrix. We identified a diagonal censorship matrix (CEN, which describes elimination or statistically significant downsampling, of specific reads during the sequencing process.

  4. 中华蜜蜂工蜂 cDNA 文库的构建及ESTs 测序分析%Construction of cDNA libraries and ESTs sequencing of Apis cerana cerana workers

    Institute of Scientific and Technical Information of China (English)

    张卫星; 郗学鹏; 秦明; 王帅; 刘春蕾; 王红芳; 胥保华

    2016-01-01

    Objectives] To build a cDNA library to improve understanding of how honey bee workers respond to adverse conditions and analyze the quality of the resultant library. [Methods] A cDNA library of Apis cerana cerana was constructed using the SMART technique. [Results] The library’s capacity was 3.6×106 cfu/mL, the recombination rate was 97% and the average length of inserts was approximately 1 000 bp. 306 ESTs were generated by ESTs sequencing. Additionally, 234 non-repetitive sequences were formed, including 207 singletons and 27 contigs after initial assembly. Using Blastx to query, compare and annotate these sequences with those in GenBank, revealed that 141 sequences could be assigned putative functions because they were homologous to known genes. Other sequences had no obvious homology, which suggests there is potential for the discovery of new functional genes. [Conclusion] The construction of a cDNA library has important benefits for cloning, screening and gene function research in Apis cerana cerana.%【目的】为了解中华蜜蜂 Apis cerana cerana 工蜂的抗逆性,构建了中华蜜蜂工蜂的 cDNA 文库,并对文库质量进行分析。【方法】本研究利用 SMART 技术构建了中华蜜蜂工蜂的全长 cDNA 文库。【结果】文库库容为3.6×106 cfu/mL,文库重组率为97%,插入片段长度多数分布在1000 bp 左右。挑取 cDNA克隆进行 EST 测序,共进行了306个成功反应,软件拼接共得到234个单基因簇(Unigene),其中包括207个单拷贝(Singletons)序列及27个重叠群(Contigs)。使用 Blastx 将这些序列同 GenBank 等数据库进行查询、比对和注释,结果显示141条序列有相关同源性,其他序列没有明显的同源性,这也为我们发现新功能基因提供了可靠依据。【结论】此文库的构建在中华蜜蜂功能基因的分离、克隆、筛选以及基因功能研究等方面具有重要作用。

  5. Cloning and Sequencing of cDNA Encoding Islet Cell Autoantigen 69kD Protein from Chinese%国人 ICA69 基因 cDNA 的克隆及序列分析

    Institute of Scientific and Technical Information of China (English)

    2000-01-01

    Objective: It is reported that cDNA encoding human islet cell autoantigen 69kD protein (hiICA69) has been cloned, so to confirm the nucleotide sequences from the insulinoma cells of Chinese. Methods: cDNA encoding hiICA69 has been amplificated by PCR, from the cDNA library of Chinese insulinoma cells. The PCR product was inserted into the pSPORT 1 vector, and was subcloned into the pUC18 plasmid. After the positive colony was screened by the blue/white colony and the restriction analysis, the nucleotide sequences of the full - length cDNA were analysed by means of the dideoxy chain termination method. Resalts: The results showed that the amplified fragment contained 1449bp, encoded 483 - amino acids. For the sequencing analysis of ICA69 gene from the insulinoma in Mongolian race, the nucleotide sequence of the recombinant was coincident with that reported by Miyazaki and that from EMBL data's bank in addition to one difference of only base on the codon. The change located in the 416th base (A→T), which led to the change of one amino acid (Gln→Leu) . Conclusion: The gene obtained by the method of gene engineering and identified by means of sequence analysis would be able to lay a foundation for follow - up research.%目的:克隆国人胰岛细胞自身抗原 69kD 蛋白基因 ( hiICA69 ) 并经序列分析予以确证。方法:采用聚合酶链式反应技术,从中国人胰岛细胞瘤 cDNA 文库中扩增出 hiICA69 编码序列cDNA,将基因片段插入 pSPORT 1 质粒,进一步亚克隆到 pUCl8 载体中,经蓝白斑和限制性酶谱分析得以初步筛选后,双脱氧末端终止法对其全部核苷酸序列予以确定。结果:证实了 hiICA69 基因全长为 1449bp、编码 483 个氨基酸。与 pietropaolo 等报道的序列比较,仅在编码第 139 位氨基酸的密码子由 CAA→CTA,即由谷氨酰胺→亮氨酸,其余均与文献报道和 EMBL 核酸数据库提供的序列相同。结论:这一基因的获得和

  6. Identification and characterization of cDNA sequences encoding the HIS3 and LEU2 genes of the fungus Alternaria tenuissima

    Institute of Scientific and Technical Information of China (English)

    Ying Wan; Xuli Wang; Yun Huang; Dewen Qiu; Linghuo Jiang

    2008-01-01

    Alternaria tenuissima is a fungus widely present in the environment and could cause diseases in plants and humans.In this study,through a yeast genetic approach,cDNA sequences were isolated and characterized for the AtHIS3 and AtLEU2 genes.AtHIS3 cDNA encodes a protein of 238 amino acids,while AtLEU2 cDNA encodes a protein of 363 amino acids.Based on the phylogenetic analysis of amino acid sequences of AtHis3p and AtLeu2p,A.tenuissima is closely related to the plant pathogenic fungus Phaeosphaeria nodorum.This study provides two genetic markers for studies of functions of genes regulating development,morphology,and virulence of A.tenuissima.

  7. Uroporphyrinogen-III synthase: Molecular cloning, nucleotide sequence, expression of a mouse full-length cDNA, and its localization on mouse chromosome 7

    Energy Technology Data Exchange (ETDEWEB)

    Xu, W.; Desnick, R.J. [Mount Sinai School of Medicine, New York, NY (United States); Kozak, C.A. [National Institute of Health, Bethesda, MD (United States)

    1995-04-10

    Uroporphyrinogen-III synthase, the fourth enzyme in the heme biosynthetic pathway, is responsible for the conversion of hydroxymethylbilane to the cyclic tetrapyrrole, uroporphyrinogen III. The deficient activity of URO-S is the enzymatic defect in congenital erythropoietic porphyria (CEP), an autosomal recessive disorder. For the generation of a mouse model of CEP, the human URO-S cDNA was used to screen 2 X 10{sup 6} recombinants from a mouse adult liver cDNA library. Ten positive clones were isolated, and dideoxy sequencing of the entire 1.6-kb insert of clone pmUROS-1 revealed 5{prime} and 3{prime} untranslated sequences of 144 and 623 bp, respectively, and an open reading frame of 798 bp encoding a 265-amino-acid polypeptide with a predicted molecular mass of 28,501 Da. The mouse and human coding sequences had 80.5 and 77.8% nucleotide and amino acid identity, respectively. The authenticity of the mouse cDNA was established by expression of the active monomeric enzyme in Escherichia coli. In addition, the analysis of two multilocus genetic crosses localized the mouse gene on chromosome 7, consistent with the mapping of the human gene to a position of conserved synteny on chromosome 10. The isolation, expression, and chromosomal mapping of this full-length cDNA should facilitate studies of the structure and organization of the mouse genomic sequence and the development of a mouse model of CEP for characterization of the disease pathogenesis and evaluation of gene therapy. 38 refs., 1 tab.

  8. Measuring cation dependent DNA polymerase fidelity landscapes by deep sequencing.

    Directory of Open Access Journals (Sweden)

    Bradley Michael Zamft

    Full Text Available High-throughput recording of signals embedded within inaccessible micro-environments is a technological challenge. The ideal recording device would be a nanoscale machine capable of quantitatively transducing a wide range of variables into a molecular recording medium suitable for long-term storage and facile readout in the form of digital data. We have recently proposed such a device, in which cation concentrations modulate the misincorporation rate of a DNA polymerase (DNAP on a known template, allowing DNA sequences to encode information about the local cation concentration. In this work we quantify the cation sensitivity of DNAP misincorporation rates, making possible the indirect readout of cation concentration by DNA sequencing. Using multiplexed deep sequencing, we quantify the misincorporation properties of two DNA polymerases--Dpo4 and Klenow exo(---obtaining the probability and base selectivity of misincorporation at all positions within the template. We find that Dpo4 acts as a DNA recording device for Mn(2+ with a misincorporation rate gain of ∼2%/mM. This modulation of misincorporation rate is selective to the template base: the probability of misincorporation on template T by Dpo4 increases >50-fold over the range tested, while the other template bases are affected less strongly. Furthermore, cation concentrations act as scaling factors for misincorporation: on a given template base, Mn(2+ and Mg(2+ change the overall misincorporation rate but do not alter the relative frequencies of incoming misincorporated nucleotides. Characterization of the ion dependence of DNAP misincorporation serves as the first step towards repurposing it as a molecular recording device.

  9. Cloning, sequencing and expression analysis of cDNA encoding a constitutive heat shock protein 70 (HSC70) in Fenneropenaeus chinensis

    Institute of Scientific and Technical Information of China (English)

    JIAO Chuanzhen; WANG Zaizhao; LI Fuhua; ZHANG Chengsong; XIANG Jianhai

    2004-01-01

    The cDNA encoding hsc70 of Chinese shrimp Fenneropenaeus chinensis was cloned from hepatopancreas by RT-PCR based on its EST sequence. The full length cDNA of 2090 bp contained an open reading frame of 1956 nucleotides and partial 5′- and 3′-untranslated region(5′- and 3′-UTR). PCR amplification and sequencing analysis showed the existence of introns in the region of 1-547 bp, but they did not exist in the region of 548-2090 bp of hsc70 cDNA. When the deduced 652 amino acid sequence of HSC70 was compared with the members of HSP70 family from other organisms, the results showed 85.9% similarity with HSC71 from Oncorhynchus mykiss and HSC70 from Homo sapiens. It also exhibited 85.8% similarity with HSP70 from Mus musculu and 85.4% with HSC70 from Manduca sexta. Expression analysis showed that hsc70 mRNA was espressed constitutively in hepatopancreas, muscle, eyestalks, haemocytes, heart, ovary, intestine and gills in Fenneropenaeus chinensis. No difference could be detected on hsc70 mRNA level in muscle between heat-shocked and control animals.

  10. cDNA cloning,sequence analysis,and recombinant expression of akitonin beta,a C-type lectin-like protein from Agkistrodon acutus

    Institute of Scientific and Technical Information of China (English)

    Xiang-dong ZHA; Jing LIU; Kang-sen XU

    2004-01-01

    AIM: To clone the cDNA of a new member of snake venom C-type lectin-like proteins, to study its structurefunction relationships and to achieve its recombinant production. METHODS: PCR primers were designed based on the homology and cDNA was amplified by RT-PCR using total RNA from snake venom gland as the template.The PCR products were cloned into the plasmid pGEM-T and sequenced. The deduced protein sequence was analyzed with some bioinformatic programs. A recombinant expression plasmid was constructed using pBADTOPO as vector and transformed into E. coli TOP10 competent cells. RESULTS: A novel cDNA sequence encoding akitonin β was found and accepted by GenBank (accession number AF387100). Akitonin β consists of a typical carbohydrate recognition domain (CRD) of C-type lectins, and it is homologous with other snake venom C-type lectin-like proteins. It was predicted to be a platelet antagonist. Upon induction with arabinose rAkitonin β expressing in E coli was achieved at a high level (superior to 150 mg/L). The recombinant fusion protein exhibited inhibitory activities on rat platelet aggregation in vitro. CONCLUSION: A new member of snake venom C-type lectin-like proteins was discovered and characterized, and an efficient recombinant expression system was established for its production.

  11. Screening target specificity of siRNAs by rapid amplification of cDNA ends (RACE) for non-sequenced species.

    Science.gov (United States)

    Sabirzhanov, Boris; Sabirzhanova, Inna B; Keifer, Joyce

    2011-05-01

    RNA interference (RNAi) is the process of sequence-specific posttranslational gene silencing triggered by double-stranded RNAs (dsRNAs). RNAi is a widely used approach for studying gene function. However, studies have shown that using siRNA can lead to off-target effects when the siRNA contains sufficient sequence identity to non-target mRNA sequences. One of the important steps in designing dsRNA is verification that it has sequence identity to only the target mRNA. In this report, we propose an approach for primary screening dsRNAs for potential off-target effects by using rapid amplification of cDNA ends. This method can be especially useful for model systems using species that have limited availability of sequence data.

  12. Ultra-deep sequencing of foraminiferal microbarcodes unveils hidden richness of early monothalamous lineages in deep-sea sediments.

    Science.gov (United States)

    Lecroq, Béatrice; Lejzerowicz, Franck; Bachar, Dipankar; Christen, Richard; Esling, Philippe; Baerlocher, Loïc; Østerås, Magne; Farinelli, Laurent; Pawlowski, Jan

    2011-08-09

    Deep-sea floors represent one of the largest and most complex ecosystems on Earth but remain essentially unexplored. The vastness and remoteness of this ecosystem make deep-sea sampling difficult, hampering traditional taxonomic observations and diversity assessment. This problem is particularly true in the case of the deep-sea meiofauna, which largely comprises small-sized, fragile, and difficult-to-identify metazoans and protists. Here, we introduce an ultra-deep sequencing-based metagenetic approach to examine the richness of benthic foraminifera, a principal component of deep-sea meiofauna. We used Illumina sequencing technology to assess foraminiferal richness in 31 unsieved deep-sea sediment samples from five distinct oceanic regions. We sequenced an extremely short fragment (36 bases) of the small subunit ribosomal DNA hypervariable region 37f, which has been shown to accurately distinguish foraminiferal species. In total, we obtained 495,978 unique sequences that were grouped into 1,643 operational taxonomic units, of which about half (841) could be reliably assigned to foraminifera. The vast majority of the operational taxonomic units (nearly 90%) were either assigned to early (ancient) lineages of soft-walled, single-chambered (monothalamous) foraminifera or remained undetermined and yet possibly belong to unknown early lineages. Contrasting with the classical view of multichambered taxa dominating foraminiferal assemblages, our work reflects an unexpected diversity of monothalamous lineages that are as yet unknown using conventional micropaleontological observations. Although we can only speculate about their morphology, the immense richness of deep-sea phylotypes revealed by this study suggests that ultra-deep sequencing can improve understanding of deep-sea benthic diversity considered until now as unknowable based on a traditional taxonomic approach.

  13. Deep Sequencing the MicroRNA Transcriptome in Colorectal Cancer.

    Directory of Open Access Journals (Sweden)

    Kristina Schee

    Full Text Available Colorectal cancer (CRC is one of the leading causes of cancer related deaths and the search for prognostic biomarkers that might improve treatment decisions is warranted. MicroRNAs (miRNAs are short non-coding RNA molecules involved in regulating gene expression and have been proposed as possible biomarkers in CRC. In order to characterize the miRNA transcriptome, a large cohort including 88 CRC tumors with long-term follow-up was deep sequenced. 523 mature miRNAs were expressed in our cohort, and they exhibited largely uniform expression patterns across tumor samples. Few associations were found between clinical parameters and miRNA expression, among them, low expression of miR-592 and high expression of miR-10b-5p and miR-615-3p were associated with tumors located in the right colon relative to the left colon and rectum. High expression of miR-615-3p was also associated with poorly differentiated tumors. No prognostic biomarker candidates for overall and metastasis-free survival were identified by applying the LASSO method in a Cox proportional hazards model or univariate Cox. Examination of the five most abundantly expressed miRNAs in the cohort (miR-10a-5p, miR-21-5p, miR-22-3p, miR-143-3p and miR-192-5p revealed that their collective expression represented 54% of the detected miRNA sequences. Pathway analysis of the target genes regulated by the five most highly expressed miRNAs uncovered a significant number of genes involved in the CRC pathway, including APC, TGFβ and PI3K, thus suggesting that these miRNAs are relevant in CRC.

  14. Sequencing analysis of 20,000 full-length cDNA clones from cassava reveals lineage specific expansions in gene families related to stress response

    Directory of Open Access Journals (Sweden)

    Sakaki Yoshiyuki

    2007-12-01

    Full Text Available Abstract Background Cassava, an allotetraploid known for its remarkable tolerance to abiotic stresses is an important source of energy for humans and animals and a raw material for many industrial processes. A full-length cDNA library of cassava plants under normal, heat, drought, aluminum and post harvest physiological deterioration conditions was built; 19968 clones were sequence-characterized using expressed sequence tags (ESTs. Results The ESTs were assembled into 6355 contigs and 9026 singletons that were further grouped into 10577 scaffolds; we found 4621 new cassava sequences and 1521 sequences with no significant similarity to plant protein databases. Transcripts of 7796 distinct genes were captured and we were able to assign a functional classification to 78% of them while finding more than half of the enzymes annotated in metabolic pathways in Arabidopsis. The annotation of sequences that were not paired to transcripts of other species included many stress-related functional categories showing that our library is enriched with stress-induced genes. Finally, we detected 230 putative gene duplications that include key enzymes in reactive oxygen species signaling pathways and could play a role in cassava stress response features. Conclusion The cassava full-length cDNA library here presented contains transcripts of genes involved in stress response as well as genes important for different areas of cassava research. This library will be an important resource for gene discovery, characterization and cloning; in the near future it will aid the annotation of the cassava genome.

  15. Sequence determination of cDNA clones of transcripts from the tumor-associated region of the Marek's disease virus genome.

    Science.gov (United States)

    Iwata, A; Ueda, S; Ishihama, A; Hirai, K

    1992-04-01

    The number of 132-bp tandem direct repeats within the long inverted repeat region of the Marek's disease virus type 1 (MDV1) genome increases concomitantly with the loss of oncogenicity during serial passages in cultured cells. Twelve clones carrying the 132-bp sequence were isolated from a cDNA library constructed from chicken embryo fibroblasts infected with the MDV1 Md5 strain. Through sequence analysis of a cDNA clone and primer extension analysis, the corresponding mRNA was found to be a linear transcript which included the two 132-bp tandem direct repeats. Two open reading frames were found in this transcript. One had a week homology with v-fms. The other should increase its size concomitantly with expansion of the 132-bp tandem direct repeat. PCR analysis of both cDNA clones and RNA gave amplified products which were as large as that produced from the genomic clone, indicating that a majority of mRNA from this region is composed of unspliced transcripts.

  16. Detection of germline mutations of hMLH1 and hMSH2 based on cDNA sequencing in China

    Institute of Scientific and Technical Information of China (English)

    Chao-Fu Wang; Xiao-Yan Zhou; Tai-Ming Zhang; Meng-Hong Sun; Da-Ren Shi

    2005-01-01

    AIM: To detect the germline mutations of hMLH1 and hMSH2based on mRNA sequencing to identify hereditary nonpolyposis colorectal cancer (HNPCC) families.METHODS: Total RNA was extracted from peripheral blood of 14 members from 12 different families fulfilling Amsterdam criteria Ⅱ. mRNA of hMLH1 and hMSH2 was reversed with special primers and heat-resistant reverse transcriptase. cDNA was amplified with expand long template PCR and cDNA sequendng analysis was followed.RESULT: Seven germline mutations were found in 6families (6/12, 50%), in 4 hMLH1 and 3 hMSH2 mutations (4/12, 33.3%); (3/12, 25%). The mutation types involved 4 missense, 1 silent and 1 frame shift mutations as well as 1 mutation in the non-coding area. Four out of the seven mutations have not been reported previously. The 4 hMLH1mutations were distributed in exons 8, 12, 16, and 19. The 3hMSH2 mutations were distributed in exons 1 and 2. Six out of the 7 mutations were pathological, which were distributed in 5 HNPCC families.CONCLUSION: Germline mutations of hMLH1 and hMSH2 can be found based on cDNA sequencing so as to identify HNPCC family, which is highly sensitive and has the advantages of cost and time saving.

  17. Global Identification of Significantly Expressed Genes in Developing Endosperm of Rice by Expression Sequence Tags and cDNA Array Approaches

    Institute of Scientific and Technical Information of China (English)

    Qichao Tu; Haitao Dong; Haigen Yao; Yongqi Fang; Cheng'en Dai; Hongmei Luo; Jian Yao; Dong Zhao; Debao Li

    2008-01-01

    Rice endosperm plays a very important role in seedling germination and determines the qualities of fice grain.Although studies on specific gene categories in endosperm have been carried out,global view of gene expression at a transcription level in rice endosperm is still limited.To gain a better understanding of the global and tissue-specific gene expression profiles in rice endosperm,a cDNA library from rice endosperm of immature seeds was sequenced.A cDNA array was constructed based on the tentative unique transcripts derived from expression sequence tag (EST) assembling results and then hybridized with cONAs from five different tissues or organs including endosperm,embryo,leaf,stem and root of rice.Significant redundancy was found for genes encoding prolamin,glutelin,allergen,and starch synthesis proteins,accounting for~34% of the total ESTs obtained.The cDNA array revealed 87 significantly expressed genes in endosperm compared with the other four organs or tissues.These genes included 13 prolamin family proteins,17 glutelin family proteins,12 binding proteins,nine catalytic proteins and four ribosomal proteins,indicating a complicated biological processing in rice endosperm.In addition,Northern verification of 1,4-alpha-glucan branching enzyme detected two isoforms in rice endosperm,the larger one of which only existed in endosperm.

  18. Cloning and sequence analysis of a full-length cDNA of SmPP1cb encoding turbot protein phosphatase 1 beta catalytic subunit

    Science.gov (United States)

    Qi, Fei; Guo, Huarong; Wang, Jian

    2008-02-01

    Reversible protein phosphorylation, catalyzed by protein kinases and phosphatases, is an important and versatile mechanism by which eukaryotic cells regulate almost all the signaling processes. Protein phosphatase 1 (PP1) is the first and well-characterized member of the protein serine/threonine phosphatase family. In the present study, a full-length cDNA encoding the beta isoform of the catalytic subunit of protein phosphatase 1(PP1cb), was for the first time isolated and sequenced from the skin tissue of flatfish turbot Scophthalmus maximus, designated SmPP1cb, by the rapid amplification of cDNA ends (RACE) technique. The cDNA sequence of SmPP1cb we obtained contains a 984 bp open reading frame (ORF), flanked by a complete 39 bp 5' untranslated region and 462 bp 3' untranslated region. The ORF encodes a putative 327 amino acid protein, and the N-terminal section of this protein is highly acidic, Met-Ala-Glu-Gly-Glu-Leu-Asp-Val-Asp, a common feature for PP1 catalytic subunit but absent in protein phosphatase 2B (PP2B). And its calculated molecular mass is 37 193 Da and pI 5.8. Sequence analysis indicated that, SmPP1cb is extremely conserved in both amino acid and nucleotide acid levels compared with the PP1cb of other vertebrates and invertebrates, and its Kozak motif contained in the 5'UTR around ATG start codon is GXXAXXGXX ATGG, which is different from mammalian in two positions A-6 and G-3, indicating the possibility of different initiation of translation in turbot, and also the 3'UTR of SmPP1cb is highly diverse in the sequence similarity and length compared with other animals, especially zebrafish. The cloning and sequencing of SmPP1cb gene lays a good foundation for the future work on the biological functions of PP1 in the flatfish turbot.

  19. Molecular cloning and nucleotide sequence of a full-length cDNA for human alpha enolase.

    Science.gov (United States)

    Giallongo, A; Feo, S; Moore, R; Croce, C M; Showe, L C

    1986-01-01

    We previously purified a 48-kDa protein (p48) that specifically reacts with an antiserum directed against the 12 carboxyl-terminal amino acids of the c-myc gene product. Using an antiserum directed against the purified p48, we have cloned a cDNA from a human expression library. This cDNA hybrid-selects an mRNA that translates to a 48-kDa protein that specifically reacts with anti-p48 serum. We have isolated a full-length cDNA that encodes p48 and spans 1755 bases. The coding region is 1299 bases long; 94 bases are 5' noncoding and 359 bases are 3' noncoding. The cDNA encodes a 433 amino acid protein that is 67% homologous to yeast enolase and 94% homologous to the rat non-neuronal enolase. The purified protein has been shown to have enolase activity and has been identified to be of the alpha type by isoenzyme analysis. The transcriptional regulation of enolase expression in response to mitogenic stimulation of peripheral blood lymphocytes and in response to heat shock is also discussed. Images PMID:3529090

  20. Nucleotide sequence of a cDNA coding for the barley seed protein CMa: an inhibitor of insect α-amylase

    DEFF Research Database (Denmark)

    Rasmussen, Søren Kjærsgård; Johansson, A.

    1992-01-01

    The primary structure of the insect alpha-amylase inhibitor CMa of barley seeds was deduced from a full-length cDNA clone pc43F6. Analysis of RNA from barley endosperm shows high levels 15 and 20 days after flowering. The cDNA predicts an amino acid sequence of 119 residues preceded by a signal...... peptide of 25 amino acids. Ala and Leu account for 55% of the signal peptide. CMa is 60-85% identical with alpha-amylase inhibitors of wheat, but shows less than 50% identity to trypsin inhibitors of barley and wheat. The 10 Cys residues are located in identical positions compared to the cereal inhibitor...

  1. Immunological responses of turbot (Psetta maxima) to nodavirus infection or polyriboinosinic polyribocytidylic acid (pIC) stimulation, using expressed sequence tags (ESTs) analysis and cDNA microarrays.

    Science.gov (United States)

    Park, Kyoung C; Osborne, Jane A; Montes, Ariana; Dios, Sonia; Nerland, Audun H; Novoa, Beatriz; Figueras, Antonio; Brown, Laura L; Johnson, Stewart C

    2009-01-01

    To investigate the immunological responses of turbot to nodavirus infection or pIC stimulation, we constructed cDNA libraries from liver, kidney and gill tissues of nodavirus-infected fish and examined the differential gene expression within turbot kidney in response to nodavirus infection or pIC stimulation using a turbot cDNA microarray. Turbot were experimentally infected with nodavirus and samples of each tissue were collected at selected time points post-infection. Using equal amount of total RNA at each sampling time, we made three tissue-specific cDNA libraries. After sequencing 3230 clones we obtained 3173 (98.2%) high quality sequences from our liver, kidney and gill libraries. Of these 2568 (80.9%) were identified as known genes and 605 (19.1%) as unknown genes. A total of 768 unique genes were identified. The two largest groups resulting from the classification of ESTs according to function were the cell/organism defense genes (71 uni-genes) and apoptosis-related process (23 uni-genes). Using these clones, a 1920 element cDNA microarray was constructed and used to investigate the differential gene expression within turbot in response to experimental nodavirus infection or pIC stimulation. Kidney tissue was collected at selected times post-infection (HPI) or stimulation (HPS), and total RNA was isolated for microarray analysis. Of the 1920 genes studied on the microarray, we identified a total of 121 differentially expressed genes in the kidney: 94 genes from nodavirus-infected animals and 79 genes from those stimulated with pIC. Within the nodavirus-infected fish we observed the highest number of differentially expressed genes at 24 HPI. Our results indicate that certain genes in turbot have important roles in immune responses to nodavirus infection and dsRNA stimulation.

  2. A subtractive cDNA library from an identified regenerating neuron is enriched in sequences up-regulated during nerve regeneration.

    Science.gov (United States)

    Korneev, S; Fedorov, A; Collins, R; Blackshaw, S E; Davies, J A

    1997-01-01

    We have constructed a subtractive cDNA library from regenerating Retzius cells of the leech, Hirudo medicinalis. It is highly enriched in sequences up-regulated during nerve regeneration. Sequence analysis of selected recombinants has identified both novel sequences and sequences homologous to molecules characterised in other species. Homologies include alpha-tubulin, a calmodulin-like protein, CAAT/enhancer-binding protein (C/EBP), protein 4.1 and synapsin. These types of proteins are exactly those predicted to be associated with axonal growth and their identification confirms the quality of the library. Most interesting, however, is the isolation of 5 previously uncharacterised cDNAs which appear to be up-regulated during regeneration. Their analysis is likely to provide new information on the molecular mechanisms of neuronal regeneration.

  3. Deep-Sea, Deep-Sequencing: Metabarcoding Extracellular DNA from Sediments of Marine Canyons.

    Directory of Open Access Journals (Sweden)

    Magdalena Guardiola

    Full Text Available Marine sediments are home to one of the richest species pools on Earth, but logistics and a dearth of taxonomic work-force hinders the knowledge of their biodiversity. We characterized α- and β-diversity of deep-sea assemblages from submarine canyons in the western Mediterranean using an environmental DNA metabarcoding. We used a new primer set targeting a short eukaryotic 18S sequence (ca. 110 bp. We applied a protocol designed to obtain extractions enriched in extracellular DNA from replicated sediment corers. With this strategy we captured information from DNA (local or deposited from the water column that persists adsorbed to inorganic particles and buffered short-term spatial and temporal heterogeneity. We analysed replicated samples from 20 localities including 2 deep-sea canyons, 1 shallower canal, and two open slopes (depth range 100-2,250 m. We identified 1,629 MOTUs, among which the dominant groups were Metazoa (with representatives of 19 phyla, Alveolata, Stramenopiles, and Rhizaria. There was a marked small-scale heterogeneity as shown by differences in replicates within corers and within localities. The spatial variability between canyons was significant, as was the depth component in one of the canyons where it was tested. Likewise, the composition of the first layer (1 cm of sediment was significantly different from deeper layers. We found that qualitative (presence-absence and quantitative (relative number of reads data showed consistent trends of differentiation between samples and geographic areas. The subset of exclusively benthic MOTUs showed similar patterns of β-diversity and community structure as the whole dataset. Separate analyses of the main metazoan phyla (in number of MOTUs showed some differences in distribution attributable to different lifestyles. Our results highlight the differentiation that can be found even between geographically close assemblages, and sets the ground for future monitoring and conservation

  4. Analysis of a cDNA clone expressing a human autoimmune antigen: full-length sequence of the U2 small nuclear RNA-associated B antigen

    Energy Technology Data Exchange (ETDEWEB)

    Habets, W.J.; Sillekens, P.T.G.; Hoet, M.H.; Schalken, J.A.; Roebroek, A.J.M.; Leunissen, J.A.M.; Van de Ven, W.J.M.; Van Venrooij, W.J.

    1987-04-01

    A U2 small nuclear RNA-associated protein, designated B'', was recently identified as the target antigen for autoimmune sera from certain patients with systemic lupus erythematosus and other rheumatic diseases. Such antibodies enabled them to isolate cDNA clone lambdaHB''-1 from a phage lambdagt11 expression library. This clone appeared to code for the B'' protein as established by in vitro translation of hybrid-selected mRNA. The identity of clone lambdaHB''-1 was further confirmed by partial peptide mapping and analysis of the reactivity of the recombinant antigen with monospecific and monoclonal antibodies. Analysis of the nucleotide sequence of the 1015-base-pair cDNA insert of clone lambdaHB''-1 revealed a large open reading frame of 800 nucleotides containing the coding sequence for a polypeptide of 25,457 daltons. In vitro transcription of the lambdaHB''-1 cDNA insert and subsequent translation resulted in a protein product with the molecular size of the B'' protein. These data demonstrate that clone lambdaHB''-1 contains the complete coding sequence of this antigen. The deduced polypeptide sequence contains three very hydrophilic regions that might constitute RNA binding sites and/or antigenic determinants. These findings might have implications both for the understanding of the pathogenesis of rheumatic diseases as well as for the elucidation of the biological function of autoimmune antigens.

  5. Cloning and Sequencing of a Full-Length cDNA Encoding the RuBPCase Small Subunit (RbcS)in Tea (Camellia sinensis)

    Institute of Scientific and Technical Information of China (English)

    YE Ai-hua; JIANG Chang-jun; ZHU Lin; YU Mei; WANG Zhao-xia; DENG Wei-wei; WEI Chao-lin

    2009-01-01

    This study was aimed to isolate ribulose-l,5-bisphosphate carboxylase/oxygenase small subunit (RbcS) from tea plant [Camellia sinensis (L.) O. Kuntze]. In the study of transcriptional profiling of gene expression from tea flower bud development stage by cDNA-AFLP (cDNA amplified fragment length polymorphism), we have isolated some transcript-derived fragments (TDFs) occurring in both the young and mature flower bud. One of them showed a high degree of similarity to RbcS. Based on the fragment, the full length of RbcS with 769-bp (EF011075) cDNA was obtained via rapid amplification of cDNA ends (RACE). It contained an open reading frame of 176 amino acids consisting of a chloroplast transit peptide with 52 amino acids and a mature protein of 124 amino acids. The amino acids sequence presented a high identity to those of other plant RbcS genes. It also contains three conserved domains and a protein kinase C phosphorylation site, one tyrosine kinase phosphorylation site and two N-myristoylation sites. Analysis by RT-PCR showed that the expression of RbcS in tea from high to low was leaf, young stem, young flower bud and mature flower bud, respectively. The isolation of the tea Rubisco small subunit gene establishes a good foundation for further study on the photosynthesis of tea plant.

  6. Identification and characterization of a novel legume-like lectin cDNA sequence from the red marine algae Gracilaria fisheri

    Indian Academy of Sciences (India)

    Sukanya Suttisrisung; Saengchan Senapin; Boonsirm Withyachumnarnkul; Kanokpan Wongprasert

    2011-12-01

    A legume-type lectin (L-lectin) gene of the red algae Gracilaria fisheri (GFL) was cloned by rapid amplification of cDNA ends (RACE). The full-length cDNA of GFL was 1714 bp and contained a 1542 bp open reading frame encoding 513 amino acids with a predicted molecular mass of 56.5 kDa. Analysis of the putative amino acid sequence with NCBI-BLAST revealed a high homology (30–68%) with legume-type lectins (L-lectin) from Griffithsia japonica, Clavispora lusitaniae, Acyrthosiphon pisum, Tetraodon nigroviridis and Xenopus tropicalis. Phylogenetic relationship analysis showed the highest sequence identity to a glycoprotein of the red algae Griffithsia japonica (68%) (GenBank number AAM93989). Conserved Domain Database analysis detected an N-terminal carbohydrate recognition domain (CRD), the characteristic of L-lectins, which contained two sugar binding sites and a metal binding site. The secondary structure prediction of GFL showed a -sheet structure, connected with turn and coil. The most abundant structural element of GFL was the random coil, while the -helixes were distributed at the N- and C-termini, and 21 -sheets were distributed in the CRD. Computer analysis of three-dimensional structure showed a common feature of L-lectins of GFL, which included an overall globular shape that was composed of a -sandwich of two anti-parallel -sheets, monosaccharide binding sites, were on the top of the structure and in proximity with a metal binding site. Northern blot analysis using a DIG-labelled probe derived from a partial GFL sequence revealed a hybridization signal of ∼1.7 kb consistent with the length of the full-length GFL cDNA identified by RACE. No detectable band was observed from control total RNA extracted from filamentous green algae.

  7. Identification and characterization of a novel legume-like lectin cDNA sequence from the red marine algae Gracilaria fisheri.

    Science.gov (United States)

    Suttisrisung, Sukanya; Senapin, Saengchan; Withyachumnarnkul, Boonsirm; Wongprasert, Kanokpan

    2011-12-01

    A legume-type lectin (L-Lectin) gene of the red algae Gracilaria fisheri (GFL) was cloned by rapid amplification of cDNA ends (RACE). The full-length cDNA of GFL was 1714 bp and contained a 1542 bp open reading frame encoding 513 amino acids with a predicted molecular mass of 56.5 kDa. Analysis of the putative amino acid sequence with NCBI-BLAST revealed a high homology (30-68%) with legume-type lectins (L-lectin) from Griffithsia japonica, Clavispora lusitaniae, Acyrthosiphon pisum, Tetraodon nigroviridis and Xenopus tropicalis. Phylogenetic relationship analysis showed the highest sequence identity to a glycoprotein of the red algae Griffithsia japonica (68%) (GenBank number AAM93989). Conserved Domain Database analysis detected an N-terminal carbohydrate recognition domain (CRD), the characteristic of L-lectins, which contained two sugar binding sites and a metal binding site. The secondary structure prediction of GFL showed a beta-sheet structure, connected with turn and coil. The most abundant structural element of GFL was the random coil, while the alpha-helixes were distributed at the N- and C-termini, and 21 beta-sheets were distributed in the CRD. Computer analysis of three-dimensional structure showed a common feature of L-lectins of GFL, which included an overall globular shape that was composed of a beta-sandwich of two anti-parallel beta-sheets, monosaccharide binding sites, were on the top of the structure and in proximity with a metal binding site. Northern blot analysis using a DIG-labelled probe derived from a partial GFL sequence revealed a hybridization signal of (approx.) 1.7 kb consistent with the length of the full-length GFL cDNA identified by RACE. No detectable band was observed from control total RNA extracted from filamentous green algae.

  8. Transcriptome analysis of the model protozoan, Tetrahymena thermophila, using Deep RNA sequencing.

    Directory of Open Access Journals (Sweden)

    Jie Xiong

    Full Text Available BACKGROUND: The ciliated protozoan Tetrahymena thermophila is a well-studied single-celled eukaryote model organism for cellular and molecular biology. However, the lack of extensive T. thermophila cDNA libraries or a large expressed sequence tag (EST database limited the quality of the original genome annotation. METHODOLOGY/PRINCIPAL FINDINGS: This RNA-seq study describes the first deep sequencing analysis of the T. thermophila transcriptome during the three major stages of the life cycle: growth, starvation and conjugation. Uniquely mapped reads covered more than 96% of the 24,725 predicted gene models in the somatic genome. More than 1,000 new transcribed regions were identified. The great dynamic range of RNA-seq allowed detection of a nearly six order-of-magnitude range of measurable gene expression orchestrated by this cell. RNA-seq also allowed the first prediction of transcript untranslated regions (UTRs and an updated (larger size estimate of the T. thermophila transcriptome: 57 Mb, or about 55% of the somatic genome. Our study identified nearly 1,500 alternative splicing (AS events distributed over 5.2% of T. thermophila genes. This percentage represents a two order-of-magnitude increase over previous EST-based estimates in Tetrahymena. Evidence of stage-specific regulation of alternative splicing was also obtained. Finally, our study allowed us to completely confirm about 26.8% of the genes originally predicted by the gene finder, to correct coding sequence boundaries and intron-exon junctions for about a third, and to reassign microarray probes and correct earlier microarray data. CONCLUSIONS/SIGNIFICANCE: RNA-seq data significantly improve the genome annotation and provide a fully comprehensive view of the global transcriptome of T. thermophila. To our knowledge, 5.2% of T. thermophila genes with AS is the highest percentage of genes showing AS reported in a unicellular eukaryote. Tetrahymena thus becomes an excellent unicellular

  9. Synthetic oligonucleotides with particular base sequences from the cDNA encoding proteins of Mycobacterium bovis BCG induce interferons and activate natural killer cells.

    Science.gov (United States)

    Tokunaga, T; Yano, O; Kuramoto, E; Kimura, Y; Yamamoto, T; Kataoka, T; Yamamoto, S

    1992-01-01

    Thirteen kinds of 45-mer single-stranded oligonucleotide, having sequence randomly selected from the known cDNA encoding BCG proteins, were tested for their capability to augment natural killer (NK) cell activity of mouse spleen cells in vitro. Six out of the 13 oligonucleotides showed the activity, while the others did not. In order to know the minimal and essential sequence(s) responsible for the biological activity, 2 kinds of 30-mer and 5 kinds of 15-mer oligonucleotide fragments of an active 45-mer nucleotide were tested for their activity. One of the 30-mer oligonucleotides, designated BCG-A4a, was active, but the other 30-mer was inactive. All of the 15-mer oligonucleotide fragments were inactive. The BCG-A4a also stimulated the spleen cells to produce interferon (IFN)-alpha and -gamma. An experiment using anti-IFN antisera showed that the NK cell activation by the oligonucleotide was ascribed to the IFN-alpha produced. It was noticed that all of the biologically active oligonucleotides possessed one or more palindrome sequence(s), and the inactive ones did not, with an exception of a 45-mer inactive oligonucleotide containing overlapping palindrome sequences (GGGCCCGGG). These findings strongly suggest that certain palindrome sequences, like GACGTC, GGCGCC and TGCGCA, are essential for 30-mer oligonucleotides, like BCG-A4a, to induce IFNs.

  10. A combined de novo protein sequencing and cDNA library approach to the venomic analysis of Chinese spider Araneus ventricosus.

    Science.gov (United States)

    Duan, Zhigui; Cao, Rui; Jiang, Liping; Liang, Songping

    2013-01-14

    In past years, spider venoms have attracted increasing attention due to their extraordinary chemical and pharmacological diversity. The recently popularized proteomic method highly improved our ability to analyze the proteins in the venom. However, the lack of information about isolated venom proteins sequences dramatically limits the ability to confidently identify venom proteins. In the present paper, the venom from Araneus ventricosus was analyzed using two complementary approaches: 2-DE/Shotgun-LC-MS/MS coupled to MASCOT search and 2-DE/Shotgun-LC-MS/MS coupled to manual de novo sequencing followed by local venom protein database (LVPD) search. The LVPD was constructed with toxin-like protein sequences obtained from the analysis of cDNA library from A. ventricosus venom glands. Our results indicate that a total of 130 toxin-like protein sequences were unambiguously identified by manual de novo sequencing coupled to LVPD search, accounting for 86.67% of all toxin-like proteins in LVPD. Thus manual de novo sequencing coupled to LVPD search was proved an extremely effective approach for the analysis of venom proteins. In addition, the approach displays impeccable advantage in validating mutant positions of isoforms from the same toxin-like family. Intriguingly, methyl esterifcation of glutamic acid was discovered for the first time in animal venom proteins by manual de novo sequencing.

  11. Acetylcholinesterase of the Sand Fly, Phlebotomus papatasi (Scopoli): cDNA Sequence, Baculovirus Expression, and Biochemical Properties

    Science.gov (United States)

    2013-01-01

    temperature of 62.5°C and extension time of 3 min. at 72°C. The cDNA was cloned into pBlueBac4.5/V5-His TOPOW (Applied Biosystems/Life Technologies) and...Parasitol 2004, 20:328–332. 4. Kravchenko V, Wasserberg G, Warburg A: Bionomics of phlebotomine sandflies in the Galilee focus of cutaneous...phlebotomine sandflies . Med Vet Entomol 2011, 25:227–231. 7. Mirzaei A, Rouhani S, Taherkhani H, Farahmand M, Kazemi B, Hedayati M, Baghaei A, Davari B

  12. A juvenile hormone-repressible transferrin-like protein from the bean bug, Riptortus clavatus: cDNA sequence analysis and protein identification during diapause and vitellogenesis.

    Science.gov (United States)

    Hirai, M; Watanabe, D; Chinzei, Y

    2000-05-01

    We found several juvenile hormone-responsive cDNAs in the bean bug, Riptortus clavatus, by using mRNA differential display (Hirai et al., 1998). One of them, a juvenile hormone-repressible cDNA, JR-3, was cloned, sequenced, characterized and identified as a transferrin (RcTf). RcTf cDNA encoded 652 amino acids with a calculated molecular weight of 71,453 Da. The deduced amino acid sequence showed significant homology with the transferin genes of several insects, Manduca sexta (43% identity), Blaberus discoidalis (43%), Aedes aegypti (43%), Drosophila melanogaster (36%), Sarcophaga peregrina (36%) and the human (25%). Antiserum was prepared by using recombinant RcTf protein expressed in Escherichia coli as an antigen. The antiserum reacted specifically with both the recombinant protein and the native protein from the bugs, with sizes of 70 and 75 kDa, respectively. The 75 kDa protein was partially purified from hemolymph of diapausing female bugs and the first ten amino acids were found to be identical to that of RcTf cDNA, indicating that the 75 kDa protein is RcTf. The tissue distribution of RcTf in the bug was examined by Western blot analysis. In diapausing animals, RcTf was detected in the fat body, hemolymph and ovary but not in the gut. In the post-diapause stage, RcTf was also detected in eggs, in addition to the fat body and ovary. These results indicate that RcTf is incorporated into the oocytes during vitellogenesis, and suggest that it may provide iron for the developing embryos.

  13. Sequencing and rescuing a highly virulent classical swine fever virus: Chinese strain cF114 from a full-length cDNA clone

    Institute of Scientific and Technical Information of China (English)

    2003-01-01

    The complete nucleotide sequence of classical swine fever virus (CSFV) strain cF114 (F114 strain propa- gated on PK-15 cells) was cloned by RT-PCR. The analyses of nucleotide and amino acids identity between cF114 and F114, Brescia, Alfort or C strain were 99.41%, 96.80%, 86.03%, 95.70% and 99.28%, 98.54%, 93.33%, 97.41% re- spectively. The cDNA fragments with correct sequence were ligated into a full-length cDNA and inserted into pMC18 plasmid (pMC12297). A full-length infectious viral RNA was synthesized by runoff transcription and transfected to PK15 cells. Viruses were recovered from transfected cells which wese titrated on PK-15 cells by endpoint dilution and indirect immunofluorescence with a CSFV-specific monoclonal antibody. The antigenicity and replication kinetics of the plasmid-derived virus (vM12297) were similar to the parental virus in vitro. The E01 or E2 gene was replaced with the genes from strain C and the pM/CE01 and pM/CE2 with chimeric full-length cDNA of cF114 were generated. The infectious viruses were obtained from pM/CE01 and pM/CE2. Both of the chimeric viruses can infect PK-15, SK- 6 and primary testicle cell of swine. The chimeric viruses can grow to a titer of 8×105 F-PFU/mL. These results are very important for understanding the genes related to the CSFV propagation and pathogenesis.

  14. Construction of a full-length enriched cDNA library and preliminary analysis of expressed sequence tags from Bengal Tiger Panthera tigris tigris.

    Science.gov (United States)

    Liu, Changqing; Liu, Dan; Guo, Yu; Lu, Taofeng; Li, Xiangchen; Zhang, Minghai; Ma, Jianzhang; Ma, Yuehui; Guan, Weijun

    2013-05-24

    In this study, a full-length enriched cDNA library was successfully constructed from Bengal tiger, Panthera tigris tigris, the most well-known wild Animal. Total RNA was extracted from cultured Bengal tiger fibroblasts in vitro. The titers of primary and amplified libraries were 1.28 × 106 pfu/mL and 1.56 × 109 pfu/mL respectively. The percentage of recombinants from unamplified library was 90.2% and average length of exogenous inserts was 0.98 kb. A total of 212 individual ESTs with sizes ranging from 356 to 1108 bps were then analyzed. The BLASTX score revealed that 48.1% of the sequences were classified as a strong match, 45.3% as nominal and 6.6% as a weak match. Among the ESTs with known putative function, 26.4% ESTs were found to be related to all kinds of metabolisms, 19.3% ESTs to information storage and processing, 11.3% ESTs to posttranslational modification, protein turnover, chaperones, 11.3% ESTs to transport, 9.9% ESTs to signal transducer/cell communication, 9.0% ESTs to structure protein, 3.8% ESTs to cell cycle, and only 6.6% ESTs classified as novel genes. By EST sequencing, a full-length gene coding ferritin was identified and characterized. The recombinant plasmid pET32a-TAT-Ferritin was constructed, coded for the TAT-Ferritin fusion protein with two 6× His-tags in N and C-terminal. After BCA assay, the concentration of soluble Trx-TAT-Ferritin recombinant protein was 2.32 ± 0.12 mg/mL. These results demonstrated that the reliability and representativeness of the cDNA library attained to the requirements of a standard cDNA library. This library provided a useful platform for the functional genome and transcriptome research of Bengal tigers.

  15. Comparison of the tyrosine aminotransferase cDNA and genomic DNA sequences of normal mink and mink affected with tyrosinemia type II.

    Science.gov (United States)

    Leib, S R; McGuire, T C; Prieur, D J

    2005-01-01

    Type II tyrosinemia, designated Richner-Hanhart syndrome in humans, is a hereditary metabolic disorder with autosomal recessive inheritance characterized by a deficiency of tyrosine aminotransferase activity. Mutations occur in the human tyrosine aminotransferase gene, resulting in high levels of tyrosine and disease. Type II tyrosinemia occurs in mink, and our hypothesis was that it would also be associated with mutation(s) in the tyrosine aminotransferase gene. Therefore, the transcribed cDNA and the genomic tyrosine aminotransferase gene were sequenced from normal and affected mink. The gene extended over 11.9 kb and had 12 exons coding for a predicted 454-amino-acid protein with 93% homology with human tyrosine aminotransferase. FISH analysis mapped the gene to chromosome 8 using the Mandahl and Fredga (1975) nomenclature and chromosome 5 using the Christensen et al. (1996) nomenclature. The hypothesis was rejected because sequence analysis disclosed no mutations in either cDNA or introns that were associated with affected mink. This suggests that an unlinked gene regulatory mutation may be the cause of tyrosinemia in mink.

  16. Avocado cellulase: nucleotide sequence of a putative full-length cDNA clone and evidence for a small gene family.

    Science.gov (United States)

    Tucker, M L; Durbin, M L; Clegg, M T; Lewis, L N

    1987-05-01

    A cDNA library was prepared from ripe avocado fruit (Persea americana Mill. cv. Hass) and screened for clones hybridizing to a 600 bp cDNA clone (pAV5) coding for avocado fruit cellulase. This screening led to the isolation of a clone (pAV363) containing a 2021 nucleotide transcribed sequence and an approximately 150 nucleotide poly(A) tail. Hybridization of pAV363 to a northern blot shows that the length of the homologous message is approximately 2.2 kb. The nucleotide sequence of this putative full-length mRNA clone contains an open reading frame of 1482 nucleotides which codes for a polypeptide of 54.1 kD. The deduced amino acid composition compares favorably with the amino acid composition of native avocado cellulase determined by amino acid analysis. Southern blot analysis of Hind III and Eco RI endonuclease digested genomic DNA indicates a small family of cellulase genes.

  17. Comparative analysis of gene expression at early seedling stage between a rice hybrid and its parents using a cDNA microarray of 9198 uni-sequences

    Institute of Scientific and Technical Information of China (English)

    HUANG; Yi; LI; Lihua; CHEN; Ying; LI; Xianghua; XU; Caiguo; WANG; Shiping; ZHANG; Qifa

    2006-01-01

    Using a cDNA microarray consisting of 9198 expressed sequence tags, we surveyed the gene expression profiles in shoots and roots of a rice hybrid, Liangyoupei 9 and its parents Peiai 64s and 93-11 at 72 h after germination. A total of 8587 sequences had detectable signals in both shoots and roots of the three genotypes. A total of 1571 sequences exhibited significant (P<0.01) expression differences in shoots or roots among the three genotypes, of which 121 showed expression polymorphisms in both shoots and roots, and 870 revealed significant expression differences between the hybrid and one of the parents. The expression polymorphism of the sequences was associated with the functional categories of the sequences. They occurred more frequently in categories of carbohydrate, energy and lipid metabolisms and stress response than expected, while less frequently in categories of amino acid metabolism, transcription and translation regulation, and signal transduction. A total of 214 sequences exhibited significant (P<0.05) mid-parent heterosis in expression, of which 117 had homology to genes with known functions, assigned in the categories of basic metabolism, genetic information processing, cell growth and death, signal transduction, transportation and stress response. The results may provide useful information for exploring the relationship between gene expression polymorphism and phenotypic variation, and for characterizing the molecular mechanism of seedling development and heterosis in rice.

  18. Cloning and sequence analysis of a full-length cDNA of SmPP1cb encoding turbot protein phosphatase 1 beta catalytic subunit

    Institute of Scientific and Technical Information of China (English)

    QI Fei; GUO Huarong; WANG Jian

    2008-01-01

    Reversible protein phosphorylation,catalyzed by protein kinases and phosphatases,is an important and versatile mechanism by which eukaryotic cells regulate almost all the signaling processes.Protein phosphatase 1(PP1) is the first and well-characterized member of the protein serine/threoninephosphatase family.In the present study.a full-length cDNA encoding the beta isolorm of the catalytic subunit of protein phosphatase 1(PP1cb).was for the first time isolated and sequenced from the skin tissue of flatfish turbot Scophthalmus maximus,designated SmPP1cb,by the rapid amplification of cDNA ends (RACE) technique.The cDNA sequence of SmPP1cb we obtained contains a 984 bp open reading frame(ORF),flanked by a complete 39 bp 5' untranslated region and 462 bp 3' untranslated region.The ORF encodes a putative 327 amino acid protein.and the N-terminal section of this protein iS highly acidic,Met-Ala-Glu-Gly-Glu-Leu-Asp-Val-Asp.a common feature for PP1 catalytic subunit but absent in protein phosphatase 2B(PP2B).And its calculated molecular mass is 37 193 Da and pI 5.8.Sequence analysis indicated that,SmPP1cb is extremely conserved in both amino acid and nucleotide acid levels compared with the PP1cb of other vertebrates and invertebrates.and its Kozak motif contained in the 5'UTR around ATG start codon is GXXAXXGXXATGG,which is different from mammalian in two positions A-6 and G-3,indicating the possibility of different initiation of translation in turbot,and also the 3'UTR of SmPP1cb is highly diverse in the sequence similarity and length compared with other animals.especially zebrafish.The cloning and sequencing of SmPP1cb gene lays a good foundation for the future work on the biological functions of PP1 in the flatfish turbot.

  19. Deep sequencing of RNA from ancient maize kernels.

    Directory of Open Access Journals (Sweden)

    Sarah L Fordyce

    Full Text Available The characterization of biomolecules from ancient samples can shed otherwise unobtainable insights into the past. Despite the fundamental role of transcriptomal change in evolution, the potential of ancient RNA remains unexploited - perhaps due to dogma associated with the fragility of RNA. We hypothesize that seeds offer a plausible refuge for long-term RNA survival, due to the fundamental role of RNA during seed germination. Using RNA-Seq on cDNA synthesized from nucleic acid extracts, we validate this hypothesis through demonstration of partial transcriptomal recovery from two sources of ancient maize kernels. The results suggest that ancient seed transcriptomics may offer a powerful new tool with which to study plant domestication.

  20. Deep Sequencing of RNA from Ancient Maize Kernels

    Science.gov (United States)

    Rasmussen, Morten; Cappellini, Enrico; Romero-Navarro, J. Alberto; Wales, Nathan; Alquezar-Planas, David E.; Penfield, Steven; Brown, Terence A.; Vielle-Calzada, Jean-Philippe; Montiel, Rafael; Jørgensen, Tina; Odegaard, Nancy; Jacobs, Michael; Arriaza, Bernardo; Higham, Thomas F. G.; Ramsey, Christopher Bronk; Willerslev, Eske; Gilbert, M. Thomas P.

    2013-01-01

    The characterization of biomolecules from ancient samples can shed otherwise unobtainable insights into the past. Despite the fundamental role of transcriptomal change in evolution, the potential of ancient RNA remains unexploited – perhaps due to dogma associated with the fragility of RNA. We hypothesize that seeds offer a plausible refuge for long-term RNA survival, due to the fundamental role of RNA during seed germination. Using RNA-Seq on cDNA synthesized from nucleic acid extracts, we validate this hypothesis through demonstration of partial transcriptomal recovery from two sources of ancient maize kernels. The results suggest that ancient seed transcriptomics may offer a powerful new tool with which to study plant domestication. PMID:23326310

  1. Sequencing and identification of expressed Schistosoma mansoni genes by random selection of cDNA clones from a directional library

    Directory of Open Access Journals (Sweden)

    Glória R. Franco

    1995-04-01

    Full Text Available We have initiated a gene discovery program in Schistosoma mansoni based on the technique of Expressed Sequence Tags (ESTs, i.e. partial sequences of cDNAs obtained from single passes in automatic DNA sequencers. ESTs can be used to identify genese onf the basis of their homology whith sequences from other species deposited in DNA or protein databases. Trasncripts with sequences without matches in teh databases may represent novel parasite-specific genes. This approach has shown to be very efficient and in less than two years a broad range of novel genes has already been ascertained, more than doubling the number of known S. mansoni genes.

  2. The enzyme and the cDNA sequence of a thermolabile and double-strand specific DNase from Northern shrimps (Pandalus borealis.

    Directory of Open Access Journals (Sweden)

    Inge W Nilsen

    Full Text Available BACKGROUND: We have previously isolated a thermolabile nuclease specific for double-stranded DNA from industrial processing water of Northern shrimps (Pandalus borealis and developed an application of the enzyme in removal of contaminating DNA in PCR-related technologies. METHODOLOGY/PRINCIPAL FINDINGS: A 43 kDa nuclease with a high specific activity of hydrolysing linear as well as circular forms of DNA was purified from hepatopancreas of Northern shrimp (Pandalus borealis. The enzyme displayed a substrate preference that was shifted from exclusively double-stranded DNA in the presence of magnesium to also encompass significant activity against single-stranded DNA when calcium was added. No activity against RNA was detected. Although originating from a cold-environment animal, the shrimp DNase has only minor low-temperature activity. Still, the enzyme was irreversibly inactivated by moderate heating with a half-life of 1 min at 65 degrees C. The purified protein was partly sequenced and derived oligonucleotides were used to prime amplification of the encoding cDNA. This cDNA sequence revealed an open reading frame encoding a 404 amino acid protein containing a signal peptide. By sequence similarity the enzyme is predicted to belong to a family of DNA/RNA non-specific nucleases even though this shrimp DNase lacks RNase activity and is highly double-strand specific in some respects. These features are in agreement with those previously established for endonucleases classified as similar to the Kamchatka crab duplex-specific nuclease (Par_DSN. Sequence comparisons and phylogenetic analyses confirmed that the Northern shrimp nuclease resembles the Par_DSN-like nucleases and displays a more distant relationship to the Serratia family of nucleases. CONCLUSIONS/SIGNIFICANCE: The shrimp nuclease contains enzyme activity that may be controlled by temperature or buffer compositions. The double-stranded DNA specificity, as well as the thermolabile feature

  3. Transcription profiling of the model cyanobacterium Synechococcus sp. strain PCC 7002 by NextGen (SOLiD™ Sequencing of cDNA

    Directory of Open Access Journals (Sweden)

    Marcus eLudwig

    2011-03-01

    Full Text Available The genome of the unicellular, euryhaline cyanobacterium Synechococcus sp. PCC 7002 encodes about 3200 proteins. Transcripts were detected for nearly all annotated open reading frames by a global transcriptomic analysis by Next-Generation (SOLiDTM sequencing of cDNA. In the cDNA samples sequenced, ~90% of the mapped sequences were derived from the 16S and 23S ribosomal RNAs and ~10% of the sequences were derived from mRNAs. In cells grown photoautotrophically under standard conditions (38 °C, 1% (v/v CO2 in air, 250 µmol photons m-2 s-1, the highest transcript levels (up to 2% of the total mRNA for the most abundantly transcribed genes (e. g., cpcAB, psbA, psaA were generally derived from genes encoding structural components of the photosynthetic apparatus. High light exposure for one hour caused changes in transcript levels for genes encoding proteins of the photosynthetic apparatus, Type-1 NADH dehydrogenase complex and ATP synthase, whereas dark incubation for one hour resulted in a global decrease in transcript levels for photosynthesis-related genes and an increase in transcript levels for genes involved in carbohydrate degradation. Transcript levels for pyruvate kinase and the pyruvate dehydrogenase complex decreased sharply in cells incubated in the dark. Under dark anoxic (fermentative conditions, transcript changes indicated a global decrease in transcripts for respiratory proteins and suggested that cells employ an alternative phosphoenolpyruvate degradation pathway via phosphoenolpyruvate synthase (ppsA and the pyruvate:ferredoxin oxidoreductase (nifJ. Finally, the data suggested that an apparent operon involved in tetrapyrrole biosynthesis and fatty acid desaturation, acsF2-ho2-hemN2-desF, may be regulated by oxygen concentration.

  4. Assessment of adaptive evolution between wheat and rice as deduced from full-length common wheat cDNA sequence data and expression patterns

    Directory of Open Access Journals (Sweden)

    Hayashizaki Yoshihide

    2009-06-01

    Full Text Available Abstract Background Wheat is an allopolyploid plant that harbors a huge, complex genome. Therefore, accumulation of expressed sequence tags (ESTs for wheat is becoming particularly important for functional genomics and molecular breeding. We prepared a comprehensive collection of ESTs from the various tissues that develop during the wheat life cycle and from tissues subjected to stress. We also examined their expression profiles in silico. As full-length cDNAs are indispensable to certify the collected ESTs and annotate the genes in the wheat genome, we performed a systematic survey and sequencing of the full-length cDNA clones. This sequence information is a valuable genetic resource for functional genomics and will enable carrying out comparative genomics in cereals. Results As part of the functional genomics and development of genomic wheat resources, we have generated a collection of full-length cDNAs from common wheat. By grouping the ESTs of recombinant clones randomly selected from the full-length cDNA library, we were able to sequence 6,162 independent clones with high accuracy. About 10% of the clones were wheat-unique genes, without any counterparts within the DNA database. Wheat clones that showed high homology to those of rice were selected in order to investigate their expression patterns in various tissues throughout the wheat life cycle and in response to abiotic-stress treatments. To assess the variability of genes that have evolved differently in wheat and rice, we calculated the substitution rate (Ka/Ks of the counterparts in wheat and rice. Genes that were preferentially expressed in certain tissues or treatments had higher Ka/Ks values than those in other tissues and treatments, which suggests that the genes with the higher variability expressed in these tissues is under adaptive selection. Conclusion We have generated a high-quality full-length cDNA resource for common wheat, which is essential for continuation of the

  5. Error rates, PCR recombination, and sampling depth in HIV-1 whole genome deep sequencing.

    Science.gov (United States)

    Zanini, Fabio; Brodin, Johanna; Albert, Jan; Neher, Richard A

    2016-12-27

    Deep sequencing is a powerful and cost-effective tool to characterize the genetic diversity and evolution of virus populations. While modern sequencing instruments readily cover viral genomes many thousand fold and very rare variants can in principle be detected, sequencing errors, amplification biases, and other artifacts can limit sensitivity and complicate data interpretation. For this reason, the number of studies using whole genome deep sequencing to characterize viral quasi-species in clinical samples is still limited. We have previously undertaken a large scale whole genome deep sequencing study of HIV-1 populations. Here we discuss the challenges, error profiles, control experiments, and computational test we developed to quantify the accuracy of variant frequency estimation.

  6. KRAS, BRAF, and TP53 deep sequencing for colorectal carcinoma patient diagnostics.

    Science.gov (United States)

    Rechsteiner, Markus; von Teichman, Adriana; Rüschoff, Jan H; Fankhauser, Niklaus; Pestalozzi, Bernhard; Schraml, Peter; Weber, Achim; Wild, Peter; Zimmermann, Dieter; Moch, Holger

    2013-05-01

    In colorectal carcinoma, KRAS (alias Ki-ras) and BRAF mutations have emerged as predictors of resistance to anti-epidermal growth factor receptor antibody treatment and worse patient outcome, respectively. In this study, we aimed to establish a high-throughput deep sequencing workflow according to 454 pyrosequencing technology to cope with the increasing demand for sequence information at medical institutions. A cohort of 81 patients with known KRAS mutation status detected by Sanger sequencing was chosen for deep sequencing. The workflow allowed us to analyze seven amplicons (one BRAF, two KRAS, and four TP53 exons) of nine patients in parallel in one deep sequencing run. Target amplification and variant calling showed reproducible results with input DNA derived from FFPE tissue that ranged from 0.4 to 50 ng with the use of different targets and multiplex identifiers. Equimolar pooling of each amplicon in a deep sequencing run was necessary to counterbalance differences in patient tissue quality. Five BRAF and 49 TP53 mutations with functional consequences were detected. The lowest mutation frequency detected in a patient tumor population was 5% in TP53 exon 5. This low-frequency mutation was successfully verified in a second PCR and deep sequencing run. In summary, our workflow allows us to process 315 targets a week and provides the quality, flexibility, and speed needed to be integrated as standard procedure for mutational analysis in diagnostics.

  7. Isolation of a human anti-haemophilic factor IX cDNA clone using a unique 52-base synthetic oligonucleotide probe deduced from the amino acid sequence of bovine factor IX.

    Science.gov (United States)

    Jaye, M; de la Salle, H; Schamber, F; Balland, A; Kohli, V; Findeli, A; Tolstoshev, P; Lecocq, J P

    1983-04-25

    A unique 52mer oligonucleotide deduced from the amino acid sequence of bovine Factor IX was synthesized and used as a probe to screen a human liver cDNA bank. The Factor IX clone isolated shows 5 differences in nucleotide and deduced amino acid sequence as compared to a previously isolated clone. In addition, precisely one codon has been deleted.Images

  8. Large scale full-length cDNA sequencing reveals a unique genomic landscape in a lepidopteran model insect, Bombyx mori.

    Science.gov (United States)

    Suetsugu, Yoshitaka; Futahashi, Ryo; Kanamori, Hiroyuki; Kadono-Okuda, Keiko; Sasanuma, Shun-ichi; Narukawa, Junko; Ajimura, Masahiro; Jouraku, Akiya; Namiki, Nobukazu; Shimomura, Michihiko; Sezutsu, Hideki; Osanai-Futahashi, Mizuko; Suzuki, Masataka G; Daimon, Takaaki; Shinoda, Tetsuro; Taniai, Kiyoko; Asaoka, Kiyoshi; Niwa, Ryusuke; Kawaoka, Shinpei; Katsuma, Susumu; Tamura, Toshiki; Noda, Hiroaki; Kasahara, Masahiro; Sugano, Sumio; Suzuki, Yutaka; Fujiwara, Haruhiko; Kataoka, Hiroshi; Arunkumar, Kallare P; Tomar, Archana; Nagaraju, Javaregowda; Goldsmith, Marian R; Feng, Qili; Xia, Qingyou; Yamamoto, Kimiko; Shimada, Toru; Mita, Kazuei

    2013-09-01

    The establishment of a complete genomic sequence of silkworm, the model species of Lepidoptera, laid a foundation for its functional genomics. A more complete annotation of the genome will benefit functional and comparative studies and accelerate extensive industrial applications for this insect. To realize these goals, we embarked upon a large-scale full-length cDNA collection from 21 full-length cDNA libraries derived from 14 tissues of the domesticated silkworm and performed full sequencing by primer walking for 11,104 full-length cDNAs. The large average intron size was 1904 bp, resulting from a high accumulation of transposons. Using gene models predicted by GLEAN and published mRNAs, we identified 16,823 gene loci on the silkworm genome assembly. Orthology analysis of 153 species, including 11 insects, revealed that among three Lepidoptera including Monarch and Heliconius butterflies, the 403 largest silkworm-specific genes were composed mainly of protective immunity, hormone-related, and characteristic structural proteins. Analysis of testis-/ovary-specific genes revealed distinctive features of sexual dimorphism, including depletion of ovary-specific genes on the Z chromosome in contrast to an enrichment of testis-specific genes. More than 40% of genes expressed in specific tissues mapped in tissue-specific chromosomal clusters. The newly obtained FL-cDNA sequences enabled us to annotate the genome of this lepidopteran model insect more accurately, enhancing genomic and functional studies of Lepidoptera and comparative analyses with other insect orders, and yielding new insights into the evolution and organization of lepidopteran-specific genes.

  9. Opsin cDNA sequences of a UV and green rhodopsin of the satyrine butterfly Bicyclus anynana

    NARCIS (Netherlands)

    Vanhoutte, Kürt; Eggen, BJL; Janssen, JJM; Stavenga, DG

    2002-01-01

    The cDNAs of an ultraviolet (UV) and long-wavelength (LW) (green) absorbing rhodopsin of the bush brown Bicyclus anynana were partially identified. The UV sequence, encoding 377 amino acids, is 76-79% identical to the UV sequences of the papilionids Papilio glaucus and Papilio xuthus and the moth Ma

  10. HIV-1 quasispecies delineation by tag linkage deep sequencing.

    Science.gov (United States)

    Wu, Nicholas C; De La Cruz, Justin; Al-Mawsawi, Laith Q; Olson, C Anders; Qi, Hangfei; Luan, Harding H; Nguyen, Nguyen; Du, Yushen; Le, Shuai; Wu, Ting-Ting; Li, Xinmin; Lewis, Martha J; Yang, Otto O; Sun, Ren

    2014-01-01

    Trade-offs between throughput, read length, and error rates in high-throughput sequencing limit certain applications such as monitoring viral quasispecies. Here, we describe a molecular-based tag linkage method that allows assemblage of short sequence reads into long DNA fragments. It enables haplotype phasing with high accuracy and sensitivity to interrogate individual viral sequences in a quasispecies. This approach is demonstrated to deduce ∼ 2000 unique 1.3 kb viral sequences from HIV-1 quasispecies in vivo and after passaging ex vivo with a detection limit of ∼ 0.005% to ∼ 0.001%. Reproducibility of the method is validated quantitatively and qualitatively by a technical replicate. This approach can improve monitoring of the genetic architecture and evolution dynamics in any quasispecies population.

  11. Protein sequences bound to mineral surfaces persist into deep time

    OpenAIRE

    Demarchi, Beatrice; Hall, Shaun; Roncal-Herrero, Teresa; Freeman, Colin L.; Woolley, Jos; Crisp, Molly K; Wilson, Julie; Fotakis, Anna Katerina; Fischer, Roman; Kessler, Benedikt M; Jersie-Christensen, Rosa Rakownikow; Olsen, Jesper Velgaard; Haile, James; Thomas, Jessica; Marean, Curtis W.

    2016-01-01

    Proteins persist longer in the fossil record than DNA, but the longevity, survival mechanisms and substrates remain contested. Here, we demonstrate the role of mineral binding in preserving the protein sequence in ostrich (Struthionidae) eggshell, including from the palaeontological sites of Laetoli (3.8 Ma) and Olduvai Gorge (1.3 Ma) in Tanzania. By tracking protein diagenesis back in time we find consistent patterns of preservation, demonstrating authenticity of the surviving sequences. Mol...

  12. Gene discovery from Jatropha curcas by sequencing of ESTs from normalized and full-length enriched cDNA library from developing seeds

    Directory of Open Access Journals (Sweden)

    Sugantham Priyanka Annabel

    2010-10-01

    Full Text Available Abstract Background Jatropha curcas L. is promoted as an important non-edible biodiesel crop worldwide. Jatropha oil, which is a triacylglycerol, can be directly blended with petro-diesel or transesterified with methanol and used as biodiesel. Genetic improvement in jatropha is needed to increase the seed yield, oil content, drought and pest resistance, and to modify oil composition so that it becomes a technically and economically preferred source for biodiesel production. However, genetic improvement efforts in jatropha could not take advantage of genetic engineering methods due to lack of cloned genes from this species. To overcome this hurdle, the current gene discovery project was initiated with an objective of isolating as many functional genes as possible from J. curcas by large scale sequencing of expressed sequence tags (ESTs. Results A normalized and full-length enriched cDNA library was constructed from developing seeds of J. curcas. The cDNA library contained about 1 × 106 clones and average insert size of the clones was 2.1 kb. Totally 12,084 ESTs were sequenced to average high quality read length of 576 bp. Contig analysis revealed 2258 contigs and 4751 singletons. Contig size ranged from 2-23 and there were 7333 ESTs in the contigs. This resulted in 7009 unigenes which were annotated by BLASTX. It showed 3982 unigenes with significant similarity to known genes and 2836 unigenes with significant similarity to genes of unknown, hypothetical and putative proteins. The remaining 191 unigenes which did not show similarity with any genes in the public database may encode for unique genes. Functional classification revealed unigenes related to broad range of cellular, molecular and biological functions. Among the 7009 unigenes, 6233 unigenes were identified to be potential full-length genes. Conclusions The high quality normalized cDNA library was constructed from developing seeds of J. curcas for the first time and 7009 unigenes coding

  13. cDNA sequence and gene locus of the human retinal phosphoinositide-specific phospholipase-C{beta}4 (PLCB4)

    Energy Technology Data Exchange (ETDEWEB)

    Alvarez, R.A.; Ghalayini, A.J.; Anderson, R.E. [Baylor College of Medicine, Houston, TX (United States)] [and others

    1995-09-01

    Defects in the Drosophila norpA (no receptor potential A) gene encoding a phosphoinositide-specific phospholipase C (PLC) block invertebrate phototransduction and lead to retinal degeneration. The mammalian homolog, PLCB4, is expressed in rat brain, bovine cerebellum, and the bovine retina in several splice variants. To determine a possible role of PLCB4 gene defects in human disease, we isolated several overlapping cDNA clones from a human retina library. The composite cDNA sequence predicts a human PLC{beta}4 polypeptide of 1022 amino acid residues (MW 117,000). This PLC{beta}4 variant lacks a 165-amino-acid N-terminal domain characteristic for the rat brain isoforms, but has a distinct putative exon 1 unique for human and bovine retina isoforms. A PLC{beta}4 monospecific antibody detected a major (130 kDa) and a minor (160 kDa) isoform in retina homogenates. Somatic cell hybrids and deletion panels were used to localize the PCLB4 gene to the short arm of chromosome 20. The gene was further sublocalized to 20p12 by florescence in situ hybridization. 4 refs., 5 figs.

  14. Deep amplicon sequencing reveals mixed phytoplasma infection within single grapevine plants

    DEFF Research Database (Denmark)

    Nicolaisen, Mogens; Contaldo, Nicoletta; Makarova, Olga

    2011-01-01

    The diversity of phytoplasmas within single plants has not yet been fully investigated. In this project, deep amplicon sequencing was used to generate 50,926 phytoplasma sequences from 11 phytoplasma-infected grapevine samples from a PCR amplicon in the 5' end of the 16S region. After clustering ...

  15. Construction and packaging of pseudotype retrovirus containing human N—ras cDNA antisense sequence and its biological effects on human hepatoma cells

    Institute of Scientific and Technical Information of China (English)

    JIALIBIN; WANGXIANG; 等

    1990-01-01

    N-ras is one of the transforming genes in human hepatic cancer cells.It has been found that N-ras was overexpressed at the mRNA and protein level in hepatoma cells.In order to explore the biological roles of N-ras in human hepatic carcinogenesis and the potential application in control of cancer cell growth,a preudotype retrovirus containing antisense sequence of human N-ras was constructed and packaged.A recombinant retrovirus vector containing antisense or sense sequences of N-ras cDNA was constructed by pZIP-NeoSV(X)1.The pseudotype virus was packaged ang rescued by transfection and infection in PA317 and ψ 2 helper cells.It has been demonstrated that the pseudotype retrovirus containing antisense N-ras sequence did inhibit the growth of human PLC/PRF/5 hepatoma cells accompanied with inhibition of p21 expression,while the retrovirus containing sense sequence had none.The pseudotype virus had no effect on human diploid fibroblasts.

  16. Localization of the human fibromodulin gene (FMOD) to chromosome 1q32 and completion of the cDNA sequence

    Energy Technology Data Exchange (ETDEWEB)

    Sztrolovics, R.; Grover, J.; Roughley, P.J. [McGill Univ., Montreal (Canada)] [and others

    1994-10-01

    This report describes the cloning of the 3{prime}-untranslated region of the human fibromodulin cDNA and its use to map the gene. For somatic cell hybrids, the generation of the PCR product was concordant with the presence of chromosome 1 and discordant with the presence of all other chromosomes, confirming that the fibromodulin gene is located within region q32 of chromosome 1. The physical mapping of genes is a critical step in the process of identifying which genes may be responsible for various inherited disorders. Specifically, the mapping of the fibromodulin gene now provides the information necessary to evaluate its potential role in genetic disorders of connective tissues. The analysis of previously reported diseases mapped to chromosome 1 reveals two genes located in the proximity of the fibromodulin locus. These are Usher syndrome type II, a recessive disorder characterized by hearing loss and retinitis pigmentosa, and Van der Woude syndrome, a dominant condition associated with abnormalities such as cleft lip and palate and hyperdontia. The genes for both of these disorders have been projected to be localized to 1q32 of a physical map that integrates available genetic linkage and physical data. However, it seems improbable that either of these disorders, exhibiting restricted tissue involvement, could be linked to the fibromodulin gene, given the wide tissue distribution of the encoded proteoglycan, although it remains possible that the relative importance of the quantity and function of the proteoglycan may avry between tissues. 11 refs., 1 fig.

  17. Protein sequences bound to mineral surfaces persist into deep time

    Science.gov (United States)

    Demarchi, Beatrice; Hall, Shaun; Roncal-Herrero, Teresa; Freeman, Colin L; Woolley, Jos; Crisp, Molly K; Wilson, Julie; Fotakis, Anna; Fischer, Roman; Kessler, Benedikt M; Rakownikow Jersie-Christensen, Rosa; Olsen, Jesper V; Haile, James; Thomas, Jessica; Marean, Curtis W; Parkington, John; Presslee, Samantha; Lee-Thorp, Julia; Ditchfield, Peter; Hamilton, Jacqueline F; Ward, Martyn W; Wang, Chunting Michelle; Shaw, Marvin D; Harrison, Terry; Domínguez-Rodrigo, Manuel; MacPhee, Ross DE; Kwekason, Amandus; Ecker, Michaela; Kolska Horwitz, Liora; Chazan, Michael; Kröger, Roland; Thomas-Oates, Jane; Harding, John H; Cappellini, Enrico; Penkman, Kirsty; Collins, Matthew J

    2016-01-01

    Proteins persist longer in the fossil record than DNA, but the longevity, survival mechanisms and substrates remain contested. Here, we demonstrate the role of mineral binding in preserving the protein sequence in ostrich (Struthionidae) eggshell, including from the palaeontological sites of Laetoli (3.8 Ma) and Olduvai Gorge (1.3 Ma) in Tanzania. By tracking protein diagenesis back in time we find consistent patterns of preservation, demonstrating authenticity of the surviving sequences. Molecular dynamics simulations of struthiocalcin-1 and -2, the dominant proteins within the eggshell, reveal that distinct domains bind to the mineral surface. It is the domain with the strongest calculated binding energy to the calcite surface that is selectively preserved. Thermal age calculations demonstrate that the Laetoli and Olduvai peptides are 50 times older than any previously authenticated sequence (equivalent to ~16 Ma at a constant 10°C). DOI: http://dx.doi.org/10.7554/eLife.17092.001 PMID:27668515

  18. Protein sequences bound to mineral surfaces persist into deep time

    DEFF Research Database (Denmark)

    Demarchi, Beatrice; Hall, Shaun; Roncal-Herrero, Teresa;

    2016-01-01

    of Laetoli (3.8 Ma) and Olduvai Gorge (1.3 Ma) in Tanzania. By tracking protein diagenesis back in time we find consistent patterns of preservation, demonstrating authenticity of the surviving sequences. Molecular dynamics simulations of struthiocalcin-1 and -2, the dominant proteins within the eggshell......, reveal that distinct domains bind to the mineral surface. It is the domain with the strongest calculated binding energy to the calcite surface that is selectively preserved. Thermal age calculations demonstrate that the Laetoli and Olduvai peptides are 50 times older than any previously authenticated...... sequence (equivalent to ~16 Ma at a constant 10°C)....

  19. Cloning and Sequence Analysis of PGIP gene and cDNA from Prunus persica%桃PGIP基因及cDNA的克隆及序列分析

    Institute of Scientific and Technical Information of China (English)

    谌悦; 张军科; 熊帅

    2009-01-01

    通过PCR扩增了桃的PGIP基因及cDNA序列,并进行了序列之间的比对及分析.结果表明,桃的PGIP基因全长1 092 bp,而其cDNA序列只有1 045 bp,PGIP基因中含有一段长147 bp的内含子,且内含子符合GT-AG规律.这与其他核果类植物的PGIP基因构成基本相同.其基因序列与GenBank中已登录的3个桃以及其他核果的序列比对,其同源性达92%~99%;与苹果、梨和大桉的同源性分别为85%、84%和84%.%PGIP gene and cDNA were cloned from Prunus persica and the sequences were compared and analyzed. The result showed that the length of PGIP gene was 1 092 bp, while the length of cDNA was only 1 045 bp. The PGIP gene contained 147 bp length intron, and according to GT-AG law, which basically the same with PGIP gene of other stone fruit plants. its gene sequence and logged 3 peaches in GenBank and other stone fruit were compared the homology can reached to 92%-99% . The homology rates of PGIPs ,compared peach with Malus, Pyrus and Eucalyptus , were 85%, 84% and 84% respectively.

  20. Protein sequences bound to mineral surfaces persist into deep time

    DEFF Research Database (Denmark)

    Demarchi, Beatrice; Hall, Shaun; Roncal-Herrero, Teresa

    2016-01-01

    Proteins persist longer in the fossil record than DNA, but the longevity, survival mechanisms and substrates remain contested. Here, we demonstrate the role of mineral binding in preserving the protein sequence in ostrich (Struthionidae) eggshell, including from the palaeontological sites of Laet...

  1. Characterization of the Melanoma miRNAome by Deep Sequencing.

    Directory of Open Access Journals (Sweden)

    Mitchell S Stark

    Full Text Available BACKGROUND: MicroRNAs (miRNAs are 18-23 nucleotide non-coding RNAs that regulate gene expression in a sequence specific manner. Little is known about the repertoire and function of miRNAs in melanoma or the melanocytic lineage. We therefore undertook a comprehensive analysis of the miRNAome in a diverse range of pigment cells including: melanoblasts, melanocytes, congenital nevocytes, acral, mucosal, cutaneous and uveal melanoma cells. METHODOLOGY/PRINCIPAL FINDINGS: We sequenced 12 small RNA libraries using Illumina's Genome Analyzer II platform. This massively parallel sequencing approach of a diverse set of melanoma and pigment cell libraries revealed a total of 539 known mature and mature-star sequences, along with the prediction of 279 novel miRNA candidates, of which 109 were common to 2 or more libraries and 3 were present in all libraries. CONCLUSIONS/SIGNIFICANCE: Some of the novel candidate miRNAs may be specific to the melanocytic lineage and as such could be used as biomarkers to assist in the early detection of distant metastases by measuring the circulating levels in blood. Follow up studies of the functional roles of these pigment cell miRNAs and the identification of the targets should shed further light on the development and progression of melanoma.

  2. cDNA Cloning and Sequence Analysis of ADH Gene in Delia antiqua%葱蝇ADH基因的克隆及序列分析

    Institute of Scientific and Technical Information of China (English)

    陈春露; 陈斌; 司风玲; 何正波

    2012-01-01

    [ Objective ] The aim was to clone the ADH gene of Delia antiqun, and carry out a sequence analysis. [ Method ] The cDNA sequence of ADH gene was cloned with the method of RACE, and then studied with homology analysis, comparison of amino acid sequence and phylogenetic analysis. [Result] The full length of cDNA obtained was 1 088 bp, among which there were 771 bp of ORF, encoding a protein of 256 amino acids with a calculated molecular weight of 30.80 kKa and a theoretical isolectric point of 8.22. The deduced amino acid sequence had the highest identity with that of Glossina morsitans based on homological analysis,and a phylogenic tree was inferred with homological ADH sequences from other insects. [ Conclusion ] The study provides a basis for the further research of ADH gene.%[目的]对葱蝇(Delia antiqua)ADH基因进行克隆,并对其进行序列分析.[方法]通过RACE的方法克隆葱蝇ADH基因的cDNA序列,同时对该序列进行同源性分析、氨基酸序列比对和系统发育分析.[结果]试验获得的cDNA全长1 088 bp,其中ORF 771 bp,编码256个氨基酸,推测其相对分子质量为30.80 kDa,等电点为8.22;通过该基因推导的氨基酸序列与其他物种的ADH进行相似性比较和系统发育分析,发现葱蝇与刺舌蝇(Glossina morsuans)氨基酸序列的同源性最高.[结论]该研究为ADH基因的进一步研究提供了基础.

  3. Characterization of gamma-crystallin from a catfish: structural characterization of one major isoform with high methionine by cDNA sequencing.

    Science.gov (United States)

    Pan, F M; Chang, W C; Lin, C H; Hsu, A L; Chiou, S H

    1995-04-01

    gamma-Crystallin is the major and most abundant lens protein present in the eye lens of most teleostean fishes. To facilitate structural characterization of gamma-crystallins isolated from the lens of the catfishes (Clarias fuscus), a cDNA mixture was synthesized from the poly(A)+mRNA isolated from fresh eye lenses, and amplification by polymerase chain reaction (PCR) was adopted to obtain cDNAs encoding various gamma-crystallins. Plasmids of transformed E. coli strain JM109 containing amplified gamma-crystallin cDNAs were purified and prepared for nucleotide sequencing by the dideoxynucleotide chain-termination method. Sequencing more than five clones containing DNA inserts of 0.52 kb revealed the presence of one major isoform with a complete reading frame of 534 base pairs, covering a gamma-crystallin (gamma M1) with a deduced protein sequence of 177 amino acids excluding the initiating methionine. It was of interest to find that this crystallin of pI 9.1 contains a high-methionine content of 15.3% in contrast to those gamma-crystallins of low-methionine content from most mammalian lenses. Sequence comparisons of catfish gamma M1-crystallin with those published sequences of gamma-crystallins from carp, bovine and mouse lenses indicate that there is approx. an 82% sequence homology between the catfish and the carp species of piscine class whereas only 51-58% homology is found between mammals and the catfish. Moreover the differences in the hydropathy profiles for these two groups of gamma-crystallins, i.e. one with a high-methionine content from teleostean fishes and the other with a low-methionine content from mammalian species, reflect a distinct variance in the polarity distributions of surface amino acids in these crystallins.(ABSTRACT TRUNCATED AT 250 WORDS)

  4. Improved rapid amplification of cDNA ends (RACE) for mapping both the 5' and 3' terminal sequences of paramyxovirus genomes.

    Science.gov (United States)

    Li, Zhuo; Yu, Meng; Zhang, Hong; Wang, Hai-Yan; Wang, Lin-Fa

    2005-12-01

    Rapid amplification of cDNA ends (RACE) is a powerful PCR-based technique for determination of RNA terminal sequences. However, most of the RACE methods reported in the literature are developed specifically for the mapping of eukaryotic transcripts with 3' poly-A tail and 5' cap structure. In this study, an improved RACE strategy was developed which allows both 5' and 3' RACE of paramyxovirus genomic RNA using the same set of common molecular biology reagents without having to rely on expensive RACE kits. Mapping of RNA genome terminal sequences is an essential part of characterizing novel paramyxoviruses since these sequences contain important signals for genome replication and transcription, and are important molecular markers for studying virus evolution. The usefulness of this strategy was demonstrated by rapid characterization of both genome ends for a novel paramyxovirus recently isolated from human kidney primary cells. The RACE strategy described in this paper is simple, cost-effective and can be used to map genome ends of any RNA viruses.

  5. cDNA Cloning, Sequence Analysis of the Porcine LIM and Cysteine-rich Domain 1 Gene

    Institute of Scientific and Technical Information of China (English)

    Jun WANG; Chang-Yan DENG; Yuan-Zhu XIONG; Bo ZUO; Lei XING; Feng-E LI; Ming-Gang LEI; Rong ZHENG; Si-Wen JIANG

    2005-01-01

    LIM domain proteins are important regulators in cell growth, cell fate determination, cell differentiation and remodeling of the cell cytoskeleton by their interaction with various structural proteins, kinases and transcriptional regulators. Using molecular biology combined with in silico cloning, we have cloned the complete coding sequence of pig LIM and the cysteine-rich domain 1 gene (LMCD1) which encodes a 363 amino acid protein. The estimated molecular weight of the LMCD1 protein is 40,788 Da with a pI of 8.39. It was found to be highly expressed in both skeletal muscle and cardiac muscle. Alignment analysis revealed that the deduced protein sequence shares 86%, 91% and 93% homology with that of its human, mouse and rat counterparts, respectively. The LMCD1 protein was predicted by bioinformatics software to contain a novel cysteine-rich domain in the N-terminal region, two LIM domains in the C-terminal region, nine potential protein kinase C phosphorylation sites, seven casein kinase Ⅱ phosphorylation sites, a tyrosine kinase phosphorylation site, seven N-glycosylation and N-myristoylation sites and a single potential N-glycosylation site, which is similar to the protein's human counterpart. Phylogenetic tree was constructed by aligning the amino acid sequences of the LIM domain from different species. In addition, four base mutations were detected by comparing the sequences of Large White pigs with those of Chinese Meishan pigs. The G294A mutation site was confirmed by polymerase chain reaction-single-strand conformation polymorphism analysis. Its allele frequencies were studied in five pig breeds.

  6. Determining mutant spectra of three RNA viral samples using ultra-deep sequencing

    Energy Technology Data Exchange (ETDEWEB)

    Chen, H

    2012-06-06

    RNA viruses have extremely high mutation rates that enable the virus to adapt to new host environments and even jump from one species to another. As part of a viral transmission study, three viral samples collected from naturally infected animals were sequenced using Illumina paired-end technology at ultra-deep coverage. In order to determine the mutant spectra within the viral quasispecies, it is critical to understand the sequencing error rates and control for false positive calls of viral variants (point mutantations). I will estimate the sequencing error rate from two control sequences and characterize the mutant spectra in the natural samples with this error rate.

  7. cDNA Cloning and Sequence Analysis of ADH Gene in Delia antiqua%葱蝇ADH基因的克隆及序列分析

    Institute of Scientific and Technical Information of China (English)

    陈春露; 陈斌; 司风玲; 何正波

    2012-01-01

    【目的】对葱蝇(De如antiqua)ADH基因进行克隆,并对其进行序列分析。【方法】通过RACE的方法克隆葱蝇ADH基因的cDNA序列,同时对该序列进行同源性分析、氨基酸序列比对和系统发育分析。[结果]试验获得的cDNA全长1088bp,其中ORF771bp,编码256个氨基酸,推测其相对分子质量为30.80kDa,等电点为8.22;通过该基因推导的氨基酸序列与其他物种的ADH进行相似性比较和系统发育分析,发现葱蝇与刺舌蝇(Glossina morsitans morsitoas)氨基酸序列的同源性最高。【结论】该研究为ADH基因的进一步研究提供了基础。%[Objective] This study aims to conduct cloning and sequence analysis of ADH gene in D. Antiqua. [Method] Full-length cDNA of ADH gene in D. antiqua was cloned by using RACE technology (GenBank access number: JQ666006). Analysis of the homology, characteristics and functional domains of ADH sequence and the phy- Iogenetic relationship to other dipteran ADH were conducted. [Result] The full length of ADH cDNA is 1 088 bp containing a 771 bp of ORF, encoding 256 amino acids, with a calculated relative molecular weight of 30.80 kDa and a theoretical isoelectric point of 8.22. The deduced amino acid sequence shares the highest homology with Glossina morsitans morsitans based on homological analysis and phylogenetic analysis. [Conclusion] This study provides basis for further research of ADH gene.

  8. Cloning and sequence of cDNA encoding 1-aminocyclo- propane-1-carboxylate oxidase in Vanda flowers

    Directory of Open Access Journals (Sweden)

    Pattana Srifah Huehne

    2013-08-01

    Full Text Available The 1-aminocyclopropane-1-carboxylate oxidase (ACO gene in the final step of ethylene biosynthesis was isolated from ethylene-sensitive Vanda Miss Joaquim flowers. This consists of 1,242 base pairs (bp encoding for 326 amino acid residues. To investigate the specific divergence in orchid ACO sequences, the deduced Vanda ACO was aligned with five other orchid ACOs. The results reveal that the ACO sequences within Doritaenopsis, Phalaenopsis and Vanda show highly conserved and almost 95% identical homology, while the ACOs isolated from Cymbidium, Dendrobium and Cattleya are 8788% identical to Vanda ACO. In addition, the 2-oxoglutarate- Fe(II_oxygenase (Oxy domain of orchid ACOs consists of a higher degree of amino acid conservation than that of the non-haem dioxygenase (DIOX_N domain. The overall homology regions of Vanda ACO are commonly folded into 12 α-helices and 12 β-sheets similar to the three dimensional template-structure of Petunia ACO. This Vanda ACO cloned gene is highly expressed in flower tissue compared with root and leaf tissues. In particular, there is an abundance of ACO transcript accumulation in the column followed by the lip and the perianth of Vanda Miss Joaquim flowers at the fully-open stage.

  9. Analysis of expression sequence tags from a full-length-enriched cDNA library of developing sesame seeds (Sesamum indicum

    Directory of Open Access Journals (Sweden)

    Ke Tao

    2011-12-01

    Full Text Available Abstract Background Sesame (Sesamum indicum is one of the most important oilseed crops with high oil contents and rich nutrient value. However, genetic improvement efforts in sesame could not get benefit from molecular biology technology due to poor DNA and RNA sequence resources. In this study, we carried out a large scale of expressed sequence tags (ESTs sequencing from developing sesame seeds and further conducted analysis on seed storage products-related genes. Results A normalized and full-length enriched cDNA library from 5 ~ 30 days old immature seeds was constructed and randomly sequenced, leading to generation of 41,248 expressed sequence tags (ESTs which then formed 4,713 contigs and 27,708 singletons with 44.9% uniESTs being putative full-length open reading frames. Approximately 26,091 of all these uniESTs have significant matches to the counterparts in Nr database of GenBank, and 21,628 of them were assigned to one or more Gene ontology (GO terms. Homologous genes involved in oil biosynthesis were identified including some conservative transcription factors regulating oil biosynthesis such as LEAFY COTYLEDON1 (LEC1, PICKLE (PKL, WRINKLED1 (WRI1 and majority of them were found for the first time in sesame seeds. One hundred and 17 ESTs were identified possibly involved in biosynthesis of sesame lignans, sesamin and sesamolin. In total, 9,347 putative functional genes from developing seeds were identified, which accounts for one third of total genes in the sesame genome. Further analysis of the uniESTs identified 1,949 non-redundant simple sequence repeats (SSRs. Conclusions This study has provided an overview of genes expressed during sesame seed development. This collection of sesame full-length cDNAs covered a wide variety of genes in seeds, in particular, candidate genes involved in biosynthesis of sesame oils and lignans. These EST sequences enriched with full length will contribute to comparative genomic studies on sesame and

  10. Using Small RNA Deep Sequencing Data to Detect Human Viruses

    Directory of Open Access Journals (Sweden)

    Fang Wang

    2016-01-01

    Full Text Available Small RNA sequencing (sRNA-seq can be used to detect viruses in infected hosts without the necessity to have any prior knowledge or specialized sample preparation. The sRNA-seq method was initially used for viral detection and identification in plants and then in invertebrates and fungi. However, it is still controversial to use sRNA-seq in the detection of mammalian or human viruses. In this study, we used 931 sRNA-seq runs of data from the NCBI SRA database to detect and identify viruses in human cells or tissues, particularly from some clinical samples. Six viruses including HPV-18, HBV, HCV, HIV-1, SMRV, and EBV were detected from 36 runs of data. Four viruses were consistent with the annotations from the previous studies. HIV-1 was found in clinical samples without the HIV-positive reports, and SMRV was found in Diffuse Large B-Cell Lymphoma cells for the first time. In conclusion, these results suggest the sRNA-seq can be used to detect viruses in mammals and humans.

  11. Using Small RNA Deep Sequencing Data to Detect Human Viruses.

    Science.gov (United States)

    Wang, Fang; Sun, Yu; Ruan, Jishou; Chen, Rui; Chen, Xin; Chen, Chengjie; Kreuze, Jan F; Fei, ZhangJun; Zhu, Xiao; Gao, Shan

    2016-01-01

    Small RNA sequencing (sRNA-seq) can be used to detect viruses in infected hosts without the necessity to have any prior knowledge or specialized sample preparation. The sRNA-seq method was initially used for viral detection and identification in plants and then in invertebrates and fungi. However, it is still controversial to use sRNA-seq in the detection of mammalian or human viruses. In this study, we used 931 sRNA-seq runs of data from the NCBI SRA database to detect and identify viruses in human cells or tissues, particularly from some clinical samples. Six viruses including HPV-18, HBV, HCV, HIV-1, SMRV, and EBV were detected from 36 runs of data. Four viruses were consistent with the annotations from the previous studies. HIV-1 was found in clinical samples without the HIV-positive reports, and SMRV was found in Diffuse Large B-Cell Lymphoma cells for the first time. In conclusion, these results suggest the sRNA-seq can be used to detect viruses in mammals and humans.

  12. Toward a better knowledge of the molecular evolution of phosphoenolpyruvate carboxylase by comparison of partial cDNA sequences.

    Science.gov (United States)

    Gehrig, H H; Heute, V; Kluge, M

    1998-01-01

    To get deeper insight into the evolution of phosphoenolpyruvate carboxylase we have identified PEPC fragments (about 1,100 bp) of another 12 plants species not yet investigated in this context. The selected plants include one Chlorophyta, two Bryophyta, four Pteridophyta, and five Spermatophyta species. The obtained phylogenetic trees on PEPC isoforms are the most complete ones up to now available. Independent of their manner of construction, the resulting dendrograms are very similar and fully consistent with the main topology as it is postulated for the evolution of the higher terrestrial plants. We found a distinct clustering of the PEPC sequences of the prokaryotes, the algae, and the spermatophytes. PEPC isoforms of the archegoniates are located in the phylogenetic trees between the algae and spermatophytes. Our results strengthen the view that the PEPC is a very useful molecular marker with which to visualize phylogenetic trends both on the metabolic and organismic levels.

  13. cDNA cloning and sequence analysis of genome segments S8 from rice black-streaked dwarf virus

    Institute of Scientific and Technical Information of China (English)

    张恒木; 陈剑平; 薛庆中; 雷娟利

    2002-01-01

    Genome segments S8 of two Chinese isolates of rice black-streaked dwarf virus (RBSDV), one from Zhejiang Province and another from Hebei Province, were amplified by RT-PCR and sequenced. Both segments consisted of 1936 nts in full length (EMBL accession numbers were AJ297431 and AJ297432, respectively) and contained only one big open reading frame which encoded a polypeptide with molecular weight of 68kD. The two Chinese isolates shared 94.0% and 96.5% identity at nucleotide and amino acid level, respectively. They shared 94.5-94.9% and 92.5-92.9% homology with S8 of RBSDV Japanese isolate at nucleotide and amino acid level, respectively; shared 85.1-87.6% and 91.7-91.9% homology with S7 of Italian MRDV (maize rough dwarf virus).

  14. Complete cDNA sequence of the preproform of human pregnancy-associated plasma protein-A. Evidence for expression in the brain and induction by cAMP

    DEFF Research Database (Denmark)

    Haaning, Jesper; Oxvig, Claus; Overgaard, Michael Toft

    1996-01-01

    A cDNA that encodes the prepropeptide of pregnancy-associated plasma protein-A (preproPAPP-A), a putative metalloproteinase, has been cloned and sequenced. PAPP-A is synthesized in the placenta as a 1627-residue precursor preproprotein with a putative 22-residue signal peptide and a highly basic ...

  15. cDNA sequence and tissue distribution of the mRNA for bovine and murine p11, the S100-related light chain of the protein-tyrosine kinase substrate p36 (calpactin I)

    DEFF Research Database (Denmark)

    Saris, Chris J M; Kristensen, Torsten; D’Eustachio, Peter

    1987-01-01

    We have isolated and sequenced cDNA clones of bovine nd murine pl 1 mRNAs. The nonpolyadenylated mRNAs are predicted to be 614 and 600 nucleotides, respectively. The p l l mRNAs both contain a 291 nucleotide open reading frame, preceded by a 5”untranslated region of 73 nucleotides in bovine p l l...

  16. Deep sequencing of the murine olfactory receptor neuron transcriptome.

    Directory of Open Access Journals (Sweden)

    Ninthujah Kanageswaran

    Full Text Available The ability of animals to sense and differentiate among thousands of odorants relies on a large set of olfactory receptors (OR and a multitude of accessory proteins within the olfactory epithelium (OE. ORs and related signaling mechanisms have been the subject of intensive studies over the past years, but our knowledge regarding olfactory processing remains limited. The recent development of next generation sequencing (NGS techniques encouraged us to assess the transcriptome of the murine OE. We analyzed RNA from OEs of female and male adult mice and from fluorescence-activated cell sorting (FACS-sorted olfactory receptor neurons (ORNs obtained from transgenic OMP-GFP mice. The Illumina RNA-Seq protocol was utilized to generate up to 86 million reads per transcriptome. In OE samples, nearly all OR and trace amine-associated receptor (TAAR genes involved in the perception of volatile amines were detectably expressed. Other genes known to participate in olfactory signaling pathways were among the 200 genes with the highest expression levels in the OE. To identify OE-specific genes, we compared olfactory neuron expression profiles with RNA-Seq transcriptome data from different murine tissues. By analyzing different transcript classes, we detected the expression of non-olfactory GPCRs in ORNs and established an expression ranking for GPCRs detected in the OE. We also identified other previously undescribed membrane proteins as potential new players in olfaction. The quantitative and comprehensive transcriptome data provide a virtually complete catalogue of genes expressed in the OE and present a useful tool to uncover candidate genes involved in, for example, olfactory signaling, OR trafficking and recycling, and proliferation.

  17. Targeted rapid amplification of cDNA ends (T-RACE)—an improved RACE reaction through degradation of non-target sequences

    OpenAIRE

    Neil I Bower; Johnston, Ian A

    2010-01-01

    Amplification of the 5' ends of cDNA, although simple in theory, can often be difficult to achieve. We describe a novel method for the specific amplification of cDNA ends. An oligo-dT adapter incorporating a dUTP-containing PCR primer primes first-strand cDNA synthesis incorporating dUTP. Using the Cap finder approach, another distinct dUTP containing adapter is added to the 3' end of the newly synthesized cDNA. Second-strand synthesis incorporating dUTP is achieved by PCR, using dUTP-contain...

  18. Whitefly (Bemisia tabaci genome project: analysis of sequenced clones from egg, instar, and adult (viruliferous and non-viruliferous cDNA libraries

    Directory of Open Access Journals (Sweden)

    Czosnek Henryk

    2006-04-01

    Full Text Available Abstract Background The past three decades have witnessed a dramatic increase in interest in the whitefly Bemisia tabaci, owing to its nature as a taxonomically cryptic species, the damage it causes to a large number of herbaceous plants because of its specialized feeding in the phloem, and to its ability to serve as a vector of plant viruses. Among the most important plant viruses to be transmitted by B. tabaci are those in the genus Begomovirus (family, Geminiviridae. Surprisingly, little is known about the genome of this whitefly. The haploid genome size for male B. tabaci has been estimated to be approximately one billion bp by flow cytometry analysis, about five times the size of the fruitfly Drosophila melanogaster. The genes involved in whitefly development, in host range plasticity, and in begomovirus vector specificity and competency, are unknown. Results To address this general shortage of genomic sequence information, we have constructed three cDNA libraries from non-viruliferous whiteflies (eggs, immature instars, and adults and two from adult insects that fed on tomato plants infected by two geminiviruses: Tomato yellow leaf curl virus (TYLCV and Tomato mottle virus (ToMoV. In total, the sequence of 18,976 clones was determined. After quality control, and removal of 5,542 clones of mitochondrial origin 9,110 sequences remained which included 3,843 singletons and 1,017 contigs. Comparisons with public databases indicated that the libraries contained genes involved in cellular and developmental processes. In addition, approximately 1,000 bases aligned with the genome of the B. tabaci endosymbiotic bacterium Candidatus Portiera aleyrodidarum, originating primarily from the egg and instar libraries. Apart from the mitochondrial sequences, the longest and most abundant sequence encodes vitellogenin, which originated from whitefly adult libraries, indicating that much of the gene expression in this insect is directed toward the production

  19. Deep RNA Sequencing of the Skeletal Muscle Transcriptome in Swimming Fish

    NARCIS (Netherlands)

    Palstra, A.P.; Beltran, S.; Burgerhout, E.; Brittijn, S.A.; Magnoni, L.J.; Henkel, C.V.; Jansen, A.; Thillart, G.E.E.J.M.; Spaink, H.P.; Planas, J.V.

    2013-01-01

    Deep RNA sequencing (RNA-seq) was performed to provide an in-depth view of the transcriptome of red and white skeletal muscle of exercised and non-exercised rainbow trout (Oncorhynchus mykiss) with the specific objective to identify expressed genes and quantify the transcriptomic effects of swimming

  20. Salmo salar and Esox lucius full-length cDNA sequences reveal changes in evolutionary pressures on a post-tetraploidization genome

    Directory of Open Access Journals (Sweden)

    Holt Robert A

    2010-04-01

    Full Text Available Abstract Background Salmonids are one of the most intensely studied fish, in part due to their economic and environmental importance, and in part due to a recent whole genome duplication in the common ancestor of salmonids. This duplication greatly impacts species diversification, functional specialization, and adaptation. Extensive new genomic resources have recently become available for Atlantic salmon (Salmo salar, but documentation of allelic versus duplicate reference genes remains a major uncertainty in the complete characterization of its genome and its evolution. Results From existing expressed sequence tag (EST resources and three new full-length cDNA libraries, 9,057 reference quality full-length gene insert clones were identified for Atlantic salmon. A further 1,365 reference full-length clones were annotated from 29,221 northern pike (Esox lucius ESTs. Pairwise dN/dS comparisons within each of 408 sets of duplicated salmon genes using northern pike as a diploid out-group show asymmetric relaxation of selection on salmon duplicates. Conclusions 9,057 full-length reference genes were characterized in S. salar and can be used to identify alleles and gene family members. Comparisons of duplicated genes show that while purifying selection is the predominant force acting on both duplicates, consistent with retention of functionality in both copies, some relaxation of pressure on gene duplicates can be identified. In addition, there is evidence that evolution has acted asymmetrically on paralogs, allowing one of the pair to diverge at a faster rate.

  1. Molecular cloning, sequencing, and expression analysis of cDNA encoding metalloprotein II (MP II) induced by single and combined metals (Cu(II), Cd(II)) in polychaeta Perinereis aibuhitensis.

    Science.gov (United States)

    Yang, Dazuo; Zhou, Yibing; Zhao, Huan; Zhou, Xiaoxiao; Sun, Na; Wang, Bin; Yuan, Xiutang

    2012-11-01

    We amplified and analyzed the complete cDNA of metalloprotein II (MP II) from the somatic muscle of the polychaete Perinereis aibuhitensis, the full length cDNA is 904 bp encoding 119 amino acids. The MP II cDNA sequence was subjected to BLAST searching in NCBI and was found to share high homology with hemerythrin of other worms. MP II expression of P. aibuhitensis exposed to single and combined metals (Cu(II), Cd(II)) was analyzed using real time-PCR. MP II mRNA expression increased at the start of Cu(II) exposure, then decreased and finally return to the normal level. Expression pattern of MP II under Cd(II) exposure was time- and dose-dependent. MP II expression induced by a combination of Cd(II) and Cu(II) was similar to that induced by Cd(II) alone.

  2. Ultra deep sequencing of Listeria monocytogenes sRNA transcriptome revealed new antisense RNAs.

    Directory of Open Access Journals (Sweden)

    Sebastian Behrens

    Full Text Available Listeria monocytogenes, a gram-positive pathogen, and causative agent of listeriosis, has become a widely used model organism for intracellular infections. Recent studies have identified small non-coding RNAs (sRNAs as important factors for regulating gene expression and pathogenicity of L. monocytogenes. Increased speed and reduced costs of high throughput sequencing (HTS techniques have made RNA sequencing (RNA-Seq the state-of-the-art method to study bacterial transcriptomes. We created a large transcriptome dataset of L. monocytogenes containing a total of 21 million reads, using the SOLiD sequencing technology. The dataset contained cDNA sequences generated from L. monocytogenes RNA collected under intracellular and extracellular condition and additionally was size fractioned into three different size ranges from 150 nt. We report here, the identification of nine new sRNAs candidates of L. monocytogenes and a reevaluation of known sRNAs of L. monocytogenes EGD-e. Automatic comparison to known sRNAs revealed a high recovery rate of 55%, which was increased to 90% by manual revision of the data. Moreover, thorough classification of known sRNAs shed further light on their possible biological functions. Interestingly among the newly identified sRNA candidates are antisense RNAs (asRNAs associated to the housekeeping genes purA, fumC and pgi and potentially their regulation, emphasizing the significance of sRNAs for metabolic adaptation in L. monocytogenes.

  3. Ultra Deep Sequencing of Listeria monocytogenes sRNA Transcriptome Revealed New Antisense RNAs

    Science.gov (United States)

    Behrens, Sebastian; Widder, Stefanie; Mannala, Gopala Krishna; Qing, Xiaoxing; Madhugiri, Ramakanth; Kefer, Nathalie; Mraheil, Mobarak Abu; Rattei, Thomas; Hain, Torsten

    2014-01-01

    Listeria monocytogenes, a gram-positive pathogen, and causative agent of listeriosis, has become a widely used model organism for intracellular infections. Recent studies have identified small non-coding RNAs (sRNAs) as important factors for regulating gene expression and pathogenicity of L. monocytogenes. Increased speed and reduced costs of high throughput sequencing (HTS) techniques have made RNA sequencing (RNA-Seq) the state-of-the-art method to study bacterial transcriptomes. We created a large transcriptome dataset of L. monocytogenes containing a total of 21 million reads, using the SOLiD sequencing technology. The dataset contained cDNA sequences generated from L. monocytogenes RNA collected under intracellular and extracellular condition and additionally was size fractioned into three different size ranges from 150 nt. We report here, the identification of nine new sRNAs candidates of L. monocytogenes and a reevaluation of known sRNAs of L. monocytogenes EGD-e. Automatic comparison to known sRNAs revealed a high recovery rate of 55%, which was increased to 90% by manual revision of the data. Moreover, thorough classification of known sRNAs shed further light on their possible biological functions. Interestingly among the newly identified sRNA candidates are antisense RNAs (asRNAs) associated to the housekeeping genes purA, fumC and pgi and potentially their regulation, emphasizing the significance of sRNAs for metabolic adaptation in L. monocytogenes. PMID:24498259

  4. Exploring fungal diversity in deep-sea sediments from Okinawa Trough using high-throughput Illumina sequencing

    Science.gov (United States)

    Zhang, Xiao-Yong; Wang, Guang-Hua; Xu, Xin-Ya; Nong, Xu-Hua; Wang, Jie; Amin, Muhammad; Qi, Shu-Hua

    2016-10-01

    The present study investigated the fungal diversity in four different deep-sea sediments from Okinawa Trough using high-throughput Illumina sequencing of the nuclear ribosomal internal transcribed spacer-1 (ITS1). A total of 40,297 fungal ITS1 sequences clustered into 420 operational taxonomic units (OTUs) with 97% sequence similarity and 170 taxa were recovered from these sediments. Most ITS1 sequences (78%) belonged to the phylum Ascomycota, followed by Basidiomycota (17.3%), Zygomycota (1.5%) and Chytridiomycota (0.8%), and a small proportion (2.4%) belonged to unassigned fungal phyla. Compared with previous studies on fungal diversity of sediments from deep-sea environments by culture-dependent approach and clone library analysis, the present result suggested that Illumina sequencing had been dramatically accelerating the discovery of fungal community of deep-sea sediments. Furthermore, our results revealed that Sordariomycetes was the most diverse and abundant fungal class in this study, challenging the traditional view that the diversity of Sordariomycetes phylotypes was low in the deep-sea environments. In addition, more than 12 taxa accounted for 21.5% sequences were found to be rarely reported as deep-sea fungi, suggesting the deep-sea sediments from Okinawa Trough harbored a plethora of different fungal communities compared with other deep-sea environments. To our knowledge, this study is the first exploration of the fungal diversity in deep-sea sediments from Okinawa Trough using high-throughput Illumina sequencing.

  5. 蓝太阳鱼生长激素全长cDNA的克隆与序列分析%Cloning and sequencing of full length growth hormone cDNA from Lepomis cyanellus

    Institute of Scientific and Technical Information of China (English)

    曹运长; 李文笙; 叶卫; 林浩然

    2004-01-01

    The full length cDNA encoding growth hormone of a freshwater fish, Lepomis cyanellus, (LcGH) was cloned from pituitary RNA with RT-PCR, 3' and 5' RACE (rapid amplification of cDNA ends). The LcGH cDNA (Genbank No. AY530822), about 989nt (nucleotide) long, consisted of a open reading frame with 615nt long, 5'and 3'untranslated regions with 93nt and 224nt long respectively, and a 57nt poly (A) tail. The DNA sequence analysis showed that there are typical Kozak sequence and polyadenylation signal. The pregrowth hormone peptide of 204aa deduced from LcGH cDNA included a putative signal peptide (17aa) locating in its Nterminal. There exist a Asn-Cys-Thr glycosylation site at amino acid 201, and 4 cysteine residues (No. 69, 177, 194, 202) that are essential to construct two S-S bonds in this pregrowth hormone peptide. Homological comparision among LcGH and other species growth hormones showed that There is high homology (more than 85%) between growth hormone of Lepomis cyanellus and that of most perciformes fish, but low homology (less than 70%) in comparison with other species such as Siluriformes and Cypriniformes fish.

  6. Ultra-deep sequencing of intra-host rabies virus populations during cross-species transmission.

    Directory of Open Access Journals (Sweden)

    Monica K Borucki

    2013-11-01

    Full Text Available One of the hurdles to understanding the role of viral quasispecies in RNA virus cross-species transmission (CST events is the need to analyze a densely sampled outbreak using deep sequencing in order to measure the amount of mutation occurring on a small time scale. In 2009, the California Department of Public Health reported a dramatic increase (350 in the number of gray foxes infected with a rabies virus variant for which striped skunks serve as a reservoir host in Humboldt County. To better understand the evolution of rabies, deep-sequencing was applied to 40 unpassaged rabies virus samples from the Humboldt outbreak. For each sample, approximately 11 kb of the 12 kb genome was amplified and sequenced using the Illumina platform. Average coverage was 17,448 and this allowed characterization of the rabies virus population present in each sample at unprecedented depths. Phylogenetic analysis of the consensus sequence data demonstrated that samples clustered according to date (1995 vs. 2009 and geographic location (northern vs. southern. A single amino acid change in the G protein distinguished a subset of northern foxes from a haplotype present in both foxes and skunks, suggesting this mutation may have played a role in the observed increased transmission among foxes in this region. Deep-sequencing data indicated that many genetic changes associated with the CST event occurred prior to 2009 since several nonsynonymous mutations that were present in the consensus sequences of skunk and fox rabies samples obtained from 20032010 were present at the sub-consensus level (as rare variants in the viral population in skunk and fox samples from 1995. These results suggest that analysis of rare variants within a viral population may yield clues to ancestral genomes and identify rare variants that have the potential to be selected for if environment conditions change.

  7. 人UCA1基因新剪接变异体全长cDNA序列的克隆%Cloning of the full-length cDNA sequence of a novel human UCA1 spliced variant

    Institute of Scientific and Technical Information of China (English)

    王宇; 陈葳; 李旭

    2012-01-01

    Objective To clone the full-length cDNA sequence of novel UCA1 spliced isoforms for understanding the exact mechanism of this type of alternative splicing. Methods The full-length cDNA was amplified from BLZ-211 cells by using the in silicon sequence elongation technique, 5'-RACE and 3'-RACE techniques. Products of RT-PCR were sequenced and further assembled. Results The new UCA1 spliced isoform sequence was 2 202 bp. Conclusion A combination of the in silicon sequence elongation, 5'-RACE and 3'-RACE techniques is an effective way to obtain the full-length cDNA, which will guide further research on the mechanism of this type of alternative splicing.%目的 克隆新的UCA1剪接变异体全长cDNA序列,为研究其可变剪接机制奠定基础.方法 用电子克隆技术和cDNA序列末端快速扩增技术(rapid amplification of cDNA ends,RACE)扩增细胞系BLZ-211 cDNA并进行产物测序和序列拼接.结果 新克隆的UCA1剪接变异体全长cDNA序列为2 202 bp.结论 综合采用电子克隆技术与RACE技术是获得全长cDNA序列的有效方法,为该基因的后续可变剪接机制的研究奠定了基础.

  8. A simple method for the parallel deep sequencing of full influenza A genomes

    DEFF Research Database (Denmark)

    Kampmann, Marie-Louise; Fordyce, Sarah Louise; Avila Arcos, Maria del Carmen;

    2011-01-01

    Given the major threat of influenza A to human and animal health, and its ability to evolve rapidly through mutation and reassortment, tools that enable its timely characterization are necessary to help monitor its evolution and spread. For this purpose, deep sequencing can be a very valuable too...... from human and H1N1 virus from swine, on a single lane of a GAIIx flow cell to an average depth of 122-fold. This technique can be applied to cultivated and uncultivated virus.......Given the major threat of influenza A to human and animal health, and its ability to evolve rapidly through mutation and reassortment, tools that enable its timely characterization are necessary to help monitor its evolution and spread. For this purpose, deep sequencing can be a very valuable tool...

  9. Metagenomes obtained by "deep sequencing" - what do they tell about the EBPR communities

    DEFF Research Database (Denmark)

    Albertsen, Mads; Saunders, Aaron Marc; Nielsen, Kåre Lehmann

    Albertsen Keywords: Metagenomics; Accumulibacter; Micro-diversity; Enhanced Biological Phosphorus Removal Introduction Metagenomics, or environmental genomics, provides comprehensive information about the entire microbial community of a certain ecosystem, e.g. a wastewater treatment plant. So far......, metagenomic analyses have been hampered by high costs and high level of expertise needed to conduct the investigations, but it is changing now with development of new technologies allowing analyses of billions of DNA sequences (deep-sequencing) and user-friendly pipelines for analyses of the huge data sets...... in Albertsen et al., (2011). Results and Discussion We sequenced two metagenomes from Aalborg East and West EBPR wastewater treatment plants at a depth of 12 and 8 Gb using Illumina short read sequencing. The EBPR plants form a distinct group when compared to metagenomes from a wide range of environments, both...

  10. MicroRNA identity and abundance in porcine skeletal muscles determined by deep sequencing

    DEFF Research Database (Denmark)

    Nielsen, M; Hansen, J H; Hedegaard, J;

    2010-01-01

    MicroRNAs (miRNA) are short single-stranded RNA molecules that regulate gene expression post-transcriptionally by binding to complementary sequences in the 3' untranslated region (3' UTR) of target mRNAs. MiRNAs participate in the regulation of myogenesis, and identification of the complete set...... of miRNAs expressed in muscles is likely to significantly increase our understanding of muscle growth and development. To determine the identity and abundance of miRNA in porcine skeletal muscle, we applied a deep sequencing approach. This allowed us to identify the sequences and relative expression...... levels of 212 annotated miRNA genes, thereby providing a thorough account of the miRNA transcriptome in porcine muscle tissue. The expression levels displayed a very large range, as reflected by the number of sequence reads, which varied from single counts for rare miRNAs to several million reads...

  11. Prognostic value of deep sequencing method for minimal residual disease detection in multiple myeloma

    Science.gov (United States)

    Lahuerta, Juan J.; Pepin, François; González, Marcos; Barrio, Santiago; Ayala, Rosa; Puig, Noemí; Montalban, María A.; Paiva, Bruno; Weng, Li; Jiménez, Cristina; Sopena, María; Moorhead, Martin; Cedena, Teresa; Rapado, Immaculada; Mateos, María Victoria; Rosiñol, Laura; Oriol, Albert; Blanchard, María J.; Martínez, Rafael; Bladé, Joan; San Miguel, Jesús; Faham, Malek; García-Sanz, Ramón

    2014-01-01

    We assessed the prognostic value of minimal residual disease (MRD) detection in multiple myeloma (MM) patients using a sequencing-based platform in bone marrow samples from 133 MM patients in at least very good partial response (VGPR) after front-line therapy. Deep sequencing was carried out in patients in whom a high-frequency myeloma clone was identified and MRD was assessed using the IGH-VDJH, IGH-DJH, and IGK assays. The results were contrasted with those of multiparametric flow cytometry (MFC) and allele-specific oligonucleotide polymerase chain reaction (ASO-PCR). The applicability of deep sequencing was 91%. Concordance between sequencing and MFC and ASO-PCR was 83% and 85%, respectively. Patients who were MRD– by sequencing had a significantly longer time to tumor progression (TTP) (median 80 vs 31 months; P < .0001) and overall survival (median not reached vs 81 months; P = .02), compared with patients who were MRD+. When stratifying patients by different levels of MRD, the respective TTP medians were: MRD ≥10−3 27 months, MRD 10−3 to 10−5 48 months, and MRD <10−5 80 months (P = .003 to .0001). Ninety-two percent of VGPR patients were MRD+. In complete response patients, the TTP remained significantly longer for MRD– compared with MRD+ patients (131 vs 35 months; P = .0009). PMID:24646471

  12. Continuous Distributed Representation of Biological Sequences for Deep Proteomics and Genomics.

    Directory of Open Access Journals (Sweden)

    Ehsaneddin Asgari

    Full Text Available We introduce a new representation and feature extraction method for biological sequences. Named bio-vectors (BioVec to refer to biological sequences in general with protein-vectors (ProtVec for proteins (amino-acid sequences and gene-vectors (GeneVec for gene sequences, this representation can be widely used in applications of deep learning in proteomics and genomics. In the present paper, we focus on protein-vectors that can be utilized in a wide array of bioinformatics investigations such as family classification, protein visualization, structure prediction, disordered protein identification, and protein-protein interaction prediction. In this method, we adopt artificial neural network approaches and represent a protein sequence with a single dense n-dimensional vector. To evaluate this method, we apply it in classification of 324,018 protein sequences obtained from Swiss-Prot belonging to 7,027 protein families, where an average family classification accuracy of 93%±0.06% is obtained, outperforming existing family classification methods. In addition, we use ProtVec representation to predict disordered proteins from structured proteins. Two databases of disordered sequences are used: the DisProt database as well as a database featuring the disordered regions of nucleoporins rich with phenylalanine-glycine repeats (FG-Nups. Using support vector machine classifiers, FG-Nup sequences are distinguished from structured protein sequences found in Protein Data Bank (PDB with a 99.8% accuracy, and unstructured DisProt sequences are differentiated from structured DisProt sequences with 100.0% accuracy. These results indicate that by only providing sequence data for various proteins into this model, accurate information about protein structure can be determined. Importantly, this model needs to be trained only once and can then be applied to extract a comprehensive set of information regarding proteins of interest. Moreover, this representation can be

  13. Continuous Distributed Representation of Biological Sequences for Deep Proteomics and Genomics.

    Science.gov (United States)

    Asgari, Ehsaneddin; Mofrad, Mohammad R K

    2015-01-01

    We introduce a new representation and feature extraction method for biological sequences. Named bio-vectors (BioVec) to refer to biological sequences in general with protein-vectors (ProtVec) for proteins (amino-acid sequences) and gene-vectors (GeneVec) for gene sequences, this representation can be widely used in applications of deep learning in proteomics and genomics. In the present paper, we focus on protein-vectors that can be utilized in a wide array of bioinformatics investigations such as family classification, protein visualization, structure prediction, disordered protein identification, and protein-protein interaction prediction. In this method, we adopt artificial neural network approaches and represent a protein sequence with a single dense n-dimensional vector. To evaluate this method, we apply it in classification of 324,018 protein sequences obtained from Swiss-Prot belonging to 7,027 protein families, where an average family classification accuracy of 93%±0.06% is obtained, outperforming existing family classification methods. In addition, we use ProtVec representation to predict disordered proteins from structured proteins. Two databases of disordered sequences are used: the DisProt database as well as a database featuring the disordered regions of nucleoporins rich with phenylalanine-glycine repeats (FG-Nups). Using support vector machine classifiers, FG-Nup sequences are distinguished from structured protein sequences found in Protein Data Bank (PDB) with a 99.8% accuracy, and unstructured DisProt sequences are differentiated from structured DisProt sequences with 100.0% accuracy. These results indicate that by only providing sequence data for various proteins into this model, accurate information about protein structure can be determined. Importantly, this model needs to be trained only once and can then be applied to extract a comprehensive set of information regarding proteins of interest. Moreover, this representation can be considered as

  14. Deep sequencing analysis of HBV genotype shift and correlation with antiviral efficiency during adefovir dipivoxil therapy.

    Directory of Open Access Journals (Sweden)

    Yuwei Wang

    Full Text Available Viral genotype shift in chronic hepatitis B (CHB patients during antiviral therapy has been reported, but the underlying mechanism remains elusive.38 CHB patients treated with ADV for one year were selected for studying genotype shift by both deep sequencing and Sanger sequencing method.Sanger sequencing method found that 7.9% patients showed mixed genotype before ADV therapy. In contrast, all 38 patients showed mixed genotype before ADV treatment by deep sequencing. 95.5% mixed genotype rate was also obtained from additional 200 treatment-naïve CHB patients. Of the 13 patients with genotype shift, the fraction of the minor genotype in 5 patients (38% increased gradually during the course of ADV treatment. Furthermore, responses to ADV and HBeAg seroconversion were associated with the high rate of genotype shift, suggesting drug and immune pressure may be key factors to induce genotype shift. Interestingly, patients with genotype C had a significantly higher rate of genotype shift than genotype B. In genotype shift group, ADV treatment induced a marked enhancement of genotype B ratio accompanied by a reduction of genotype C ratio, suggesting genotype C may be more sensitive to ADV than genotype B. Moreover, patients with dominant genotype C may have a better therapeutic effect. Finally, genotype shifts was correlated with clinical improvement in terms of ALT.Our findings provided a rational explanation for genotype shift among ADV-treated CHB patients. The genotype and genotype shift might be associated with antiviral efficiency.

  15. Exploring the Mechanisms of Gastrointestinal Cancer Development Using Deep Sequencing Analysis

    Energy Technology Data Exchange (ETDEWEB)

    Matsumoto, Tomonori; Shimizu, Takahiro; Takai, Atsushi; Marusawa, Hiroyuki, E-mail: maru@kuhp.kyoto-u.ac.jp [Department of Gastroenterology and Hepatology, Graduate School of Medicine, Kyoto University, 54 Shogoin-Kawahara-cho, Sakyo-ku, Kyoto 606-8507 (Japan)

    2015-06-15

    Next-generation sequencing (NGS) technologies have revolutionized cancer genomics due to their high throughput sequencing capacity. Reports of the gene mutation profiles of various cancers by many researchers, including international cancer genome research consortia, have increased over recent years. In addition to detecting somatic mutations in tumor cells, NGS technologies enable us to approach the subject of carcinogenic mechanisms from new perspectives. Deep sequencing, a method of optimizing the high throughput capacity of NGS technologies, allows for the detection of genetic aberrations in small subsets of premalignant and/or tumor cells in noncancerous chronically inflamed tissues. Genome-wide NGS data also make it possible to clarify the mutational signatures of each cancer tissue by identifying the precise pattern of nucleotide alterations in the cancer genome, providing new information regarding the mechanisms of tumorigenesis. In this review, we highlight these new methods taking advantage of NGS technologies, and discuss our current understanding of carcinogenic mechanisms elucidated from such approaches.

  16. Complete genome sequence of a new tobamovirus naturally infecting tomatoes in Mexico

    Science.gov (United States)

    The complete genomic sequence of a new tobamovirus in tomato was determined through deep sequencing and assembly of small RNAs, thenvalidated through Sanger sequencing of the overlapping RT-PCR products and rapid amplification of cDNA ends (RACE). Based on the genomic sequence identity (85%) to kn...

  17. Deep sequence characterisation of a divergent HPIV-4a from an adult with prolonged influenza-like illness

    Directory of Open Access Journals (Sweden)

    Katherine E. Arden

    2015-12-01

    Deep sequencing allowed identification and genomic characterisation of a possible pathogen from an ILI as well as being an important tool to aid future understanding of the linkages between viral genetic variation, transmission and disease prognosis.

  18. Deep sequencing discovery of novel and conserved microRNAs in trifoliate orange (Citrus trifoliata

    Directory of Open Access Journals (Sweden)

    Yu Huaping

    2010-07-01

    Full Text Available Abstract Background MicroRNAs (miRNAs play a critical role in post-transcriptional gene regulation and have been shown to control many genes involved in various biological and metabolic processes. There have been extensive studies to discover miRNAs and analyze their functions in model plant species, such as Arabidopsis and rice. Deep sequencing technologies have facilitated identification of species-specific or lowly expressed as well as conserved or highly expressed miRNAs in plants. Results In this research, we used Solexa sequencing to discover new microRNAs in trifoliate orange (Citrus trifoliata which is an important rootstock of citrus. A total of 13,106,753 reads representing 4,876,395 distinct sequences were obtained from a short RNA library generated from small RNA extracted from C. trifoliata flower and fruit tissues. Based on sequence similarity and hairpin structure prediction, we found that 156,639 reads representing 63 sequences from 42 highly conserved miRNA families, have perfect matches to known miRNAs. We also identified 10 novel miRNA candidates whose precursors were all potentially generated from citrus ESTs. In addition, five miRNA* sequences were also sequenced. These sequences had not been earlier described in other plant species and accumulation of the 10 novel miRNAs were confirmed by qRT-PCR analysis. Potential target genes were predicted for most conserved and novel miRNAs. Moreover, four target genes including one encoding IRX12 copper ion binding/oxidoreductase and three genes encoding NB-LRR disease resistance protein have been experimentally verified by detection of the miRNA-mediated mRNA cleavage in C. trifoliata. Conclusion Deep sequencing of short RNAs from C. trifoliata flowers and fruits identified 10 new potential miRNAs and 42 highly conserved miRNA families, indicating that specific miRNAs exist in C. trifoliata. These results show that regulatory miRNAs exist in agronomically important trifoliate orange

  19. Mayday SeaSight: combined analysis of deep sequencing and microarray data.

    Directory of Open Access Journals (Sweden)

    Florian Battke

    Full Text Available Recently emerged deep sequencing technologies offer new high-throughput methods to quantify gene expression, epigenetic modifications and DNA-protein binding. From a computational point of view, the data is very different from that produced by the already established microarray technology, providing a new perspective on the samples under study and complementing microarray gene expression data. Software offering the integrated analysis of data from different technologies is of growing importance as new data emerge in systems biology studies. Mayday is an extensible platform for visual data exploration and interactive analysis and provides many methods for dissecting complex transcriptome datasets. We present Mayday SeaSight, an extension that allows to integrate data from different platforms such as deep sequencing and microarrays. It offers methods for computing expression values from mapped reads and raw microarray data, background correction and normalization and linking microarray probes to genomic coordinates. It is now possible to use Mayday's wealth of methods to analyze sequencing data and to combine data from different technologies in one analysis.

  20. 家蝇防御素基因的cDNA克隆及序列分析%Cloning and sequence analysis of the full-length cDNA encoding defensin, an antimicrobial peptide from the housefly (Musca domestica)

    Institute of Scientific and Technical Information of China (English)

    王来城; 王金星; 王来元; 赵小凡

    2003-01-01

    Defensin is a kind of cationic.inducible antimicrobial peptide found in a large range of living organisms that contributes to host defense by disrupting the cytoplasmic membrane of microorganisms.with their broad antimicrobial spectrum and strong pharmaceutical effects.antimicrobial peptides,including defensins,represent a source of novel antibiotic agents.A novel full-length 430 base pairs cDNA of an insect defensin was cloned using polymerase chain reaction (PCR) from the cDnA library of houseflies(Musca domestica) that had been challenged by E.coli and staphylococcus taincd an NH2-terminal signal sequence(1-22)followed by a propeptide and the mature peptide(53-92),The sequence identity with other insect defensin is between 51% and 73%.The mature peptide,with a predicted molecular weight of 4.0kDa,and pI of 8.69,has 1 negative charged amino acid and 4 positice ones,the putative housefly defensin is characterized by 6 invariant cysteine residues forming 3 disulfide bonds,Cys1-Cys4,Cys2-Cys5 and Cys3-Cys6,These results suggest that the novel full-length cDNA of the defensin gene.Denominated Mdde,has been successfully cloned from houseflies.

  1. Genetic variation of human papillomavirus type 16 in individual clinical specimens revealed by deep sequencing.

    Directory of Open Access Journals (Sweden)

    Iwao Kukimoto

    Full Text Available Viral genetic diversity within infected cells or tissues, called viral quasispecies, has been mostly studied for RNA viruses, but has also been described among DNA viruses, including human papillomavirus type 16 (HPV16 present in cervical precancerous lesions. However, the extent of HPV genetic variation in cervical specimens, and its involvement in HPV-induced carcinogenesis, remains unclear. Here, we employ deep sequencing to comprehensively analyze genetic variation in the HPV16 genome isolated from individual clinical specimens. Through overlapping full-circle PCR, approximately 8-kb DNA fragments covering the whole HPV16 genome were amplified from HPV16-positive cervical exfoliated cells collected from patients with either low-grade squamous intraepithelial lesion (LSIL or invasive cervical cancer (ICC. Deep sequencing of the amplified HPV16 DNA enabled de novo assembly of the full-length HPV16 genome sequence for each of 7 specimens (5 LSIL and 2 ICC samples. Subsequent alignment of read sequences to the assembled HPV16 sequence revealed that 2 LSILs and 1 ICC contained nucleotide variations within E6, E1 and the non-coding region between E5 and L2 with mutation frequencies of 0.60% to 5.42%. In transient replication assays, a novel E1 mutant found in ICC, E1 Q381E, showed reduced ability to support HPV16 origin-dependent replication. In addition, partially deleted E2 genes were detected in 1 LSIL sample in a mixed state with the intact E2 gene. Thus, the methods used in this study provide a fundamental framework for investigating the influence of HPV somatic genetic variation on cervical carcinogenesis.

  2. Ultra-deep sequencing of VHSV isolates contributes to understanding the role of viral quasispecies.

    Science.gov (United States)

    Schönherz, Anna A; Lorenzen, Niels; Guldbrandtsen, Bernt; Buitenhuis, Bart; Einer-Jensen, Katja

    2016-01-08

    The high mutation rate of RNA viruses enables the generation of a genetically diverse viral population, termed a quasispecies, within a single infected host. This high in-host genetic diversity enables an RNA virus to adapt to a diverse array of selective pressures such as host immune response and switching between host species. The negative-sense, single-stranded RNA virus, viral haemorrhagic septicaemia virus (VHSV), was originally considered an epidemic virus of cultured rainbow trout in Europe, but was later proved to be endemic among a range of marine fish species in the Northern hemisphere. To better understand the nature of a virus quasispecies related to the evolutionary potential of VHSV, a deep-sequencing protocol specific to VHSV was established and applied to 4 VHSV isolates, 2 originating from rainbow trout and 2 from Atlantic herring. Each isolate was subjected to Illumina paired end shotgun sequencing after PCR amplification and the 11.1 kb genome was successfully sequenced with an average coverage of 0.5-1.9 × 10(6) sequenced copies. Differences in single nucleotide polymorphism (SNP) frequency were detected both within and between isolates, possibly related to their stage of adaptation to host species and host immune reactions. The N, M, P and Nv genes appeared nearly fixed, while genetic variation in the G and L genes demonstrated presence of diverse genetic populations particularly in two isolates. The results demonstrate that deep sequencing and analysis methodologies can be useful for future in vivo host adaption studies of VHSV.

  3. Deep transcriptome sequencing of Pecten maximus hemocytes: a genomic resource for bivalve immunology.

    Science.gov (United States)

    Pauletto, Marianna; Milan, Massimo; Moreira, Rebeca; Novoa, Beatriz; Figueras, Antonio; Babbucci, Massimiliano; Patarnello, Tomaso; Bargelloni, Luca

    2014-03-01

    Pecten maximus, the king scallop, is a bivalve species with important commercial value for both fisheries and aquaculture, traditionally consumed in several European countries. Major problems in larval rearing, however, still limit hatchery-based seed production. High mortalities during early larval stages, likely related to bacterial pathogens, represent the most relevant bottleneck. To address this issue, understanding host defense mechanisms against microbes is extremely important. In this study next-generation RNA-sequencing was carried on scallop hemocytes. To enrich for immune-related transcripts, cDNA libraries from hemocytes challenged in vivo with inactivated-Vibrio anguillarum and in vitro with pathogen-associated molecular patterns, as well as unchallenged controls, were sequenced yielding 216,444,674 sequence reads. De novo assembly of the scallop hemocyte transcriptome consisted of 73,732 contigs (31% annotated). A total of 934 contigs encoded proteins with a known immune function, grouped into several functional categories. Particular attention was reserved to Toll-like receptors (TLRs), a family of pattern recognition receptors (PRRs) involved in non-self recognition. Through mining the scallop hemocyte transcriptome, at least four TLRs could be identified. The organization of canonical TLR domains demonstrated that single cysteine cluster and multiple cysteine cluster TLRs co-exist in this species. In addition, preliminary data concerning their mRNA level following bacterial challenge suggested that different members of this family could exhibit opposite responses to pathogenic stimuli. Finally, a global analysis of differential expression comparing gene-expression levels in in vitro and in vivo stimulated hemocytes against controls provided evidence on a large set of transcripts involved in the great scallop immune response.

  4. High-throughput sequencing and analysis of the gill tissue transcriptome from the deep-sea hydrothermal vent mussel Bathymodiolus azoricus

    Directory of Open Access Journals (Sweden)

    Gomes Paula

    2010-10-01

    Full Text Available Abstract Background Bathymodiolus azoricus is a deep-sea hydrothermal vent mussel found in association with large faunal communities living in chemosynthetic environments at the bottom of the sea floor near the Azores Islands. Investigation of the exceptional physiological reactions that vent mussels have adopted in their habitat, including responses to environmental microbes, remains a difficult challenge for deep-sea biologists. In an attempt to reveal genes potentially involved in the deep-sea mussel innate immunity we carried out a high-throughput sequence analysis of freshly collected B. azoricus transcriptome using gills tissues as the primary source of immune transcripts given its strategic role in filtering the surrounding waterborne potentially infectious microorganisms. Additionally, a substantial EST data set was produced and from which a comprehensive collection of genes coding for putative proteins was organized in a dedicated database, "DeepSeaVent" the first deep-sea vent animal transcriptome database based on the 454 pyrosequencing technology. Results A normalized cDNA library from gills tissue was sequenced in a full 454 GS-FLX run, producing 778,996 sequencing reads. Assembly of the high quality reads resulted in 75,407 contigs of which 3,071 were singletons. A total of 39,425 transcripts were conceptually translated into amino-sequences of which 22,023 matched known proteins in the NCBI non-redundant protein database, 15,839 revealed conserved protein domains through InterPro functional classification and 9,584 were assigned with Gene Ontology terms. Queries conducted within the database enabled the identification of genes putatively involved in immune and inflammatory reactions which had not been previously evidenced in the vent mussel. Their physical counterpart was confirmed by semi-quantitative quantitative Reverse-Transcription-Polymerase Chain Reactions (RT-PCR and their RNA transcription level by quantitative PCR (q

  5. The tetramethylammonium chloride method for screening of cDNA libraries using highly degenerate oligonucleotides obtained by backtranslation of amino-acid sequences

    DEFF Research Database (Denmark)

    Honoré, B; Madsen, Peder; Leffers, H

    1993-01-01

    We describe a method for screening of cDNA libraries with highly degenerate oligonucleotides using tetramethylammonium chloride (TMAC). This method is a convenient alternative to using probes generated by the polymerase chain reaction (PCR), especially when these cannot easily be made. Nylon filt...

  6. Ultra-deep sequencing of VHSV isolates contributes to understanding the role of viral quasispecies

    DEFF Research Database (Denmark)

    Schönherz, Anna A.; Lorenzen, Niels; Guldbrandtsen, Bernt

    2016-01-01

    The high mutation rate of RNA viruses enables the generation of a genetically diverse viral population, termed a quasispecies, within a single infected host. This high in-host genetic diversity enables an RNA virus to adapt to a diverse array of selective pressures such as host immune response....... To better understand the nature of a virus quasispecies related to the evolutionary potential of VHSV, a deep-sequencing protocol specific to VHSV was established and applied to 4 VHSV isolates, 2 originating from rainbow trout and 2 from Atlantic herring. Each isolate was subjected to Illumina paired end...

  7. Evaluation of Nine Somatic Variant Callers for Detection of Somatic Mutations in Exome and Targeted Deep Sequencing Data

    DEFF Research Database (Denmark)

    Krøigård, Anne Bruun; Thomassen, Mads; Lænkholm, Anne Vibeke;

    2016-01-01

    Next generation sequencing is extensively applied to catalogue somatic mutations in cancer, in research settings and increasingly in clinical settings for molecular diagnostics, guiding therapy decisions. Somatic variant callers perform paired comparisons of sequencing data from cancer tissue...... a comprehensive evaluation using exome sequencing and targeted deep sequencing data of paired tumor-normal samples from five breast cancer patients to evaluate the performance of nine publicly available somatic variant callers: EBCall, Mutect, Seurat, Shimmer, Indelocator, Somatic Sniper, Strelka, VarScan 2...... callers, as for some callers, increased sequencing depth highly improved sensitivity. For SNV calling, we report EBCall, Mutect, Virmid and Strelka to be the most reliable somatic variant callers for both exome sequencing and targeted deep sequencing. For indel calling, EBCall is superior due to high...

  8. Evaluation of Nine Somatic Variant Callers for Detection of Somatic Mutations in Exome and Targeted Deep Sequencing Data

    DEFF Research Database (Denmark)

    Krøigård, Anne Bruun; Thomassen, Mads; Lænkholm, Anne Vibeke;

    2016-01-01

    a comprehensive evaluation using exome sequencing and targeted deep sequencing data of paired tumor-normal samples from five breast cancer patients to evaluate the performance of nine publicly available somatic variant callers: EBCall, Mutect, Seurat, Shimmer, Indelocator, Somatic Sniper, Strelka, VarScan 2...... and Virmid for the detection of single nucleotide mutations and small deletions and insertions. We report a large variation in the number of calls from the nine somatic variant callers on the same sequencing data and highly variable agreement. Sequencing depth had markedly diverse impact on individual...... callers, as for some callers, increased sequencing depth highly improved sensitivity. For SNV calling, we report EBCall, Mutect, Virmid and Strelka to be the most reliable somatic variant callers for both exome sequencing and targeted deep sequencing. For indel calling, EBCall is superior due to high...

  9. Inside the intraterrestrials: The deep biosphere seen through massively parallel sequencing

    Science.gov (United States)

    Biddle, J.

    2009-12-01

    Deeply buried marine sediments may house a large amount of the Earth’s microbial population. Initial studies based on 16S rRNA clone libraries suggest that these sediments contain unique phylotypes of microorganisms, particularly from the archaeal domain. Since this environment is so difficult to study, microbiologists are challenged to find ways to examine these populations remotely. A major approach taken to study this environment uses massively parallel sequencing to examine the inner genetic workings of these microorganisms after the sediment has been drilled. Both metagenomics and tagged amplicon sequencing have been employed on deep sediments, and initial results show that different geographic regions can be differentiated through genomics and also minor populations may cause major geochemical changes.

  10. Metatranscriptomic analysis of small RNAs present in soybean deep sequencing libraries

    Directory of Open Access Journals (Sweden)

    Lorrayne Gomes Molina

    2012-01-01

    Full Text Available A large number of small RNAs unrelated to the soybean genome were identified after deep sequencing of soybean small RNA libraries. A metatranscriptomic analysis was carried out to identify the origin of these sequences. Comparative analyses of small interference RNAs (siRNAs present in samples collected in open areas corresponding to soybean field plantations and samples from soybean cultivated in greenhouses under a controlled environment were made. Different pathogenic, symbiotic and free-living organisms were identified from samples of both growth systems. They included viruses, bacteria and different groups of fungi. This approach can be useful not only to identify potentially unknown pathogens and pests, but also to understand the relations that soybean plants establish with microorganisms that may affect, directly or indirectly, plant health and crop production.

  11. Detection and characterization of mycoviruses in arbuscular mycorrhizal fungi by deep-sequencing.

    Science.gov (United States)

    Ezawa, Tatsuhiro; Ikeda, Yoji; Shimura, Hanako; Masuta, Chikara

    2015-01-01

    Fungal viruses (mycoviruses) often have a significant impact not only on phenotypic expression of the host fungus but also on higher order biological interactions, e.g., conferring plant stress tolerance via an endophytic host fungus. Arbuscular mycorrhizal (AM) fungi in the phylum Glomeromycota associate with most land plants and supply mineral nutrients to the host plants. So far, little information about mycoviruses has been obtained in the fungi due to their obligate biotrophic nature. Here we provide a technical breakthrough, "two-step strategy" in combination with deep-sequencing, for virological study in AM fungi; dsRNA is first extracted and sequenced using material obtained from highly productive open pot culture, and then the presence of viruses is verified using pure material produced in the in vitro monoxenic culture. This approach enabled us to demonstrate the presence of several viruses for the first time from a glomeromycotan fungus.

  12. Optimization of affinity, specificity and function of designed influenza inhibitors using deep sequencing

    Energy Technology Data Exchange (ETDEWEB)

    Whitehead, Timothy A.; Chevalier, Aaron; Song, Yifan; Dreyfus, Cyrille; Fleishman, Sarel J.; De Mattos, Cecilia; Myers, Chris A.; Kamisetty, Hetunandan; Blair, Patrick; Wilson, Ian A.; Baker, David (UWASH); (Scripps); (NRL)

    2012-06-19

    We show that comprehensive sequence-function maps obtained by deep sequencing can be used to reprogram interaction specificity and to leapfrog over bottlenecks in affinity maturation by combining many individually small contributions not detectable in conventional approaches. We use this approach to optimize two computationally designed inhibitors against H1N1 influenza hemagglutinin and, in both cases, obtain variants with subnanomolar binding affinity. The most potent of these, a 51-residue protein, is broadly cross-reactive against all influenza group 1 hemagglutinins, including human H2, and neutralizes H1N1 viruses with a potency that rivals that of several human monoclonal antibodies, demonstrating that computational design followed by comprehensive energy landscape mapping can generate proteins with potential therapeutic utility.

  13. 家蚕铜锌超氧化物歧化酶cDNA的克隆与测序%Cloning and sequencing of the silkworm (Bombyx mori) copper zinc-superoxide dismutase cDNA

    Institute of Scientific and Technical Information of China (English)

    唐云明; 鲁成; 向仲怀; 许禾声

    2003-01-01

    The total RNA of the silkworm (Bombyx mori ) was obtained with chlorinated caesium density gradient cen-trifugation, mRNA was converted to cDNA by reverse transcription. This cDNA was used as a template for two PCR am-plifications by three primers. These were DP1: 5′-ATGGT (GT) GT (GT) AA (AG) GC (TC) GT-3′; DP2: 5′-ATGGT (GT) GT (GT) AA (AG) GC (TC) GT (GT) (CT) T-3′ and PA: 5′-GAGGACTCGAGCTCAAGC-3′. Northern blot identification of poly A+ mRNA extracted from silkworms by hybridization with the above amplified product detected a single mRNA transcript of approximate 500 bp. This showed that the amplified product was from silk-worm mRNA. Amplified products were separated by agarose gel electrophoresis and purified. The purified products were ligated to pUC19-T or pUCm-T vector and transformed into E. coli DHSct competent cells. Recombinant colonies were screened by X-gal. A clone of silkworm CuZn-SOD cDNA was thus obtained. DNA sequencing was done using the Sanger dideoxy method. The result of recombinant plasmid DNA sequencing using a DNA sequencer was 591 bps cloned. Se-quencing with DNAStar and DNAClub revealed that l - 3 bp of the 5′-extremity was the initiation codon ATG. 459 bp of the 5′-extremity translated 153 amino acids, 460 - 462 bp was the termination codon TAA which included the degenera-tion primer (DP2) derived from the N-terminal amino acid of the 5′end, PA of the 3′end and 73 bp poly A+ tail at the front of PA in the 3′end and so on.

  14. Pathogen-specific deep sequence-coupled biopanning: A method for surveying human antibody responses

    Science.gov (United States)

    Pascale, Juan M.; Moreno, Brechla; Chackerian, Bryce; Peabody, David S.

    2017-01-01

    Identifying the targets of antibody responses during infection is important for designing vaccines, developing diagnostic and prognostic tools, and understanding pathogenesis. We developed a novel deep sequence-coupled biopanning approach capable of identifying the protein epitopes of antibodies present in human polyclonal serum. Here, we report the adaptation of this approach for the identification of pathogen-specific epitopes recognized by antibodies elicited during acute infection. As a proof-of-principle, we applied this approach to assessing antibodies to Dengue virus (DENV). Using a panel of sera from patients with acute secondary DENV infection, we panned a DENV antigen fragment library displayed on the surface of bacteriophage MS2 virus-like particles and characterized the population of affinity-selected peptide epitopes by deep sequence analysis. Although there was considerable variation in the responses of individuals, we found several epitopes within the Envelope glycoprotein and Non-Structural Protein 1 that were commonly enriched. This report establishes a novel approach for characterizing pathogen-specific antibody responses in human sera, and has future utility in identifying novel diagnostic and vaccine targets. PMID:28152075

  15. Advanced methylome analysis after bisulfite deep sequencing: an example in Arabidopsis.

    Directory of Open Access Journals (Sweden)

    Huy Q Dinh

    Full Text Available Deep sequencing after bisulfite conversion (BS-Seq is the method of choice to generate whole genome maps of cytosine methylation at single base-pair resolution. Its application to genomic DNA of Arabidopsis flower bud tissue resulted in the first complete methylome, determining a methylation rate of 6.7% in this tissue. BS-Seq reads were mapped onto an in silico converted reference genome, applying the so-called 3-letter genome method. Here, we present BiSS (Bisufite Sequencing Scorer, a new method applying Smith-Waterman alignment to map bisulfite-converted reads to a reference genome. In addition, we introduce a comprehensive adaptive error estimate that accounts for sequencing errors, erroneous bisulfite conversion and also wrongly mapped reads. The re-analysis of the Arabidopsis methylome data with BiSS mapped substantially more reads to the genome. As a result, it determines the methylation status of an extra 10% of cytosines and estimates the methylation rate to be 7.7%. We validated the results by individual traditional bisulfite sequencing for selected genomic regions. In addition to predicting the methylation status of each cytosine, BiSS also provides an estimate of the methylation degree at each genomic site. Thus, BiSS explores BS-Seq data more extensively and provides more information for downstream analysis.

  16. Molecular Cloning and Sequence Analysis on cDNA of Cystatin Gene from Tea Leaves%茶树巯基蛋白酶抑制剂基因的cDNA克隆与序列分析

    Institute of Scientific and Technical Information of China (English)

    王朝霞; 李叶云; 江昌俊; 余有本

    2005-01-01

    对多种已知植物巯基蛋白酶抑制剂(cystatin)基因的氨基酸序列进行比对分析,根据其高度保守的氨基酸序列设计一对简并引物,并从茶树品种龙井43鲜叶中提取总RNA,用RT-PCR法扩增出-204 bp的cDNA特异片段,然后通过3'/5'RACE的方法,分别扩增出3'端和5'端的序列,从而获得茶树巯基蛋白酶抑制剂基因的cDNA全长序列,所得序列全长627 bp,编码101个氨基酸,分子量约11.062 KDa.该基因在推测的氨基酸序列中含有巯基蛋白酶抑制剂家族中高度保守的、与其活性有关的QXVXG结构,且经Blast分析表明,该基因序列与其他植物巯基蛋白酶抑制剂基因的氨基酸序列同源性为54%~77%.%Two degenerate primers were designed according to the conserved region among the known plant cystatins. A cDNA fragment of 204 bp was amplified by RT-PCR (reverse transcription polymerase chain reaction)of total RNA extracted from fresh leaves of Tea plant (Camellia sinensis cv Longjing43). A full-length cDNA of the cystatin gene was obtained by 3'/5'RACE (rapid amplification of cDNA ends). The cDNA sequence of this 627 bp clone contained an open reading frame encoding a polypeptide of 101 amino acid residues with a predicable molecular mass of 11.026 KDa. The deduced amino acid sequence contained the motif QXVXG conserved among most members of the cystatin superfamily. By using the program of Blast on GenBank database, the sequence presented a high match with the cystatin genes from other plants, such as European chestnut, Cassava, Cowpea,Tomato, Soybean et al. All researched out sequences were all cystatins, so we can conclude that the cloned sequence is a member of cystatin gene from Tea plant.

  17. MicroRNA expression signatures of bladder cancer revealed by deep sequencing.

    Directory of Open Access Journals (Sweden)

    Yonghua Han

    Full Text Available BACKGROUND: MicroRNAs (miRNAs are a class of small noncoding RNAs that regulate gene expression. They are aberrantly expressed in many types of cancers. In this study, we determined the genome-wide miRNA profiles in bladder urothelial carcinoma by deep sequencing. METHODOLOGY/PRINCIPAL FINDINGS: We detected 656 differentially expressed known human miRNAs and miRNA antisense sequences (miRNA*s in nine bladder urothelial carcinoma patients by deep sequencing. Many miRNAs and miRNA*s were significantly upregulated or downregulated in bladder urothelial carcinoma compared to matched histologically normal urothelium. hsa-miR-96 was the most significantly upregulated miRNA and hsa-miR-490-5p was the most significantly downregulated one. Upregulated miRNAs were more common than downregulated ones. The hsa-miR-183, hsa-miR-200b ∼ 429, hsa-miR-200c ∼ 141 and hsa-miR-17 ∼ 92 clusters were significantly upregulated. The hsa-miR-143 ∼ 145 cluster was significantly downregulated. hsa-miR-182, hsa-miR-183, hsa-miR-200a, hsa-miR-143 and hsa-miR-195 were evaluated by Real-Time qPCR in a total of fifty-one bladder urothelial carcinoma patients. They were aberrantly expressed in bladder urothelial carcinoma compared to matched histologically normal urothelium (p < 0.001 for each miRNA. CONCLUSIONS/SIGNIFICANCE: To date, this is the first study to determine genome-wide miRNA expression patterns in human bladder urothelial carcinoma by deep sequencing. We found that a collection of miRNAs were aberrantly expressed in bladder urothelial carcinoma compared to matched histologically normal urothelium, suggesting that they might play roles as oncogenes or tumor suppressors in the development and/or progression of this cancer. Our data provide novel insights into cancer biology.

  18. 猪Mx1基因cDNA的克隆及序列分析%Cloning and Sequence Analysis of Porcine Mx1 Gene cDNA

    Institute of Scientific and Technical Information of China (English)

    李想; 沈学文; 李文贵; 毕峻龙; 杨贵树; 尹革芬

    2011-01-01

    为获得云南本地猪Mx1基因cDNA序列,根据GenBank中猪Mx1基因的参考序列,设计合成3对引物.以云南本地猪外周淋巴细胞为样本,采用RT - PCR方法进行分段扩增,对扩增片段进行克隆,测序分析及序列拼接.结果表明,扩增的云南本地猪Mx1基因cDNA全长2546 bp,开放阅读框1992 bp,编码663个氨基酸,与GenBank中他猪种Mx1基因序列对比,核苷酸序列的同源性为99.4%~99.8%,氨基酸序列的同源性为98.8%~99.5%.成功获得云南本地猪Mx1基因的cDNA序列,为研究云南本地猪Mx1蛋白的抗病毒活性及作用机制奠定了基础.%In order to obtain the cDNA sequence of Mxl gene of Yunnan local pig, primers were designed according to the sequences of porcine Mxl gene in GenBank. Fragments were amplified using RT-PCR from peripheral lymphocytes of Yunnan local pigs, and the fragment sequences were analyzed and spliced. The results demonstrated that the cDNA length of Mxl gene amplified from Yunnan local pigs was 2 546 bp, the open reading frame was 1 992 bp with encoding 663 amino acids. Compared with other Mxl gene sequences in GenBank, the homologies of nucleotide sequences were from 99. 4% to 99. 8% , those of amino acid sequences were from 98. 8% to 99. 5%. The cDNA sequence of Mxl gene in Yunnan local pigs was successfully obtained, which may facilitate the study on the antiviral activity and mechanism of Mxl protein in Yunnan local pigs.

  19. Transcriptome and small RNA deep sequencing reveals deregulation of miRNA biogenesis in human glioma.

    Science.gov (United States)

    Moore, Lynette M; Kivinen, Virpi; Liu, Yuexin; Annala, Matti; Cogdell, David; Liu, Xiuping; Liu, Chang-Gong; Sawaya, Raymond; Yli-Harja, Olli; Shmulevich, Ilya; Fuller, Gregory N; Zhang, Wei; Nykter, Matti

    2013-02-01

    Altered expression of oncogenic and tumour-suppressing microRNAs (miRNAs) is widely associated with tumourigenesis. However, the regulatory mechanisms underlying these alterations are poorly understood. We sought to shed light on the deregulation of miRNA biogenesis promoting the aberrant miRNA expression profiles identified in these tumours. Using sequencing technology to perform both whole-transcriptome and small RNA sequencing of glioma patient samples, we examined precursor and mature miRNAs to directly evaluate the miRNA maturation process, and examined expression profiles for genes involved in the major steps of miRNA biogenesis. We found that ratios of mature to precursor forms of a large number of miRNAs increased with the progression from normal brain to low-grade and then to high-grade gliomas. The expression levels of genes involved in each of the three major steps of miRNA biogenesis (nuclear processing, nucleo-cytoplasmic transport, and cytoplasmic processing) were systematically altered in glioma tissues. Survival analysis of an independent data set demonstrated that the alteration of genes involved in miRNA maturation correlates with survival in glioma patients. Direct quantification of miRNA maturation with deep sequencing demonstrated that deregulation of the miRNA biogenesis pathway is a hallmark for glioma genesis and progression.

  20. Breaking the 1000-gene barrier for Mimivirus using ultra-deep genome and transcriptome sequencing

    Directory of Open Access Journals (Sweden)

    Claverie Jean-Michel

    2011-03-01

    Full Text Available Abstract Background Mimivirus, a giant dsDNA virus infecting Acanthamoeba, is the prototype of the mimiviridae family, the latest addition to the family of the nucleocytoplasmic large DNA viruses (NCLDVs. Its 1.2 Mb-genome was initially predicted to encode 917 genes. A subsequent RNA-Seq analysis precisely mapped many transcript boundaries and identified 75 new genes. Findings We now report a much deeper analysis using the SOLiD™ technology combining RNA-Seq of the Mimivirus transcriptome during the infectious cycle (202.4 Million reads, and a complete genome re-sequencing (45.3 Million reads. This study corrected the genome sequence and identified several single nucleotide polymorphisms. Our results also provided clear evidence of previously overlooked transcription units, including an important RNA polymerase subunit distantly related to Euryarchea homologues. The total Mimivirus gene count is now 1018, 11% greater than the original annotation. Conclusions This study highlights the huge progress brought about by ultra-deep sequencing for the comprehensive annotation of virus genomes, opening the door to a complete one-nucleotide resolution level description of their transcriptional activity, and to the realistic modeling of the viral genome expression at the ultimate molecular level. This work also illustrates the need to go beyond bioinformatics-only approaches for the annotation of short protein and non-coding genes in viral genomes.

  1. Mapping vaccinia virus DNA replication origins at nucleotide level by deep sequencing.

    Science.gov (United States)

    Senkevich, Tatiana G; Bruno, Daniel; Martens, Craig; Porcella, Stephen F; Wolf, Yuri I; Moss, Bernard

    2015-09-01

    Poxviruses reproduce in the host cytoplasm and encode most or all of the enzymes and factors needed for expression and synthesis of their double-stranded DNA genomes. Nevertheless, the mode of poxvirus DNA replication and the nature and location of the replication origins remain unknown. A current but unsubstantiated model posits only leading strand synthesis starting at a nick near one covalently closed end of the genome and continuing around the other end to generate a concatemer that is subsequently resolved into unit genomes. The existence of specific origins has been questioned because any plasmid can replicate in cells infected by vaccinia virus (VACV), the prototype poxvirus. We applied directional deep sequencing of short single-stranded DNA fragments enriched for RNA-primed nascent strands isolated from the cytoplasm of VACV-infected cells to pinpoint replication origins. The origins were identified as the switching points of the fragment directions, which correspond to the transition from continuous to discontinuous DNA synthesis. Origins containing a prominent initiation point mapped to a sequence within the hairpin loop at one end of the VACV genome and to the same sequence within the concatemeric junction of replication intermediates. These findings support a model for poxvirus genome replication that involves leading and lagging strand synthesis and is consistent with the requirements for primase and ligase activities as well as earlier electron microscopic and biochemical studies implicating a replication origin at the end of the VACV genome.

  2. Construction of Midgut Tissue-Specific cDNA Library of Bombyx mandarina M. and Isolation and Sequence Analysis of Serine Protease Gene Fragment%野桑蚕中肠组织cDNA文库的构建及丝氨酸蛋白酶基因片段的克隆与序列分析

    Institute of Scientific and Technical Information of China (English)

    王燕红; 李兵; 王东; 朱莎; 赵华强; 卫正国; 沈卫德

    2008-01-01

    [Objective] The aim of the study is to construct cDNA library of midgut tissue of wild silkworm and isolate the serine protease gene. [Method] The midgut tissue-specific cDNA library of wild silkworm was constructed via cDNA Library Construction Kit (TaKaRa), then the serine protease gene was cloned via sequencing of the yielded cDNA library. [Result] The titer of cDNA library reached 6.2×105 pfu/ml, average insert size was about 1.2 kb. The serine protease gene cDNA fragment was obtained from colony sequencing (Accession No: EU672968). The nucleotide sequence of the cloned 854 bp fragment encodes 284 amino acid residues. Homology analyses showed some homology between putative amino acid sequence of the cloned fragment and amino acid sequences of serine proteases from other ten insects. [Conclusion] The results may avail to reveal the resistance of silkworm and wild silkworm to exotic intrusion.

  3. First complete genome sequence of an emerging cucumber green mottle mosaic virus isolate in North America

    Science.gov (United States)

    The complete genome sequence (6,423 nt) of an emerging Cucumber green mottle mosaic virus (CGMMV) isolate on cucumber in North America was determined through deep sequencing of sRNA and rapid amplification of cDNA ends. It shares 99% nucleotide sequence identity to the Asian genotype, but only 90% t...

  4. 岩栖蝮蛇类凝血酶纯化、cDNA克隆和序列分析%Purification, cDNA cloning and sequence analysis of thrombin-like enzyme from Gloydius saxatilis

    Institute of Scientific and Technical Information of China (English)

    孙德军; 杨春伟; 杨同书; 颜炜群; 王伟

    2003-01-01

    Thrombin-like enzyme has great medical application in treating thrombus. A thrombin-like enzyme from Gloydius saxatilis snake venom was isolated and purified to homogeneity by a rapid and effective method using ion-exchange chromatography on DEAE-Sepharose and affinity chromatography on heparin-sepharose.SDS-polyacrylamide electrophoresis under reducing condition revealed that the purified enzyme had a single protein band and its molecular weight was 32000 dalton.Total RNAs were extracted from the venom gland of the G.saxatilis snake.Using degenerate primers,we amplified the cDNA of the thrombin-like enzyme gene in the venom gland of G.saxatilis using the reverse transcription-polymerase chainreaction (RT-PCR) method.The cDNA fragment was inserted into pGEMT vector,cloned and its nucleotide sequence was determined.Its open reading frame is composed of 774 nucleotides and codes a protein prezymogen of 258 amino acids,including a putative secretory signal peptide of 18 amino acids and a proposed pro-peptide of 6 amino acid residues.It contains 12 cysteine residues.The sequence analysis indicates that the deduced amino acid sequence of the cDNA fragment shares high identity with the thrombin-like enzyme genes of other snakes in the gene bank.The query sequence exhibits strong amino acid sequence homology of 88%,88% and 86% to the serine proteas of T.gramineus,thrombin-like defibrase Ⅰ of D.acutus and serine protease catroxase Ⅱ of C.atrox respectively.Based on the amino acid sequences of other thrombin-like enzymes,the catalytic residues and disulfide bridges of this thrombin-like enzyme are deduced as follows:catalytic residues,His65,Asp110,Ser%204;and six disulfide bridges Cys31-Cys163,Cys50-Cys66,Cys98-Cys256,Cys142-Cys210,Cys174-Cys189 and Cys200-Cys225.According to the possible linked glycosylation sites N-X-T (Asn-X-Thr) or N-X-S (Asn-X-Ser),its possible glycosylation sites are N44-S45-T46 and N251-T252-T253 residues [Acta Zoologica Sinica 49(6):878-882,2003].

  5. Advancing Eucalyptus genomics: identification and sequencing of lignin biosynthesis genes from deep-coverage BAC libraries

    Directory of Open Access Journals (Sweden)

    Kudrna David

    2011-03-01

    Full Text Available Abstract Background Eucalyptus species are among the most planted hardwoods in the world because of their rapid growth, adaptability and valuable wood properties. The development and integration of genomic resources into breeding practice will be increasingly important in the decades to come. Bacterial artificial chromosome (BAC libraries are key genomic tools that enable positional cloning of important traits, synteny evaluation, and the development of genome framework physical maps for genetic linkage and genome sequencing. Results We describe the construction and characterization of two deep-coverage BAC libraries EG_Ba and EG_Bb obtained from nuclear DNA fragments of E. grandis (clone BRASUZ1 digested with HindIII and BstYI, respectively. Genome coverages of 17 and 15 haploid genome equivalents were estimated for EG_Ba and EG_Bb, respectively. Both libraries contained large inserts, with average sizes ranging from 135 Kb (Eg_Bb to 157 Kb (Eg_Ba, very low extra-nuclear genome contamination providing a probability of finding a single copy gene ≥ 99.99%. Libraries were screened for the presence of several genes of interest via hybridizations to high-density BAC filters followed by PCR validation. Five selected BAC clones were sequenced and assembled using the Roche GS FLX technology providing the whole sequence of the E. grandis chloroplast genome, and complete genomic sequences of important lignin biosynthesis genes. Conclusions The two E. grandis BAC libraries described in this study represent an important milestone for the advancement of Eucalyptus genomics and forest tree research. These BAC resources have a highly redundant genome coverage (> 15×, contain large average inserts and have a very low percentage of clones with organellar DNA or empty vectors. These publicly available BAC libraries are thus suitable for a broad range of applications in genetic and genomic research in Eucalyptus and possibly in related species of Myrtaceae

  6. Deep HST-WFPC2 photometry of NGC 288. II. The Main Sequence Luminosity Function

    CERN Document Server

    Bellazzini, M; Montegriffo, P; Messineo, M; Monaco, L; Rood, R T; Pecci, Flavio Fusi; Montegriffo, Paolo; Messineo, Maria

    2002-01-01

    The Main Sequence Luminosity Function (LF) of the Galactic globular cluster NGC 288 has been obtained using deep WFPC2 photometry. We have employed a new method to correct for completeness and fully account for bin-to-bin migration due to blending and/or observational scatter. The effect of the presence of binary systems in the final LF is quantified and is found to be negligible. There is a strong indication of the mass segregation of unevolved single stars and clear signs of a depletion of low mass stars in NGC 288 with respect to other clusters. The results are in good agreement with the prediction of theoretical models of the dynamical evolution of NGC 288 that take into account the extreme orbital properties of this cluster.

  7. 沙田柚无机焦磷酸酶基因的 cDNA 克隆及序列分析%Cloning and sequence analysis of inorganic pyrophosphatase gene from Citrus grandis var.shatianyu

    Institute of Scientific and Technical Information of China (English)

    秦新民; 万珊; 李惠敏; 覃屏生; 张渝

    2015-01-01

    related)and non-self(genetically unrelated)pollen.Inorganic pyrophosphatase(IPPase)play important roles in regu-lating the growth and development in plants.In order to better understand the mechanism of IPP gene in the self-in-compatibility of Citrus grandis var.shatianyu ,the inorganic pyrophosphatase gene of C .grandis var.shatianyu was cloned and physicochemical properties of pyrophosphatase were analyzed.The total RNA was isolated from style of C .grandis var.shatianyu used the total RNA Purification System(Invitrogen)and following the manufacture′s pro-tocols.According to the EST sequence(internal fragment of inorganic pyrophosphatase gene)in suppression subtrac-tive hybridization libraries of C .grandis var.shatianyu style,4 specific primers 5′-GSP1,5′-nGSP1,3′-GSP2 and 3′-nGSP2 were designed for amplifying 3-RACE and 5-RACE of the gene.The full-length sequences of cDNA of inor-ganic pyrophosphatase gene were obtained from suppression subtractive hybridization libraries of C .grandis var.shatianyu style by the SMART-RACE PCR method.A comparison of the similarity of the full-length cDNA se-quence of the inorganic pyrophosphatase gene was performed in the GenBank database used the BLAST program.DNAman software was used for mino acid sequence and homology analysis.Prediction of molecular weight, isoeletric point(pI)and hydrophobicity were performed by using on line software ExpASy and DNAman.A 150 bp band in 3-RACE,and a 900 bp band in 5-RACE were cloned by nested and non nested PCR method.For the full-length cDNA sequences,the middle sequence,3-RACE and 5-RACE sequence of the inorganic pyrophosphatase gene were spliced and formed the full-length cDNA sequences.As a result,the cDNA of inorganic pyrophosphatase was 1 136 bp in length containing an 654 bp open reading frame(ORF),which encoded a protein of 217 amino acids with a 170 bp 5 untranslated region(5 UTR)and a 321 bp 3 UTR.The sequence of the cloned cDNA of the inorganic pyro-phosphatase from C .grandis

  8. Complex Genotype Mixtures Analyzed by Deep Sequencing in Two Different Regions of Hepatitis B Virus.

    Directory of Open Access Journals (Sweden)

    Andrea Caballero

    Full Text Available This study assesses the presence and outcome of genotype mixtures in the polymerase/surface and X/preCore regions of the HBV genome in patients with chronic hepatitis B virus (HBV infection. Thirty samples from ten chronic hepatitis B patients were included. The polymerase/surface and X/preCore regions were analyzed by deep sequencing (UDPS in the first available sample at diagnosis, a pre-treatment sample, and a sample while under treatment. HBV genotype was determined by phylogenesis. Quasispecies complexity was evaluated by mutation frequency and nucleotide diversity. The polymerase/surface and X/preCore regions were validated for genotyping from 113 GenBank reference sequences. UDPS yielded a median of 10,960 sequences per sample (IQR 16,645 in the polymerase/surface region and 11,595 sequences per sample (IQR 14,682 in X/preCore. Genotype mixtures were more common in X/preCore (90% than in polymerase/surface (30% (p<0.001. On X/preCore genotyping, all samples were genotype A, whereas polymerase/surface yielded genotypes A (80%, D (16.7%, and F (3.3% (p = 0.036. Genotype changes in polymerase/surface were observed in four patients during natural quasispecies dynamics and in two patients during treatment. There were no genotype changes in X/preCore. Quasispecies complexity was higher in X/preCore than in polymerase/surface (p = 0.004. The results provide evidence of genotype mixtures and differential genotype proportions in the polymerase/surface and X/preCore regions. The genotype dynamics in HBV infection and the different patterns of quasispecies complexity in the HBV genome suggest a new paradigm for HBV genotype classification.

  9. Complex Genotype Mixtures Analyzed by Deep Sequencing in Two Different Regions of Hepatitis B Virus.

    Science.gov (United States)

    Caballero, Andrea; Gregori, Josep; Homs, Maria; Tabernero, David; Gonzalez, Carolina; Quer, Josep; Blasi, Maria; Casillas, Rosario; Nieto, Leonardo; Riveiro-Barciela, Mar; Esteban, Rafael; Buti, Maria; Rodriguez-Frias, Francisco

    2015-01-01

    This study assesses the presence and outcome of genotype mixtures in the polymerase/surface and X/preCore regions of the HBV genome in patients with chronic hepatitis B virus (HBV) infection. Thirty samples from ten chronic hepatitis B patients were included. The polymerase/surface and X/preCore regions were analyzed by deep sequencing (UDPS) in the first available sample at diagnosis, a pre-treatment sample, and a sample while under treatment. HBV genotype was determined by phylogenesis. Quasispecies complexity was evaluated by mutation frequency and nucleotide diversity. The polymerase/surface and X/preCore regions were validated for genotyping from 113 GenBank reference sequences. UDPS yielded a median of 10,960 sequences per sample (IQR 16,645) in the polymerase/surface region and 11,595 sequences per sample (IQR 14,682) in X/preCore. Genotype mixtures were more common in X/preCore (90%) than in polymerase/surface (30%) (pgenotyping, all samples were genotype A, whereas polymerase/surface yielded genotypes A (80%), D (16.7%), and F (3.3%) (p = 0.036). Genotype changes in polymerase/surface were observed in four patients during natural quasispecies dynamics and in two patients during treatment. There were no genotype changes in X/preCore. Quasispecies complexity was higher in X/preCore than in polymerase/surface (p = 0.004). The results provide evidence of genotype mixtures and differential genotype proportions in the polymerase/surface and X/preCore regions. The genotype dynamics in HBV infection and the different patterns of quasispecies complexity in the HBV genome suggest a new paradigm for HBV genotype classification.

  10. Deep sequencing reveals low incidence of endogenous LINE-1 retrotransposition in human induced pluripotent stem cells.

    Directory of Open Access Journals (Sweden)

    Hubert Arokium

    Full Text Available Long interspersed element-1 (LINE-1 or L1 retrotransposition induces insertional mutations that can result in diseases. It was recently shown that the copy number of L1 and other retroelements is stable in induced pluripotent stem cells (iPSCs. However, by using an engineered reporter construct over-expressing L1, another study suggests that reprogramming activates L1 mobility in iPSCs. Given the potential of human iPSCs in therapeutic applications, it is important to clarify whether these cells harbor somatic insertions resulting from endogenous L1 retrotransposition. Here, we verified L1 expression during and after reprogramming as well as potential somatic insertions driven by the most active human endogenous L1 subfamily (L1Hs. Our results indicate that L1 over-expression is initiated during the reprogramming process and is subsequently sustained in isolated clones. To detect potential somatic insertions in iPSCs caused by L1Hs retotransposition, we used a novel sequencing strategy. As opposed to conventional sequencing direction, we sequenced from the 3' end of L1Hs to the genomic DNA, thus enabling the direct detection of the polyA tail signature of retrotransposition for verification of true insertions. Deep coverage sequencing thus allowed us to detect seven potential somatic insertions with low read counts from two iPSC clones. Negative PCR amplification in parental cells, presence of a polyA tail and absence from seven L1 germline insertion databases highly suggested true somatic insertions in iPSCs. Furthermore, these insertions could not be detected in iPSCs by PCR, likely due to low abundance. We conclude that L1Hs retrotransposes at low levels in iPSCs and therefore warrants careful analyses for genotoxic effects.

  11. Deep RNA sequencing of the skeletal muscle transcriptome in swimming fish.

    Directory of Open Access Journals (Sweden)

    Arjan P Palstra

    Full Text Available Deep RNA sequencing (RNA-seq was performed to provide an in-depth view of the transcriptome of red and white skeletal muscle of exercised and non-exercised rainbow trout (Oncorhynchus mykiss with the specific objective to identify expressed genes and quantify the transcriptomic effects of swimming-induced exercise. Pubertal autumn-spawning seawater-raised female rainbow trout were rested (n = 10 or swum (n = 10 for 1176 km at 0.75 body-lengths per second in a 6,000-L swim-flume under reproductive conditions for 40 days. Red and white muscle RNA of exercised and non-exercised fish (4 lanes was sequenced and resulted in 15-17 million reads per lane that, after de novo assembly, yielded 149,159 red and 118,572 white muscle contigs. Most contigs were annotated using an iterative homology search strategy against salmonid ESTs, the zebrafish Danio rerio genome and general Metazoan genes. When selecting for large contigs (>500 nucleotides, a number of novel rainbow trout gene sequences were identified in this study: 1,085 and 1,228 novel gene sequences for red and white muscle, respectively, which included a number of important molecules for skeletal muscle function. Transcriptomic analysis revealed that sustained swimming increased transcriptional activity in skeletal muscle and specifically an up-regulation of genes involved in muscle growth and developmental processes in white muscle. The unique collection of transcripts will contribute to our understanding of red and white muscle physiology, specifically during the long-term reproductive migration of salmonids.

  12. Dysregulation of B Cell Repertoire Formation in Myasthenia Gravis Patients Revealed through Deep Sequencing.

    Science.gov (United States)

    Vander Heiden, Jason A; Stathopoulos, Panos; Zhou, Julian Q; Chen, Luan; Gilbert, Tamara J; Bolen, Christopher R; Barohn, Richard J; Dimachkie, Mazen M; Ciafaloni, Emma; Broering, Teresa J; Vigneault, Francois; Nowak, Richard J; Kleinstein, Steven H; O'Connor, Kevin C

    2017-02-15

    Myasthenia gravis (MG) is a prototypical B cell-mediated autoimmune disease affecting 20-50 people per 100,000. The majority of patients fall into two clinically distinguishable types based on whether they produce autoantibodies targeting the acetylcholine receptor (AChR-MG) or muscle specific kinase (MuSK-MG). The autoantibodies are pathogenic, but whether their generation is associated with broader defects in the B cell repertoire is unknown. To address this question, we performed deep sequencing of the BCR repertoire of AChR-MG, MuSK-MG, and healthy subjects to generate ∼518,000 unique VH and VL sequences from sorted naive and memory B cell populations. AChR-MG and MuSK-MG subjects displayed distinct gene segment usage biases in both VH and VL sequences within the naive and memory compartments. The memory compartment of AChR-MG was further characterized by reduced positive selection of somatic mutations in the VH CDR and altered VH CDR3 physicochemical properties. The VL repertoire of MuSK-MG was specifically characterized by reduced V-J segment distance in recombined sequences, suggesting diminished VL receptor editing during B cell development. Our results identify large-scale abnormalities in both the naive and memory B cell repertoires. Particular abnormalities were unique to either AChR-MG or MuSK-MG, indicating that the repertoires reflect the distinct properties of the subtypes. These repertoire abnormalities are consistent with previously observed defects in B cell tolerance checkpoints in MG, thereby offering additional insight regarding the impact of tolerance defects on peripheral autoimmune repertoires. These collective findings point toward a deformed B cell repertoire as a fundamental component of MG.

  13. Deep sequencing reveals low incidence of endogenous LINE-1 retrotransposition in human induced pluripotent stem cells.

    Science.gov (United States)

    Arokium, Hubert; Kamata, Masakazu; Kim, Sanggu; Kim, Namshin; Liang, Min; Presson, Angela P; Chen, Irvin S

    2014-01-01

    Long interspersed element-1 (LINE-1 or L1) retrotransposition induces insertional mutations that can result in diseases. It was recently shown that the copy number of L1 and other retroelements is stable in induced pluripotent stem cells (iPSCs). However, by using an engineered reporter construct over-expressing L1, another study suggests that reprogramming activates L1 mobility in iPSCs. Given the potential of human iPSCs in therapeutic applications, it is important to clarify whether these cells harbor somatic insertions resulting from endogenous L1 retrotransposition. Here, we verified L1 expression during and after reprogramming as well as potential somatic insertions driven by the most active human endogenous L1 subfamily (L1Hs). Our results indicate that L1 over-expression is initiated during the reprogramming process and is subsequently sustained in isolated clones. To detect potential somatic insertions in iPSCs caused by L1Hs retotransposition, we used a novel sequencing strategy. As opposed to conventional sequencing direction, we sequenced from the 3' end of L1Hs to the genomic DNA, thus enabling the direct detection of the polyA tail signature of retrotransposition for verification of true insertions. Deep coverage sequencing thus allowed us to detect seven potential somatic insertions with low read counts from two iPSC clones. Negative PCR amplification in parental cells, presence of a polyA tail and absence from seven L1 germline insertion databases highly suggested true somatic insertions in iPSCs. Furthermore, these insertions could not be detected in iPSCs by PCR, likely due to low abundance. We conclude that L1Hs retrotransposes at low levels in iPSCs and therefore warrants careful analyses for genotoxic effects.

  14. Small RNA Library Cloning Procedure for Deep Sequencing of Specific Endogenous siRNA Classes in Caenorhabditis elegans

    Science.gov (United States)

    Ow, Maria C.; Lau, Nelson C.; Hall, Sarah E.

    2017-01-01

    In recent years, distinct classes of small RNAs ranging in size from ~21 to 26 nucleotides have been discovered and shown to play important roles in a wide array of cellular functions. Because of the abundance of these small RNAs, library preparation from an RNA sample followed by deep sequencing provides the identity and quantity of a particular class of small RNAs. In this chapter we describe a detailed protocol for preparing small RNA libraries for deep sequencing on the Illumina platform from the nematode C. elegans. PMID:24920360

  15. Enhanced methods for unbiased deep sequencing of Lassa and Ebola RNA viruses from clinical and biological samples.

    Science.gov (United States)

    Matranga, Christian B; Andersen, Kristian G; Winnicki, Sarah; Busby, Michele; Gladden, Adrianne D; Tewhey, Ryan; Stremlau, Matthew; Berlin, Aaron; Gire, Stephen K; England, Eleina; Moses, Lina M; Mikkelsen, Tarjei S; Odia, Ikponmwonsa; Ehiane, Philomena E; Folarin, Onikepe; Goba, Augustine; Kahn, S Humarr; Grant, Donald S; Honko, Anna; Hensley, Lisa; Happi, Christian; Garry, Robert F; Malboeuf, Christine M; Birren, Bruce W; Gnirke, Andreas; Levin, Joshua Z; Sabeti, Pardis C

    2014-01-01

    We have developed a robust RNA sequencing method for generating complete de novo assemblies with intra-host variant calls of Lassa and Ebola virus genomes in clinical and biological samples. Our method uses targeted RNase H-based digestion to remove contaminating poly(rA) carrier and ribosomal RNA. This depletion step improves both the quality of data and quantity of informative reads in unbiased total RNA sequencing libraries. We have also developed a hybrid-selection protocol to further enrich the viral content of sequencing libraries. These protocols have enabled rapid deep sequencing of both Lassa and Ebola virus and are broadly applicable to other viral genomics studies.

  16. Improved sequence learning with subthalamic nucleus deep brain stimulation: evidence for treatment-specific network modulation.

    Science.gov (United States)

    Mure, Hideo; Tang, Chris C; Argyelan, Miklos; Ghilardi, Maria-Felice; Kaplitt, Michael G; Dhawan, Vijay; Eidelberg, David

    2012-02-22

    We used a network approach to study the effects of anti-parkinsonian treatment on motor sequence learning in humans. Eight Parkinson's disease (PD) patients with bilateral subthalamic nucleus (STN) deep brain stimulation underwent H(2)(15)O positron emission tomography (PET) imaging to measure regional cerebral blood flow (rCBF) while they performed kinematically matched sequence learning and movement tasks at baseline and during stimulation. Network analysis revealed a significant learning-related spatial covariance pattern characterized by consistent increases in subject expression during stimulation (p = 0.008, permutation test). The network was associated with increased activity in the lateral cerebellum, dorsal premotor cortex, and parahippocampal gyrus, with covarying reductions in the supplementary motor area (SMA) and orbitofrontal cortex. Stimulation-mediated increases in network activity correlated with concurrent improvement in learning performance (p learning performance or network activity. Analysis of learning-related rCBF in network regions revealed improvement in baseline abnormalities with STN stimulation but not levodopa. These effects were most pronounced in the SMA. In this region, a consistent rCBF response to stimulation was observed across subjects and trials (p = 0.01), although the levodopa response was not significant. These findings link the cognitive treatment response in PD to changes in the activity of a specific cerebello-premotor cortical network. Selective modulation of overactive SMA-STN projection pathways may underlie the improvement in learning found with stimulation.

  17. Deep sequencing of Trichomonas vaginalis during the early infection of vaginal epithelial cells and amoeboid transition.

    Science.gov (United States)

    Gould, Sven B; Woehle, Christian; Kusdian, Gary; Landan, Giddy; Tachezy, Jan; Zimorski, Verena; Martin, William F

    2013-08-01

    The human pathogen Trichomonas vaginalis has the largest protozoan genome known, potentially encoding approximately 60,000 proteins. To what degree these genes are expressed is not well known and only a few key transcription factors and promoter domains have been identified. To shed light on the expression capacity of the parasite and transcriptional regulation during phase transitions, we deep sequenced the transcriptomes of the protozoan during two environmental stimuli of the early infection process: exposure to oxygen and contact with vaginal epithelial cells. Eleven 3' fragment libraries from different time points after exposure to oxygen only and in combination with human tissue were sequenced, generating more than 150 million reads which mapped onto 33,157 protein coding genes in total and a core set of more than 20,000 genes represented within all libraries. The data uncover gene family expression regulation in this parasite and give evidence for a concentrated response to the individual stimuli. Oxygen stress primarily reveals the parasite's strategies to deal with oxygen radicals. The exposure of oxygen-adapted parasites to human epithelial cells primarily induces cytoskeletal rearrangement and proliferation, reflecting the rapid morphological transition from spindle shaped flagellates to tissue-feeding and actively dividing amoeboids.

  18. Ultra Deep Sequencing of a Baculovirus Population Reveals Widespread Genomic Variations

    Directory of Open Access Journals (Sweden)

    Aurélien Chateigner

    2015-07-01

    Full Text Available Viruses rely on widespread genetic variation and large population size for adaptation. Large DNA virus populations are thought to harbor little variation though natural populations may be polymorphic. To measure the genetic variation present in a dsDNA virus population, we deep sequenced a natural strain of the baculovirus Autographa californica multiple nucleopolyhedrovirus. With 124,221X average genome coverage of our 133,926 bp long consensus, we could detect low frequency mutations (0.025%. K-means clustering was used to classify the mutations in four categories according to their frequency in the population. We found 60 high frequency non-synonymous mutations under balancing selection distributed in all functional classes. These mutants could alter viral adaptation dynamics, either through competitive or synergistic processes. Lastly, we developed a technique for the delimitation of large deletions in next generation sequencing data. We found that large deletions occur along the entire viral genome, with hotspots located in homologous repeat regions (hrs. Present in 25.4% of the genomes, these deletion mutants presumably require functional complementation to complete their infection cycle. They might thus have a large impact on the fitness of the baculovirus population. Altogether, we found a wide breadth of genomic variation in the baculovirus population, suggesting it has high adaptive potential.

  19. Deep RNA sequencing analysis of readthrough gene fusions in human prostate adenocarcinoma and reference samples

    Directory of Open Access Journals (Sweden)

    Nacu Serban

    2011-01-01

    Full Text Available Abstract Background Readthrough fusions across adjacent genes in the genome, or transcription-induced chimeras (TICs, have been estimated using expressed sequence tag (EST libraries to involve 4-6% of all genes. Deep transcriptional sequencing (RNA-Seq now makes it possible to study the occurrence and expression levels of TICs in individual samples across the genome. Methods We performed single-end RNA-Seq on three human prostate adenocarcinoma samples and their corresponding normal tissues, as well as brain and universal reference samples. We developed two bioinformatics methods to specifically identify TIC events: a targeted alignment method using artificial exon-exon junctions within 200,000 bp from adjacent genes, and genomic alignment allowing splicing within individual reads. We performed further experimental verification and characterization of selected TIC and fusion events using quantitative RT-PCR and comparative genomic hybridization microarrays. Results Targeted alignment against artificial exon-exon junctions yielded 339 distinct TIC events, including 32 gene pairs with multiple isoforms. The false discovery rate was estimated to be 1.5%. Spliced alignment to the genome was less sensitive, finding only 18% of those found by targeted alignment in 33-nt reads and 59% of those in 50-nt reads. However, spliced alignment revealed 30 cases of TICs with intervening exons, in addition to distant inversions, scrambled genes, and translocations. Our findings increase the catalog of observed TIC gene pairs by 66%. We verified 6 of 6 predicted TICs in all prostate samples, and 2 of 5 predicted novel distant gene fusions, both private events among 54 prostate tumor samples tested. Expression of TICs correlates with that of the upstream gene, which can explain the prostate-specific pattern of some TIC events and the restriction of the SLC45A3-ELK4 e4-e2 TIC to ERG-negative prostate samples, as confirmed in 20 matched prostate tumor and normal

  20. cDNA sequences and mRNA levels of two hexamerin storage proteins PinSP1 and PinSP2 from the Indianmeal moth, Plodia interpunctella.

    Science.gov (United States)

    Zhu, Yu Cheng; Muthukrishnan, Subbaratnam; Kramer, Karl J

    2002-05-01

    In insects, storage proteins or hexamerins accumulate apparently to serve as sources of amino acids during metamorphosis and reproduction. Two storage protein-like cDNAs obtained from a cDNA library prepared from fourth instar larvae of the Indianmeal moth (Plodia interpunctella) were cloned and sequenced. The first clone, PinSP1, contained 2431 nucleotides with a 2295 nucleotide open reading frame (ORF) encoding a protein with 765 amino acid residues. The second cDNA, PinSP2, consisted of 2336 nucleotides with a 2250-nucleotide ORF encoding a protein with 750 amino acid residues. PinSP1 and PinSP2 shared 59% nucleotide sequence identity and 44% deduced amino acid sequence identity. A 17-amino acid signal peptide and a molecular mass of 90.4 kDa were predicted for the PinSP1 protein, whereas a 15-amino acid signal peptide and a mass of 88 kDa were predicted for PinSP2. Both proteins contained conserved insect larval storage protein signature sequence patterns and were 60-70% identical to other lepidopteran larval storage proteins. Expression of mRNA for both larval storage proteins was determined using the quantitative reverse transcription polymerase chain reaction method. Only very low levels were present in the second instar, but both mRNAs dramatically increased during the third instar, peaked in the fourth instar, decreased dramatically late in the same instar and pupal stages, and were undetectable during the adult stage. Males and females exhibited similar mRNA expression levels for both storage proteins during the pupal and adult stages. The results support the hypothesis that P. interpunctella, a species that does not feed after the larval stage, accumulates these two storage proteins as reserves during larval development for subsequent use in the pupal and adult stages.

  1. 水稻EPSP合酶cDNA克隆、序列分析及其拷贝数测定%Isolation of Rice EPSP Synthase cDNA and Its Sequence Analysis and Copy Number Determination

    Institute of Scientific and Technical Information of China (English)

    徐军望; 魏晓丽; 李旭刚; 陈蕾; 冯德江; 朱祯

    2002-01-01

    根据本室分离的水稻EPSP合酶基因的基因组序列设计一对引物,利用RT-PCR方法首次从水稻(Oryza sativa L. subsp. indica)叶片的RNA中扩增获得了水稻编码EPSP合酶的全长为1 585 bp的cDNA片段,它含有一个完整的开放读码框,编码511个氨基酸,包括444个氨基酸组成的成熟肽序列以及N端的67个氨基酸组成的叶绿体转运肽序列.成熟肽氨基酸序列对比表明,除真菌来源的EPSP合酶变异较大外,其他来源的EPSP合酶同源性较高,均在51%以上.而叶绿体转运肽氨基酸序列同源性较低.Southern杂交表明水稻EPSP合酶基因在水稻基因组中以单拷贝形式存在.RT-PCR分析表明,水稻EPSP合酶基因在根、未成熟种子和叶片中均有转录表达,在叶片中表达量最高.%In order to isolate the total cDNA of rice (Oryza sativa L.) epsps gene, RT-PCR was carried out with template of rice first-strand cDNA and primers designed according to rice EPSP synthase genomic sequence obtained in previous study. A 1 585-bp cDNA fragment was amplified and cloned. The 1 585-bp cDNA contains an open reading frame (ORF) comprising of 1 533 nucleotides (nt) which encodes a 511 residue polypepetides, including 67 amino acids chloroplast transit peptide and 444 amino acids EPSP synthase mature peptide. A comparison between the EPSP synthase of different sources indicates that the mature peptide shows more than 51% identity except for the fungi EPSP synthase and the transit peptide shows considerably less sequence conservation. The copy number of rice epsps gene is estimated to be one copy per haploid rice genome using southern blot. RT-PCR indicated that rice epsps gene is expressed in rice leaves, endosperms and roots and has the highest expression level in leaves.

  2. Ultra-deep sequencing of mouse mitochondrial DNA: mutational patterns and their origins.

    Directory of Open Access Journals (Sweden)

    Adam Ameur

    2011-03-01

    Full Text Available Somatic mutations of mtDNA are implicated in the aging process, but there is no universally accepted method for their accurate quantification. We have used ultra-deep sequencing to study genome-wide mtDNA mutation load in the liver of normally- and prematurely-aging mice. Mice that are homozygous for an allele expressing a proof-reading-deficient mtDNA polymerase (mtDNA mutator mice have 10-times-higher point mutation loads than their wildtype siblings. In addition, the mtDNA mutator mice have increased levels of a truncated linear mtDNA molecule, resulting in decreased sequence coverage in the deleted region. In contrast, circular mtDNA molecules with large deletions occur at extremely low frequencies in mtDNA mutator mice and can therefore not drive the premature aging phenotype. Sequence analysis shows that the main proportion of the mutation load in heterozygous mtDNA mutator mice and their wildtype siblings is inherited from their heterozygous mothers consistent with germline transmission. We found no increase in levels of point mutations or deletions in wildtype C57Bl/6N mice with increasing age, thus questioning the causative role of these changes in aging. In addition, there was no increased frequency of transversion mutations with time in any of the studied genotypes, arguing against oxidative damage as a major cause of mtDNA mutations. Our results from studies of mice thus indicate that most somatic mtDNA mutations occur as replication errors during development and do not result from damage accumulation in adult life.

  3. Reconstructing the dynamics of HIV evolution within hosts from serial deep sequence data.

    Directory of Open Access Journals (Sweden)

    Art F Y Poon

    Full Text Available At the early stage of infection, human immunodeficiency virus (HIV-1 predominantly uses the CCR5 coreceptor for host cell entry. The subsequent emergence of HIV variants that use the CXCR4 coreceptor in roughly half of all infections is associated with an accelerated decline of CD4+ T-cells and rate of progression to AIDS. The presence of a 'fitness valley' separating CCR5- and CXCR4-using genotypes is postulated to be a biological determinant of whether the HIV coreceptor switch occurs. Using phylogenetic methods to reconstruct the evolutionary dynamics of HIV within hosts enables us to discriminate between competing models of this process. We have developed a phylogenetic pipeline for the molecular clock analysis, ancestral reconstruction, and visualization of deep sequence data. These data were generated by next-generation sequencing of HIV RNA extracted from longitudinal serum samples (median 7 time points from 8 untreated subjects with chronic HIV infections (Amsterdam Cohort Studies on HIV-1 infection and AIDS. We used the known dates of sampling to directly estimate rates of evolution and to map ancestral mutations to a reconstructed timeline in units of days. HIV coreceptor usage was predicted from reconstructed ancestral sequences using the geno2pheno algorithm. We determined that the first mutations contributing to CXCR4 use emerged about 16 (per subject range 4 to 30 months before the earliest predicted CXCR4-using ancestor, which preceded the first positive cell-based assay of CXCR4 usage by 10 (range 5 to 25 months. CXCR4 usage arose in multiple lineages within 5 of 8 subjects, and ancestral lineages following alternate mutational pathways before going extinct were common. We observed highly patient-specific distributions and time-scales of mutation accumulation, implying that the role of a fitness valley is contingent on the genotype of the transmitted variant.

  4. Reconstructing the dynamics of HIV evolution within hosts from serial deep sequence data.

    Science.gov (United States)

    Poon, Art F Y; Swenson, Luke C; Bunnik, Evelien M; Edo-Matas, Diana; Schuitemaker, Hanneke; van 't Wout, Angélique B; Harrigan, P Richard

    2012-01-01

    At the early stage of infection, human immunodeficiency virus (HIV)-1 predominantly uses the CCR5 coreceptor for host cell entry. The subsequent emergence of HIV variants that use the CXCR4 coreceptor in roughly half of all infections is associated with an accelerated decline of CD4+ T-cells and rate of progression to AIDS. The presence of a 'fitness valley' separating CCR5- and CXCR4-using genotypes is postulated to be a biological determinant of whether the HIV coreceptor switch occurs. Using phylogenetic methods to reconstruct the evolutionary dynamics of HIV within hosts enables us to discriminate between competing models of this process. We have developed a phylogenetic pipeline for the molecular clock analysis, ancestral reconstruction, and visualization of deep sequence data. These data were generated by next-generation sequencing of HIV RNA extracted from longitudinal serum samples (median 7 time points) from 8 untreated subjects with chronic HIV infections (Amsterdam Cohort Studies on HIV-1 infection and AIDS). We used the known dates of sampling to directly estimate rates of evolution and to map ancestral mutations to a reconstructed timeline in units of days. HIV coreceptor usage was predicted from reconstructed ancestral sequences using the geno2pheno algorithm. We determined that the first mutations contributing to CXCR4 use emerged about 16 (per subject range 4 to 30) months before the earliest predicted CXCR4-using ancestor, which preceded the first positive cell-based assay of CXCR4 usage by 10 (range 5 to 25) months. CXCR4 usage arose in multiple lineages within 5 of 8 subjects, and ancestral lineages following alternate mutational pathways before going extinct were common. We observed highly patient-specific distributions and time-scales of mutation accumulation, implying that the role of a fitness valley is contingent on the genotype of the transmitted variant.

  5. Nucleotide sequence of a tobacco cDNA encoding plastidic glutamine synthetase and light inducibility, organ specificity and diurnal rhythmicity in the expression of the corresponding genes of tobacco and tomato.

    Science.gov (United States)

    Becker, T W; Caboche, M; Carrayol, E; Hirel, B

    1992-06-01

    A full-length cDNA encoding glutamine synthetase (GS) was cloned from a lambda gt10 library of tobacco leaf RNA, and the nucleotide sequence was determined. An open reading frame accounting for a primary translation product consisting of 432 amino acids has been localized on the cDNA. The calculated molecular mass of the encoded protein is 47.2 kDa. The predicted amino acid sequence of this precursor shows higher homology to GS-2 protein sequences from other species than to a leaf GS-1 polypeptide sequence, indicating that the cDNA isolated encodes the chloroplastic isoform (GS-2) of tobacco GS. The presence of C- and N-terminal extensions which are characteristic of GS-2 proteins supports this conclusion. Genomic Southern blot analysis indicated that GS-2 is encoded by a single gene in the diploid genomes of both tomato and Nicotiana sylvestris, while two GS-2 genes are very likely present in the amphidiploid tobacco genome. Western blot analysis indicated that in etiolated and in green tomato cotyledons GS-2 subunits are represented by polypeptides of similar size, while in green tomato leaves an additional GS-2 polypeptide of higher apparent molecular weight is detectable. In contrast, tobacco GS-2 is composed of subunits of identical size in all organs examined. GS-2 transcripts and GS-2 proteins could be detected at high levels in the leaves of both tobacco or tomato. Lower amounts of GS-2 mRNA were detected in stems, corolla, and roots of tomato, but not in non-green organs of tobacco. The GS-2 transcript abundance exhibited a diurnal fluctuation in tomato leaves but not in tobacco leaves. White or red light stimulated the accumulation of GS-2 transcripts and GS-2 protein in etiolated tomato cotyledons. Far-red light cancelled this stimulation. The red light response of the GS-2 gene was reduced in etiolated seedlings of the phytochrome-deficient aurea mutant of tomato. These results indicate a phytochrome-mediated light stimulation of GS-2 gene expression

  6. Whole-genome sequence of Sunxiuqinia dokdonensis DH1(T), isolated from deep sub-seafloor sediment in Dokdo Island.

    Science.gov (United States)

    Lim, Sooyeon; Chang, Dong-Ho; Kim, Byoung-Chan

    2016-09-01

    Sunxiuqinia dokdonensis DH1(T) was isolated from deep sub-seafloor sediment at a depth of 900 m below the seafloor off Seo-do (the west part of Dokdo Island) in the East Sea of the Republic of Korea and subjected to whole genome sequencing on HiSeq platform and annotated on RAST. The nucleotide sequence of this genome was deposited into DDBJ/EMBL/GenBank under the accession LGIA00000000.

  7. Whole-genome sequence of Sunxiuqinia dokdonensis DH1T, isolated from deep sub-seafloor sediment in Dokdo Island

    OpenAIRE

    Sooyeon Lim; Dong-Ho Chang; Byoung-Chan Kim

    2016-01-01

    Sunxiuqinia dokdonensis DH1T was isolated from deep sub-seafloor sediment at a depth of 900 m below the seafloor off Seo-do (the west part of Dokdo Island) in the East Sea of the Republic of Korea and subjected to whole genome sequencing on HiSeq platform and annotated on RAST. The nucleotide sequence of this genome was deposited into DDBJ/EMBL/GenBank under the accession LGIA00000000.

  8. Whole-genome sequence of Sunxiuqinia dokdonensis DH1T, isolated from deep sub-seafloor sediment in Dokdo Island

    Directory of Open Access Journals (Sweden)

    Sooyeon Lim

    2016-09-01

    Full Text Available Sunxiuqinia dokdonensis DH1T was isolated from deep sub-seafloor sediment at a depth of 900 m below the seafloor off Seo-do (the west part of Dokdo Island in the East Sea of the Republic of Korea and subjected to whole genome sequencing on HiSeq platform and annotated on RAST. The nucleotide sequence of this genome was deposited into DDBJ/EMBL/GenBank under the accession LGIA00000000.

  9. Screening Target Specificity of siRNAs by Rapid Amplification of cDNA Ends (RACE) for Non-Sequenced Species

    OpenAIRE

    Sabirzhanov, Boris; Sabirzhanova, Inna B.; Keifer, Joyce

    2011-01-01

    RNA interference (RNAi) is the process of sequence-specific posttranslational gene silencing triggered by double-stranded RNAs (dsRNAs). RNAi is a widely used approach for studying gene function. However, studies have shown that using siRNA can lead to off-target effects when the siRNA contains sufficient sequence identity to non-target mRNA sequences. One of the important steps in designing dsRNA is verification that it has sequence identity to only the target mRNA. In this report, we propos...

  10. Generation and analysis of a large-scale expressed sequence Tag database from a full-length enriched cDNA library of developing leaves of Gossypium hirsutum L.

    Directory of Open Access Journals (Sweden)

    Min Lin

    Full Text Available BACKGROUND: Cotton (Gossypium hirsutum L. is one of the world's most economically-important crops. However, its entire genome has not been sequenced, and limited resources are available in GenBank for understanding the molecular mechanisms underlying leaf development and senescence. METHODOLOGY/PRINCIPAL FINDINGS: In this study, 9,874 high-quality ESTs were generated from a normalized, full-length cDNA library derived from pooled RNA isolated from throughout leaf development during the plant blooming stage. After clustering and assembly of these ESTs, 5,191 unique sequences, representative 1,652 contigs and 3,539 singletons, were obtained. The average unique sequence length was 682 bp. Annotation of these unique sequences revealed that 84.4% showed significant homology to sequences in the NCBI non-redundant protein database, and 57.3% had significant hits to known proteins in the Swiss-Prot database. Comparative analysis indicated that our library added 2,400 ESTs and 991 unique sequences to those known for cotton. The unigenes were functionally characterized by gene ontology annotation. We identified 1,339 and 200 unigenes as potential leaf senescence-related genes and transcription factors, respectively. Moreover, nine genes related to leaf senescence and eleven MYB transcription factors were randomly selected for quantitative real-time PCR (qRT-PCR, which revealed that these genes were regulated differentially during senescence. The qRT-PCR for three GhYLSs revealed that these genes express express preferentially in senescent leaves. CONCLUSIONS/SIGNIFICANCE: These EST resources will provide valuable sequence information for gene expression profiling analyses and functional genomics studies to elucidate their roles, as well as for studying the mechanisms of leaf development and senescence in cotton and discovering candidate genes related to important agronomic traits of cotton. These data will also facilitate future whole-genome sequence

  11. Isolation, cDNA sequence analysis and tissue expression profile of a novel swine gene differentially expressed in the Longissimus dorsi muscle tissues from Large White × Meishan cross combination

    Institute of Scientific and Technical Information of China (English)

    LIU Yonggang; LEI Minggang; XIONG Yuanzhu; DENG Changyan

    2005-01-01

    In order to study the molecular mechanism of heterosis in pigs, the mRNA differential display technique was performed to investigate the differences in gene expression in the Longissimus dorsi muscle tissues from Large White × Meishan cross combination.One novel gene differentially expressed between the hybrids and the purebreds was isolated and subsequently identified using semi-quantitative reverse transcriptase polymerase chain reaction (RT-PCR) and its complete cDNA sequence was obtained using the rapid amplification of cDNA ends (RACE) method. The nucleotide sequence of the gene is not homologous to any of the known porcine genes. The sequence prediction revealed that the open reading frame of this gene encodes a protein of 188 amino acids that contains the putative conserved domain of the PRA1 family protein and this protein has high homology with the PRA1 family protein 3 of three species-rat (88 % ), human(88 % ), and mouse (87 % ), -so that it can be defined as swine PRA1 family protein 3. The phylogenetic tree analysis revealed that the swine PRA1 family protein 3 has a closer genetic relationship with the human PRA1 family protein 3 than with those of mouse and rat.The tissue expression analysis indicated that swine PRA1family protein 3 gene is highly-expressed in muscle and fat, moderately in spleen,weakly in heart, kidney, ovary, lung, and almost not expressed in small intestine and liver. The function of this gene and the relationship between this gene and heterosis are also discussed.

  12. Hybridization Capture-Based Next-Generation Sequencing to Evaluate Coding Sequence and Deep Intronic Mutations in the NF1 Gene

    Science.gov (United States)

    Cunha, Karin Soares; Oliveira, Nathalia Silva; Fausto, Anna Karoline; de Souza, Carolina Cruz; Gros, Audrey; Bandres, Thomas; Idrissi, Yamina; Merlio, Jean-Philippe; de Moura Neto, Rodrigo Soares; Silva, Rosane; Geller, Mauro; Cappellen, David

    2016-01-01

    Neurofibromatosis 1 (NF1) is one of the most common genetic disorders and is caused by mutations in the NF1 gene. NF1 gene mutational analysis presents a considerable challenge because of its large size, existence of highly homologous pseudogenes located throughout the human genome, absence of mutational hotspots, and diversity of mutations types, including deep intronic splicing mutations. We aimed to evaluate the use of hybridization capture-based next-generation sequencing to screen coding and noncoding NF1 regions. Hybridization capture-based next-generation sequencing, with genomic DNA as starting material, was used to sequence the whole NF1 gene (exons and introns) from 11 unrelated individuals and 1 relative, who all had NF1. All of them met the NF1 clinical diagnostic criteria. We showed a mutation detection rate of 91% (10 out of 11). We identified eight recurrent and two novel mutations, which were all confirmed by Sanger methodology. In the Sanger sequencing confirmation, we also included another three relatives with NF1. Splicing alterations accounted for 50% of the mutations. One of them was caused by a deep intronic mutation (c.1260 + 1604A > G). Frameshift truncation and missense mutations corresponded to 30% and 20% of the pathogenic variants, respectively. In conclusion, we show the use of a simple and fast approach to screen, at once, the entire NF1 gene (exons and introns) for different types of pathogenic variations, including the deep intronic splicing mutations. PMID:27999334

  13. Hybridization Capture-Based Next-Generation Sequencing to Evaluate Coding Sequence and Deep Intronic Mutations in the NF1 Gene.

    Science.gov (United States)

    Cunha, Karin Soares; Oliveira, Nathalia Silva; Fausto, Anna Karoline; de Souza, Carolina Cruz; Gros, Audrey; Bandres, Thomas; Idrissi, Yamina; Merlio, Jean-Philippe; de Moura Neto, Rodrigo Soares; Silva, Rosane; Geller, Mauro; Cappellen, David

    2016-12-17

    Neurofibromatosis 1 (NF1) is one of the most common genetic disorders and is caused by mutations in the NF1 gene. NF1 gene mutational analysis presents a considerable challenge because of its large size, existence of highly homologous pseudogenes located throughout the human genome, absence of mutational hotspots, and diversity of mutations types, including deep intronic splicing mutations. We aimed to evaluate the use of hybridization capture-based next-generation sequencing to screen coding and noncoding NF1 regions. Hybridization capture-based next-generation sequencing, with genomic DNA as starting material, was used to sequence the whole NF1 gene (exons and introns) from 11 unrelated individuals and 1 relative, who all had NF1. All of them met the NF1 clinical diagnostic criteria. We showed a mutation detection rate of 91% (10 out of 11). We identified eight recurrent and two novel mutations, which were all confirmed by Sanger methodology. In the Sanger sequencing confirmation, we also included another three relatives with NF1. Splicing alterations accounted for 50% of the mutations. One of them was caused by a deep intronic mutation (c.1260 + 1604A > G). Frameshift truncation and missense mutations corresponded to 30% and 20% of the pathogenic variants, respectively. In conclusion, we show the use of a simple and fast approach to screen, at once, the entire NF1 gene (exons and introns) for different types of pathogenic variations, including the deep intronic splicing mutations.

  14. Hybridization Capture-Based Next-Generation Sequencing to Evaluate Coding Sequence and Deep Intronic Mutations in the NF1 Gene

    Directory of Open Access Journals (Sweden)

    Karin Soares Cunha

    2016-12-01

    Full Text Available Neurofibromatosis 1 (NF1 is one of the most common genetic disorders and is caused by mutations in the NF1 gene. NF1 gene mutational analysis presents a considerable challenge because of its large size, existence of highly homologous pseudogenes located throughout the human genome, absence of mutational hotspots, and diversity of mutations types, including deep intronic splicing mutations. We aimed to evaluate the use of hybridization capture-based next-generation sequencing to screen coding and noncoding NF1 regions. Hybridization capture-based next-generation sequencing, with genomic DNA as starting material, was used to sequence the whole NF1 gene (exons and introns from 11 unrelated individuals and 1 relative, who all had NF1. All of them met the NF1 clinical diagnostic criteria. We showed a mutation detection rate of 91% (10 out of 11. We identified eight recurrent and two novel mutations, which were all confirmed by Sanger methodology. In the Sanger sequencing confirmation, we also included another three relatives with NF1. Splicing alterations accounted for 50% of the mutations. One of them was caused by a deep intronic mutation (c.1260 + 1604A > G. Frameshift truncation and missense mutations corresponded to 30% and 20% of the pathogenic variants, respectively. In conclusion, we show the use of a simple and fast approach to screen, at once, the entire NF1 gene (exons and introns for different types of pathogenic variations, including the deep intronic splicing mutations.

  15. Stress responses in alfalfa (Medicago sativa L.) 12. Sequence analysis of phenylalanine ammonia-lyase (PAL) cDNA clones and appearance of PAL transcripts in elicitor-treated cell cultures and developing plants.

    Science.gov (United States)

    Gowri, G; Paiva, N L; Dixon, R A

    1991-09-01

    An expression library containing cDNAs derived from transcripts from fungal elicitor-treated alfalfa cell suspension cultures was screened with an antiserum raised against phenylalanine ammonia-lyase (PAL) from alfalfa. A single immunoreactive clone was isolated which encoded a full-length PAL cDNA (APAL1) consisting of a 2175 bp open reading frame, 96 bp 5'-untranslated leader and 128 bp 3'-non-coding region. The deduced amino acid sequence was 86.5% similar to that of the PAL2 gene of bean, and encoded a polypeptide of Mr 78,865. A second PAL cDNA species was isolated, whose 3'-untranslated region was 86% identical to that of APAL1. Southern blot analysis indicated that PAL is encoded by a small multigene family in alfalfa. PAL transcript levels were rapidly and massively induced, and preceded increased PAL extractable activity, on exposure of alfalfa suspension cells to elicitor from baker's yeast. PAL transcripts were most abundant in roots, stems and petioles during growth and development of alfalfa seedlings. These studies provide the basis for an examination of the developmental and environmental control of a key enzyme of phenylpropanoid synthesis in a plant species which is readily amenable to stable genetic transformation.

  16. DeepSNVMiner: a sequence analysis tool to detect emergent, rare mutations in subsets of cell populations

    Directory of Open Access Journals (Sweden)

    T. Daniel Andrews

    2016-05-01

    Full Text Available Background. Massively parallel sequencing technology is being used to sequence highly diverse populations of DNA such as that derived from heterogeneous cell mixtures containing both wild-type and disease-related states. At the core of such molecule tagging techniques is the tagging and identification of sequence reads derived from individual input DNA molecules, which must be first computationally disambiguated to generate read groups sharing common sequence tags, with each read group representing a single input DNA molecule. This disambiguation typically generates huge numbers of reads groups, each of which requires additional variant detection analysis steps to be run specific to each read group, thus representing a significant computational challenge. While sequencing technologies for producing these data are approaching maturity, the lack of available computational tools for analysing such heterogeneous sequence data represents an obstacle to the widespread adoption of this technology. Results. Using synthetic data we successfully detect unique variants at dilution levels of 1 in a 1,000,000 molecules, and find DeeepSNVMiner obtains significantly lower false positive and false negative rates compared to popular variant callers GATK, SAMTools, FreeBayes and LoFreq, particularly as the variant concentration levels decrease. In a dilution series with genomic DNA from two cells lines, we find DeepSNVMiner identifies a known somatic variant when present at concentrations of only 1 in 1,000 molecules in the input material, the lowest concentration amongst all variant callers tested. Conclusions. Here we present DeepSNVMiner; a tool to disambiguate tagged sequence groups and robustly identify sequence variants specific to subsets of starting DNA molecules that may indicate the presence of a disease. DeepSNVMiner is an automated workflow of custom sequence analysis utilities and open source tools able to differentiate somatic DNA variants from

  17. Main: Sequences [KOME

    Lifescience Database Archive (English)

    Full Text Available Sequences Nucleotide Sequence Nucleotide sequence of full length cDNA (trimmed sequence) kome_ine_full_seq...uence_db.fasta.zip kome_ine_full_sequence_db.zip kome_ine_full_sequence_db ...

  18. Targeted deep sequencing improves outcome stratification in chronic myelomonocytic leukemia with low risk cytogenetic features

    Science.gov (United States)

    Palomo, Laura; Garcia, Olga; Arnan, Montse; Xicoy, Blanca; Fuster, Francisco; Cabezón, Marta; Coll, Rosa; Ademà, Vera; Grau, Javier; Jiménez, Maria-José; Pomares, Helena; Marcé, Sílvia; Mallo, Mar; Millá, Fuensanta; Alonso, Esther; Sureda, Anna; Gallardo, David; Feliu, Evarist; Ribera, Josep-Maria; Solé, Francesc; Zamora, Lurdes

    2016-01-01

    Clonal cytogenetic abnormalities are found in 20-30% of patients with chronic myelomonocytic leukemia (CMML), while gene mutations are present in >90% of cases. Patients with low risk cytogenetic features account for 80% of CMML cases and often fall into the low risk categories of CMML prognostic scoring systems, but the outcome differs considerably among them. We performed targeted deep sequencing of 83 myeloid-related genes in 56 CMML patients with low risk cytogenetic features or uninformative conventional cytogenetics (CC) at diagnosis, with the aim to identify the genetic characteristics of patients with a more aggressive disease. Targeted sequencing was also performed in a subset of these patients at time of acute myeloid leukemia (AML) transformation. Overall, 98% of patients harbored at least one mutation. Mutations in cell signaling genes were acquired at time of AML progression. Mutations in ASXL1, EZH2 and NRAS correlated with higher risk features and shorter overall survival (OS) and progression free survival (PFS). Patients with SRSF2 mutations associated with poorer OS, while absence of TET2 mutations (TET2wt) was predictive of shorter PFS. A decrease in OS and PFS was observed as the number of adverse risk gene mutations (ASXL1, EZH2, NRAS and SRSF2) increased. On multivariate analyses, CMML-specific scoring system (CPSS) and presence of adverse risk gene mutations remained significant for OS, while CPSS and TET2wt were predictive of PFS. These results confirm that mutation analysis can add prognostic value to patients with CMML and low risk cytogenetic features or uninformative CC. PMID:27486981

  19. Next-Generation Analysis of Deep Sequencing Data: Bringing Light into the Black Box of SELEX Experiments.

    Science.gov (United States)

    Blank, Michael

    2016-01-01

    In silico analysis of next-generation sequencing data (NGS; also termed deep sequencing) derived from in vitro selection experiments enables the analysis of the SELEX procedure (Systematic Evolution of Ligands by EXponential enrichment) in an unprecedented depth and improves the identification of aptamers. Besides quality control and optimization of starting libraries, advanced screening strategies for difficult targets or early identification of rare but high quality aptamers which are otherwise lost in the in vitro selection experiments become possible. The high information content of sequence data obtained from selection experiments is furthermore useful for subsequent lead optimization.

  20. Distinctive Drug-resistant Mutation Profiles and Interpretations of HIV-1 Proviral DNA Revealed by Deep Sequencing in Reverse Transcriptase

    Institute of Scientific and Technical Information of China (English)

    YIN Qian Qian; SHAO Yi Ming; MA Li Ying; LI Zhen Peng; ZHAO Hai; PAN Dong; WANG Yan; XU Wei Si; XING Hui; FENGYi; JIANG Shi Bo

    2016-01-01

    ObjectiveTo investigate distinctive features in drug-resistant mutations(DRMs) and interpretations for reverse transcriptase inhibitors (RTIs) between proviral DNA and paired viral RNA in HIV-1-infected patients. MethodsForty-three HIV-1-infected individuals receiving first-line antiretroviral therapy were recruited to participate in a multicenter AIDS Cohort Study in Anhui and Henan Provinces in China in 2004. Drug resistance genotyping was performed by bulk sequencing and deep sequencing on the plasma and whole blood of 77 samples, respectively. Drug-resistance interpretation was compared between viral RNA and paired proviral DNA. ResultsCompared with bulk sequencing, deep sequencing could detect more DRMs and samples with DRMs in both viral RNA and proviral DNA. The mutations M184I and M230I were more prevalent in proviral DNA than in viral RNA (Fisher’s exact test,P ConclusionCompared with viral RNA, the distinctive information of DRMsand drug resistance interpretations for proviral DNA could be obtained by deep sequencing, which could provide more detailed and precise information for drug resistance monitoring and the rational design of optimal antiretroviral therapy regimens.

  1. Genomic region operation kit for flexible processing of deep sequencing data.

    Science.gov (United States)

    Ovaska, Kristian; Lyly, Lauri; Sahu, Biswajyoti; Jänne, Olli A; Hautaniemi, Sampsa

    2013-01-01

    Computational analysis of data produced in deep sequencing (DS) experiments is challenging due to large data volumes and requirements for flexible analysis approaches. Here, we present a mathematical formalism based on set algebra for frequently performed operations in DS data analysis to facilitate translation of biomedical research questions to language amenable for computational analysis. With the help of this formalism, we implemented the Genomic Region Operation Kit (GROK), which supports various DS-related operations such as preprocessing, filtering, file conversion, and sample comparison. GROK provides high-level interfaces for R, Python, Lua, and command line, as well as an extension C++ API. It supports major genomic file formats and allows storing custom genomic regions in efficient data structures such as red-black trees and SQL databases. To demonstrate the utility of GROK, we have characterized the roles of two major transcription factors (TFs) in prostate cancer using data from 10 DS experiments. GROK is freely available with a user guide from >http://csbi.ltdk.helsinki.fi/grok/.

  2. Analysis of Plasmodium falciparum diversity in natural infections by deep sequencing.

    Science.gov (United States)

    Manske, Magnus; Miotto, Olivo; Campino, Susana; Auburn, Sarah; Almagro-Garcia, Jacob; Maslen, Gareth; O'Brien, Jack; Djimde, Abdoulaye; Doumbo, Ogobara; Zongo, Issaka; Ouedraogo, Jean-Bosco; Michon, Pascal; Mueller, Ivo; Siba, Peter; Nzila, Alexis; Borrmann, Steffen; Kiara, Steven M; Marsh, Kevin; Jiang, Hongying; Su, Xin-Zhuan; Amaratunga, Chanaki; Fairhurst, Rick; Socheat, Duong; Nosten, Francois; Imwong, Mallika; White, Nicholas J; Sanders, Mandy; Anastasi, Elisa; Alcock, Dan; Drury, Eleanor; Oyola, Samuel; Quail, Michael A; Turner, Daniel J; Ruano-Rubio, Valentin; Jyothi, Dushyanth; Amenga-Etego, Lucas; Hubbart, Christina; Jeffreys, Anna; Rowlands, Kate; Sutherland, Colin; Roper, Cally; Mangano, Valentina; Modiano, David; Tan, John C; Ferdig, Michael T; Amambua-Ngwa, Alfred; Conway, David J; Takala-Harrison, Shannon; Plowe, Christopher V; Rayner, Julian C; Rockett, Kirk A; Clark, Taane G; Newbold, Chris I; Berriman, Matthew; MacInnis, Bronwyn; Kwiatkowski, Dominic P

    2012-07-19

    Malaria elimination strategies require surveillance of the parasite population for genetic changes that demand a public health response, such as new forms of drug resistance. Here we describe methods for the large-scale analysis of genetic variation in Plasmodium falciparum by deep sequencing of parasite DNA obtained from the blood of patients with malaria, either directly or after short-term culture. Analysis of 86,158 exonic single nucleotide polymorphisms that passed genotyping quality control in 227 samples from Africa, Asia and Oceania provides genome-wide estimates of allele frequency distribution, population structure and linkage disequilibrium. By comparing the genetic diversity of individual infections with that of the local parasite population, we derive a metric of within-host diversity that is related to the level of inbreeding in the population. An open-access web application has been established for the exploration of regional differences in allele frequency and of highly differentiated loci in the P. falciparum genome.

  3. Multiple platform assessment of the EGF dependent transcriptome by microarray and deep tag sequencing analysis

    Directory of Open Access Journals (Sweden)

    Iraola Susana

    2011-06-01

    Full Text Available Abstract Background Epidermal Growth Factor (EGF is a key regulatory growth factor activating many processes relevant to normal development and disease, affecting cell proliferation and survival. Here we use a combined approach to study the EGF dependent transcriptome of HeLa cells by using multiple long oligonucleotide based microarray platforms (from Agilent, Operon, and Illumina in combination with digital gene expression profiling (DGE with the Illumina Genome Analyzer. Results By applying a procedure for cross-platform data meta-analysis based on RankProd and GlobalAncova tests, we establish a well validated gene set with transcript levels altered after EGF treatment. We use this robust gene list to build higher order networks of gene interaction by interconnecting associated networks, supporting and extending the important role of the EGF signaling pathway in cancer. In addition, we find an entirely new set of genes previously unrelated to the currently accepted EGF associated cellular functions. Conclusions We propose that the use of global genomic cross-validation derived from high content technologies (microarrays or deep sequencing can be used to generate more reliable datasets. This approach should help to improve the confidence of downstream in silico functional inference analyses based on high content data.

  4. Generation and analysis of expressed sequence tags from a normalized cDNA library of young leaf from Ma bamboo (Dendrocalamus latiflorus Munro).

    Science.gov (United States)

    Gao, Z M; Li, C L; Peng, Z H

    2011-11-01

    Ma bamboo (Dendrocalamus latiflorus Munro) belongs to Dendrocalamus genus, Bambusease tribe, Bambusoideae subfamily, Poaceae family. It is a representative species of clumping bamboo, and a principal commercial species for various construction purposes using mature culms and for human consumption using young shoots. A normalized cDNA library was constructed from young leaves of Ma bamboo and 9,574 high-quality ESTs were generated, from which 5,317 unigenes including 1,502 contigs and 3,815 singletons were assembled. The unigenes were assigned into different gene ontology (GO) categories and summarized into 13 broad biologically functional groups according to similar functional characteristics or cellular roles by BLAST search against public databases. Eight hundred and ninety-one unigenes were assigned by KO identifiers and mapped to six KEGG biochemical pathways. The transcripts involved in biosynthesis of secondary metabolites such as cytochrome 450, flavonol synthase/flavanone 3-hydroxylase, and dihydroflavonol-4-reductase were well represented by 14 unigenes in the unigene set. The candidate genes involved in phytohormone metabolism, signal transduction and encoding cell wall-associated receptor kinases were also identified. Sixty-seven unigenes related to plant resistance (R) genes, including RPP genes, RGAs and RDL/RF genes, were discovered. These results will provide genome-wide knowledge about the molecular physiology of Ma bamboo young leaves and tools for advanced studies of molecular mechanism underlying leaf growth and development.

  5. Analysis of 4,664 high-quality sequence-finished poplar full-length cDNA clones and their utility for the discovery of genes responding to insect feeding

    Directory of Open Access Journals (Sweden)

    Douglas Carl J

    2008-01-01

    Full Text Available Abstract Background The genus Populus includes poplars, aspens and cottonwoods, which will be collectively referred to as poplars hereafter unless otherwise specified. Poplars are the dominant tree species in many forest ecosystems in the Northern Hemisphere and are of substantial economic value in plantation forestry. Poplar has been established as a model system for genomics studies of growth, development, and adaptation of woody perennial plants including secondary xylem formation, dormancy, adaptation to local environments, and biotic interactions. Results As part of the poplar genome sequencing project and the development of genomic resources for poplar, we have generated a full-length (FL-cDNA collection using the biotinylated CAP trapper method. We constructed four FLcDNA libraries using RNA from xylem, phloem and cambium, and green shoot tips and leaves from the P. trichocarpa Nisqually-1 genotype, as well as insect-attacked leaves of the P. trichocarpa × P. deltoides hybrid. Following careful selection of candidate cDNA clones, we used a combined strategy of paired end reads and primer walking to generate a set of 4,664 high-accuracy, sequence-verified FLcDNAs, which clustered into 3,990 putative unique genes. Mapping FLcDNAs to the poplar genome sequence combined with BLAST comparisons to previously predicted protein coding sequences in the poplar genome identified 39 FLcDNAs that likely localize to gaps in the current genome sequence assembly. Another 173 FLcDNAs mapped to the genome sequence but were not included among the previously predicted genes in the poplar genome. Comparative sequence analysis against Arabidopsis thaliana and other species in the non-redundant database of GenBank revealed that 11.5% of the poplar FLcDNAs display no significant sequence similarity to other plant proteins. By mapping the poplar FLcDNAs against transcriptome data previously obtained with a 15.5 K cDNA microarray, we identified 153 FLcDNA clones

  6. Transcriptome analysis of the mud crab (Scylla paramamosain by 454 deep sequencing: assembly, annotation, and marker discovery.

    Directory of Open Access Journals (Sweden)

    Hongyu Ma

    Full Text Available In this study, we reported the characterization of the first transcriptome of the mud crab (Scylla paramamosain. Pooled cDNAs of four tissue types from twelve wild individuals were sequenced using the Roche 454 FLX platform. Analysis performed included de novo assembly of transcriptome sequences, functional annotation, and molecular marker discovery. A total of 1,314,101 high quality reads with an average length of 411 bp were generated by 454 sequencing on a mixed cDNA library. De novo assembly of these 1,314,101 reads produced 76,778 contigs (consisting of 818,154 reads with 5.4-fold average sequencing coverage. The remaining 495,947 reads were singletons. A total of 78,268 unigenes were identified based on sequence similarity with known proteins (E≤0.00001 in UniProt and non-redundant protein databases. Meanwhile, 44,433 sequences were identified (E≤0.00001 using a BLASTN search against the NCBI nucleotide database. Gene Ontology (GO analysis indicated that biosynthetic process, cell part, and ion binding were the most abundant terms in biological process, cellular component, and molecular function categories, respectively. Kyoto Encyclopedia of Genes and Genome (KEGG pathway analysis revealed that 4,878 unigenes distributed in 281 different pathways. In addition, 19,011 microsatellites and 37,063 potential single nucleotide polymorphisms were detected from the transcriptome of S. paramamosain. Finally, thirty polymorphic microsatellite markers were developed and used to assess genetic diversity of a wild population of S. paramamosain. So far, existing sequence resources for S. paramamosain are extremely limited. The present study provides a characterization of transcriptome from multiple tissues and individuals, as well as an assessment of genetic diversity of a wild population. These sequence resources will facilitate the investigation of population genetic diversity, the development of genetic maps, and the conduct of molecular marker

  7. Genome-wide identification of Schistosoma japonicum microRNAs using a deep-sequencing approach.

    Directory of Open Access Journals (Sweden)

    Jian Huang

    Full Text Available BACKGROUND: Human schistosomiasis is one of the most prevalent and serious parasitic diseases worldwide. Schistosoma japonicum is one of important pathogens of this disease. MicroRNAs (miRNAs are a large group of non-coding RNAs that play important roles in regulating gene expression and protein translation in animals. Genome-wide identification of miRNAs in a given organism is a critical step to facilitating our understanding of genome organization, genome biology, evolution, and posttranscriptional regulation. METHODOLOGY/PRINCIPAL FINDINGS: We sequenced two small RNA libraries prepared from different stages of the life cycle of S. japonicum, immature schistosomula and mature pairing adults, through a deep DNA sequencing approach, which yielded approximately 12 million high-quality short sequence reads containing a total of approximately 2 million non-redundant tags. Based on a bioinformatics pipeline, we identified 176 new S. japonicum miRNAs, of which some exhibited a differential pattern of expression between the two stages. Although 21 S. japonicum miRNAs are orthologs of known miRNAs within the metazoans, some nucleotides at many positions of Schistosoma miRNAs, such as miR-8, let-7, miR-10, miR-31, miR-92, miR-124, and miR-125, are indeed significantly distinct from other bilaterian orthologs. In addition, both miR-71 and some miR-2 family members in tandem are found to be clustered in a reversal direction model on two genomic loci, and two pairs of novel S. japonicum miRNAs were derived from sense and antisense DNA strands at the same genomic loci. CONCLUSIONS/SIGNIFICANCE: The collection of S. japonicum miRNAs could be used as a new platform to study the genomic structure, gene regulation and networks, evolutionary processes, development, and host-parasite interactions. Some S. japonicum miRNAs and their clusters could represent the ancestral forms of the conserved orthologues and a model for the genesis of novel miRNAs.

  8. 羊驼垂体催乳素(PRL)基因全长cDNA的克隆及序列分析%Cloning and Sequence Analysis of the Full-length cDNA of PRL Gene from Alpaca Pituitary

    Institute of Scientific and Technical Information of China (English)

    薛霖莉; 董常生; 赫晓燕; 范瑞文; 王海东; 曹靖; 郝欢庆

    2011-01-01

    In order to provide theoretical basis for studying biological function and application of alpaca prolactin (PRL), the alpaca PRL cDNA sequence were cloned and analyzed.According to the known cDNA sequences from mammals, alpaca PRL primers was designed and the full-length cDNA of PRL from alpaca pituitary was cloned by RT-PCR and RACE techniques.The size of full-length cDNA of PRL from alpaca pituitary was 959 bp and it contained an open reading frame (ORF) of 687 bp which encoded PRL precursor protein with 229 AA.PRL precursor protein was a single-chain polypeptide composed of 30-AA signal peptide and 199-AA mature peptide.The spatial structure of alpaca PRL protein was similar to human GH.The result of the sequence alignment showed that the amino acids composition of alpaca PRL was similar to most mammals, but the methionine at 81-AA (51-AA for mature peptide) might lead to different spatial structure which might impact functions of alpaca PRL.A phylogenetic tree constructed basing on the amino acid sequences of alpaca PRL and other organisms showed that the relationships between alpaca PRL and camel PRL were closest and that the evolution speed of alpaca was very slow with no ' episodic' evolution pattern as most mammals such as primates, rodents and ruminant.%获得并分析羊驼PRL基因cDNA全序列结构,为研究羊驼催乳素(PRL)的各种生物学作用和生产应用提供理论依据.根据已知的不同哺乳动物的PRL基因cDNA序列,设计羊驼PRL引物,运用RT-PCR方法和cDNA末端快速扩增(RACE)技术获得羊驼PRL基因cDNA全序列.羊驼PRL基因cDNA序列全长959 bp,编码区为687bp,编码229个氨基酸的PRL碰前体蛋白.预测羊驼PRL蛋白质的空间结构类似人生长激素(GH),但在81位(成熟肽为51位)为蛋氨酸可能导致蛋白空间结构的不同而影响羊驼PRL的功能;序列比对结果表明,羊驼PRL的cDNA序列与大多数哺乳动物相似.构建的基因进化树分析结果显示,羊驼PRL与骆

  9. Cloning, sequence analysis and detection of vitellogenin cDNA from Colisa fasciata%条纹密鲈卵黄蛋白原基因的克隆与检测

    Institute of Scientific and Technical Information of China (English)

    梁岳; 方展强

    2011-01-01

    Using RT-PCR,the partial length vitellogenin( VTG)cDNA sequence and (3-actin cDNA sequence of Colisa fasciata were cloned. VTG cDNA sequence contains 3 464 bp nucleotides and encodes 1 150 amino acids, p-actin cDNA sequence contains 1 253 bp nucleotides and encodes 375 amino acids. In vivo and in vitro methods were employed to investigate VTG mRNA expression under exposure to estradiol (E2), octylphenol (OP), cadmium ( Cd2 + ) and perfluorooctane sulfonates ( PFOS) and evaluate the estrogenic activity. The results showed that E2 and OP could induce VTG mRNA expression in dose-dependent way by in vivo and in vitro test. Cd2+ could induce VTG mRNA expression only in the low dose by in vivo test,but VTG mRNA expression was not observed in PFOS groups by in vivo and in vitro test. The results indicated that the strength of estrogenic effects was in the order E2 > OP > Cd2 +. Cd2 + estrogenic effects in vivo and in vitro results are inconsistent, suggesting that the mechanism of Cd2 + induced effects of estrogen and E2 may be different. The results also indicated that VTG cDNA of C. Fasciata is very sensitive to environmental hormone and very suitable to be a biomarker for monitoring the environmental hormones.%采用RT-PCR方法克隆并分析了条纹密鲈卵黄蛋白原(VTG)和β-肌动蛋白(β-actin)cDNA部分序列.获得的VTG cDNA序列片段长3464 bp,全部处于编码区,编码1150个氨基酸;β-actin cDNA序列片段长1253 bp,编码375个氨基酸.使用活体与离体的实验方法,检测了VTG mRNA转录情况,并以此评价雌二醇(E2)、辛基酚(OP)、镉(Cd2+)和全氟辛烷磺酸类化合物( PFOS)引起的雌激素效应.结果显示,E2和OP在活体和离体实验中均能剂量依赖性地诱导VTG mRNA表达.Cd2+仅在活体实验低剂量组诱导VTG mRNA表达,PFOS在活体和离体实验的各个浓度组均未见显著的VTG mRNA表达.结果表明,所诱导的雌激素效应强弱的排列顺序为E2>OP> Cd2.Cd2的雌激素效应

  10. De Novo Transcriptome Sequencing Analysis of cDNA Library and Large-Scale Unigene Assembly in Japanese Red Pine (Pinus densiflora).

    Science.gov (United States)

    Liu, Le; Zhang, Shijie; Lian, Chunlan

    2015-12-04

    Japanese red pine (Pinus densiflora) is extensively cultivated in Japan, Korea, China, and Russia and is harvested for timber, pulpwood, garden, and paper markets. However, genetic information and molecular markers were very scarce for this species. In this study, over 51 million sequencing clean reads from P. densiflora mRNA were produced using Illumina paired-end sequencing technology. It yielded 83,913 unigenes with a mean length of 751 bp, of which 54,530 (64.98%) unigenes showed similarity to sequences in the NCBI database. Among which the best matches in the NCBI Nr database were Picea sitchensis (41.60%), Amborella trichopoda (9.83%), and Pinus taeda (4.15%). A total of 1953 putative microsatellites were identified in 1784 unigenes using MISA (MicroSAtellite) software, of which the tri-nucleotide repeats were most abundant (50.18%) and 629 EST-SSR (expressed sequence tag- simple sequence repeats) primer pairs were successfully designed. Among 20 EST-SSR primer pairs randomly chosen, 17 markers yielded amplification products of the expected size in P. densiflora. Our results will provide a valuable resource for gene-function analysis, germplasm identification, molecular marker-assisted breeding and resistance-related gene(s) mapping for pine for P. densiflora.

  11. De Novo Transcriptome Sequencing Analysis of cDNA Library and Large-Scale Unigene Assembly in Japanese Red Pine (Pinus densiflora

    Directory of Open Access Journals (Sweden)

    Le Liu

    2015-12-01

    Full Text Available Japanese red pine (Pinus densiflora is extensively cultivated in Japan, Korea, China, and Russia and is harvested for timber, pulpwood, garden, and paper markets. However, genetic information and molecular markers were very scarce for this species. In this study, over 51 million sequencing clean reads from P. densiflora mRNA were produced using Illumina paired-end sequencing technology. It yielded 83,913 unigenes with a mean length of 751 bp, of which 54,530 (64.98% unigenes showed similarity to sequences in the NCBI database. Among which the best matches in the NCBI Nr database were Picea sitchensis (41.60%, Amborella trichopoda (9.83%, and Pinus taeda (4.15%. A total of 1953 putative microsatellites were identified in 1784 unigenes using MISA (MicroSAtellite software, of which the tri-nucleotide repeats were most abundant (50.18% and 629 EST-SSR (expressed sequence tag- simple sequence repeats primer pairs were successfully designed. Among 20 EST-SSR primer pairs randomly chosen, 17 markers yielded amplification products of the expected size in P. densiflora. Our results will provide a valuable resource for gene-function analysis, germplasm identification, molecular marker-assisted breeding and resistance-related gene(s mapping for pine for P. densiflora.

  12. Mitochondrial matR sequences help to resolve deep phylogenetic relationships in rosids

    Directory of Open Access Journals (Sweden)

    Dilcher David L

    2007-11-01

    Full Text Available Abstract Background Rosids are a major clade in the angiosperms containing 13 orders and about one-third of angiosperm species. Recent molecular analyses recognized two major groups (i.e., fabids with seven orders and malvids with three orders. However, phylogenetic relationships within the two groups and among fabids, malvids, and potentially basal rosids including Geraniales, Myrtales, and Crossosomatales remain to be resolved with more data and a broader taxon sampling. In this study, we obtained DNA sequences of the mitochondrial matR gene from 174 species representing 72 families of putative rosids and examined phylogenetic relationships and phylogenetic utility of matR in rosids. We also inferred phylogenetic relationships within the "rosid clade" based on a combined data set of 91 taxa and four genes including matR, two plastid genes (rbcL, atpB, and one nuclear gene (18S rDNA. Results Comparison of mitochondrial matR and two plastid genes (rbcL and atpB showed that the synonymous substitution rate in matR was approximately four times slower than those of rbcL and atpB; however, the nonsynonymous substitution rate in matR was relatively high, close to its synonymous substitution rate, indicating that the matR has experienced a relaxed evolutionary history. Analyses of our matR sequences supported the monophyly of malvids and most orders of the rosids. However, fabids did not form a clade; instead, the COM clade of fabids (Celastrales, Oxalidales, Malpighiales, and Huaceae was sister to malvids. Analyses of the four-gene data set suggested that Geraniales and Myrtales were successively sister to other rosids, and that Crossosomatales were sister to malvids. Conclusion Compared to plastid genes such as rbcL and atpB, slowly evolving matR produced less homoplasious but not less informative substitutions. Thus, matR appears useful in higher-level angiosperm phylogenetics. Analysis of matR alone identified a novel deep relationship within

  13. Deep sequencing reveals microRNAs predictive of antiangiogenic drug response

    Science.gov (United States)

    García-Donas, Jesús; Beuselinck, Benoit; Inglada-Pérez, Lucía; Graña, Osvaldo; Schöffski, Patrick; Wozniak, Agnieszka; Bechter, Oliver; Apellániz-Ruiz, Maria; Leandro-García, Luis Javier; Esteban, Emilio; Castellano, Daniel E.; González del Alba, Aranzazu; Climent, Miguel Angel; Hernando, Susana; Arranz, José Angel; Morente, Manuel; Pisano, David G.; Robledo, Mercedes

    2016-01-01

    The majority of metastatic renal cell carcinoma (RCC) patients are treated with tyrosine kinase inhibitors (TKI) in first-line treatment; however, a fraction are refractory to these antiangiogenic drugs. MicroRNAs (miRNAs) are regulatory molecules proven to be accurate biomarkers in cancer. Here, we identified miRNAs predictive of progressive disease under TKI treatment through deep sequencing of 74 metastatic clear cell RCC cases uniformly treated with these drugs. Twenty-nine miRNAs were differentially expressed in the tumors of patients who progressed under TKI therapy (P values from 6 × 10–9 to 3 × 10–3). Among 6 miRNAs selected for validation in an independent series, the most relevant associations corresponded to miR–1307-3p, miR–155-5p, and miR–221-3p (P = 4.6 × 10–3, 6.5 × 10–3, and 3.4 × 10–2, respectively). Furthermore, a 2 miRNA–based classifier discriminated individuals with progressive disease upon TKI treatment (AUC = 0.75, 95% CI, 0.64–0.85; P = 1.3 × 10–4) with better predictive value than clinicopathological risk factors commonly used. We also identified miRNAs significantly associated with progression-free survival and overall survival (P = 6.8 × 10–8 and 7.8 × 10–7 for top hits, respectively), and 7 overlapped with early progressive disease. In conclusion, this is the first miRNome comprehensive study, to our knowledge, that demonstrates a predictive value of miRNAs for TKI response and provides a new set of relevant markers that can help rationalize metastatic RCC treatment. PMID:27699216

  14. Deep sequencing-based identification of small regulatory RNAs in Synechocystis sp. PCC 6803.

    Directory of Open Access Journals (Sweden)

    Wen Xu

    Full Text Available Synechocystis sp. PCC 6803 is a genetically tractable model organism for photosynthesis research. The genome of Synechocystis sp. PCC 6803 consists of a circular chromosome and seven plasmids. The importance of small regulatory RNAs (sRNAs as mediators of a number of cellular processes in bacteria has begun to be recognized. However, little is known regarding sRNAs in Synechocystis sp. PCC 6803. To provide a comprehensive overview of sRNAs in this model organism, the sRNAs of Synechocystis sp. PCC 6803 were analyzed using deep sequencing, and 7,951,189 reads were obtained. High quality mapping reads (6,127,890 were mapped onto the genome and assembled into 16,192 transcribed regions (clusters based on read overlap. A total number of 5211 putative sRNAs were revealed from the genome and the 4 megaplasmids, and 27 of these molecules, including four from plasmids, were confirmed by RT-PCR. In addition, possible target genes regulated by all of the putative sRNAs identified in this study were predicted by IntaRNA and analyzed for functional categorization and biological pathways, which provided evidence that sRNAs are indeed involved in many different metabolic pathways, including basic metabolic pathways, such as glycolysis/gluconeogenesis, the citrate cycle, fatty acid metabolism and adaptations to environmentally stress-induced changes. The information from this study provides a valuable reservoir for understanding the sRNA-mediated regulation of the complex physiology and metabolic processes of cyanobacteria.

  15. Microbial Dark Matter: Unusual intervening sequences in 16S rRNA genes of candidate phyla from the deep subsurface

    Energy Technology Data Exchange (ETDEWEB)

    Jarett, Jessica; Stepanauskas, Ramunas; Kieft, Thomas; Onstott, Tullis; Woyke, Tanja

    2014-03-17

    The Microbial Dark Matter project has sequenced genomes from over 200 single cells from candidate phyla, greatly expanding our knowledge of the ecology, inferred metabolism, and evolution of these widely distributed, yet poorly understood lineages. The second phase of this project aims to sequence an additional 800 single cells from known as well as potentially novel candidate phyla derived from a variety of environments. In order to identify whole genome amplified single cells, screening based on phylogenetic placement of 16S rRNA gene sequences is being conducted. Briefly, derived 16S rRNA gene sequences are aligned to a custom version of the Greengenes reference database and added to a reference tree in ARB using parsimony. In multiple samples from deep subsurface habitats but not from other habitats, a large number of sequences proved difficult to align and therefore to place in the tree. Based on comparisons to reference sequences and structural alignments using SSU-ALIGN, many of these ?difficult? sequences appear to originate from candidate phyla, and contain intervening sequences (IVSs) within the 16S rRNA genes. These IVSs are short (39 - 79 nt) and do not appear to be self-splicing or to contain open reading frames. IVSs were found in the loop regions of stem-loop structures in several different taxonomic groups. Phylogenetic placement of sequences is strongly affected by IVSs; two out of three groups investigated were classified as different phyla after their removal. Based on data from samples screened in this project, IVSs appear to be more common in microbes occurring in deep subsurface habitats, although the reasons for this remain elusive.

  16. Acyclic identification of aptamers for human alpha-thrombin using over-represented libraries and deep sequencing.

    Directory of Open Access Journals (Sweden)

    Gillian V Kupakuwana

    Full Text Available BACKGROUND: Aptamers are oligonucleotides that bind proteins and other targets with high affinity and selectivity. Twenty years ago elements of natural selection were adapted to in vitro selection in order to distinguish aptamers among randomized sequence libraries. The primary bottleneck in traditional aptamer discovery is multiple cycles of in vitro evolution. METHODOLOGY/PRINCIPAL FINDINGS: We show that over-representation of sequences in aptamer libraries and deep sequencing enables acyclic identification of aptamers. We demonstrated this by isolating a known family of aptamers for human α-thrombin. Aptamers were found within a library containing an average of 56,000 copies of each possible randomized 15mer segment. The high affinity sequences were counted many times above the background in 2-6 million reads. Clustering analysis of sequences with more than 10 counts distinguished two sequence motifs with candidates at high abundance. Motif I contained the previously observed consensus 15mer, Thb1 (46,000 counts, and related variants with mostly G/T substitutions; secondary analysis showed that affinity for thrombin correlated with abundance (K(d = 12 nM for Thb1. The signal-to-noise ratio for this experiment was roughly 10,000∶1 for Thb1. Motif II was unrelated to Thb1 with the leading candidate (29,000 counts being a novel aptamer against hexose sugars in the storage and elution buffers for Concanavilin A (K(d = 0.5 µM for α-methyl-mannoside; ConA was used to immobilize α-thrombin. CONCLUSIONS/SIGNIFICANCE: Over-representation together with deep sequencing can dramatically shorten the discovery process, distinguish aptamers having a wide range of affinity for the target, allow an exhaustive search of the sequence space within a simplified library, reduce the quantity of the target required, eliminate cycling artifacts, and should allow multiplexing of sequencing experiments and targets.

  17. Study on a cDNA sequence of cold inducible zinc finger protein in albinism tea cultivar "Xiaoxueya"%茶树品种“小雪芽”冷诱导锌指蛋白基因cDNA研究

    Institute of Scientific and Technical Information of China (English)

    王开荣; 李娜娜; 陆建良; 郑新强; 梁月荣; 吴颖; 李明

    2012-01-01

    A cDNA sequence of cold inducible zinc finger protein in leaf of albinism tea cultivar " Xiaoxueya" was investigated. The results shows that the cDNA sequence had 698 bp in length, with 83% and 82% identity to zinc finger protein mRNA of Glycine max and Ricinus communis respectively. It had an opening reading frame encoding 230 amino acids. Compared to that of common tea cuhivar " Fudingdabai" , there were three loci of nucleotide deletion and one locus of nucleotide substitution. Its deduced amino acid sequence had 99% identity to that of cuhivar " Fudingdabai" , among which there were 3 loci of amino acid substitution. The expression of cold inducible zinc finger protein in " Xiaoxueya" ,sas significantly lower than in " Fudingdabai". It is considered that the mutation in gene sequence resulted in the sensitivity of cultivar "Xiaoxueya" to low temperature through low expression and mutation of the cold inducible zinc finger protein.%分析了低温诱导型新梢白化茶树品种“小雪芽”叶片低温诱导锌指蛋白基因cDNA序列。该序列长度为698bp,与大豆锌指蛋白mRNA同源性为83%,与蓖麻锌指蛋白mRNA同源性为82%;具有可编码230个氨基酸的开放阅读框。与“福鼎大白茶”冷诱导锌指蛋白eDNA序列相比,该eDNA序列在50—51位上核苷酸AT缺失,第143位的A被置换为G,第654位T缺失。其翻译的蛋白质氨基酸序列与“福鼎大白茶”同源性达到99%,但有3个位点的氨基酸变异。该基因表达丰度明显低于“福鼎大白茶”。研究认为,基因结构差异,引起表达强度和蛋白质氨基酸序列的差异,可能是引起“小雪芽”品种对低温敏感的重要因素。

  18. Deep Sequencing Reveals the Complete Genome and Evidence for Transcriptional Activity of the First Virus-Like Sequences Identified in Aristotelia chilensis (Maqui Berry

    Directory of Open Access Journals (Sweden)

    Javier Villacreses

    2015-04-01

    Full Text Available Here, we report the genome sequence and evidence for transcriptional activity of a virus-like element in the native Chilean berry tree Aristotelia chilensis. We propose to name the endogenous sequence as Aristotelia chilensis Virus 1 (AcV1. High-throughput sequencing of the genome of this tree uncovered an endogenous viral element, with a size of 7122 bp, corresponding to the complete genome of AcV1. Its sequence contains three open reading frames (ORFs: ORFs 1 and 2 shares 66%–73% amino acid similarity with members of the Caulimoviridae virus family, especially the Petunia vein clearing virus (PVCV, Petuvirus genus. ORF1 encodes a movement protein (MP; ORF2 a Reverse Transcriptase (RT and a Ribonuclease H (RNase H domain; and ORF3 showed no amino acid sequence similarity with any other known virus proteins. Analogous to other known endogenous pararetrovirus sequences (EPRVs, AcV1 is integrated in the genome of Maqui Berry and showed low viral transcriptional activity, which was detected by deep sequencing technology (DNA and RNA-seq. Phylogenetic analysis of AcV1 and other pararetroviruses revealed a closer resemblance with Petuvirus. Overall, our data suggests that AcV1 could be a new member of Caulimoviridae family, genus Petuvirus, and the first evidence of this kind of virus in a fruit plant.

  19. Deep sequencing reveals the complete genome and evidence for transcriptional activity of the first virus-like sequences identified in Aristotelia chilensis (Maqui Berry).

    Science.gov (United States)

    Villacreses, Javier; Rojas-Herrera, Marcelo; Sánchez, Carolina; Hewstone, Nicole; Undurraga, Soledad F; Alzate, Juan F; Manque, Patricio; Maracaja-Coutinho, Vinicius; Polanco, Victor

    2015-04-03

    Here, we report the genome sequence and evidence for transcriptional activity of a virus-like element in the native Chilean berry tree Aristotelia chilensis. We propose to name the endogenous sequence as Aristotelia chilensis Virus 1 (AcV1). High-throughput sequencing of the genome of this tree uncovered an endogenous viral element, with a size of 7122 bp, corresponding to the complete genome of AcV1. Its sequence contains three open reading frames (ORFs): ORFs 1 and 2 shares 66%-73% amino acid similarity with members of the Caulimoviridae virus family, especially the Petunia vein clearing virus (PVCV), Petuvirus genus. ORF1 encodes a movement protein (MP); ORF2 a Reverse Transcriptase (RT) and a Ribonuclease H (RNase H) domain; and ORF3 showed no amino acid sequence similarity with any other known virus proteins. Analogous to other known endogenous pararetrovirus sequences (EPRVs), AcV1 is integrated in the genome of Maqui Berry and showed low viral transcriptional activity, which was detected by deep sequencing technology (DNA and RNA-seq). Phylogenetic analysis of AcV1 and other pararetroviruses revealed a closer resemblance with Petuvirus. Overall, our data suggests that AcV1 could be a new member of Caulimoviridae family, genus Petuvirus, and the first evidence of this kind of virus in a fruit plant.

  20. Rapid and Deep Proteomes by Faster Sequencing on a Benchtop Quadrupole Ultra-High-Field Orbitrap Mass Spectrometer

    DEFF Research Database (Denmark)

    Kelstrup, Christian D; Jersie-Christensen, Rosa R; Batth, Tanveer Singh

    2014-01-01

    per second or up to 600 new peptides sequenced per gradient minute. We identify 4400 proteins from one microgram of HeLa digest using a one hour gradient, which is an approximately 30% improvement compared to previous instrumentation. In addition, we show very deep proteome coverage can be achieved...... in less than 24 hours of analysis time by offline high pH reversed-phase peptide fractionation from which we identify more than 140,000 unique peptide sequences. This is comparable to state-of-the-art multi-day, multi-enzyme efforts. Finally the acquisition methods are evaluated for single...

  1. Genome-wide detection and analysis of hippocampus core promoters using DeepCAGE

    DEFF Research Database (Denmark)

    Valen, Eivind; Pascarella, Giovanni; Chalk, Alistair;

    2009-01-01

    in a given tissue. Here, we present a new method for high-throughput sequencing of 5' cDNA tags-DeepCAGE: merging the Cap Analysis of Gene Expression method with ultra-high-throughput sequence technology. We apply DeepCAGE to characterize 1.4 million sequenced TSS from mouse hippocampus and reveal a wealth...... of novel core promoters that are preferentially used in hippocampus: This is the most comprehensive promoter data set for any tissue to date. Using these data, we present evidence indicating a key role for the Arnt2 transcription factor in hippocampus gene regulation. DeepCAGE can also detect promoters...

  2. The mitochondrial genome sequence of a deep-sea, hydrothermal vent limpet, Lepetodrilus nux, presents a novel vetigastropod gene arrangement.

    Science.gov (United States)

    Nakajima, Yuichi; Shinzato, Chuya; Khalturina, Mariia; Nakamura, Masako; Watanabe, Hiromi; Satoh, Noriyuki; Mitarai, Satoshi

    2016-08-01

    While mitochondrial (mt) genomes are used extensively for comparative and evolutionary genomics, few mt genomes of deep-sea species, including hydrothermal vent species, have been determined. The Genus Lepetodrilus is a major deep-sea gastropod taxon that occurs in various deep-sea ecosystems. Using next-generation sequencing, we determined nearly the complete mitochondrial genome sequence of Lepetodrilus nux, which inhabits hydrothermal vents in the Okinawa Trough. The total length of the mitochondrial genome is 16,353bp, excluding the repeat region. It contains 13 protein-coding genes, 22 tRNA genes, two rRNA genes, and a control region, typical of most metazoan genomes. Compared with other vetigastropod mt genome sequences, L. nux employs a novel mt gene arrangement. Other novel arrangements have been identified in the vetigastropod, Fissurella volcano, and in Chrysomallon squamiferum, a neomphaline gastropod; however, all three gene arrangements are different, and Bayesian inference suggests that each lineage diverged independently. Our findings suggest that vetigastropod mt gene arrangements are more diverse than previously realized.

  3. Molecular cloning of lupin leghemoglobin cDNA

    DEFF Research Database (Denmark)

    Konieczny, A; Jensen, E O; Marcker, K A

    1987-01-01

    Poly(A)+ RNA isolated from root nodules of yellow lupin (Lupinus luteus, var. Ventus) has been used as a template for the construction of a cDNA library. The ds cDNA was synthesized and inserted into the Hind III site of plasmid pBR 322 using synthetic Hind III linkers. Clones containing sequences...... its nucleotide sequence was consistent with known amino acid sequence of lupin Lb II. The cloned lupin Lb cDNA hybridized to poly(A)+ RNA from nodules only, which is in accordance with the general concept, that leghemoglobin is expressed exclusively in nodules. Udgivelsesdato: 1987-null...

  4. Anchoring a Defined Sequence to the 55' Ends of mRNAs : The Bolt to Clone Rare Full Length mRNAs and Generate cDNA Libraries porn a Few Cells.

    Science.gov (United States)

    Baptiste, J; Milne Edwards, D; Delort, J; Mallet, J

    1993-01-01

    Among numerous applications, the polymerase chain reaction (PCR) (1,2) provides a convenient means to clone 5' ends of rare mRNAs and to generate cDNA libraries from tissue available in amounts too low to be processed by conventional methods. Basically, the amplification of cDNAs by the PCR requires the availability of the sequences of two stretches of the molecule to be amplified. A sequence can easily be imposed at the 5' end of the first-strand cDNAs (corresponding to the 3' end of the mRNAs) by priming the reverse transcription with a specific primer (for cloning the 5' end of rare messenger) or with an oligonucleotide tailored with a poly (dT) stretch (for cDNA library construction), taking advantage of the poly (A) sequence that is located at the 3' end of mRNAs. Several strategies have been devised to tag the 3' end of the ss-cDNAs (corresponding to the 55' end of the mRNAs). We (3) and others have described strategies based on the addition of a homopolymeric dG (4,5) or dA (6,7) tail using terminal deoxyribonucleotide transferase (TdT) ("anchor-PCR" [4]). However, this strategy has important limitations. The TdT reaction is difficult to control and has a low efficiency (unpublished observations). But most importantly, the return primers containing a homopolymeric (dC or dT) tail generate nonspecific amplifications, a phenomenon that prevents the isolation of low abundance mRNA species and/or interferes with the relative abundance of primary clones in the library. To circumvent these drawbacks, we have used two approaches. First, we devised a strategy based on a cRNA enrichment procedure, which has been useful to eliminate nonspecific-PCR products and to allow detection and cloning of cDNAs of low abundance (3). More recently, to avoid the nonspecific amplification resulting from the annealing of the homopolymeric tail oligonucleotide, we have developed a novel anchoring strategy that is based on the ligation of an oligonucleotide to the 35' end of ss

  5. cDNA sequence and Fab crystal structure of HL4E10, a hamster IgG lambda light chain antibody stimulatory for γδ T cells.

    Directory of Open Access Journals (Sweden)

    Petra Verdino

    Full Text Available Hamsters are widely used to generate monoclonal antibodies against mouse, rat, and human antigens, but sequence and structural information for hamster immunoglobulins is sparse. To our knowledge, only three hamster IgG sequences have been published, all of which use kappa light chains, and no three-dimensional structure of a hamster antibody has been reported. We generated antibody HL4E10 as a probe to identify novel costimulatory molecules on the surface of γδ T cells which lack the traditional αβ T cell co-receptors CD4, CD8, and the costimulatory molecule CD28. HL4E10 binding to γδ T cell, surface-expressed, Junctional Adhesion Molecule-Like (JAML protein leads to potent costimulation via activation of MAP kinase pathways and cytokine production, resulting in cell proliferation. The cDNA sequence of HL4E10 is the first example of a hamster lambda light chain and only the second known complete hamster heavy chain sequence. The crystal structure of the HL4E10 Fab at 2.95 Å resolution reveals a rigid combining site with pockets faceted by solvent-exposed tyrosine residues, which are structurally optimized for JAML binding. The characterization of HL4E10 thus comprises a valuable addition to the spartan database of hamster immunoglobulin genes and structures. As the HL4E10 antibody is uniquely costimulatory for γδ T cells, humanized versions thereof may be of clinical relevance in treating γδ T cell dysfunction-associated diseases, such as chronic non-healing wounds and cancer.

  6. The cDNA Cloning and Sequence Analysis of Fragment of CART mRNA in Porcine Hypothalamus%猪下丘脑CART完整CDS区结构域分析

    Institute of Scientific and Technical Information of China (English)

    李鹏飞; 李富禄; 于秀菊; 吕丽华

    2011-01-01

    In order to obtain and analysis the complete CDS and its domain of porcine CART (Cocain and amphetamine- regulated transcript, CART), based on the cDNA sequences of CART in other species published in GenBank, three pairs of specific primers were designed and part mRNA sequence was amplified by RT-PCR. By alignment with other species, the sequences had high similarities, suggesting that the sequence was the porcine CART,and the mRNA expressed in porcine hypothalamus; Analysis of the 3D structure of the CART protein,showed that a CART superfamily (CART Superfamily) existed between amino acids 40 to 116. The superfamily may be as the ligand of iron to achieve a variety of physiological functions in vivo.%为获得猪可卡因苯异丙胺调控转录肽(Cocain and amphetamine-regulated transcript,CART)的完整CDS区并分析其结构城,根据GenBank中公布的其他物种CART的cDNA序列,设计了三对特异性引物,采用RT- PCR技术扩增出猪下丘脑组织CART mRNA的部分序列.经NCBI比对,与其他物种的相似性较高,提示该序列为猪CART的mRNA序列,CART mRNA在猪下丘脑上有表达;对该序列进行3D结构分析,发现猪CART蛋白在第40~116位氨基酸间存在一个CART超家族(CART Superfamily),这个超家族可能通过作为铁离子的配合基,在动物体内实现多种生理功能.

  7. cDNA sequence and Fab crystal structure of HL4E10, a hamster IgG lambda light chain antibody stimulatory for γδ T cells.

    Science.gov (United States)

    Verdino, Petra; Witherden, Deborah A; Podshivalova, Katie; Rieder, Stephanie E; Havran, Wendy L; Wilson, Ian A

    2011-01-01

    Hamsters are widely used to generate monoclonal antibodies against mouse, rat, and human antigens, but sequence and structural information for hamster immunoglobulins is sparse. To our knowledge, only three hamster IgG sequences have been published, all of which use kappa light chains, and no three-dimensional structure of a hamster antibody has been reported. We generated antibody HL4E10 as a probe to identify novel costimulatory molecules on the surface of γδ T cells which lack the traditional αβ T cell co-receptors CD4, CD8, and the costimulatory molecule CD28. HL4E10 binding to γδ T cell, surface-expressed, Junctional Adhesion Molecule-Like (JAML) protein leads to potent costimulation via activation of MAP kinase pathways and cytokine production, resulting in cell proliferation. The cDNA sequence of HL4E10 is the first example of a hamster lambda light chain and only the second known complete hamster heavy chain sequence. The crystal structure of the HL4E10 Fab at 2.95 Å resolution reveals a rigid combining site with pockets faceted by solvent-exposed tyrosine residues, which are structurally optimized for JAML binding. The characterization of HL4E10 thus comprises a valuable addition to the spartan database of hamster immunoglobulin genes and structures. As the HL4E10 antibody is uniquely costimulatory for γδ T cells, humanized versions thereof may be of clinical relevance in treating γδ T cell dysfunction-associated diseases, such as chronic non-healing wounds and cancer.

  8. Construction of genome-wide physical BAC contigs using mapped cDNA as probes: Toward an integrated BAC library resource for genome sequencing and analysis. Annual report, July 1995--January 1997

    Energy Technology Data Exchange (ETDEWEB)

    Mitchell, S.C.; Bocskai, D.; Cao, Y. [and others

    1997-12-31

    The goal of human genome project is to characterize and sequence entire genomes of human and several model organisms, thus providing complete sets of information on the entire structure of transcribed, regulatory and other functional regions for these organisms. In the past years, a number of useful genetic and physical markers on human and mouse genomes have been made available along with the advent of BAC library resources for these organisms. The advances in technology and resource development made it feasible to efficiently construct genome-wide physical BAC contigs for human and other genomes. Currently, over 30,000 mapped STSs and 27,000 mapped Unigenes are available for human genome mapping. ESTs and cDNAs are excellent resources for building contig maps for two reasons. Firstly, they exist in two alternative forms--as both sequence information for PCR primer pairs, and cDoreen genomic libraries efficiently for large number of DNA probes by combining over 100 cDNA probes in each hybridization. Second, the linkage and order of genes are rather conserved among human, mouse and other model organisms. Therefore, gene markers have advantages over random anonymous STSs in building maps for comparative genomic studies.

  9. Next-Generation Sequencing Reveals Deep Intronic Cryptic ABCC8 and HADH Splicing Founder Mutations Causing Hyperinsulinism by Pseudoexon Activation

    Science.gov (United States)

    Flanagan, Sarah E.; Xie, Weijia; Caswell, Richard; Damhuis, Annet; Vianey-Saban, Christine; Akcay, Teoman; Darendeliler, Feyza; Bas, Firdevs; Guven, Ayla; Siklar, Zeynep; Ocal, Gonul; Berberoglu, Merih; Murphy, Nuala; O’Sullivan, Maureen; Green, Andrew; Clayton, Peter E.; Banerjee, Indraneel; Clayton, Peter T.; Hussain, Khalid; Weedon, Michael N.; Ellard, Sian

    2013-01-01

    Next-generation sequencing (NGS) enables analysis of the human genome on a scale previously unachievable by Sanger sequencing. Exome sequencing of the coding regions and conserved splice sites has been very successful in the identification of disease-causing mutations, and targeting of these regions has extended clinical diagnostic testing from analysis of fewer than ten genes per phenotype to more than 100. Noncoding mutations have been less extensively studied despite evidence from mRNA analysis for the existence of deep intronic mutations in >20 genes. We investigated individuals with hyperinsulinaemic hypoglycaemia and biochemical or genetic evidence to suggest noncoding mutations by using NGS to analyze the entire genomic regions of ABCC8 (117 kb) and HADH (94 kb) from overlapping ∼10 kb PCR amplicons. Two deep intronic mutations, c.1333-1013A>G in ABCC8 and c.636+471G>T HADH, were identified. Both are predicted to create a cryptic splice donor site and an out-of-frame pseudoexon. Sequence analysis of mRNA from affected individuals’ fibroblasts or lymphoblastoid cells confirmed mutant transcripts with pseudoexon inclusion and premature termination codons. Testing of additional individuals showed that these are founder mutations in the Irish and Turkish populations, accounting for 14% of focal hyperinsulinism cases and 32% of subjects with HADH mutations in our cohort. The identification of deep intronic mutations has previously focused on the detection of aberrant mRNA transcripts in a subset of disorders for which RNA is readily obtained from the target tissue or ectopically expressed at sufficient levels. Our approach of using NGS to analyze the entire genomic DNA sequence is applicable to any disease. PMID:23273570

  10. Cloning and sequence analysis of squalene synthase gene and cDNA in Glycyrrhiza uralensis%甘草鲨烯合酶基因及cDNA的克隆与序列分析

    Institute of Scientific and Technical Information of China (English)

    荣齐仙; 刘春生; 黄璐琦; 张宁; 南博; 呙未

    2011-01-01

    Objective: To clone and sequence the open reading frame and genomic sequence of squalene synthase (SQS) from Glycyrrhiza uralensis. Method: The primers were designed according to cDNA sequence of SQS from G. glabra reported by Hiroaki HAYASHI, SQS cDNA was cloned with total RNA extracted from roots of G. uralensis. Specific fragments were amplified by RT-PCR and then were cloned and sequenced. SQS DNA was cloned with total DNA extracted from roots of G. uralensis. Specific fragments were amplified by PCR and then were cloned and sequenced. Result: GuSQS1 (GenBank accession number: GQ266154) was 1 242 bp in length encoding proteins with 412 amino acid. NCBI Blast x search results showed GuSQS1 had the highest amino acid similarity to the corresponding proteins from G. uralensis The identities of GuSQS1 with the two proteins were 98. 55% and 88. 62%. SQS ( GenBank accession number: GQ180932) gene with 4 484 bp containing 13 exons and 12 introns was then amplified by PCR with genomic DNA extracted from roots of G. uralensis. Conclusion: These findings of cloning and sequencing the open reading frame and genomic sequence of squalene synthase (SQS) from G. uralensis brought some new clues for the further exploration of SmSQS function in sterol and terpenes biosynthesis.%目的:对甘草鲨烯合酶(SQS)基因的cDNA及DNA进行克隆及序列分析.方法:根据已报道的光果甘草SQS1基因的cDNA序列设计引物,采用RT-PCR的方法,提取甘草根的RNA然后反转录成cDNA,以cDNA为模板,扩增出SQS基因的cDNA序列,以甘草总DNA为模板,扩增SQS的DNA序列.结果:序列分析表明,克隆获得的甘草SQS1的cDNA编码区为1242 bp,编码413个氨基酸残基,命名为GuSQS1,登录号为GQ266154,与卢虹玉等报道的甘草的2个SQS(SQS1和SQS2)的氨基酸序列一致性为98.55%,88.62%,对应DNA序列全长为4484 bp,含有13个外显子,12个内含子,登录号为GQ180932.结论:甘草SQS的cDNA及DNA序列的获得为进一步研究

  11. Analysis of full-length cDNA sequence of FAD2 gene in Vernicia fordii seeds%油桐种子FAD2基因全长cDNA序列分析

    Institute of Scientific and Technical Information of China (English)

    谢禄山; 谭晓风; 张琳; 龙洪旭

    2012-01-01

    Linoleic acid produced in seeds of Verniciafordii is the direct material for synthesize eleostearic acid through catalysis of FAD2, the researches on FAD2 gene in seed from V. fordii has practical significance on improving yield of eleostearic acid. Taking 16 FAD2 clones in cDNA library of nearly mature V. fordii 'Duinian tung' seeds as materials, CAP3 splicing, BLAST alignment and DNAMAN analysis were carried on. The results showed that the cloned gene sequence was FAD2 full-length cDNA sequence, its length was 1 537 bp. The sequence contained a complete coding sequence, length of 1 146 bp (106-1 255 bp), encoding 383 amino acids. The relative molecular mass of the enzyme protein was 44 144.4 u, jsoelectric point was 8.57. The N end of amino acid sequence had a signal peptide sequence of 6 residues, 5 transmembrane domains, 3 strong hydrophilic sequences existed at the N end, C end and intermediate part, respectively, and the activity center of enzyme was 3 conserved histidine clusters. In system evolution, FAD2 gene in V. fordii had a nearest phylogenetic relationship with V. montana, nearer relationship with Euphorbiaceae plants such as Ricinus communis, Triadica sebifrea, Hevea brasiliensis, Jatropha curcas, and far relationship with Olea europaea, Arachis hypogaea, Sesamum indicum, further relationship with Camelia oleifera.%油桐种子中FAD2催化形成的亚油酸是合成桐油酸的直接原料,研究油桐种子中的FAD2基因对提高桐油酸的产量具有实际意义.将油桐对年桐近成熟种子cDNA文库中的16个FAD2克隆子进行CAP3拼接,再进行BLAST比对,并进行DNAMAN分析,结果表明所克隆的基因序列为FAD2全长cDNA序列,其长度为1 537 bp,含有1个完整的编码序列,长度为1 146 bp( 106~1 255 bp),编码383个氨基酸.酶蛋白相对分子质量44 144.4 u,等电点为8.57,氨基酸序列N端有6个残基的信号肽序列,有5个跨膜结构域,N端、C端及中间各有一段表现为强

  12. Porcine transcriptome analysis based on 97 non-normalized cDNA libraries and assembly of 1,021,891 expressed sequence tags

    DEFF Research Database (Denmark)

    Gorodkin, Jan; Cirera, Susanna; Hedegaard, Jacob;

    2007-01-01

    of genes between different tissues, in particular brain/spinal cord, and found patterns of correlation between genes that share expression in pairs of libraries. Finally, there was remarkable agreement in expression between specialized tissues according to Gene Ontology categories. CONCLUSION: This EST......BACKGROUND: Knowledge of the structure of gene expression is essential for mammalian transcriptomics research. We analyzed a collection of more than one million porcine expressed sequence tags (ESTs), of which two-thirds were generated in the Sino-Danish Pig Genome Project and one-third are from...... approximately 25% have a high confidence match to UniProt. Approximately 6,000 new porcine gene clusters were identified. Expression analysis based on the non-normalized libraries resulted in the following findings. The distribution of cluster sizes is scaling invariant. Brain and testes are among the tissues...

  13. Porcine transcriptome analysis based on 97 non-normalized cDNA libraries and assembly of 1,021,891 expressed sequence tags

    DEFF Research Database (Denmark)

    Gorodkin, Jan; Cirera, Susanna; Hedegaard, Jakob;

    2007-01-01

    of genes between different tissues, in particular brain/spinal cord, and found patterns of correlation between genes that share expression in pairs of libraries. Finally, there was remarkable agreement in expression between specialized tissues according to Gene Ontology categories. Conclusion: This EST......Background: Knowledge of the structure of gene expression is essential for mammalian transcriptomics research. We analyzed a collection of more than one million porcine expressed sequence tags (ESTs), of which two-thirds were generated in the Sino-Danish Pig Genome Project and one-third are from...... approximately 25% have a high confidence match to UniProt. Approximately 6,000 new porcine gene clusters were identified. Expression analysis based on the non-normalized libraries resulted in the following findings. The distribution of cluster sizes is scaling invariant. Brain and testes are among the tissues...

  14. Soluble forms of tumor necrosis factor receptors (TNF-Rs). The cDNA for the type I TNF-R, cloned using amino acid sequence data of its soluble form, encodes both the cell surface and a soluble form of the receptor

    DEFF Research Database (Denmark)

    Nophar, Y; Kemper, O; Brakebusch, C

    1990-01-01

    found to have effects characteristic of TNF, including stimulating phosphorylation of specific cellular proteins. Oligonucleotide probes designed on the basis of the NH2-terminal amino acid sequence of TBPI were used to clone the cDNA for the structurally related cell surface type 1 TNF-R. It is notable...

  15. cDNA Clone of Prophenoloxidase for Litopenaeus Stylirostris and Sequence Structure Analysis%细角滨对虾酚氧化酶原cDNA 克隆及序列结构分析

    Institute of Scientific and Technical Information of China (English)

    许尤厚; 胡超群

    2015-01-01

    采用 RT-PCR 原理和长片段扩增技术克隆细角滨对虾酚氧化酶原基因。结果表明,细角滨对虾血淋巴细胞内存在2个 proPO 基因。 proPO gene 1的 cDNA 序列包含有372氨基酸,前190个氨基酸为一个M 家族血蓝蛋白,是一个铜结合位点区域,191-372为一个 C 家族的血蓝蛋白,是一个免疫球蛋白样的区域。proPO gene 2的2个功能位点之间的序列有重叠,proPO gene 2 cDNA 序列的6-935bp 包含了第一个功能位点,928-1464bp 则包含了第二个功能位点。系统进化树比对分析发现2个基因之间的序列差异非常大。细角滨对虾和凡纳滨对虾的 proPO gene 2同处于一个密切相关的群,proPO gene 1则和其他几种对虾的 proPO gene 处于一个群。 proPO gene 2与 proPO gene 1在对虾免疫活动中是否存在不同的功能还有待于进一步的研究。%Prophenoloxidase (proPO) is one of the important factors on humoral immunity of shrimp, so far there are no re-ports for Litopenaeus stylirostris. Depend on techniques of RT-PCR and long fragment amplification cloning, prophenoloxidase gene of L. stylirostris was cloned. The results show that, there are two proPO genes in the lymphocytes of L. stylirostris. ProPO gene 1 cDNA sequence contains 372 amino acids, the first 190 amino acids are a family of M hemocyanin, a copper binding site region, 191-372 is one of the C family of hemocyanin, is an immunoglobulin like region. There are sequence overlap between the 2 functional sites of proPO gene 2, which means that 6-935bp contains the first functional sites, while 928-1464bp contains sec-ond functional sites. The phylogenetic tree alignment analysis showed that sequence structures of two genes is very different. Pro-PO gene 2 of L. stylirostris and L. vannamei was in a closely related group; but proPO gene 1 of L. stylirostris and L. vannamei was in another group with other several shrimp. The function of ProPO gene 2 and proPO gene 1 in shrimp immune

  16. 杜仲HDR基因全长cDNA克隆与序列分析%Cloning and Sequence Analysis of 1-Hydroxy-2-Methyl-2-E-Butenyl-4-Diphosphate Reductase Gene cDNA from Eucommia ulmoides

    Institute of Scientific and Technical Information of China (English)

    刘攀峰; 杜红岩; 乌云塔娜; 杜兰英; 孙志强

    2013-01-01

    以杜仲叶片cDNA为模板,采用反转录RCR及RACE技术分离出HDR基因的cDNA克隆,命名为EuHDR.EuHDR基因cDNA全长1 653 bp,5'端非编码区长82 bp,3'端非编码区长188 bp,编码460个氨基酸,与喜树HDR基因序列相似性最高,达82%;推导EuHDR氨基酸序列中包含转运肽序列(A1-A33)及植物HDR蛋白多个保守的功能位点(A117,A208,A262,A345);EuHDR蛋白二级结构α-螺旋占35.65%,β-折叠占19.78%,螺环结构占44.57%;EuHDR蛋白三级结构为单体形式,呈不规则的三叶草形状;系统进化分析表明EuHDR蛋白与葡萄HDR蛋白的亲缘关系最为接近.%1-hydroxy-2-methyl-2-(E)-butenyl 4-diphosphate reductase (HDR) synthesizes IPP and DMAPP in the last step of the plant 2-C-methyl-D-erythritol 4-phosphate (MEP) pathway.Homologous HDR gene cDNA was isolated from the leaves of Eucommia ulmoides by the method of reverse transcription polymerase chain reaction (RTPCR) and rapid amplification of cDNA ends (RACE) technique,and named as EuHDR.With the highest gene sequence similarity to Camptotheca acuminata (82%),the full-length cDNA of EuHDR was 1 653 bp including 5'non-coding region of 82 bp and 3' non-coding region of 188 bp and encoded 460 amino acids.The transit peptide sequence (A1-A33) and multiple conserved functional sites(A117,A208,A262,A345)of plant HDR protein were found in the deduced coding sequence of EuHDR.The secondary structure of EuHDR protein was predicted with proportion of α-helix to 35.65%,β-sheet to 19.78% and loop/coil to 44.57%.The calculated protein tertiary structure of EuHDR was formed as monomer,which in space displayed asymmetrical shamrock-like shape.Phylogenetic analysis revealed that the evolutionary relationship of EuHDR protein was the closest to Vitis vinifera HDR protein.

  17. Deep sequencing extends the diversity of human papillomaviruses in human skin.

    OpenAIRE

    Bzhalava, Davit; Mühr, Laila Sara Arroyo; Lagheden, Camilla; Ekström, Johanna; Forslund, Ola; Dillner, Joakim; Hultin, Emilie

    2014-01-01

    Most viruses in human skin are known to be human papillomaviruses (HPVs). Previous sequencing of skin samples has identified 273 different cutaneous HPV types, including 47 previously unknown types. In the present study, we wished to extend prior studies using deeper sequencing. This deeper sequencing without prior PCR of a pool of 142 whole genome amplified skin lesions identified 23 known HPV types, 3 novel putative HPV types and 4 non-HPV viruses. The complete sequence was obtained for one...

  18. Metavisitor, a Suite of Galaxy Tools for Simple and Rapid Detection and Discovery of Viruses in Deep Sequence Data

    Science.gov (United States)

    Vernick, Kenneth D.

    2017-01-01

    Metavisitor is a software package that allows biologists and clinicians without specialized bioinformatics expertise to detect and assemble viral genomes from deep sequence datasets. The package is composed of a set of modular bioinformatic tools and workflows that are implemented in the Galaxy framework. Using the graphical Galaxy workflow editor, users with minimal computational skills can use existing Metavisitor workflows or adapt them to suit specific needs by adding or modifying analysis modules. Metavisitor works with DNA, RNA or small RNA sequencing data over a range of read lengths and can use a combination of de novo and guided approaches to assemble genomes from sequencing reads. We show that the software has the potential for quick diagnosis as well as discovery of viruses from a vast array of organisms. Importantly, we provide here executable Metavisitor use cases, which increase the accessibility and transparency of the software, ultimately enabling biologists or clinicians to focus on biological or medical questions. PMID:28045932

  19. [Whole cDNA sequence cloning and expression of chicken L-FABP gene and its relationship with lipid deposition of hybrid chickens].

    Science.gov (United States)

    Yu, Ying; Wang, Dong; Sun, Dong-Xiao; Xu, Gui-Yun; Li, Jun-Ying; Zhang, Yuan

    2011-07-01

    Liver fatty acid-binding protein (L-FABP) is closely related to intracellular transportation and deposition of lipids. A positive differential displayed fragment was found in the liver tissue among Silkie (CC), CAU-brown chicken (CD), and their reciprocal hybrids (CD and DC) at 8 weeks-old using differential display RT-PCR techniques (DDRT-PCR). Through recycling, sequencing, and alignment analysis, the fragment was identified as chicken liver fatty acid-binding protein gene (L-FABP, GenBank accession number AY321365). Reverse Northern dot blot and semi-quantitative RT-PCR revealed that the avian L-FABP gene was over-expressed in the liver tissue of the reciprocal hybrids (CD and DC) compared to their parental lines (CC and DD), which was consistent with the fact that higher abdomen fat weight and wider inter-muscular fat width observed in the reciprocal hybrids. Considering the higher expression of L-FABP may contribute to the increased lipid deposition in the hybrid chickens, the functional study of avian L-FABP is warranted in future.

  20. Deep sequencing unearths nuclear mitochondrial sequences under Leber's hereditary optic neuropathy-associated false heteroplasmic mitochondrial DNA variants.

    Science.gov (United States)

    Petruzzella, Vittoria; Carrozzo, Rosalba; Calabrese, Claudia; Dell'Aglio, Rosa; Trentadue, Raffaella; Piredda, Roberta; Artuso, Lucia; Rizza, Teresa; Bianchi, Marzia; Porcelli, Anna Maria; Guerriero, Silvana; Gasparre, Giuseppe; Attimonelli, Marcella

    2012-09-01

    Leber's hereditary optic neuropathy (LHON) is associated with mitochondrial DNA (mtDNA) ND mutations that are mostly homoplasmic. However, these mutations are not sufficient to explain the peculiar features of penetrance and the tissue-specific expression of the disease and are believed to be causative in association with unknown environmental or other genetic factors. Discerning between clear-cut pathogenetic variants, such as those that appear to be heteroplasmic, and less penetrant variants, such as the homoplasmic, remains a challenging issue that we have addressed here using next-generation sequencing approach. We set up a protocol to quantify MTND5 heteroplasmy levels in a family in which the proband manifests a LHON phenotype. Furthermore, to study this mtDNA haplotype, we applied the cybridization protocol. The results demonstrate that the mutations are mostly homoplasmic, whereas the suspected heteroplasmic feature of the observed mutations is due to the co-amplification of Nuclear mitochondrial Sequences.

  1. Comparison of illumina and 454 deep sequencing in participants failing raltegravir-based antiretroviral therapy.

    Directory of Open Access Journals (Sweden)

    Jonathan Z Li

    Full Text Available The impact of raltegravir-resistant HIV-1 minority variants (MVs on raltegravir treatment failure is unknown. Illumina sequencing offers greater throughput than 454, but sequence analysis tools for viral sequencing are needed. We evaluated Illumina and 454 for the detection of HIV-1 raltegravir-resistant MVs.A5262 was a single-arm study of raltegravir and darunavir/ritonavir in treatment-naïve patients. Pre-treatment plasma was obtained from 5 participants with raltegravir resistance at the time of virologic failure. A control library was created by pooling integrase clones at predefined proportions. Multiplexed sequencing was performed with Illumina and 454 platforms at comparable costs. Illumina sequence analysis was performed with the novel snp-assess tool and 454 sequencing was analyzed with V-Phaser.Illumina sequencing resulted in significantly higher sequence coverage and a 0.095% limit of detection. Illumina accurately detected all MVs in the control library at ≥0.5% and 7/10 MVs expected at 0.1%. 454 sequencing failed to detect any MVs at 0.1% with 5 false positive calls. For MVs detected in the patient samples by both 454 and Illumina, the correlation in the detected variant frequencies was high (R2 = 0.92, P<0.001. Illumina sequencing detected 2.4-fold greater nucleotide MVs and 2.9-fold greater amino acid MVs compared to 454. The only raltegravir-resistant MV detected was an E138K mutation in one participant by Illumina sequencing, but not by 454.In participants of A5262 with raltegravir resistance at virologic failure, baseline raltegravir-resistant MVs were rarely detected. At comparable costs to 454 sequencing, Illumina demonstrated greater depth of coverage, increased sensitivity for detecting HIV MVs, and fewer false positive variant calls.

  2. The DEEP2 Galaxy Redshift Survey: The Red Sequence AGN Fraction and its Environment and Redshift Dependence

    CERN Document Server

    Montero-Dorta, Antonio D; Yan, Renbin; Cooper, Michael C; Newman, Jeffery A; Georgakakis, Antonis; Prada, Francisco; Davis, Marc; Nandra, Kirpal; Coil, Alison

    2008-01-01

    We measure the dependence of the AGN fraction on local environment at z~1, using spectroscopic data taken from the DEEP2 Galaxy Redshift Survey, and Chandra X-ray data from the All-Wavelength Extended Groth Strip International Survey (AEGIS). To provide a clean sample of AGN we restrict our analysis to the red sequence population; this also reduces additional colour-environment correlations. We find evidence that high redshift LINERs in DEEP2 tend to favour higher density environments relative to the red population from which they are drawn. In contrast, Seyferts and X-ray selected AGN at z~1 show little (or no) environmental dependencies within the same underlying population. We compare these results with a sample of local AGN drawn from the SDSS. Contrary to the high redshift behaviour, we find that both LINERs and Seyferts in the SDSS show a slowly declining red sequence AGN fraction towards high density environments. Interestingly, at z~1 red sequence Seyferts and LINERs are approximately equally abundant...

  3. HPV Population Profiling in Healthy Men by Next-Generation Deep Sequencing Coupled with HPV-QUEST.

    Science.gov (United States)

    Yin, Li; Yao, Jin; Chang, Kaifen; Gardner, Brent P; Yu, Fahong; Giuliano, Anna R; Goodenow, Maureen M

    2016-01-25

    Multiple-type human papillomaviruses (HPV) infection presents a greater risk for persistence in asymptomatic individuals and may accelerate cancer development. To extend the scope of HPV types defined by probe-based assays, multiplexing deep sequencing of HPV L1, coupled with an HPV-QUEST genotyping server and a bioinformatic pipeline, was established and applied to survey the diversity of HPV genotypes among a subset of healthy men from the HPV in Men (HIM) Multinational Study. Twenty-one HPV genotypes (12 high-risk and 9 low-risk) were detected in the genital area from 18 asymptomatic individuals. A single HPV type, either HPV16, HPV6b or HPV83, was detected in 7 individuals, while coinfection by 2 to 5 high-risk and/or low-risk genotypes was identified in the other 11 participants. In two individuals studied for over one year, HPV16 persisted, while fluctuations of coinfecting genotypes occurred. HPV L1 regions were generally identical between query and reference sequences, although nonsynonymous and synonymous nucleotide polymorphisms of HPV16, 18, 31, 35h, 59, 70, 73, cand85, 6b, 62, 81, 83, cand89 or JEB2 L1 genotypes, mostly unidentified by linear array, were evident. Deep sequencing coupled with HPV-QUEST provides efficient and unambiguous classification of HPV genotypes in multiple-type HPV infection in host ecosystems.

  4. Deep sequencing detects very-low-grade somatic mosaicism in the unaffected mother of siblings with nemaline myopathy.

    Science.gov (United States)

    Miyatake, Satoko; Koshimizu, Eriko; Hayashi, Yukiko K; Miya, Kazushi; Shiina, Masaaki; Nakashima, Mitsuko; Tsurusaki, Yoshinori; Miyake, Noriko; Saitsu, Hirotomo; Ogata, Kazuhiro; Nishino, Ichizo; Matsumoto, Naomichi

    2014-07-01

    When an expected mutation in a particular disease-causing gene is not identified in a suspected carrier, it is usually assumed to be due to germline mosaicism. We report here very-low-grade somatic mosaicism in ACTA1 in an unaffected mother of two siblings affected with a neonatal form of nemaline myopathy. The mosaicism was detected by deep resequencing using a next-generation sequencer. We identified a novel heterozygous mutation in ACTA1, c.448A>G (p.Thr150Ala), in the affected siblings. Three-dimensional structural modeling suggested that this mutation may affect polymerization and/or actin's interactions with other proteins. In this family, we expected autosomal dominant inheritance with either parent demonstrating germline or somatic mosaicism. Sanger sequencing identified no mutation. However, further deep resequencing of this mutation on a next-generation sequencer identified very-low-grade somatic mosaicism in the mother: 0.4%, 1.1%, and 8.3% in the saliva, blood leukocytes, and nails, respectively. Our study demonstrates the possibility of very-low-grade somatic mosaicism in suspected carriers, rather than germline mosaicism.

  5. High-resolution deep sequencing reveals biodiversity, population structure, and persistence of HIV-1 quasispecies within host ecosystems

    Directory of Open Access Journals (Sweden)

    Yin Li

    2012-12-01

    Full Text Available Abstract Background Deep sequencing provides the basis for analysis of biodiversity of taxonomically similar organisms in an environment. While extensively applied to microbiome studies, population genetics studies of viruses are limited. To define the scope of HIV-1 population biodiversity within infected individuals, a suite of phylogenetic and population genetic algorithms was applied to HIV-1 envelope hypervariable domain 3 (Env V3 within peripheral blood mononuclear cells from a group of perinatally HIV-1 subtype B infected, therapy-naïve children. Results Biodiversity of HIV-1 Env V3 quasispecies ranged from about 70 to 270 unique sequence clusters across individuals. Viral population structure was organized into a limited number of clusters that included the dominant variants combined with multiple clusters of low frequency variants. Next generation viral quasispecies evolved from low frequency variants at earlier time points through multiple non-synonymous changes in lineages within the evolutionary landscape. Minor V3 variants detected as long as four years after infection co-localized in phylogenetic reconstructions with early transmitting viruses or with subsequent plasma virus circulating two years later. Conclusions Deep sequencing defines HIV-1 population complexity and structure, reveals the ebb and flow of dominant and rare viral variants in the host ecosystem, and identifies an evolutionary record of low-frequency cell-associated viral V3 variants that persist for years. Bioinformatics pipeline developed for HIV-1 can be applied for biodiversity studies of virome populations in human, animal, or plant ecosystems.

  6. 大腹园蛛Avg1 cDNA的克隆和序列分析%Cloning and Sequence Analysis of Araneus ventricousus Avg1 cDNA

    Institute of Scientific and Technical Information of China (English)

    任洪林; 柳增善; 卢士英; 潘风光

    2004-01-01

    Quantity and activity of cathepsin B (CB) are increased during the progress of carcinoma cells' invasion and metastasis. So, CB is being applied to the diagnosis of cancers and inhibiting their diffusion. The complete Avgl eDNA was randomly cloned from the cDNA library of major ampullate gland of Araneus ventricousus. It is 1 253 bp and encode 334 amino acids in its 1 002 bp encoding region. The molecular weight of the protein is 36 953.00.GenBank accession number is AY302573. The 3′ non-coding region is composed of 179 bp with a polyadenylation signal AATAAA sequence appearing at the position 129 nt and the poly(A) tail is at the position 153 nt downstream of stop codon TAA. The signal peptide cleavage site of its deduced protein is between codon 16 and 17. Two glycosylation sites of AsnThrThr and AsnValSer, respectively, appear at codon 23 and 202. The high homology genes are not found in all genes known in NCBI, but the typical conserved domain of peptidase-C1 has been detected in NCBI BLASTp, and high homologies with CB of some kinds of creatures are shown. The objective function of the Avgl has not been studied in the spide silk gland yet.

  7. Dynamics of hepatitis B virus quasispecies in association with nucleos(tide analogue treatment determined by ultra-deep sequencing.

    Directory of Open Access Journals (Sweden)

    Norihiro Nishijima

    Full Text Available BACKGROUND AND AIMS: Although the advent of ultra-deep sequencing technology allows for the analysis of heretofore-undetectable minor viral mutants, a limited amount of information is currently available regarding the clinical implications of hepatitis B virus (HBV genomic heterogeneity. METHODS: To characterize the HBV genetic heterogeneity in association with anti-viral therapy, we performed ultra-deep sequencing of full-genome HBV in the liver and serum of 19 patients with chronic viral infection, including 14 therapy-naïve and 5 nucleos(tide analogue(NA-treated cases. RESULTS: Most genomic changes observed in viral variants were single base substitutions and were widely distributed throughout the HBV genome. Four of eight (50% chronic therapy-naïve HBeAg-negative patients showed a relatively low prevalence of the G1896A pre-core (pre-C mutant in the liver tissues, suggesting that other mutations were involved in their HBeAg seroconversion. Interestingly, liver tissues in 4 of 5 (80% of the chronic NA-treated anti-HBe-positive cases had extremely low levels of the G1896A pre-C mutant (0.0%, 0.0%, 0.1%, and 1.1%, suggesting the high sensitivity of the G1896A pre-C mutant to NA. Moreover, various abundances of clones resistant to NA were common in both the liver and serum of treatment-naïve patients, and the proportion of M204VI mutants resistant to lamivudine and entecavir expanded in response to entecavir treatment in the serum of 35.7% (5/14 of patients, suggesting the putative risk of developing drug resistance to NA. CONCLUSION: Our findings illustrate the strong advantage of deep sequencing on viral genome as a tool for dissecting the pathophysiology of HBV infection.

  8. High diversity of picornaviruses in rats from different continents revealed by deep sequencing

    DEFF Research Database (Denmark)

    Arn Hansen, Thomas; Mollerup, Sarah; Nguyen, Nam-Phuong;

    2016-01-01

    ) collected from two continents by analyzing 2.2 billion next-generation sequencing reads derived from both DNA and RNA. Among other virus families, we found sequences from members of the Picornaviridae to be abundant in the microbiome of all the samples. Here we describe the diversity of the picornavirus...

  9. Complete nucleotide sequences and construction of full-length infectious cDNA clones of cucumber green mottle mosaic virus (CGMMV) in a versatile newly developed binary vector including both 35S and T7 promoters.

    Science.gov (United States)

    Park, Chan-Hwan; Ju, Hye-Kyoung; Han, Jae-Yeong; Park, Jong-Seo; Kim, Ik-Hyun; Seo, Eun-Young; Kim, Jung-Kyu; Hammond, John; Lim, Hyoun-Sub

    2017-04-01

    Seed-transmitted viruses have caused significant damage to watermelon crops in Korea in recent years, with cucumber green mottle mosaic virus (CGMMV) infection widespread as a result of infected seed lots. To determine the likely origin of CGMMV infection, we collected CGMMV isolates from watermelon and melon fields and generated full-length infectious cDNA clones. The full-length cDNAs were cloned into newly constructed binary vector pJY, which includes both the 35S and T7 promoters for versatile usage (agroinfiltration and in vitro RNA transcription) and a modified hepatitis delta virus ribozyme sequence to precisely cleave RNA transcripts at the 3' end of the tobamovirus genome. Three CGMMV isolates (OMpj, Wpj, and Mpj) were separately evaluated for infectivity in Nicotiana benthamiana, demonstrated by either Agroinfiltration or inoculation with in vitro RNA transcripts. CGMMV nucleotide identities to other tobamoviruses were calculated from pairwise alignments using DNAMAN. CGMMV identities were 49.89% to tobacco mosaic virus; 49.85% to pepper mild mottle virus; 50.47% to tomato mosaic virus; 60.9% to zucchini green mottle mosaic virus; and 60.96% to kyuri green mottle mosaic virus, confirming that CGMMV is a distinct species most similar to other cucurbit-infecting tobamoviruses. We further performed phylogenetic analysis to determine relationships of our new Korean CGMMV isolates to previously characterized isolates from Canada, China, India, Israel, Japan, Korea, Russia, Spain, and Taiwan available from NCBI. Analysis of CGMMV amino acid sequences showed three major clades, broadly typified as 'Russian,' 'Israeli,' and 'Asian' groups. All of our new Korean isolates fell within the 'Asian' clade. Neither the 128 nor 186 kDa RdRps of the three new isolates showed any detectable gene silencing suppressor function.

  10. 黄瓜幼果cDNA文库构建与EST测序分析%Construction of a Young Fruit cDNA Library and EST Sequencing in Cucumis sativus

    Institute of Scientific and Technical Information of China (English)

    潘宇; 蒲志群; 肖雅文; 赵名琛; 郑浴; 石士涛; 胡小燕; 张兴国

    2013-01-01

    将黄瓜授粉前后多个发育阶段的幼果组织等量混合后提取总RNA和mRNA,以λTriplEx2为栽体、XL1-Blue为宿主茵,构建了1个黄瓜幼果cDNA文库;其滴度为1.165×106pfu/mL,重组率在94.4%左右.测序获得116条EST,92.2%的长度在400 bp以上,19%为重叠序列.在GenBank中进行BLAST分析后确认与已知功能基因相似的EST序列有71条,有相似序列而功能未知的基因和没有相似序列的EST序列各占19.83%和18.97%.从对文库的检验结果看,构建的cDNA文库重组率较高,库容达到预期要求.%The growth and development of cucumber (Cucumis sativus L.) fruit is closely related to its yield and quality.To gain the gene expression pattern of the young fruit just before and after pollination is important to exploring the molecular mechanisms of parthenocarpy and fruit growth initiation.In this study,tissues of young fruit of cucumber at different development stages before and after pollination were mixed and total RNA and mRNA were extracted.Then,a cDNA library of cucumber young fruit with a titer of 1.165 × 106 pfu/mL and a recombinant frequency of 94.4% was constructed,using λTriplEx2 as a vector and XL1-Blue as the host strain.One hundred and sixteen EST sequences were obtained,of which 92.2% were over 400 bp in size and 19% were contigs.BLAST analysis in GenBank revealed that 71 of the 116 ESTs were homologous to genes of known function,19.83% were related to genes with unknown functions and 18.97 % were novel.The cDNA library sufficed the criteria with high recombinant efficiency and wide representativeness.The results will facilitate the cloning of development-related genes from cucumber fruit.

  11. Cloning and sequence analysis of cDNA encoding aquaporin (AQP) gene from Anopheles sinensis%中华按蚊水通道蛋白(AsAQP)cDNA克隆与序列分析

    Institute of Scientific and Technical Information of China (English)

    唐建霞; 张超; 白亮; 李菊林; Liu Kun; 周华云; 曹俊; 高琪

    2012-01-01

    目的 克隆中华按蚊水通道蛋白(AsAQP)基因的cDNA全长序列,分析其基因序列特征,为研究AsAQP的生物学功能提供分子基础.方法 根据已报道的昆虫水通道蛋白(AQP)氨基酸序列的保守区域,采用兼并引物从中华按蚊cDNA中获取AsAQP基因片段,在此基础上利用cDNA末端快速扩增(RACE)技术克隆该基因cDNA全长序列,并用生物信息学方法对获取的序列进行分析.结果 利用兼并引物从中华按蚊成蚊cDNA中分离到AsAQP基因片段,利用RACE技术克隆到该基因的全长cDNA.序列分析表明,该基因cDNA全长762 bp,编码253个氨基酸,蛋白分子量约为63.2 kD.生物信息学分析表明,AsAQP具有典型的6个跨膜区结构和2个天冬酰胺酸-脯氨酸-丙氨酸(NPA)结构,该结构是主要内在蛋白(MIP)家族典型的结构特征.AsAQP与致倦库蚊(Culex quinquefasciatus)AQP及埃及伊蚊(Aedes aegypti AQP蛋白的同源性分别为76%和78%.氨基酸序列聚类分析表明,AsAQP与其他蚊种的水通道蛋白遗传距离较近.结论 利用兼并引物结合RACR技术首次获得了编码AsAQP基因的cDNA全长序列,该基因属于MIP蛋白家族成员,具有典型的功能域,为进一步研究该蛋白的功能奠定了基础.%Objective To clone and analyze the full-length sequence of aquaporin gene of Anopheles sinensis (AsAQP) , so as to provide an insight into its biology functions. Methods The degenerate primers were used to amplify conserved region of AQP from An. Sinensis cDNA. After then, the full-length cDNA of AsAQP was obtained by rapid amplification of cDNA ends (RACE). Concurrently, the bioinformatics methods were applied to analyze the obtained sequence. Results The obtained full-length cD-NA of AsAQP consisted of 762 bp and 253 deduced amino acids with a predicted molecular mass of 63.2 kD. Bioinformatics analysis demonstrated that AsAQP had a typical structure with six membrane-spanning domains and an internal symmetry showing

  12. Cloning and sequence analysis of full-length cDNA ofα-actin gene from Chelonia mydas%绿海龟α-actin基因的cDNA克隆与序列分析

    Institute of Scientific and Technical Information of China (English)

    陶翠花; 刘莹莹; 赵丽媛; 许敏; 祝茜

    2014-01-01

    To explore the sequence and characteristic of α-actin gene from Chelonia mydas, the full-length cDNA sequence ofα-actin gene was cloned using RT-PCR and RACE technique, which was consisted of 1347 bp nucleo-tides (GenBank accession number: JX073650), with a putative open reading frame (ORF) of 1134 bp encoding a deduced 377 amino acid protein containing a glycosylation site (from 14 to 17) and an Actin domain (from 7 to 377). The molecular weight of the protein was 42.0 kDa and the isoelectric point (pI) was 5.23. The nucleotide sequence similarity ofα-actin gene between C. mydas and other species was above 85.4%, while the similarity of amino acid sequence was more than 98.9%, suggesting that α-actin gene was highly conserved. This study has enriched the Actin gene database and provided basic data for further studies on expression and function of relevant genes.%为探究绿海龟(Chelonia mydas)α-actin基因序列的相关信息,作者利用RT-PCR和RACE方法从绿海龟肌肉组织中获得了α-actin基因的cDNA全长序列,共1347bp(GenBank登录号为JX073650)。所得序列包含一个1134 bp的开放阅读框,编码由377个氨基酸组成的蛋白,该蛋白7~377位为Actin结构域,14~17位有一个糖基化位点,无信号肽;预测分子量为42.0 kDa,理论等电点为5.23。将编码区序列与 GenBank 上同源序列进行比对发现,核苷酸序列相似性均在85.4%以上,氨基酸序列相似性均在98.9%以上,说明α-actin基因作为编码蛋白是高度保守的。

  13. Characterization of the Genomic Diversity of Norovirus in Linked Patients Using a Metagenomic Deep Sequencing Approach

    Science.gov (United States)

    Nasheri, Neda; Petronella, Nicholas; Ronholm, Jennifer; Bidawid, Sabah; Corneau, Nathalie

    2017-01-01

    Norovirus (NoV) is the leading cause of gastroenteritis worldwide. A robust cell culture system does not exist for NoV and therefore detailed characterization of outbreak and sporadic strains relies on molecular techniques. In this study, we employed a metagenomic approach that uses non-specific amplification followed by next-generation sequencing to whole genome sequence NoV genomes directly from clinical samples obtained from 8 linked patients. Enough sequencing depth was obtained for each sample to use a de novo assembly of near-complete genome sequences. The resultant consensus sequences were then used to identify inter-host nucleotide variations that occur after direct transmission, analyze amino acid variations in the major capsid protein, and provide evidence of recombination events. The analysis of intra-host quasispecies diversity was possible due to high coverage-depth. We also observed a linear relationship between NoV viral load in the clinical sample and the number of sequence reads that could be attributed to NoV. The method demonstrated here has the potential for future use in whole genome sequence analyses of other RNA viruses isolated from clinical, environmental, and food specimens. PMID:28197136

  14. Using deep RNA sequencing for the structural annotation of the Laccaria bicolor mycorrhizal transcriptome.

    Directory of Open Access Journals (Sweden)

    Peter E Larsen

    Full Text Available BACKGROUND: Accurate structural annotation is important for prediction of function and required for in vitro approaches to characterize or validate the gene expression products. Despite significant efforts in the field, determination of the gene structure from genomic data alone is a challenging and inaccurate process. The ease of acquisition of transcriptomic sequence provides a direct route to identify expressed sequences and determine the correct gene structure. METHODOLOGY: We developed methods to utilize RNA-seq data to correct errors in the structural annotation and extend the boundaries of current gene models using assembly approaches. The methods were validated with a transcriptomic data set derived from the fungus Laccaria bicolor, which develops a mycorrhizal symbiotic association with the roots of many tree species. Our analysis focused on the subset of 1501 gene models that are differentially expressed in the free living vs. mycorrhizal transcriptome and are expected to be important elements related to carbon metabolism, membrane permeability and transport, and intracellular signaling. Of the set of 1501 gene models, 1439 (96% successfully generated modified gene models in which all error flags were successfully resolved and the sequences aligned to the genomic sequence. The remaining 4% (62 gene models either had deviations from transcriptomic data that could not be spanned or generated sequence that did not align to genomic sequence. The outcome of this process is a set of high confidence gene models that can be reliably used for experimental characterization of protein function. CONCLUSIONS: 69% of expressed mycorrhizal JGI "best" gene models deviated from the transcript sequence derived by this method. The transcriptomic sequence enabled correction of a majority of the structural inconsistencies and resulted in a set of validated models for 96% of the mycorrhizal genes. The method described here can be applied to improve gene

  15. Using deep RNA sequencing for the structural annotation of the laccaria bicolor mycorrhizal transcriptome.

    Energy Technology Data Exchange (ETDEWEB)

    Larsen, P. E.; Trivedi, G.; Sreedasyam, A.; Lu, V.; Podila, G. K.; Collart, F. R.; Biosciences Division; Univ. of Alabama

    2010-07-06

    Accurate structural annotation is important for prediction of function and required for in vitro approaches to characterize or validate the gene expression products. Despite significant efforts in the field, determination of the gene structure from genomic data alone is a challenging and inaccurate process. The ease of acquisition of transcriptomic sequence provides a direct route to identify expressed sequences and determine the correct gene structure. We developed methods to utilize RNA-seq data to correct errors in the structural annotation and extend the boundaries of current gene models using assembly approaches. The methods were validated with a transcriptomic data set derived from the fungus Laccaria bicolor, which develops a mycorrhizal symbiotic association with the roots of many tree species. Our analysis focused on the subset of 1501 gene models that are differentially expressed in the free living vs. mycorrhizal transcriptome and are expected to be important elements related to carbon metabolism, membrane permeability and transport, and intracellular signaling. Of the set of 1501 gene models, 1439 (96%) successfully generated modified gene models in which all error flags were successfully resolved and the sequences aligned to the genomic sequence. The remaining 4% (62 gene models) either had deviations from transcriptomic data that could not be spanned or generated sequence that did not align to genomic sequence. The outcome of this process is a set of high confidence gene models that can be reliably used for experimental characterization of protein function. 69% of expressed mycorrhizal JGI 'best' gene models deviated from the transcript sequence derived by this method. The transcriptomic sequence enabled correction of a majority of the structural inconsistencies and resulted in a set of validated models for 96% of the mycorrhizal genes. The method described here can be applied to improve gene structural annotation in other species, provided

  16. High diversity of picornaviruses in rats from different continents revealed by deep sequencing

    DEFF Research Database (Denmark)

    Arn Hansen, Thomas; Mollerup, Sarah; Nguyen, Nam-Phuong

    2016-01-01

    Outbreaks of zoonotic diseases in humans and livestock are not uncommon, and an important component in containment of such emerging viral diseases is rapid and reliable diagnostics. Such methods are often PCR-based and hence require the availability of sequence data from the pathogen. Rattus......) collected from two continents by analyzing 2.2 billion next-generation sequencing reads derived from both DNA and RNA. Among other virus families, we found sequences from members of the Picornaviridae to be abundant in the microbiome of all the samples. Here we describe the diversity of the picornavirus...

  17. CLONING AND ANALYSIS OF THE FULL-LENGTH cDNA SEQUENCE OF SEPIELLA MAINDRONI SCD GENE%曼氏无针乌贼(Sepiella maindroni)SCD基因全长cDNA的克隆和序列分析

    Institute of Scientific and Technical Information of China (English)

    马明华; 刘慧慧; 迟长凤; 吴常文

    2013-01-01

    硬脂酰辅酶A去饱和酶(SCD)是脂肪代谢的关键酶.本研究采用RT-PCR和RACE技术克隆了曼氏无针乌贼(Sepiella maindroni)SCD cDNA的全序列,序列全长1513bp,由261bp的5′非翻译区、编码306个氨基酸的921bp开放阅读框和331bp的3′非翻译区组成.在线翻译所得多肽理论分子量为34.92kDa,等电点为8.95,是疏水性蛋白,含有丰富的螺旋结构(45.10%),存在4个跨膜区.其氨基酸序列与真蛸(Octopus vulgaris)和长牡蛎(Crassostrea gigas)相似性达到91%,与其它非软体动物也表现为50%以上的相似性,说明SCD结构相对保守;系统进化树结果表明曼氏无针乌贼和真蛸及牡蛎进化关系最近,与鱼类稍远,与人及大鼠等哺乳动物亲缘关系最远.SCD基因是改善曼氏无针乌贼肉质的重要候选基因,其成功克隆及相关分析对于深入探讨软体动物脂肪酸代谢相关基因在生物体内作用机制及调控机理具有重要意义.%Strearyl coenzyme A desaturation enzyme is the key enzyme of fatty acid desaturation.In this paper,a 1513bp full-length cDNA of SCD gene from Sepiella maindroni was obtained with RT-PCR and rapid amplification of cDNA ends (RACE) techniques,which consisted of a 261bp 5′untranslated region (UTR),a 921bp open reading frame (ORF),and a 349bp 3′UTR.The molecular weight of deduced protein was 34.92kDa and its pI was 8.95.The SCD protein was hydrophobic protein and contained four transmembrane regions with rich spiral structures (45.10%).The deduced amino acid sequence aligned with those of SCD genes from different species showed high degree of sequence homology.The similarity of amino acid sequence of SCD protein was 91% among S.maindroni,Octopus vulgaris and Crassostrea gigas,and the homology was also more than 50% between S.maindroni and other animals.The result indicated that the structure of SCD protein was conserved.The SCD in S.maindroni was clustered with O.vulgaris and C.gigas,and further

  18. Tibetan antelope cystathionine γ -lyase:complete cDNA sequences%高原藏羚羊胱硫醚-γ-裂解酶基因克隆与全序列测定

    Institute of Scientific and Technical Information of China (English)

    李肃; 格日力

    2013-01-01

    Objective To identify the Cystathionine-γ-Lyase(CSE) genes coding sequences molecular cloning , exam the tissues expression spectrums and discuss the hypoxic adaptations mechanisms in Tibetan antelope .Methods The total RNA was extracted ,and the cDNA was captured by reverse transcription RT-PCR ,then identified ,se-quenced and cloned .Results There was 96 .47% homology between the Tibetan antelope gene fragment containing the purpose gene and the cattle gene in gen banks ,thus the result mean the gene which cloned before was CSE gene protein .The length of the CSE gene protein was been detected by designing primers according to the human ,mous , wild boar ,cattle CSEcDNA sequences ,and the CSE gene primers of tibetan antelope which tesied by Pnaman .Conclu-sion CSE gene protein might play an important role in the body of the Tibetan anteplope ,which provide experiment basis to the gene study about adaptation in high altitude hypoxia environment .in the future .%目的:探讨克隆高原藏羚羊胱硫醚-γ-裂解酶(CSE)基因编码区并检测其在成年高原藏羚羊组织中的表达,同时探讨高原藏羚羊低氧适应的分子生物学机制。方法从高原藏羚羊组织中提取总 RNA ,通过逆转录聚合酶链反应(RT-PCR)获得高原藏羚羊 cDNA ,并进行鉴定和测序。结果将含有目的片段克隆后经测序和 Blast分析,结果显示其部分编码序列与 GenBank 中牛 CSE 蛋白基因序列同源性96.47%,表明本实验所克隆的序列为CSE 蛋白基因。根据已知人、褐家鼠、小鼠、野猪、食蟹猴、家牛 CSEcDNA 序列和 Pnaman 软件设计高原藏羚羊 cse基因的引物。结论 CSE mRNA 可能在高原藏羚羊机体较为广泛的区域中发挥着作用,同时为高原低氧适应相关基因的研究提供了实验依据。

  19. Dissection of the octoploid strawberry genome by deep sequencing of the genomes of Fragaria species.

    Science.gov (United States)

    Hirakawa, Hideki; Shirasawa, Kenta; Kosugi, Shunichi; Tashiro, Kosuke; Nakayama, Shinobu; Yamada, Manabu; Kohara, Mistuyo; Watanabe, Akiko; Kishida, Yoshie; Fujishiro, Tsunakazu; Tsuruoka, Hisano; Minami, Chiharu; Sasamoto, Shigemi; Kato, Midori; Nanri, Keiko; Komaki, Akiko; Yanagi, Tomohiro; Guoxin, Qin; Maeda, Fumi; Ishikawa, Masami; Kuhara, Satoru; Sato, Shusei; Tabata, Satoshi; Isobe, Sachiko N

    2014-01-01

    Cultivated strawberry (Fragaria x ananassa) is octoploid and shows allogamous behaviour. The present study aims at dissecting this octoploid genome through comparison with its wild relatives, F. iinumae, F. nipponica, F. nubicola, and F. orientalis by de novo whole-genome sequencing on an Illumina and Roche 454 platforms. The total length of the assembled Illumina genome sequences obtained was 698 Mb for F. x ananassa, and ∼200 Mb each for the four wild species. Subsequently, a virtual reference genome termed FANhybrid_r1.2 was constructed by integrating the sequences of the four homoeologous subgenomes of F. x ananassa, from which heterozygous regions in the Roche 454 and Illumina genome sequences were eliminated. The total length of FANhybrid_r1.2 thus created was 173.2 Mb with the N50 length of 5137 bp. The Illumina-assembled genome sequences of F. x ananassa and the four wild species were then mapped onto the reference genome, along with the previously published F. vesca genome sequence to establish the subgenomic structure of F. x ananassa. The strategy adopted in this study has turned out to be successful in dissecting the genome of octoploid F. x ananassa and appears promising when applied to the analysis of other polyploid plant species.

  20. Gene discovery using mutagen-induced polymorphisms and deep sequencing: application to plant disease resistance.

    Science.gov (United States)

    Zhu, Ying; Mang, Hyung-gon; Sun, Qi; Qian, Jun; Hipps, Ashley; Hua, Jian

    2012-09-01

    Next-generation sequencing technologies are accelerating gene discovery by combining multiple steps of mapping and cloning used in the traditional map-based approach into one step using DNA sequence polymorphisms existing between two different accessions/strains/backgrounds of the same species. The existing next-generation sequencing method, like the traditional one, requires the use of a segregating population from a cross of a mutant organism in one accession with a wild-type (WT) organism in a different accession. It therefore could potentially be limited by modification of mutant phenotypes in different accessions and/or by the lengthy process required to construct a particular mapping parent in a second accession. Here we present mapping and cloning of an enhancer mutation with next-generation sequencing on bulked segregants in the same accession using sequence polymorphisms induced by a chemical mutagen. This method complements the conventional cloning approach and makes forward genetics more feasible and powerful in molecularly dissecting biological processes in any organisms. The pipeline developed in this study can be used to clone causal genes in background of single mutants or higher order of mutants and in species with or without sequence information on multiple accessions.

  1. Mosaic KCNJ2 mutation in Andersen-Tawil syndrome: targeted deep sequencing is useful for the detection of mosaicism.

    Science.gov (United States)

    Hasegawa, K; Ohno, S; Kimura, H; Itoh, H; Makiyama, T; Yoshida, Y; Horie, M

    2015-03-01

    Andersen-Tawil syndrome (ATS) is an inherited disease characterized by ventricular arrhythmias, periodic paralysis, and dysmorphic features. It results from a heterozygous mutation of KCNJ2, but little is known about mosaicism in ATS. We performed genetic analysis of KCNJ2 in 32 ATS probands and their family members and identified KCNJ2 mutations in 25 probands, 20 families who underwent extensive genetic testing. These tests revealed that seven probands carried de novo mutations while 13 carried inherited mutations from their parents. We then specifically assessed a single proband and the respective family. The proband was a 9 year old girl who fulfilled the ATS triad and carried an insertion mutation (p.75_76insThr). We determined that the proband's mother carried a somatic mosaicism and that the proband's younger brother also carried the ATS phenotype with the same insertion mutation. The mother, who exhibited mosaicism, was asymptomatic, although she exhibited Q(T)U prolongation. Mutant allele frequency was 11% as per TA cloning and 17.3% as per targeted deep sequencing. Our observations suggest that targeted deep sequencing is useful for the detection of mosaicism and that the detection of mosaic mutations in parents of apparently sporadic ATS patients can help in the process of genetic counseling.

  2. Characterization and Development of EST-SSRs by Deep Transcriptome Sequencing in Chinese Cabbage (Brassica rapa L. ssp. pekinensis

    Directory of Open Access Journals (Sweden)

    Qian Ding

    2015-01-01

    Full Text Available Simple sequence repeats (SSRs are among the most important markers for population analysis and have been widely used in plant genetic mapping and molecular breeding. Expressed sequence tag-SSR (EST-SSR markers, located in the coding regions, are potentially more efficient for QTL mapping, gene targeting, and marker-assisted breeding. In this study, we investigated 51,694 nonredundant unigenes, assembled from clean reads from deep transcriptome sequencing with a Solexa/Illumina platform, for identification and development of EST-SSRs in Chinese cabbage. In total, 10,420 EST-SSRs with over 12 bp were identified and characterized, among which 2744 EST-SSRs are new and 2317 are known ones showing polymorphism with previously reported SSRs. A total of 7877 PCR primer pairs for 1561 EST-SSR loci were designed, and primer pairs for twenty-four EST-SSRs were selected for primer evaluation. In nineteen EST-SSR loci (79.2%, amplicons were successfully generated with high quality. Seventeen (89.5% showed polymorphism in twenty-four cultivars of Chinese cabbage. The polymorphic alleles of each polymorphic locus were sequenced, and the results showed that most polymorphisms were due to variations of SSR repeat motifs. The EST-SSRs identified and characterized in this study have important implications for developing new tools for genetics and molecular breeding in Chinese cabbage.

  3. Geochemical features and effects on deep-seated fluids during the May-June 2012 southern Po Valley seismic sequence

    Directory of Open Access Journals (Sweden)

    Francesco Italiano

    2012-10-01

    Full Text Available A periodic sampling of the groundwaters and dissolved and free gases in selected deep wells located in the area affected by the May-June 2012 southern Po Valley seismic sequence has provided insight into seismogenic-induced changes of the local aquifer systems. The results obtained show progressive changes in the fluid geochemistry, allowing it to be established that deep-seated fluids were mobilized during the seismic sequence and reached surface layers along faults and fractures, which generated significant geochemical anomalies. The May-June 2012 seismic swarm (mainshock on May 29, 2012, M 5.8; 7 shocks M >5, about 200 events 3 > M > 5 induced several modifications in the circulating fluids. This study reports the preliminary results obtained for the geochemical features of the waters and gases collected over the epicentral area from boreholes drilled at different depths, thus intercepting water and gases with different origins and circulation. The aim of the investigations was to improve our knowledge of the fluids circulating over the seismic area (e.g. origin, provenance, interactions, mixing of different components, temporal changes. This was achieved by collecting samples from both shallow and deep-drilled boreholes, and then, after the selection of the relevant sites, we looked for temporal changes with mid-to-long-term monitoring activity following a constant sampling rate. This allowed us to gain better insight into the relationships between the fluid circulation and the faulting activity. The sampling sites are listed in Table 1, along with the analytical results of the gas phase. […

  4. Location and sequence of muscle onset in deep abdominal muscles measured by different modes of ultrasound imaging.

    Science.gov (United States)

    Westad, Christian; Mork, Paul J; Vasseljen, Ottar

    2010-10-01

    Various modes of ultrasound (US) imaging have been introduced as an alternative to electromyography for determining muscle onset. The purpose of this study was to compare the agreement between US motion-mode (US(m-mode)) and US strain rate (US(SR)) derived from tissue velocity imaging in determining latency time, location and sequence of muscle onset in abdominal muscles using the same data set (contractions). Twenty-four subjects performed four rapid arm flexions in response to a light signal while US recordings were made from the abdominal muscles on the contralateral side. The examined muscles were transversus abdominis (TrA), superficial and deep obliquus internus abdominis (OI(deep) and OI(sup)), and obliquus externus abdominis (OE). The results showed that the two methods detected the first muscle onset on average within 0.1 ms (95% CI; +/-1.4 ms) of each other. US(SR) detected the second muscle onset on average 27 ms after US(m-mode). While US(SR) and US(m-mode) can be used interchangeably to detect the first muscle onset, the location of both first onset and subsequent muscle onsets can be reliably detected by US(SR) only. Furthermore, this study indicates that OI may be functionally subdivided into a superficial and deep region, with onset in OI(deep) occurring on average 53 ms before OI(sup). First onset was detected more frequently in OI than in TrA (65% versus 25% of detected onsets, 10% were equal).

  5. 双齿围沙蚕Cu/Zn-SOD cDNA基因的克隆及序列分析%Cloning and sequence analysis of Cu/Zn-SOD cDNA from sandworm Perinereis aibuhitensis

    Institute of Scientific and Technical Information of China (English)

    岳宗豪; 樊鑫; 赵欢; 任洪伟; 张旭峰; 周一兵

    2014-01-01

    A full length cDNA of Cu/Zn-SOD was firstly cloned in sandworm Perinereis aibuhitensis by homology cloning and RACE techniques based on the partial copper-zinc superoxide dismutase ( Cu/Zn-SOD) gene from pol-ychaete Alitta succinea. The full length of the cDNA was found to be 870 bp including a 156 bp 5′untranslated re-gion,a 261 bp 3′untranslated region and 453 bp open reading frame encoding 150 amino acids. There were typical Cu2+ and Zn2+ binding sites as well as two Cu/Zn-SOD protein family tag sequences in the deduced protein which was within the intracellular Cu/Zn-SOD with relative molecular mass of 15 249 900 and the isoelectric point of 5. 66 by bioinformatic analysis. No signal peptide and transmembrane domain were observed in the deduced pro-tein, indicating that it belonged to the hydrophilic protein. Multiple sequences alignment analysis revealed that the deduced amino acids had high homology to the proteins of partial molluscs, fishes and insects. The findings will provide basis for research of dose-response between gene expression and environmental pollutants, and defense mechanism of the sandworm.%根据已知多毛类Alitta succinea铜锌超氧化物歧化酶( Cu/Zn-SOD)基因序列设计引物,利用同源克隆及RACE方法首次从双齿围沙蚕Perinereis aibuhitensis中克隆得到Cu/Zn-SOD基因全长cDNA序列。结果表明:双齿围沙蚕Cu/Zn-SOD基因cDNA全长870 bp,其中包括156 bp的5′端非编码区,261 bp 3′端非翻译区和453 bp 开放阅读框,编码150个氨基酸;该蛋白序列具有典型的Cu2+和Zn2+结合位点,并具有两处Cu/Zn-SOD 蛋白家族标签序列。通过生物信息学分析表明,该蛋白属于胞内Cu/Zn-SOD,理论相对分子质量为15249900,等电点为5.66,无信号肽和跨膜区,推测为亲水性蛋白。同源性分析表明,双齿围沙蚕Cu/Zn-SOD氨基酸序列与部分软体动物、鱼类和昆虫的Cu/Zn-SOD蛋白序列具有很高的相似性。该研究结果为后续研

  6. 茶树泛素活化酶基因全长cDNA克隆及序列分析%Cloning and Sequencing of UBA1 Gene Full-length cDNA from Tea Plant

    Institute of Scientific and Technical Information of China (English)

    邓婷婷; 吴扬; 李娟; 李银花; 黄建安; 刘仲华

    2012-01-01

    The cDNA-AFLP technology was applied to analyze gene expression during periodic albinism process of Anji Baicha. Some transcript-derived fragments (TDFs) were isolated occurring in both the albinistic and re-greening stage leaves. One of them showed a high similarity to ubiquitin-activating enzyme 1 (UBA\\) gene. Based on the fragment, the full length of UBAl gene with 3 764 bp (GenBank Accession No. JN180299) cDNA was obtained via rapid amplification of cDNA ends (RACE), named Camellia Sinensis UBA1 gene. It contained an open reading frame (ORF) encoding a polypeptide of 1 094 amino acid residues with a predicable molecular mass of 121 kD. Analysis of the nucleotide sequence and deduced amino acid sequence showed 82%, 81%, 79%, 79%, 77% homology with UBAl genes from Nicotiana tabacum, Ricinus communis, Oryza saliva subsp. Japonica, Triticum aestivum, Arabidopsis thaliana, respectively. Analysis by qRT-PCR showed that the transcript of UBAl was significantly up-regulated at the albinistic stage to 2.49-fold higher than that at the re-greening stage. This is a key enzyme in the ubiquitin-proteasome mediated protein degradation system. The clone and analysis of the tea plant UBAl gene establishes a good foundation for further study on the molecular mechanism of periodic albinism in Anji Baicha.%应用cDNA-AFLP技术分离安吉白茶阶段性返白过程中的差异表达基因,获得一白期表达上调片断TDF (transcript derived fragment,TDF).BLAST比对结果显示,该片段与其他物种的泛素活化酶基因有很高的相似性.通过SMART-RACE技术分别扩增出其3’和5’末端序列,成功获得该基因全长cDNA序列(GenBank登录号JN180299).所得序列全长3 764 bp,其开放阅读框编码1 094个氨基酸,蛋白分子量约为121 kD.该基因的氨基酸序列与烟草、蓖麻、水稻、小麦、拟南芥中的UBA1基因编码的氨基酸序列分别有82%、81%、79%、79%、77%的同源性.qRT-PCR分析表明,安吉白茶UBA1

  7. Interleukin-1 stimulates the expression of type I and type II interleukin-1 receptors in the rat insulinoma cell line Rinm5F; sequencing a rat type II interleukin-1 receptor cDNA.

    Science.gov (United States)

    Bristulf, J; Gatti, S; Malinowsky, D; Bjork, L; Sundgren, A K; Bartfai, T

    1994-01-01

    The insulin secreting rat Rinm5F cells are often used to study the cytotoxic actions of interleukin-1 (IL-1) on pancreatic beta-cells. We demonstrate here that Rinm5F insulinoma cells express both type I and type II interleukin-1 receptor (IL-1R) mRNAs and gene products. IL-1R agonists, recombinant murine IL-1 alpha (rmIL-1 alpha, 10 ng/ml) and recombinant rat IL-1 beta (rrIL-1 beta, 100 pg/ml or 10 ng/ml) induce the upregulation of mRNA expression for both types of IL-1 receptors (IL-1Rs). This effect of rrIL-1 beta is antagonised by preincubation with recombinant human interleukin 1 receptor antagonist protein (rhIL-1ra, 5 micrograms/ml). Furthermore, this rrIL-1 beta induced upregulation of IL-1R mRNAs is blocked by actinomycin D (7.5 micrograms/ml), whereas cycloheximide (20 micrograms/ml) has no effect. The phorbol ester PMA (20 nM) upregulates the expression of mRNAs both IL-1 receptors, whereas glucose (50 mM) upregulates the expression of the type I IL-1R mRNA only. Pretreatment of cells with pertussis toxin (100 ng/ml) partially blocks the rrIL-1 beta induced expression of mRNA for the type I and, to a lesser extent, the type II IL-1R. Incubation of the cells with rrIL-1 beta also induces a time-dependent expression of c-fos, interleukin-6 (IL-6) and tumour necrosis factor alpha (TNF-alpha) mRNAs. Binding studies with 125I-recombinant human IL-1 beta (125I-rhIL-1 beta) indicate that IL-1R gene products, with the ligand binding characteristics of the type I IL-1R, are constitutively present on Rinm5F cells. Treatment with rrIL-1 beta (6h) increases the number of 125I-rhIL-1 beta binding sites on Rinm5F cells. We have also demonstrated that the number of type II IL-1R binding sites increases after induction with rrIL-1 beta (6h), by indirect immunofluorescence using a monoclonal antibody (ALVA 42) raised against the human type II IL-1R. Furthermore, we have sequenced the type II IL-1R cDNA in the rat insulinoma Rinm5F cells. The comparison of the amino acid

  8. Mitochondrial genome sequences reveal deep divergences among Anopheles punctulatus sibling species in Papua New Guinea

    Directory of Open Access Journals (Sweden)

    Logue Kyle

    2013-02-01

    Full Text Available Abstract Background Members of the Anopheles punctulatus group (AP group are the primary vectors of human malaria in Papua New Guinea. The AP group includes 13 sibling species, most of them morphologically indistinguishable. Understanding why only certain species are able to transmit malaria requires a better comprehension of their evolutionary history. In particular, understanding relationships and divergence times among Anopheles species may enable assessing how malaria-related traits (e.g. blood feeding behaviours, vector competence have evolved. Methods DNA sequences of 14 mitochondrial (mt genomes from five AP sibling species and two species of the Anopheles dirus complex of Southeast Asia were sequenced. DNA sequences from all concatenated protein coding genes (10,770 bp were then analysed using a Bayesian approach to reconstruct phylogenetic relationships and date the divergence of the AP sibling species. Results Phylogenetic reconstruction using the concatenated DNA sequence of all mitochondrial protein coding genes indicates that the ancestors of the AP group arrived in Papua New Guinea 25 to 54 million years ago and rapidly diverged to form the current sibling species. Conclusion Through evaluation of newly described mt genome sequences, this study has revealed a divergence among members of the AP group in Papua New Guinea that would significantly predate the arrival of humans in this region, 50 thousand years ago. The divergence observed among the mtDNA sequences studied here may have resulted from reproductive isolation during historical changes in sea-level through glacial minima and maxima. This leads to a hypothesis that the AP sibling species have evolved independently for potentially thousands of generations. This suggests that the evolution of many phenotypes, such as insecticide resistance will arise independently in each of the AP sibling species studied here.

  9. Whole genome deep sequencing of HIV-1 reveals the impact of early minor variants upon immune recognition during acute infection.

    Directory of Open Access Journals (Sweden)

    Matthew R Henn

    Full Text Available Deep sequencing technologies have the potential to transform the study of highly variable viral pathogens by providing a rapid and cost-effective approach to sensitively characterize rapidly evolving viral quasispecies. Here, we report on a high-throughput whole HIV-1 genome deep sequencing platform that combines 454 pyrosequencing with novel assembly and variant detection algorithms. In one subject we combined these genetic data with detailed immunological analyses to comprehensively evaluate viral evolution and immune escape during the acute phase of HIV-1 infection. The majority of early, low frequency mutations represented viral adaptation to host CD8+ T cell responses, evidence of strong immune selection pressure occurring during the early decline from peak viremia. CD8+ T cell responses capable of recognizing these low frequency escape variants coincided with the selection and evolution of more effective secondary HLA-anchor escape mutations. Frequent, and in some cases rapid, reversion of transmitted mutations was also observed across the viral genome. When located within restricted CD8 epitopes these low frequency reverting mutations were sufficient to prime de novo responses to these epitopes, again illustrating the capacity of the immune response to recognize and respond to low frequency variants. More importantly, rapid viral escape from the most immunodominant CD8+ T cell responses coincided with plateauing of the initial viral load decline in this subject, suggestive of a potential link between maintenance of effective, dominant CD8 responses and the degree of early viremia reduction. We conclude that the early control of HIV-1 replication by immunodominant CD8+ T cell responses may be substantially influenced by rapid, low frequency viral adaptations not detected by conventional sequencing approaches, which warrants further investigation. These data support the critical need for vaccine-induced CD8+ T cell responses to target more

  10. rSW-seq: Algorithm for detection of copy number alterations in deep sequencing data

    Directory of Open Access Journals (Sweden)

    Kim Tae-Min

    2010-08-01

    Full Text Available Abstract Background Recent advances in sequencing technologies have enabled generation of large-scale genome sequencing data. These data can be used to characterize a variety of genomic features, including the DNA copy number profile of a cancer genome. A robust and reliable method for screening chromosomal alterations would allow a detailed characterization of the cancer genome with unprecedented accuracy. Results We develop a method for identification of copy number alterations in a tumor genome compared to its matched control, based on application of Smith-Waterman algorithm to single-end sequencing data. In a performance test with simulated data, our algorithm shows >90% sensitivity and >90% precision in detecting a single copy number change that contains approximately 500 reads for the normal sample. With 100-bp reads, this corresponds to a ~50 kb region for 1X genome coverage of the human genome. We further refine the algorithm to develop rSW-seq, (recursive Smith-Waterman-seq to identify alterations in a complex configuration, which are commonly observed in the human cancer genome. To validate our approach, we compare our algorithm with an existing algorithm using simulated and publicly available datasets. We also compare the sequencing-based profiles to microarray-based results. Conclusion We propose rSW-seq as an efficient method for detecting copy number changes in the tumor genome.

  11. Deep sequencing of uveal melanoma identifies a recurrent mutation in PLCB4

    DEFF Research Database (Denmark)

    Johansson, Peter; Aoude, Lauren G; Wadt, Karin;

    2016-01-01

    -genome or whole-exome sequencing of 28 tumors or primary cell lines. These samples have a low mutation burden, with a mean of 10.6 protein changing mutations per sample (range 0 to 53). As expected for these sun-shielded melanomas the mutation spectrum was not consistent with an ultraviolet radiation signature...

  12. Reproducibility of Illumina platform deep sequencing errors allows accurate determination of DNA barcodes in cells.

    NARCIS (Netherlands)

    Beltman, J.B.; Urbanus, J.; Velds, A.; Rooij, van N.; Rohr, J.C.; Naik, S.H.; Schumacher, T.N.

    2016-01-01

    BACKGROUND Next generation sequencing (NGS) of amplified DNA is a powerful tool to describe genetic heterogeneity within cell populations that can both be used to investigate the clonal structure of cell populations and to perform genetic lineage tracing. For applications in which both abundant and

  13. High-Throughput Plasmid cDNA Library Screening

    Energy Technology Data Exchange (ETDEWEB)

    Wan, Kenneth H.; Yu, Charles; George, Reed A.; Carlson, JosephW.; Hoskins, Roger A.; Svirskas, Robert; Stapleton, Mark; Celniker, SusanE.

    2006-05-24

    Libraries of cDNA clones are valuable resources foranalysing the expression, structure, and regulation of genes, as well asfor studying protein functions and interactions. Full-length cDNA clonesprovide information about intron and exon structures, splice junctionsand 5'- and 3'-untranslated regions (UTRs). Open reading frames (ORFs)derived from cDNA clones can be used to generate constructs allowingexpression of native proteins and N- or C-terminally tagged proteins.Thus, obtaining full-length cDNA clones and sequences for most or allgenes in an organism is critical for understanding genome functions.Expressed sequence tag (EST) sequencing samples cDNA libraries at random,which is most useful at the beginning of large-scale screening projects.However, as projects progress towards completion, the probability ofidentifying unique cDNAs via EST sequencing diminishes, resulting in poorrecovery of rare transcripts. We describe an adapted, high-throughputprotocol intended for recovery of specific, full-length clones fromplasmid cDNA libraries in five days.

  14. 梅花鹿卵泡刺激素α-亚基cDNA的分子克隆与序列分析%Nucleotide sequence of cloned cDNA for α-subunit of sika follicle stimulating hormone

    Institute of Scientific and Technical Information of China (English)

    关洪斌; 李庆章; 张莉

    2002-01-01

    从新屠宰的母梅花鹿脑垂体中提取总RNA,反转录获得cDNA,以此cDNA为模板用PCR法扩增目的片段,获得长为380 bp的梅花鹿卵泡刺激素α--亚基cDNA片段,将它克隆至pMD-18-T-Verctor.随机挑选3个阳性重组子进行测序,并将测序结果与绵羊、牛、猪等多种哺乳动物该基因的核苷酸序列及相应氨基酸序列进行比较.结果表明,梅花鹿卵泡刺激素α--亚基基因编码的氨基酸序列与绵羊、水牛的该基因同源性最高,达97%,只有4个氨基酸不同;与牛的该基因同源性达96%.与人的该基因氨基酸序列同源性较低,为75%.其编码的核苷酸序列与绵羊、水牛、牛的同源性最高,达96%,只有14~16个碱基不同;与人的该基因核苷酸同源性最低,为84%.总的来说,哺乳动物的卵泡刺激素α-亚基具有很高的同源性.%Total RNA was prepared from pituitary gland of new butchered sika.cDNA was synthesized by RT-PCRreaction and this cDNA was used as model in PCR amplification for α-subunit of sika follicle stimulating hormone.The PCR product was 380bp in 1.2% agarose gel electro-phoresis which just was the target fragment of predictedFSHα-subunit. It was cloned it to pMD-18-T vector. 3 positive recombinant was selected at random to analyze itssequence by DNA analysis apparatus. Its amino acid sequence was compared with some other mammalian. The resultshows that it has the highest homology with sheep and buffalo,which it reaches 97%. There are only 4 amino acidsdifference among sika ,sheep and buffalo. It has lower homology in amino acid with human, its homology is 75%. Ithas the highest homology among sika ,sheep, buffalo and bovine in nucleotide sequence, which it reaches 96%.There are 14-16 nucleotides difference among them. It has lower homology in nucleotide sequence with human, it isonly 84%. It was found that the nucleotide sequence of the o-subunit in these mammalian species are highly con-servative. According to our

  15. Identification of a new enamovirus associated with citrus vein enation disease by deep sequencing of small RNAs.

    Science.gov (United States)

    Vives, Mari Carmen; Velázquez, Karelia; Pina, José Antonio; Moreno, Pedro; Guerri, José; Navarro, Luis

    2013-10-01

    To identify the causal agent of citrus vein enation disease, we examined by deep sequencing (Solexa-Illumina) the small RNA (sRNA) fraction from infected and healthy Etrog citron plants. Our results showed that virus-derived sRNAs (vsRNAs): (i) represent about 14.21% of the total sRNA population, (ii) are predominantly of 21 and 24 nucleotides with a biased distribution of their 5' nucleotide and with a clear prevalence of those of (+) polarity, and (iii) derive from all the viral genome, although a prominent hotspot is present at a 5'-proximal region. Contigs assembled from vsRNAs showed similarity with luteovirus sequences, particularly with Pea enation mosaic virus, the type member of the genus Enamovirus. The genomic RNA (gRNA) sequence of a new virus, provisionally named Citrus vein enation virus (CVEV), was completed and characterized. The CVEV gRNA was found to be single-stranded, positive-sense, with a size of 5,983 nucleotides and five open reading frames. Phylogenetic comparisons based on amino acid signatures of the RNA polymerase and the coat protein clearly classifies CVEV within the genus Enamovirus. Dot-blot hybridization and reverse transcription-polymerase chain reaction tests were developed to detect CVEV in plants affected by vein enation disease. CVEV detection by these methods has already been adopted for use in the Spanish citrus quarantine, sanitation, and certification programs.

  16. Identification of representative genes of the central nervous system of the locust, Locusta migratoria manilensis by deep sequencing.

    Science.gov (United States)

    Zhang, Zhengyi; Peng, Zhi-Yu; Yi, Kang; Cheng, Yanbing; Xia, Yuxian

    2012-01-01

    The shortage of available genomic and transcriptomic data hampers the molecular study on the migratory locust, Locusta migratoria manilensis (L.) (Orthoptera: Acrididae) central nervous system (CNS). In this study, locust CNS RNA was sequenced by deep sequencing. 41,179 unigenes were obtained with an average length of 570 bp, and 5,519 unigenes were longer than 1,000 bp. Compared with an EST database of another locust species Schistocerca gregaria Forsskåi, 9,069 unigenes were found conserved, while 32,110 unigenes were differentially expressed. A total of 15,895 unigenes were identified, including 644 nervous system relevant unigenes. Among the 25,284 unknown unigenes, 9,482 were found to be specific to the CNS by filtering out the previous ESTs acquired from locust organs without CNS's. The locust CNS showed the most matches (18%) with Tribolium castaneum (Herbst) (Coleoptera: Tenebrionidae) sequences. Comprehensive assessment reveals that the database generated in this study is broadly representative of the CNS of adult locust, providing comprehensive gene information at the transcriptional level that could facilitate research of the locust CNS, including various physiological aspects and pesticide target finding.

  17. Deep sequencing of the oral microbiome reveals signatures of periodontal disease.

    Directory of Open Access Journals (Sweden)

    Bo Liu

    Full Text Available The oral microbiome, the complex ecosystem of microbes inhabiting the human mouth, harbors several thousands of bacterial types. The proliferation of pathogenic bacteria within the mouth gives rise to periodontitis, an inflammatory disease known to also constitute a risk factor for cardiovascular disease. While much is known about individual species associated with pathogenesis, the system-level mechanisms underlying the transition from health to disease are still poorly understood. Through the sequencing of the 16S rRNA gene and of whole community DNA we provide a glimpse at the global genetic, metabolic, and ecological changes associated with periodontitis in 15 subgingival plaque samples, four from each of two periodontitis patients, and the remaining samples from three healthy individuals. We also demonstrate the power of whole-metagenome sequencing approaches in characterizing the genomes of key players in the oral microbiome, including an unculturable TM7 organism. We reveal the disease microbiome to be enriched in virulence factors, and adapted to a parasitic lifestyle that takes advantage of the disrupted host homeostasis. Furthermore, diseased samples share a common structure that was not found in completely healthy samples, suggesting that the disease state may occupy a narrow region within the space of possible configurations of the oral microbiome. Our pilot study demonstrates the power of high-throughput sequencing as a tool for understanding the role of the oral microbiome in periodontal disease. Despite a modest level of sequencing (~2 lanes Illumina 76 bp PE and high human DNA contamination (up to ~90% we were able to partially reconstruct several oral microbes and to preliminarily characterize some systems-level differences between the healthy and diseased oral microbiomes.

  18. High-throughput, high-fidelity HLA genotyping with deep sequencing.

    Science.gov (United States)

    Wang, Chunlin; Krishnakumar, Sujatha; Wilhelmy, Julie; Babrzadeh, Farbod; Stepanyan, Lilit; Su, Laura F; Levinson, Douglas; Fernandez-Viña, Marcelo A; Davis, Ronald W; Davis, Mark M; Mindrinos, Michael

    2012-05-29

    Human leukocyte antigen (HLA) genes are the most polymorphic in the human genome. They play a pivotal role in the immune response and have been implicated in numerous human pathologies, especially autoimmunity and infectious diseases. Despite their importance, however, they are rarely characterized comprehensively because of the prohibitive cost of standard technologies and the technical challenges of accurately discriminating between these highly related genes and their many allelles. Here we demonstrate a high-resolution, and cost-effective methodology to type HLA genes by sequencing, which combines the advantage of long-range amplification, the power of high-throughput sequencing platforms, and a unique genotyping algorithm. We calibrated our method for HLA-A, -B, -C, and -DRB1 genes with both reference cell lines and clinical samples and identified several previously undescribed alleles with mismatches, insertions, and deletions. We have further demonstrated the utility of this method in a clinical setting by typing five clinical samples in an Illumina MiSeq instrument with a 5-d turnaround. Overall, this technology has the capacity to deliver low-cost, high-throughput, and accurate HLA typing by multiplexing thousands of samples in a single sequencing run, which will enable comprehensive disease-association studies with large cohorts. Furthermore, this approach can also be extended to include other polymorphic genes.

  19. New mutations in chronic lymphocytic leukemia identified by target enrichment and deep sequencing.

    Directory of Open Access Journals (Sweden)

    Elena Doménech

    Full Text Available Chronic lymphocytic leukemia (CLL is a heterogeneous disease without a well-defined genetic alteration responsible for the onset of the disease. Several lines of evidence coincide in identifying stimulatory and growth signals delivered by B-cell receptor (BCR, and co-receptors together with NFkB pathway, as being the driving force in B-cell survival in CLL. However, the molecular mechanism responsible for this activation has not been identified. Based on the hypothesis that BCR activation may depend on somatic mutations of the BCR and related pathways we have performed a complete mutational screening of 301 selected genes associated with BCR signaling and related pathways using massive parallel sequencing technology in 10 CLL cases. Four mutated genes in coding regions (KRAS, SMARCA2, NFKBIE and PRKD3 have been confirmed by capillary sequencing. In conclusion, this study identifies new genes mutated in CLL, all of them in cases with progressive disease, and demonstrates that next-generation sequencing technologies applied to selected genes or pathways of interest are powerful tools for identifying novel mutational changes.

  20. MicroRNA repertoire for functional genome research in tilapia identified by deep sequencing.

    Science.gov (United States)

    Yan, Biao; Wang, Zhen-Hua; Zhu, Chang-Dong; Guo, Jin-Tao; Zhao, Jin-Liang

    2014-08-01

    The Nile tilapia (Oreochromis niloticus; Cichlidae) is an economically important species in aquaculture and occupies a prominent position in the aquaculture industry. MicroRNAs (miRNAs) are a class of noncoding RNAs that post-transcriptionally regulate gene expression involved in diverse biological and metabolic processes. To increase the repertoire of miRNAs characterized in tilapia, we used the Illumina/Solexa sequencing technology to sequence a small RNA library using pooled RNA sample isolated from the different developmental stages of tilapia. Bioinformatic analyses suggest that 197 conserved and 27 novel miRNAs are expressed in tilapia. Sequence alignments indicate that all tested miRNAs and miRNAs* are highly conserved across many species. In addition, we characterized the tissue expression patterns of five miRNAs using real-time quantitative PCR. We found that miR-1/206, miR-7/9, and miR-122 is abundantly expressed in muscle, brain, and liver, respectively, implying a potential role in the regulation of tissue differentiation or the maintenance of tissue identity. Overall, our results expand the number of tilapia miRNAs, and the discovery of miRNAs in tilapia genome contributes to a better understanding the role of miRNAs in regulating diverse biological processes.

  1. High diversity of picornaviruses in rats from different continents revealed by deep sequencing.

    Science.gov (United States)

    Hansen, Thomas Arn; Mollerup, Sarah; Nguyen, Nam-Phuong; White, Nicole E; Coghlan, Megan; Alquezar-Planas, David E; Joshi, Tejal; Jensen, Randi Holm; Fridholm, Helena; Kjartansdóttir, Kristín Rós; Mourier, Tobias; Warnow, Tandy; Belsham, Graham J; Bunce, Michael; Willerslev, Eske; Nielsen, Lars Peter; Vinner, Lasse; Hansen, Anders Johannes

    2016-08-17

    Outbreaks of zoonotic diseases in humans and livestock are not uncommon, and an important component in containment of such emerging viral diseases is rapid and reliable diagnostics. Such methods are often PCR-based and hence require the availability of sequence data from the pathogen. Rattus norvegicus (R. norvegicus) is a known reservoir for important zoonotic pathogens. Transmission may be direct via contact with the animal, for example, through exposure to its faecal matter, or indirectly mediated by arthropod vectors. Here we investigated the viral content in rat faecal matter (n=29) collected from two continents by analyzing 2.2 billion next-generation sequencing reads derived from both DNA and RNA. Among other virus families, we found sequences from members of the Picornaviridae to be abundant in the microbiome of all the samples. Here we describe the diversity of the picornavirus-like contigs including near-full-length genomes closely related to the Boone cardiovirus and Theiler's encephalomyelitis virus. From this study, we conclude that picornaviruses within R. norvegicus are more diverse than previously recognized. The virome of R. norvegicus should be investigated further to assess the full potential for zoonotic virus transmission.

  2. Single nucleotide polymorphism discovery in rainbow trout by deep sequencing of a reduced representation library

    Directory of Open Access Journals (Sweden)

    Salem Mohamed

    2009-11-01

    Full Text Available Abstract Background To enhance capabilities for genomic analyses in rainbow trout, such as genomic selection, a large suite of polymorphic markers that are amenable to high-throughput genotyping protocols must be identified. Expressed Sequence Tags (ESTs have been used for single nucleotide polymorphism (SNP discovery in salmonids. In those strategies, the salmonid semi-tetraploid genomes often led to assemblies of paralogous sequences and therefore resulted in a high rate of false positive SNP identification. Sequencing genomic DNA using primers identified from ESTs proved to be an effective but time consuming methodology of SNP identification in rainbow trout, therefore not suitable for high throughput SNP discovery. In this study, we employed a high-throughput strategy that used pyrosequencing technology to generate data from a reduced representation library constructed with genomic DNA pooled from 96 unrelated rainbow trout that represent the National Center for Cool and Cold Water Aquaculture (NCCCWA broodstock population. Results The reduced representation library consisted of 440 bp fragments resulting from complete digestion with the restriction enzyme HaeIII; sequencing produced 2,000,000 reads providing an average 6 fold coverage of the estimated 150,000 unique genomic restriction fragments (300,000 fragment ends. Three independent data analyses identified 22,022 to 47,128 putative SNPs on 13,140 to 24,627 independent contigs. A set of 384 putative SNPs, randomly selected from the sets produced by the three analyses were genotyped on individual fish to determine the validation rate of putative SNPs among analyses, distinguish apparent SNPs that actually represent paralogous loci in the tetraploid genome, examine Mendelian segregation, and place the validated SNPs on the rainbow trout linkage map. Approximately 48% (183 of the putative SNPs were validated; 167 markers were successfully incorporated into the rainbow trout linkage map. In

  3. Rabbit muscle creatine phosphokinase. CDNA cloning, primary structure and detection of human homologues.

    Science.gov (United States)

    Putney, S; Herlihy, W; Royal, N; Pang, H; Aposhian, H V; Pickering, L; Belagaje, R; Biemann, K; Page, D; Kuby, S

    1984-12-10

    A cDNA library was constructed from rabbit muscle poly(A) RNA. Limited amino acid sequence information was obtained on rabbit muscle creatine phosphokinase and this was the basis for design and synthesis of two oligonucleotide probes complementary to a creatine kinase cDNA sequence which encodes a pentapeptide. Colony hybridizations with the probes and subsequent steps led to isolation of two clones, whose cDNA segments partially overlap and which together encode the entire protein. The primary structure was established from the sequence of two cDNA clones and from independently determined sequences of scattered portions of the polypeptide. The reactive cysteine has been located to position 282 within the 380 amino acid polypeptide. The rabbit cDNA hybridizes to digests of human chromosomal DNA. This reveals a restriction fragment length polymorphism associated with the human homologue(s) which hybridizes to the rabbit cDNA.

  4. On Cloning,Sequence Analysis and Tissue Expression of Ceruloplasmin Gene in Rare Gudgeon%稀有鮈鲫铜蓝蛋白基因 cDNA 克隆及组织表达分析

    Institute of Scientific and Technical Information of China (English)

    景致; 彭作刚; 张耀光

    2014-01-01

    铜蓝蛋白(Ceruloplasmin ,Cp)是一种重要的铜转运蛋白,合成于肝脏并参与生物体铁的代谢,在医学上是各种炎症、感染、中毒及癌症疾病的标志性蛋白.铜蓝蛋白的研究已在多种真骨鱼类中被报道,文中第一次在稀有鮈鲫(Gobiocypris rarus)中报道此基因.采用cDNA末端快速扩增技术(rapid amplification of cDNA ends ,RACE)克隆了稀有鮈鲫铜蓝蛋白基因,使用荧光定量PCR的方法构建了该基因组织表达谱.序列分析表明稀有鮈鲫铜蓝蛋白基因包含3264 bp全长编码序列,该序列编码1087个氨基酸,其核苷酸和氨基酸序列与斑马鱼同源性最高(分别为88.1%和90.3%).理论相对分子质量和等电点分别为124429.1 D和6.41.荧光定量PCR检测表明该基因在肝脏和脾脏中相对表达量最高,在肌肉和鳃中相对表达量最低.使用氨基酸序列进行蛋白结构保守域分析,结果表明铜蓝蛋白基因在脊椎动物中是相对保守的,推测其功能也与其他物种相似.这为进一步研究稀有鮈鲫该基因的功能及其应用奠定了基础.%Ceruloplasmin (Cp) ,which is the major copper-carrying protein synthesized in the liver ,plays a role in iron metabolism .It is a marker protein for inflammation ,infection ,poisoning and cancer .The Cp gene has been reported in several teleosts and here the gene in rare gudgeon (Gobiocy p ris rarus) has been first characterized . In this study , the Cp gene has been cloned by rapid amplification of cDNA ends (RACE) .Real-time PCR has been performed to demonstrate the expression pattern in different tissues . The CDS of Cp gene is 3 264 bp long ,which encodes 1 087 amino acids .BLAST result indicates that the most similar homologue of rare gudgeon Cp is from zebrafish ,with a homology of 88 .1% (DNA ) and 90 .3% (amino acid) .The predicted relative molecular mass of the protein is 124 429.1 D with an estimated PI of 6

  5. Identification of microRNAs Involved in the Host Response to Enterovirus 71 Infection by a Deep Sequencing Approach

    Directory of Open Access Journals (Sweden)

    Lunbiao Cui

    2010-01-01

    Full Text Available Role of microRNA (miRNA has been highlighted in pathogen-host interactions recently. To identify cellular miRNAs involved in the host response to enterovirus 71 (EV71 infection, we performed a comprehensive miRNA profiling in EV71-infected Hep2 cells through deep sequencing. 64 miRNAs were found whose expression levels changed for more than 2-fold in response to EV71 infection. Gene ontology analysis revealed that many of these mRNAs play roles in neurological process, immune response, and cell death pathways, which are known to be associated with the extreme virulence of EV71. To our knowledge, this is the first study on host miRNAs expression alteration response to EV71 infection. Our findings supported the hypothesis that certain miRNAs might be essential in the host-pathogen interactions.

  6. Deep sequencing of mRNA in CD24− and CD24+ mammary carcinoma Mvt1 cell line

    Directory of Open Access Journals (Sweden)

    Ran Rostoker

    2015-09-01

    Full Text Available CD24 is an anchored cell surface marker that is highly expressed in cancer cells (Lee et al., 2009 and its expression is associated with poorer outcome of cancer patients (Kristiansen et al., 2003. Phenotype comparison between two subpopulations derived from the Mvt1 cell line, CD24− cells (with no CD24 cell surface expression and the CD24+ cells, identified high tumorigenic capacity for the CD24+ cells. In order to reveal the transcripts that support the CD24+ aggressive and invasive phenotype we compared the gene profiles of these two subpopulations. mRNA profiles of CD24− and CD24+ cells were generated by deep sequencing, in triplicate, using an Illumina HiSeq 2500. Here we provide a detailed description of the mRNA-seq analysis from our recent study (Rostoker et al., 2015. The mRNA-seq data have been deposited in the NCBI GEO database (accession number GSE68746.

  7. Discovering novel microRNAs and age-related nonlinear changes in rat brains using deep sequencing.

    Science.gov (United States)

    Yin, Lanxuan; Sun, Yubai; Wu, Jinfeng; Yan, Siyu; Deng, Zhenglu; Wang, Jun; Liao, Shenke; Yin, Dazhong; Li, Guolin

    2015-02-01

    Elucidating the molecular mechanisms of brain aging remains a significant challenge for biogerontologists. The discovery of gene regulation by microRNAs (miRNAs) has added a new dimension for examining this process; however, the full complement of miRNAs involved in brain aging is still not known. In this study, miRNA profiles of young, adult, and old rats were obtained to evaluate molecular changes during aging. High-throughput deep sequencing revealed 547 known and 171 candidate novel miRNAs that were differentially expressed among groups. Unexpectedly, miRNA expression did not decline progressively with advancing age; moreover, genes targeted by age-associated miRNAs were predicted to be involved in biological processes linked to aging and neurodegenerative diseases. These findings provide novel insight into the molecular mechanisms underlying brain aging and a resource for future studies on age-related brain disorders.

  8. Insights into the genetic structure and diversity of 38 South Asian Indians from deep whole-genome sequencing.

    Directory of Open Access Journals (Sweden)

    Lai-Ping Wong

    2014-05-01

    Full Text Available South Asia possesses a significant amount of genetic diversity due to considerable intergroup differences in culture and language. There have been numerous reports on the genetic structure of Asian Indians, although these have mostly relied on genotyping microarrays or targeted sequencing of the mitochondria and Y chromosomes. Asian Indians in Singapore are primarily descendants of immigrants from Dravidian-language-speaking states in south India, and 38 individuals from the general population underwent deep whole-genome sequencing with a target coverage of 30X as part of the Singapore Sequencing Indian Project (SSIP. The genetic structure and diversity of these samples were compared against samples from the Singapore Sequencing Malay Project and populations in Phase 1 of the 1,000 Genomes Project (1 KGP. SSIP samples exhibited greater intra-population genetic diversity and possessed higher heterozygous-to-homozygous genotype ratio than other Asian populations. When compared against a panel of well-defined Asian Indians, the genetic makeup of the SSIP samples was closely related to South Indians. However, even though the SSIP samples clustered distinctly from the Europeans in the global population structure analysis with autosomal SNPs, eight samples were assigned to mitochondrial haplogroups that were predominantly present in Europeans and possessed higher European admixture than the remaining samples. An analysis of the relative relatedness between SSIP with two archaic hominins (Denisovan, Neanderthal identified higher ancient admixture in East Asian populations than in SSIP. The data resource for these samples is publicly available and is expected to serve as a valuable complement to the South Asian samples in Phase 3 of 1 KGP.

  9. Deep sequencing of subseafloor eukaryotic rRNA reveals active Fungi across marine subsurface provinces.

    Directory of Open Access Journals (Sweden)

    William Orsi

    Full Text Available The deep marine subsurface is a vast habitat for microbial life where cells may live on geologic timescales. Because DNA in sediments may be preserved on long timescales, ribosomal RNA (rRNA is suggested to be a proxy for the active fraction of a microbial community in the subsurface. During an investigation of eukaryotic 18S rRNA by amplicon pyrosequencing, unique profiles of Fungi were found across a range of marine subsurface provinces including ridge flanks, continental margins, and abyssal plains. Subseafloor fungal populations exhibit statistically significant correlations with total organic carbon (TOC, nitrate, sulfide, and dissolved inorganic carbon (DIC. These correlations are supported by terminal restriction length polymorphism (TRFLP analyses of fungal rRNA. Geochemical correlations with fungal pyrosequencing and TRFLP data from this geographically broad sample set suggests environmental selection of active Fungi in the marine subsurface. Within the same dataset, ancient rRNA signatures were recovered from plants and diatoms in marine sediments ranging from 0.03 to 2.7 million years old, suggesting that rRNA from some eukaryotic taxa may be much more stable than previously considered in the marine subsurface.

  10. [cDNA library construction from panicle meristem of finger millet].

    Science.gov (United States)

    Radchuk, V; Pirko, Ia V; Isaenkov, S V; Emets, A I; Blium, Ia B

    2014-01-01

    The protocol for production of full-size cDNA using SuperScript Full-Length cDNA Library Construction Kit II (Invitrogen) was tested and high quality cDNA library from meristematic tissue of finger millet panicle (Eleusine coracana (L.) Gaertn) was created. The titer of obtained cDNA library comprised 3.01 x 10(5) CFU/ml in avarage. In average the length of cDNA insertion consisted about 1070 base pairs, the effectivity of cDNA fragment insertions--99.5%. The selective sequencing of cDNA clones from created library was performed. The sequences of cDNA clones were identified with usage of BLAST-search. The results of cDNA library analysis and selective sequencing represents prove good functionality and full length character of inserted cDNA clones. Obtained cDNA library from meristematic tissue of finger millet panicle represents good and valuable source for isolation and identification of key genes regulating metabolism and meristematic development and for mining of new molecular markers to conduct out high quality genetic investigations and molecular breeding as well.

  11. RECOGNITION OF CDNA MICROARRAY IMAGE USING FEEDFORWARD ARTIFICIAL NEURAL NETWORK

    Directory of Open Access Journals (Sweden)

    R. M. Farouk

    2014-09-01

    Full Text Available The complementary DNA (cDNA sequence considered the magic biometric technique for personal identification. Microarray image processing used for the concurrent genes identification. In this paper, we present a new method for cDNA recognition based on the artificial neural network (ANN. We have segmented the location of the spots in a cDNA microarray. Thus, a precise localization and segmenting of a spot are essential to obtain a more exact intensity measurement, leading to a more accurate gene expression measurement. The segmented cDNA microarray image resized and used as an input for the proposed artificial neural network. For matching and recognition, we have trained the artificial neural network. Recognition results are given for the galleries of cDNA sequences . The numerical results show that, the proposed matching technique is an effective in the cDNA sequences process. The experimental results of our matching approach using different databases shows that, the proposed technique is an effective matching performance.

  12. RECOGNITION OF CDNA MICROARRAY IMAGE USING FEEDFORWARD ARTIFICIAL NEURAL NETWORK

    Directory of Open Access Journals (Sweden)

    R. M. Farouk

    2014-07-01

    Full Text Available The complementary DNA (cDNA sequence considered th e magic biometric technique for personal identification. Microarray image processing used fo r the concurrent genes identification. In this pape r, we present a new method for cDNA recognition based on the artificial neural network (ANN. We have segmented the location of the spots in a cDNA micro array. Thus, a precise localization and segmenting of a spot are essential to obtain a more exact intensity measurement, leading to a more accurate gene expression measurement. The segmented cDNA microarr ay image resized and used as an input for the proposed artificial neural network. For matching an d recognition, we have trained the artificial neura l network. Recognition results are given for the gall eries of cDNA sequences . The numerical results sho w that, the proposed matching technique is an effecti ve in the cDNA sequences process. The experimental results of our matching approach using different da tabases shows that, the proposed technique is an effective matching performance.

  13. Evolutionary Relations of Hexanchiformes Deep-Sea Sharks Elucidated by Whole Mitochondrial Genome Sequences

    Directory of Open Access Journals (Sweden)

    Keiko Tanaka

    2013-01-01

    Full Text Available Hexanchiformes is regarded as a monophyletic taxon, but the morphological and genetic relationships between the five extant species within the order are still uncertain. In this study, we determined the whole mitochondrial DNA (mtDNA sequences of seven sharks including representatives of the five Hexanchiformes, one squaliform, and one carcharhiniform and inferred the phylogenetic relationships among those species and 12 other Chondrichthyes (cartilaginous fishes species for which the complete mitogenome is available. The monophyly of Hexanchiformes and its close relation with all other Squaliformes sharks were strongly supported by likelihood and Bayesian phylogenetic analysis of 13,749 aligned nucleotides of 13 protein coding genes and two rRNA genes that were derived from the whole mDNA sequences of the 19 species. The phylogeny suggested that Hexanchiformes is in the superorder Squalomorphi, Chlamydoselachus anguineus (frilled shark is the sister species to all other Hexanchiformes, and the relations within Hexanchiformes are well resolved as Chlamydoselachus, (Notorynchus, (Heptranchias, (Hexanchus griseus, H. nakamurai. Based on our phylogeny, we discussed evolutionary scenarios of the jaw suspension mechanism and gill slit numbers that are significant features in the sharks.

  14. Computational approaches for the analysis of ncRNA through Deep Sequencing techniques

    Directory of Open Access Journals (Sweden)

    Dario eVeneziano

    2015-06-01

    Full Text Available The majority of the human transcriptome is defined as non-coding RNA (ncRNA, since only a small fraction of human DNA encodes for proteins, as reported by the ENCODE project. Several distinct classes of ncRNAs, such as transfer RNA (tRNA, microRNA (miRNA, and long non-coding RNA (lncRNA, have been classified, each with its own three-dimensional folding and specific function. As ncRNAs are highly abundant in living organisms and have been discovered to play important roles in many biological processes, there has been an ever increasing need to investigate the entire ncRNAome in further unbiased detail.Recently, the advent of Next-Generation Sequencing (NGS technologies has substantially increased the throughput of transcriptome studies, allowing an unprecedented investigation of ncRNAs, as regulatory pathways and novel functions involving ncRNAs are now also emerging. The huge amount of transcript data produced by NGS has progressively required the development and implementation of suitable bioinformatics workflows, complemented by knowledge-based approaches, to identify, classify, and evaluate the expression of hundreds of ncRNAs in normal and pathological states, such as cancer.In this mini-review, we present and discuss current bioinformatics advances in the development of such computational approaches to analyze and classify the non-coding RNA component of human transcriptome sequence data obtained from NGS technologies.

  15. Functional characterization of a monoclonal antibody epitope using a lambda phage display-deep sequencing platform

    Science.gov (United States)

    Domina, Maria; Lanza Cariccio, Veronica; Benfatto, Salvatore; Venza, Mario; Venza, Isabella; Borgogni, Erica; Castellino, Flora; Midiri, Angelina; Galbo, Roberta; Romeo, Letizia; Biondo, Carmelo; Masignani, Vega; Teti, Giuseppe; Felici, Franco; Beninati, Concetta

    2016-01-01

    We have recently described a method, named PROFILER, for the identification of antigenic regions preferentially targeted by polyclonal antibody responses after vaccination. To test the ability of the technique to provide insights into the functional properties of monoclonal antibody (mAb) epitopes, we used here a well-characterized epitope of meningococcal factor H binding protein (fHbp), which is recognized by mAb 12C1. An fHbp library, engineered on a lambda phage vector enabling surface expression of polypeptides of widely different length, was subjected to massive parallel sequencing of the phage inserts after affinity selection with the 12C1 mAb. We detected dozens of unique antibody-selected sequences, the most enriched of which (designated as FrC) could largely recapitulate the ability of fHbp to bind mAb 12C1. Computational analysis of the cumulative enrichment of single amino acids in the antibody-selected fragments identified two overrepresented stretches of residues (H248-K254 and S140-G154), whose presence was subsequently found to be required for binding of FrC to mAb 12C1. Collectively, these results suggest that the PROFILER technology can rapidly and reliably identify, in the context of complex conformational epitopes, discrete “hot spots” with a crucial role in antigen-antibody interactions, thereby providing useful clues for the functional characterization of the epitope. PMID:27530334

  16. Deep sequencing uncovers numerous small RNAs on all four replicons of the plant pathogen Agrobacterium tumefaciens.

    Science.gov (United States)

    Wilms, Ina; Overlöper, Aaron; Nowrousian, Minou; Sharma, Cynthia M; Narberhaus, Franz

    2012-04-01

    Agrobacterium species are capable of interkingdom gene transfer between bacteria and plants. The genome of Agrobacterium tumefaciens consists of a circular and a linear chromosome, the At-plasmid and the Ti-plasmid, which harbors bacterial virulence genes required for tumor formation in plants. Little is known about promoter sequences and the small RNA (sRNA) repertoire of this and other α-proteobacteria. We used a differential RNA sequencing (dRNA-seq) approach to map transcriptional start sites of 388 annotated genes and operons. In addition, a total number of 228 sRNAs was revealed from all four Agrobacterium replicons. Twenty-two of these were confirmed by independent RNA gel blot analysis and several sRNAs were differentially expressed in response to growth media, growth phase, temperature or pH. One sRNA from the Ti-plasmid was massively induced under virulence conditions. The presence of 76 cis-antisense sRNAs, two of them on the reverse strand of virulence genes, suggests considerable antisense transcription in Agrobacterium. The information gained from this study provides a valuable reservoir for an in-depth understanding of sRNA-mediated regulation of the complex physiology and infection process of Agrobacterium.

  17. Focused Evolution of HIV-1 Neutralizing Antibodies Revealed by Structures and Deep Sequencing

    Energy Technology Data Exchange (ETDEWEB)

    Wu, Xueling; Zhou, Tongqing; Zhu, Jiang; Zhang, Baoshan; Georgiev, Ivelin; Wang, Charlene; Chen, Xuejun; Longo, Nancy S.; Louder, Mark; McKee, Krisha; O’Dell, Sijy; Perfetto, Stephen; Schmidt, Stephen D.; Shi, Wei; Wu, Lan; Yang, Yongping; Yang, Zhi-Yong; Yang, Zhongjia; Zhang, Zhenhai; Bonsignori, Mattia; Crump, John A.; Kapiga, Saidi H.; Sam, Noel E.; Haynes, Barton F.; Simek, Melissa; Burton, Dennis R.; Koff, Wayne C.; Doria-Rose, Nicole A.; Connors, Mark; Mullikin, James C.; Nabel, Gary J.; Roederer, Mario; Shapiro, Lawrence; Kwong, Peter D.; Mascola, John R. (Tumaini); (NIH); (Duke); (Kilimanjaro Repro.); (IAVI)

    2013-03-04

    Antibody VRC01 is a human immunoglobulin that neutralizes about 90% of HIV-1 isolates. To understand how such broadly neutralizing antibodies develop, we used x-ray crystallography and 454 pyrosequencing to characterize additional VRC01-like antibodies from HIV-1-infected individuals. Crystal structures revealed a convergent mode of binding for diverse antibodies to the same CD4-binding-site epitope. A functional genomics analysis of expressed heavy and light chains revealed common pathways of antibody-heavy chain maturation, confined to the IGHV1-2*02 lineage, involving dozens of somatic changes, and capable of pairing with different light chains. Broadly neutralizing HIV-1 immunity associated with VRC01-like antibodies thus involves the evolution of antibodies to a highly affinity-matured state required to recognize an invariant viral structure, with lineages defined from thousands of sequences providing a genetic roadmap of their development.

  18. Genome-wide analysis of SRSF10-regulated alternative splicing by deep sequencing of chicken transcriptome

    Directory of Open Access Journals (Sweden)

    Xuexia Zhou

    2014-12-01

    Full Text Available Splicing factor SRSF10 is known to function as a sequence-specific splicing activator that is capable of regulating alternative splicing both in vitro and in vivo. We recently used an RNA-seq approach coupled with bioinformatics analysis to identify the extensive splicing network regulated by SRSF10 in chicken cells. We found that SRSF10 promoted both exon inclusion and exclusion. Functionally, many of the SRSF10-verified alternative exons are linked to pathways of response to external stimulus. Here we describe in detail the experimental design, bioinformatics analysis and GO/pathway enrichment analysis of SRSF10-regulated genes to correspond with our data in the Gene Expression Omnibus with accession number GSE53354. Our data thus provide a resource for studying regulation of alternative splicing in vivo that underlines biological functions of splicing regulatory proteins in cells.

  19. Deep sequencing of MYC DNA-binding sites in Burkitt lymphoma.

    Directory of Open Access Journals (Sweden)

    Volkhard Seitz

    Full Text Available BACKGROUND: MYC is a key transcription factor involved in central cellular processes such as regulation of the cell cycle, histone acetylation and ribosomal biogenesis. It is overexpressed in the majority of human tumors including aggressive B-cell lymphoma. Especially Burkitt lymphoma (BL is a highlight example for MYC overexpression due to a chromosomal translocation involving the c-MYC gene. However, no genome-wide analysis of MYC-binding sites by chromatin immunoprecipitation (ChIP followed by next generation sequencing (ChIP-Seq has been conducted in BL so far. METHODOLOGY/PRINCIPAL FINDINGS: ChIP-Seq was performed on 5 BL cell lines with a MYC-specific antibody giving rise to 7,054 MYC-binding sites after bioinformatics analysis of a total of approx. 19 million sequence reads. In line with previous findings, binding sites accumulate in gene sets known to be involved in the cell cycle, ribosomal biogenesis, histone acetyltransferase and methyltransferase complexes demonstrating a regulatory role of MYC in these processes. Unexpectedly, MYC-binding sites also accumulate in many B-cell relevant genes. To assess the functional consequences of MYC binding, the ChIP-Seq data were supplemented with siRNA- mediated knock-downs of MYC in BL cell lines followed by gene expression profiling. Interestingly, amongst others, genes involved in the B-cell function were up-regulated in response to MYC silencing. CONCLUSION/SIGNIFICANCE: The 7,054 MYC-binding sites identified by our ChIP-Seq approach greatly extend the knowledge regarding MYC binding in BL and shed further light on the enormous complexity of the MYC regulatory network. Especially our observations that (i many B-cell relevant genes are targeted by MYC and (ii that MYC down-regulation leads to an up-regulation of B-cell genes highlight an interesting aspect of BL biology.

  20. Ultra-deep sequencing reveals the microRNA expression pattern of the human stomach.

    Directory of Open Access Journals (Sweden)

    Ândrea Ribeiro-dos-Santos

    Full Text Available BACKGROUND: While microRNAs (miRNAs play important roles in tissue differentiation and in maintaining basal physiology, little is known about the miRNA expression levels in stomach tissue. Alterations in the miRNA profile can lead to cell deregulation, which can induce neoplasia. METHODOLOGY/PRINCIPAL FINDINGS: A small RNA library of stomach tissue was sequenced using high-throughput SOLiD sequencing technology. We obtained 261,274 quality reads with perfect matches to the human miRnome, and 42% of known miRNAs were identified. Digital Gene Expression profiling (DGE was performed based on read abundance and showed that fifteen miRNAs were highly expressed in gastric tissue. Subsequently, the expression of these miRNAs was validated in 10 healthy individuals by RT-PCR showed a significant correlation of 83.97% (P<0.05. Six miRNAs showed a low variable pattern of expression (miR-29b, miR-29c, miR-19b, miR-31, miR-148a, miR-451 and could be considered part of the expression pattern of the healthy gastric tissue. CONCLUSIONS/SIGNIFICANCE: This study aimed to validate normal miRNA profiles of human gastric tissue to establish a reference profile for healthy individuals. Determining the regulatory processes acting in the stomach will be important in the fight against gastric cancer, which is the second-leading cause of cancer mortality worldwide.

  1. Identification of Retinopathy of Prematurity Related miRNAs in Hyperoxia-Induced Neonatal Rats by Deep Sequencing

    Directory of Open Access Journals (Sweden)

    Ruibin Zhao

    2014-12-01

    Full Text Available Retinopathy of prematurity (ROP remains a major problem for many preterm infants. MicroRNAs (miRNAs are a class of small noncoding RNAs that regulate gene expression at the posttranscriptional level and have been studied in many diseases. To understand the roles of miRNAs in ROP model rats, we constructed two small RNA libraries from the plasma of hyperoxia-induced rats and normal controls. Sequencing data revealed that 44 down-regulated microRNAs and 22 up-regulated microRNAs from the hyperoxia-induced rats were identified by deep sequencing technology. Some of the differentially expressed miRNAs were confirmed by quantitative reverse transcription-PCR (qRT-PCR. A total of 594 target genes of the differentially expressed microRNAs were identified using a bioinformatics approach. Functional annotation analysis indicated that a number of pathways might be involved in angiogenesis, cell proliferation and cell differentiation, which might be involved in the genesis and development of ROP. The elevated expression level of the vascular endothelial growth factor (VEGF protein in the hyperoxia-induced neonatal rats was also confirmed by enzyme linked immunosorbent assay (ELISA. This study provides some insights into the molecular mechanisms that underlie ROP development, thereby aiding the diagnosis and treatment of this disease.

  2. Deep sequencing and proteomic analysis of the microRNA-induced silencing complex in human red blood cells.

    Science.gov (United States)

    Azzouzi, Imane; Moest, Hansjoerg; Wollscheid, Bernd; Schmugge, Markus; Eekels, Julia J M; Speer, Oliver

    2015-05-01

    During maturation, erythropoietic cells extrude their nuclei but retain their ability to respond to oxidant stress by tightly regulating protein translation. Several studies have reported microRNA-mediated regulation of translation during terminal stages of erythropoiesis, even after enucleation. In the present study, we performed a detailed examination of the endogenous microRNA machinery in human red blood cells using a combination of deep sequencing analysis of microRNAs and proteomic analysis of the microRNA-induced silencing complex. Among the 197 different microRNAs detected, miR-451a was the most abundant, representing more than 60% of all read sequences. In addition, miR-451a and its known target, 14-3-3ζ mRNA, were bound to the microRNA-induced silencing complex, implying their direct interaction in red blood cells. The proteomic characterization of endogenous Argonaute 2-associated microRNA-induced silencing complex revealed 26 cofactor candidates. Among these cofactors, we identified several RNA-binding proteins, as well as motor proteins and vesicular trafficking proteins. Our results demonstrate that red blood cells contain complex microRNA machinery, which might enable immature red blood cells to control protein translation independent of de novo nuclei information.

  3. Analysis of tumor heterogeneity and cancer gene networks using deep sequencing of MMTV-induced mouse mammary tumors.

    Directory of Open Access Journals (Sweden)

    Christiaan Klijn

    Full Text Available Cancer develops through a multistep process in which normal cells progress to malignant tumors via the evolution of their genomes as a result of the acquisition of mutations in cancer driver genes. The number, identity and mode of action of cancer driver genes, and how they contribute to tumor evolution is largely unknown. This study deployed the Mouse Mammary Tumor Virus (MMTV as an insertional mutagen to find both the driver genes and the networks in which they function. Using deep insertion site sequencing we identified around 31000 retroviral integration sites in 604 MMTV-induced mammary tumors from mice with mammary gland-specific deletion of Trp53, Pten heterozygous knockout mice, or wildtype strains. We identified 18 known common integration sites (CISs and 12 previously unknown CISs marking new candidate cancer genes. Members of the Wnt, Fgf, Fgfr, Rspo and Pdgfr gene families were commonly mutated in a mutually exclusive fashion. The sequence data we generated yielded also information on the clonality of insertions in individual tumors, allowing us to develop a data-driven model of MMTV-induced tumor development. Insertional mutations near Wnt and Fgf genes mark the earliest "initiating" events in MMTV induced tumorigenesis, whereas Fgfr genes are targeted later during tumor progression. Our data shows that insertional mutagenesis can be used to discover the mutational networks, the timing of mutations, and the genes that initiate and drive tumor evolution.

  4. Deep sequencing and in silico analyses identify MYB-regulated gene networks and signaling pathways in pancreatic cancer.

    Science.gov (United States)

    Azim, Shafquat; Zubair, Haseeb; Srivastava, Sanjeev K; Bhardwaj, Arun; Zubair, Asif; Ahmad, Aamir; Singh, Seema; Khushman, Moh'd; Singh, Ajay P

    2016-06-29

    We have recently demonstrated that the transcription factor MYB can modulate several cancer-associated phenotypes in pancreatic cancer. In order to understand the molecular basis of these MYB-associated changes, we conducted deep-sequencing of transcriptome of MYB-overexpressing and -silenced pancreatic cancer cells, followed by in silico pathway analysis. We identified significant modulation of 774 genes upon MYB-silencing (p networks by in silico analysis. Further analyses placed genes in our RNA sequencing-generated dataset to several canonical signalling pathways, such as cell-cycle control, DNA-damage and -repair responses, p53 and HIF1α. Importantly, we observed downregulation of the pancreatic adenocarcinoma signaling pathway in MYB-silenced pancreatic cancer cells exhibiting suppression of EGFR and NF-κB. Decreased expression of EGFR and RELA was validated by both qPCR and immunoblotting and they were both shown to be under direct transcriptional control of MYB. These observations were further confirmed in a converse approach wherein MYB was overexpressed ectopically in a MYB-null pancreatic cancer cell line. Our findings thus suggest that MYB potentially regulates growth and genomic stability of pancreatic cancer cells via targeting complex gene networks and signaling pathways. Further in-depth functional studies are warranted to fully understand MYB signaling in pancreatic cancer.

  5. Identifying genomic changes associated with insecticide resistance in the dengue mosquito Aedes aegypti by deep targeted sequencing.

    Science.gov (United States)

    Faucon, Frederic; Dusfour, Isabelle; Gaude, Thierry; Navratil, Vincent; Boyer, Frederic; Chandre, Fabrice; Sirisopa, Patcharawan; Thanispong, Kanutcharee; Juntarajumnong, Waraporn; Poupardin, Rodolphe; Chareonviriyaphap, Theeraphap; Girod, Romain; Corbel, Vincent; Reynaud, Stephane; David, Jean-Philippe

    2015-09-01

    The capacity of mosquitoes to resist insecticides threatens the control of diseases such as dengue and malaria. Until alternative control tools are implemented, characterizing resistance mechanisms is crucial for managing resistance in natural populations. Insecticide biodegradation by detoxification enzymes is a common resistance mechanism; however, the genomic changes underlying this mechanism have rarely been identified, precluding individual resistance genotyping. In particular, the role of copy number variations (CNVs) and polymorphisms of detoxification enzymes have never been investigated at the genome level, although they can represent robust markers of metabolic resistance. In this context, we combined target enrichment with high-throughput sequencing for conducting the first comprehensive screening of gene amplifications and polymorphisms associated with insecticide resistance in mosquitoes. More than 760 candidate genes were captured and deep sequenced in several populations of the dengue mosquito Ae. aegypti displaying distinct genetic backgrounds and contrasted resistance levels to the insecticide deltamethrin. CNV analysis identified 41 gene amplifications associated with resistance, most affecting cytochrome P450s overtranscribed in resistant populations. Polymorphism analysis detected more than 30,000 variants and strong selection footprints in specific genomic regions. Combining Bayesian and allele frequency filtering approaches identified 55 nonsynonymous variants strongly associated with resistance. Both CNVs and polymorphisms were conserved within regions but differed across continents, confirming that genomic changes underlying metabolic resistance to insecticides are not universal. By identifying novel DNA markers of insecticide resistance, this study opens the way for tracking down metabolic changes developed by mosquitoes to resist insecticides within and among populations.

  6. Deep sequencing reveals the complex and coordinated transcriptional regulation of genes related to grain quality in rice cultivars

    Directory of Open Access Journals (Sweden)

    An Gynheung

    2011-04-01

    Full Text Available Abstract Background Milling yield and eating quality are two important grain quality traits in rice. To identify the genes involved in these two traits, we performed a deep transcriptional analysis of developing seeds using both massively parallel signature sequencing (MPSS and sequencing-by-synthesis (SBS. Five MPSS and five SBS libraries were constructed from 6-day-old developing seeds of Cypress (high milling yield, LaGrue (low milling yield, Ilpumbyeo (high eating quality, YR15965 (low eating quality, and Nipponbare (control. Results The transcriptomes revealed by MPSS and SBS had a high correlation co-efficient (0.81 to 0.90, and about 70% of the transcripts were commonly identified in both types of the libraries. SBS, however, identified 30% more transcripts than MPSS. Among the highly expressed genes in Cypress and Ilpumbyeo, over 100 conserved cis regulatory elements were identified. Numerous specifically expressed transcription factor (TF genes were identified in Cypress (282, LaGrue (312, Ilpumbyeo (363, YR15965 (260, and Nipponbare (357. Many key grain quality-related genes (i.e., genes involved in starch metabolism, aspartate amino acid metabolism, storage and allergenic protein synthesis, and seed maturation that were expressed at high levels underwent alternative splicing and produced antisense transcripts either in Cypress or Ilpumbyeo. Further, a time course RT-PCR analysis confirmed a higher expression level of genes involved in starch metabolism such as those encoding ADP glucose pyrophosphorylase (AGPase and granule bound starch synthase I (GBSS I in Cypress than that in LaGrue during early seed development. Conclusion This study represents the most comprehensive analysis of the developing seed transcriptome of rice available to date. Using two high throughput sequencing methods, we identified many differentially expressed genes that may affect milling yield or eating quality in rice. Many of the identified genes are involved

  7. 藏羚羊PGC-1α基因编码区的克隆与分析%Molecular Cloning and Sequence Analysis of PGC-1α cDNA in Tibetan Antelope

    Institute of Scientific and Technical Information of China (English)

    马燕; 常荣; 祁玉娟; 格日力

    2012-01-01

    Total RNAs were extracted from myocardium of Tibetan Antelope {Pantholops hodgsonii) and Tibetan Sheep,both inhabiting on Tibetan Plateau (altitude 4 300 m). PGC-la coding cDNA sequences were cloned with reverse transcription polymerase chain reaction ( RT-PCR) , and the sequences were confirmed by DNA sequencing. The cloning and sequencing results confirmed that the PGC-la gene coding sequences of both Tibetan Antelope and Tibetan sheep showed above 90% identity with other species. In addition, the cloned sequences contained the RNA/DNA binding sites, RRM (RNA recognition motif) , the domains involved in the interaction with NRF-1 and MEF2C , Arg/Ser rich domain, negative regulatory domain, LXXLL motif, as well as conserved sequences like TPPTTPP and DHDYCQ, which are present in all PGC-l family members. Fourteen variable amino acid sites were identified in the functional domains mentioned above. Additionally, analysis of generic phosphorylation sites and kinase specific phosphorylation prediction sites indicated that the 329-threonine amino acid site could be phosphorylated by PKG,which may be unique to Tibetan Antelope. Secondary structures of PGC-la protein from Tibetan Antelope and Tibetan Sheep were also predicted in this study. In summary,the PGC-la gene coding regions from Tibetan antelope and Tibetan Sheep have been successfully cloned,which may provide fundamental data for further investigating high altitude adaptation related to genetics in the future.%以藏羚羊(Pantholops hodgsonii)及同海拔分布的藏系绵羊(Tibetan Sheep)的心肌组织为材料,提取总RNA,利用逆转录聚合酶链反应(RT-PCR)技术扩增出过氧化物酶体增生物激活受体γ辅激活因子-1α(PGC-1α)的基因编码区cDNA片段,与载体连接构建重组质粒,经转化、扩增培养、鉴定后测序.利用生物信息学方法分析显示,藏羚羊和藏系绵羊的PGC-1α基因编码区长度均为2 349 bp,编码797个氨基

  8. Small RNA and transcriptome deep sequencing proffers insight into floral gene regulation in Rosa cultivars

    Directory of Open Access Journals (Sweden)

    Kim Jungeun

    2012-11-01

    Full Text Available Abstract Background Roses (Rosa sp., which belong to the family Rosaceae, are the most economically important ornamental plants—making up 30% of the floriculture market. However, given high demand for roses, rose breeding programs are limited in molecular resources which can greatly enhance and speed breeding efforts. A better understanding of important genes that contribute to important floral development and desired phenotypes will lead to improved rose cultivars. For this study, we analyzed rose miRNAs and the rose flower transcriptome in order to generate a database to expound upon current knowledge regarding regulation of important floral characteristics. A rose genetic database will enable comprehensive analysis of gene expression and regulation via miRNA among different Rosa cultivars. Results We produced more than 0.5 million reads from expressed sequences, totalling more than 110 million bp. From these, we generated 35,657, 31,434, 34,725, and 39,722 flower unigenes from Rosa hybrid: ‘Vital’, ‘Maroussia’, and ‘Sympathy’ and Rosa rugosa Thunb. , respectively. The unigenes were assigned functional annotations, domains, metabolic pathways, Gene Ontology (GO terms, Plant Ontology (PO terms, and MIPS Functional Catalogue (FunCat terms. Rose flower transcripts were compared with genes from whole genome sequences of Rosaceae members (apple, strawberry, and peach and grape. We also produced approximately 40 million small RNA reads from flower tissue for Rosa, representing 267 unique miRNA tags. Among identified miRNAs, 25 of them were novel and 242 of them were conserved miRNAs. Statistical analyses of miRNA profiles revealed both shared and species-specific miRNAs, which presumably effect flower development and phenotypes. Conclusions In this study, we constructed a Rose miRNA and transcriptome database, and we analyzed the miRNAs and transcriptome generated from the flower tissues of four Rosa cultivars. The database provides a

  9. The 2007 Nazko, British Columbia, earthquake sequence: Injection of magma deep in the crust beneath the Anahim volcanic belt

    Science.gov (United States)

    Cassidy, J.F.; Balfour, N.; Hickson, C.; Kao, H.; White, Rickie; Caplan-Auerbach, J.; Mazzotti, S.; Rogers, Gary C.; Al-Khoubbi, I.; Bird, A.L.; Esteban, L.; Kelman, M.; Hutchinson, J.; McCormack, D.

    2011-01-01

    On 9 October 2007, an unusual sequence of earthquakes began in central British Columbia about 20 km west of the Nazko cone, the most recent (circa 7200 yr) volcanic center in the Anahim volcanic belt. Within 25 hr, eight earthquakes of magnitude 2.3-2.9 occurred in a region where no earthquakes had previously been recorded. During the next three weeks, more than 800 microearthquakes were located (and many more detected), most at a depth of 25-31 km and within a radius of about 5 km. After about two months, almost all activity ceased. The clear P- and S-wave arrivals indicated that these were high-frequency (volcanic-tectonic) earthquakes and the b value of 1.9 that we calculated is anomalous for crustal earthquakes but consistent with volcanic-related events. Analysis of receiver functions at a station immediately above the seismicity indicated a Moho near 30 km depth. Precise relocation of the seismicity using a double-difference method suggested a horizontal migration at the rate of about 0:5 km=d, with almost all events within the lowermost crust. Neither harmonic tremor nor long-period events were observed; however, some spasmodic bursts were recorded and determined to be colocated with the earthquake hypocenters. These observations are all very similar to a deep earthquake sequence recorded beneath Lake Tahoe, California, in 2003-2004. Based on these remarkable similarities, we interpret the Nazko sequence as an indication of an injection of magma into the lower crust beneath the Anahim volcanic belt. This magma injection fractures rock, producing high-frequency, volcanic-tectonic earthquakes and spasmodic bursts.

  10. Deep sequencing reveals microbiota dysbiosis of tongue coat in patients with liver carcinoma

    Science.gov (United States)

    Lu, Haifeng; Ren, Zhigang; Li, Ang; Zhang, Hua; Jiang, Jianwen; Xu, Shaoyan; Luo, Qixia; Zhou, Kai; Sun, Xiaoli; Zheng, Shusen; Li, Lanjuan

    2016-09-01

    Liver carcinoma (LC) is a common malignancy worldwide, associated with high morbidity and mortality. Characterizing microbiome profiles of tongue coat may provide useful insights and potential diagnostic marker for LC patients. Herein, we are the first time to investigate tongue coat microbiome of LC patients with cirrhosis based on 16S ribosomal RNA (rRNA) gene sequencing. After strict inclusion and exclusion criteria, 35 early LC patients with cirrhosis and 25 matched healthy subjects were enrolled. Microbiome diversity of tongue coat in LC patients was significantly increased shown by Shannon, Simpson and Chao 1 indexes. Microbiome on tongue coat was significantly distinguished LC patients from healthy subjects by principal component analysis. Tongue coat microbial profiles represented 38 operational taxonomic units assigned to 23 different genera, distinguishing LC patients. Linear discriminant analysis (LDA) effect size (LEfSe) reveals significant microbial dysbiosis of tongue coats in LC patients. Strikingly, Oribacterium and Fusobacterium could distinguish LC patients from healthy subjects. LEfSe outputs show microbial gene functions related to categories of nickel/iron_transport, amino_acid_transport, energy produced system and metabolism between LC patients and healthy subjects. These findings firstly identify microbiota dysbiosis of tongue coat in LC patients, may providing novel and non-invasive potential diagnostic biomarker of LC.

  11. Genomic DNA sequences from mastodon and woolly mammoth reveal deep speciation of forest and savanna elephants.

    Directory of Open Access Journals (Sweden)

    Nadin Rohland

    Full Text Available To elucidate the history of living and extinct elephantids, we generated 39,763 bp of aligned nuclear DNA sequence across 375 loci for African savanna elephant, African forest elephant, Asian elephant, the extinct American mastodon, and the woolly mammoth. Our data establish that the Asian elephant is the closest living relative of the extinct mammoth in the nuclear genome, extending previous findings from mitochondrial DNA analyses. We also find that savanna and forest elephants, which some have argued are the same species, are as or more divergent in the nuclear genome as mammoths and Asian elephants, which are considered to be distinct genera, thus resolving a long-standing debate about the appropriate taxonomic classification of the African elephants. Finally, we document a much larger effective population size in forest elephants compared with the other elephantid taxa, likely reflecting species differences in ancient geographic structure and range and differences in life history traits such as variance in male reproductive success.

  12. Deep near-IR variability survey of pre-main-sequence stars in Rho Ophiuchi

    CERN Document Server

    de Oliveira, Catarina Alves

    2008-01-01

    Variability is a common characteristic of pre-main-sequence stars (PMS). Near-IR variability surveys of young stellar objects (YSOs) can probe stellar and circumstellar environments and provide information about the dynamics of the on going magnetic and accretion processes. Furthermore, variability can be used as a tool to uncover new cluster members in star formation regions. We hope to achieve the deepest near-IR variability study of YSOs targeting the Rho Ophiuchi cluster. Fourteen epochs of observations were obtained with the Wide Field Camera (WFCAM) at the UKIRT telescope scheduled in a manner that allowed the study of variability on timescales of days, months, and years. Statistical tools, such as the multi-band cross correlation index and the reduced chi-square, were used to disentangle signals of variability from noise. Variability characteristics are compared to existing models of YSOs in order to relate them to physical processes, and then used to select new candidate members of this star-forming r...

  13. Deep Sequencing of Porphyromonas gingivalis and comparative transcriptome analysis of a LuxS mutant

    Directory of Open Access Journals (Sweden)

    Takanoi eHirano

    2012-06-01

    Full Text Available Porphyromonas gingivalis is a major etiological agent and chronic and aggressive forms of periodontal disease. The organism is an assacharolytic anaerobe and is a constituent of mixed species biofilms in a variety of microenvironments in the oral cavity. P. gingivalis expresses a range of virulence factors over which it exerts tight control. High-throughput sequencing technologies provide the opportunity to relate functional genomics to basic biology. In this study we report qualitative and quantitative RNA-Seq analysis of the transcriptome of P. gingivalis. We have also applied RNA-Seq to the transcriptome of a ΔluxS mutant of P. gingivalis deficient in AI-2-mediated bacterial communication. The transcriptome analysis confirmed the expression of all predicted ORFs for strain ATCC 33277, including 854 hypothetical proteins, and allowed the identification of hitherto unknown transcriptional units. Twelve noncoding RNAs were identified, including 11 small RNAs and one cobalamine riboswitch. Fifty seven genes were differentially regulated in the LuxS mutant. Addition of exogenous synthetic 4,5-dihydroxy-2,3-pentanedione (DPD, AI-2 precursor to the ΔluxS mutant culture complemented expression of a subset of genes, indicating that LuxS is involved in both AI-2 signaling and non-signaling dependent systems in P. gingivalis. This work provides an important dataset for future study of P. gingivalis pathophysiology and further defines the LuxS regulon in this oral pathogen.

  14. Deep sequencing whole transcriptome exploration of the σE regulon in Neisseria meningitidis.

    Science.gov (United States)

    Huis in 't Veld, Robert Antonius Gerhardus; Willemsen, Antonius Marcellinus; van Kampen, Antonius Hubertus Cornelis; Bradley, Edward John; Baas, Frank; Pannekoek, Yvonne; van der Ende, Arie

    2011-01-01

    Bacteria live in an ever-changing environment and must alter protein expression promptly to adapt to these changes and survive. Specific response genes that are regulated by a subset of alternative σ(70)-like transcription factors have evolved in order to respond to this changing environment. Recently, we have described the existence of a σ(E) regulon including the anti-σ-factor MseR in the obligate human bacterial pathogen Neisseria meningitidis. To unravel the complete σ(E) regulon in N. meningitidis, we sequenced total RNA transcriptional content of wild type meningococci and compared it with that of mseR mutant cells (ΔmseR) in which σ(E) is highly expressed. Eleven coding genes and one non-coding gene were found to be differentially expressed between H44/76 wildtype and H44/76ΔmseR cells. Five of the 6 genes of the σ(E) operon, msrA/msrB, and the gene encoding a pepSY-associated TM helix family protein showed enhanced transcription, whilst aniA encoding a nitrite reductase and nspA encoding the vaccine candidate Neisserial surface protein A showed decreased transcription. Analysis of differential expression in IGRs showed enhanced transcription of a non-coding RNA molecule, identifying a σ(E) dependent small non-coding RNA. Together this constitutes the first complete exploration of an alternative σ-factor regulon in N. meningitidis. The results direct to a relatively small regulon indicative for a strictly defined response consistent with a relatively stable niche, the human throat, where N. meningitidis resides.

  15. Deep sequencing whole transcriptome exploration of the σE regulon in Neisseria meningitidis.

    Directory of Open Access Journals (Sweden)

    Robert Antonius Gerhardus Huis in 't Veld

    Full Text Available Bacteria live in an ever-changing environment and must alter protein expression promptly to adapt to these changes and survive. Specific response genes that are regulated by a subset of alternative σ(70-like transcription factors have evolved in order to respond to this changing environment. Recently, we have described the existence of a σ(E regulon including the anti-σ-factor MseR in the obligate human bacterial pathogen Neisseria meningitidis. To unravel the complete σ(E regulon in N. meningitidis, we sequenced total RNA transcriptional content of wild type meningococci and compared it with that of mseR mutant cells (ΔmseR in which σ(E is highly expressed. Eleven coding genes and one non-coding gene were found to be differentially expressed between H44/76 wildtype and H44/76ΔmseR cells. Five of the 6 genes of the σ(E operon, msrA/msrB, and the gene encoding a pepSY-associated TM helix family protein showed enhanced transcription, whilst aniA encoding a nitrite reductase and nspA encoding the vaccine candidate Neisserial surface protein A showed decreased transcription. Analysis of differential expression in IGRs showed enhanced transcription of a non-coding RNA molecule, identifying a σ(E dependent small non-coding RNA. Together this constitutes the first complete exploration of an alternative σ-factor regulon in N. meningitidis. The results direct to a relatively small regulon indicative for a strictly defined response consistent with a relatively stable niche, the human throat, where N. meningitidis resides.

  16. Deep sequencing analyses of low density microbial communities: working at the boundary of accurate microbiota detection.

    Directory of Open Access Journals (Sweden)

    Giske Biesbroek

    Full Text Available INTRODUCTION: Accurate analyses of microbiota composition of low-density communities (10(3-10(4 bacteria/sample can be challenging. Background DNA from chemicals and consumables, extraction biases as well as differences in PCR efficiency can significantly interfere with microbiota assessment. This study was aiming to establish protocols for accurate microbiota analysis at low microbial density. METHODS: To examine possible effects of bacterial density on microbiota analyses we compared microbiota profiles of serial diluted saliva and low (nares, nasopharynx and high-density (oropharynx upper airway communities in four healthy individuals. DNA was extracted with four different extraction methods (Epicentre Masterpure, Qiagen DNeasy, Mobio Powersoil and a phenol bead-beating protocol combined with Agowa-Mag-mini. Bacterial DNA recovery was analysed by 16S qPCR and microbiota profiles through GS-FLX-Titanium-Sequencing of 16S rRNA gene amplicons spanning the V5-V7 regions. RESULTS: Lower template concentrations significantly impacted microbiota profiling results. With higher dilutions, low abundant species were overrepresented. In samples of <10(5 bacteria per ml, e.g. DNA <1 pg/µl, microbiota profiling deviated from the original sample and other dilutions showing a significant increase in the taxa Proteobacteria and decrease in Bacteroidetes. In similar low density samples, DNA extraction method determined if DNA levels were below or above 1 pg/µl and, together with lysis preferences per method, had profound impact on microbiota analyses in both relative abundance as well as representation of species. CONCLUSION: This study aimed to interpret microbiota analyses of low-density communities. Bacterial density seemed to interfere with microbiota analyses at < than 10(6 bacteria per ml or DNA <1 pg/µl. We therefore recommend this threshold for working with low density materials. This study underlines that bias reduction is crucial for adequate

  17. Deep sequencing reveals small RNA characterization of invasive micropapillary carcinomas of the breast.

    Science.gov (United States)

    Li, Shuai; Yang, Cuicui; Zhai, Lili; Zhang, Wenwei; Yu, Jing; Gu, Feng; Lang, Ronggang; Fan, Yu; Gong, Meihua; Zhang, Xiuqing; Fu, Li

    2012-11-01

    Invasive micropapillary carcinoma (IMPC) is an uncommon histological type of breast cancer. IMPC has a special growth pattern and a more aggressive behavior than invasive ductal carcinomas of no special types (IDC-NSTs). microRNAs are a large class of non-coding RNAs involved in the regulation of various biological processes. Here, we analyzed the small RNA transcriptomes of five formalin-fixed paraffin-embedded (FFPE) pure IMPC samples and five FFPE IDC-NSTs samples by means of next-generation sequencing, generating a total of >170,000,000 clean reads. In an unsupervised cluster analysis, differently expressed miRNAs generated a tree with clear distinction between IMPC and IDC-NSTs classes. Paired fresh-frozen and FFPE specimens showed very similar miRNA expression profiles. By means of RT-qPCR, we further investigated miRNA expression in more IMPC (n = 22) and IDC-NSTs (n = 24) FFPE samples and found let-7b, miR-30c, miR-148a, miR-181a, miR-181a*, and miR-181b were significantly differently expressed between the two groups. We also elucidated several features of miRNA in these breast cancer tissues including 5' variability, miRNA editing, and 3' untemplated addition. Our findings will lead to further understanding of the invasive potency of IMPC and gain an insight into the diversity and complexity of small RNA molecules in breast cancer tissues.

  18. Rapid amplification of cDNA ends (RACE).

    Science.gov (United States)

    Yeku, Oladapo; Frohman, Michael A

    2011-01-01

    Rapid Amplification of cDNA ends (RACE) provides an inexpensive and powerful tool to quickly obtain full-length cDNA when the sequence is only partially known. Starting with an mRNA mixture, gene-specific primers generated from the known regions of the gene and non-specific anchors, full-length sequences can be identified in as little as 3 days. RACE can also be used to identify alternative transcripts of a gene when the partial or complete sequence of only one transcript is known. In the following sections, we outline details for rapid amplification of 5(') and 3(') cDNA ends using the "new RACE" technique.

  19. Draft Genome Sequence of Pseudoalteromonas sp. Strain XI10 Isolated from the Brine-Seawater Interface of Erba Deep in the Red Sea

    KAUST Repository

    Zhang, Guishan

    2016-03-10

    Pseudoalteromonas sp. strain XI10 was isolated from the brine-seawater interface of Erba Deep in the Red Sea, Saudi Arabia. Here, we present the draft genome sequence of strain XI10, a gammaproteobacterium that synthesizes polysaccharides for biofilm formation when grown in liquid culture.

  20. Draft Genome Sequences of TwoThiomicrospiraStrains Isolated from the Brine-Seawater Interface of Kebrit Deep in the Red Sea

    KAUST Repository

    Zhang, Guishan

    2016-03-11

    Two Thiomicrospira strains, WB1 and XS5, were isolated from the Kebrit Deep brine-seawater interface in the Red Sea, Saudi Arabia. Here, we present the draft genome sequences of these gammaproteobacteria, which both produce sulfuric acid from thiosulfate in culture.

  1. Deep Sea Coral voucher sequence dataset - Identification of deep-sea corals collected during the 2009 - 2014 West Coast Groundfish Bottom Trawl Survey

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — Data for this project resides in the West Coast Groundfish Bottom Trawl Survey Database. Deep-sea corals are often components of trawling bycatch, though their...

  2. Isolation and characterization of a full-length cDNA coding for an adipose differentiation-related protein.

    OpenAIRE

    Jiang, H P; Serrero, G

    1992-01-01

    We have previously isolated from a 1246 adipocyte cDNA library a cDNA clone called 154, corresponding to a mRNA that increases abundantly at a very early time during the differentiation of 1246 adipocytes and in adipocyte precursors in primary culture. We show here that the mRNA encoded by this cDNA is expressed abundantly and preferentially in mouse fat pads. A full-length cDNA for clone 154 was isolated by the RACE (rapid amplification of cDNA ends) protocol. Sequence analysis of this cDNA ...

  3. Isolation and characterization of a full-length cDNA coding for an adipose differentiation-related protein

    OpenAIRE

    Jiang, Hui-Ping; Serrero, Ginette

    1992-01-01

    We have previously isolated from a 1246 adipocyte cDNA library a cDNA clone called 154, corresponding to a mRNA that increases abundantly at a very early time during the differentiation of 1246 adipocytes and in adipocyte precursors in primary culture. We show here that the mRNA encoded by this cDNA is expressed abundantly and preferentially in mouse fat pads. A full-length cDNA for clone 154 was isolated by the RACE (rapid amplification of cDNA ends) protocol. Sequence analysis of this cDNA ...

  4. Molecular cloning of growth hormone encoding cDNA of Indian major carps by a modified rapid amplification of cDNA ends strategy.

    Science.gov (United States)

    Venugopal, T; Mathavan, S; Pandian, T J

    2002-06-01

    A modified rapid amplification of cDNA ends (RACE) strategy has been developed for cloning highly conserved cDNA sequences. Using this modified method, the growth hormone (GH) encoding cDNA sequences of Labeo rohita, Cirrhina mrigala and Catla catla have been cloned, characterized and overexpressed in Escherichia coli. These sequences show 96-98% homology to each other and are about 85% homologous to that of common carp. Besides, an attempt has been made for the first time to describe a 3-D model of the fish GH protein.

  5. Molecular cloning of growth hormone encoding cDNA of Indian major carps by a modified rapid amplification of cDNA ends strategy

    Indian Academy of Sciences (India)

    T Venugopal; S Mathavan; T J Pandian

    2002-06-01

    A modified rapid amplification of cDNA ends (RACE) strategy has been developed for cloning highly conserved cDNA sequences. Using this modified method, the growth hormone (GH) encoding cDNA sequences of Labeo rohita, Cirrhina mrigala and Catla catla have been cloned, characterized and overexpressed in Escherichia coli. These sequences show 96–98% homology to each other and are about 85% homologous to that of common carp. Besides, an attempt has been made for the first time to describe a 3-D model of the fish GH protein.

  6. Refining transcriptional programs in kidney development by integration of deep RNA-sequencing and array-based spatial profiling

    Directory of Open Access Journals (Sweden)

    Rumballe Bree A

    2011-09-01

    Full Text Available Abstract Background The developing mouse kidney is currently the best-characterized model of organogenesis at a transcriptional level. Detailed spatial maps have been generated for gene expression profiling combined with systematic in situ screening. These studies, however, fall short of capturing the transcriptional complexity arising from each locus due to the limited scope of microarray-based technology, which is largely based on "gene-centric" models. Results To address this, the polyadenylated RNA and microRNA transcriptomes of the 15.5 dpc mouse kidney were profiled using strand-specific RNA-sequencing (RNA-Seq to a depth sufficient to complement spatial maps from pre-existing microarray datasets. The transcriptional complexity of RNAs arising from mouse RefSeq loci was catalogued; including 3568 alternatively spliced transcripts and 532 uncharacterized alternate 3' UTRs. Antisense expressions for 60% of RefSeq genes was also detected including uncharacterized non-coding transcripts overlapping kidney progenitor markers, Six2 and Sall1, and were validated by section in situ hybridization. Analysis of genes known to be involved in kidney development, particularly during mesenchymal-to-epithelial transition, showed an enrichment of non-coding antisense transcripts extended along protein-coding RNAs. Conclusion The resulting resource further refines the transcriptomic cartography of kidney organogenesis by integrating deep RNA sequencing data with locus-based information from previously published expression atlases. The added resolution of RNA-Seq has provided the basis for a transition from classical gene-centric models of kidney development towards more accurate and detailed "transcript-centric" representations, which highlights the extent of transcriptional complexity of genes that direct complex development events.

  7. Identiifcation of microRNAs in two species of tomato,Solanum lycopersicum and Solanum habrochaites, by deep sequencing

    Institute of Scientific and Technical Information of China (English)

    FAN Shan-shan; LI Qian-nan; GUO Guang-jun; GAO Jian-chang; WANG Xiao-xuan; GUO Yan-mei; John C Snyder; DU Yong-chen

    2015-01-01

    MicroRNAs (miRNAs) are ~21 nucleotide (nt), endogenous RNAs that regulate gene expression in plants. Increasing evidence suggests that miRNAs play an important role in species-speciifc development in plants. However, the detailed miRNA proifle divergence has not been performed among tomato species. In this study, the smal RNA (sRNA) proifles of Solanum lycopersicumcultivar 9706 andSolanum habrochaites species PI 134417 were obtained by deep sequencing. Sixty-three known miRNA families were identiifed from these two species, of which 39 were common. Further miRNA proifle comparison showed that 24 known non-conserved miRNA families were species-speciifc between these two tomato species. In addition, six conserved miRNA families displayed an apparent divergent expression pattern between the two tomato species. Our results suggested that species-speciifc, non-conserved miRNAs and divergent expression of conserved miRNAs might contribute to developmental changes and phenotypic variation between the two tomato species. Twenty new miRNAs were also identiifed inS. lycopersicum. This research signiifcantly increases the number of known miRNA families in tomato and provides the ifrst set of smal RNAs inS. habrochaites. It also suggests that miRNAs have an important role in species-speciifc plant developmental regulation.

  8. Cloning and bioinformatics analysis of cDNA encoding cattle Smad4 gene

    Institute of Scientific and Technical Information of China (English)

    Xiaohui ZHANG; Shangzhong XU; Xue GAO; Hongyan REN; Jinbao CHEN

    2008-01-01

    The cDNA of cattle Smad4 gene was cloned by RT-PCR, 3' RACE and 5' RACE and got a 3503-bp full-long cDNA sequence. The cloned cattle Smad4 cDNA sequence had been send to GenBank and got an accession number: DQ494856. Cattle Smad4 gene consists of 12 exons and codes 553 amino acids. Cattle Smad4 cDNA shares 99%, 96%, 95%, 91% and 91% similarity in nucleic acid sequences, and 99%, 98%, 98%, 99% and 98% sim-ilarity in amino acid sequences with sheep, pig, human, rat and mouse, respectively. Smad4 cDNA was found in the testes, pancreas, liver, small intestine, ovary, lymph, car-diac muscle, skeleton muscle and thymus gland, which indicated that Smad4 was broadly expressed in cattle.

  9. 杜仲肉桂醇脱氢酶基因全长cDNA克隆及序列分析%Cloning and Sequence Analysis of the Full-length cDNA of Cinnamyl Alcohol Dehydrogenase Gene from Eucommia ulmoides Olive

    Institute of Scientific and Technical Information of China (English)

    赵丹; 李晓毓; 陈建; 赵德刚

    2012-01-01

    以杜仲(Eucommia ulmoides Olive)4、5月份新长成的杜仲幼嫩叶片为材料,在克隆一段肉桂醇脱氢酶(cinnamyl alcohol dehydrogenase,CAD)基因的基础上,以杜仲cDNA为模板,采用cDNA末端快速扩增法(Rapid amplification of cDNA Ends,RACE)克隆了5'端828 bp和3'端798 bp cDNA序列,经5'RACE产物和3'RACE产物序列拼接,获得全长为1243 bp的杜仲CAD cDNA序列,开放阅读框编码243个氨基酸,命名为EuCAD(GenBank登录号:DQ142643).与GenBank中序列比对分析发现,该cDNA序列与苹果树、桉树、红橡树中的CAD基因序列同源性均为81%,预测编码的氨基酸序列与苹果树、桉树、红橡树的同源性分别为73%、70%和70%,因此认为是杜仲肉桂醇脱氢酶基因.该基因为首次从杜仲中克隆,为探索木质素的合成调控机理奠定基础.%Cinnamyl alcohol dehydrogenase ( CAD) plays an important role in the lignin biosynthesis. Cloning and sequence analysis of this gene ( CAD) from Eucommia ulmoides Olive were carried out by Rapid Amplification of cDNA Ends ( RACE) in the current work. The sequence analysis showed that the full-length cDNA of CAD contained 1243 bp, whose open reading frame ( ORF ) predicted a protein of 243 amino acids. The cDNA blast in GenBank showed 81% homology with Malus domestica, Eucalyptus gunnii, and Quercus suber, and amino acid blast demonstrated 73% , 70% , 70% homology with that of just-mentioned species, respectively , suggesting that full-length cDNA was authentic Eucommia CAD. The Cloning of Eucommia CAD may facilitate to unravel the synthetical mechanism of lignin in plant.

  10. Recognition of cDNA microarray image Using Feedforward artificial neural network

    OpenAIRE

    R. M. Farouk; E. M. Badr; M. A. SayedElahl

    2014-01-01

    The complementary DNA (cDNA) sequence is considered to be the magic biometric technique for personal identification. In this paper, we present a new method for cDNA recognition based on the artificial neural network (ANN). Microarray imaging is used for the concurrent identification of thousands of genes. We have segmented the location of the spots in a cDNA microarray. Thus, a precise localization and segmenting of a spot are essential to obtain a more accurate intensity measurement, leading...

  11. Molecular cloning and mammalian expression of human beta 2-glycoprotein I cDNA

    DEFF Research Database (Denmark)

    Kristensen, Torsten; Schousboe, Inger; Boel, Espen;

    1991-01-01

    Human β2-glycoprotein (β2gpI) cDNA was isolated from a liver cDNA library and sequenced. The cDNA encoded a 19-residue hydrophobic signal peptide followed by the mature β2gpI of 326 amino acid residues. In liver and in the hepatoma cell line HepG2 there are two mRNA species of about 1.4 and 4.3 k...

  12. Revolutions in rapid amplification of cDNA ends: new strategies for polymerase chain reaction cloning of full-length cDNA ends.

    Science.gov (United States)

    Schaefer, B C

    1995-05-20

    Rapid amplification of cDNA ends (RACE) is a polymerase chain reaction (PCR)-based technique which was developed to facilitate the cloning of full-length cDNA 5'- and 3'-ends after a partial cDNA sequence has been obtained by other methods. While RACE can yield complete sequences of cDNA ends in only a few days, the RACE procedure frequently results in the exclusive amplification of truncated cDNA ends, undermining efforts to generate full-length clones. Many investigators have suggested modifications to the RACE protocol to improve the effectiveness of the technique. Based on first-hand experience with RACE, a critical review of numerous published variations of the key steps in the RACE method is presented. Also included is a detailed, effective protocol based on RNA ligase-mediated RACE/reverse ligation-mediated PCR, as well as a demonstration of its utility.

  13. THE CLONING OF HRNT-1 USING A COMBINATION OF cDNA LIBRARY SCREENING WITH BIOTIN-LABELED PROBE AND RAPID AMPLIFICATION OF cDNA ENDS

    Institute of Scientific and Technical Information of China (English)

    2001-01-01

    Objective: To clone the human counterpart of rat ZA73, EST cloned from rat tracheal epithelial (RTE) neoplastic transformed cell model induced by (a-particles radiation by using mRNA differential display. Methods: According to the sequence of rat ZA73, a probe was biotin-labeled to screen human cDNA library, and then the gene sequence was extended by RACE (rapid amplification of cDNA ends). Result: Human gene HRNT-1 (GenBank Accession Number: AF223393) is 4.256 kb in length, with an ORF located in the region between 254 and 3013 bp. 5' UTS (untranslated sequences) is 253 bp, 3' UTS is 1243 bp. Conclusion: The combination of cDNA library screening with biotin-labeled probes and RACE is an effective method to clone full-length cDNA, especially for sequences longer than 2 kb.

  14. RNA deep sequencing reveals novel candidate genes and polymorphisms in boar testis and liver tissues with divergent androstenone levels.

    Directory of Open Access Journals (Sweden)

    Asep Gunawan

    Full Text Available Boar taint is an unpleasant smell and taste of pork meat derived from some entire male pigs. The main causes of boar taint are the two compounds androstenone (5α-androst-16-en-3-one and skatole (3-methylindole. It is crucial to understand the genetic mechanism of boar taint to select pigs for lower androstenone levels and thus reduce boar taint. The aim of the present study was to investigate transcriptome differences in boar testis and liver tissues with divergent androstenone levels using RNA deep sequencing (RNA-Seq. The total number of reads produced for each testis and liver sample ranged from 13,221,550 to 33,206,723 and 12,755,487 to 46,050,468, respectively. In testis samples 46 genes were differentially regulated whereas 25 genes showed differential expression in the liver. The fold change values ranged from -4.68 to 2.90 in testis samples and -2.86 to 3.89 in liver samples. Differentially regulated genes in high androstenone testis and liver samples were enriched in metabolic processes such as lipid metabolism, small molecule biochemistry and molecular transport. This study provides evidence for transcriptome profile and gene polymorphisms of boars with divergent androstenone level using RNA-Seq technology. Digital gene expression analysis identified candidate genes in flavin monooxygenease family, cytochrome P450 family and hydroxysteroid dehydrogenase family. Moreover, polymorphism and association analysis revealed mutation in IRG6, MX1, IFIT2, CYP7A1, FMO5 and KRT18 genes could be potential candidate markers for androstenone levels in boars. Further studies are required for proving the role of candidate genes to be used in genomic selection against boar taint in pig breeding programs.

  15. Deep Sequencing of Suppression Subtractive Hybridisation Drought and Recovery Libraries of the Non-model Crop Trifolium repens L.

    Science.gov (United States)

    Bisaga, Maciej; Lowe, Matthew; Hegarty, Matthew; Abberton, Michael; Ravagnani, Adriana

    2017-01-01

    White clover is a short-lived perennial whose persistence is greatly affected by abiotic stresses, particularly drought. The aim of this work was to characterize its molecular response to water deficit and recovery following re-hydration to identify targets for the breeding of tolerant varieties. We created a white clover reference transcriptome of 16,193 contigs by deep sequencing (mean base coverage 387x) four Suppression Subtractive Hybridization (SSH) libraries (a forward and a reverse library for each treatment) constructed from young leaf tissue of white clover at the onset of the response to drought and recovery. Reads from individual libraries were then mapped to the reference transcriptome and processed comparing expression level data. The pipeline generated four robust sets of transcripts induced and repressed in the leaves of plants subjected to water deficit stress (6,937 and 3,142, respectively) and following re-hydration (6,695 and 4,897, respectively). Semi-quantitative polymerase chain reaction was used to verify the expression pattern of 16 genes. The differentially expressed transcripts were functionally annotated and mapped to biological processes and pathways. In agreement with similar studies in other crops, the majority of transcripts up-regulated in response to drought belonged to metabolic processes, such as amino acid, carbohydrate, and lipid metabolism, while transcripts involved in photosynthesis, such as components of the photosystem and the biosynthesis of photosynthetic pigments, were up-regulated during recovery. The data also highlighted the role of raffinose family oligosaccharides (RFOs) and the possible delayed response of the flavonoid pathways in the initial response of white clover to water withdrawal. The work presented in this paper is to our knowledge the first large scale molecular analysis of the white clover response to drought stress and re-hydration. The data generated provide a valuable genomic resource for marker

  16. A deep sequencing approach to comparatively analyze the transcriptome of lifecycle stages of the filarial worm, Brugia malayi.

    Directory of Open Access Journals (Sweden)

    Young-Jun Choi

    2011-12-01

    Full Text Available BACKGROUND: Developing intervention strategies for the control of parasitic nematodes continues to be a significant challenge. Genomic and post-genomic approaches play an increasingly important role for providing fundamental molecular information about these parasites, thus enhancing basic as well as translational research. Here we report a comprehensive genome-wide survey of the developmental transcriptome of the human filarial parasite Brugia malayi. METHODOLOGY/PRINCIPAL FINDINGS: Using deep sequencing, we profiled the transcriptome of eggs and embryos, immature (≤3 days of age and mature microfilariae (MF, third- and fourth-stage larvae (L3 and L4, and adult male and female worms. Comparative analysis across these stages provided a detailed overview of the molecular repertoires that define and differentiate distinct lifecycle stages of the parasite. Genome-wide assessment of the overall transcriptional variability indicated that the cuticle collagen family and those implicated in molting exhibit noticeably dynamic stage-dependent patterns. Of particular interest was the identification of genes displaying sex-biased or germline-enriched profiles due to their potential involvement in reproductive processes. The study also revealed discrete transcriptional changes during larval development, namely those accompanying the maturation of MF and the L3 to L4 transition that are vital in establishing successful infection in mosquito vectors and vertebrate hosts, respectively. CONCLUSIONS/SIGNIFICANCE: Characterization of the transcriptional program of the parasite's lifecycle is an important step toward understanding the developmental processes required for the infectious cycle. We find that the transcriptional program has a number of stage-specific pathways activated during worm development. In addition to advancing our understanding of transcriptome dynamics, these data will aid in the study of genome structure and organization by facilitating

  17. Ultra-deep T cell receptor sequencing reveals the complexity and intratumour heterogeneity of T cell clones in renal cell carcinomas.

    Science.gov (United States)

    Gerlinger, Marco; Quezada, Sergio A; Peggs, Karl S; Furness, Andrew J S; Fisher, Rosalie; Marafioti, Teresa; Shende, Vishvesh H; McGranahan, Nicholas; Rowan, Andrew J; Hazell, Steven; Hamm, David; Robins, Harlan S; Pickering, Lisa; Gore, Martin; Nicol, David L; Larkin, James; Swanton, Charles

    2013-12-01

    The recognition of cancer cells by T cells can impact upon prognosis and be exploited for immunotherapeutic approaches. This recognition depends on the specific interaction between antigens displayed on the surface of cancer cells and the T cell receptor (TCR), which is generated by somatic rearrangements of TCR α- and β-chains (TCRb). Our aim was to assess whether ultra-deep sequencing of the rearranged TCRb in DNA extracted from unfractionated clear cell renal cell carcinoma (ccRCC) samples can provide insights into the clonality and heterogeneity of intratumoural T cells in ccRCCs, a tumour type that can display extensive genetic intratumour heterogeneity (ITH). For this purpose, DNA was extracted from two to four tumour regions from each of four primary ccRCCs and was analysed by ultra-deep TCR sequencing. In parallel, tumour infiltration by CD4, CD8 and Foxp3 regulatory T cells was evaluated by immunohistochemistry and correlated with TCR-sequencing data. A polyclonal T cell repertoire with 367-16 289 (median 2394) unique TCRb sequences was identified per tumour region. The frequencies of the 100 most abundant T cell clones/tumour were poorly correlated between most regions (Pearson correlation coefficient, -0.218 to 0.465). 3-93% of these T cell clones were not detectable across all regions. Thus, the clonal composition of T cell populations can be heterogeneous across different regions of the same ccRCC. T cell ITH was higher in tumours pretreated with an mTOR inhibitor, which could suggest that therapy can influence adaptive tumour immunity. These data show that ultra-deep TCR-sequencing technology can be applied directly to DNA extracted from unfractionated tumour samples, allowing novel insights into the clonality of T cell populations in cancers. These were polyclonal and displayed ITH in ccRCC. TCRb sequencing may shed light on mechanisms of cancer immunity and the efficacy of immunotherapy approaches.

  18. deepBase v2.0: identification, expression, evolution and function of small RNAs, LncRNAs and circular RNAs from deep-sequencing data.

    Science.gov (United States)

    Zheng, Ling-Ling; Li, Jun-Hao; Wu, Jie; Sun, Wen-Ju; Liu, Shun; Wang, Ze-Lin; Zhou, Hui; Yang, Jian-Hua; Qu, Liang-Hu

    2016-01-04

    Small non-coding RNAs (e.g. miRNAs) and long non-coding RNAs (e.g. lincRNAs and circRNAs) are emerging as key regulators of various cellular processes. However, only a very small fraction of these enigmatic RNAs have been well functionally characterized. In this study, we describe deepBase v2.0 (http://biocenter.sysu.edu.cn/deepBase/), an updated platform, to decode evolution, expression patterns and functions of diverse ncRNAs across 19 species. deepBase v2.0 has been updated to provide the most comprehensive collection of ncRNA-derived small RNAs generated from 588 sRNA-Seq datasets. Moreover, we developed a pipeline named lncSeeker to identify 176 680 high-confidence lncRNAs from 14 species. Temporal and spatial expression patterns of various ncRNAs were profiled. We identified approximately 24 280 primate-specific, 5193 rodent-specific lncRNAs, and 55 highly conserved lncRNA orthologs between human and zebrafish. We annotated 14 867 human circRNAs, 1260 of which are orthologous to mouse circRNAs. By combining expression profiles and functional genomic annotations, we developed lncFunction web-server to predict the function of lncRNAs based on protein-lncRNA co-expression networks. This study is expected to provide considerable resources to facilitate future experimental studies and to uncover ncRNA functions.

  19. 扩展莫尼茨绦虫蛋白激酶C相互作用蛋白(PICK1)基因的克隆及序列分析%Cloning and Sequence Analysis of the cDNA Encoding Protein Interacting with C Kinase 1 (PICK1) in Moniezia expansa

    Institute of Scientific and Technical Information of China (English)

    赵文娟; 康立超; 薄新文; 王新华

    2011-01-01

    [目的]分离和鉴定扩展莫尼茨绦虫(Monieziaexpansa)新基因,为进一步研究该基因的功能奠定基础.[方法]构建扩展莫尼茨绦虫成虫cDNA文库,随机挑取重组阳性克隆进行测序,对部分序列进行引物步移法测序,获取其全长cDNA序列;采用生物信息学等分析技术对该cDNA序列进行开放阅读框(ORF)的寻找、编码氨基酸的推导、核苷酸和氨基酸同源性比较及蛋白质二级结构的初步预测.[结果]获得了1个扩展莫尼茨绦虫新基因蛋白激酶C相互作用蛋白,全长1 527 bp,编码447个氨基酸,CDS预测存在明显的BAR,PDZ结构域.编码蛋白的理论分子质量为50.173 3 ku,等电点为5.22.[结论]获得了扩展莫尼茨绦虫蛋白激酶C相互作用蛋白的全长cDNA序列,为该基因功能的试验性鉴定工作奠定基础.%[Objective] The purpose of this program was to clone and identify novel genes from an adult Monieda expansa (M. Expansa) cDNA library, and provide a foundation for further research. [Method]A cDNA library was constructed from M. Expansa adult stage. Clones were selected randomly from the cDNA library and were sequenced by using the method of expression sequence tags (ESTs) . Novel genes were acquired by primer -walking. The cDNA sequence encoding M. Expansa PICK1 protein was analyzed, including searching the ORF, translating the nucleotide to protein sequence, similarity searches and secondary structure predication with bioinformatics analysis. [ Result ] PICK 1 genes, 1527 bp and coding for 447 amino acids, was cloned and sequenced, then the sequence was submitted to GenBank and got an accession number, GH291479. The theoretical pi was 5.22 and molecular weight was SO. 173 3 ku. [Conclusion]The full - length cDNA encoding M. Expansa PICK1 was obtained, which laid the foundation for further functional study of this gene.

  20. Complete Genome Sequence of the Hyperthermophilic Archaeon Pyrococcus sp. Strain ST04, Isolated from a Deep-Sea Hydrothermal Sulfide Chimney on the Juan de Fuca Ridge

    Science.gov (United States)

    Jung, Jong-Hyun; Lee, Ju-Hoon; Holden, James F.; Seo, Dong-Ho; Shin, Hakdong; Kim, Hae-Yeong; Kim, Wooki; Ryu, Sangryeol

    2012-01-01

    Pyrococcus sp. strain ST04 is a hyperthermophilic, anaerobic, and heterotrophic archaeon isolated from a deep-sea hydrothermal sulfide chimney on the Endeavour Segment of the Juan de Fuca Ridge in the northeastern Pacific Ocean. To further understand the distinct characteristics of this archaeon at the genome level (polysaccharide utilization at high temperature and ATP generation by a Na+ gradient), the genome of strain ST04 was completely sequenced and analyzed. Here, we present the complete genome sequence analysis results of Pyrococcus sp. ST04 and report the major findings from the genome annotation, with a focus on its saccharolytic and metabolite production potential. PMID:22843576

  1. Higher and lower-level relationships of the deep-sea fish order Alepocephaliformes (Teleostei: Otocephala) inferred from whole mitogenome sequences

    DEFF Research Database (Denmark)

    Poulsen, Jan Yde; Møller, Peter Rask; Lavoué, Sébastien

    2009-01-01

    Fishes of the order Alepocephaliformes, slickheads and tubeshoulders, constitute a group of deep-sea fishes poorly known in respect to most areas of their biology and systematics. Morphological studies have found alepocephaliform fishes to display a mosaic of synapomorphic and symplesiomorphic...... are alepocephaliforms and unambiguously aligned sequences were subjected to partitioned maximum likelihood and Bayesian analyses. Results from the present study support Alepocephaliformes as a genetically distinct otocephalan order as sister clade to Ostariophysi (mostly freshwater fishes comprising Gonorynchiformes...

  2. 泌盐植物长叶红砂质膜 Na +/H +逆向转运蛋白基因(RtSOS1)全长 cDNA 的克隆及序列分析%Cloning and Sequence Analysis of the Plasma Membrane Na +/H +Antiporter cDNA in Recretohalophyte Reaumuria trigyna Maxim

    Institute of Scientific and Technical Information of China (English)

    党振华; 郑琳琳; 冯智; 王迎春

    2013-01-01

      Reaumuria trigyna Maxim.is an endangered small shrub with the features of a recretohalophyte .This species is endemic to the Eastern Alxa Western Ordos area and developed distinctive strategies to adapt to the semi -desert and salty soil environment .A full-length cDNA of the plasma Na+/H+antiporter (RtSOS1) was isolated from this species by using RT-PCR and RACE technologies.The 3 829 bp sequence comprised a 3 438 bp open reading frame,encoding an 1 145 amino acids protein with the molecular weight of 126.76 kDa.Bioinformatics analyze re-veals that RtSOS1 composed of 11 transmembrane domains within its N terminal portion ,and a hydrophilic cytoplas-mic tail with the length approximately 700 amino acids in its C-terminal portion.In the C-terminal region,the phos-phorylation domain and the auto -inhibited domain are found.The Homology comparison and phylogenetic analysis showed that RtSOS1 is related to plasma membrane Na+/H+antiporter in other plant species.%  长叶红砂为内蒙古东阿拉善-西鄂尔多斯地区特有珍稀泌盐,强旱生小灌木,对盐渍荒漠环境具有极强适应性。利用 RT-PCR 和 RACE 技术从该植物中分离出质膜 Na+/H+逆向转运蛋白基因(RtSOS1),该 cDNA 全长为3829 bp,开放阅读框为3438 bp,编码一个含1145个氨基酸的蛋白质,推测分子量为126.76 kDa。氨基酸序列的生物信息学分析推测,该蛋白 N 端含有11个跨膜结构域,C 端为一个长约700个氨基酸的亲水性尾,具有磷酸化和自我抑制结构域。同源性比对和系统发育分析证实,RtSOS1与其他植物的质膜 Na+/H+逆向转运蛋白亲缘关系较近。

  3. An efficient strategy of screening for pathogens in wild-caught ticks and mosquitoes by reusing small RNA deep sequencing data.

    Directory of Open Access Journals (Sweden)

    Lu Zhuang

    Full Text Available This paper explored our hypothesis that sRNA (18 ∼ 30 bp deep sequencing technique can be used as an efficient strategy to identify microorganisms other than viruses, such as prokaryotic and eukaryotic pathogens. In the study, the clean reads derived from the sRNA deep sequencing data of wild-caught ticks and mosquitoes were compared against the NCBI nucleotide collection (non-redundant nt database using Blastn. The blast results were then analyzed with in-house Python scripts. An empirical formula was proposed to identify the putative pathogens. Results showed that not only viruses but also prokaryotic and eukaryotic species of interest can be screened out and were subsequently confirmed with experiments. Specially, a novel Rickettsia spp. was indicated to exist in Haemaphysalis longicornis ticks collected in Beijing. Our study demonstrated the reuse of sRNA deep sequencing data would have the potential to trace the origin of pathogens or discover novel agents of emerging/re-emerging infectious diseases.

  4. 人干细胞因子(SCF)5’旁侧调控序列与 其全长cDNA融合克隆的构建及鉴定%Construction and identification of the fusion clone of the SCF 5’flanking sequence and its full-length cDNA

    Institute of Scientific and Technical Information of China (English)

    谭运年; 谭文斌; 彭兴华

    2001-01-01

    Objective:To study the effect of different length DNA sequences of the 5’SCF 1.42 kb flanking sequence in the expression and regulation of full-length cDNA in eukaryotic cells,a fused clone of the 5’SCF 1.42kb flanking sequence and its full-length cDNA were constructed.Methods:A 1.42 kb flanking sequence and a 1.2 kb full-length cDNA were achieved by PCR from human genomic DNA and by RT-PCR from HepG2 mPNA,respectively,and then cloned into pGEM-T cloning vector and identified.Both clones were digested with fitly restricted endonuclease and three DNA fragments,480 bp,980 bp,1.2 kb cDNA were harvested.Finally,these three DNA fragments were subcloned into pUC19 cloning vector in turn.Results:The fused clone of 5’SCF 1.42 kb flanking sequence and its full-length cDNA were successfully constructed.Conclusion:To avoid the interruption of some restricted endonuclease sites,the way to cut larger fragments into smaller fragments is still useful and effective in the process of gene cloning%目的:为研究人干细胞因子(SCF)5’旁侧1.42 kb区域内不同序列对全长cDNA在真核细胞中表达的调控作用,构建了SCF5’旁侧1.42 kb的调控序列与其1.2 kb的全长cDNA融合克隆。方法:将PCR获得的SCF5’旁侧1.42 kb的调控序列与RT-PCR获得的其1.2 kb的全长cDNA克隆入pGEM-T载体,筛选正确插入方向,利用合适的限制性内切酶从中切出三个片段依次亚克隆入pUC19克隆载体中。结果:获得了SCF5’旁侧 1.42 kb的调控序列与其1.2 kb的全长cDNA并成功地构建了它们的融合克隆。结论:T-载体在克隆添加了少量具有3’→5’外切核酸酶的PCR产物中仍然有效;在稍大片段的基因克隆操作中,利用分段亚克隆的方法,避开干扰另外的酶切位点,依次分段亚克隆更为可行。

  5. Main: Sequences [KOME

    Lifescience Database Archive (English)

    Full Text Available Sequences Amino Acid Sequence Amino Acid sequence of full length cDNA (Longest ORF) kome_ine_full_seq...uence_amino_db.fasta.zip kome_ine_full_sequence_amino_db.zip kome_ine_full_sequence_amino_db ...

  6. Cloning and Sequence Analysis of Mannose-6-phosphate lsomerase cDNA from Metarhizium anisopliae ZJ1109%金龟子绿僵菌甘露糖6-磷酸异构酶基因cDNA序列的克隆及分析

    Institute of Scientific and Technical Information of China (English)

    李亚

    2011-01-01

    The primers were designed according to the conservative sequence of mannose-6-phosphate isomerase gene, and the complete cDNA sequence of Metarhizium anisopliae was amplified by RT-PCR and RACE PCR. The complete cDNA sequence of mpi gene was 1513 bp and the complete ORF length was 1328 bp which encoded a protein with 441 amino acid residues. Blast analysis indicated the deduced amino acid sequence of the mpi gene from M. Anisopliae showed high homology with other fungi respectively. The analysis of protein structure showed MPI protein had the characteristic of the conservative phosphate enzyme, and mainly constructed by a-helix and random coil.%通过绿僵菌属甘露糖6-磷酸异构酶基因保守核苷酸区域设计简并性引物,采用RT-PCR及RACE-PCR技术成功克隆了金龟子绿僵菌mpi基因cDNA序列.该基因cDNA序列全长为1513bp,开放阅读框长度为1328 bp,共编码441氨基酸.BLAST分析发现该基因演绎的氨基酸序列与其它真菌同源性较高,蛋白结构分析表明MPI蛋白是较保守的蛋白磷酸酶结构特征,主要由α螺旋和不规则卷曲构成.

  7. Rescue of mumps virus from cDNA.

    Science.gov (United States)

    Clarke, D K; Sidhu, M S; Johnson, J E; Udem, S A

    2000-05-01

    A complete DNA copy of the genome of a Jeryl Lynn strain of mumps virus (15,384 nucleotides) was assembled from cDNA fragments such that an exact antigenome RNA could be generated following transcription by T7 RNA polymerase and cleavage by hepatitis delta virus ribozyme. The plasmid containing the genome sequence, together with support plasmids which express mumps virus NP, P, and L proteins under control of the T7 RNA polymerase promoter, were transfected into A549 cells previously infected with recombinant vaccinia virus (MVA-T7) that expressed T7 RNA polymerase. Rescue of infectious virus from the genome cDNA was demonstrated by amplification of mumps virus from transfected-cell cultures and by subsequent consensus sequencing of reverse transcription-PCR products generated from infected-cell RNA to verify the presence of specific nucleotide tags introduced into the genome cDNA clone. The only coding change (position 8502, A to G) in the cDNA clone relative to the consensus sequence of the Jeryl Lynn plaque isolate from which it was derived, resulting in a lysine-to-arginine substitution at amino acid 22 of the L protein, did not prevent rescue of mumps virus, even though an amino acid alignment for the L proteins of paramyxoviruses indicates that lysine is highly conserved at that position. This system may provide the basis of a safe and effective virus vector for the in vivo expression of immunologically and biologically active proteins, peptides, and RNAs.

  8. Identification and characterization of microRNAs by deep-sequencing in Hyalomma anatolicum anatolicum (Acari: Ixodidae) ticks.

    Science.gov (United States)

    Luo, Jin; Liu, Guang-Yuan; Chen, Ze; Ren, Qiao-Yun; Yin, Hong; Luo, Jian-Xun; Wang, Hui

    2015-06-15

    Hyalomma anatolicum anatolicum (H.a. anatolicum) (Acari: Ixodidae) ticks are globally distributed ectoparasites with veterinary and medical importance. These ticks not only weaken animals by sucking their blood but also transmit different species of parasitic protozoans. Multiple factors influence these parasitic infections including miRNAs, which are non-coding, small regulatory RNA molecules essential for the complex life cycle of parasites. To identify and characterize miRNAs in H.a. anatolicum, we developed an integrative approach combining deep sequencing, bioinformatics and real-time PCR analysis. Here we report the use of this approach to identify miRNA expression, family distribution, and nucleotide characteristics, and discovered novel miRNAs in H.a. anatolicum. The result showed that miR-1-3p, miR-275-3p, and miR-92a were expressed abundantly. There was a strong bias on miRNA, family members, and nucleotide compositions at certain positions in H.a. anatolicum miRNA. Uracil was the dominant nucleotide, particularly at positions 1, 6, 16, and 18, which were located approximately at the beginning, middle, and end of conserved miRNAs. Analysis of the conserved miRNAs indicated that miRNAs in H.a. anatolicum were concentrated along three diverse phylogenetic branches of bilaterians, insects and coelomates. Two possible roles for the use of miRNA in H.a. anatolicum could be presumed based on its parasitic life cycle: to maintain a large category of miRNA families of different animals, and/or to preserve stringent conserved seed regions with active changes in other places of miRNAs mainly in the middle and the end regions. These might help the parasite to undergo its complex life style in different hosts and adapt more readily to the host changes. The present study represents the first large scale characterization of H.a. anatolicum miRNAs, which could further the understanding of the complex biology of this zoonotic parasite, as well as initiate miRNA studies

  9. Deep Sequencing Analysis of miRNA Expression in Breast Muscle of Fast-Growing and Slow-Growing Broilers

    Directory of Open Access Journals (Sweden)

    Hongjia Ouyang

    2015-07-01

    Full Text Available Growth performance is an important economic trait in chicken. MicroRNAs (miRNAs have been shown to play important roles in various biological processes, but their functions in chicken growth are not yet clear. To investigate the function of miRNAs in chicken growth, breast muscle tissues of the two-tail samples (highest and lowest body weight from Recessive White Rock (WRR and Xinghua Chickens (XH were performed on high throughput small RNA deep sequencing. In this study, a total of 921 miRNAs were identified, including 733 known mature miRNAs and 188 novel miRNAs. There were 200, 279, 257 and 297 differentially expressed miRNAs in the comparisons of WRRh vs. WRRl, WRRh vs. XHh, WRRl vs. XHl, and XHh vs. XHl group, respectively. A total of 22 highly differentially expressed miRNAs (fold change > 2 or < 0.5; p-value < 0.05; q-value < 0.01, which also have abundant expression (read counts > 1000 were found in our comparisons. As far as two analyses (WRRh vs. WRRl, and XHh vs. XHl are concerned, we found 80 common differentially expressed miRNAs, while 110 miRNAs were found in WRRh vs. XHh and WRRl vs. XHl. Furthermore, 26 common miRNAs were identified among all four comparisons. Four differentially expressed miRNAs (miR-223, miR-16, miR-205a and miR-222b-5p were validated by quantitative real-time RT-PCR (qRT-PCR. Regulatory networks of interactions among miRNAs and their targets were constructed using integrative miRNA target-prediction and network-analysis. Growth hormone receptor (GHR was confirmed as a target of miR-146b-3p by dual-luciferase assay and qPCR, indicating that miR-34c, miR-223, miR-146b-3p, miR-21 and miR-205a are key growth-related target genes in the network. These miRNAs are proposed as candidate miRNAs for future studies concerning miRNA-target function on regulation of chicken growth.

  10. Discovery and characterization of artifactual mutations in deep coverage targeted capture sequencing data due to oxidative DNA damage during sample preparation.

    Science.gov (United States)

    Costello, Maura; Pugh, Trevor J; Fennell, Timothy J; Stewart, Chip; Lichtenstein, Lee; Meldrim, James C; Fostel, Jennifer L; Friedrich, Dennis C; Perrin, Danielle; Dionne, Danielle; Kim, Sharon; Gabriel, Stacey B; Lander, Eric S; Fisher, Sheila; Getz, Gad

    2013-04-01

    As researchers begin probing deep coverage sequencing data for increasingly rare mutations and subclonal events, the fidelity of next generation sequencing (NGS) laboratory methods will become increasingly critical. Although error rates for sequencing and polymerase chain reaction (PCR) are well documented, the effects that DNA extraction and other library preparation steps could have on downstream sequence integrity have not been thoroughly evaluated. Here, we describe the discovery of novel C > A/G > T transversion artifacts found at low allelic fractions in targeted capture data. Characteristics such as sequencer read orientation and presence in both tumor and normal samples strongly indicated a non-biological mechanism. We identified the source as oxidation of DNA during acoustic shearing in samples containing reactive contaminants from the extraction process. We show generation of 8-oxoguanine (8-oxoG) lesions during DNA shearing, present analysis tools to detect oxidation in sequencing data and suggest methods to reduce DNA oxidation through the introduction of antioxidants. Further, informatics methods are presented to confidently filter these artifacts from sequencing data sets. Though only seen in a low percentage of reads in affected samples, such artifacts could have profoundly deleterious effects on the ability to confidently call rare mutations, and eliminating other possible sources of artifacts should become a priority for the research community.

  11. Preparation of cDNA libraries from vascular cells.

    Science.gov (United States)

    Lieb, M E; Taubman, M B

    1999-01-01

    The vast majority of past and present efforts in the molecular cloning of expressed sequences involve isolation of clones from cDNA libraries constructed in bacteriophage lambda (1,2). As discussed in Chapter 6 , screening these cDNA libraries using labeled probes remains the most straightforward method to isolate full length cDNAs for which some partial sequence information is known. Although the availability of high quality reagents and kits over the past decade has made the process of library construction increasingly straightforward, generation of high-quality libraries is a task that still requires a fair amount of dedicated effort. Because alternative PCR-based cloning strategies have become increasingly popular alternatives to cDNA library screening, it is useful to consider the advantages and disadvantages of each strategy before embarking on a project to construct a cDNA library (Table 1). In our opinion, it is worthwhile to construct a cDNA library when the transcript of interest is not exceedingly rare (i.e., can readily be detected by Northern blot analysis of total RNA), when multiple cDNAs will need to be cloned over a period of time, and in situations where occasional mutations can not be tolerated (for example, if the cDNA is to be expressed in mammalian cells to examine function). In situations where the transcript of interest is expressed at exceedingly low levels, or when only a single cDNA needs to be cloned, a PCR-based strategy should be considered. When the tissue source is precious (such as a unique clinical specimen), successful construction of a phage library provides a resource that can be amplified and used for multiple cloning projects over many years, but runs the risk of consuming the available RNA if the library construction fails. Table 1 Comparison of Relative Advantages of cDNA Cloning from Lambda Phage Libraries by Plaque Hybridization Compared to Newer PCR- Based Strategies Lambda phage cDNA library PCR-based strategy Freedom

  12. Cloning and sequence analysis of cDNA coding for group Ⅰ allergen of dermatophagoides farinae(Der f Ⅰ)%粉尘螨Ⅰ类变应原(Der fⅠ)的cDNA克隆及序列分析

    Institute of Scientific and Technical Information of China (English)

    郝敏麒; 徐军; 钟南山

    2001-01-01

    Objective To acquire the cDNA of group I allergen of (Der f Ⅰ) ofGuangzhou, China for further usage in construction of DNA vaccine and expression of the recombinant protein. Methods The live mites of local area which had been identified and cultured were picked. The total RNA was extracted. The Der f Ⅰ cDNA was amplified with RT-PCR. Then it was subcloned into a vector and sequenced. Results The segment with 632 bases was determined. The sequence homology with the published one (emb|X65196.1|)on gene bank was 99% at the nucleotide level.The deduced amino acid sequence homology was 100%. Conclusion It is the first time that we achieve the cDNA of Der f Ⅰ of Guangzhou. Its sequence homology is high as compared with the published one.%目的 获得我国广州地区粉尘螨Ⅰ类变应原(DerfⅠ)cDNA片段,为构建DNA疫苗或表达重组蛋白打下基础。方法 挑取经选择鉴定的活粉尘螨,提取总RNA,采用RT-PCR的方法扩增DerfⅠ片断,进行克隆、测序和分析。结果 获得长度为632个碱基对的核苷酸片断,序列分析结果和Genebank上去除内含子后的基因序列(emb|X65196.1|)同源性为99%,其中有6个碱基不同,推导的编码氨基酸序列同源性为100%。结论 我们首次获得广州地区的DerfⅠ的cDNA克隆,该克隆cDNA序列与Genebank上已公布的DerfⅠ序列高度同源。

  13. Cloning and expression analysis of MBLL cDNA

    Institute of Scientific and Technical Information of China (English)

    2000-01-01

    The mbl (muscleblind) gene of Drosophila encodes a nuclear protein which contains two Cys3His motifs. The mutation of mbl gene will disturb the differentiation of all the Drosophila's photoreceptors. Primers have been designed according to human EST086139, which is highly homologous to mbl gene. Human fetal brain cDNA library has been screened and a novel cDNA clone has been obtained. The 2595 bp cDNA, designated MBLL (muscleblind-like), contains an open reading frame which encodes 255 amino acids and has 4 Cys3His motifs (GenBank Acc. AF061261). The amino acids sequence shares high homology to Drosophila's mbl. The Northern blot and RNA dot blot hybridization of 43 human adult tissues and 7 fetal tissues show that MBLL is a widely expressed gene, but the expression amounts differ in these tissues.

  14. Phylogenetic and genome-wide deep-sequencing analyses of canine parvovirus reveal co-infection with field variants and emergence of a recent recombinant strain.

    Science.gov (United States)

    Pérez, Ruben; Calleros, Lucía; Marandino, Ana; Sarute, Nicolás; Iraola, Gregorio; Grecco, Sofia; Blanc, Hervé; Vignuzzi, Marco; Isakov, Ofer; Shomron, Noam; Carrau, Lucía; Hernández, Martín; Francia, Lourdes; Sosa, Katia; Tomás, Gonzalo; Panzera, Yanina

    2014-01-01

    Canine parvovirus (CPV), a fast-evolving single-stranded DNA virus, comprises three antigenic variants (2a, 2b, and 2c) with different frequencies and genetic variability among countries. The contribution of co-infection and recombination to the genetic variability of CPV is far from being fully elucidated. Here we took advantage of a natural CPV population, recently formed by the convergence of divergent CPV-2c and CPV-2a strains, to study co-infection and recombination. Complete sequences of the viral coding region of CPV-2a and CPV-2c strains from 40 samples were generated and analyzed using phylogenetic tools. Two samples showed co-infection and were further analyzed by deep sequencing. The sequence profile of one of the samples revealed the presence of CPV-2c and CPV-2a strains that differed at 29 nucleotides. The other sample included a minor CPV-2a strain (13.3% of the viral population) and a major recombinant strain (86.7%). The recombinant strain arose from inter-genotypic recombination between CPV-2c and CPV-2a strains within the VP1/VP2 gene boundary. Our findings highlight the importance of deep-sequencing analysis to provide a better understanding of CPV molecular diversity.

  15. Phylogenetic and genome-wide deep-sequencing analyses of canine parvovirus reveal co-infection with field variants and emergence of a recent recombinant strain.

    Directory of Open Access Journals (Sweden)

    Ruben Pérez

    Full Text Available Canine parvovirus (CPV, a fast-evolving single-stranded DNA virus, comprises three antigenic variants (2a, 2b, and 2c with different frequencies and genetic variability among countries. The contribution of co-infection and recombination to the genetic variability of CPV is far from being fully elucidated. Here we took advantage of a natural CPV population, recently formed by the convergence of divergent CPV-2c and CPV-2a strains, to study co-infection and recombination. Complete sequences of the viral coding region of CPV-2a and CPV-2c strains from 40 samples were generated and analyzed using phylogenetic tools. Two samples showed co-infection and were further analyzed by deep sequencing. The sequence profile of one of the samples revealed the presence of CPV-2c and CPV-2a strains that differed at 29 nucleotides. The other sample included a minor CPV-2a strain (13.3% of the viral population and a major recombinant strain (86.7%. The recombinant strain arose from inter-genotypic recombination between CPV-2c and CPV-2a strains within the VP1/VP2 gene boundary. Our findings highlight the importance of deep-sequencing analysis to provide a better understanding of CPV molecular diversity.

  16. Polymerase reaction without primers throughout for the reconstruction of full-length cDNA from products of rapid amplification of cDNA ends (RACE).

    Science.gov (United States)

    Sunohara, Mitsuhiro; Kawakami, Masanori; Kage, Hidenori; Watanabe, Kousuke; Emoto, Noriko; Nagase, Takahide; Ohishi, Nobuya; Takai, Daiya

    2011-07-01

    Rapid amplification of cDNA ends (RACE) has widely been used to determine both ends of the cDNA from its partial sequence. Conventionally, 5'- and 3'-RACE products were ligated at a restriction site in the overlap region to reconstruct the full-length cDNA; however, reconstruction is difficult if no appropriate restriction enzymes are available. Here, we report a novel method to reconstruct full-length cDNA with DNA polymerase. Instead of usual PCR, chain reactions were avoided and the elongation time was shortened, which enables non-specific products or undesired point mutations to be minimized. We successfully reconstructed and TA-cloned a full-length cDNA of echinoderm microtubule-associated protein-like 4 (EML4)-anaplastic lymphoma kinase (ALK) fusion gene variant 2 from RACE products obtained from a surgically resected lung adenocarcinoma sample. We also evaluated some parameters to provide recommendations for this new method.

  17. Finding the needle in the haystack: differentiating "identical" twins in paternity testing and forensics by ultra-deep next generation sequencing.

    Science.gov (United States)

    Weber-Lehmann, Jacqueline; Schilling, Elmar; Gradl, Georg; Richter, Daniel C; Wiehler, Jens; Rolf, Burkhard

    2014-03-01

    Monozygotic (MZ) twins are considered being genetically identical, therefore they cannot be differentiated using standard forensic DNA testing. Here we describe how identification of extremely rare mutations by ultra-deep next generation sequencing can solve such cases. We sequenced DNA from sperm samples of two twins and from a blood sample of the child of one twin. Bioinformatics analysis revealed five single nucleotide polymorphisms (SNPs) present in the twin father and the child, but not in the twin uncle. The SNPs were confirmed by classical Sanger sequencing. Our results give experimental evidence for the hypothesis that rare mutations will occur early after the human blastocyst has split into two, the origin of twins, and that such mutations will be carried on into somatic tissue and the germline. The method provides a solution to solve paternity and forensic cases involving monozygotic twins as alleged fathers or originators of DNA traces.

  18. Transcriptional Slippage and RNA Editing Increase the Diversity of Transcripts in Chloroplasts: Insight from Deep Sequencing of Vigna radiata Genome and Transcriptome.

    Directory of Open Access Journals (Sweden)

    Ching-Ping Lin

    Full Text Available We performed deep sequencing of the nuclear and organellar genomes of three mungbean genotypes: Vigna radiata ssp. sublobata TC1966, V. radiata var. radiata NM92 and the recombinant inbred line RIL59 derived from a cross between TC1966 and NM92. Moreover, we performed deep sequencing of the RIL59 transcriptome to investigate transcript variability. The mungbean chloroplast genome has a quadripartite structure including a pair of inverted repeats separated by two single copy regions. A total of 213 simple sequence repeats were identified in the chloroplast genomes of NM92 and RIL59; 78 single nucleotide variants and nine indels were discovered in comparing the chloroplast genomes of TC1966 and NM92. Analysis of the mungbean chloroplast transcriptome revealed mRNAs that were affected by transcriptional slippage and RNA editing. Transcriptional slippage frequency was positively correlated with the length of simple sequence repeats of the mungbean chloroplast genome (R2=0.9911. In total, 41 C-to-U editing sites were found in 23 chloroplast genes and in one intergenic spacer. No editing site that swapped U to C was found. A combination of bioinformatics and experimental methods revealed that the plastid-encoded RNA polymerase-transcribed genes psbF and ndhA are affected by transcriptional slippage in mungbean and in main lineages of land plants, including three dicots (Glycine max, Brassica rapa, and Nicotiana tabacum, two monocots (Oryza sativa and Zea mays, two gymnosperms (Pinus taeda and Ginkgo biloba and one moss (Physcomitrella patens. Transcript analysis of the rps2 gene showed that transcriptional slippage could affect transcripts at single sequence repeat regions with poly-A runs. It showed that transcriptional slippage together with incomplete RNA editing may cause sequence diversity of transcripts in chloroplasts of land plants.

  19. Arthropod phylogenetics in light of three novel millipede (myriapoda: diplopoda mitochondrial genomes with comments on the appropriateness of mitochondrial genome sequence data for inferring deep level relationships.

    Directory of Open Access Journals (Sweden)

    Michael S Brewer

    Full Text Available BACKGROUND: Arthropods are the most diverse group of eukaryotic organisms, but their phylogenetic relationships are poorly understood. Herein, we describe three mitochondrial genomes representing orders of millipedes for which complete genomes had not been characterized. Newly sequenced genomes are combined with existing data to characterize the protein coding regions of myriapods and to attempt to reconstruct the evolutionary relationships within the Myriapoda and Arthropoda. RESULTS: The newly sequenced genomes are similar to previously characterized millipede sequences in terms of synteny and length. Unique translocations occurred within the newly sequenced taxa, including one half of the Appalachioria falcifera genome, which is inverted with respect to other millipede genomes. Across myriapods, amino acid conservation levels are highly dependent on the gene region. Additionally, individual loci varied in the level of amino acid conservation. Overall, most gene regions showed low levels of conservation at many sites. Attempts to reconstruct the evolutionary relationships suffered from questionable relationships and low support values. Analyses of phylogenetic informativeness show the lack of signal deep in the trees (i.e., genes evolve too quickly. As a result, the myriapod tree resembles previously published results but lacks convincing support, and, within the arthropod tree, well established groups were recovered as polyphyletic. CONCLUSIONS: The novel genome sequences described herein provide useful genomic information concerning millipede groups that had not been investigated. Taken together with existing sequences, the variety of compositions and evolution of myriapod mitochondrial genomes are shown to be more complex than previously thought. Unfortunately, the use of mitochondrial protein-coding regions in deep arthropod phylogenetics appears problematic, a result consistent with previously published studies. Lack of phylogenetic

  20. Cloning and Sequence Analysis of the Full-Length cDNA of H-FABP Gene in Lanzhou Fat-Tailed Sheep%兰州大尾羊心脏型脂肪酸结合蛋白(H-FABP)基因克隆及其同源性比较

    Institute of Scientific and Technical Information of China (English)

    徐红伟; 柏家林; 冯玉兰; 曹忻; 蔡勇; 金方圆; 达小强; 杨具田; 臧荣鑫

    2013-01-01

    [目的]克隆兰州大尾羊心脏型肪酸结合蛋白(H-FABP)基因全长cDNA序列,为研究绵羊H-FABP生物学作用和生产应用提供理论依据.[方法]根据已知哺乳动物H-FABP基因cDNA序列,设计5 ′和3 ′特异引物,运用cDNA末端快速扩增(RACE)技术获得兰州大尾羊H-FABP基因全长cDNA序列.[结果]扩增获得兰州大尾羊5 ′端425 bp、3 ′端231 bp片段和177 bp中间片段,拼接获得748 bp兰州大尾羊H-FABP基因全长cDNA序列(GenBank登录号:JQ780322). 兰州大尾羊H-FABP基因ORF长402 bp,编码133个氨基酸.核苷酸序列分析显示兰州大尾羊H-FABP基因序列与大多数哺乳动物相似,但其第66位发生的碱基转换(T← →G)引起所编码的第22位天门冬氨酸(N)不同于其它所有物种的赖氨酸(K).构建的基因进化树分析结果显示兰州大尾羊与山羊亲缘关系最近.预测兰州大尾羊H-FABP蛋白质的空间结构与山羊和牛H-FABP类似,由2个α螺旋和1 0个反向平行的β折叠组成,10个折叠片围成一个桶状结构,疏水性残基位于桶内,用于结合脂肪酸.[结论]克隆了兰州大尾羊H-FABP基因,为进一步研究该基因的功能奠定了基础.%[Objective] To clone the full length cDNA of heart fatty acid-binding protein (H-FABP) gene in Lanzhou fat-tailed sheep for providing a theoretical basis to study its biological function and application in sheep.[Method] The 5′-and 3′-gene specific primers were designed according to the alignment of known cDNA sequences of H-FABP from mammals.Technique of rapid amplification of cDNA ends (RACE) was employed to clone the full length cDNA of H-FABP gene in Lanzhou fat-tailed sheep.[Results] About 425 bp 5′-RACE cDNA and 231 bp 3′-RACE cDNA was obtained by 5′-RACE and 3′-RACE,respectively,using skeletal muscle RNA transcribed cDNA as template.Nest PCR was performed to clone 177 bp intermediate fragment.The full length cDNA of 748 bp H-FABP gene was spliced (Gen

  1. A drosophila full-length cDNA resource

    Energy Technology Data Exchange (ETDEWEB)

    Stapleton, Mark; Carlson, Joseph; Brokstein, Peter; Yu, Charles; Champe, Mark; George, Reed; Guarin, Hannibal; Kronmiller, Brent; Pacleb, Joanne; Park, Soo; Rubin, Gerald M.; Celniker, Susan E.

    2003-05-09

    Background: A collection of sequenced full-length cDNAs is an important resource both for functional genomics studies and for the determination of the intron-exon structure of genes. Providing this resource to the Drosophila melanogaster research community has been a long-term goal of the Berkeley Drosophila Genome Project. We have previously described the Drosophila Gene Collection (DGC), a set of putative full-length cDNAs that was produced by generating and analyzing over 250,000 expressed sequence tags (ESTs) derived from a variety of tissues and developmental stages. Results: We have generated high-quality full-insert sequence for 8,921 clones in the DGC. We compared the sequence of these clones to the annotated Release 3 genomic sequence, and identified more than 5,300 cDNAs that contain a complete and accurate protein-coding sequence. This corresponds to at least one splice form for 40 percent of the predicted D. melanogaster genes. We also identified potential new cases of RNA editing. Conclusions: We show that comparison of cDNA sequences to a high-quality annotated genomic sequence is an effective approach to identifying and eliminating defective clones from a cDNA collection and ensure its utility for experimentation. Clones were eliminated either because they carry single nucleotide discrepancies, which most probably result from reverse transcriptase errors, or because they are truncated and contain only part of the protein-coding sequence.

  2. Full length cDNA cloning and sequence characteristics analysis of BSP Ⅱ gene from Sika deer antler tissue%梅花鹿茸角组织BSPⅡ基因全长cDNA的克隆及序列特征分析

    Institute of Scientific and Technical Information of China (English)

    郝丽

    2011-01-01

    从梅花鹿鹿茸尖端组织全长cDNA文库中克隆了与骨形成和骨改建有关的一种新的骨生长因子BSPⅡ基因的全长cDNA序列,并结合生物信息学方法和实时荧光定量RT-PCR技术对该基因的氨基酸序列及其在鹿茸尖端不同组织层的表达情况进行了分析。结果表明,BSPⅡ基因cDNA全长为1576bp,编码311个氨基酸。经生物信息学分析表明,该基因编码的蛋白具有N端信号肽及跨膜区,相对分子质量为34100,理论等电点为4.05,其一级结构中谷氨酸所占比例最高;二级结构元件主要以α-螺旋和无规则卷曲为主;同源序列分析表明,梅花%The full length cDNA of BSP Ⅱ gene from velvet tip tissue full-length cDNA library of Sika deer was cloned,bioinformatics method and Real-time RT-PCR technique were used to analyze the amino acid sequence and expression. The results showed the full-length cDNA of the BSP Ⅱ gene was 1 576 bp,encoded a peptide of 311 amino acid,its relative molecular weight was 34 100, isoelectric point was 4.05, contained a N-terminal signal peptid and transmembrane domain; Glu occupied the highest proportion and secondary structure with a-helix and random coilbased. The results obtained through homologous sequence analysis indicated that BSP Ⅱ of Sika deer was highly sim- ilarity to Bos Taurus(93 %) ;with multiple sequences comparison,the N-end and 68-215,265-308 Glu rich area were highly conservative; the molecular evolution trees displayed that BSP Ⅱ of Sika deer had high genetic relationship with Equus caballus. Real-time PCR results showed that there was a significant positive correlation between its exp