WorldWideScience

Sample records for solexa sequencing technology

  1. Sequencing of chloroplast genome using whole cellular DNA and Solexa sequencing technology

    Directory of Open Access Journals (Sweden)

    Jian eWu

    2012-11-01

    Full Text Available Sequencing of the chloroplast genome using traditional sequencing methods has been difficult because of its size (>120 kb and the complicated procedures required to prepare templates. To explore the feasibility of sequencing the chloroplast genome using DNA extracted from whole cells and Solexa sequencing technology, we sequenced whole cellular DNA isolated from leaves of three Brassica rapa accessions with one lane per accession. In total, 246 Mb, 362Mb, 361 Mb sequence data were generated for the three accessions Chiifu-401-42, Z16 and FT, respectively. Microreads were assembled by reference-guided assembly using the cpDNA sequences of B. rapa, Arabidopsis thaliana, and Nicotiana tabacum. We achieved coverage of more than 99.96% of the cp genome in the three tested accessions using the B. rapa sequence as the reference. When A. thaliana or N. tabacum sequences were used as references, 99.7–99.8% or 95.5–99.7% of the B. rapa chloroplast genome was covered, respectively. These results demonstrated that sequencing of whole cellular DNA isolated from young leaves using the Illumina Genome Analyzer is an efficient method for high-throughput sequencing of chloroplast genome.

  2. Identification of microRNA-Like RNAs in the filamentous fungus Trichoderma reesei by solexa sequencing.

    Directory of Open Access Journals (Sweden)

    Kang Kang

    Full Text Available microRNAs (miRNAs are non-coding small RNAs (sRNAs capable of negatively regulating gene expression. Recently, microRNA-like small RNAs (milRNAs were discovered in several filamentous fungi but not yet in Trichoderma reesei, an industrial filamentous fungus that can secrete abundant hydrolases. To explore the presence of milRNA in T. reesei and evaluate their expression under induction of cellulose, two T. reesei sRNA libraries of cellulose induction (IN and non-induction (CON were generated and sequenced using Solexa sequencing technology. A total of 726 and 631 sRNAs were obtained from the IN and CON samples, respectively. Global expression analysis showed an extensively differential expression of sRNAs in T. reesei under the two conditions. Thirteen predicted milRNAs were identified in T. reesei based on the short hairpin structure analysis. The milRNA profiles obtained in deep sequencing were further validated by RT-qPCR assay. Computational analysis predicted a number of potential targets relating to many processes including regulation of enzyme expression. The presence and differential expression of T. reesei milRNAs imply that milRNA might play a role in T. reesei growth and cellulase induction. This work lays foundation for further functional study of fungal milRNAs and their industrial application.

  3. A combination of LongSAGE with Solexa sequencing is well suited to explore the depth and the complexity of transcriptome

    Directory of Open Access Journals (Sweden)

    Scoté-Blachon Céline

    2008-09-01

    Full Text Available Abstract Background "Open" transcriptome analysis methods allow to study gene expression without a priori knowledge of the transcript sequences. As of now, SAGE (Serial Analysis of Gene Expression, LongSAGE and MPSS (Massively Parallel Signature Sequencing are the mostly used methods for "open" transcriptome analysis. Both LongSAGE and MPSS rely on the isolation of 21 pb tag sequences from each transcript. In contrast to LongSAGE, the high throughput sequencing method used in MPSS enables the rapid sequencing of very large libraries containing several millions of tags, allowing deep transcriptome analysis. However, a bias in the complexity of the transcriptome representation obtained by MPSS was recently uncovered. Results In order to make a deep analysis of mouse hypothalamus transcriptome avoiding the limitation introduced by MPSS, we combined LongSAGE with the Solexa sequencing technology and obtained a library of more than 11 millions of tags. We then compared it to a LongSAGE library of mouse hypothalamus sequenced with the Sanger method. Conclusion We found that Solexa sequencing technology combined with LongSAGE is perfectly suited for deep transcriptome analysis. In contrast to MPSS, it gives a complex representation of transcriptome as reliable as a LongSAGE library sequenced by the Sanger method.

  4. Discovery of cashmere goat (Capra hircus) microRNAs in skin and hair follicles by Solexa sequencing.

    Science.gov (United States)

    Yuan, Chao; Wang, Xiaolong; Geng, Rongqing; He, Xiaolin; Qu, Lei; Chen, Yulin

    2013-07-28

    MicroRNAs (miRNAs) are a large family of endogenous, non-coding RNAs, about 22 nucleotides long, which regulate gene expression through sequence-specific base pairing with target mRNAs. Extensive studies have shown that miRNA expression in the skin changes remarkably during distinct stages of the hair cycle in humans, mice, goats and sheep. In this study, the skin tissues were harvested from the three stages of hair follicle cycling (anagen, catagen and telogen) in a fibre-producing goat breed. In total, 63,109,004 raw reads were obtained by Solexa sequencing and 61,125,752 clean reads remained for the small RNA digitalisation analysis. This resulted in the identification of 399 conserved miRNAs; among these, 326 miRNAs were expressed in all three follicular cycling stages, whereas 3, 12 and 11 miRNAs were specifically expressed in anagen, catagen, and telogen, respectively. We also identified 172 potential novel miRNAs by Mireap, 36 miRNAs were expressed in all three cycling stages, whereas 23, 29 and 44 miRNAs were specifically expressed in anagen, catagen, and telogen, respectively. The expression level of five arbitrarily selected miRNAs was analyzed by quantitative PCR, and the results indicated that the expression patterns were consistent with the Solexa sequencing results. Gene Ontology and KEGG pathway analyses indicated that five major biological pathways (Metabolic pathways, Pathways in cancer, MAPK signalling pathway, Endocytosis and Focal adhesion) accounted for 23.08% of target genes among 278 biological functions, indicating that these pathways are likely to play significant roles during hair cycling. During all hair cycle stages of cashmere goats, a large number of conserved and novel miRNAs were identified through a high-throughput sequencing approach. This study enriches the Capra hircus miRNA databases and provides a comprehensive miRNA transcriptome profile in the skin of goats during the hair follicle cycle.

  5. Solexa sequencing identification of conserved and novel microRNAs in backfat of Large White and Chinese Meishan pigs.

    Directory of Open Access Journals (Sweden)

    Chen Chen

    Full Text Available The domestic pig (Sus scrofa, an important species in animal production industry, is a right model for studying adipogenesis and fat deposition. In order to expand the repertoire of porcine miRNAs and further explore potential regulatory miRNAs which have influence on adipogenesis, high-throughput Solexa sequencing approach was adopted to identify miRNAs in backfat of Large White (lean type pig and Meishan pigs (Chinese indigenous fatty pig. We identified 215 unique miRNAs comprising 75 known pre-miRNAs, of which 49 miRNA*s were first identified in our study, 73 miRNAs were overlapped in both libraries, and 140 were novelly predicted miRNAs, and 215 unique miRNAs were collectively corresponding to 235 independent genomic loci. Furthermore, we analyzed the sequence variations, seed edits and phylogenetic development of the miRNAs. 17 miRNAs were widely conserved from vertebrates to invertebrates, suggesting that these miRNAs may serve as potential evolutional biomarkers. 9 conserved miRNAs with significantly differential expressions were determined. The expression of miR-215, miR-135, miR-224 and miR-146b was higher in Large White pigs, opposite to the patterns shown by miR-1a, miR-133a, miR-122, miR-204 and miR-183. Almost all novel miRNAs could be considered pig-specific except ssc-miR-1343, miR-2320, miR-2326, miR-2411 and miR-2483 which had homologs in Bos taurus, among which ssc-miR-1343, miR-2320, miR-2411 and miR-2483 were validated in backfat tissue by stem-loop qPCR. Our results displayed a high level of concordance between the qPCR and Solexa sequencing method in 9 of 10 miRNAs comparisons except for miR-1a. Moreover, we found 2 miRNAs, miR-135 and miR-183, may exert impacts on porcine backfat development through WNT signaling pathway. In conclusion, our research develops porcine miRNAs and should be beneficial to study the adipogenesis and fat deposition of different pig breeds based on miRNAs.

  6. Genome-wide profiling of DNA-binding proteins using barcode-based multiplex Solexa sequencing.

    Science.gov (United States)

    Raghav, Sunil Kumar; Deplancke, Bart

    2012-01-01

    Chromatin immunoprecipitation (ChIP) is a commonly used technique to detect the in vivo binding of proteins to DNA. ChIP is now routinely paired to microarray analysis (ChIP-chip) or next-generation sequencing (ChIP-Seq) to profile the DNA occupancy of proteins of interest on a genome-wide level. Because ChIP-chip introduces several biases, most notably due to the use of a fixed number of probes, ChIP-Seq has quickly become the method of choice as, depending on the sequencing depth, it is more sensitive, quantitative, and provides a greater binding site location resolution. With the ever increasing number of reads that can be generated per sequencing run, it has now become possible to analyze several samples simultaneously while maintaining sufficient sequence coverage, thus significantly reducing the cost per ChIP-Seq experiment. In this chapter, we provide a step-by-step guide on how to perform multiplexed ChIP-Seq analyses. As a proof-of-concept, we focus on the genome-wide profiling of RNA Polymerase II as measuring its DNA occupancy at different stages of any biological process can provide insights into the gene regulatory mechanisms involved. However, the protocol can also be used to perform multiplexed ChIP-Seq analyses of other DNA-binding proteins such as chromatin modifiers and transcription factors.

  7. Identification and differential expression of microRNAs in ovaries of laying and Broody geese (Anser cygnoides by Solexa sequencing.

    Directory of Open Access Journals (Sweden)

    Qi Xu

    Full Text Available BACKGROUND: Recent functional studies have demonstrated that the microRNAs (miRNAs play critical roles in ovarian gonadal development, steroidogenesis, apoptosis, and ovulation in mammals. However, little is known about the involvement of miRNAs in the ovarian function of fowl. The goose (Anas cygnoides is a commercially important food that is cultivated widely in China but the goose industry has been hampered by high broodiness and poor egg laying performance, which are influenced by ovarian function. METHODOLOGY/PRINCIPAL FINDINGS: In this study, the miRNA transcriptomes of ovaries from laying and broody geese were profiled using Solexa deep sequencing and bioinformatics was used to determine differential expression of the miRNAs. As a result, 11,350,396 and 9,890,887 clean reads were obtained in laying and broodiness goose, respectively, and 1,328 conserved known miRNAs and 22 novel potential miRNA candidates were identified. A total of 353 conserved microRNAs were significantly differentially expressed between laying and broody ovaries. Compared with miRNA expression in the laying ovary, 127 miRNAs were up-regulated and 126 miRNAs were down-regulated in the ovary of broody birds. A subset of the differentially expressed miRNAs (G-miR-320, G-miR-202, G-miR-146, and G-miR-143* were validated using real-time quantitative PCR. In addition, 130,458 annotated mRNA transcripts were identified as putative target genes. Gene ontology annotation and KEGG (Kyoto Encyclopedia of Genes and Genomes pathway analysis suggested that the differentially expressed miRNAs are involved in ovarian function, including hormone secretion, reproduction processes and so on. CONCLUSIONS: The present study provides the first global miRNA transcriptome data in A. cygnoides and identifies novel and known miRNAs that are differentially expressed between the ovaries of laying and broody geese. These findings contribute to our understanding of the functional involvement of mi

  8. Solexa sequencing and custom microRNA chip reveal repertoire of microRNAs in mammary gland of bovine suffering from natural infectious mastitis.

    Science.gov (United States)

    Ju, Zhihua; Jiang, Qiang; Liu, Gang; Wang, Xiuge; Luo, Guojing; Zhang, Yan; Zhang, Jibin; Zhong, Jifeng; Huang, Jinming

    2018-02-01

    Identification of microRNAs (miRNAs), target genes and regulatory networks associated with innate immune and inflammatory responses and tissue damage is essential to elucidate the molecular and genetic mechanisms for resistance to mastitis. In this study, a combination of Solexa sequencing and custom miRNA chip approaches was used to profile the expression of miRNAs in bovine mammary gland at the late stage of natural infection with Staphylococcus aureus, a widespread mastitis pathogen. We found 383 loci corresponding to 277 known and 49 putative novel miRNAs, two potential mitrons and 266 differentially expressed miRNAs in the healthy and mastitic cows' mammary glands. Several interaction networks and regulators involved in mastitis susceptibility, such as ALCAM, COL1A1, APOP4, ITIH4, CRP and fibrinogen alpha (FGA), were highlighted. Significant down-regulation and location of bta-miR-26a, which targets FGA in the mastitic mammary glands, were validated using quantitative real-time PCR, in situ hybridization and dual-luciferase reporter assays. We propose that the observed miRNA variations in mammary glands of mastitic cows are related to the maintenance of immune and defense responses, cell proliferation and apoptosis, and tissue injury and healing during the late stage of infection. Furthermore, the effect of bta-miR-26a in mastitis, mediated at least in part by enhancing FGA expression, involves host defense, inflammation and tissue damage. © 2018 Stichting International Foundation for Animal Genetics.

  9. MicroRNA of the fifth-instar posterior silk gland of silkworm identified by Solexa sequencing

    Directory of Open Access Journals (Sweden)

    Jisheng Li

    2014-12-01

    Full Text Available No special studies have been focused on the microRNA (miRNA in the fifth-instar posterior silk gland of Bombyx mori. Here, using next-generation sequencing, we acquired 93.2 million processed reads from 10 small RNA libraries. In this paper, we tried to thoroughly describe how our dataset generated from deep sequencing which was recently published in BMC genomics. Results showed that our findings are largely enriched silkworm miRNA depository and may benefit us to reveal the miRNA functions in the process of silk production.

  10. Using quality scores and longer reads improves accuracy of Solexa read mapping

    Directory of Open Access Journals (Sweden)

    Xuan Zhenyu

    2008-02-01

    Full Text Available Abstract Background Second-generation sequencing has the potential to revolutionize genomics and impact all areas of biomedical science. New technologies will make re-sequencing widely available for such applications as identifying genome variations or interrogating the oligonucleotide content of a large sample (e.g. ChIP-sequencing. The increase in speed, sensitivity and availability of sequencing technology brings demand for advances in computational technology to perform associated analysis tasks. The Solexa/Illumina 1G sequencer can produce tens of millions of reads, ranging in length from ~25–50 nt, in a single experiment. Accurately mapping the reads back to a reference genome is a critical task in almost all applications. Two sources of information that are often ignored when mapping reads from the Solexa technology are the 3' ends of longer reads, which contain a much higher frequency of sequencing errors, and the base-call quality scores. Results To investigate whether these sources of information can be used to improve accuracy when mapping reads, we developed the RMAP tool, which can map reads having a wide range of lengths and allows base-call quality scores to determine which positions in each read are more important when mapping. We applied RMAP to analyze data re-sequenced from two human BAC regions for varying read lengths, and varying criteria for use of quality scores. RMAP is freely available for downloading at http://rulai.cshl.edu/rmap/. Conclusion Our results indicate that significant gains in Solexa read mapping performance can be achieved by considering the information in 3' ends of longer reads, and appropriately using the base-call quality scores. The RMAP tool we have developed will enable researchers to effectively exploit this information in targeted re-sequencing projects.

  11. Comparison of next generation sequencing technologies for transcriptome characterization

    Directory of Open Access Journals (Sweden)

    Soltis Douglas E

    2009-08-01

    Full Text Available Abstract Background We have developed a simulation approach to help determine the optimal mixture of sequencing methods for most complete and cost effective transcriptome sequencing. We compared simulation results for traditional capillary sequencing with "Next Generation" (NG ultra high-throughput technologies. The simulation model was parameterized using mappings of 130,000 cDNA sequence reads to the Arabidopsis genome (NCBI Accession SRA008180.19. We also generated 454-GS20 sequences and de novo assemblies for the basal eudicot California poppy (Eschscholzia californica and the magnoliid avocado (Persea americana using a variety of methods for cDNA synthesis. Results The Arabidopsis reads tagged more than 15,000 genes, including new splice variants and extended UTR regions. Of the total 134,791 reads (13.8 MB, 119,518 (88.7% mapped exactly to known exons, while 1,117 (0.8% mapped to introns, 11,524 (8.6% spanned annotated intron/exon boundaries, and 3,066 (2.3% extended beyond the end of annotated UTRs. Sequence-based inference of relative gene expression levels correlated significantly with microarray data. As expected, NG sequencing of normalized libraries tagged more genes than non-normalized libraries, although non-normalized libraries yielded more full-length cDNA sequences. The Arabidopsis data were used to simulate additional rounds of NG and traditional EST sequencing, and various combinations of each. Our simulations suggest a combination of FLX and Solexa sequencing for optimal transcriptome coverage at modest cost. We have also developed ESTcalc http://fgp.huck.psu.edu/NG_Sims/ngsim.pl, an online webtool, which allows users to explore the results of this study by specifying individualized costs and sequencing characteristics. Conclusion NG sequencing technologies are a highly flexible set of platforms that can be scaled to suit different project goals. In terms of sequence coverage alone, the NG sequencing is a dramatic advance

  12. Allele Re-sequencing Technologies

    DEFF Research Database (Denmark)

    Byrne, Stephen; Farrell, Jacqueline Danielle; Asp, Torben

    2013-01-01

    The development of next-generation sequencing technologies has made sequencing an affordable approach for detection of genetic variations associated with various traits. However, the cost of whole genome re-sequencing still remains too high to be feasible for many plant species with large...... alternative to whole genome re-sequencing to identify causative genetic variations in plants. One challenge, however, will be efficient bioinformatics strategies for data handling and analysis from the increasing amount of sequence information....

  13. An efficient annotation and gene-expression derivation tool for Illumina Solexa datasets.

    Science.gov (United States)

    Hosseini, Parsa; Tremblay, Arianne; Matthews, Benjamin F; Alkharouf, Nadim W

    2010-07-02

    The data produced by an Illumina flow cell with all eight lanes occupied, produces well over a terabyte worth of images with gigabytes of reads following sequence alignment. The ability to translate such reads into meaningful annotation is therefore of great concern and importance. Very easily, one can get flooded with such a great volume of textual, unannotated data irrespective of read quality or size. CASAVA, a optional analysis tool for Illumina sequencing experiments, enables the ability to understand INDEL detection, SNP information, and allele calling. To not only extract from such analysis, a measure of gene expression in the form of tag-counts, but furthermore to annotate such reads is therefore of significant value. We developed TASE (Tag counting and Analysis of Solexa Experiments), a rapid tag-counting and annotation software tool specifically designed for Illumina CASAVA sequencing datasets. Developed in Java and deployed using jTDS JDBC driver and a SQL Server backend, TASE provides an extremely fast means of calculating gene expression through tag-counts while annotating sequenced reads with the gene's presumed function, from any given CASAVA-build. Such a build is generated for both DNA and RNA sequencing. Analysis is broken into two distinct components: DNA sequence or read concatenation, followed by tag-counting and annotation. The end result produces output containing the homology-based functional annotation and respective gene expression measure signifying how many times sequenced reads were found within the genomic ranges of functional annotations. TASE is a powerful tool to facilitate the process of annotating a given Illumina Solexa sequencing dataset. Our results indicate that both homology-based annotation and tag-count analysis are achieved in very efficient times, providing researchers to delve deep in a given CASAVA-build and maximize information extraction from a sequencing dataset. TASE is specially designed to translate sequence data

  14. A practical comparison of de novo genome assembly software tools for next-generation sequencing technologies.

    Directory of Open Access Journals (Sweden)

    Wenyu Zhang

    Full Text Available The advent of next-generation sequencing technologies is accompanied with the development of many whole-genome sequence assembly methods and software, especially for de novo fragment assembly. Due to the poor knowledge about the applicability and performance of these software tools, choosing a befitting assembler becomes a tough task. Here, we provide the information of adaptivity for each program, then above all, compare the performance of eight distinct tools against eight groups of simulated datasets from Solexa sequencing platform. Considering the computational time, maximum random access memory (RAM occupancy, assembly accuracy and integrity, our study indicate that string-based assemblers, overlap-layout-consensus (OLC assemblers are well-suited for very short reads and longer reads of small genomes respectively. For large datasets of more than hundred millions of short reads, De Bruijn graph-based assemblers would be more appropriate. In terms of software implementation, string-based assemblers are superior to graph-based ones, of which SOAPdenovo is complex for the creation of configuration file. Our comparison study will assist researchers in selecting a well-suited assembler and offer essential information for the improvement of existing assemblers or the developing of novel assemblers.

  15. "First generation" automated DNA sequencing technology.

    Science.gov (United States)

    Slatko, Barton E; Kieleczawa, Jan; Ju, Jingyue; Gardner, Andrew F; Hendrickson, Cynthia L; Ausubel, Frederick M

    2011-10-01

    Beginning in the 1980s, automation of DNA sequencing has greatly increased throughput, reduced costs, and enabled large projects to be completed more easily. The development of automation technology paralleled the development of other aspects of DNA sequencing: better enzymes and chemistry, separation and imaging technology, sequencing protocols, robotics, and computational advancements (including base-calling algorithms with quality scores, database developments, and sequence analysis programs). Despite the emergence of high-throughput sequencing platforms, automated Sanger sequencing technology remains useful for many applications. This unit provides background and a description of the "First-Generation" automated DNA sequencing technology. It also includes protocols for using the current Applied Biosystems (ABI) automated DNA sequencing machines. © 2011 by John Wiley & Sons, Inc.

  16. Sequence based polymorphic (SBP marker technology for targeted genomic regions: its application in generating a molecular map of the Arabidopsis thaliana genome

    Directory of Open Access Journals (Sweden)

    Sahu Binod B

    2012-01-01

    Full Text Available Abstract Background Molecular markers facilitate both genotype identification, essential for modern animal and plant breeding, and the isolation of genes based on their map positions. Advancements in sequencing technology have made possible the identification of single nucleotide polymorphisms (SNPs for any genomic regions. Here a sequence based polymorphic (SBP marker technology for generating molecular markers for targeted genomic regions in Arabidopsis is described. Results A ~3X genome coverage sequence of the Arabidopsis thaliana ecotype, Niederzenz (Nd-0 was obtained by applying Illumina's sequencing by synthesis (Solexa technology. Comparison of the Nd-0 genome sequence with the assembled Columbia-0 (Col-0 genome sequence identified putative single nucleotide polymorphisms (SNPs throughout the entire genome. Multiple 75 base pair Nd-0 sequence reads containing SNPs and originating from individual genomic DNA molecules were the basis for developing co-dominant SBP markers. SNPs containing Col-0 sequences, supported by transcript sequences or sequences from multiple BAC clones, were compared to the respective Nd-0 sequences to identify possible restriction endonuclease enzyme site variations. Small amplicons, PCR amplified from both ecotypes, were digested with suitable restriction enzymes and resolved on a gel to reveal the sequence based polymorphisms. By applying this technology, 21 SBP markers for the marker poor regions of the Arabidopsis map representing polymorphisms between Col-0 and Nd-0 ecotypes were generated. Conclusions The SBP marker technology described here allowed the development of molecular markers for targeted genomic regions of Arabidopsis. It should facilitate isolation of co-dominant molecular markers for targeted genomic regions of any animal or plant species, whose genomic sequences have been assembled. This technology will particularly facilitate the development of high density molecular marker maps, essential for

  17. Special Issue: Next Generation DNA Sequencing

    Directory of Open Access Journals (Sweden)

    Paul Richardson

    2010-10-01

    Full Text Available Next Generation Sequencing (NGS refers to technologies that do not rely on traditional dideoxy-nucleotide (Sanger sequencing where labeled DNA fragments are physically resolved by electrophoresis. These new technologies rely on different strategies, but essentially all of them make use of real-time data collection of a base level incorporation event across a massive number of reactions (on the order of millions versus 96 for capillary electrophoresis for instance. The major commercial NGS platforms available to researchers are the 454 Genome Sequencer (Roche, Illumina (formerly Solexa Genome analyzer, the SOLiD system (Applied Biosystems/Life Technologies and the Heliscope (Helicos Corporation. The techniques and different strategies utilized by these platforms are reviewed in a number of the papers in this special issue. These technologies are enabling new applications that take advantage of the massive data produced by this next generation of sequencing instruments. [...

  18. Low-pass shotgun sequencing of the barley genome facilitates rapid identification of genes, conserved non-coding sequences and novel repeats

    Directory of Open Access Journals (Sweden)

    Graner Andreas

    2008-10-01

    Full Text Available Abstract Background Barley has one of the largest and most complex genomes of all economically important food crops. The rise of new short read sequencing technologies such as Illumina/Solexa permits such large genomes to be effectively sampled at relatively low cost. Based on the corresponding sequence reads a Mathematically Defined Repeat (MDR index can be generated to map repetitive regions in genomic sequences. Results We have generated 574 Mbp of Illumina/Solexa sequences from barley total genomic DNA, representing about 10% of a genome equivalent. From these sequences we generated an MDR index which was then used to identify and mark repetitive regions in the barley genome. Comparison of the MDR plots with expert repeat annotation drawing on the information already available for known repetitive elements revealed a significant correspondence between the two methods. MDR-based annotation allowed for the identification of dozens of novel repeat sequences, though, which were not recognised by hand-annotation. The MDR data was also used to identify gene-containing regions by masking of repetitive sequences in eight de-novo sequenced bacterial artificial chromosome (BAC clones. For half of the identified candidate gene islands indeed gene sequences could be identified. MDR data were only of limited use, when mapped on genomic sequences from the closely related species Triticum monococcum as only a fraction of the repetitive sequences was recognised. Conclusion An MDR index for barley, which was obtained by whole-genome Illumina/Solexa sequencing, proved as efficient in repeat identification as manual expert annotation. Circumventing the labour-intensive step of producing a specific repeat library for expert annotation, an MDR index provides an elegant and efficient resource for the identification of repetitive and low-copy (i.e. potentially gene-containing sequences regions in uncharacterised genomic sequences. The restriction that a particular

  19. Genome Sequence Databases (Overview): Sequencing and Assembly

    Energy Technology Data Exchange (ETDEWEB)

    Lapidus, Alla L.

    2009-01-01

    From the date its role in heredity was discovered, DNA has been generating interest among scientists from different fields of knowledge: physicists have studied the three dimensional structure of the DNA molecule, biologists tried to decode the secrets of life hidden within these long molecules, and technologists invent and improve methods of DNA analysis. The analysis of the nucleotide sequence of DNA occupies a special place among the methods developed. Thanks to the variety of sequencing technologies available, the process of decoding the sequence of genomic DNA (or whole genome sequencing) has become robust and inexpensive. Meanwhile the assembly of whole genome sequences remains a challenging task. In addition to the need to assemble millions of DNA fragments of different length (from 35 bp (Solexa) to 800 bp (Sanger)), great interest in analysis of microbial communities (metagenomes) of different complexities raises new problems and pushes some new requirements for sequence assembly tools to the forefront. The genome assembly process can be divided into two steps: draft assembly and assembly improvement (finishing). Despite the fact that automatically performed assembly (or draft assembly) is capable of covering up to 98% of the genome, in most cases, it still contains incorrectly assembled reads. The error rate of the consensus sequence produced at this stage is about 1/2000 bp. A finished genome represents the genome assembly of much higher accuracy (with no gaps or incorrectly assembled areas) and quality ({approx}1 error/10,000 bp), validated through a number of computer and laboratory experiments.

  20. Large-Scale Isolation of Microsatellites from Chinese Mitten Crab Eriocheir sinensis via a Solexa Genomic Survey

    Directory of Open Access Journals (Sweden)

    Qun Wang

    2012-12-01

    Full Text Available Microsatellites are simple sequence repeats with a high degree of polymorphism in the genome; they are used as DNA markers in many molecular genetic studies. Using traditional methods such as the magnetic beads enrichment method, only a few microsatellite markers have been isolated from the Chinese mitten crab Eriocheir sinensis, as the crab genome sequence information is unavailable. Here, we have identified a large number of microsatellites from the Chinese mitten crab by taking advantage of Solexa genomic surveying. A total of 141,737 SSR (simple sequence repeats motifs were identified via analysis of 883 Mb of the crab genomic DNA information, including mono-, di-, tri-, tetra-, penta- and hexa-nucleotide repeat motifs. The number of di-nucleotide repeat motifs was 82,979, making this the most abundant type of repeat motif (58.54%; the second most abundant were the tri-nucleotide repeats (42,657, 30.11%. Among di-nucleotide repeats, the most frequent repeats were AC motifs, accounting for 67.55% of the total number. AGG motifs were the most frequent (59.32% of the tri-nucleotide motifs. A total of 15,125 microsatellite loci had a flanking sequence suitable for setting the primer of a polymerase chain reaction (PCR. To verify the identified SSRs, a subset of 100 primer pairs was randomly selected for PCR. Eighty two primer sets (82% produced strong PCR products matching expected sizes, and 78% were polymorphic. In an analysis of 30 wild individuals from the Yangtze River with 20 primer sets, the number of alleles per locus ranged from 2–14 and the mean allelic richness was 7.4. No linkage disequilibrium was found between any pair of loci, indicating that the markers were independent. The Hardy-Weinberg equilibrium test showed significant deviation in four of the 20 microsatellite loci after sequential Bonferroni corrections. This method is cost- and time-effective in comparison to traditional approaches for the isolation of microsatellites.

  1. [Sequencing technology in gene diagnosis and its application].

    Science.gov (United States)

    Yibin, Guo

    2014-11-01

    The study of gene mutation is one of the hot topics in the field of life science nowadays, and the related detection methods and diagnostic technology have been developed rapidly. Sequencing technology plays an indispensable role in the definite diagnosis and classification of genetic diseases. In this review, we summarize the research progress in sequencing technology, evaluate the advantages and disadvantages of 1(st) ~3(rd) generation of sequencing technology, and describe its application in gene diagnosis. Also we made forecasts and prospects on its development trend.

  2. Advantages of genome sequencing by long-read sequencer using SMRT technology in medical area.

    Science.gov (United States)

    Nakano, Kazuma; Shiroma, Akino; Shimoji, Makiko; Tamotsu, Hinako; Ashimine, Noriko; Ohki, Shun; Shinzato, Misuzu; Minami, Maiko; Nakanishi, Tetsuhiro; Teruya, Kuniko; Satou, Kazuhito; Hirano, Takashi

    2017-07-01

    PacBio RS II is the first commercialized third-generation DNA sequencer able to sequence a single molecule DNA in real-time without amplification. PacBio RS II's sequencing technology is novel and unique, enabling the direct observation of DNA synthesis by DNA polymerase. PacBio RS II confers four major advantages compared to other sequencing technologies: long read lengths, high consensus accuracy, a low degree of bias, and simultaneous capability of epigenetic characterization. These advantages surmount the obstacle of sequencing genomic regions such as high/low G+C, tandem repeat, and interspersed repeat regions. Moreover, PacBio RS II is ideal for whole genome sequencing, targeted sequencing, complex population analysis, RNA sequencing, and epigenetics characterization. With PacBio RS II, we have sequenced and analyzed the genomes of many species, from viruses to humans. Herein, we summarize and review some of our key genome sequencing projects, including full-length viral sequencing, complete bacterial genome and almost-complete plant genome assemblies, and long amplicon sequencing of a disease-associated gene region. We believe that PacBio RS II is not only an effective tool for use in the basic biological sciences but also in the medical/clinical setting.

  3. Next generation sequencing (NGS)technologies and applications

    Energy Technology Data Exchange (ETDEWEB)

    Vuyisich, Momchilo [Los Alamos National Laboratory

    2012-09-11

    NGS technology overview: (1) NGS library preparation - Nucleic acids extraction, Sample quality control, RNA conversion to cDNA, Addition of sequencing adapters, Quality control of library; (2) Sequencing - Clonal amplification of library fragments, (except PacBio), Sequencing by synthesis, Data output (reads and quality); and (3) Data analysis - Read mapping, Genome assembly, Gene expression, Operon structure, sRNA discovery, and Epigenetic analyses.

  4. Discovery and profiling of novel and conserved microRNAs during flower development in Carya cathayensis via deep sequencing.

    Science.gov (United States)

    Wang, Zheng Jia; Huang, Jian Qin; Huang, You Jun; Li, Zheng; Zheng, Bing Song

    2012-08-01

    Hickory (Carya cathayensis Sarg.) is an economically important woody plant in China, but its long juvenile phase delays yield. MicroRNAs (miRNAs) are critical regulators of genes and important for normal plant development and physiology, including flower development. We used Solexa technology to sequence two small RNA libraries from two floral differentiation stages in hickory to identify miRNAs related to flower development. We identified 39 conserved miRNA sequences from 114 loci belonging to 23 families as well as two novel and ten potential novel miRNAs belonging to nine families. Moreover, 35 conserved miRNA*s and two novel miRNA*s were detected. Twenty miRNA sequences from 49 loci belonging to 11 families were differentially expressed; all were up-regulated at the later stage of flower development in hickory. Quantitative real-time PCR of 12 conserved miRNA sequences, five novel miRNA families, and two novel miRNA*s validated that all were expressed during hickory flower development, and the expression patterns were similar to those detected with Solexa sequencing. Finally, a total of 146 targets of the novel and conserved miRNAs were predicted. This study identified a diverse set of miRNAs that were closely related to hickory flower development and that could help in plant floral induction.

  5. Sequencing of BAC pools by different next generation sequencing platforms and strategies

    Directory of Open Access Journals (Sweden)

    Scholz Uwe

    2011-10-01

    Full Text Available Abstract Background Next generation sequencing of BACs is a viable option for deciphering the sequence of even large and highly repetitive genomes. In order to optimize this strategy, we examined the influence of read length on the quality of Roche/454 sequence assemblies, to what extent Illumina/Solexa mate pairs (MPs improve the assemblies by scaffolding and whether barcoding of BACs is dispensable. Results Sequencing four BACs with both FLX and Titanium technologies revealed similar sequencing accuracy, but showed that the longer Titanium reads produce considerably less misassemblies and gaps. The 454 assemblies of 96 barcoded BACs were improved by scaffolding 79% of the total contig length with MPs from a non-barcoded library. Assembly of the unmasked 454 sequences without separation by barcodes revealed chimeric contig formation to be a major problem, encompassing 47% of the total contig length. Masking the sequences reduced this fraction to 24%. Conclusion Optimal BAC pool sequencing should be based on the longest available reads, with barcoding essential for a comprehensive assessment of both repetitive and non-repetitive sequence information. When interest is restricted to non-repetitive regions and repeats are masked prior to assembly, barcoding is non-essential. In any case, the assemblies can be improved considerably by scaffolding with non-barcoded BAC pool MPs.

  6. Management of High-Throughput DNA Sequencing Projects: Alpheus.

    Science.gov (United States)

    Miller, Neil A; Kingsmore, Stephen F; Farmer, Andrew; Langley, Raymond J; Mudge, Joann; Crow, John A; Gonzalez, Alvaro J; Schilkey, Faye D; Kim, Ryan J; van Velkinburgh, Jennifer; May, Gregory D; Black, C Forrest; Myers, M Kathy; Utsey, John P; Frost, Nicholas S; Sugarbaker, David J; Bueno, Raphael; Gullans, Stephen R; Baxter, Susan M; Day, Steve W; Retzel, Ernest F

    2008-12-26

    High-throughput DNA sequencing has enabled systems biology to begin to address areas in health, agricultural and basic biological research. Concomitant with the opportunities is an absolute necessity to manage significant volumes of high-dimensional and inter-related data and analysis. Alpheus is an analysis pipeline, database and visualization software for use with massively parallel DNA sequencing technologies that feature multi-gigabase throughput characterized by relatively short reads, such as Illumina-Solexa (sequencing-by-synthesis), Roche-454 (pyrosequencing) and Applied Biosystem's SOLiD (sequencing-by-ligation). Alpheus enables alignment to reference sequence(s), detection of variants and enumeration of sequence abundance, including expression levels in transcriptome sequence. Alpheus is able to detect several types of variants, including non-synonymous and synonymous single nucleotide polymorphisms (SNPs), insertions/deletions (indels), premature stop codons, and splice isoforms. Variant detection is aided by the ability to filter variant calls based on consistency, expected allele frequency, sequence quality, coverage, and variant type in order to minimize false positives while maximizing the identification of true positives. Alpheus also enables comparisons of genes with variants between cases and controls or bulk segregant pools. Sequence-based differential expression comparisons can be developed, with data export to SAS JMP Genomics for statistical analysis.

  7. DSAP: deep-sequencing small RNA analysis pipeline.

    Science.gov (United States)

    Huang, Po-Jung; Liu, Yi-Chung; Lee, Chi-Ching; Lin, Wei-Chen; Gan, Richie Ruei-Chi; Lyu, Ping-Chiang; Tang, Petrus

    2010-07-01

    DSAP is an automated multiple-task web service designed to provide a total solution to analyzing deep-sequencing small RNA datasets generated by next-generation sequencing technology. DSAP uses a tab-delimited file as an input format, which holds the unique sequence reads (tags) and their corresponding number of copies generated by the Solexa sequencing platform. The input data will go through four analysis steps in DSAP: (i) cleanup: removal of adaptors and poly-A/T/C/G/N nucleotides; (ii) clustering: grouping of cleaned sequence tags into unique sequence clusters; (iii) non-coding RNA (ncRNA) matching: sequence homology mapping against a transcribed sequence library from the ncRNA database Rfam (http://rfam.sanger.ac.uk/); and (iv) known miRNA matching: detection of known miRNAs in miRBase (http://www.mirbase.org/) based on sequence homology. The expression levels corresponding to matched ncRNAs and miRNAs are summarized in multi-color clickable bar charts linked to external databases. DSAP is also capable of displaying miRNA expression levels from different jobs using a log(2)-scaled color matrix. Furthermore, a cross-species comparative function is also provided to show the distribution of identified miRNAs in different species as deposited in miRBase. DSAP is available at http://dsap.cgu.edu.tw.

  8. FDA's Activities Supporting Regulatory Application of "Next Gen" Sequencing Technologies.

    Science.gov (United States)

    Wilson, Carolyn A; Simonyan, Vahan

    2014-01-01

    Applications of next-generation sequencing (NGS) technologies require availability and access to an information technology (IT) infrastructure and bioinformatics tools for large amounts of data storage and analyses. The U.S. Food and Drug Administration (FDA) anticipates that the use of NGS data to support regulatory submissions will continue to increase as the scientific and clinical communities become more familiar with the technologies and identify more ways to apply these advanced methods to support development and evaluation of new biomedical products. FDA laboratories are conducting research on different NGS platforms and developing the IT infrastructure and bioinformatics tools needed to enable regulatory evaluation of the technologies and the data sponsors will submit. A High-performance Integrated Virtual Environment, or HIVE, has been launched, and development and refinement continues as a collaborative effort between the FDA and George Washington University to provide the tools to support these needs. The use of a highly parallelized environment facilitated by use of distributed cloud storage and computation has resulted in a platform that is both rapid and responsive to changing scientific needs. The FDA plans to further develop in-house capacity in this area, while also supporting engagement by the external community, by sponsoring an open, public workshop to discuss NGS technologies and data formats standardization, and to promote the adoption of interoperability protocols in September 2014. Next-generation sequencing (NGS) technologies are enabling breakthroughs in how the biomedical community is developing and evaluating medical products. One example is the potential application of this method to the detection and identification of microbial contaminants in biologic products. In order for the U.S. Food and Drug Administration (FDA) to be able to evaluate the utility of this technology, we need to have the information technology infrastructure and

  9. Technology development for gene discovery and full-length sequencing

    Energy Technology Data Exchange (ETDEWEB)

    Marcelo Bento Soares

    2004-07-19

    In previous years, with support from the U.S. Department of Energy, we developed methods for construction of normalized and subtracted cDNA libraries, and constructed hundreds of high-quality libraries for production of Expressed Sequence Tags (ESTs). Our clones were made widely available to the scientific community through the IMAGE Consortium, and millions of ESTs were produced from our libraries either by collaborators or by our own sequencing laboratory at the University of Iowa. During this grant period, we focused on (1) the development of a method for preferential cloning of tissue-specific and/or rare transcripts, (2) its utilization to expedite EST-based gene discovery for the NIH Mouse Brain Molecular Anatomy Project, (3) further development and optimization of a method for construction of full-length-enriched cDNA libraries, and (4) modification of a plasmid vector to maximize efficiency of full-length cDNA sequencing by the transposon-mediated approach. It is noteworthy that the technology developed for preferential cloning of rare mRNAs enabled identification of over 2,000 mouse transcripts differentially expressed in the hippocampus. In addition, the method that we optimized for construction of full-length-enriched cDNA libraries was successfully utilized for the production of approximately fifty libraries from the developing mouse nervous system, from which over 2,500 full-ORF-containing cDNAs have been identified and accurately sequenced in their entirety either by our group or by the NIH-Mammalian Gene Collection Program Sequencing Team.

  10. High-throughput sequence alignment using Graphics Processing Units

    Directory of Open Access Journals (Sweden)

    Trapnell Cole

    2007-12-01

    Full Text Available Abstract Background The recent availability of new, less expensive high-throughput DNA sequencing technologies has yielded a dramatic increase in the volume of sequence data that must be analyzed. These data are being generated for several purposes, including genotyping, genome resequencing, metagenomics, and de novo genome assembly projects. Sequence alignment programs such as MUMmer have proven essential for analysis of these data, but researchers will need ever faster, high-throughput alignment tools running on inexpensive hardware to keep up with new sequence technologies. Results This paper describes MUMmerGPU, an open-source high-throughput parallel pairwise local sequence alignment program that runs on commodity Graphics Processing Units (GPUs in common workstations. MUMmerGPU uses the new Compute Unified Device Architecture (CUDA from nVidia to align multiple query sequences against a single reference sequence stored as a suffix tree. By processing the queries in parallel on the highly parallel graphics card, MUMmerGPU achieves more than a 10-fold speedup over a serial CPU version of the sequence alignment kernel, and outperforms the exact alignment component of MUMmer on a high end CPU by 3.5-fold in total application time when aligning reads from recent sequencing projects using Solexa/Illumina, 454, and Sanger sequencing technologies. Conclusion MUMmerGPU is a low cost, ultra-fast sequence alignment program designed to handle the increasing volume of data produced by new, high-throughput sequencing technologies. MUMmerGPU demonstrates that even memory-intensive applications can run significantly faster on the relatively low-cost GPU than on the CPU.

  11. SEQUENCING BATCH REACTOR: A PROMISING TECHNOLOGY IN WASTEWATER TREATMENT

    Directory of Open Access Journals (Sweden)

    A. H. Mahvi

    2008-04-01

    Full Text Available Discharge of domestic and industrial wastewater to surface or groundwater is very dangerous to the environment. Therefore treatment of any kind of wastewater to produce effluent with good quality is necessary. In this regard choosing an effective treatment system is important. Sequencing batch reactor is a modification of activated sludge process which has been successfully used to treat municipal and industrial wastewater. The process could be applied for nutrients removal, high biochemical oxygen demand containing industrial wastewater, wastewater containing toxic materials such as cyanide, copper, chromium, lead and nickel, food industries effluents, landfill leachates and tannery wastewater. Of the process advantages are single-tank configuration, small foot print, easily expandable, simple operation and low capital costs. Many researches have been conducted on this treatment technology. The authors had been conducted some investigations on a modification of sequencing batch reactor. Their studies resulted in very high percentage removal of biochemical oxygen demand, chemical oxygen demand, total kjeldahl nitrogen, total nitrogen, total phosphorus and total suspended solids respectively. This paper reviews some of the published works in addition to experiences of the authors.

  12. Applications and Case Studies of the Next-Generation Sequencing Technologies in Food, Nutrition and Agriculture.

    Science.gov (United States)

    Next-generation sequencing technologies are able to produce high-throughput short sequence reads in a cost-effective fashion. The emergence of these technologies has not only facilitated genome sequencing but also changed the landscape of life sciences. Here I survey their major applications ranging...

  13. Biomolecule Sequencer: Next-Generation DNA Sequencing Technology for In-Flight Environmental Monitoring, Research, and Beyond

    Science.gov (United States)

    Smith, David J.; Burton, Aaron; Castro-Wallace, Sarah; John, Kristen; Stahl, Sarah E.; Dworkin, Jason Peter; Lupisella, Mark L.

    2016-01-01

    On the International Space Station (ISS), technologies capable of rapid microbial identification and disease diagnostics are not currently available. NASA still relies upon sample return for comprehensive, molecular-based sample characterization. Next-generation DNA sequencing is a powerful approach for identifying microorganisms in air, water, and surfaces onboard spacecraft. The Biomolecule Sequencer payload, manifested to SpaceX-9 and scheduled on the Increment 4748 research plan (June 2016), will assess the functionality of a commercially-available next-generation DNA sequencer in the microgravity environment of ISS. The MinION device from Oxford Nanopore Technologies (Oxford, UK) measures picoamp changes in electrical current dependent on nucleotide sequences of the DNA strand migrating through nanopores in the system. The hardware is exceptionally small (9.5 x 3.2 x 1.6 cm), lightweight (120 grams), and powered only by a USB connection. For the ISS technology demonstration, the Biomolecule Sequencer will be powered by a Microsoft Surface Pro3. Ground-prepared samples containing lambda bacteriophage, Escherichia coli, and mouse genomic DNA, will be launched and stored frozen on the ISS until experiment initiation. Immediately prior to sequencing, a crew member will collect and thaw frozen DNA samples, connect the sequencer to the Surface Pro3, inject thawed samples into a MinION flow cell, and initiate sequencing. At the completion of the sequencing run, data will be downlinked for ground analysis. Identical, synchronous ground controls will be used for data comparisons to determine sequencer functionality, run-time sequence, current dynamics, and overall accuracy. We will present our latest results from the ISS flight experiment the first time DNA has ever been sequenced in space and discuss the many potential applications of the Biomolecule Sequencer for environmental monitoring, medical diagnostics, higher fidelity and more adaptable Space Biology Human

  14. Applications of Next-Generation Sequencing Technologies to Diagnostic Virology

    Directory of Open Access Journals (Sweden)

    Giorgio Palù

    2011-11-01

    Full Text Available Novel DNA sequencing techniques, referred to as “next-generation” sequencing (NGS, provide high speed and throughput that can produce an enormous volume of sequences with many possible applications in research and diagnostic settings. In this article, we provide an overview of the many applications of NGS in diagnostic virology. NGS techniques have been used for high-throughput whole viral genome sequencing, such as sequencing of new influenza viruses, for detection of viral genome variability and evolution within the host, such as investigation of human immunodeficiency virus and human hepatitis C virus quasispecies, and monitoring of low-abundance antiviral drug-resistance mutations. NGS techniques have been applied to metagenomics-based strategies for the detection of unexpected disease-associated viruses and for the discovery of novel human viruses, including cancer-related viruses. Finally, the human virome in healthy and disease conditions has been described by NGS-based metagenomics.

  15. Digital PCR provides sensitive and absolute calibration for high throughput sequencing

    Directory of Open Access Journals (Sweden)

    Fan H Christina

    2009-03-01

    Full Text Available Abstract Background Next-generation DNA sequencing on the 454, Solexa, and SOLiD platforms requires absolute calibration of the number of molecules to be sequenced. This requirement has two unfavorable consequences. First, large amounts of sample-typically micrograms-are needed for library preparation, thereby limiting the scope of samples which can be sequenced. For many applications, including metagenomics and the sequencing of ancient, forensic, and clinical samples, the quantity of input DNA can be critically limiting. Second, each library requires a titration sequencing run, thereby increasing the cost and lowering the throughput of sequencing. Results We demonstrate the use of digital PCR to accurately quantify 454 and Solexa sequencing libraries, enabling the preparation of sequencing libraries from nanogram quantities of input material while eliminating costly and time-consuming titration runs of the sequencer. We successfully sequenced low-nanogram scale bacterial and mammalian DNA samples on the 454 FLX and Solexa DNA sequencing platforms. This study is the first to definitively demonstrate the successful sequencing of picogram quantities of input DNA on the 454 platform, reducing the sample requirement more than 1000-fold without pre-amplification and the associated bias and reduction in library depth. Conclusion The digital PCR assay allows absolute quantification of sequencing libraries, eliminates uncertainties associated with the construction and application of standard curves to PCR-based quantification, and with a coefficient of variation close to 10%, is sufficiently precise to enable direct sequencing without titration runs.

  16. The fast changing landscape of sequencing technologies and their impact on microbial genome assemblies and annotation.

    Science.gov (United States)

    Mavromatis, Konstantinos; Land, Miriam L; Brettin, Thomas S; Quest, Daniel J; Copeland, Alex; Clum, Alicia; Goodwin, Lynne; Woyke, Tanja; Lapidus, Alla; Klenk, Hans Peter; Cottingham, Robert W; Kyrpides, Nikos C

    2012-01-01

    The emergence of next generation sequencing (NGS) has provided the means for rapid and high throughput sequencing and data generation at low cost, while concomitantly creating a new set of challenges. The number of available assembled microbial genomes continues to grow rapidly and their quality reflects the quality of the sequencing technology used, but also of the analysis software employed for assembly and annotation. In this work, we have explored the quality of the microbial draft genomes across various sequencing technologies. We have compared the draft and finished assemblies of 133 microbial genomes sequenced at the Department of Energy-Joint Genome Institute and finished at the Los Alamos National Laboratory using a variety of combinations of sequencing technologies, reflecting the transition of the institute from Sanger-based sequencing platforms to NGS platforms. The quality of the public assemblies and of the associated gene annotations was evaluated using various metrics. Results obtained with the different sequencing technologies, as well as their effects on downstream processes, were analyzed. Our results demonstrate that the Illumina HiSeq 2000 sequencing system, the primary sequencing technology currently used for de novo genome sequencing and assembly at JGI, has various advantages in terms of total sequence throughput and cost, but it also introduces challenges for the downstream analyses. In all cases assembly results although on average are of high quality, need to be viewed critically and consider sources of errors in them prior to analysis. These data follow the evolution of microbial sequencing and downstream processing at the JGI from draft genome sequences with large gaps corresponding to missing genes of significant biological role to assemblies with multiple small gaps (Illumina) and finally to assemblies that generate almost complete genomes (Illumina+PacBio).

  17. Discovery of precursor and mature microRNAs and their putative gene targets using high-throughput sequencing in pineapple (Ananas comosus var. comosus).

    Science.gov (United States)

    Yusuf, Noor Hydayaty Md; Ong, Wen Dee; Redwan, Raimi Mohamed; Latip, Mariam Abd; Kumar, S Vijay

    2015-10-15

    MicroRNAs (miRNAs) are a class of small, endogenous non-coding RNAs that negatively regulate gene expression, resulting in the silencing of target mRNA transcripts through mRNA cleavage or translational inhibition. MiRNAs play significant roles in various biological and physiological processes in plants. However, the miRNA-mediated gene regulatory network in pineapple, the model tropical non-climacteric fruit, remains largely unexplored. Here, we report a complete list of pineapple mature miRNAs obtained from high-throughput small RNA sequencing and precursor miRNAs (pre-miRNAs) obtained from ESTs. Two small RNA libraries were constructed from pineapple fruits and leaves, respectively, using Illumina's Solexa technology. Sequence similarity analysis using miRBase revealed 579,179 reads homologous to 153 miRNAs from 41 miRNA families. In addition, a pineapple fruit transcriptome library consisting of approximately 30,000 EST contigs constructed using Solexa sequencing was used for the discovery of pre-miRNAs. In all, four pre-miRNAs were identified (MIR156, MIR399, MIR444 and MIR2673). Furthermore, the same pineapple transcriptome was used to dissect the function of the miRNAs in pineapple by predicting their putative targets in conjunction with their regulatory networks. In total, 23 metabolic pathways were found to be regulated by miRNAs in pineapple. The use of high-throughput sequencing in pineapples to unveil the presence of miRNAs and their regulatory pathways provides insight into the repertoire of miRNA regulation used exclusively in this non-climacteric model plant. Copyright © 2015 Elsevier B.V. All rights reserved.

  18. DNA Polymerases Drive DNA Sequencing-by-Synthesis Technologies: Both Past and Present

    Directory of Open Access Journals (Sweden)

    Cheng-Yao eChen

    2014-06-01

    Full Text Available Next-generation sequencing (NGS technologies have revolutionized modern biological and biomedical research. The engines responsible for this innovation are DNA polymerases; they catalyze the biochemical reaction for deriving template sequence information. In fact, DNA polymerase has been a cornerstone of DNA sequencing from the very beginning. E. coli DNA polymerase I proteolytic (Klenow fragment was originally utilized in Sanger's dideoxy chain terminating DNA sequencing chemistry. From these humble beginnings followed an explosion of organism-specific, genome sequence information accessible via public database. Family A/B DNA polymerases from mesophilic/thermophilic bacteria/archaea were modified and tested in today's standard capillary electrophoresis (CE and NGS sequencing platforms. These enzymes were selected for their efficient incorporation of bulky dye-terminator and reversible dye-terminator nucleotides respectively. Third generation, real-time single molecule sequencing platform requires slightly different enzyme properties. Enterobacterial phage ⱷ29 DNA polymerase copies long stretches of DNA and possesses a unique capability to efficiently incorporate terminal phosphate-labeled nucleoside polyphosphates. Furthermore, ⱷ29 enzyme has also been utilized in emerging DNA sequencing technologies including nanopore-, and protein-transistor-based sequencing. DNA polymerase is, and will continue to be, a crucial component of sequencing technologies.

  19. The application of the high throughput sequencing technology in the transposable elements.

    Science.gov (United States)

    Liu, Zhen; Xu, Jian-hong

    2015-09-01

    High throughput sequencing technology has dramatically improved the efficiency of DNA sequencing, and decreased the costs to a great extent. Meanwhile, this technology usually has advantages of better specificity, higher sensitivity and accuracy. Therefore, it has been applied to the research on genetic variations, transcriptomics and epigenomics. Recently, this technology has been widely employed in the studies of transposable elements and has achieved fruitful results. In this review, we summarize the application of high throughput sequencing technology in the fields of transposable elements, including the estimation of transposon content, preference of target sites and distribution, insertion polymorphism and population frequency, identification of rare copies, transposon horizontal transfers as well as transposon tagging. We also briefly introduce the major common sequencing strategies and algorithms, their advantages and disadvantages, and the corresponding solutions. Finally, we envision the developing trends of high throughput sequencing technology, especially the third generation sequencing technology, and its application in transposon studies in the future, hopefully providing a comprehensive understanding and reference for related scientific researchers.

  20. High throughput sequencing and proteomics to identify immunogenic proteins of a new pathogen: the dirty genome approach.

    Science.gov (United States)

    Greub, Gilbert; Kebbi-Beghdadi, Carole; Bertelli, Claire; Collyn, François; Riederer, Beat M; Yersin, Camille; Croxatto, Antony; Raoult, Didier

    2009-12-23

    With the availability of new generation sequencing technologies, bacterial genome projects have undergone a major boost. Still, chromosome completion needs a costly and time-consuming gap closure, especially when containing highly repetitive elements. However, incomplete genome data may be sufficiently informative to derive the pursued information. For emerging pathogens, i.e. newly identified pathogens, lack of release of genome data during gap closure stage is clearly medically counterproductive. We thus investigated the feasibility of a dirty genome approach, i.e. the release of unfinished genome sequences to develop serological diagnostic tools. We showed that almost the whole genome sequence of the emerging pathogen Parachlamydia acanthamoebae was retrieved even with relatively short reads from Genome Sequencer 20 and Solexa. The bacterial proteome was analyzed to select immunogenic proteins, which were then expressed and used to elaborate the first steps of an ELISA. This work constitutes the proof of principle for a dirty genome approach, i.e. the use of unfinished genome sequences of pathogenic bacteria, coupled with proteomics to rapidly identify new immunogenic proteins useful to develop in the future specific diagnostic tests such as ELISA, immunohistochemistry and direct antigen detection. Although applied here to an emerging pathogen, this combined dirty genome sequencing/proteomic approach may be used for any pathogen for which better diagnostics are needed. These genome sequences may also be very useful to develop DNA based diagnostic tests. All these diagnostic tools will allow further evaluations of the pathogenic potential of this obligate intracellular bacterium.

  1. The history and advances of reversible terminators used in new generations of sequencing technology.

    Science.gov (United States)

    Chen, Fei; Dong, Mengxing; Ge, Meng; Zhu, Lingxiang; Ren, Lufeng; Liu, Guocheng; Mu, Rong

    2013-02-01

    DNA sequencing using reversible terminators, as one sequencing by synthesis strategy, has garnered a great deal of interest due to its popular application in the second-generation high-throughput DNA sequencing technology. In this review, we provided its history of development, classification, and working mechanism of this technology. We also outlined the screening strategies for DNA polymerases to accommodate the reversible terminators as substrates during polymerization; particularly, we introduced the "REAP" method developed by us. At the end of this review, we discussed current limitations of this approach and provided potential solutions to extend its application. Copyright © 2013. Production and hosting by Elsevier Ltd.

  2. Treatment of Laboratory Wastewater by Sequence Batch reactor technology

    International Nuclear Information System (INIS)

    Imtiaz, N.; Butt, M.; Khan, R.A.; Saeed, M.T.; Irfan, M.

    2012-01-01

    These studies were conducted on the characterization and treatment of sewage mixed with waste -water of research and testing laboratory (PCSIR Laboratories Lahore). In this study all the parameters COD, BOD and TSS etc of influent (untreated waste-water) and effluent (treated waste-water) were characterized using the standard methods of examination for water and waste-water. All the results of the analyzed waste-water parameters were above the National Environmental Quality Standards (NEQS) set at National level. Treatment of waste-water was carried out by conventional sequencing batch reactor technique (SBR) using aeration and settling technique in the same treatment reactor at laboratory scale. The results of COD after treatment were reduced from (90-95 %), BOD (95-97 %) and TSS (96-99 %) and the reclaimed effluent quality was suitable for gardening purposes. (author)

  3. Integrating sequencing technologies in personal genomics: optimal low cost reconstruction of structural variants.

    Directory of Open Access Journals (Sweden)

    Jiang Du

    2009-07-01

    Full Text Available The goal of human genome re-sequencing is obtaining an accurate assembly of an individual's genome. Recently, there has been great excitement in the development of many technologies for this (e.g. medium and short read sequencing from companies such as 454 and SOLiD, and high-density oligo-arrays from Affymetrix and NimbelGen, with even more expected to appear. The costs and sensitivities of these technologies differ considerably from each other. As an important goal of personal genomics is to reduce the cost of re-sequencing to an affordable point, it is worthwhile to consider optimally integrating technologies. Here, we build a simulation toolbox that will help us optimally combine different technologies for genome re-sequencing, especially in reconstructing large structural variants (SVs. SV reconstruction is considered the most challenging step in human genome re-sequencing. (It is sometimes even harder than de novo assembly of small genomes because of the duplications and repetitive sequences in the human genome. To this end, we formulate canonical problems that are representative of issues in reconstruction and are of small enough scale to be computationally tractable and simulatable. Using semi-realistic simulations, we show how we can combine different technologies to optimally solve the assembly at low cost. With mapability maps, our simulations efficiently handle the inhomogeneous repeat-containing structure of the human genome and the computational complexity of practical assembly algorithms. They quantitatively show how combining different read lengths is more cost-effective than using one length, how an optimal mixed sequencing strategy for reconstructing large novel SVs usually also gives accurate detection of SNPs/indels, how paired-end reads can improve reconstruction efficiency, and how adding in arrays is more efficient than just sequencing for disentangling some complex SVs. Our strategy should facilitate the sequencing of

  4. DNA fingerprinting, DNA barcoding, and next generation sequencing technology in plants.

    Science.gov (United States)

    Sucher, Nikolaus J; Hennell, James R; Carles, Maria C

    2012-01-01

    DNA fingerprinting of plants has become an invaluable tool in forensic, scientific, and industrial laboratories all over the world. PCR has become part of virtually every variation of the plethora of approaches used for DNA fingerprinting today. DNA sequencing is increasingly used either in combination with or as a replacement for traditional DNA fingerprinting techniques. A prime example is the use of short, standardized regions of the genome as taxon barcodes for biological identification of plants. Rapid advances in "next generation sequencing" (NGS) technology are driving down the cost of sequencing and bringing large-scale sequencing projects into the reach of individual investigators. We present an overview of recent publications that demonstrate the use of "NGS" technology for DNA fingerprinting and DNA barcoding applications.

  5. Experimental evolution, genetic analysis and genome re-sequencing reveal the mutation conferring artemisinin resistance in an isogenic lineage of malaria parasites

    KAUST Repository

    Hunt, Paul; Martinelli, Axel; Modrzynska, Katarzyna; Borges, Sofia; Creasey, Alison; Rodrigues, Louise; Beraldi, Dario; Loewe, Laurence; Fawcett, Richard; Kumar, Sujai; Thomson, Marian; Trivedi, Urmi; Otto, Thomas D; Pain, Arnab; Blaxter, Mark; Cravo, Pedro

    2010-01-01

    was mapped to a region of chromosome 2 by Linkage Group Selection in two different genetic crosses. Whole-genome deep coverage short-read re-sequencing (IlluminaSolexa) defined the point mutations, insertions, deletions and copy-number variations arising

  6. Application of genotyping by sequencing technology to a variety of crop breeding programs.

    Science.gov (United States)

    Kim, Changsoo; Guo, Hui; Kong, Wenqian; Chandnani, Rahul; Shuang, Lan-Shuan; Paterson, Andrew H

    2016-01-01

    Since the Arabidopsis genome was completed, draft sequences or pseudomolecules have been published for more than 100 plant genomes including green algae, in large part due to advances in sequencing technologies. Advanced DNA sequencing technologies have also conferred new opportunities for high-throughput low-cost crop genotyping, based on single-nucleotide polymorphisms (SNPs). However, a recurring complication in crop genotyping that differs from other taxa is a higher level of DNA sequence duplication, noting that all angiosperms are thought to have polyploidy in their evolutionary history. In the current article, we briefly review current genotyping methods using next-generation sequencing (NGS) technologies. We also explore case studies of genotyping-by-sequencing (GBS) applications to several crops differing in genome size, organization and breeding system (paleopolyploids, neo-allopolyploids, neo-autopolyploids). GBS typically shows good results when it is applied to an inbred diploid species with a well-established reference genome. However, we have also made some progress toward GBS of outcrossing species lacking reference genomes and of polyploid populations, which still need much improvement. Regardless of some limitations, low-cost and multiplexed genotyping offered by GBS will be beneficial to breed superior cultivars in many crop species. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  7. Stepwise threshold clustering: a new method for genotyping MHC loci using next-generation sequencing technology.

    Directory of Open Access Journals (Sweden)

    William E Stutz

    Full Text Available Genes of the vertebrate major histocompatibility complex (MHC are of great interest to biologists because of their important role in immunity and disease, and their extremely high levels of genetic diversity. Next generation sequencing (NGS technologies are quickly becoming the method of choice for high-throughput genotyping of multi-locus templates like MHC in non-model organisms. Previous approaches to genotyping MHC genes using NGS technologies suffer from two problems:1 a "gray zone" where low frequency alleles and high frequency artifacts can be difficult to disentangle and 2 a similar sequence problem, where very similar alleles can be difficult to distinguish as two distinct alleles. Here were present a new method for genotyping MHC loci--Stepwise Threshold Clustering (STC--that addresses these problems by taking full advantage of the increase in sequence data provided by NGS technologies. Unlike previous approaches for genotyping MHC with NGS data that attempt to classify individual sequences as alleles or artifacts, STC uses a quasi-Dirichlet clustering algorithm to cluster similar sequences at increasing levels of sequence similarity. By applying frequency and similarity based criteria to clusters rather than individual sequences, STC is able to successfully identify clusters of sequences that correspond to individual or similar alleles present in the genomes of individual samples. Furthermore, STC does not require duplicate runs of all samples, increasing the number of samples that can be genotyped in a given project. We show how the STC method works using a single sample library. We then apply STC to 295 threespine stickleback (Gasterosteus aculeatus samples from four populations and show that neighboring populations differ significantly in MHC allele pools. We show that STC is a reliable, accurate, efficient, and flexible method for genotyping MHC that will be of use to biologists interested in a variety of downstream applications.

  8. Students' Guided Reinvention of Definition of Limit of a Sequence with Interactive Technology

    Science.gov (United States)

    Flores, Alfinio; Park, Jungeun

    2016-01-01

    In a course emphasizing interactive technology, 19 students, including 18 mathematics education majors, mostly in their first year, reinvented the definition of limit of a sequence while working in small cooperative groups. The class spent four sessions of 75 minutes each on a cyclical process of guided reinvention of the definition of limit of a…

  9. Transcriptome analysis of carnation (Dianthus caryophyllus L.) based on next-generation sequencing technology.

    Science.gov (United States)

    Tanase, Koji; Nishitani, Chikako; Hirakawa, Hideki; Isobe, Sachiko; Tabata, Satoshi; Ohmiya, Akemi; Onozaki, Takashi

    2012-07-02

    Carnation (Dianthus caryophyllus L.), in the family Caryophyllaceae, can be found in a wide range of colors and is a model system for studies of flower senescence. In addition, it is one of the most important flowers in the global floriculture industry. However, few genomics resources, such as sequences and markers are available for carnation or other members of the Caryophyllaceae. To increase our understanding of the genetic control of important characters in carnation, we generated an expressed sequence tag (EST) database for a carnation cultivar important in horticulture by high-throughput sequencing using 454 pyrosequencing technology. We constructed a normalized cDNA library and a 3'-UTR library of carnation, obtaining a total of 1,162,126 high-quality reads. These reads were assembled into 300,740 unigenes consisting of 37,844 contigs and 262,896 singlets. The contigs were searched against an Arabidopsis sequence database, and 61.8% (23,380) of them had at least one BLASTX hit. These contigs were also annotated with Gene Ontology (GO) and were found to cover a broad range of GO categories. Furthermore, we identified 17,362 potential simple sequence repeats (SSRs) in 14,291 of the unigenes. We focused on gene discovery in the areas of flower color and ethylene biosynthesis. Transcripts were identified for almost every gene involved in flower chlorophyll and carotenoid metabolism and in anthocyanin biosynthesis. Transcripts were also identified for every step in the ethylene biosynthesis pathway. We present the first large-scale sequence data set for carnation, generated using next-generation sequencing technology. The large EST database generated from these sequences is an informative resource for identifying genes involved in various biological processes in carnation and provides an EST resource for understanding the genetic diversity of this plant.

  10. Transcriptome analysis of carnation (Dianthus caryophyllus L. based on next-generation sequencing technology

    Directory of Open Access Journals (Sweden)

    Tanase Koji

    2012-07-01

    Full Text Available Abstract Background Carnation (Dianthus caryophyllus L., in the family Caryophyllaceae, can be found in a wide range of colors and is a model system for studies of flower senescence. In addition, it is one of the most important flowers in the global floriculture industry. However, few genomics resources, such as sequences and markers are available for carnation or other members of the Caryophyllaceae. To increase our understanding of the genetic control of important characters in carnation, we generated an expressed sequence tag (EST database for a carnation cultivar important in horticulture by high-throughput sequencing using 454 pyrosequencing technology. Results We constructed a normalized cDNA library and a 3’-UTR library of carnation, obtaining a total of 1,162,126 high-quality reads. These reads were assembled into 300,740 unigenes consisting of 37,844 contigs and 262,896 singlets. The contigs were searched against an Arabidopsis sequence database, and 61.8% (23,380 of them had at least one BLASTX hit. These contigs were also annotated with Gene Ontology (GO and were found to cover a broad range of GO categories. Furthermore, we identified 17,362 potential simple sequence repeats (SSRs in 14,291 of the unigenes. We focused on gene discovery in the areas of flower color and ethylene biosynthesis. Transcripts were identified for almost every gene involved in flower chlorophyll and carotenoid metabolism and in anthocyanin biosynthesis. Transcripts were also identified for every step in the ethylene biosynthesis pathway. Conclusions We present the first large-scale sequence data set for carnation, generated using next-generation sequencing technology. The large EST database generated from these sequences is an informative resource for identifying genes involved in various biological processes in carnation and provides an EST resource for understanding the genetic diversity of this plant.

  11. MicroRNA repertoire for functional genome research in tilapia identified by deep sequencing.

    Science.gov (United States)

    Yan, Biao; Wang, Zhen-Hua; Zhu, Chang-Dong; Guo, Jin-Tao; Zhao, Jin-Liang

    2014-08-01

    The Nile tilapia (Oreochromis niloticus; Cichlidae) is an economically important species in aquaculture and occupies a prominent position in the aquaculture industry. MicroRNAs (miRNAs) are a class of noncoding RNAs that post-transcriptionally regulate gene expression involved in diverse biological and metabolic processes. To increase the repertoire of miRNAs characterized in tilapia, we used the Illumina/Solexa sequencing technology to sequence a small RNA library using pooled RNA sample isolated from the different developmental stages of tilapia. Bioinformatic analyses suggest that 197 conserved and 27 novel miRNAs are expressed in tilapia. Sequence alignments indicate that all tested miRNAs and miRNAs* are highly conserved across many species. In addition, we characterized the tissue expression patterns of five miRNAs using real-time quantitative PCR. We found that miR-1/206, miR-7/9, and miR-122 is abundantly expressed in muscle, brain, and liver, respectively, implying a potential role in the regulation of tissue differentiation or the maintenance of tissue identity. Overall, our results expand the number of tilapia miRNAs, and the discovery of miRNAs in tilapia genome contributes to a better understanding the role of miRNAs in regulating diverse biological processes.

  12. Comparing microarrays and next-generation sequencing technologies for microbial ecology research.

    Science.gov (United States)

    Roh, Seong Woon; Abell, Guy C J; Kim, Kyoung-Ho; Nam, Young-Do; Bae, Jin-Woo

    2010-06-01

    Recent advances in molecular biology have resulted in the application of DNA microarrays and next-generation sequencing (NGS) technologies to the field of microbial ecology. This review aims to examine the strengths and weaknesses of each of the methodologies, including depth and ease of analysis, throughput and cost-effectiveness. It also intends to highlight the optimal application of each of the individual technologies toward the study of a particular environment and identify potential synergies between the two main technologies, whereby both sample number and coverage can be maximized. We suggest that the efficient use of microarray and NGS technologies will allow researchers to advance the field of microbial ecology, and importantly, improve our understanding of the role of microorganisms in their various environments.

  13. Draft genome sequence of Streptomyces coelicoflavus ZG0656 reveals the putative biosynthetic gene cluster of acarviostatin family α-amylase inhibitors.

    Science.gov (United States)

    Guo, X; Geng, P; Bai, F; Bai, G; Sun, T; Li, X; Shi, L; Zhong, Q

    2012-08-01

    The aims of this study are to obtain the draft genome sequence of Streptomyces coelicoflavus ZG0656, which produces novel acarviostatin family α-amylase inhibitors, and then to reveal the putative acarviostatin-related gene cluster and the biosynthetic pathway. The draft genome sequence of S. coelicoflavus ZG0656 was generated using a shotgun approach employing a combination of 454 and Solexa sequencing technologies. Genome analysis revealed a putative gene cluster for acarviostatin biosynthesis, termed sct-cluster. The cluster contains 13 acarviostatin synthetic genes, six transporter genes, four starch degrading or transglycosylation enzyme genes and two regulator genes. On the basis of bioinformatic analysis, we proposed a putative biosynthetic pathway of acarviostatins. The intracellular steps produce a structural core, acarviostatin I00-7-P, and the extracellular assemblies lead to diverse acarviostatin end products. The draft genome sequence of S. coelicoflavus ZG0656 revealed the putative biosynthetic gene cluster of acarviostatins and a putative pathway of acarviostatin production. To our knowledge, S. coelicoflavus ZG0656 is the first strain in this species for which a genome sequence has been reported. The analysis of sct-cluster provided important insights into the biosynthesis of acarviostatins. This work will be a platform for producing novel variants and yield improvement. © 2012 The Authors. Letters in Applied Microbiology © 2012 The Society for Applied Microbiology.

  14. [Application of next-generation semiconductor sequencing technologies in genetic diagnosis of inherited cardiomyopathies].

    Science.gov (United States)

    Zhao, Yue; Zhang, Hong; Xia, Xue-shan

    2015-07-01

    Inherited cardiomyopathy is the most common hereditary cardiac disease. It also causes a significant proportion of sudden cardiac deaths in young adults and athletes. So far, approximately one hundred genes have been reported to be involved in cardiomyopathies through different mechanisms. Therefore, the identification of the genetic basis and disease mechanisms of cardiomyopathies are important for establishing a clinical diagnosis and genetic testing. Next-generation semiconductor sequencing (NGSS) technology platform is a high-throughput sequencer capable of analyzing clinically derived genomes with high productivity, sensitivity and specificity. It was launched in 2010 by Life Technologies of USA, and it is based on a high density semiconductor chip, which was covered with tens of thousands of wells. NGSS has been successfully used in candidate gene mutation screening to identify hereditary disease. In this review, we summarize these genetic variations, challenge and application of NGSS in inherited cardiomyopathy, and its value in disease diagnosis, prevention and treatment.

  15. Read length and repeat resolution: exploring prokaryote genomes using next-generation sequencing technologies.

    Directory of Open Access Journals (Sweden)

    Matt J Cahill

    Full Text Available BACKGROUND: There are a growing number of next-generation sequencing technologies. At present, the most cost-effective options also produce the shortest reads. However, even for prokaryotes, there is uncertainty concerning the utility of these technologies for the de novo assembly of complete genomes. This reflects an expectation that short reads will be unable to resolve small, but presumably abundant, repeats. METHODOLOGY/PRINCIPAL FINDINGS: Using a simple model of repeat assembly, we develop and test a technique that, for any read length, can estimate the occurrence of unresolvable repeats in a genome, and thus predict the number of gaps that would need to be closed to produce a complete sequence. We apply this technique to 818 prokaryote genome sequences. This provides a quantitative assessment of the relative performance of various lengths. Notably, unpaired reads of only 150nt can reconstruct approximately 50% of the analysed genomes with fewer than 96 repeat-induced gaps. Nonetheless, there is considerable variation amongst prokaryotes. Some genomes can be assembled to near contiguity using very short reads while others require much longer reads. CONCLUSIONS: Given the diversity of prokaryote genomes, a sequencing strategy should be tailored to the organism under study. Our results will provide researchers with a practical resource to guide the selection of the appropriate read length.

  16. Read length and repeat resolution: Exploring prokaryote genomes using next-generation sequencing technologies

    KAUST Repository

    Cahill, Matt J.

    2010-07-12

    Background: There are a growing number of next-generation sequencing technologies. At present, the most cost-effective options also produce the shortest reads. However, even for prokaryotes, there is uncertainty concerning the utility of these technologies for the de novo assembly of complete genomes. This reflects an expectation that short reads will be unable to resolve small, but presumably abundant, repeats. Methodology/Principal Findings: Using a simple model of repeat assembly, we develop and test a technique that, for any read length, can estimate the occurrence of unresolvable repeats in a genome, and thus predict the number of gaps that would need to be closed to produce a complete sequence. We apply this technique to 818 prokaryote genome sequences. This provides a quantitative assessment of the relative performance of various lengths. Notably, unpaired reads of only 150nt can reconstruct approximately 50% of the analysed genomes with fewer than 96 repeat-induced gaps. Nonetheless, there is considerable variation amongst prokaryotes. Some genomes can be assembled to near contiguity using very short reads while others require much longer reads. Conclusions: Given the diversity of prokaryote genomes, a sequencing strategy should be tailored to the organism under study. Our results will provide researchers with a practical resource to guide the selection of the appropriate read length. 2010 Cahill et al.

  17. Read length and repeat resolution: Exploring prokaryote genomes using next-generation sequencing technologies

    KAUST Repository

    Cahill, Matt J.; Kö ser, Claudio U.; Ross, Nicholas E.; Archer, John A.C.

    2010-01-01

    Background: There are a growing number of next-generation sequencing technologies. At present, the most cost-effective options also produce the shortest reads. However, even for prokaryotes, there is uncertainty concerning the utility of these technologies for the de novo assembly of complete genomes. This reflects an expectation that short reads will be unable to resolve small, but presumably abundant, repeats. Methodology/Principal Findings: Using a simple model of repeat assembly, we develop and test a technique that, for any read length, can estimate the occurrence of unresolvable repeats in a genome, and thus predict the number of gaps that would need to be closed to produce a complete sequence. We apply this technique to 818 prokaryote genome sequences. This provides a quantitative assessment of the relative performance of various lengths. Notably, unpaired reads of only 150nt can reconstruct approximately 50% of the analysed genomes with fewer than 96 repeat-induced gaps. Nonetheless, there is considerable variation amongst prokaryotes. Some genomes can be assembled to near contiguity using very short reads while others require much longer reads. Conclusions: Given the diversity of prokaryote genomes, a sequencing strategy should be tailored to the organism under study. Our results will provide researchers with a practical resource to guide the selection of the appropriate read length. 2010 Cahill et al.

  18. Transcriptome sequencing and differential gene expression analysis in Viola yedoensis Makino (Fam. Violaceae) responsive to cadmium (Cd) pollution

    Energy Technology Data Exchange (ETDEWEB)

    Gao, Jian [Key Laboratory of Biology and Genetic Improvement of Maize in Southwest Region, Ministry of Agriculture, Maize Research Institute of Sichuan Agricultural University, Wenjiang, Sichuan (China); Luo, Mao [Drug Discovery Research Center of Luzhou Medical College, Luzhou, Sichuan (China); Zhu, Ye; He, Ying; Wang, Qin [Department of Pharmacy of Luzhou Medical College, Luzhou, Sichuan (China); Zhang, Chun, E-mail: zc83good@126.com [Department of Pharmacy of Luzhou Medical College, Luzhou, Sichuan (China)

    2015-03-27

    Viola yedoensis Makino is an important Chinese traditional medicine plant adapted to cadmium (Cd) pollution regions. Illumina sequencing technology was used to sequence the transcriptome of V. yedoensis Makino. We sequenced Cd-treated (VIYCd) and untreated (VIYCK) samples of V. yedoensis, and obtained 100,410,834 and 83,587,676 high quality reads, respectively. After de novo assembly and quantitative assessment, 109,800 unigenes were finally generated with an average length of 661 bp. We then obtained functional annotations by aligning unigenes with public protein databases including NR, NT, SwissProt, KEGG and COG. In addition, 892 differentially expressed genes (DEGs) were investigated between the two libraries of untreated (VIYCK) and Cd-treated (VIYCd) plants. Moreover, 15 randomly selected DEGs were further validated with qRT-PCR and the results were highly accordant with the Solexa analysis. This study firstly generated a successful global analysis of the V. yedoensis transcriptome and it will provide for further studies on gene expression, genomics, and functional genomics in Violaceae. - Highlights: • A de novo assembly generated 109,800 unigenes and 5,4479 of them were annotated. • 31,285 could be classified into 26 COG categories. • 263 biosynthesis pathways were predicted and classified into five categories. • 892 DEGs were detected and 15 of them were validated by qRT-PCR.

  19. Transcriptome sequencing and differential gene expression analysis in Viola yedoensis Makino (Fam. Violaceae) responsive to cadmium (Cd) pollution

    International Nuclear Information System (INIS)

    Gao, Jian; Luo, Mao; Zhu, Ye; He, Ying; Wang, Qin; Zhang, Chun

    2015-01-01

    Viola yedoensis Makino is an important Chinese traditional medicine plant adapted to cadmium (Cd) pollution regions. Illumina sequencing technology was used to sequence the transcriptome of V. yedoensis Makino. We sequenced Cd-treated (VIYCd) and untreated (VIYCK) samples of V. yedoensis, and obtained 100,410,834 and 83,587,676 high quality reads, respectively. After de novo assembly and quantitative assessment, 109,800 unigenes were finally generated with an average length of 661 bp. We then obtained functional annotations by aligning unigenes with public protein databases including NR, NT, SwissProt, KEGG and COG. In addition, 892 differentially expressed genes (DEGs) were investigated between the two libraries of untreated (VIYCK) and Cd-treated (VIYCd) plants. Moreover, 15 randomly selected DEGs were further validated with qRT-PCR and the results were highly accordant with the Solexa analysis. This study firstly generated a successful global analysis of the V. yedoensis transcriptome and it will provide for further studies on gene expression, genomics, and functional genomics in Violaceae. - Highlights: • A de novo assembly generated 109,800 unigenes and 5,4479 of them were annotated. • 31,285 could be classified into 26 COG categories. • 263 biosynthesis pathways were predicted and classified into five categories. • 892 DEGs were detected and 15 of them were validated by qRT-PCR

  20. Genome Microscale Heterogeneity among Wild Potatoes Revealed by Diversity Arrays Technology Marker Sequences

    Directory of Open Access Journals (Sweden)

    Alessandra Traini

    2013-01-01

    Full Text Available Tuber-bearing potato species possess several genes that can be exploited to improve the genetic background of the cultivated potato Solanum tuberosum. Among them, S. bulbocastanum and S. commersonii are well known for their strong resistance to environmental stresses. However, scant information is available for these species in terms of genome organization, gene function, and regulatory networks. Consequently, genomic tools to assist breeding are meager, and efficient exploitation of these species has been limited so far. In this paper, we employed the reference genome sequences from cultivated potato and tomato and a collection of sequences of 1,423 potato Diversity Arrays Technology (DArT markers that show polymorphic representation across the genomes of S. bulbocastanum and/or S. commersonii genotypes. Our results highlighted microscale genome sequence heterogeneity that may play a significant role in functional and structural divergence between related species. Our analytical approach provides knowledge of genome structural and sequence variability that could not be detected by transcriptome and proteome approaches.

  1. Genome Microscale Heterogeneity among Wild Potatoes Revealed by Diversity Arrays Technology Marker Sequences.

    Science.gov (United States)

    Traini, Alessandra; Iorizzo, Massimo; Mann, Harpartap; Bradeen, James M; Carputo, Domenico; Frusciante, Luigi; Chiusano, Maria Luisa

    2013-01-01

    Tuber-bearing potato species possess several genes that can be exploited to improve the genetic background of the cultivated potato Solanum tuberosum. Among them, S. bulbocastanum and S. commersonii are well known for their strong resistance to environmental stresses. However, scant information is available for these species in terms of genome organization, gene function, and regulatory networks. Consequently, genomic tools to assist breeding are meager, and efficient exploitation of these species has been limited so far. In this paper, we employed the reference genome sequences from cultivated potato and tomato and a collection of sequences of 1,423 potato Diversity Arrays Technology (DArT) markers that show polymorphic representation across the genomes of S. bulbocastanum and/or S. commersonii genotypes. Our results highlighted microscale genome sequence heterogeneity that may play a significant role in functional and structural divergence between related species. Our analytical approach provides knowledge of genome structural and sequence variability that could not be detected by transcriptome and proteome approaches.

  2. Academic performance in a pharmacotherapeutics course sequence taught synchronously on two campuses using distance education technology.

    Science.gov (United States)

    Steinberg, Michael; Morin, Anna K

    2011-10-10

    To compare the academic performance of campus-based students in a pharmacotherapeutics course with that of students at a distant campus taught via synchronous teleconferencing. Examination scores and final course grades for campus-based and distant students completing the case-based pharmacotherapeutics course sequence over a 5-year period were collected and analyzed. The mean examination scores and final course grades were not significantly different between students on the 2 campuses. The use of synchronous distance education technology to teach students does not affect students' academic performance when used in an active-learning, case-based pharmacotherapeutics course.

  3. A genome-wide analysis of lentivector integration sites using targeted sequence capture and next generation sequencing technology.

    Science.gov (United States)

    Ustek, Duran; Sirma, Sema; Gumus, Ergun; Arikan, Muzaffer; Cakiris, Aris; Abaci, Neslihan; Mathew, Jaicy; Emrence, Zeliha; Azakli, Hulya; Cosan, Fulya; Cakar, Atilla; Parlak, Mahmut; Kursun, Olcay

    2012-10-01

    One application of next-generation sequencing (NGS) is the targeted resequencing of interested genes which has not been used in viral integration site analysis of gene therapy applications. Here, we combined targeted sequence capture array and next generation sequencing to address the whole genome profiling of viral integration sites. Human 293T and K562 cells were transduced with a HIV-1 derived vector. A custom made DNA probe sets targeted pLVTHM vector used to capture lentiviral vector/human genome junctions. The captured DNA was sequenced using GS FLX platform. Seven thousand four hundred and eighty four human genome sequences flanking the long terminal repeats (LTR) of pLVTHM fragment sequences matched with an identity of at least 98% and minimum 50 bp criteria in both cells. In total, 203 unique integration sites were identified. The integrations in both cell lines were totally distant from the CpG islands and from the transcription start sites and preferentially located in introns. A comparison between the two cell lines showed that the lentiviral-transduced DNA does not have the same preferred regions in the two different cell lines. Copyright © 2012 Elsevier B.V. All rights reserved.

  4. Characterizing ncRNAs in human pathogenic protists using high-throughput sequencing technology

    Directory of Open Access Journals (Sweden)

    Lesley Joan Collins

    2011-12-01

    Full Text Available ncRNAs are key genes in many human diseases including cancer and viral infection, as well as providing critical functions in pathogenic organisms such as fungi, bacteria, viruses and protists. Until now the identification and characterization of ncRNAs associated with disease has been slow or inaccurate requiring many years of testing to understand complicated RNA and protein gene relationships. High-throughput sequencing now offers the opportunity to characterize miRNAs, siRNAs, snoRNAs and long ncRNAs on a genomic scale making it faster and easier to clarify how these ncRNAs contribute to the disease state. However, this technology is still relatively new, and ncRNA discovery is not an application of high priority for streamlined bioinformatics. Here we summarize background concepts and practical approaches for ncRNA analysis using high-throughput sequencing, and how it relates to understanding human disease. As a case study, we focus on the parasitic protists Giardia lamblia and Trichomonas vaginalis, where large evolutionary distance has meant difficulties in comparing ncRNAs with those from model eukaryotes. A combination of biological, computational and sequencing approaches has enabled easier classification of ncRNA classes such as snoRNAs, but has also aided the identification of novel classes. It is hoped that a higher level of understanding of ncRNA expression and interaction may aid in the development of less harsh treatment for protist-based diseases.

  5. Characterizing ncRNAs in Human Pathogenic Protists Using High-Throughput Sequencing Technology

    Science.gov (United States)

    Collins, Lesley Joan

    2011-01-01

    ncRNAs are key genes in many human diseases including cancer and viral infection, as well as providing critical functions in pathogenic organisms such as fungi, bacteria, viruses, and protists. Until now the identification and characterization of ncRNAs associated with disease has been slow or inaccurate requiring many years of testing to understand complicated RNA and protein gene relationships. High-throughput sequencing now offers the opportunity to characterize miRNAs, siRNAs, small nucleolar RNAs (snoRNAs), and long ncRNAs on a genomic scale, making it faster and easier to clarify how these ncRNAs contribute to the disease state. However, this technology is still relatively new, and ncRNA discovery is not an application of high priority for streamlined bioinformatics. Here we summarize background concepts and practical approaches for ncRNA analysis using high-throughput sequencing, and how it relates to understanding human disease. As a case study, we focus on the parasitic protists Giardia lamblia and Trichomonas vaginalis, where large evolutionary distance has meant difficulties in comparing ncRNAs with those from model eukaryotes. A combination of biological, computational, and sequencing approaches has enabled easier classification of ncRNA classes such as snoRNAs, but has also aided the identification of novel classes. It is hoped that a higher level of understanding of ncRNA expression and interaction may aid in the development of less harsh treatment for protist-based diseases. PMID:22303390

  6. Model-based quality assessment and base-calling for second-generation sequencing data.

    Science.gov (United States)

    Bravo, Héctor Corrada; Irizarry, Rafael A

    2010-09-01

    Second-generation sequencing (sec-gen) technology can sequence millions of short fragments of DNA in parallel, making it capable of assembling complex genomes for a small fraction of the price and time of previous technologies. In fact, a recently formed international consortium, the 1000 Genomes Project, plans to fully sequence the genomes of approximately 1200 people. The prospect of comparative analysis at the sequence level of a large number of samples across multiple populations may be achieved within the next five years. These data present unprecedented challenges in statistical analysis. For instance, analysis operates on millions of short nucleotide sequences, or reads-strings of A,C,G, or T's, between 30 and 100 characters long-which are the result of complex processing of noisy continuous fluorescence intensity measurements known as base-calling. The complexity of the base-calling discretization process results in reads of widely varying quality within and across sequence samples. This variation in processing quality results in infrequent but systematic errors that we have found to mislead downstream analysis of the discretized sequence read data. For instance, a central goal of the 1000 Genomes Project is to quantify across-sample variation at the single nucleotide level. At this resolution, small error rates in sequencing prove significant, especially for rare variants. Sec-gen sequencing is a relatively new technology for which potential biases and sources of obscuring variation are not yet fully understood. Therefore, modeling and quantifying the uncertainty inherent in the generation of sequence reads is of utmost importance. In this article, we present a simple model to capture uncertainty arising in the base-calling procedure of the Illumina/Solexa GA platform. Model parameters have a straightforward interpretation in terms of the chemistry of base-calling allowing for informative and easily interpretable metrics that capture the variability in

  7. High throughput sequencing and proteomics to identify immunogenic proteins of a new pathogen: the dirty genome approach.

    Directory of Open Access Journals (Sweden)

    Gilbert Greub

    Full Text Available BACKGROUND: With the availability of new generation sequencing technologies, bacterial genome projects have undergone a major boost. Still, chromosome completion needs a costly and time-consuming gap closure, especially when containing highly repetitive elements. However, incomplete genome data may be sufficiently informative to derive the pursued information. For emerging pathogens, i.e. newly identified pathogens, lack of release of genome data during gap closure stage is clearly medically counterproductive. METHODS/PRINCIPAL FINDINGS: We thus investigated the feasibility of a dirty genome approach, i.e. the release of unfinished genome sequences to develop serological diagnostic tools. We showed that almost the whole genome sequence of the emerging pathogen Parachlamydia acanthamoebae was retrieved even with relatively short reads from Genome Sequencer 20 and Solexa. The bacterial proteome was analyzed to select immunogenic proteins, which were then expressed and used to elaborate the first steps of an ELISA. CONCLUSIONS/SIGNIFICANCE: This work constitutes the proof of principle for a dirty genome approach, i.e. the use of unfinished genome sequences of pathogenic bacteria, coupled with proteomics to rapidly identify new immunogenic proteins useful to develop in the future specific diagnostic tests such as ELISA, immunohistochemistry and direct antigen detection. Although applied here to an emerging pathogen, this combined dirty genome sequencing/proteomic approach may be used for any pathogen for which better diagnostics are needed. These genome sequences may also be very useful to develop DNA based diagnostic tests. All these diagnostic tools will allow further evaluations of the pathogenic potential of this obligate intracellular bacterium.

  8. Identification of SNP and SSR Markers in Finger Millet Using Next Generation Sequencing Technologies.

    Science.gov (United States)

    Gimode, Davis; Odeny, Damaris A; de Villiers, Etienne P; Wanyonyi, Solomon; Dida, Mathews M; Mneney, Emmarold E; Muchugi, Alice; Machuka, Jesse; de Villiers, Santie M

    2016-01-01

    Finger millet is an important cereal crop in eastern Africa and southern India with excellent grain storage quality and unique ability to thrive in extreme environmental conditions. Since negligible attention has been paid to improving this crop to date, the current study used Next Generation Sequencing (NGS) technologies to develop both Simple Sequence Repeat (SSR) and Single Nucleotide Polymorphism (SNP) markers. Genomic DNA from cultivated finger millet genotypes KNE755 and KNE796 was sequenced using both Roche 454 and Illumina technologies. Non-organelle sequencing reads were assembled into 207 Mbp representing approximately 13% of the finger millet genome. We identified 10,327 SSRs and 23,285 non-homeologous SNPs and tested 101 of each for polymorphism across a diverse set of wild and cultivated finger millet germplasm. For the 49 polymorphic SSRs, the mean polymorphism information content (PIC) was 0.42, ranging from 0.16 to 0.77. We also validated 92 SNP markers, 80 of which were polymorphic with a mean PIC of 0.29 across 30 wild and 59 cultivated accessions. Seventy-six of the 80 SNPs were polymorphic across 30 wild germplasm with a mean PIC of 0.30 while only 22 of the SNP markers showed polymorphism among the 59 cultivated accessions with an average PIC value of 0.15. Genetic diversity analysis using the polymorphic SNP markers revealed two major clusters; one of wild and another of cultivated accessions. Detailed STRUCTURE analysis confirmed this grouping pattern and further revealed 2 sub-populations within wild E. coracana subsp. africana. Both STRUCTURE and genetic diversity analysis assisted with the correct identification of the new germplasm collections. These polymorphic SSR and SNP markers are a significant addition to the existing 82 published SSRs, especially with regard to the previously reported low polymorphism levels in finger millet. Our results also reveal an unexploited finger millet genetic resource that can be included in the regional

  9. Identification of SNP and SSR Markers in Finger Millet Using Next Generation Sequencing Technologies.

    Directory of Open Access Journals (Sweden)

    Davis Gimode

    Full Text Available Finger millet is an important cereal crop in eastern Africa and southern India with excellent grain storage quality and unique ability to thrive in extreme environmental conditions. Since negligible attention has been paid to improving this crop to date, the current study used Next Generation Sequencing (NGS technologies to develop both Simple Sequence Repeat (SSR and Single Nucleotide Polymorphism (SNP markers. Genomic DNA from cultivated finger millet genotypes KNE755 and KNE796 was sequenced using both Roche 454 and Illumina technologies. Non-organelle sequencing reads were assembled into 207 Mbp representing approximately 13% of the finger millet genome. We identified 10,327 SSRs and 23,285 non-homeologous SNPs and tested 101 of each for polymorphism across a diverse set of wild and cultivated finger millet germplasm. For the 49 polymorphic SSRs, the mean polymorphism information content (PIC was 0.42, ranging from 0.16 to 0.77. We also validated 92 SNP markers, 80 of which were polymorphic with a mean PIC of 0.29 across 30 wild and 59 cultivated accessions. Seventy-six of the 80 SNPs were polymorphic across 30 wild germplasm with a mean PIC of 0.30 while only 22 of the SNP markers showed polymorphism among the 59 cultivated accessions with an average PIC value of 0.15. Genetic diversity analysis using the polymorphic SNP markers revealed two major clusters; one of wild and another of cultivated accessions. Detailed STRUCTURE analysis confirmed this grouping pattern and further revealed 2 sub-populations within wild E. coracana subsp. africana. Both STRUCTURE and genetic diversity analysis assisted with the correct identification of the new germplasm collections. These polymorphic SSR and SNP markers are a significant addition to the existing 82 published SSRs, especially with regard to the previously reported low polymorphism levels in finger millet. Our results also reveal an unexploited finger millet genetic resource that can be included

  10. The first FDA marketing authorizations of next-generation sequencing technology and tests: challenges, solutions and impact for future assays.

    Science.gov (United States)

    Bijwaard, Karen; Dickey, Jennifer S; Kelm, Kellie; Težak, Živana

    2015-01-01

    The rapid emergence and clinical translation of novel high-throughput sequencing technologies created a need to clarify the regulatory pathway for the evaluation and authorization of these unique technologies. Recently, the US FDA authorized for marketing four next generation sequencing (NGS)-based diagnostic devices which consisted of two heritable disease-specific assays, library preparation reagents and a NGS platform that are intended for human germline targeted sequencing from whole blood. These first authorizations can serve as a case study in how different types of NGS-based technology are reviewed by the FDA. In this manuscript we describe challenges associated with the evaluation of these novel technologies and provide an overview of what was reviewed. Besides making validated NGS-based devices available for in vitro diagnostic use, these first authorizations create a regulatory path for similar future instruments and assays.

  11. Automatic start-up system of nuclear reactor based on sequence control technology

    International Nuclear Information System (INIS)

    Zhang Yao; Zhang Dafa; Peng Huaqing

    2009-01-01

    A conceptive design of an automatic start-up system based on the sequence control for the nuclear reactors is given in this paper, so as to solve the problems during the start-up process, such as the long operation time, low automatic control level and high accident rate. The start-up process and its requirements are analyzed in detail at first. Then,the principle, the architecture, the key technologies of the automatic start-up system of nuclear reactors are designed and discussed. With the designed system, the automatic start-up of the nuclear reactor can be realized,the work load of the operator can be reduced,and the safety and efficiency of the nuclear power plant during its start-up can be improved. (authors)

  12. Identification of microRNAs from Eugenia uniflora by high-throughput sequencing and bioinformatics analysis.

    Science.gov (United States)

    Guzman, Frank; Almerão, Mauricio P; Körbes, Ana P; Loss-Morais, Guilherme; Margis, Rogerio

    2012-01-01

    microRNAs or miRNAs are small non-coding regulatory RNAs that play important functions in the regulation of gene expression at the post-transcriptional level by targeting mRNAs for degradation or inhibiting protein translation. Eugenia uniflora is a plant native to tropical America with pharmacological and ecological importance, and there have been no previous studies concerning its gene expression and regulation. To date, no miRNAs have been reported in Myrtaceae species. Small RNA and RNA-seq libraries were constructed to identify miRNAs and pre-miRNAs in Eugenia uniflora. Solexa technology was used to perform high throughput sequencing of the library, and the data obtained were analyzed using bioinformatics tools. From 14,489,131 small RNA clean reads, we obtained 1,852,722 mature miRNA sequences representing 45 conserved families that have been identified in other plant species. Further analysis using contigs assembled from RNA-seq allowed the prediction of secondary structures of 25 known and 17 novel pre-miRNAs. The expression of twenty-seven identified miRNAs was also validated using RT-PCR assays. Potential targets were predicted for the most abundant mature miRNAs in the identified pre-miRNAs based on sequence homology. This study is the first large scale identification of miRNAs and their potential targets from a species of the Myrtaceae family without genomic sequence resources. Our study provides more information about the evolutionary conservation of the regulatory network of miRNAs in plants and highlights species-specific miRNAs.

  13. De novo assembly of a 40 Mb eukaryotic genome from short sequence reads: Sordaria macrospora, a model organism for fungal morphogenesis.

    Science.gov (United States)

    Nowrousian, Minou; Stajich, Jason E; Chu, Meiling; Engh, Ines; Espagne, Eric; Halliday, Karen; Kamerewerd, Jens; Kempken, Frank; Knab, Birgit; Kuo, Hsiao-Che; Osiewacz, Heinz D; Pöggeler, Stefanie; Read, Nick D; Seiler, Stephan; Smith, Kristina M; Zickler, Denise; Kück, Ulrich; Freitag, Michael

    2010-04-08

    Filamentous fungi are of great importance in ecology, agriculture, medicine, and biotechnology. Thus, it is not surprising that genomes for more than 100 filamentous fungi have been sequenced, most of them by Sanger sequencing. While next-generation sequencing techniques have revolutionized genome resequencing, e.g. for strain comparisons, genetic mapping, or transcriptome and ChIP analyses, de novo assembly of eukaryotic genomes still presents significant hurdles, because of their large size and stretches of repetitive sequences. Filamentous fungi contain few repetitive regions in their 30-90 Mb genomes and thus are suitable candidates to test de novo genome assembly from short sequence reads. Here, we present a high-quality draft sequence of the Sordaria macrospora genome that was obtained by a combination of Illumina/Solexa and Roche/454 sequencing. Paired-end Solexa sequencing of genomic DNA to 85-fold coverage and an additional 10-fold coverage by single-end 454 sequencing resulted in approximately 4 Gb of DNA sequence. Reads were assembled to a 40 Mb draft version (N50 of 117 kb) with the Velvet assembler. Comparative analysis with Neurospora genomes increased the N50 to 498 kb. The S. macrospora genome contains even fewer repeat regions than its closest sequenced relative, Neurospora crassa. Comparison with genomes of other fungi showed that S. macrospora, a model organism for morphogenesis and meiosis, harbors duplications of several genes involved in self/nonself-recognition. Furthermore, S. macrospora contains more polyketide biosynthesis genes than N. crassa. Phylogenetic analyses suggest that some of these genes may have been acquired by horizontal gene transfer from a distantly related ascomycete group. Our study shows that, for typical filamentous fungi, de novo assembly of genomes from short sequence reads alone is feasible, that a mixture of Solexa and 454 sequencing substantially improves the assembly, and that the resulting data can be used for

  14. De novo assembly of a 40 Mb eukaryotic genome from short sequence reads: Sordaria macrospora, a model organism for fungal morphogenesis.

    Directory of Open Access Journals (Sweden)

    Minou Nowrousian

    2010-04-01

    Full Text Available Filamentous fungi are of great importance in ecology, agriculture, medicine, and biotechnology. Thus, it is not surprising that genomes for more than 100 filamentous fungi have been sequenced, most of them by Sanger sequencing. While next-generation sequencing techniques have revolutionized genome resequencing, e.g. for strain comparisons, genetic mapping, or transcriptome and ChIP analyses, de novo assembly of eukaryotic genomes still presents significant hurdles, because of their large size and stretches of repetitive sequences. Filamentous fungi contain few repetitive regions in their 30-90 Mb genomes and thus are suitable candidates to test de novo genome assembly from short sequence reads. Here, we present a high-quality draft sequence of the Sordaria macrospora genome that was obtained by a combination of Illumina/Solexa and Roche/454 sequencing. Paired-end Solexa sequencing of genomic DNA to 85-fold coverage and an additional 10-fold coverage by single-end 454 sequencing resulted in approximately 4 Gb of DNA sequence. Reads were assembled to a 40 Mb draft version (N50 of 117 kb with the Velvet assembler. Comparative analysis with Neurospora genomes increased the N50 to 498 kb. The S. macrospora genome contains even fewer repeat regions than its closest sequenced relative, Neurospora crassa. Comparison with genomes of other fungi showed that S. macrospora, a model organism for morphogenesis and meiosis, harbors duplications of several genes involved in self/nonself-recognition. Furthermore, S. macrospora contains more polyketide biosynthesis genes than N. crassa. Phylogenetic analyses suggest that some of these genes may have been acquired by horizontal gene transfer from a distantly related ascomycete group. Our study shows that, for typical filamentous fungi, de novo assembly of genomes from short sequence reads alone is feasible, that a mixture of Solexa and 454 sequencing substantially improves the assembly, and that the resulting data

  15. Killer Immunoglobulin-Like Receptor Allele Determination Using Next-Generation Sequencing Technology

    Directory of Open Access Journals (Sweden)

    Bercelin Maniangou

    2017-05-01

    Full Text Available The impact of natural killer (NK cell alloreactivity on hematopoietic stem cell transplantation (HSCT outcome is still debated due to the complexity of graft parameters, HLA class I environment, the nature of killer cell immunoglobulin-like receptor (KIR/KIR ligand genetic combinations studied, and KIR+ NK cell repertoire size. KIR genes are known to be polymorphic in terms of gene content, copy number variation, and number of alleles. These allelic polymorphisms may impact both the phenotype and function of KIR+ NK cells. We, therefore, speculate that polymorphisms may alter donor KIR+ NK cell phenotype/function thus modulating post-HSCT KIR+ NK cell alloreactivity. To investigate KIR allele polymorphisms of all KIR genes, we developed a next-generation sequencing (NGS technology on a MiSeq platform. To ensure the reliability and specificity of our method, genomic DNA from well-characterized cell lines were used; high-resolution KIR typing results obtained were then compared to those previously reported. Two different bioinformatic pipelines were used allowing the attribution of sequencing reads to specific KIR genes and the assignment of KIR alleles for each KIR gene. Our results demonstrated successful long-range KIR gene amplifications of all reference samples using intergenic KIR primers. The alignment of reads to the human genome reference (hg19 using BiRD pipeline or visualization of data using Profiler software demonstrated that all KIR genes were completely sequenced with a sufficient read depth (mean 317× for all loci and a high percentage of mapping (mean 93% for all loci. Comparison of high-resolution KIR typing obtained to those published data using exome capture resulted in a reported concordance rate of 95% for centromeric and telomeric KIR genes. Overall, our results suggest that NGS can be used to investigate the broad KIR allelic polymorphism. Hence, these data improve our knowledge, not only on KIR+ NK cell alloreactivity in

  16. Nanopore sequencing technology and tools for genome assembly: computational analysis of the current state, bottlenecks and future directions.

    Science.gov (United States)

    Senol Cali, Damla; Kim, Jeremie S; Ghose, Saugata; Alkan, Can; Mutlu, Onur

    2018-04-02

    Nanopore sequencing technology has the potential to render other sequencing technologies obsolete with its ability to generate long reads and provide portability. However, high error rates of the technology pose a challenge while generating accurate genome assemblies. The tools used for nanopore sequence analysis are of critical importance, as they should overcome the high error rates of the technology. Our goal in this work is to comprehensively analyze current publicly available tools for nanopore sequence analysis to understand their advantages, disadvantages and performance bottlenecks. It is important to understand where the current tools do not perform well to develop better tools. To this end, we (1) analyze the multiple steps and the associated tools in the genome assembly pipeline using nanopore sequence data, and (2) provide guidelines for determining the appropriate tools for each step. Based on our analyses, we make four key observations: (1) the choice of the tool for basecalling plays a critical role in overcoming the high error rates of nanopore sequencing technology. (2) Read-to-read overlap finding tools, GraphMap and Minimap, perform similarly in terms of accuracy. However, Minimap has a lower memory usage, and it is faster than GraphMap. (3) There is a trade-off between accuracy and performance when deciding on the appropriate tool for the assembly step. The fast but less accurate assembler Miniasm can be used for quick initial assembly, and further polishing can be applied on top of it to increase the accuracy, which leads to faster overall assembly. (4) The state-of-the-art polishing tool, Racon, generates high-quality consensus sequences while providing a significant speedup over another polishing tool, Nanopolish. We analyze various combinations of different tools and expose the trade-offs between accuracy, performance, memory usage and scalability. We conclude that our observations can guide researchers and practitioners in making conscious

  17. Next generation sequencing based transcriptome analysis of septic-injury responsive genes in the beetle Tribolium castaneum.

    Directory of Open Access Journals (Sweden)

    Boran Altincicek

    Full Text Available Beetles (Coleoptera are the most diverse animal group on earth and interact with numerous symbiotic or pathogenic microbes in their environments. The red flour beetle Tribolium castaneum is a genetically tractable model beetle species and its whole genome sequence has recently been determined. To advance our understanding of the molecular basis of beetle immunity here we analyzed the whole transcriptome of T. castaneum by high-throughput next generation sequencing technology. Here, we demonstrate that the Illumina/Solexa sequencing approach of cDNA samples from T. castaneum including over 9.7 million reads with 72 base pairs (bp length (approximately 700 million bp sequence information with about 30× transcriptome coverage confirms the expression of most predicted genes and enabled subsequent qualitative and quantitative transcriptome analysis. This approach recapitulates our recent quantitative real-time PCR studies of immune-challenged and naïve T. castaneum beetles, validating our approach. Furthermore, this sequencing analysis resulted in the identification of 73 differentially expressed genes upon immune-challenge with statistical significance by comparing expression data to calculated values derived by fitting to generalized linear models. We identified up regulation of diverse immune-related genes (e.g. Toll receptor, serine proteinases, DOPA decarboxylase and thaumatin and of numerous genes encoding proteins with yet unknown functions. Of note, septic-injury resulted also in the elevated expression of genes encoding heat-shock proteins or cytochrome P450s supporting the view that there is crosstalk between immune and stress responses in T. castaneum. The present study provides a first comprehensive overview of septic-injury responsive genes in T. castaneum beetles. Identified genes advance our understanding of T. castaneum specific gene expression alteration upon immune-challenge in particular and may help to understand beetle immunity

  18. Detecting novel genetic mutations in Chinese Usher syndrome families using next-generation sequencing technology.

    Science.gov (United States)

    Qu, Ling-Hui; Jin, Xin; Xu, Hai-Wei; Li, Shi-Ying; Yin, Zheng-Qin

    2015-02-01

    Usher syndrome (USH) is the most common cause of combined blindness and deafness inherited in an autosomal recessive mode. Molecular diagnosis is of great significance in revealing the molecular pathogenesis and aiding the clinical diagnosis of this disease. However, molecular diagnosis remains a challenge due to high phenotypic and genetic heterogeneity in USH. This study explored an approach for detecting disease-causing genetic mutations in candidate genes in five index cases from unrelated USH families based on targeted next-generation sequencing (NGS) technology. Through systematic data analysis using an established bioinformatics pipeline and segregation analysis, 10 pathogenic mutations in the USH disease genes were identified in the five USH families. Six of these mutations were novel: c.4398G > A and EX38-49del in MYO7A, c.988_989delAT in USH1C, c.15104_15105delCA and c.6875_6876insG in USH2A. All novel variations segregated with the disease phenotypes in their respective families and were absent from ethnically matched control individuals. This study expanded the mutation spectrum of USH and revealed the genotype-phenotype relationships of the novel USH mutations in Chinese patients. Moreover, this study proved that targeted NGS is an accurate and effective method for detecting genetic mutations related to USH. The identification of pathogenic mutations is of great significance for elucidating the underlying pathophysiology of USH.

  19. Integration of microbiological, epidemiological and next generation sequencing technologies data for the managing of nosocomial infections

    Directory of Open Access Journals (Sweden)

    Matteo Brilli

    2018-02-01

    Full Text Available At its core, the work of clinical microbiologists consists in the retrieving of a few bytes of information (species identification; metabolic capacities; staining and antigenic properties; antibiotic resistance profiles, etc. from pathogenic agents. The development of next generation sequencing technologies (NGS, and the possibility to determine the entire genome for bacterial pathogens, fungi and protozoans will likely introduce a breakthrough in the amount of information generated by clinical microbiology laboratories: from bytes to Megabytes of information, for a single isolate. In parallel, the development of novel informatics tools, designed for the management and analysis of the so-called Big Data, offers the possibility to search for patterns in databases collecting genomic and microbiological information on the pathogens, as well as epidemiological data and information on the clinical parameters of the patients. Nosocomial infections and antibiotic resistance will likely represent major challenges for clinical microbiologists, in the next decades. In this paper, we describe how bacterial genomics based on NGS, integrated with novel informatic tools, could contribute to the control of hospital infections and multi-drug resistant pathogens.

  20. A Review on the Applications of Next Generation Sequencing Technologies as Applied to Food-Related Microbiome Studies

    Directory of Open Access Journals (Sweden)

    Yu Cao

    2017-09-01

    Full Text Available The development of next generation sequencing (NGS techniques has enabled researchers to study and understand the world of microorganisms from broader and deeper perspectives. The contemporary advances in DNA sequencing technologies have not only enabled finer characterization of bacterial genomes but also provided deeper taxonomic identification of complex microbiomes which in its genomic essence is the combined genetic material of the microorganisms inhabiting an environment, whether the environment be a particular body econiche (e.g., human intestinal contents or a food manufacturing facility econiche (e.g., floor drain. To date, 16S rDNA sequencing, metagenomics and metatranscriptomics are the three basic sequencing strategies used in the taxonomic identification and characterization of food-related microbiomes. These sequencing strategies have used different NGS platforms for DNA and RNA sequence identification. Traditionally, 16S rDNA sequencing has played a key role in understanding the taxonomic composition of a food-related microbiome. Recently, metagenomic approaches have resulted in improved understanding of a microbiome by providing a species-level/strain-level characterization. Further, metatranscriptomic approaches have contributed to the functional characterization of the complex interactions between different microbial communities within a single microbiome. Many studies have highlighted the use of NGS techniques in investigating the microbiome of fermented foods. However, the utilization of NGS techniques in studying the microbiome of non-fermented foods are limited. This review provides a brief overview of the advances in DNA sequencing chemistries as the technology progressed from first, next and third generations and highlights how NGS provided a deeper understanding of food-related microbiomes with special focus on non-fermented foods.

  1. A Review on the Applications of Next Generation Sequencing Technologies as Applied to Food-Related Microbiome Studies

    Science.gov (United States)

    Cao, Yu; Fanning, Séamus; Proos, Sinéad; Jordan, Kieran; Srikumar, Shabarinath

    2017-01-01

    The development of next generation sequencing (NGS) techniques has enabled researchers to study and understand the world of microorganisms from broader and deeper perspectives. The contemporary advances in DNA sequencing technologies have not only enabled finer characterization of bacterial genomes but also provided deeper taxonomic identification of complex microbiomes which in its genomic essence is the combined genetic material of the microorganisms inhabiting an environment, whether the environment be a particular body econiche (e.g., human intestinal contents) or a food manufacturing facility econiche (e.g., floor drain). To date, 16S rDNA sequencing, metagenomics and metatranscriptomics are the three basic sequencing strategies used in the taxonomic identification and characterization of food-related microbiomes. These sequencing strategies have used different NGS platforms for DNA and RNA sequence identification. Traditionally, 16S rDNA sequencing has played a key role in understanding the taxonomic composition of a food-related microbiome. Recently, metagenomic approaches have resulted in improved understanding of a microbiome by providing a species-level/strain-level characterization. Further, metatranscriptomic approaches have contributed to the functional characterization of the complex interactions between different microbial communities within a single microbiome. Many studies have highlighted the use of NGS techniques in investigating the microbiome of fermented foods. However, the utilization of NGS techniques in studying the microbiome of non-fermented foods are limited. This review provides a brief overview of the advances in DNA sequencing chemistries as the technology progressed from first, next and third generations and highlights how NGS provided a deeper understanding of food-related microbiomes with special focus on non-fermented foods. PMID:29033905

  2. Sim3C: simulation of Hi-C and Meta3C proximity ligation sequencing technologies.

    Science.gov (United States)

    DeMaere, Matthew Z; Darling, Aaron E

    2018-02-01

    Chromosome conformation capture (3C) and Hi-C DNA sequencing methods have rapidly advanced our understanding of the spatial organization of genomes and metagenomes. Many variants of these protocols have been developed, each with their own strengths. Currently there is no systematic means for simulating sequence data from this family of sequencing protocols, potentially hindering the advancement of algorithms to exploit this new datatype. We describe a computational simulator that, given simple parameters and reference genome sequences, will simulate Hi-C sequencing on those sequences. The simulator models the basic spatial structure in genomes that is commonly observed in Hi-C and 3C datasets, including the distance-decay relationship in proximity ligation, differences in the frequency of interaction within and across chromosomes, and the structure imposed by cells. A means to model the 3D structure of randomly generated topologically associating domains is provided. The simulator considers several sources of error common to 3C and Hi-C library preparation and sequencing methods, including spurious proximity ligation events and sequencing error. We have introduced the first comprehensive simulator for 3C and Hi-C sequencing protocols. We expect the simulator to have use in testing of Hi-C data analysis algorithms, as well as more general value for experimental design, where questions such as the required depth of sequencing, enzyme choice, and other decisions can be made in advance in order to ensure adequate statistical power with respect to experimental hypothesis testing.

  3. Is Whole Exome Sequencing an Ethically Disruptive Technology? Perspectives of Pediatric Oncologists and Parents of Pediatric Patients with Solid Tumors

    Science.gov (United States)

    McCullough, Laurence B.; Slashinski, Melody J.; McGuire, Amy L.; Street, Richard L.; Eng, Christine M.; Gibbs, Richard A.; Parsons, D. Williams; Plon, Sharon E.

    2016-01-01

    Background Some anticipate that physician and parents will be ill-prepared or unprepared for the clinical introduction of genome sequencing, making it ethically disruptive. Procedure As part of the Baylor Advancing Sequencing in Childhood Cancer Care (BASIC3) study, we conducted semi-structured interviews with 16 pediatric oncologists and 40 parents of pediatric patients with cancer prior to the return of sequencing results. We elicited expectations and attitudes concerning the impact of sequencing on clinical decision-making, clinical utility, and treatment expectations from both groups. Using accepted methods of qualitative research to analyze interview transcripts, we completed a thematic analysis to provide inductive insights into their views of sequencing. Results Our major findings reveal that neither pediatric oncologists nor parents anticipate sequencing to be an ethically disruptive technology, because they expect to be prepared to integrate sequencing results into their existing approaches to learning and using new clinical information for care. Pediatric oncologists do not expect sequencing results to be more complex than other diagnostic information and plan simply to incorporate these data into their evidence-based approach to clinical practice although they were concerned about impact on parents. For parents, there is an urgency to protect their chil's health and in this context they expect genomic information to better prepare them to participate in decisions about their chil's care. Conclusion Our data do not support concern that introducing genome sequencing into childhood cancer care will be ethically disruptive, i.e., leave physicians or parents ill-prepared or unprepared to make responsible decisions about patient care. PMID:26505993

  4. Deep sequencing discovery of novel and conserved microRNAs in trifoliate orange (Citrus trifoliata

    Directory of Open Access Journals (Sweden)

    Yu Huaping

    2010-07-01

    Full Text Available Abstract Background MicroRNAs (miRNAs play a critical role in post-transcriptional gene regulation and have been shown to control many genes involved in various biological and metabolic processes. There have been extensive studies to discover miRNAs and analyze their functions in model plant species, such as Arabidopsis and rice. Deep sequencing technologies have facilitated identification of species-specific or lowly expressed as well as conserved or highly expressed miRNAs in plants. Results In this research, we used Solexa sequencing to discover new microRNAs in trifoliate orange (Citrus trifoliata which is an important rootstock of citrus. A total of 13,106,753 reads representing 4,876,395 distinct sequences were obtained from a short RNA library generated from small RNA extracted from C. trifoliata flower and fruit tissues. Based on sequence similarity and hairpin structure prediction, we found that 156,639 reads representing 63 sequences from 42 highly conserved miRNA families, have perfect matches to known miRNAs. We also identified 10 novel miRNA candidates whose precursors were all potentially generated from citrus ESTs. In addition, five miRNA* sequences were also sequenced. These sequences had not been earlier described in other plant species and accumulation of the 10 novel miRNAs were confirmed by qRT-PCR analysis. Potential target genes were predicted for most conserved and novel miRNAs. Moreover, four target genes including one encoding IRX12 copper ion binding/oxidoreductase and three genes encoding NB-LRR disease resistance protein have been experimentally verified by detection of the miRNA-mediated mRNA cleavage in C. trifoliata. Conclusion Deep sequencing of short RNAs from C. trifoliata flowers and fruits identified 10 new potential miRNAs and 42 highly conserved miRNA families, indicating that specific miRNAs exist in C. trifoliata. These results show that regulatory miRNAs exist in agronomically important trifoliate orange

  5. Technology trajectories and the selection of optimal R and D project sequences

    NARCIS (Netherlands)

    van Bommel, Ties; Mahieu, R.J.; Nijssen, E.J.

    2014-01-01

    Given a set of R&D projects drawing on the same underlying technology, a technology trajectory refers to the order in which projects are executed. Due to their technological interdependence, the successful execution of one project can increase a firm's technological capability, and help to

  6. Rapid sequencing of the bamboo mitochondrial genome using Illumina technology and parallel episodic evolution of organelle genomes in grasses.

    Science.gov (United States)

    Ma, Peng-Fei; Guo, Zhen-Hua; Li, De-Zhu

    2012-01-01

    Compared to their counterparts in animals, the mitochondrial (mt) genomes of angiosperms exhibit a number of unique features. However, unravelling their evolution is hindered by the few completed genomes, of which are essentially Sanger sequenced. While next-generation sequencing technologies have revolutionized chloroplast genome sequencing, they are just beginning to be applied to angiosperm mt genomes. Chloroplast genomes of grasses (Poaceae) have undergone episodic evolution and the evolutionary rate was suggested to be correlated between chloroplast and mt genomes in Poaceae. It is interesting to investigate whether correlated rate change also occurred in grass mt genomes as expected under lineage effects. A time-calibrated phylogenetic tree is needed to examine rate change. We determined a largely completed mt genome from a bamboo, Ferrocalamus rimosivaginus (Poaceae), through Illumina sequencing of total DNA. With combination of de novo and reference-guided assembly, 39.5-fold coverage Illumina reads were finally assembled into scaffolds totalling 432,839 bp. The assembled genome contains nearly the same genes as the completed mt genomes in Poaceae. For examining evolutionary rate in grass mt genomes, we reconstructed a phylogenetic tree including 22 taxa based on 31 mt genes. The topology of the well-resolved tree was almost identical to that inferred from chloroplast genome with only minor difference. The inconsistency possibly derived from long branch attraction in mtDNA tree. By calculating absolute substitution rates, we found significant rate change (∼4-fold) in mt genome before and after the diversification of Poaceae both in synonymous and nonsynonymous terms. Furthermore, the rate change was correlated with that of chloroplast genomes in grasses. Our result demonstrates that it is a rapid and efficient approach to obtain angiosperm mt genome sequences using Illumina sequencing technology. The parallel episodic evolution of mt and chloroplast

  7. Transcriptome sequencing of lentil based on second-generation technology permits large-scale unigene assembly and SSR marker discovery

    Directory of Open Access Journals (Sweden)

    Materne Michael

    2011-05-01

    Full Text Available Abstract Background Lentil (Lens culinaris Medik. is a cool-season grain legume which provides a rich source of protein for human consumption. In terms of genomic resources, lentil is relatively underdeveloped, in comparison to other Fabaceae species, with limited available data. There is hence a significant need to enhance such resources in order to identify novel genes and alleles for molecular breeding to increase crop productivity and quality. Results Tissue-specific cDNA samples from six distinct lentil genotypes were sequenced using Roche 454 GS-FLX Titanium technology, generating c. 1.38 × 106 expressed sequence tags (ESTs. De novo assembly generated a total of 15,354 contigs and 68,715 singletons. The complete unigene set was sequence-analysed against genome drafts of the model legume species Medicago truncatula and Arabidopsis thaliana to identify 12,639, and 7,476 unique matches, respectively. When compared to the genome of Glycine max, a total of 20,419 unique hits were observed corresponding to c. 31% of the known gene space. A total of 25,592 lentil unigenes were subsequently annoated from GenBank. Simple sequence repeat (SSR-containing ESTs were identified from consensus sequences and a total of 2,393 primer pairs were designed. A subset of 192 EST-SSR markers was screened for validation across a panel 12 cultivated lentil genotypes and one wild relative species. A total of 166 primer pairs obtained successful amplification, of which 47.5% detected genetic polymorphism. Conclusions A substantial collection of ESTs has been developed from sequence analysis of lentil genotypes using second-generation technology, permitting unigene definition across a broad range of functional categories. As well as providing resources for functional genomics studies, the unigene set has permitted significant enhancement of the number of publicly-available molecular genetic markers as tools for improvement of this species.

  8. Use of Metagenomic Shotgun Sequencing Technology To Detect Foodborne Pathogens within the Microbiome of the Beef Production Chain

    OpenAIRE

    Yang, Xiang; Noyes, Noelle R.; Doster, Enrique; Martin, Jennifer N.; Linke, Lyndsey M.; Magnuson, Roberta J.; Yang, Hua; Geornaras, Ifigenia; Woerner, Dale R.; Jones, Kenneth L.; Ruiz, Jaime; Boucher, Christina; Morley, Paul S.; Belk, Keith E.

    2016-01-01

    Foodborne illnesses associated with pathogenic bacteria are a global public health and economic challenge. The diversity of microorganisms (pathogenic and nonpathogenic) that exists within the food and meat industries complicates efforts to understand pathogen ecology. Further, little is known about the interaction of pathogens within the microbiome throughout the meat production chain. Here, a metagenomic approach and shotgun sequencing technology were used as tools to detect pathogenic bact...

  9. Peripheral blood transcriptome sequencing reveals rejection-relevant genes in long-term heart transplantation.

    Science.gov (United States)

    Chen, Yan; Zhang, Haibo; Xiao, Xue; Jia, Yixin; Wu, Weili; Liu, Licheng; Jiang, Jun; Zhu, Baoli; Meng, Xu; Chen, Weijun

    2013-10-03

    Peripheral blood-based gene expression patterns have been investigated as biomarkers to monitor the immune system and rule out rejection after heart transplantation. Recent advances in the high-throughput deep sequencing (HTS) technologies provide new leads in transcriptome analysis. By performing Solexa/Illumina's digital gene expression (DGE) profiling, we analyzed gene expression profiles of PBMCs from 6 quiescent (grade 0) and 6 rejection (grade 2R&3R) heart transplant recipients at more than 6 months after transplantation. Subsequently, quantitative real-time polymerase chain reaction (qRT-PCR) was carried out in an independent validation cohort of 47 individuals from three rejection groups (ISHLT, grade 0,1R, 2R&3R). Through DGE sequencing and qPCR validation, 10 genes were identified as informative genes for detection of cardiac transplant rejection. A further clustering analysis showed that the 10 genes were not only effective for distinguishing patients with acute cardiac allograft rejection, but also informative for discriminating patients with renal allograft rejection based on both blood and biopsy samples. Moreover, PPI network analysis revealed that the 10 genes were connected to each other within a short interaction distance. We proposed a 10-gene signature for heart transplant patients at high-risk of developing severe rejection, which was found to be effective as well in other organ transplant. Moreover, we supposed that these genes function systematically as biomarkers in long-time allograft rejection. Further validation in broad transplant population would be required before the non-invasive biomarkers can be generally utilized to predict the risk of transplant rejection. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  10. Molecular-Sized DNA or RNA Sequencing Machine | NCI Technology Transfer Center | TTC

    Science.gov (United States)

    The National Cancer Institute's Gene Regulation and Chromosome Biology Laboratory is seeking statements of capability or interest from parties interested in collaborative research to co-develop a molecular-sized DNA or RNA sequencing machine.

  11. Generation of expressed sequence tags for discovery of genes responsible for floral traits of Chrysanthemum morifolium by next-generation sequencing technology.

    Science.gov (United States)

    Sasaki, Katsutomo; Mitsuda, Nobutaka; Nashima, Kenji; Kishimoto, Kyutaro; Katayose, Yuichi; Kanamori, Hiroyuki; Ohmiya, Akemi

    2017-09-04

    Chrysanthemum morifolium is one of the most economically valuable ornamental plants worldwide. Chrysanthemum is an allohexaploid plant with a large genome that is commercially propagated by vegetative reproduction. New cultivars with different floral traits, such as color, morphology, and scent, have been generated mainly by classical cross-breeding and mutation breeding. However, only limited genetic resources and their genome information are available for the generation of new floral traits. To obtain useful information about molecular bases for floral traits of chrysanthemums, we read expressed sequence tags (ESTs) of chrysanthemums by high-throughput sequencing using the 454 pyrosequencing technology. We constructed normalized cDNA libraries, consisting of full-length, 3'-UTR, and 5'-UTR cDNAs derived from various tissues of chrysanthemums. These libraries produced a total number of 3,772,677 high-quality reads, which were assembled into 213,204 contigs. By comparing the data obtained with those of full genome-sequenced species, we confirmed that our chrysanthemum contig set contained the majority of all expressed genes, which was sufficient for further molecular analysis in chrysanthemums. We confirmed that our chrysanthemum EST set (contigs) contained a number of contigs that encoded transcription factors and enzymes involved in pigment and aroma compound metabolism that was comparable to that of other species. This information can serve as an informative resource for identifying genes involved in various biological processes in chrysanthemums. Moreover, the findings of our study will contribute to a better understanding of the floral characteristics of chrysanthemums including the myriad cultivars at the molecular level.

  12. Direct comparisons of Illumina vs. Roche 454 sequencing technologies on the same microbial community DNA sample.

    Directory of Open Access Journals (Sweden)

    Chengwei Luo

    Full Text Available Next-generation sequencing (NGS is commonly used in metagenomic studies of complex microbial communities but whether or not different NGS platforms recover the same diversity from a sample and their assembled sequences are of comparable quality remain unclear. We compared the two most frequently used platforms, the Roche 454 FLX Titanium and the Illumina Genome Analyzer (GA II, on the same DNA sample obtained from a complex freshwater planktonic community. Despite the substantial differences in read length and sequencing protocols, the platforms provided a comparable view of the community sampled. For instance, derived assemblies overlapped in ~90% of their total sequences and in situ abundances of genes and genotypes (estimated based on sequence coverage correlated highly between the two platforms (R(2>0.9. Evaluation of base-call error, frameshift frequency, and contig length suggested that Illumina offered equivalent, if not better, assemblies than Roche 454. The results from metagenomic samples were further validated against DNA samples of eighteen isolate genomes, which showed a range of genome sizes and G+C% content. We also provide quantitative estimates of the errors in gene and contig sequences assembled from datasets characterized by different levels of complexity and G+C% content. For instance, we noted that homopolymer-associated, single-base errors affected ~1% of the protein sequences recovered in Illumina contigs of 10× coverage and 50% G+C; this frequency increased to ~3% when non-homopolymer errors were also considered. Collectively, our results should serve as a useful practical guide for choosing proper sampling strategies and data possessing protocols for future metagenomic studies.

  13. Direct comparisons of Illumina vs. Roche 454 sequencing technologies on the same microbial community DNA sample.

    Science.gov (United States)

    Luo, Chengwei; Tsementzi, Despina; Kyrpides, Nikos; Read, Timothy; Konstantinidis, Konstantinos T

    2012-01-01

    Next-generation sequencing (NGS) is commonly used in metagenomic studies of complex microbial communities but whether or not different NGS platforms recover the same diversity from a sample and their assembled sequences are of comparable quality remain unclear. We compared the two most frequently used platforms, the Roche 454 FLX Titanium and the Illumina Genome Analyzer (GA) II, on the same DNA sample obtained from a complex freshwater planktonic community. Despite the substantial differences in read length and sequencing protocols, the platforms provided a comparable view of the community sampled. For instance, derived assemblies overlapped in ~90% of their total sequences and in situ abundances of genes and genotypes (estimated based on sequence coverage) correlated highly between the two platforms (R(2)>0.9). Evaluation of base-call error, frameshift frequency, and contig length suggested that Illumina offered equivalent, if not better, assemblies than Roche 454. The results from metagenomic samples were further validated against DNA samples of eighteen isolate genomes, which showed a range of genome sizes and G+C% content. We also provide quantitative estimates of the errors in gene and contig sequences assembled from datasets characterized by different levels of complexity and G+C% content. For instance, we noted that homopolymer-associated, single-base errors affected ~1% of the protein sequences recovered in Illumina contigs of 10× coverage and 50% G+C; this frequency increased to ~3% when non-homopolymer errors were also considered. Collectively, our results should serve as a useful practical guide for choosing proper sampling strategies and data possessing protocols for future metagenomic studies.

  14. MicroRNA and piRNA profiles in normal human testis detected by next generation sequencing.

    Directory of Open Access Journals (Sweden)

    Qingling Yang

    Full Text Available BACKGROUND: MicroRNAs (miRNAs are the class of small endogenous RNAs that play an important regulatory role in cells by negatively affecting gene expression at transcriptional and post-transcriptional levels. There have been extensive studies aiming to discover miRNAs and to analyze their functions in the cells from a variety of species. However, there are no published studies of miRNA profiles in human testis using next generation sequencing (NGS technology. RESULTS: We employed Solexa sequencing technology to profile miRNAs in normal human testis. Total 770 known and 5 novel human miRNAs, and 20121 piRNAs were detected, indicating that the human testis has a complex population of small RNAs. The expression of 15 known and 5 novel detected miRNAs was validated by qRT-PCR. We have also predicted the potential target genes of the abundant known and novel miRNAs, and subjected them to GO and pathway analysis, revealing the involvement of miRNAs in many important biological phenomenon including meiosis and p53-related pathways that are implicated in the regulation of spermatogenesis. CONCLUSIONS: This study reports the first genome-wide miRNA profiles in human testis using a NGS approach. The presence of large number of miRNAs and the nature of their target genes suggested that miRNAs play important roles in spermatogenesis. Here we provide a useful resource for further elucidation of the regulatory role of miRNAs and piRNAs in the spermatogenesis. It may also facilitate the development of prophylactic strategies for male infertility.

  15. Nanopore sequencing technology: a new route for the fast detection of unauthorized GMO.

    Science.gov (United States)

    Fraiture, Marie-Alice; Saltykova, Assia; Hoffman, Stefan; Winand, Raf; Deforce, Dieter; Vanneste, Kevin; De Keersmaecker, Sigrid C J; Roosens, Nancy H C

    2018-05-21

    In order to strengthen the current genetically modified organism (GMO) detection system for unauthorized GMO, we have recently developed a new workflow based on DNA walking to amplify unknown sequences surrounding a known DNA region. This DNA walking is performed on transgenic elements, commonly found in GMO, that were earlier detected by real-time PCR (qPCR) screening. Previously, we have demonstrated the ability of this approach to detect unauthorized GMO via the identification of unique transgene flanking regions and the unnatural associations of elements from the transgenic cassette. In the present study, we investigate the feasibility to integrate the described workflow with the MinION Next-Generation-Sequencing (NGS). The MinION sequencing platform can provide long read-lengths and deal with heterogenic DNA libraries, allowing for rapid and efficient delivery of sequences of interest. In addition, the ability of this NGS platform to characterize unauthorized and unknown GMO without any a priori knowledge has been assessed.

  16. Technological sequence of creating components of the training system of the future officers to the management of physical training

    Directory of Open Access Journals (Sweden)

    Olkhovy O.M.

    2012-09-01

    Full Text Available The goal is to determine constructive ways of sequence of constructing components of the training system of the future officers to carry out official questions of managing the physical training in the process of the further military career. The structural logic circuit of the interconnections stages of optimum cycle management and technological sequence of constructing the components of the training system of the future officers to the management of physical training, which provides: definition of requirements to the typical problems of professional activities on the issues of the leadership, organization and conducting of physical training, the creation of the phased system model cadets training, training of the curriculum discipline ″Physical education, special physical training and sport″; model creation and definition of criteria of the integral evaluation of the readiness of the future officers to the management of physical training was determined through the analysis more than thirty documentary and scientific literature.

  17. New Approaches and Technologies to Sequence de novo Plant reference Genomes (2013 DOE JGI Genomics of Energy and Environment 8th Annual User Meeting)

    Energy Technology Data Exchange (ETDEWEB)

    Schmutz, Jeremy

    2013-03-01

    Jeremy Schmutz of the HudsonAlpha Institute for Biotechnology on New approaches and technologies to sequence de novo plant reference genomes at the 8th Annual Genomics of Energy Environment Meeting on March 27, 2013 in Walnut Creek, CA.

  18. CRISPR-Cas9 technology: applications in genome engineering, development of sequence-specific antimicrobials, and future prospects.

    Science.gov (United States)

    de la Fuente-Núñez, César; Lu, Timothy K

    2017-02-20

    The development of CRISPR-Cas9 technology has revolutionized our ability to edit DNA and to modulate expression levels of genes of interest, thus providing powerful tools to accelerate the precise engineering of a wide range of organisms. In addition, the CRISPR-Cas system can be harnessed to design "precision" antimicrobials that target bacterial pathogens in a DNA sequence-specific manner. This capability will enable killing of drug-resistant microbes by selectively targeting genes involved in antibiotic resistance, biofilm formation and virulence. Here, we review the origins and mechanistic basis of CRISPR-Cas systems, discuss how this technology can be leveraged to provide a range of applications in both eukaryotic and prokaryotic systems, and finish by outlining limitations and future prospects.

  19. Identification and Characterization of MicroRNAs in Small Brown Planthopper (Laodephax striatellus) by Next-Generation Sequencing

    Science.gov (United States)

    Lou, Yonggen; Cheng, Jia'an; Zhang, Hengmu; Xu, Jian-Hong

    2014-01-01

    MicroRNAs (miRNAs) are endogenous non-coding small RNAs that regulate gene expression at the post-transcriptional level and are thought to play critical roles in many metabolic activities in eukaryotes. The small brown planthopper (Laodephax striatellus Fallén), one of the most destructive agricultural pests, causes great damage to crops including rice, wheat, and maize. However, information about the genome of L. striatellus is limited. In this study, a small RNA library was constructed from a mixed L. striatellus population and sequenced by Solexa sequencing technology. A total of 501 mature miRNAs were identified, including 227 conserved and 274 novel miRNAs belonging to 125 and 250 families, respectively. Sixty-nine conserved miRNAs that are included in 38 families are predicted to have an RNA secondary structure typically found in miRNAs. Many miRNAs were validated by stem-loop RT-PCR. Comparison with the miRNAs in 84 animal species from miRBase showed that the conserved miRNA families we identified are highly conserved in the Arthropoda phylum. Furthermore, miRanda predicted 2701 target genes for 378 miRNAs, which could be categorized into 52 functional groups annotated by gene ontology. The function of miRNA target genes was found to be very similar between conserved and novel miRNAs. This study of miRNAs in L. striatellus will provide new information and enhance the understanding of the role of miRNAs in the regulation of L. striatellus metabolism and development. PMID:25057821

  20. Identification and characterization of microRNAs in small brown planthopper (Laodephax striatellus by next-generation sequencing.

    Directory of Open Access Journals (Sweden)

    Guoyan Zhou

    Full Text Available MicroRNAs (miRNAs are endogenous non-coding small RNAs that regulate gene expression at the post-transcriptional level and are thought to play critical roles in many metabolic activities in eukaryotes. The small brown planthopper (Laodephax striatellus Fallén, one of the most destructive agricultural pests, causes great damage to crops including rice, wheat, and maize. However, information about the genome of L. striatellus is limited. In this study, a small RNA library was constructed from a mixed L. striatellus population and sequenced by Solexa sequencing technology. A total of 501 mature miRNAs were identified, including 227 conserved and 274 novel miRNAs belonging to 125 and 250 families, respectively. Sixty-nine conserved miRNAs that are included in 38 families are predicted to have an RNA secondary structure typically found in miRNAs. Many miRNAs were validated by stem-loop RT-PCR. Comparison with the miRNAs in 84 animal species from miRBase showed that the conserved miRNA families we identified are highly conserved in the Arthropoda phylum. Furthermore, miRanda predicted 2701 target genes for 378 miRNAs, which could be categorized into 52 functional groups annotated by gene ontology. The function of miRNA target genes was found to be very similar between conserved and novel miRNAs. This study of miRNAs in L. striatellus will provide new information and enhance the understanding of the role of miRNAs in the regulation of L. striatellus metabolism and development.

  1. The Application of Next Generation Sequencing Technology on Noninvasive Prenatal Test

    DEFF Research Database (Denmark)

    Jiang, Hui

    There are nearly 7000 rare diseases that have been reported in the world. Although most of them occur with a frequency of less than one in 2000, in total about 6% of the population suffers from rare diseases. These rare diseases are often caused by changes in genes, which is currently lack of eff...... diseases and monogenetic diseases in a noninvasively manner. The new approach has great potential to be wildly used in the worldwide with the decreasing in sequencing costs, and therefore play an incredible role to prevent rare diseases....

  2. Complete genome sequencing of the luminescent bacterium, Vibrio qinghaiensis sp. Q67 using PacBio technology

    Science.gov (United States)

    Gong, Liang; Wu, Yu; Jian, Qijie; Yin, Chunxiao; Li, Taotao; Gupta, Vijai Kumar; Duan, Xuewu; Jiang, Yueming

    2018-01-01

    Vibrio qinghaiensis sp.-Q67 (Vqin-Q67) is a freshwater luminescent bacterium that continuously emits blue-green light (485 nm). The bacterium has been widely used for detecting toxic contaminants. Here, we report the complete genome sequence of Vqin-Q67, obtained using third-generation PacBio sequencing technology. Continuous long reads were attained from three PacBio sequencing runs and reads >500 bp with a quality value of >0.75 were merged together into a single dataset. This resultant highly-contiguous de novo assembly has no genome gaps, and comprises two chromosomes with substantial genetic information, including protein-coding genes, non-coding RNA, transposon and gene islands. Our dataset can be useful as a comparative genome for evolution and speciation studies, as well as for the analysis of protein-coding gene families, the pathogenicity of different Vibrio species in fish, the evolution of non-coding RNA and transposon, and the regulation of gene expression in relation to the bioluminescence of Vqin-Q67.

  3. Introduction of the hybcell-based compact sequencing technology and comparison to state-of-the-art methodologies for KRAS mutation detection.

    Science.gov (United States)

    Zopf, Agnes; Raim, Roman; Danzer, Martin; Niklas, Norbert; Spilka, Rita; Pröll, Johannes; Gabriel, Christian; Nechansky, Andreas; Roucka, Markus

    2015-03-01

    The detection of KRAS mutations in codons 12 and 13 is critical for anti-EGFR therapy strategies; however, only those methodologies with high sensitivity, specificity, and accuracy as well as the best cost and turnaround balance are suitable for routine daily testing. Here we compared the performance of compact sequencing using the novel hybcell technology with 454 next-generation sequencing (454-NGS), Sanger sequencing, and pyrosequencing, using an evaluation panel of 35 specimens. A total of 32 mutations and 10 wild-type cases were reported using 454-NGS as the reference method. Specificity ranged from 100% for Sanger sequencing to 80% for pyrosequencing. Sanger sequencing and hybcell-based compact sequencing achieved a sensitivity of 96%, whereas pyrosequencing had a sensitivity of 88%. Accuracy was 97% for Sanger sequencing, 85% for pyrosequencing, and 94% for hybcell-based compact sequencing. Quantitative results were obtained for 454-NGS and hybcell-based compact sequencing data, resulting in a significant correlation (r = 0.914). Whereas pyrosequencing and Sanger sequencing were not able to detect multiple mutated cell clones within one tumor specimen, 454-NGS and the hybcell-based compact sequencing detected multiple mutations in two specimens. Our comparison shows that the hybcell-based compact sequencing is a valuable alternative to state-of-the-art methodologies used for detection of clinically relevant point mutations.

  4. Understanding invasion history and predicting invasive niches using genetic sequencing technology in Australia: case studies from Cucurbitaceae and Boraginaceae.

    Science.gov (United States)

    Shaik, Razia S; Zhu, Xiaocheng; Clements, David R; Weston, Leslie A

    2016-01-01

    Part of the challenge in dealing with invasive plant species is that they seldom represent a uniform, static entity. Often, an accurate understanding of the history of plant introduction and knowledge of the real levels of genetic diversity present in species and populations of importance is lacking. Currently, the role of genetic diversity in promoting the successful establishment of invasive plants is not well defined. Genetic profiling of invasive plants should enhance our understanding of the dynamics of colonization in the invaded range. Recent advances in DNA sequencing technology have greatly facilitated the rapid and complete assessment of plant population genetics. Here, we apply our current understanding of the genetics and ecophysiology of plant invasions to recent work on Australian plant invaders from the Cucurbitaceae and Boraginaceae. The Cucurbitaceae study showed that both prickly paddy melon ( Cucumis myriocarpus ) and camel melon ( Citrullus lanatus ) were represented by only a single genotype in Australia, implying that each was probably introduced as a single introduction event. In contrast, a third invasive melon, Citrullus colocynthis , possessed a moderate level of genetic diversity in Australia and was potentially introduced to the continent at least twice. The Boraginaceae study demonstrated the value of comparing two similar congeneric species; one, Echium plantagineum , is highly invasive and genetically diverse, whereas the other, Echium vulgare , exhibits less genetic diversity and occupies a more limited ecological niche. Sequence analysis provided precise identification of invasive plant species, as well as information on genetic diversity and phylogeographic history. Improved sequencing technologies will continue to allow greater resolution of genetic relationships among invasive plant populations, thereby potentially improving our ability to predict the impact of these relationships upon future spread and better manage invaders

  5. Experimental evolution, genetic analysis and genome re-sequencing reveal the mutation conferring artemisinin resistance in an isogenic lineage of malaria parasites

    KAUST Repository

    Hunt, Paul

    2010-09-16

    Background: Classical and quantitative linkage analyses of genetic crosses have traditionally been used to map genes of interest, such as those conferring chloroquine or quinine resistance in malaria parasites. Next-generation sequencing technologies now present the possibility of determining genome-wide genetic variation at single base-pair resolution. Here, we combine in vivo experimental evolution, a rapid genetic strategy and whole genome re-sequencing to identify the precise genetic basis of artemisinin resistance in a lineage of the rodent malaria parasite, Plasmodium chabaudi. Such genetic markers will further the investigation of resistance and its control in natural infections of the human malaria, P. falciparum.Results: A lineage of isogenic in vivo drug-selected mutant P. chabaudi parasites was investigated. By measuring the artemisinin responses of these clones, the appearance of an in vivo artemisinin resistance phenotype within the lineage was defined. The underlying genetic locus was mapped to a region of chromosome 2 by Linkage Group Selection in two different genetic crosses. Whole-genome deep coverage short-read re-sequencing (IlluminaSolexa) defined the point mutations, insertions, deletions and copy-number variations arising in the lineage. Eight point mutations arise within the mutant lineage, only one of which appears on chromosome 2. This missense mutation arises contemporaneously with artemisinin resistance and maps to a gene encoding a de-ubiquitinating enzyme.Conclusions: This integrated approach facilitates the rapid identification of mutations conferring selectable phenotypes, without prior knowledge of biological and molecular mechanisms. For malaria, this model can identify candidate genes before resistant parasites are commonly observed in natural human malaria populations. 2010 Hunt et al; licensee BioMed Central Ltd.

  6. Deep sequencing-based transcriptome profiling analysis of bacteria-challenged Lateolabrax japonicus reveals insight into the immune-relevant genes in marine fish

    Directory of Open Access Journals (Sweden)

    Xiang Li-xin

    2010-08-01

    Full Text Available Abstract Background Systematic research on fish immunogenetics is indispensable in understanding the origin and evolution of immune systems. This has long been a challenging task because of the limited number of deep sequencing technologies and genome backgrounds of non-model fish available. The newly developed Solexa/Illumina RNA-seq and Digital gene expression (DGE are high-throughput sequencing approaches and are powerful tools for genomic studies at the transcriptome level. This study reports the transcriptome profiling analysis of bacteria-challenged Lateolabrax japonicus using RNA-seq and DGE in an attempt to gain insights into the immunogenetics of marine fish. Results RNA-seq analysis generated 169,950 non-redundant consensus sequences, among which 48,987 functional transcripts with complete or various length encoding regions were identified. More than 52% of these transcripts are possibly involved in approximately 219 known metabolic or signalling pathways, while 2,673 transcripts were associated with immune-relevant genes. In addition, approximately 8% of the transcripts appeared to be fish-specific genes that have never been described before. DGE analysis revealed that the host transcriptome profile of Vibrio harveyi-challenged L. japonicus is considerably altered, as indicated by the significant up- or down-regulation of 1,224 strong infection-responsive transcripts. Results indicated an overall conservation of the components and transcriptome alterations underlying innate and adaptive immunity in fish and other vertebrate models. Analysis suggested the acquisition of numerous fish-specific immune system components during early vertebrate evolution. Conclusion This study provided a global survey of host defence gene activities against bacterial challenge in a non-model marine fish. Results can contribute to the in-depth study of candidate genes in marine fish immunity, and help improve current understanding of host

  7. Next generation DNA sequencing technology delivers valuable genetic markers for the genomic orphan legume species, Bituminaria bituminosa

    Directory of Open Access Journals (Sweden)

    Pazos-Navarro María

    2011-12-01

    Full Text Available Abstract Background Bituminaria bituminosa is a perennial legume species from the Canary Islands and Mediterranean region that has potential as a drought-tolerant pasture species and as a source of pharmaceutical compounds. Three botanical varieties have previously been identified in this species: albomarginata, bituminosa and crassiuscula. B. bituminosa can be considered a genomic 'orphan' species with very few genomic resources available. New DNA sequencing technologies provide an opportunity to develop high quality molecular markers for such orphan species. Results 432,306 mRNA molecules were sampled from a leaf transcriptome of a single B. bituminosa plant using Roche 454 pyrosequencing, resulting in an average read length of 345 bp (149.1 Mbp in total. Sequences were assembled into 3,838 isotigs/contigs representing putatively unique gene transcripts. Gene ontology descriptors were identified for 3,419 sequences. Raw sequence reads containing simple sequence repeat (SSR motifs were identified, and 240 primer pairs flanking these motifs were designed. Of 87 primer pairs developed this way, 75 (86.2% successfully amplified primarily single fragments by PCR. Fragment analysis using 20 primer pairs in 79 accessions of B. bituminosa detected 130 alleles at 21 SSR loci. Genetic diversity analyses confirmed that variation at these SSR loci accurately reflected known taxonomic relationships in original collections of B. bituminosa and provided additional evidence that a division of the botanical variety bituminosa into two according to geographical origin (Mediterranean region and Canary Islands may be appropriate. Evidence of cross-pollination was also found between botanical varieties within a B. bituminosa breeding programme. Conclusions B. bituminosa can no longer be considered a genomic orphan species, having now a large (albeit incomplete repertoire of expressed gene sequences that can serve as a resource for future genetic studies. This

  8. High-precision, whole-genome sequencing of laboratory strains facilitates genetic studies.

    Directory of Open Access Journals (Sweden)

    Anjana Srivatsan

    2008-08-01

    Full Text Available Whole-genome sequencing is a powerful technique for obtaining the reference sequence information of multiple organisms. Its use can be dramatically expanded to rapidly identify genomic variations, which can be linked with phenotypes to obtain biological insights. We explored these potential applications using the emerging next-generation sequencing platform Solexa Genome Analyzer, and the well-characterized model bacterium Bacillus subtilis. Combining sequencing with experimental verification, we first improved the accuracy of the published sequence of the B. subtilis reference strain 168, then obtained sequences of multiple related laboratory strains and different isolates of each strain. This provides a framework for comparing the divergence between different laboratory strains and between their individual isolates. We also demonstrated the power of Solexa sequencing by using its results to predict a defect in the citrate signal transduction pathway of a common laboratory strain, which we verified experimentally. Finally, we examined the molecular nature of spontaneously generated mutations that suppress the growth defect caused by deletion of the stringent response mediator relA. Using whole-genome sequencing, we rapidly mapped these suppressor mutations to two small homologs of relA. Interestingly, stable suppressor strains had mutations in both genes, with each mutation alone partially relieving the relA growth defect. This supports an intriguing three-locus interaction module that is not easily identifiable through traditional suppressor mapping. We conclude that whole-genome sequencing can drastically accelerate the identification of suppressor mutations and complex genetic interactions, and it can be applied as a standard tool to investigate the genetic traits of model organisms.

  9. Characterization of microflora in Latin-style cheeses by next-generation sequencing technology

    Directory of Open Access Journals (Sweden)

    Lusk Tina S

    2012-11-01

    Full Text Available Abstract Background Cheese contamination can occur at numerous stages in the manufacturing process including the use of improperly pasteurized or raw milk. Of concern is the potential contamination by Listeria monocytogenes and other pathogenic bacteria that find the high moisture levels and moderate pH of popular Latin-style cheeses like queso fresco a hospitable environment. In the investigation of a foodborne outbreak, samples typically undergo enrichment in broth for 24 hours followed by selective agar plating to isolate bacterial colonies for confirmatory testing. The broth enrichment step may also enable background microflora to proliferate, which can confound subsequent analysis if not inhibited by effective broth or agar additives. We used 16S rRNA gene sequencing to provide a preliminary survey of bacterial species associated with three brands of Latin-style cheeses after 24-hour broth enrichment. Results Brand A showed a greater diversity than the other two cheese brands (Brands B and C at nearly every taxonomic level except phylum. Brand B showed the least diversity and was dominated by a single bacterial taxon, Exiguobacterium, not previously reported in cheese. This genus was also found in Brand C, although Lactococcus was prominent, an expected finding since this bacteria belongs to the group of lactic acid bacteria (LAB commonly found in fermented foods. Conclusions The contrasting diversity observed in Latin-style cheese was surprising, demonstrating that despite similarity of cheese type, raw materials and cheese making conditions appear to play a critical role in the microflora composition of the final product. The high bacterial diversity associated with Brand A suggests it may have been prepared with raw materials of high bacterial diversity or influenced by the ecology of the processing environment. Additionally, the presence of Exiguobacterium in high proportions (96% in Brand B and, to a lesser extent, Brand C (46%, may

  10. A New Targeted CFTR Mutation Panel Based on Next-Generation Sequencing Technology.

    Science.gov (United States)

    Lucarelli, Marco; Porcaro, Luigi; Biffignandi, Alice; Costantino, Lucy; Giannone, Valentina; Alberti, Luisella; Bruno, Sabina Maria; Corbetta, Carlo; Torresani, Erminio; Colombo, Carla; Seia, Manuela

    2017-09-01

    Searching for mutations in the cystic fibrosis transmembrane conductance regulator gene (CFTR) is a key step in the diagnosis of and neonatal and carrier screening for cystic fibrosis (CF), and it has implications for prognosis and personalized therapy. The large number of mutations and genetic and phenotypic variability make this search a complex task. Herein, we developed, validated, and tested a laboratory assay for an extended search for mutations in CFTR using a next-generation sequencing-based method, with a panel of 188 CFTR mutations customized for the Italian population. Overall, 1426 dried blood spots from neonatal screening, 402 genomic DNA samples from various origins, and 1138 genomic DNA samples from patients with CF were analyzed. The assay showed excellent analytical and diagnostic operative characteristics. We identified and experimentally validated 159 (of 188) CFTR mutations. The assay achieved detection rates of 95.0% and 95.6% in two large-scale case series of CF patients from central and northern Italy, respectively. These detection rates are among the highest reported so far with a genetic test for CF based on a mutation panel. This assay appears to be well suited for diagnostics, neonatal and carrier screening, and assisted reproduction, and it represents a considerable advantage in CF genetic counseling. Copyright © 2017 American Society for Investigative Pathology and the Association for Molecular Pathology. Published by Elsevier Inc. All rights reserved.

  11. Sequence assembly

    DEFF Research Database (Denmark)

    Scheibye-Alsing, Karsten; Hoffmann, S.; Frankel, Annett Maria

    2009-01-01

    Despite the rapidly increasing number of sequenced and re-sequenced genomes, many issues regarding the computational assembly of large-scale sequencing data have remain unresolved. Computational assembly is crucial in large genome projects as well for the evolving high-throughput technologies and...... in genomic DNA, highly expressed genes and alternative transcripts in EST sequences. We summarize existing comparisons of different assemblers and provide a detailed descriptions and directions for download of assembly programs at: http://genome.ku.dk/resources/assembly/methods.html....

  12. Investigating the mechanisms of glyphosate resistance in goosegrass (Eleusine indica (L.) Gaertn.) by RNA sequencing technology.

    Science.gov (United States)

    Chen, Jingchao; Huang, Hongjuan; Wei, Shouhui; Huang, Zhaofeng; Wang, Xu; Zhang, Chaoxian

    2017-01-01

    Glyphosate is an important non-selective herbicide that is in common use worldwide. However, evolved glyphosate-resistant (GR) weeds significantly affect crop yields. Unfortunately, the mechanisms underlying resistance in GR weeds, such as goosegrass (Eleusine indica (L.) Gaertn.), an annual weed found worldwide, have not been fully elucidated. In this study, transcriptome analysis was conducted to further assess the potential mechanisms of glyphosate resistance in goosegrass. The RNA sequencing libraries generated 24 597 462 clean reads. De novo assembly analysis produced 48 852 UniGenes with an average length of 847 bp. All UniGenes were annotated using seven databases. Sixteen candidate differentially expressed genes selected by digital gene expression analysis were validated by quantitative real-time PCR (qRT-PCR). Among these UniGenes, the EPSPS and PFK genes were constitutively up-regulated in resistant (R) individuals and showed a higher copy number than that in susceptible (S) individuals. The expressions of four UniGenes relevant to photosynthesis were inhibited by glyphosate in S individuals, and this toxic response was confirmed by gas exchange analysis. Two UniGenes annotated as glutathione transferase (GST) were constitutively up-regulated in R individuals, and were induced by glyphosate both in R and S. In addition, the GST activities in R individuals were higher than in S. Our research confirmed that two UniGenes (PFK, EPSPS) were strongly associated with target resistance, and two GST-annotated UniGenes may play a role in metabolic glyphosate resistance in goosegrass. © 2016 The Authors The Plant Journal © 2016 John Wiley & Sons Ltd.

  13. A high-throughput method to detect RNA profiling by integration of RT-MLPA with next generation sequencing technology.

    Science.gov (United States)

    Wang, Jing; Yang, Xue; Chen, Haofeng; Wang, Xuewei; Wang, Xiangyu; Fang, Yi; Jia, Zhenyu; Gao, Jidong

    2017-07-11

    RNA in formalin-fixed and paraffin-embedded (FFPE) tissues provides large amount of information indicating disease stages, histological tumor types and grades, as well as clinical outcomes. However, Detection of RNA expression levels in formalin-fixed and paraffin-embedded samples is extremely difficult due to poor RNA quality. Here we developed a high-throughput method, Reverse Transcription-Multiple Ligation-dependent Probe Sequencing (RT-MLPSeq), to determine expression levels of multiple transcripts in FFPE samples. By combining Reverse Transcription-Multiple Ligation-dependent Amplification method and next generation sequencing technology, RT-MLPSeq overcomes the limit of probe length in multiplex ligation-dependent probe amplification assay and thus could detect expression levels of transcripts without quantitative limitations. We proved that different RT-MLPSeq probes targeting on the same transcripts have highly consistent results and the starting RNA/cDNA input could be as little as 1 ng. RT-MLPSeq also presented consistent relative RNA levels of selected 13 genes with reverse transcription quantitative PCR. Finally, we demonstrated the application of the new RT-MLPSeq method by measuring the mRNA expression levels of 21 genes which can be used for accurate calculation of the breast cancer recurrence score - an index that has been widely used for managing breast cancer patients.

  14. [Research on soil bacteria under the impact of sealed CO2 leakage by high-throughput sequencing technology].

    Science.gov (United States)

    Tian, Di; Ma, Xin; Li, Yu-E; Zha, Liang-Song; Wu, Yang; Zou, Xiao-Xia; Liu, Shuang

    2013-10-01

    Carbon dioxide Capture and Storage has provided a new option for mitigating global anthropogenic CO2 emission with its unique advantages. However, there is a risk of the sealed CO2 leakage, bringing a serious threat to the ecology system. It is widely known that soil microorganisms are closely related to soil health, while the study on the impact of sequestered CO2 leakage on soil microorganisms is quite deficient. In this study, the leakage scenarios of sealed CO2 were constructed and the 16S rRNA genes of soil bacteria were sequenced by Illumina high-throughput sequencing technology on Miseq platform, and related biological analysis was conducted to explore the changes of soil bacterial abundance, diversity and structure. There were 486,645 reads for 43,017 OTUs of 15 soil samples and the results of biological analysis showed that there were differences in the abundance, diversity and community structure of soil bacterial community under different CO, leakage scenarios while the abundance and diversity of the bacterial community declined with the amplification of CO2 leakage quantity and leakage time, and some bacteria species became the dominant bacteria species in the bacteria community, therefore the increase of Acidobacteria species would be a biological indicator for the impact of sealed CO2 leakage on soil ecology system.

  15. Shedding light on the Early Pleistocene of TD6 (Gran Dolina, Atapuerca, Spain): The technological sequence and occupational inferences.

    Science.gov (United States)

    Mosquera, Marina; Ollé, Andreu; Rodríguez-Álvarez, Xose Pedro; Carbonell, Eudald

    2018-01-01

    This paper aims to update the information available on the lithic assemblage from the entire sequence of TD6 now that the most recent excavations have been completed, and to explore possible changes in both occupational patterns and technological strategies evidenced in the unit. This is the first study to analyse the entire TD6 sequence, including subunits TD6.3 and TD6.1, which have never been studied, along with the better-known TD6.2 Homo antecessor-bearing subunit. We also present an analysis of several lithic refits found in TD6, as well as certain technical features that may help characterise the hominin occupations. The archaeo-palaeontological record from TD6 consists of 9,452 faunal remains, 443 coprolites, 1,046 lithic pieces, 170 hominin remains and 91 Celtis seeds. The characteristics of this record seem to indicate two main stages of occupation. In the oldest subunit, TD6.3, the lithic assemblage points to the light and limited hominin occupation of the cave, which does, however, grow over the course of the level. In contrast, the lithic assemblages from TD6.2 and TD6.1 are rich and varied, which may reflect Gran Dolina cave's establishment as a landmark in the region. Despite the occupational differences between the lowermost subunit and the rest of the deposit, technologically the TD6 lithic assemblage is extremely homogeneous throughout. In addition, the composition and spatial distribution of the 12 groups of lithic refits found in unit TD6, as well as the in situ nature of the assemblage demonstrate the high degree of preservation at the site. This may help clarify the nature of the Early Pleistocene hominin occupations of TD6, and raise reasonable doubt about the latest interpretations that support the ex situ character of the assemblage as a whole.

  16. Discrimination of the Lactobacillus acidophilus group using sequencing, species-specific PCR and SNaPshot mini-sequencing technology based on the recA gene.

    Science.gov (United States)

    Huang, Chien-Hsun; Chang, Mu-Tzu; Huang, Mu-Chiou; Wang, Li-Tin; Huang, Lina; Lee, Fwu-Ling

    2012-10-01

    To clearly identify specific species and subspecies of the Lactobacillus acidophilus group using phenotypic and genotypic (16S rDNA sequence analysis) techniques alone is difficult. The aim of this study was to use the recA gene for species discrimination in the L. acidophilus group, as well as to develop a species-specific primer and single nucleotide polymorphism primer based on the recA gene sequence for species and subspecies identification. The average sequence similarity for the recA gene among type strains was 80.0%, and most members of the L. acidophilus group could be clearly distinguished. The species-specific primer was designed according to the recA gene sequencing, which was employed for polymerase chain reaction with the template DNA of Lactobacillus strains. A single 231-bp species-specific band was found only in L. delbrueckii. A SNaPshot mini-sequencing assay using recA as a target gene was also developed. The specificity of the mini-sequencing assay was evaluated using 31 strains of L. delbrueckii species and was able to unambiguously discriminate strains belonging to the subspecies L. delbrueckii subsp. bulgaricus. The phylogenetic relationships of most strains in the L. acidophilus group can be resolved using recA gene sequencing, and a novel method to identify the species and subspecies of the L. delbrueckii and L. delbrueckii subsp. bulgaricus was developed by species-specific polymerase chain reaction combined with SNaPshot mini-sequencing. Copyright © 2012 Society of Chemical Industry.

  17. Integrated mRNA and microRNA transcriptome sequencing characterizes sequence variants and mRNA–microRNA regulatory network in nasopharyngeal carcinoma model systems

    Directory of Open Access Journals (Sweden)

    Carol Ying-Ying Szeto

    2014-01-01

    Full Text Available Nasopharyngeal carcinoma (NPC is a prevalent malignancy in Southeast Asia among the Chinese population. Aberrant regulation of transcripts has been implicated in many types of cancers including NPC. Herein, we characterized mRNA and miRNA transcriptomes by RNA sequencing (RNASeq of NPC model systems. Matched total mRNA and small RNA of undifferentiated Epstein–Barr virus (EBV-positive NPC xenograft X666 and its derived cell line C666, well-differentiated NPC cell line HK1, and the immortalized nasopharyngeal epithelial cell line NP460 were sequenced by Solexa technology. We found 2812 genes and 149 miRNAs (human and EBV to be differentially expressed in NP460, HK1, C666 and X666 with RNASeq; 533 miRNA–mRNA target pairs were inversely regulated in the three NPC cell lines compared to NP460. Integrated mRNA/miRNA expression profiling and pathway analysis show extracellular matrix organization, Beta-1 integrin cell surface interactions, and the PI3K/AKT, EGFR, ErbB, and Wnt pathways were potentially deregulated in NPC. Real-time quantitative PCR was performed on selected mRNA/miRNAs in order to validate their expression. Transcript sequence variants such as short insertions and deletions (INDEL, single nucleotide variant (SNV, and isomiRs were characterized in the NPC model systems. A novel TP53 transcript variant was identified in NP460, HK1, and C666. Detection of three previously reported novel EBV-encoded BART miRNAs and their isomiRs were also observed. Meta-analysis of a model system to a clinical system aids the choice of different cell lines in NPC studies. This comprehensive characterization of mRNA and miRNA transcriptomes in NPC cell lines and the xenograft provides insights on miRNA regulation of mRNA and valuable resources on transcript variation and regulation in NPC, which are potentially useful for mechanistic and preclinical studies.

  18. Use of Metagenomic Shotgun Sequencing Technology To Detect Foodborne Pathogens within the Microbiome of the Beef Production Chain.

    Science.gov (United States)

    Yang, Xiang; Noyes, Noelle R; Doster, Enrique; Martin, Jennifer N; Linke, Lyndsey M; Magnuson, Roberta J; Yang, Hua; Geornaras, Ifigenia; Woerner, Dale R; Jones, Kenneth L; Ruiz, Jaime; Boucher, Christina; Morley, Paul S; Belk, Keith E

    2016-04-01

    Foodborne illnesses associated with pathogenic bacteria are a global public health and economic challenge. The diversity of microorganisms (pathogenic and nonpathogenic) that exists within the food and meat industries complicates efforts to understand pathogen ecology. Further, little is known about the interaction of pathogens within the microbiome throughout the meat production chain. Here, a metagenomic approach and shotgun sequencing technology were used as tools to detect pathogenic bacteria in environmental samples collected from the same groups of cattle at different longitudinal processing steps of the beef production chain: cattle entry to feedlot, exit from feedlot, cattle transport trucks, abattoir holding pens, and the end of the fabrication system. The log read counts classified as pathogens per million reads for Salmonella enterica,Listeria monocytogenes,Escherichia coli,Staphylococcus aureus, Clostridium spp. (C. botulinum and C. perfringens), and Campylobacter spp. (C. jejuni,C. coli, and C. fetus) decreased over subsequential processing steps. Furthermore, the normalized read counts for S. enterica,E. coli, and C. botulinumwere greater in the final product than at the feedlots, indicating that the proportion of these bacteria increased (the effect on absolute numbers was unknown) within the remaining microbiome. From an ecological perspective, data indicated that shotgun metagenomics can be used to evaluate not only the microbiome but also shifts in pathogen populations during beef production. Nonetheless, there were several challenges in this analysis approach, one of the main ones being the identification of the specific pathogen from which the sequence reads originated, which makes this approach impractical for use in pathogen identification for regulatory and confirmation purposes. Copyright © 2016 Yang et al.

  19. In vitro identification and in silico utilization of interspecies sequence similarities using GeneChip® technology

    Directory of Open Access Journals (Sweden)

    Ye Shui Q

    2005-05-01

    Full Text Available Abstract Background Genomic approaches in large animal models (canine, ovine etc are challenging due to insufficient genomic information for these species and the lack of availability of corresponding microarray platforms. To address this problem, we speculated that conserved interspecies genetic sequences can be experimentally detected by cross-species hybridization. The Affymetrix platform probe redundancy offers flexibility in selecting individual probes with high sequence similarities between related species for gene expression analysis. Results Gene expression profiles of 40 canine samples were generated using the human HG-U133A GeneChip (U133A. Due to interspecies genetic differences, only 14 ± 2% of canine transcripts were detected by U133A probe sets whereas profiling of 40 human samples detected 49 ± 6% of human transcripts. However, when these probe sets were deconstructed into individual probes and examined performance of each probe, we found that 47% of human probes were able to find their targets in canine tissues and generate a detectable hybridization signal. Therefore, we restricted gene expression analysis to these probes and observed the 60% increase in the number of identified canine transcripts. These results were validated by comparison of transcripts identified by our restricted analysis of cross-species hybridization with transcripts identified by hybridization of total lung canine mRNA to new Affymetrix Canine GeneChip®. Conclusion The experimental identification and restriction of gene expression analysis to probes with detectable hybridization signal drastically increases transcript detection of canine-human hybridization suggesting the possibility of broad utilization of cross-hybridizations of related species using GeneChip technology.

  20. Description and pilot results from a novel method for evaluating return of incidental findings from next-generation sequencing technologies.

    Science.gov (United States)

    Goddard, Katrina A B; Whitlock, Evelyn P; Berg, Jonathan S; Williams, Marc S; Webber, Elizabeth M; Webster, Jennifer A; Lin, Jennifer S; Schrader, Kasmintan A; Campos-Outcalt, Doug; Offit, Kenneth; Feigelson, Heather Spencer; Hollombe, Celine

    2013-09-01

    The aim of this study was to develop, operationalize, and pilot test a transparent, reproducible, and evidence-informed method to determine when to report incidental findings from next-generation sequencing technologies. Using evidence-based principles, we proposed a three-stage process. Stage I "rules out" incidental findings below a minimal threshold of evidence and is evaluated using inter-rater agreement and comparison with an expert-based approach. Stage II documents criteria for clinical actionability using a standardized approach to allow experts to consistently consider and recommend whether results should be routinely reported (stage III). We used expert opinion to determine the face validity of stages II and III using three case studies. We evaluated the time and effort for stages I and II. For stage I, we assessed 99 conditions and found high inter-rater agreement (89%), and strong agreement with a separate expert-based method. Case studies for familial adenomatous polyposis, hereditary hemochromatosis, and α1-antitrypsin deficiency were all recommended for routine reporting as incidental findings. The method requires definition of clinically actionable incidental findings and provide documentation and pilot testing of a feasible method that is scalable to the whole genome.

  1. First Complete Genomic Sequence of a Rabies Virus from the Republic of Tajikistan Obtained Directly from a Flinders Technology Associates Card

    OpenAIRE

    Goharriz, H.; Marston, D. A.; Sharifzoda, F.; Ellis, R. J.; Horton, D. L.; Khakimov, T.; Whatmore, A.; Khamroev, K.; Makhmadshoev, A. N.; Bazarov, M.; Fooks, A. R.; Banyard, A. C.

    2017-01-01

    ABSTRACT A brain homogenate derived from a rabid dog in the district of Tojikobod, Republic of Tajikistan, was applied to a Flinders Technology Associates (FTA) card. A full-genome sequence of rabies virus (RABV) was generated from the FTA card directly without extraction, demonstrating the utility of these cards for readily obtaining genetic data.

  2. First Complete Genomic Sequence of a Rabies Virus from the Republic of Tajikistan Obtained Directly from a Flinders Technology Associates Card.

    Science.gov (United States)

    Goharriz, H; Marston, D A; Sharifzoda, F; Ellis, R J; Horton, D L; Khakimov, T; Whatmore, A; Khamroev, K; Makhmadshoev, A N; Bazarov, M; Fooks, A R; Banyard, A C

    2017-07-06

    A brain homogenate derived from a rabid dog in the district of Tojikobod, Republic of Tajikistan, was applied to a Flinders Technology Associates (FTA) card. A full-genome sequence of rabies virus (RABV) was generated from the FTA card directly without extraction, demonstrating the utility of these cards for readily obtaining genetic data. © Crown copyright 2017.

  3. Is Whole-Exome Sequencing an Ethically Disruptive Technology? Perspectives of Pediatric Oncologists and Parents of Pediatric Patients With Solid Tumors.

    Science.gov (United States)

    McCullough, Laurence B; Slashinski, Melody J; McGuire, Amy L; Street, Richard L; Eng, Christine M; Gibbs, Richard A; Parsons, D William; Plon, Sharon E

    2016-03-01

    It has been anticipated that physician and parents will be ill prepared or unprepared for the clinical introduction of genome sequencing, making it ethically disruptive. As a part of the Baylor Advancing Sequencing in Childhood Cancer Care study, we conducted semistructured interviews with 16 pediatric oncologists and 40 parents of pediatric patients with cancer prior to the return of sequencing results. We elicited expectations and attitudes concerning the impact of sequencing on clinical decision making, clinical utility, and treatment expectations from both groups. Using accepted methods of qualitative research to analyze interview transcripts, we completed a thematic analysis to provide inductive insights into their views of sequencing. Our major findings reveal that neither pediatric oncologists nor parents anticipate sequencing to be an ethically disruptive technology, because they expect to be prepared to integrate sequencing results into their existing approaches to learning and using new clinical information for care. Pediatric oncologists do not expect sequencing results to be more complex than other diagnostic information and plan simply to incorporate these data into their evidence-based approach to clinical practice, although they were concerned about impact on parents. For parents, there is an urgency to protect their child's health and in this context they expect genomic information to better prepare them to participate in decisions about their child's care. Our data do not support the concern that introducing genome sequencing into childhood cancer care will be ethically disruptive, that is, leave physicians or parents ill prepared or unprepared to make responsible decisions about patient care. © 2015 Wiley Periodicals, Inc.

  4. MicroRNA discovery and analysis of pinewood nematode Bursaphelenchus xylophilus by deep sequencing.

    Directory of Open Access Journals (Sweden)

    Qi-Xing Huang

    Full Text Available BACKGROUND: MicroRNAs (miRNAs are considered to be very important in regulating the growth, development, behavior and stress response in animals and plants in post-transcriptional gene regulation. Pinewood nematode, Bursaphelenchus xylophilus, is an important invasive plant parasitic nematode in Asia. To have a comprehensive knowledge about miRNAs of the nematode is necessary for further in-depth study on roles of miRNAs in the ecological adaptation of the invasive species. METHODS AND FINDINGS: Five small RNA libraries were constructed and sequenced by Illumina/Solexa deep-sequencing technology. A total of 810 miRNA candidates (49 conserved and 761 novel were predicted by a computational pipeline, of which 57 miRNAs (20 conserved and 37 novel encoded by 53 miRNA precursors were identified by experimental methods. Ten novel miRNAs were considered to be species-specific miRNAs of B. xylophilus. Comparison of expression profiles of miRNAs in the five small RNA libraries showed that many miRNAs exhibited obviously different expression levels in the third-stage dispersal juvenile and at a cold-stressed status. Most of the miRNAs exhibited obviously down-regulated expression in the dispersal stage. But differences among the three geographic libraries were not prominent. A total of 979 genes were predicted to be targets of these authentic miRNAs. Among them, seven heat shock protein genes were targeted by 14 miRNAs, and six FMRFamide-like neuropeptides genes were targeted by 17 miRNAs. A real-time quantitative polymerase chain reaction was used to quantify the mRNA expression levels of target genes. CONCLUSIONS: Basing on the fact that a negative correlation existed between the expression profiles of miRNAs and the mRNA expression profiles of their target genes (hsp, flp by comparing those of the nematodes at a cold stressed status and a normal status, we suggested that miRNAs might participate in ecological adaptation and behavior regulation of the

  5. The objective of this program is to develop innovative DNA detection technologies to achieve fast microbial community assessment. The specific approaches are (1) to develop inexpensive and reliable sequence-proof hybridization DNA detection technology (2) to develop quantitative DNA hybridization technology for microbial community assessment and (3) to study the microbes which have demonstrated the potential to have nuclear waste bioremediation

    International Nuclear Information System (INIS)

    Chen, Chung H.

    2004-01-01

    The objective of this program is to develop innovative DNA detection technologies to achieve fast microbial community assessment. The specific approaches are (1) to develop inexpensive and reliable sequence-proof hybridization DNA detection technology (2) to develop quantitative DNA hybridization technology for microbial community assessment and (3) to study the microbes which have demonstrated the potential to have nuclear waste bioremediation

  6. Deep sequencing-based transcriptome analysis of chicken spleen in response to avian pathogenic Escherichia coli (APEC infection.

    Directory of Open Access Journals (Sweden)

    Qinghua Nie

    Full Text Available Avian pathogenic Escherichia coli (APEC leads to economic losses in poultry production and is also a threat to human health. The goal of this study was to characterize the chicken spleen transcriptome and to identify candidate genes for response and resistance to APEC infection using Solexa sequencing. We obtained 14422935, 14104324, and 14954692 Solexa read pairs for non-challenged (NC, challenged-mild pathology (MD, and challenged-severe pathology (SV, respectively. A total of 148197 contigs and 98461 unigenes were assembled, of which 134949 contigs and 91890 unigenes match the chicken genome. In total, 12272 annotated unigenes take part in biological processes (11664, cellular components (11927, and molecular functions (11963. Summing three specific contrasts, 13650 significantly differentially expressed unigenes were found in NC Vs. MD (6844, NC Vs. SV (7764, and MD Vs. SV (2320. Some unigenes (e.g. CD148, CD45 and LCK were involved in crucial pathways, such as the T cell receptor (TCR signaling pathway and microbial metabolism in diverse environments. This study facilitates understanding of the genetic architecture of the chicken spleen transcriptome, and has identified candidate genes for host response to APEC infection.

  7. Identification of microRNAs from Amur grape (Vitis amurensis Rupr.) by deep sequencing and analysis of microRNA variations with bioinformatics.

    Science.gov (United States)

    Wang, Chen; Han, Jian; Liu, Chonghuai; Kibet, Korir Nicholas; Kayesh, Emrul; Shangguan, Lingfei; Li, Xiaoying; Fang, Jinggui

    2012-03-29

    MicroRNA (miRNA) is a class of functional non-coding small RNA with 19-25 nucleotides in length while Amur grape (Vitis amurensis Rupr.) is an important wild fruit crop with the strongest cold resistance among the Vitis species, is used as an excellent breeding parent for grapevine, and has elicited growing interest in wine production. To date, there is a relatively large number of grapevine miRNAs (vv-miRNAs) from cultivated grapevine varieties such as Vitis vinifera L. and hybrids of V. vinifera and V. labrusca, but there is no report on miRNAs from Vitis amurensis Rupr, a wild grapevine species. A small RNA library from Amur grape was constructed and Solexa technology used to perform deep sequencing of the library followed by subsequent bioinformatics analysis to identify new miRNAs. In total, 126 conserved miRNAs belonging to 27 miRNA families were identified, and 34 known but non-conserved miRNAs were also found. Significantly, 72 new potential Amur grape-specific miRNAs were discovered. The sequences of these new potential va-miRNAs were further validated through miR-RACE, and accumulation of 18 new va-miRNAs in seven tissues of grapevines confirmed by real time RT-PCR (qRT-PCR) analysis. The expression levels of va-miRNAs in flowers and berries were found to be basically consistent in identity to those from deep sequenced sRNAs libraries of combined corresponding tissues. We also describe the conservation and variation of va-miRNAs using miR-SNPs and miR-LDs during plant evolution based on comparison of orthologous sequences, and further reveal that the number and sites of miR-SNP in diverse miRNA families exhibit distinct divergence. Finally, 346 target genes for the new miRNAs were predicted and they include a number of Amur grape stress tolerance genes and many genes regulating anthocyanin synthesis and sugar metabolism. Deep sequencing of short RNAs from Amur grape flowers and berries identified 72 new potential miRNAs and 34 known but non-conserved mi

  8. Identification of microRNAs from Amur grape (vitis amurensis Rupr. by deep sequencing and analysis of microRNA variations with bioinformatics

    Directory of Open Access Journals (Sweden)

    Wang Chen

    2012-03-01

    Full Text Available Abstract Background MicroRNA (miRNA is a class of functional non-coding small RNA with 19-25 nucleotides in length while Amur grape (Vitis amurensis Rupr. is an important wild fruit crop with the strongest cold resistance among the Vitis species, is used as an excellent breeding parent for grapevine, and has elicited growing interest in wine production. To date, there is a relatively large number of grapevine miRNAs (vv-miRNAs from cultivated grapevine varieties such as Vitis vinifera L. and hybrids of V. vinifera and V. labrusca, but there is no report on miRNAs from Vitis amurensis Rupr, a wild grapevine species. Results A small RNA library from Amur grape was constructed and Solexa technology used to perform deep sequencing of the library followed by subsequent bioinformatics analysis to identify new miRNAs. In total, 126 conserved miRNAs belonging to 27 miRNA families were identified, and 34 known but non-conserved miRNAs were also found. Significantly, 72 new potential Amur grape-specific miRNAs were discovered. The sequences of these new potential va-miRNAs were further validated through miR-RACE, and accumulation of 18 new va-miRNAs in seven tissues of grapevines confirmed by real time RT-PCR (qRT-PCR analysis. The expression levels of va-miRNAs in flowers and berries were found to be basically consistent in identity to those from deep sequenced sRNAs libraries of combined corresponding tissues. We also describe the conservation and variation of va-miRNAs using miR-SNPs and miR-LDs during plant evolution based on comparison of orthologous sequences, and further reveal that the number and sites of miR-SNP in diverse miRNA families exhibit distinct divergence. Finally, 346 target genes for the new miRNAs were predicted and they include a number of Amur grape stress tolerance genes and many genes regulating anthocyanin synthesis and sugar metabolism. Conclusions Deep sequencing of short RNAs from Amur grape flowers and berries identified 72

  9. Identification and verification of hybridoma-derived monoclonal antibody variable region sequences using recombinant DNA technology and mass spectrometry.

    Science.gov (United States)

    Babrak, Lmar; McGarvey, Jeffery A; Stanker, Larry H; Hnasko, Robert

    2017-10-01

    Antibody engineering requires the identification of antigen binding domains or variable regions (VR) unique to each antibody. It is the VR that define the unique antigen binding properties and proper sequence identification is essential for functional evaluation and performance of recombinant antibodies (rAb). This determination can be achieved by sequence analysis of immunoglobulin (Ig) transcripts obtained from a monoclonal antibody (MAb) producing hybridoma and subsequent expression of a rAb. However the polyploidy nature of a hybridoma cell often results in the added expression of aberrant immunoglobulin-like transcripts or even production of anomalous antibodies which can confound production of rAb. An incorrect VR sequence will result in a non-functional rAb and de novo assembly of Ig primary structure without a sequence map is challenging. To address these problems, we have developed a methodology which combines: 1) selective PCR amplification of VR from both the heavy and light chain IgG from hybridoma, 2) molecular cloning and DNA sequence analysis and 3) tandem mass spectrometry (MS/MS) on enzyme digests obtained from the purified IgG. Peptide analysis proceeds by evaluating coverage of the predicted primary protein sequence provided by the initial DNA maps for the VR. This methodology serves to both identify and verify the primary structure of the MAb VR for production as rAb. Published by Elsevier Ltd.

  10. A Bioinformatic Pipeline for Monitoring of the Mutational Stability of Viral Drug Targets with Deep-Sequencing Technology.

    Science.gov (United States)

    Kravatsky, Yuri; Chechetkin, Vladimir; Fedoseeva, Daria; Gorbacheva, Maria; Kravatskaya, Galina; Kretova, Olga; Tchurikov, Nickolai

    2017-11-23

    The efficient development of antiviral drugs, including efficient antiviral small interfering RNAs (siRNAs), requires continuous monitoring of the strict correspondence between a drug and the related highly variable viral DNA/RNA target(s). Deep sequencing is able to provide an assessment of both the general target conservation and the frequency of particular mutations in the different target sites. The aim of this study was to develop a reliable bioinformatic pipeline for the analysis of millions of short, deep sequencing reads corresponding to selected highly variable viral sequences that are drug target(s). The suggested bioinformatic pipeline combines the available programs and the ad hoc scripts based on an original algorithm of the search for the conserved targets in the deep sequencing data. We also present the statistical criteria for the threshold of reliable mutation detection and for the assessment of variations between corresponding data sets. These criteria are robust against the possible sequencing errors in the reads. As an example, the bioinformatic pipeline is applied to the study of the conservation of RNA interference (RNAi) targets in human immunodeficiency virus 1 (HIV-1) subtype A. The developed pipeline is freely available to download at the website http://virmut.eimb.ru/. Brief comments and comparisons between VirMut and other pipelines are also presented.

  11. A Bioinformatic Pipeline for Monitoring of the Mutational Stability of Viral Drug Targets with Deep-Sequencing Technology

    Directory of Open Access Journals (Sweden)

    Yuri Kravatsky

    2017-11-01

    Full Text Available The efficient development of antiviral drugs, including efficient antiviral small interfering RNAs (siRNAs, requires continuous monitoring of the strict correspondence between a drug and the related highly variable viral DNA/RNA target(s. Deep sequencing is able to provide an assessment of both the general target conservation and the frequency of particular mutations in the different target sites. The aim of this study was to develop a reliable bioinformatic pipeline for the analysis of millions of short, deep sequencing reads corresponding to selected highly variable viral sequences that are drug target(s. The suggested bioinformatic pipeline combines the available programs and the ad hoc scripts based on an original algorithm of the search for the conserved targets in the deep sequencing data. We also present the statistical criteria for the threshold of reliable mutation detection and for the assessment of variations between corresponding data sets. These criteria are robust against the possible sequencing errors in the reads. As an example, the bioinformatic pipeline is applied to the study of the conservation of RNA interference (RNAi targets in human immunodeficiency virus 1 (HIV-1 subtype A. The developed pipeline is freely available to download at the website http://virmut.eimb.ru/. Brief comments and comparisons between VirMut and other pipelines are also presented.

  12. High throughput resistance profiling of Plasmodium falciparum infections based on custom dual indexing and Illumina next generation sequencing-technology

    DEFF Research Database (Denmark)

    Nag, Sidsel; Dalgaard, Marlene Danner; Kofoed, Poul-Erik

    2017-01-01

    Genetic polymorphisms in P. falciparum can be used to indicate the parasite's susceptibility to antimalarial drugs as well as its geographical origin. Both of these factors are key to monitoring development and spread of antimalarial drug resistance. In this study, we combine multiplex PCR, custom...... designed dual indexing and Miseq sequencing for high throughput SNP-profiling of 457 malaria infections from Guinea-Bissau, at the cost of 10 USD per sample. By amplifying and sequencing 15 genetic fragments, we cover 20 resistance-conferring SNPs occurring in pfcrt, pfmdr1, pfdhfr, pfdhps, as well...

  13. Is the extraction by Whatman FTA filter matrix technology and sequencing of large ribosomal subunit D1-D2 region sufficient for identification of clinical fungi?

    Science.gov (United States)

    Kiraz, Nuri; Oz, Yasemin; Aslan, Huseyin; Erturan, Zayre; Ener, Beyza; Akdagli, Sevtap Arikan; Muslumanoglu, Hamza; Cetinkaya, Zafer

    2015-10-01

    Although conventional identification of pathogenic fungi is based on the combination of tests evaluating their morphological and biochemical characteristics, they can fail to identify the less common species or the differentiation of closely related species. In addition these tests are time consuming, labour-intensive and require experienced personnel. We evaluated the feasibility and sufficiency of DNA extraction by Whatman FTA filter matrix technology and DNA sequencing of D1-D2 region of the large ribosomal subunit gene for identification of clinical isolates of 21 yeast and 160 moulds in our clinical mycology laboratory. While the yeast isolates were identified at species level with 100% homology, 102 (63.75%) clinically important mould isolates were identified at species level, 56 (35%) isolates at genus level against fungal sequences existing in DNA databases and two (1.25%) isolates could not be identified. Consequently, Whatman FTA filter matrix technology was a useful method for extraction of fungal DNA; extremely rapid, practical and successful. Sequence analysis strategy of D1-D2 region of the large ribosomal subunit gene was found considerably sufficient in identification to genus level for the most clinical fungi. However, the identification to species level and especially discrimination of closely related species may require additional analysis. © 2015 Blackwell Verlag GmbH.

  14. Identification and verification of hybridoma-derived monoclonal antibody variable region sequences using recombinant DNA technology and mass spectrometry

    Science.gov (United States)

    Antibody engineering requires the identification of antigen binding domains or variable regions (VR) unique to each antibody. It is the VR that define the unique antigen binding properties and proper sequence identification is essential for functional evaluation and performance of recombinant antibo...

  15. Massively parallel sequencing, aCGH, and RNA-Seq technologies provide a comprehensive molecular diagnosis of Fanconi anemia.

    Science.gov (United States)

    Chandrasekharappa, Settara C; Lach, Francis P; Kimble, Danielle C; Kamat, Aparna; Teer, Jamie K; Donovan, Frank X; Flynn, Elizabeth; Sen, Shurjo K; Thongthip, Supawat; Sanborn, Erica; Smogorzewska, Agata; Auerbach, Arleen D; Ostrander, Elaine A

    2013-05-30

    Current methods for detecting mutations in Fanconi anemia (FA)-suspected patients are inefficient and often miss mutations. We have applied recent advances in DNA sequencing and genomic capture to the diagnosis of FA. Specifically, we used custom molecular inversion probes or TruSeq-enrichment oligos to capture and sequence FA and related genes, including introns, from 27 samples from the International Fanconi Anemia Registry at The Rockefeller University. DNA sequencing was complemented with custom array comparative genomic hybridization (aCGH) and RNA sequencing (RNA-seq) analysis. aCGH identified deletions/duplications in 4 different FA genes. RNA-seq analysis revealed lack of allele specific expression associated with a deletion and splicing defects caused by missense, synonymous, and deep-in-intron variants. The combination of TruSeq-targeted capture, aCGH, and RNA-seq enabled us to identify the complementation group and biallelic germline mutations in all 27 families: FANCA (7), FANCB (3), FANCC (3), FANCD1 (1), FANCD2 (3), FANCF (2), FANCG (2), FANCI (1), FANCJ (2), and FANCL (3). FANCC mutations are often the cause of FA in patients of Ashkenazi Jewish (AJ) ancestry, and we identified 2 novel FANCC mutations in 2 patients of AJ ancestry. We describe here a strategy for efficient molecular diagnosis of FA.

  16. Learning with Technology: Video Modeling with Concrete-Representational-Abstract Sequencing for Students with Autism Spectrum Disorder

    Science.gov (United States)

    Yakubova, Gulnoza; Hughes, Elizabeth M.; Shinaberry, Megan

    2016-01-01

    The purpose of this study was to determine the effectiveness of a video modeling intervention with concrete-representational-abstract instructional sequence in teaching mathematics concepts to students with autism spectrum disorder (ASD). A multiple baseline across skills design of single-case experimental methodology was used to determine the…

  17. Model SNP development for complex genomes based on hexaploid oat using high-throughput 454 sequencing technology

    Directory of Open Access Journals (Sweden)

    Chao Shiaoman

    2011-01-01

    Full Text Available Abstract Background Genetic markers are pivotal to modern genomics research; however, discovery and genotyping of molecular markers in oat has been hindered by the size and complexity of the genome, and by a scarcity of sequence data. The purpose of this study was to generate oat expressed sequence tag (EST information, develop a bioinformatics pipeline for SNP discovery, and establish a method for rapid, cost-effective, and straightforward genotyping of SNP markers in complex polyploid genomes such as oat. Results Based on cDNA libraries of four cultivated oat genotypes, approximately 127,000 contigs were assembled from approximately one million Roche 454 sequence reads. Contigs were filtered through a novel bioinformatics pipeline to eliminate ambiguous polymorphism caused by subgenome homology, and 96 in silico SNPs were selected from 9,448 candidate loci for validation using high-resolution melting (HRM analysis. Of these, 52 (54% were polymorphic between parents of the Ogle1040 × TAM O-301 (OT mapping population, with 48 segregating as single Mendelian loci, and 44 being placed on the existing OT linkage map. Ogle and TAM amplicons from 12 primers were sequenced for SNP validation, revealing complex polymorphism in seven amplicons but general sequence conservation within SNP loci. Whole-amplicon interrogation with HRM revealed insertions, deletions, and heterozygotes in secondary oat germplasm pools, generating multiple alleles at some primer targets. To validate marker utility, 36 SNP assays were used to evaluate the genetic diversity of 34 diverse oat genotypes. Dendrogram clusters corresponded generally to known genome composition and genetic ancestry. Conclusions The high-throughput SNP discovery pipeline presented here is a rapid and effective method for identification of polymorphic SNP alleles in the oat genome. The current-generation HRM system is a simple and highly-informative platform for SNP genotyping. These techniques provide

  18. GenHtr: a tool for comparative assessment of genetic heterogeneity in microbial genomes generated by massive short-read sequencing

    Directory of Open Access Journals (Sweden)

    Yu GongXin

    2010-10-01

    Full Text Available Abstract Background Microevolution is the study of short-term changes of alleles within a population and their effects on the phenotype of organisms. The result of the below-species-level evolution is heterogeneity, where populations consist of subpopulations with a large number of structural variations. Heterogeneity analysis is thus essential to our understanding of how selective and neutral forces shape bacterial populations over a short period of time. The Solexa Genome Analyzer, a next-generation sequencing platform, allows millions of short sequencing reads to be obtained with great accuracy, allowing for the ability to study the dynamics of the bacterial population at the whole genome level. The tool referred to as GenHtr was developed for genome-wide heterogeneity analysis. Results For particular bacterial strains, GenHtr relies on a set of Solexa short reads on given bacteria pathogens and their isogenic reference genome to identify heterogeneity sites, the chromosomal positions with multiple variants of genes in the bacterial population, and variations that occur in large gene families. GenHtr accomplishes this by building and comparatively analyzing genome-wide heterogeneity genotypes for both the newly sequenced genomes (using massive short-read sequencing and their isogenic reference (using simulated data. As proof of the concept, this approach was applied to SRX007711, the Solexa sequencing data for a newly sequenced Staphylococcus aureus subsp. USA300 cell line, and demonstrated that it could predict such multiple variants. They include multiple variants of genes critical in pathogenesis, e.g. genes encoding a LysR family transcriptional regulator, 23 S ribosomal RNA, and DNA mismatch repair protein MutS. The heterogeneity results in non-synonymous and nonsense mutations, leading to truncated proteins for both LysR and MutS. Conclusion GenHtr was developed for genome-wide heterogeneity analysis. Although it is much more time

  19. Discovery of MicroRNAs associated with myogenesis by deep sequencing of serial developmental skeletal muscles in pigs.

    Directory of Open Access Journals (Sweden)

    Xinhua Hou

    Full Text Available MicroRNAs (miRNAs are short, single-stranded non-coding RNAs that repress their target genes by binding their 3' UTRs. These RNAs play critical roles in myogenesis. To gain knowledge about miRNAs involved in the regulation of myogenesis, porcine longissimus muscles were collected from 18 developmental stages (33-, 40-, 45-, 50-, 55-, 60-, 65-, 70-, 75-, 80-, 85-, 90-, 95-, 100- and 105-day post-gestation fetuses, 0 and 10-day postnatal piglets and adult pigs to identify miRNAs using Solexa sequencing technology. We detected 197 known miRNAs and 78 novel miRNAs according to comparison with known miRNAs in the miRBase (release 17.0 database. Moreover, variations in sequence length and single nucleotide polymorphisms were also observed in 110 known miRNAs. Expression analysis of the 11 most abundant miRNAs were conducted using quantitative PCR (qPCR in eleven tissues (longissimus muscles, leg muscles, heart, liver, spleen, lung, kidney, stomach, small intestine and colon, and the results revealed that ssc-miR-378, ssc-miR-1 and ssc-miR-206 were abundantly expressed in skeletal muscles. During skeletal muscle development, the expression level of ssc-miR-378 was low at 33 days post-coitus (dpc, increased at 65 and 90 dpc, peaked at postnatal day 0, and finally declined and maintained a comparatively stable level. This expression profile suggested that ssc-miR-378 was a new candidate miRNA for myogenesis and participated in skeletal muscle development in pigs. Target prediction and KEGG pathway analysis suggested that bone morphogenetic protein 2 (BMP2 and mitogen-activated protein kinase 1 (MAPK1, both of which were relevant to proliferation and differentiation, might be the potential targets of miR-378. Luciferase activities of report vectors containing the 3'UTR of porcine BMP2 or MAPK1 were downregulated by miR-378, which suggested that miR-378 probably regulated myogenesis though the regulation of these two genes.

  20. Design of a High Density SNP Genotyping Assay in the Pig Using SNPs Identified and Characterized by Next Generation Sequencing Technology

    Science.gov (United States)

    Ramos, Antonio M.; Crooijmans, Richard P. M. A.; Affara, Nabeel A.; Amaral, Andreia J.; Archibald, Alan L.; Beever, Jonathan E.; Bendixen, Christian; Churcher, Carol; Clark, Richard; Dehais, Patrick; Hansen, Mark S.; Hedegaard, Jakob; Hu, Zhi-Liang; Kerstens, Hindrik H.; Law, Andy S.; Megens, Hendrik-Jan; Milan, Denis; Nonneman, Danny J.; Rohrer, Gary A.; Rothschild, Max F.; Smith, Tim P. L.; Schnabel, Robert D.; Van Tassell, Curt P.; Taylor, Jeremy F.; Wiedmann, Ralph T.; Schook, Lawrence B.; Groenen, Martien A. M.

    2009-01-01

    Background The dissection of complex traits of economic importance to the pig industry requires the availability of a significant number of genetic markers, such as single nucleotide polymorphisms (SNPs). This study was conducted to discover several hundreds of thousands of porcine SNPs using next generation sequencing technologies and use these SNPs, as well as others from different public sources, to design a high-density SNP genotyping assay. Methodology/Principal Findings A total of 19 reduced representation libraries derived from four swine breeds (Duroc, Landrace, Large White, Pietrain) and a Wild Boar population and three restriction enzymes (AluI, HaeIII and MspI) were sequenced using Illumina's Genome Analyzer (GA). The SNP discovery effort resulted in the de novo identification of over 372K SNPs. More than 549K SNPs were used to design the Illumina Porcine 60K+SNP iSelect Beadchip, now commercially available as the PorcineSNP60. A total of 64,232 SNPs were included on the Beadchip. Results from genotyping the 158 individuals used for sequencing showed a high overall SNP call rate (97.5%). Of the 62,621 loci that could be reliably scored, 58,994 were polymorphic yielding a SNP conversion success rate of 94%. The average minor allele frequency (MAF) for all scorable SNPs was 0.274. Conclusions/Significance Overall, the results of this study indicate the utility of using next generation sequencing technologies to identify large numbers of reliable SNPs. In addition, the validation of the PorcineSNP60 Beadchip demonstrated that the assay is an excellent tool that will likely be used in a variety of future studies in pigs. PMID:19654876

  1. Transcriptomic SNP discovery for custom genotyping arrays: impacts of sequence data, SNP calling method and genotyping technology on the probability of validation success.

    Science.gov (United States)

    Humble, Emily; Thorne, Michael A S; Forcada, Jaume; Hoffman, Joseph I

    2016-08-26

    Single nucleotide polymorphism (SNP) discovery is an important goal of many studies. However, the number of 'putative' SNPs discovered from a sequence resource may not provide a reliable indication of the number that will successfully validate with a given genotyping technology. For this it may be necessary to account for factors such as the method used for SNP discovery and the type of sequence data from which it originates, suitability of the SNP flanking sequences for probe design, and genomic context. To explore the relative importance of these and other factors, we used Illumina sequencing to augment an existing Roche 454 transcriptome assembly for the Antarctic fur seal (Arctocephalus gazella). We then mapped the raw Illumina reads to the new hybrid transcriptome using BWA and BOWTIE2 before calling SNPs with GATK. The resulting markers were pooled with two existing sets of SNPs called from the original 454 assembly using NEWBLER and SWAP454. Finally, we explored the extent to which SNPs discovered using these four methods overlapped and predicted the corresponding validation outcomes for both Illumina Infinium iSelect HD and Affymetrix Axiom arrays. Collating markers across all discovery methods resulted in a global list of 34,718 SNPs. However, concordance between the methods was surprisingly poor, with only 51.0 % of SNPs being discovered by more than one method and 13.5 % being called from both the 454 and Illumina datasets. Using a predictive modeling approach, we could also show that SNPs called from the Illumina data were on average more likely to successfully validate, as were SNPs called by more than one method. Above and beyond this pattern, predicted validation outcomes were also consistently better for Affymetrix Axiom arrays. Our results suggest that focusing on SNPs called by more than one method could potentially improve validation outcomes. They also highlight possible differences between alternative genotyping technologies that could be

  2. Transcriptome analysis of carnation (Dianthus caryophyllus L.) based on next-generation sequencing technology

    OpenAIRE

    Tanase Koji; Nishitani Chikako; Hirakawa Hideki; Isobe Sachiko; Tabata Satoshi; Ohmiya Akemi; Onozaki Takashi

    2012-01-01

    Abstract Background Carnation (Dianthus caryophyllus L.), in the family Caryophyllaceae, can be found in a wide range of colors and is a model system for studies of flower senescence. In addition, it is one of the most important flowers in the global floriculture industry. However, few genomics resources, such as sequences and markers are available for carnation or other members of the Caryophyllaceae. To increase our understanding of the genetic control of important characters in carnation, ...

  3. Learning with Technology: Video Modeling with Concrete-Representational-Abstract Sequencing for Students with Autism Spectrum Disorder.

    Science.gov (United States)

    Yakubova, Gulnoza; Hughes, Elizabeth M; Shinaberry, Megan

    2016-07-01

    The purpose of this study was to determine the effectiveness of a video modeling intervention with concrete-representational-abstract instructional sequence in teaching mathematics concepts to students with autism spectrum disorder (ASD). A multiple baseline across skills design of single-case experimental methodology was used to determine the effectiveness of the intervention on the acquisition and maintenance of addition, subtraction, and number comparison skills for four elementary school students with ASD. Findings supported the effectiveness of the intervention in improving skill acquisition and maintenance at a 3-week follow-up. Implications for practice and future research are discussed.

  4. Application of next-generation sequencing for rapid marker development in molecular plant breeding: a case study on anthracnose disease resistance in Lupinus angustifolius L.

    Directory of Open Access Journals (Sweden)

    Yang Huaan

    2012-07-01

    Full Text Available Abstract Background In the last 30 years, a number of DNA fingerprinting methods such as RFLP, RAPD, AFLP, SSR, DArT, have been extensively used in marker development for molecular plant breeding. However, it remains a daunting task to identify highly polymorphic and closely linked molecular markers for a target trait for molecular marker-assisted selection. The next-generation sequencing (NGS technology is far more powerful than any existing generic DNA fingerprinting methods in generating DNA markers. In this study, we employed a grain legume crop Lupinus angustifolius (lupin as a test case, and examined the utility of an NGS-based method of RAD (restriction-site associated DNA sequencing as DNA fingerprinting for rapid, cost-effective marker development tagging a disease resistance gene for molecular breeding. Results Twenty informative plants from a cross of RxS (disease resistant x susceptible in lupin were subjected to RAD single-end sequencing by multiplex identifiers. The entire RAD sequencing products were resolved in two lanes of the 16-lanes per run sequencing platform Solexa HiSeq2000. A total of 185 million raw reads, approximately 17 Gb of sequencing data, were collected. Sequence comparison among the 20 test plants discovered 8207 SNP markers. Filtration of DNA sequencing data with marker identification parameters resulted in the discovery of 38 molecular markers linked to the disease resistance gene Lanr1. Five randomly selected markers were converted into cost-effective, simple PCR-based markers. Linkage analysis using marker genotyping data and disease resistance phenotyping data on a F8 population consisting of 186 individual plants confirmed that all these five markers were linked to the R gene. Two of these newly developed sequence-specific PCR markers, AnSeq3 and AnSeq4, flanked the target R gene at a genetic distance of 0.9 centiMorgan (cM, and are now replacing the markers previously developed by a traditional DNA

  5. First study on gene expression of cement proteins and potential adhesion-related genes of a membranous-based barnacle as revealed from Next-Generation Sequencing technology

    KAUST Repository

    Lin, Hsiu Chin; Wong, Yue Him; Tsang, Ling Ming; Chu, Ka Hou; Qian, Pei Yuan; Chan, Benny K K

    2013-01-01

    This is the first study applying Next-Generation Sequencing (NGS) technology to survey the kinds, expression location, and pattern of adhesion-related genes in a membranous-based barnacle. A total of 77,528,326 and 59,244,468 raw sequence reads of total RNA were generated from the prosoma and the basis of Tetraclita japonica formosana, respectively. In addition, 55,441 and 67,774 genes were further assembled and analyzed. The combined sequence data from both body parts generates a total of 79,833 genes of which 47.7% were shared. Homologues of barnacle cement proteins - CP-19K, -52K, and -100K - were found and all were dominantly expressed at the basis where the cement gland complex is located. This is the main area where transcripts of cement proteins and other potential adhesion-related genes were detected. The absence of another common barnacle cement protein, CP-20K, in the adult transcriptome suggested a possible life-stage restricted gene function and/or a different mechanism in adhesion between membranous-based and calcareous-based barnacles. © 2013 © 2013 Taylor & Francis.

  6. First study on gene expression of cement proteins and potential adhesion-related genes of a membranous-based barnacle as revealed from Next-Generation Sequencing technology

    KAUST Repository

    Lin, Hsiu Chin

    2013-12-12

    This is the first study applying Next-Generation Sequencing (NGS) technology to survey the kinds, expression location, and pattern of adhesion-related genes in a membranous-based barnacle. A total of 77,528,326 and 59,244,468 raw sequence reads of total RNA were generated from the prosoma and the basis of Tetraclita japonica formosana, respectively. In addition, 55,441 and 67,774 genes were further assembled and analyzed. The combined sequence data from both body parts generates a total of 79,833 genes of which 47.7% were shared. Homologues of barnacle cement proteins - CP-19K, -52K, and -100K - were found and all were dominantly expressed at the basis where the cement gland complex is located. This is the main area where transcripts of cement proteins and other potential adhesion-related genes were detected. The absence of another common barnacle cement protein, CP-20K, in the adult transcriptome suggested a possible life-stage restricted gene function and/or a different mechanism in adhesion between membranous-based and calcareous-based barnacles. © 2013 © 2013 Taylor & Francis.

  7. The usefulness of DNA sequencing after extraction by Whatman FTA filter matrix technology and phenotypic tests for differentiation of Candida albicans and Candida dubliniensis.

    Science.gov (United States)

    Kiraz, Nuri; Oz, Yasemin; Aslan, Huseyin; Muslumanoglu, Hamza

    2014-02-01

    Since C. dubliniensis is similar to C. albicans phenotypically, it can be misidentified as C. albicans. We aimed to investigate the prevalence of C. dubliniensis among isolates previously identified as C. albicans in our stocks and to compare the phenotypic methods and DNA sequencing of D1/D2 region on the ribosomal large subunit (rLSU) gene. A total of 850 isolates included in this study. Phenotypic identification was performed based on germ tube formation, chlamydospore production, colony colors on chromogenic agar, inability of growth at 45 °C and growth on hypertonic Sabouraud dextrose agar. Eighty isolates compatible with C. dubliniensis by at least one phenotypic test were included in the sequence analysis. Nested PCR amplification of D1/D2 region of the rLSU gene was performed after the fungal DNA extraction by Whatman FTA filter paper technology. The sequencing analysis of PCR products carried out by an automated capillary gel electrophoresis device. The rate of C. dubliniensis was 2.35 % (n = 20) among isolates previously described as C. albicans. Consequently, none of the phenotypic tests provided satisfactory performance alone in our study, and molecular methods required special equipment and high cost. Thus, at least two phenotypic methods can be used for identification of C. dubliniensis, and molecular methods can be used for confirmation.

  8. Identification and Characterization of Epstein-Barr Virus Genomes in Lung Carcinoma Biopsy Samples by Next-Generation Sequencing Technology.

    Science.gov (United States)

    Wang, Shanshan; Xiong, Hongchao; Yan, Shi; Wu, Nan; Lu, Zheming

    2016-05-18

    Epstein-Barr virus (EBV) has been detected in the tumor cells of several cancers, including some cases of lung carcinoma (LC). However, the genomic characteristics and diversity of EBV strains associated with LC are poorly understood. In this study, we sequenced the EBV genomes isolated from four primary LC tumor biopsy samples, designated LC1 to LC4. Comparative analysis demonstrated that LC strains were more closely related to GD1 strain. Compared to GD1 reference genome, a total of 520 variations in all, including 498 substitutions, 12 insertions, and 10 deletions were found. Latent genes were found to harbor the most numbers of nonsynonymous mutations. Phylogenetic analysis showed that all LC strains were closely related to Asian EBV strains, whereas different from African/American strains. LC2 genome was distinct from the other three LC genomes, suggesting at least two parental lineages of EBV among the LC genomes may exist. All LC strains could be classified as China 1 and V-val subtype according to the amino acid sequence of LMP1 and EBNA1, respectively. In conclusion, our results showed the genomic diversity among EBV genomes isolated from LC, which might facilitate to uncover the previously unknown variations of pathogenic significance.

  9. Existing and emerging detection technologies for DNA (Deoxyribonucleic Acid) finger printing, sequencing, bio- and analytical chips: a multidisciplinary development unifying molecular biology, chemical and electronics engineering.

    Science.gov (United States)

    Kumar Khanna, Vinod

    2007-01-01

    The current status and research trends of detection techniques for DNA-based analysis such as DNA finger printing, sequencing, biochips and allied fields are examined. An overview of main detectors is presented vis-à-vis these DNA operations. The biochip method is explained, the role of micro- and nanoelectronic technologies in biochip realization is highlighted, various optical and electrical detection principles employed in biochips are indicated, and the operational mechanisms of these detection devices are described. Although a diversity of biochips for diagnostic and therapeutic applications has been demonstrated in research laboratories worldwide, only some of these chips have entered the clinical market, and more chips are awaiting commercialization. The necessity of tagging is eliminated in refractive-index change based devices, but the basic flaw of indirect nature of most detection methodologies can only be overcome by generic and/or reagentless DNA sensors such as the conductance-based approach and the DNA-single electron transistor (DNA-SET) structure. Devices of the electrical detection-based category are expected to pave the pathway for the next-generation DNA chips. The review provides a comprehensive coverage of the detection technologies for DNA finger printing, sequencing and related techniques, encompassing a variety of methods from the primitive art to the state-of-the-art scenario as well as promising methods for the future.

  10. Genetic diagnosis of Duchenne and Becker muscular dystrophy using next-generation sequencing technology: comprehensive mutational search in a single platform.

    Science.gov (United States)

    Lim, Byung Chan; Lee, Seungbok; Shin, Jong-Yeon; Kim, Jong-Il; Hwang, Hee; Kim, Ki Joong; Hwang, Yong Seung; Seo, Jeong-Sun; Chae, Jong Hee

    2011-11-01

    Duchenne muscular dystrophy or Becker muscular dystrophy might be a suitable candidate disease for application of next-generation sequencing in the genetic diagnosis because the complex mutational spectrum and the large size of the dystrophin gene require two or more analytical methods and have a high cost. The authors tested whether large deletions/duplications or small mutations, such as point mutations or short insertions/deletions of the dystrophin gene, could be predicted accurately in a single platform using next-generation sequencing technology. A custom solution-based target enrichment kit was designed to capture whole genomic regions of the dystrophin gene and other muscular-dystrophy-related genes. A multiplexing strategy, wherein four differently bar-coded samples were captured and sequenced together in a single lane of the Illumina Genome Analyser, was applied. The study subjects were 25 16 with deficient dystrophin expression without a large deletion/duplication and 9 with a known large deletion/duplication. Nearly 100% of the exonic region of the dystrophin gene was covered by at least eight reads with a mean read depth of 107. Pathogenic small mutations were identified in 15 of the 16 patients without a large deletion/duplication. Using these 16 patients as the standard, the authors' method accurately predicted the deleted or duplicated exons in the 9 patients with known mutations. Inclusion of non-coding regions and paired-end sequence analysis enabled accurate identification by increasing the read depth and providing information about the breakpoint junction. The current method has an advantage for the genetic diagnosis of Duchenne muscular dystrophy and Becker muscular dystrophy wherein a comprehensive mutational search may be feasible using a single platform.

  11. The advantages of SMRT sequencing

    OpenAIRE

    Roberts, Richard J; Carneiro, Mauricio O; Schatz, Michael C

    2013-01-01

    Of the current next-generation sequencing technologies, SMRT sequencing is sometimes overlooked. However, attributes such as long reads, modified base detection and high accuracy make SMRT a useful technology and an ideal approach to the complete sequencing of small genomes.

  12. Development and characterization of 26 novel microsatellite loci for the trochid gastropod Gibbula divaricata (Linnaeus, 1758, using Illumina MiSeq next generation sequencing technology

    Directory of Open Access Journals (Sweden)

    Violeta López-Márquez

    2016-03-01

    Full Text Available In the present study we used the high-throughput sequencing technology Illumina MiSeq to develop 26 polymorphic microsatellite loci for the marine snail Gibbula divaricata. Four to 32 alleles were detected per locus across 30 samples analyzed. Observed and expected heterozygosities ranged from 0.130 to 0.933 and from 0.294 to 0.956, respectively. No significant linkage disequilibrium existed. Seven loci deviated from Hardy-Weinberg equilibrium that could not totally be explained by the presence of null alleles. Sympatric distribution with other species of the genus Gibbula, as G. rarilineata and G. varia, lead us to test the cross utility of the developed markers in these two species, which could be useful to test common biogeographic patterns or potential hybridization phenomena, since morphological intermediate specimens were found.

  13. Clinical Application of Picodroplet Digital PCR Technology for Rapid Detection of EGFR T790M in Next-Generation Sequencing Libraries and DNA from Limited Tumor Samples.

    Science.gov (United States)

    Borsu, Laetitia; Intrieri, Julie; Thampi, Linta; Yu, Helena; Riely, Gregory; Nafa, Khedoudja; Chandramohan, Raghu; Ladanyi, Marc; Arcila, Maria E

    2016-11-01

    Although next-generation sequencing (NGS) is a robust technology for comprehensive assessment of EGFR-mutant lung adenocarcinomas with acquired resistance to tyrosine kinase inhibitors, it may not provide sufficiently rapid and sensitive detection of the EGFR T790M mutation, the most clinically relevant resistance biomarker. Here, we describe a digital PCR (dPCR) assay for rapid T790M detection on aliquots of NGS libraries prepared for comprehensive profiling, fully maximizing broad genomic analysis on limited samples. Tumor DNAs from patients with EGFR-mutant lung adenocarcinomas and acquired resistance to epidermal growth factor receptor inhibitors were prepared for Memorial Sloan-Kettering-Integrated Mutation Profiling of Actionable Cancer Targets sequencing, a hybrid capture-based assay interrogating 410 cancer-related genes. Precapture library aliquots were used for rapid EGFR T790M testing by dPCR, and results were compared with NGS and locked nucleic acid-PCR Sanger sequencing (reference high sensitivity method). Seventy resistance samples showed 99% concordance with the reference high sensitivity method in accuracy studies. Input as low as 2.5 ng provided a sensitivity of 1% and improved further with increasing DNA input. dPCR on libraries required less DNA and showed better performance than direct genomic DNA. dPCR on NGS libraries is a robust and rapid approach to EGFR T790M testing, allowing most economical utilization of limited material for comprehensive assessment. The same assay can also be performed directly on any limited DNA source and cell-free DNA. Copyright © 2016 American Society for Investigative Pathology and the Association for Molecular Pathology. Published by Elsevier Inc. All rights reserved.

  14. De novo transcriptome sequencing of the Octopus vulgaris hemocytes using Illumina RNA-Seq technology: response to the infection by the gastrointestinal parasite Aggregata octopiana.

    Science.gov (United States)

    Castellanos-Martínez, Sheila; Arteta, David; Catarino, Susana; Gestal, Camino

    2014-01-01

    Octopus vulgaris is a highly valuable species of great commercial interest and excellent candidate for aquaculture diversification; however, the octopus' well-being is impaired by pathogens, of which the gastrointestinal coccidian parasite Aggregata octopiana is one of the most important. The knowledge of the molecular mechanisms of the immune response in cephalopods, especially in octopus is scarce. The transcriptome of the hemocytes of O. vulgaris was de novo sequenced using the high-throughput paired-end Illumina technology to identify genes involved in immune defense and to understand the molecular basis of octopus tolerance/resistance to coccidiosis. A bi-directional mRNA library was constructed from hemocytes of two groups of octopus according to the infection by A. octopiana, sick octopus, suffering coccidiosis, and healthy octopus, and reads were de novo assembled together. The differential expression of transcripts was analysed using the general assembly as a reference for mapping the reads from each condition. After sequencing, a total of 75,571,280 high quality reads were obtained from the sick octopus group and 74,731,646 from the healthy group. The general transcriptome of the O. vulgaris hemocytes was assembled in 254,506 contigs. A total of 48,225 contigs were successfully identified, and 538 transcripts exhibited differential expression between groups of infection. The general transcriptome revealed genes involved in pathways like NF-kB, TLR and Complement. Differential expression of TLR-2, PGRP, C1q and PRDX genes due to infection was validated using RT-qPCR. In sick octopuses, only TLR-2 was up-regulated in hemocytes, but all of them were up-regulated in caecum and gills. The transcriptome reported here de novo establishes the first molecular clues to understand how the octopus immune system works and interacts with a highly pathogenic coccidian. The data provided here will contribute to identification of biomarkers for octopus resistance against

  15. De novo transcriptome sequencing of axolotl blastema for identification of differentially expressed genes during limb regeneration

    Science.gov (United States)

    2013-01-01

    Background Salamanders are unique among vertebrates in their ability to completely regenerate amputated limbs through the mediation of blastema cells located at the stump ends. This regeneration is nerve-dependent because blastema formation and regeneration does not occur after limb denervation. To obtain the genomic information of blastema tissues, de novo transcriptomes from both blastema tissues and denervated stump ends of Ambystoma mexicanum (axolotls) 14 days post-amputation were sequenced and compared using Solexa DNA sequencing. Results The sequencing done for this study produced 40,688,892 reads that were assembled into 307,345 transcribed sequences. The N50 of transcribed sequence length was 562 bases. A similarity search with known proteins identified 39,200 different genes to be expressed during limb regeneration with a cut-off E-value exceeding 10-5. We annotated assembled sequences by using gene descriptions, gene ontology, and clusters of orthologous group terms. Targeted searches using these annotations showed that the majority of the genes were in the categories of essential metabolic pathways, transcription factors and conserved signaling pathways, and novel candidate genes for regenerative processes. We discovered and confirmed numerous sequences of the candidate genes by using quantitative polymerase chain reaction and in situ hybridization. Conclusion The results of this study demonstrate that de novo transcriptome sequencing allows gene expression analysis in a species lacking genome information and provides the most comprehensive mRNA sequence resources for axolotls. The characterization of the axolotl transcriptome can help elucidate the molecular mechanisms underlying blastema formation during limb regeneration. PMID:23815514

  16. Identification and characterization of novel and differentially expressed microRNAs in peripheral blood from healthy and mastitis Holstein cattle by deep sequencing.

    Science.gov (United States)

    Li, Zhixiong; Wang, Hongliang; Chen, Ling; Wang, Lijun; Liu, Xiaolin; Ru, Caixia; Song, Ailong

    2014-02-01

    MicroRNA (miRNA) mediates post-transcriptional gene regulation and plays an important role in regulating the development of immune cells and in modulating innate and adaptive immune responses in mammals, including cattle. In the present study, we identified novel and differentially expressed miRNAs in peripheral blood from healthy and mastitis Holstein cattle by Solexa sequencing and bioinformatics. In total, 608 precursor hairpins (pre-miRNAs) encoding for 753 mature miRNAs were detected. Statistically, 173 unique miRNAs (of 753, 22.98%) were identified that had significant differential expression between healthy and mastitis Holstein cattle (P mastitis Holstein cattle, which provide important information on mastitis in miRNAs expression. Diverse miRNAs may play an important role in the treatment of mastitis in Holstein cattle. © 2013 Stichting International Foundation for Animal Genetics.

  17. De Novo Assembly of Human Herpes Virus Type 1 (HHV-1) Genome, Mining of Non-Canonical Structures and Detection of Novel Drug-Resistance Mutations Using Short- and Long-Read Next Generation Sequencing Technologies.

    Science.gov (United States)

    Karamitros, Timokratis; Harrison, Ian; Piorkowska, Renata; Katzourakis, Aris; Magiorkinis, Gkikas; Mbisa, Jean Lutamyo

    2016-01-01

    Human herpesvirus type 1 (HHV-1) has a large double-stranded DNA genome of approximately 152 kbp that is structurally complex and GC-rich. This makes the assembly of HHV-1 whole genomes from short-read sequencing data technically challenging. To improve the assembly of HHV-1 genomes we have employed a hybrid genome assembly protocol using data from two sequencing technologies: the short-read Roche 454 and the long-read Oxford Nanopore MinION sequencers. We sequenced 18 HHV-1 cell culture-isolated clinical specimens collected from immunocompromised patients undergoing antiviral therapy. The susceptibility of the samples to several antivirals was determined by plaque reduction assay. Hybrid genome assembly resulted in a decrease in the number of contigs in 6 out of 7 samples and an increase in N(G)50 and N(G)75 of all 7 samples sequenced by both technologies. The approach also enhanced the detection of non-canonical contigs including a rearrangement between the unique (UL) and repeat (T/IRL) sequence regions of one sample that was not detectable by assembly of 454 reads alone. We detected several known and novel resistance-associated mutations in UL23 and UL30 genes. Genome-wide genetic variability ranged from genomes will be useful in determining genetic determinants of drug resistance, virulence, pathogenesis and viral evolution. The numerous, complex repeat regions of the HHV-1 genome currently remain a barrier towards this goal.

  18. Genetic mapping using the Diversity Arrays Technology (DArT) : application and validation using the whole-genome sequences of Arabidopsis thaliana and the fungal wheat pathogen Mycosphaerella graminicola

    NARCIS (Netherlands)

    Wittenberg, A.H.J.

    2007-01-01

    Diversity Arrays Technology (DArT) is a microarray-based DNA marker technique for genome-wide discovery and genotyping of genetic variation. DArT allows simultaneous scoring of hundreds- to thousands of restriction site based polymorphisms between genotypes and does not require DNA sequence

  19. Technology.

    Science.gov (United States)

    Online-Offline, 1998

    1998-01-01

    Focuses on technology, on advances in such areas as aeronautics, electronics, physics, the space sciences, as well as computers and the attendant progress in medicine, robotics, and artificial intelligence. Describes educational resources for elementary and middle school students, including Web sites, CD-ROMs and software, videotapes, books,…

  20. Technology

    Directory of Open Access Journals (Sweden)

    Xu Jing

    2016-01-01

    Full Text Available The traditional answer card reading method using OMR (Optical Mark Reader, most commonly, OMR special card special use, less versatile, high cost, aiming at the existing problems proposed a method based on pattern recognition of the answer card identification method. Using the method based on Line Segment Detector to detect the tilt of the image, the existence of tilt image rotation correction, and eventually achieve positioning and detection of answers to the answer sheet .Pattern recognition technology for automatic reading, high accuracy, detect faster

  1. Characterization and Development of EST-SSRs by Deep Transcriptome Sequencing in Chinese Cabbage (Brassica rapa L. ssp. pekinensis

    Directory of Open Access Journals (Sweden)

    Qian Ding

    2015-01-01

    Full Text Available Simple sequence repeats (SSRs are among the most important markers for population analysis and have been widely used in plant genetic mapping and molecular breeding. Expressed sequence tag-SSR (EST-SSR markers, located in the coding regions, are potentially more efficient for QTL mapping, gene targeting, and marker-assisted breeding. In this study, we investigated 51,694 nonredundant unigenes, assembled from clean reads from deep transcriptome sequencing with a Solexa/Illumina platform, for identification and development of EST-SSRs in Chinese cabbage. In total, 10,420 EST-SSRs with over 12 bp were identified and characterized, among which 2744 EST-SSRs are new and 2317 are known ones showing polymorphism with previously reported SSRs. A total of 7877 PCR primer pairs for 1561 EST-SSR loci were designed, and primer pairs for twenty-four EST-SSRs were selected for primer evaluation. In nineteen EST-SSR loci (79.2%, amplicons were successfully generated with high quality. Seventeen (89.5% showed polymorphism in twenty-four cultivars of Chinese cabbage. The polymorphic alleles of each polymorphic locus were sequenced, and the results showed that most polymorphisms were due to variations of SSR repeat motifs. The EST-SSRs identified and characterized in this study have important implications for developing new tools for genetics and molecular breeding in Chinese cabbage.

  2. Genome Sequencing

    DEFF Research Database (Denmark)

    Sato, Shusei; Andersen, Stig Uggerhøj

    2014-01-01

    The current Lotus japonicus reference genome sequence is based on a hybrid assembly of Sanger TAC/BAC, Sanger shotgun and Illumina shotgun sequencing data generated from the Miyakojima-MG20 accession. It covers nearly all expressed L. japonicus genes and has been annotated mainly based on transcr......The current Lotus japonicus reference genome sequence is based on a hybrid assembly of Sanger TAC/BAC, Sanger shotgun and Illumina shotgun sequencing data generated from the Miyakojima-MG20 accession. It covers nearly all expressed L. japonicus genes and has been annotated mainly based...

  3. Molecular characterization of human T-cell lymphotropic virus type 1 full and partial genomes by Illumina massively parallel sequencing technology.

    Directory of Open Access Journals (Sweden)

    Rodrigo Pessôa

    Full Text Available BACKGROUND: Here, we report on the partial and full-length genomic (FLG variability of HTLV-1 sequences from 90 well-characterized subjects, including 48 HTLV-1 asymptomatic carriers (ACs, 35 HTLV-1-associated myelopathy/tropical spastic paraparesis (HAM/TSP and 7 adult T-cell leukemia/lymphoma (ATLL patients, using an Illumina paired-end protocol. METHODS: Blood samples were collected from 90 individuals, and DNA was extracted from the PBMCs to measure the proviral load and to amplify the HTLV-1 FLG from two overlapping fragments. The amplified PCR products were subjected to deep sequencing. The sequencing data were assembled, aligned, and mapped against the HTLV-1 genome with sufficient genetic resemblance and utilized for further phylogenetic analysis. RESULTS: A high-throughput sequencing-by-synthesis instrument was used to obtain an average of 3210- and 5200-fold coverage of the partial (n = 14 and FLG (n = 76 data from the HTLV-1 strains, respectively. The results based on the phylogenetic trees of consensus sequences from partial and FLGs revealed that 86 (95.5% individuals were infected with the transcontinental sub-subtypes of the cosmopolitan subtype (aA and that 4 individuals (4.5% were infected with the Japanese sub-subtypes (aB. A comparison of the nucleotide and amino acids of the FLG between the three clinical settings yielded no correlation between the sequenced genotype and clinical outcomes. The evolutionary relationships among the HTLV sequences were inferred from nucleotide sequence, and the results are consistent with the hypothesis that there were multiple introductions of the transcontinental subtype in Brazil. CONCLUSIONS: This study has increased the number of subtype aA full-length genomes from 8 to 81 and HTLV-1 aB from 2 to 5 sequences. The overall data confirmed that the cosmopolitan transcontinental sub-subtypes were the most prevalent in the Brazilian population. It is hoped that this valuable genomic data

  4. Molecular characterization of human T-cell lymphotropic virus type 1 full and partial genomes by Illumina massively parallel sequencing technology.

    Science.gov (United States)

    Pessôa, Rodrigo; Watanabe, Jaqueline Tomoko; Nukui, Youko; Pereira, Juliana; Casseb, Jorge; Kasseb, Jorge; de Oliveira, Augusto César Penalva; Segurado, Aluisio Cotrim; Sanabani, Sabri Saeed

    2014-01-01

    Here, we report on the partial and full-length genomic (FLG) variability of HTLV-1 sequences from 90 well-characterized subjects, including 48 HTLV-1 asymptomatic carriers (ACs), 35 HTLV-1-associated myelopathy/tropical spastic paraparesis (HAM/TSP) and 7 adult T-cell leukemia/lymphoma (ATLL) patients, using an Illumina paired-end protocol. Blood samples were collected from 90 individuals, and DNA was extracted from the PBMCs to measure the proviral load and to amplify the HTLV-1 FLG from two overlapping fragments. The amplified PCR products were subjected to deep sequencing. The sequencing data were assembled, aligned, and mapped against the HTLV-1 genome with sufficient genetic resemblance and utilized for further phylogenetic analysis. A high-throughput sequencing-by-synthesis instrument was used to obtain an average of 3210- and 5200-fold coverage of the partial (n = 14) and FLG (n = 76) data from the HTLV-1 strains, respectively. The results based on the phylogenetic trees of consensus sequences from partial and FLGs revealed that 86 (95.5%) individuals were infected with the transcontinental sub-subtypes of the cosmopolitan subtype (aA) and that 4 individuals (4.5%) were infected with the Japanese sub-subtypes (aB). A comparison of the nucleotide and amino acids of the FLG between the three clinical settings yielded no correlation between the sequenced genotype and clinical outcomes. The evolutionary relationships among the HTLV sequences were inferred from nucleotide sequence, and the results are consistent with the hypothesis that there were multiple introductions of the transcontinental subtype in Brazil. This study has increased the number of subtype aA full-length genomes from 8 to 81 and HTLV-1 aB from 2 to 5 sequences. The overall data confirmed that the cosmopolitan transcontinental sub-subtypes were the most prevalent in the Brazilian population. It is hoped that this valuable genomic data will add to our current understanding of the

  5. Simultaneous discrimination of species and strains in Lactobacillus rhamnosus using species-specific PCR combined with multiplex mini-sequencing technology.

    Science.gov (United States)

    Huang, Chien-Hsun; Chang, Mu-Tzu; Huang, Lina; Chu, Wen-Shen

    2015-12-01

    This study described the use of species-specific PCR in combination with SNaPshot mini-sequencing to achieve species identification and strain differentiation in Lactobacillus rhamnosus. To develop species-specific PCR and strain subtyping primers, the dnaJ gene was used as a target, and its corresponding sequences were analyzed both in Lb. rhamnosus and in a subset of its phylogenetically closest species. The results indicated that the species-specific primer pair was indeed specific for Lb. rhamnosus, and the mini-sequencing assay was able to unambiguously distinguish Lb. rhamnosus strains into different haplotypes. In conclusion, we have successfully developed a rapid, accurate and cost-effective assay for inter- and intraspecies discrimination of Lb. rhamnosus, which can be applied to achieve efficient quality control of probiotic products. Copyright © 2015 Elsevier Ltd. All rights reserved.

  6. Genomic sequencing in clinical trials

    OpenAIRE

    Mestan, Karen K; Ilkhanoff, Leonard; Mouli, Samdeep; Lin, Simon

    2011-01-01

    Abstract Human genome sequencing is the process by which the exact order of nucleic acid base pairs in the 24 human chromosomes is determined. Since the completion of the Human Genome Project in 2003, genomic sequencing is rapidly becoming a major part of our translational research efforts to understand and improve human health and disease. This article reviews the current and future directions of clinical research with respect to genomic sequencing, a technology that is just beginning to fin...

  7. Biosensors for DNA sequence detection

    Science.gov (United States)

    Vercoutere, Wenonah; Akeson, Mark

    2002-01-01

    DNA biosensors are being developed as alternatives to conventional DNA microarrays. These devices couple signal transduction directly to sequence recognition. Some of the most sensitive and functional technologies use fibre optics or electrochemical sensors in combination with DNA hybridization. In a shift from sequence recognition by hybridization, two emerging single-molecule techniques read sequence composition using zero-mode waveguides or electrical impedance in nanoscale pores.

  8. Graphene nanodevices for DNA sequencing

    NARCIS (Netherlands)

    Heerema, S.J.; Dekker, C.

    2016-01-01

    Fast, cheap, and reliable DNA sequencing could be one of the most disruptive innovations of this decade, as it will pave the way for personalized medicine. In pursuit of such technology, a variety of nanotechnology-based approaches have been explored and established, including sequencing with

  9. Whole-Genome Sequencing of Sordaria macrospora Mutants Identifies Developmental Genes.

    Science.gov (United States)

    Nowrousian, Minou; Teichert, Ines; Masloff, Sandra; Kück, Ulrich

    2012-02-01

    The study of mutants to elucidate gene functions has a long and successful history; however, to discover causative mutations in mutants that were generated by random mutagenesis often takes years of laboratory work and requires previously generated genetic and/or physical markers, or resources like DNA libraries for complementation. Here, we present an alternative method to identify defective genes in developmental mutants of the filamentous fungus Sordaria macrospora through Illumina/Solexa whole-genome sequencing. We sequenced pooled DNA from progeny of crosses of three mutants and the wild type and were able to pinpoint the causative mutations in the mutant strains through bioinformatics analysis. One mutant is a spore color mutant, and the mutated gene encodes a melanin biosynthesis enzyme. The causative mutation is a G to A change in the first base of an intron, leading to a splice defect. The second mutant carries an allelic mutation in the pro41 gene encoding a protein essential for sexual development. In the mutant, we detected a complex pattern of deletion/rearrangements at the pro41 locus. In the third mutant, a point mutation in the stop codon of a transcription factor-encoding gene leads to the production of immature fruiting bodies. For all mutants, transformation with a wild type-copy of the affected gene restored the wild-type phenotype. Our data demonstrate that whole-genome sequencing of mutant strains is a rapid method to identify developmental genes in an organism that can be genetically crossed and where a reference genome sequence is available, even without prior mapping information.

  10. Analysis of Litopenaeus vannamei transcriptome using the next-generation DNA sequencing technique.

    Directory of Open Access Journals (Sweden)

    Chaozheng Li

    Full Text Available BACKGROUND: Pacific white shrimp (Litopenaeus vannamei, the major species of farmed shrimps in the world, has been attracting extensive studies, which require more and more genome background knowledge. The now available transcriptome data of L. vannamei are insufficient for research requirements, and have not been adequately assembled and annotated. METHODOLOGY/PRINCIPAL FINDINGS: This is the first study that used a next-generation high-throughput DNA sequencing technique, the Solexa/Illumina GA II method, to analyze the transcriptome from whole bodies of L. vannamei larvae. More than 2.4 Gb of raw data were generated, and 109,169 unigenes with a mean length of 396 bp were assembled using the SOAP denovo software. 73,505 unigenes (>200 bp with good quality sequences were selected and subjected to annotation analysis, among which 37.80% can be matched in NCBI Nr database, 37.3% matched in Swissprot, and 44.1% matched in TrEMBL. Using BLAST and BLAST2Go softwares, 11,153 unigenes were classified into 25 Clusters of Orthologous Groups of proteins (COG categories, 8171 unigenes were assigned into 51 Gene ontology (GO functional groups, and 18,154 unigenes were divided into 220 Kyoto Encyclopedia of Genes and Genomes (KEGG pathways. To primarily verify part of the results of assembly and annotations, 12 assembled unigenes that are homologous to many embryo development-related genes were chosen and subjected to RT-PCR for electrophoresis and Sanger sequencing analyses, and to real-time PCR for expression profile analyses during embryo development. CONCLUSIONS/SIGNIFICANCE: The L. vannamei transcriptome analyzed using the next-generation sequencing technique enriches the information of L. vannamei genes, which will facilitate our understanding of the genome background of crustaceans, and promote the studies on L. vannamei.

  11. Identification and characterization of novel serum microRNA candidates from deep sequencing in cervical cancer patients.

    Science.gov (United States)

    Juan, Li; Tong, Hong-li; Zhang, Pengjun; Guo, Guanghong; Wang, Zi; Wen, Xinyu; Dong, Zhennan; Tian, Ya-ping

    2014-09-03

    Small non-coding microRNAs (miRNAs) are involved in cancer development and progression, and serum profiles of cervical cancer patients may be useful for identifying novel miRNAs. We performed deep sequencing on serum pools of cervical cancer patients and healthy controls with 3 replicates and constructed a small RNA library. We used MIREAP to predict novel miRNAs and identified 2 putative novel miRNAs between serum pools of cervical cancer patients and healthy controls after filtering out pseudo-pre-miRNAs using Triplet-SVM analysis. The 2 putative novel miRNAs were validated by real time PCR and were significantly decreased in cervical cancer patients compared with healthy controls. One novel miRNA had an area under curve (AUC) of 0.921 (95% CI: 0.883, 0.959) with a sensitivity of 85.7% and a specificity of 88.2% when discriminating between cervical cancer patients and healthy controls. Our results suggest that characterizing serum profiles of cervical cancers by Solexa sequencing may be a good method for identifying novel miRNAs and that the validated novel miRNAs described here may be cervical cancer-associated biomarkers.

  12. Quantum-Sequencing: Fast electronic single DNA molecule sequencing

    Science.gov (United States)

    Casamada Ribot, Josep; Chatterjee, Anushree; Nagpal, Prashant

    2014-03-01

    A major goal of third-generation sequencing technologies is to develop a fast, reliable, enzyme-free, high-throughput and cost-effective, single-molecule sequencing method. Here, we present the first demonstration of unique ``electronic fingerprint'' of all nucleotides (A, G, T, C), with single-molecule DNA sequencing, using Quantum-tunneling Sequencing (Q-Seq) at room temperature. We show that the electronic state of the nucleobases shift depending on the pH, with most distinct states identified at acidic pH. We also demonstrate identification of single nucleotide modifications (methylation here). Using these unique electronic fingerprints (or tunneling data), we report a partial sequence of beta lactamase (bla) gene, which encodes resistance to beta-lactam antibiotics, with over 95% success rate. These results highlight the potential of Q-Seq as a robust technique for next-generation sequencing.

  13. Development of severe accident evaluation technology (level 2 PSA) for sodium-cooled fast reactors. (5) Identification of dominant factors in ex-vessel accident sequences

    International Nuclear Information System (INIS)

    Ohno, Shuji; Seino, Hiroshi; Miyahara, Shinya

    2009-01-01

    The evaluation of accident progression outside of a reactor vessel (ex-vessel) and subsequent transfer behavior of radioactive materials is of great importance from the viewpoint of Level 2 PSA. Hence typical ex-vessel accident sequences in the JAEA Sodium-cooled Fast Reactor are qualitatively discussed in this paper and dominant behaviors or factors in the sequences are investigated through parametric calculations using the CONTAIN/LMR code. Scenarios to be focused on are, 1) sodium vapor leakage from the reactor vessel and 2) sodium-concrete reaction, which are both to be considered in the accident category of LOHRS (loss of heat removal system) and might be followed by an early containment failure due to the thermal effect of sodium combustion and hydrogen burning respectively. The calculated results clarify that the sodium vapor leak rate and the scale of sodium-concrete reaction are the important factors to dominate the ex-vessel accident progression. In addition to the understandings of the dominant factors, the analyzed results also provide the specific information such as pressure loading value to the containment and the timing of pressurization, which is indispensable as technical base in Level 2 PSA for developing event trees and for quantifying the accident consequences. (author)

  14. Construction of an SNP-based high-density linkage map for flax (Linum usitatissimum L.) using specific length amplified fragment sequencing (SLAF-seq) technology.

    Science.gov (United States)

    Yi, Liuxi; Gao, Fengyun; Siqin, Bateer; Zhou, Yu; Li, Qiang; Zhao, Xiaoqing; Jia, Xiaoyun; Zhang, Hui

    2017-01-01

    Flax is an important crop for oil and fiber, however, no high-density genetic maps have been reported for this species. Specific length amplified fragment sequencing (SLAF-seq) is a high-resolution strategy for large scale de novo discovery and genotyping of single nucleotide polymorphisms. In this study, SLAF-seq was employed to develop SNP markers in an F2 population to construct a high-density genetic map for flax. In total, 196.29 million paired-end reads were obtained. The average sequencing depth was 25.08 in male parent, 32.17 in the female parent, and 9.64 in each F2 progeny. In total, 389,288 polymorphic SLAFs were detected, from which 260,380 polymorphic SNPs were developed. After filtering, 4,638 SNPs were found suitable for genetic map construction. The final genetic map included 4,145 SNP markers on 15 linkage groups and was 2,632.94 cM in length, with an average distance of 0.64 cM between adjacent markers. To our knowledge, this map is the densest SNP-based genetic map for flax. The SNP markers and genetic map reported in here will serve as a foundation for the fine mapping of quantitative trait loci (QTLs), map-based gene cloning and marker assisted selection (MAS) for flax.

  15. RNA sequencing reveals differential expression of mitochondrial and oxidation reduction genes in the long-lived naked mole-rat when compared to mice.

    Science.gov (United States)

    Yu, Chuanfei; Li, Yang; Holmes, Andrew; Szafranski, Karol; Faulkes, Chris G; Coen, Clive W; Buffenstein, Rochelle; Platzer, Matthias; de Magalhães, João Pedro; Church, George M

    2011-01-01

    The naked mole-rat (Heterocephalus glaber) is a long-lived, cancer resistant rodent and there is a great interest in identifying the adaptations responsible for these and other of its unique traits. We employed RNA sequencing to compare liver gene expression profiles between naked mole-rats and wild-derived mice. Our results indicate that genes associated with oxidoreduction and mitochondria were expressed at higher relative levels in naked mole-rats. The largest effect is nearly 300-fold higher expression of epithelial cell adhesion molecule (Epcam), a tumour-associated protein. Also of interest are the protease inhibitor, alpha2-macroglobulin (A2m), and the mitochondrial complex II subunit Sdhc, both ageing-related genes found strongly over-expressed in the naked mole-rat. These results hint at possible candidates for specifying species differences in ageing and cancer, and in particular suggest complex alterations in mitochondrial and oxidation reduction pathways in the naked mole-rat. Our differential gene expression analysis obviated the need for a reference naked mole-rat genome by employing a combination of Illumina/Solexa and 454 platforms for transcriptome sequencing and assembling transcriptome contigs of the non-sequenced species. Overall, our work provides new research foci and methods for studying the naked mole-rat's fascinating characteristics.

  16. RNA sequencing reveals differential expression of mitochondrial and oxidation reduction genes in the long-lived naked mole-rat when compared to mice.

    Directory of Open Access Journals (Sweden)

    Chuanfei Yu

    Full Text Available The naked mole-rat (Heterocephalus glaber is a long-lived, cancer resistant rodent and there is a great interest in identifying the adaptations responsible for these and other of its unique traits. We employed RNA sequencing to compare liver gene expression profiles between naked mole-rats and wild-derived mice. Our results indicate that genes associated with oxidoreduction and mitochondria were expressed at higher relative levels in naked mole-rats. The largest effect is nearly 300-fold higher expression of epithelial cell adhesion molecule (Epcam, a tumour-associated protein. Also of interest are the protease inhibitor, alpha2-macroglobulin (A2m, and the mitochondrial complex II subunit Sdhc, both ageing-related genes found strongly over-expressed in the naked mole-rat. These results hint at possible candidates for specifying species differences in ageing and cancer, and in particular suggest complex alterations in mitochondrial and oxidation reduction pathways in the naked mole-rat. Our differential gene expression analysis obviated the need for a reference naked mole-rat genome by employing a combination of Illumina/Solexa and 454 platforms for transcriptome sequencing and assembling transcriptome contigs of the non-sequenced species. Overall, our work provides new research foci and methods for studying the naked mole-rat's fascinating characteristics.

  17. 基于二次参数化技术的风电机组序列化建模%Wind turbine sequence modeling based on secondary parametric technology

    Institute of Scientific and Technical Information of China (English)

    高青风; 孙振兴; 滕伟; 柳亦兵

    2012-01-01

    针对常规建模方法在庞大复杂风电机组应用上的不足,在建立单台风电机组全参数化模型的基础上,以风电机组序列整体为研究对象,根据同序列不同功率机组之间各零部件设计参数变化规律,研究应用二次参数化技术和参数序列化方法实现了高效的风电机组模型建立与管理;并结合三维模型参数驱动技术,研发了风电机组序列建模系统,可有效减少建模和造型工作量、降低设计失误率、提高设计效率,验证了该建模方法的正确性与合理性.%According to the disadvantages of conventional modeling methods for large and complex wind turbines, a highly efficient modeling method is proposed in this paper. Based on a fully parametric model of a single wind turbine, it took the whole sequence of wind turbines as the research object, and realized the fast modeling via secondary parametric technology according to the change law of design parameters of different wind turbines in the same sequence. A sequence modeling system for wind turbines was also developed by combining with parameter-driven modeling technology. The system can effectively reduce the workload and design error rate, improve design efficiency, and prove the correctness and rationality of the modeling method.

  18. DNA Sequencing by Capillary Electrophoresis

    Science.gov (United States)

    Karger, Barry L.; Guttman, Andras

    2009-01-01

    Sequencing of human and other genomes has been at the center of interest in the biomedical field over the past several decades and is now leading toward an era of personalized medicine. During this time, DNA sequencing methods have evolved from the labor intensive slab gel electrophoresis, through automated multicapillary electrophoresis systems using fluorophore labeling with multispectral imaging, to the “next generation” technologies of cyclic array, hybridization based, nanopore and single molecule sequencing. Deciphering the genetic blueprint and follow-up confirmatory sequencing of Homo sapiens and other genomes was only possible by the advent of modern sequencing technologies that was a result of step by step advances with a contribution of academics, medical personnel and instrument companies. While next generation sequencing is moving ahead at break-neck speed, the multicapillary electrophoretic systems played an essential role in the sequencing of the Human Genome, the foundation of the field of genomics. In this prospective, we wish to overview the role of capillary electrophoresis in DNA sequencing based in part of several of our articles in this journal. PMID:19517496

  19. Next-generation sequencing technology a new tool for killer cell immunoglobulin-like receptor allele typing in hematopoietic stem cell transplantation.

    Science.gov (United States)

    Maniangou, B; Retière, C; Gagne, K

    2018-02-01

    Killer cell Immunoglobulin-like Receptor (KIR) genes are a family of genes located together within the leukocyte receptor cluster on human chromosome 19q13.4. To date, 17 KIR genes have been identified including nine inhibitory genes (2DL1/L2/L3/L4/L5A/L5B, 3DL1/L2/L3), six activating genes (2DS1/S2/S3/S4/S5, 3DS1) and two pseudogenes (2DP1, 3DP1) classified into group A (KIR A) and group B (KIR B) haplotypes. The number and the nature of KIR genes vary between the individuals. In addition, these KIR genes are known to be polymorphic at allelic level (907 alleles described in July 2017). KIR genes encode for receptors which are predominantly expressed by Natural Killer (NK) cells. KIR receptors recognize HLA class I molecules and are able to kill residual recipient leukemia cells, and thus reduce the likelihood of relapse. KIR alleles of Hematopoietic Stem Cell (HSC) donor would require to be known (Alicata et al. Eur J Immunol 2016) because the KIR allele polymorphism may affect both the KIR + NK cell phenotype and function (Gagne et al. Eur J Immunol 2013; Bari R, et al. Sci Rep 2016) as well as HSCT outcome (Boudreau et al. JCO 2017). The introduction of the Next Generation Sequencing (NGS) has overcome current conventional DNA sequencing method limitations, known to be time consuming. Recently, a novel NGS KIR allele typing approach of all KIR genes was developed by our team in Nantes from 30 reference DNAs (Maniangou et al. Front in Immunol 2017). This NGS KIR allele typing approach is simple, fast, reliable, specific and showed a concordance rate of 95% for centromeric and telomeric KIR genes in comparison with high-resolution KIR typing obtained to those published data using exome capture (Norman PJ et al. Am J Hum Genet 2016). This NGS KIR allele typing approach may also be used in reproduction and to better study KIR + NK cell implication in the control of viral infections. Copyright © 2017 Elsevier Masson SAS. All rights reserved.

  20. Inhibition of expression in Escherichia coli of a virulence regulator MglB of Francisella tularensis using external guide sequence technology.

    Directory of Open Access Journals (Sweden)

    Gaoping Xiao

    Full Text Available External guide sequences (EGSs have successfully been used to inhibit expression of target genes at the post-transcriptional level in both prokaryotes and eukaryotes. We previously reported that EGS accessible and cleavable sites in the target RNAs can rapidly be identified by screening random EGS (rEGS libraries. Here the method of screening rEGS libraries and a partial RNase T1 digestion assay were used to identify sites accessible to EGSs in the mRNA of a global virulence regulator MglB from Francisella tularensis, a Gram-negative pathogenic bacterium. Specific EGSs were subsequently designed and their activities in terms of the cleavage of mglB mRNA by RNase P were tested in vitro and in vivo. EGS73, EGS148, and EGS155 in both stem and M1 EGS constructs induced mglB mRNA cleavage in vitro. Expression of stem EGS73 and EGS155 in Escherichia coli resulted in significant reduction of the mglB mRNA level coded for the F. tularensis mglB gene inserted in those cells.

  1. High Density Linkage Map Construction and Mapping of Yield Trait QTLs in Maize (Zea mays) Using the Genotyping-by-Sequencing (GBS) Technology

    Science.gov (United States)

    Su, Chengfu; Wang, Wei; Gong, Shunliang; Zuo, Jinghui; Li, Shujiang; Xu, Shizhong

    2017-01-01

    Increasing grain yield is the ultimate goal for maize breeding. High resolution quantitative trait loci (QTL) mapping can help us understand the molecular basis of phenotypic variation of yield and thus facilitate marker assisted breeding. The aim of this study is to use genotyping-by-sequencing (GBS) for large-scale SNP discovery and simultaneous genotyping of all F2 individuals from a cross between two varieties of maize that are in clear contrast in yield and related traits. A set of 199 F2 progeny derived from the cross of varieties SG-5 and SG-7 were generated and genotyped by GBS. A total of 1,046,524,604 reads with an average of 5,258,918 reads per F2 individual were generated. This number of reads represents an approximately 0.36-fold coverage of the maize reference genome Zea_mays.AGPv3.29 for each F2 individual. A total of 68,882 raw SNPs were discovered in the F2 population, which, after stringent filtering, led to a total of 29,927 high quality SNPs. Comparative analysis using these physically mapped marker loci revealed a higher degree of synteny with the reference genome. The SNP genotype data were utilized to construct an intra-specific genetic linkage map of maize consisting of 3,305 bins on 10 linkage groups spanning 2,236.66 cM at an average distance of 0.68 cM between consecutive markers. From this map, we identified 28 QTLs associated with yield traits (100-kernel weight, ear length, ear diameter, cob diameter, kernel row number, corn grains per row, ear weight, and grain weight per plant) using the composite interval mapping (CIM) method and 29 QTLs using the least absolute shrinkage selection operator (LASSO) method. QTLs identified by the CIM method account for 6.4% to 19.7% of the phenotypic variation. Small intervals of three QTLs (qCGR-1, qKW-2, and qGWP-4) contain several genes, including one gene (GRMZM2G139872) encoding the F-box protein, three genes (GRMZM2G180811, GRMZM5G828139, and GRMZM5G873194) encoding the WD40-repeat protein, and

  2. Identification of Differentially Expressed miRNAs between White and Black Hair Follicles by RNA-Sequencing in the Goat (Capra hircus)

    Science.gov (United States)

    Wu, Zhenyang; Fu, Yuhua; Cao, Jianhua; Yu, Mei; Tang, Xiaohui; Zhao, Shuhong

    2014-01-01

    MicroRNAs (miRNAs) play a key role in many biological processes by regulating gene expression at the post-transcriptional level. A number of miRNAs have been identified from livestock species. However, compared with other animals, such as pigs and cows, the number of miRNAs identified in goats is quite low, particularly in hair follicles. In this study, to investigate the functional roles of miRNAs in goat hair follicles of goats with different coat colors, we sequenced miRNAs from two hair follicles samples (white and black) using Solexa sequencing. A total of 35,604,016 reads were obtained, which included 30,878,637 clean reads (86.73%). MiRDeep2 software identified 214 miRNAs. Among them, 205 were conserved among species and nine were novel miRNAs. Furthermore, DESeq software identified six differentially expressed miRNAs. Quantitative PCR confirmed differential expression of two miRNAs, miR-10b and miR-211. KEGG pathways were analyzed using the DAVID website for the predicted target genes of the differentially expressed miRNAs. Several signaling pathways including Notch and MAPK pathways may affect the process of coat color formation. Our study showed that the identified miRNAs might play an essential role in black and white follicle formation in goats. PMID:24879525

  3. Identification of Differentially Expressed miRNAs between White and Black Hair Follicles by RNA-Sequencing in the Goat (Capra hircus

    Directory of Open Access Journals (Sweden)

    Zhenyang Wu

    2014-05-01

    Full Text Available MicroRNAs (miRNAs play a key role in many biological processes by regulating gene expression at the post-transcriptional level. A number of miRNAs have been identified from livestock species. However, compared with other animals, such as pigs and cows, the number of miRNAs identified in goats is quite low, particularly in hair follicles. In this study, to investigate the functional roles of miRNAs in goat hair follicles of goats with different coat colors, we sequenced miRNAs from two hair follicles samples (white and black using Solexa sequencing. A total of 35,604,016 reads were obtained, which included 30,878,637 clean reads (86.73%. MiRDeep2 software identified 214 miRNAs. Among them, 205 were conserved among species and nine were novel miRNAs. Furthermore, DESeq software identified six differentially expressed miRNAs. Quantitative PCR confirmed differential expression of two miRNAs, miR-10b and miR-211. KEGG pathways were analyzed using the DAVID website for the predicted target genes of the differentially expressed miRNAs. Several signaling pathways including Notch and MAPK pathways may affect the process of coat color formation. Our study showed that the identified miRNAs might play an essential role in black and white follicle formation in goats.

  4. Image sequence analysis

    CERN Document Server

    1981-01-01

    The processing of image sequences has a broad spectrum of important applica­ tions including target tracking, robot navigation, bandwidth compression of TV conferencing video signals, studying the motion of biological cells using microcinematography, cloud tracking, and highway traffic monitoring. Image sequence processing involves a large amount of data. However, because of the progress in computer, LSI, and VLSI technologies, we have now reached a stage when many useful processing tasks can be done in a reasonable amount of time. As a result, research and development activities in image sequence analysis have recently been growing at a rapid pace. An IEEE Computer Society Workshop on Computer Analysis of Time-Varying Imagery was held in Philadelphia, April 5-6, 1979. A related special issue of the IEEE Transactions on Pattern Anal­ ysis and Machine Intelligence was published in November 1980. The IEEE Com­ puter magazine has also published a special issue on the subject in 1981. The purpose of this book ...

  5. Genome-wide massively parallel sequencing of formaldehyde fixed-paraffin embedded (FFPE tumor tissues for copy-number- and mutation-analysis.

    Directory of Open Access Journals (Sweden)

    Michal R Schweiger

    Full Text Available BACKGROUND: Cancer re-sequencing programs rely on DNA isolated from fresh snap frozen tissues, the preparation of which is combined with additional preservation efforts. Tissue samples at pathology departments are routinely stored as formalin-fixed and paraffin-embedded (FFPE samples and their use would open up access to a variety of clinical trials. However, FFPE preparation is incompatible with many down-stream molecular biology techniques such as PCR based amplification methods and gene expression studies. METHODOLOGY/PRINCIPAL FINDINGS: Here we investigated the sample quality requirements of FFPE tissues for massively parallel short-read sequencing approaches. We evaluated key variables of pre-fixation, fixation related and post-fixation processes that occur in routine medical service (e.g. degree of autolysis, duration of fixation and of storage. We also investigated the influence of tissue storage time on sequencing quality by using material that was up to 18 years old. Finally, we analyzed normal and tumor breast tissues using the Sequencing by Synthesis technique (Illumina Genome Analyzer, Solexa to simultaneously localize genome-wide copy number alterations and to detect genomic variations such as substitutions and point-deletions and/or insertions in FFPE tissue samples. CONCLUSIONS/SIGNIFICANCE: The application of second generation sequencing techniques on small amounts of FFPE material opens up the possibility to analyze tissue samples which have been collected during routine clinical work as well as in the context of clinical trials. This is in particular important since FFPE samples are amply available from surgical tumor resections and histopathological diagnosis, and comprise tissue from precursor lesions, primary tumors, lymphogenic and/or hematogenic metastases. Large-scale studies using this tissue material will result in a better prediction of the prognosis of cancer patients and the early identification of patients which

  6. Genetic sequences derived from suppression subtractive ...

    African Journals Online (AJOL)

    STORAGESEVER

    2008-06-17

    Jun 17, 2008 ... their possible roles in Xanthomonas albilineans ... Technology, P. O. Box 1334, Durban 4000, Republic of South Africa. Accepted 4 ... Clones selected were sequenced (using a Perkin Elmer ABI PRISM Dye terminator cycle.

  7. Shotgun protein sequencing.

    Energy Technology Data Exchange (ETDEWEB)

    Faulon, Jean-Loup Michel; Heffelfinger, Grant S.

    2009-06-01

    A novel experimental and computational technique based on multiple enzymatic digestion of a protein or protein mixture that reconstructs protein sequences from sequences of overlapping peptides is described in this SAND report. This approach, analogous to shotgun sequencing of DNA, is to be used to sequence alternative spliced proteins, to identify post-translational modifications, and to sequence genetically engineered proteins.

  8. Clinical applications of sequencing take center stage

    OpenAIRE

    Glusman, Gustavo

    2013-01-01

    A report on the Advances in Genome Biology and Technology (AGBT) meeting, Marco Island, Florida, USA, February 20-23, 2013. This year's Advances in Genome Biology and Technology (AGBT) meeting reflected the current state of 'next generation' sequencing (NGS) technologies: significantly reduced competition and innovation, and a strong focus on standardization and application. Announcements of technological breakthroughs - a hallmark of previous AGBT meetings - were markedly absent, but existin...

  9. Massively parallel sequencing of forensic STRs

    DEFF Research Database (Denmark)

    Parson, Walther; Ballard, David; Budowle, Bruce

    2016-01-01

    The DNA Commission of the International Society for Forensic Genetics (ISFG) is reviewing factors that need to be considered ahead of the adoption by the forensic community of short tandem repeat (STR) genotyping by massively parallel sequencing (MPS) technologies. MPS produces sequence data that...

  10. Direct chloroplast sequencing: comparison of sequencing platforms and analysis tools for whole chloroplast barcoding.

    Directory of Open Access Journals (Sweden)

    Marta Brozynska

    Full Text Available Direct sequencing of total plant DNA using next generation sequencing technologies generates a whole chloroplast genome sequence that has the potential to provide a barcode for use in plant and food identification. Advances in DNA sequencing platforms may make this an attractive approach for routine plant identification. The HiSeq (Illumina and Ion Torrent (Life Technology sequencing platforms were used to sequence total DNA from rice to identify polymorphisms in the whole chloroplast genome sequence of a wild rice plant relative to cultivated rice (cv. Nipponbare. Consensus chloroplast sequences were produced by mapping sequence reads to the reference rice chloroplast genome or by de novo assembly and mapping of the resulting contigs to the reference sequence. A total of 122 polymorphisms (SNPs and indels between the wild and cultivated rice chloroplasts were predicted by these different sequencing and analysis methods. Of these, a total of 102 polymorphisms including 90 SNPs were predicted by both platforms. Indels were more variable with different sequencing methods, with almost all discrepancies found in homopolymers. The Ion Torrent platform gave no apparent false SNP but was less reliable for indels. The methods should be suitable for routine barcoding using appropriate combinations of sequencing platform and data analysis.

  11. Adaptive Basis Selection for Exponential Family Smoothing Splines with Application in Joint Modeling of Multiple Sequencing Samples

    OpenAIRE

    Ma, Ping; Zhang, Nan; Huang, Jianhua Z.; Zhong, Wenxuan

    2017-01-01

    Second-generation sequencing technologies have replaced array-based technologies and become the default method for genomics and epigenomics analysis. Second-generation sequencing technologies sequence tens of millions of DNA/cDNA fragments in parallel. After the resulting sequences (short reads) are mapped to the genome, one gets a sequence of short read counts along the genome. Effective extraction of signals in these short read counts is the key to the success of sequencing technologies. No...

  12. Multimodal sequence learning.

    Science.gov (United States)

    Kemény, Ferenc; Meier, Beat

    2016-02-01

    While sequence learning research models complex phenomena, previous studies have mostly focused on unimodal sequences. The goal of the current experiment is to put implicit sequence learning into a multimodal context: to test whether it can operate across different modalities. We used the Task Sequence Learning paradigm to test whether sequence learning varies across modalities, and whether participants are able to learn multimodal sequences. Our results show that implicit sequence learning is very similar regardless of the source modality. However, the presence of correlated task and response sequences was required for learning to take place. The experiment provides new evidence for implicit sequence learning of abstract conceptual representations. In general, the results suggest that correlated sequences are necessary for implicit sequence learning to occur. Moreover, they show that elements from different modalities can be automatically integrated into one unitary multimodal sequence. Copyright © 2015 Elsevier B.V. All rights reserved.

  13. Sequence Read Archive (SRA)

    Data.gov (United States)

    U.S. Department of Health & Human Services — The Sequence Read Archive (SRA) stores raw sequencing data from the next generation of sequencing platforms including Roche 454 GS System®, Illumina Genome...

  14. High-Throughput Sequencing Reveals Hypothalamic MicroRNAs as Novel Partners Involved in Timing the Rapid Development of Chicken (Gallus gallus) Gonads.

    Science.gov (United States)

    Han, Wei; Zou, Jianmin; Wang, Kehua; Su, Yijun; Zhu, Yunfen; Song, Chi; Li, Guohui; Qu, Liang; Zhang, Huiyong; Liu, Honglin

    2015-01-01

    Onset of the rapid gonad growth is a milestone in sexual development that comprises many genes and regulatory factors. The observations in model organisms and mammals including humans have shown a potential link between miRNAs and development timing. To determine whether miRNAs play roles in this process in the chicken (Gallus gallus), the Solexa deep sequencing was performed to analyze the profiles of miRNA expression in the hypothalamus of hens from two different pubertal stages, before onset of the rapid gonad development (BO) and after onset of the rapid gonad development (AO). 374 conserved and 46 novel miRNAs were identified as hypothalamus-expressed miRNAs in the chicken. 144 conserved miRNAs were showed to be differentially expressed (reads > 10, P time quantitative RT-PCR (qRT-PCR) method. 2013 putative genes were predicted as the targets of the 15 most differentially expressed miRNAs (fold-change > 4.0, P times by the miRNAs. qRT-PCR revealed the basic transcription levels of these clock genes were much higher (P development of chicken gonads. Considering the characteristics of miRNA functional conservation, the results will contribute to the research on puberty onset in humans.

  15. Next-generation sequencing-based transcriptome analysis of Helicoverpa armigera Larvae immune-primed with Photorhabdus luminescens TT01.

    Directory of Open Access Journals (Sweden)

    Zengyang Zhao

    Full Text Available Although invertebrates are incapable of adaptive immunity, immunal reactions which are functionally similar to the adaptive immunity of vertebrates have been described in many studies of invertebrates including insects. The phenomenon was termed immune priming. In order to understand the molecular mechanism of immune priming, we employed Illumina/Solexa platform to investigate the transcriptional changes of the hemocytes and fat body of Helicoverpa armigera larvae immune-primed with the pathogenic bacteria Photorhabdus luminescens TT01. A total of 43.6 and 65.1 million clean reads with 4.4 and 6.5 gigabase sequence data were obtained from the TT01 (the immune-primed and PBS (non-primed cDNA libraries and assembled into 35,707 all-unigenes (non-redundant transcripts, which has a length varied from 201 to 16,947 bp and a N50 length of 1,997 bp. For 35,707 all-unigenes, 20,438 were functionally annotated and 2,494 were differentially expressed after immune priming. The differentially expressed genes (DEGs are mainly related to immunity, detoxification, development and metabolism of the host insect. Analysis on the annotated immune related DEGs supported a hypothesis that we proposed previously: the immune priming phenomenon observed in H. armigera larvae was achieved by regulation of key innate immune elements. The transcriptome profiling data sets (especially the sequences of 1,022 unannotated DEGs and the clues (such as those on immune-related signal and regulatory pathways obtained from this study will facilitate immune-related novel gene discovery and provide valuable information for further exploring the molecular mechanism of immune priming of invertebrates. All these will increase our understanding of invertebrate immunity which may provide new approaches to control insect pests or prevent epidemic of infectious diseases in economic invertebrates in the future.

  16. Complete genome sequence of the fire blight pathogen Erwinia pyrifoliae DSM 12163T and comparative genomic insights into plant pathogenicity

    Directory of Open Access Journals (Sweden)

    Frey Jürg E

    2010-01-01

    Full Text Available Abstract Background Erwinia pyrifoliae is a newly described necrotrophic pathogen, which causes fire blight on Asian (Nashi pear and is geographically restricted to Eastern Asia. Relatively little is known about its genetics compared to the closely related main fire blight pathogen E. amylovora. Results The genome of the type strain of E. pyrifoliae strain DSM 12163T, was sequenced using both 454 and Solexa pyrosequencing and annotated. The genome contains a circular chromosome of 4.026 Mb and four small plasmids. Based on their respective role in virulence in E. amylovora or related organisms, we identified several putative virulence factors, including type III and type VI secretion systems and their effectors, flagellar genes, sorbitol metabolism, iron uptake determinants, and quorum-sensing components. A deletion in the rpoS gene covering the most conserved region of the protein was identified which may contribute to the difference in virulence/host-range compared to E. amylovora. Comparative genomics with the pome fruit epiphyte Erwinia tasmaniensis Et1/99 showed that both species are overall highly similar, although specific differences were identified, for example the presence of some phage gene-containing regions and a high number of putative genomic islands containing transposases in the E. pyrifoliae DSM 12163T genome. Conclusions The E. pyrifoliae genome is an important addition to the published genome of E. tasmaniensis and the unfinished genome of E. amylovora providing a foundation for re-sequencing additional strains that may shed light on the evolution of the host-range and virulence/pathogenicity of this important group of plant-associated bacteria.

  17. Next Generation DNA Sequencing and the Future of Genomic Medicine

    OpenAIRE

    Anderson, Matthew W.; Schrijver, Iris

    2010-01-01

    In the years since the first complete human genome sequence was reported, there has been a rapid development of technologies to facilitate high-throughput sequence analysis of DNA (termed “next-generation” sequencing). These novel approaches to DNA sequencing offer the promise of complete genomic analysis at a cost feasible for routine clinical diagnostics. However, the ability to more thoroughly interrogate genomic sequence raises a number of important issues with regard to result interpreta...

  18. Hardware Accelerated Sequence Alignment with Traceback

    Directory of Open Access Journals (Sweden)

    Scott Lloyd

    2009-01-01

    in a timely manner. Known methods to accelerate alignment on reconfigurable hardware only address sequence comparison, limit the sequence length, or exhibit memory and I/O bottlenecks. A space-efficient, global sequence alignment algorithm and architecture is presented that accelerates the forward scan and traceback in hardware without memory and I/O limitations. With 256 processing elements in FPGA technology, a performance gain over 300 times that of a desktop computer is demonstrated on sequence lengths of 16000. For greater performance, the architecture is scalable to more processing elements.

  19. Comparison of two Next Generation sequencing platforms for full genome sequencing of Classical Swine Fever Virus

    DEFF Research Database (Denmark)

    Fahnøe, Ulrik; Pedersen, Anders Gorm; Höper, Dirk

    2013-01-01

    to the consensus sequence. Additionally, we got an average sequence depth for the genome of 4000 for the Iontorrent PGM and 400 for the FLX platform making the mapping suitable for single nucleotide variant (SNV) detection. The analysis revealed a single non-silent SNV A10665G leading to the amino acid change D......Next Generation Sequencing (NGS) is becoming more adopted into viral research and will be the preferred technology in the years to come. We have recently sequenced several strains of Classical Swine Fever Virus (CSFV) by NGS on both Genome Sequencer FLX (GS FLX) and Iontorrent PGM platforms...

  20. Genome Sequences of Oryza Species

    KAUST Repository

    Kumagai, Masahiko; Tanaka, Tsuyoshi; Ohyanagi, Hajime; Hsing, Yue-Ie C.; Itoh, Takeshi

    2018-01-01

    This chapter summarizes recent data obtained from genome sequencing, annotation projects, and studies on the genome diversity of Oryza sativa and related Oryza species. O. sativa, commonly known as Asian rice, is the first monocot species whose complete genome sequence was deciphered based on physical mapping by an international collaborative effort. This genome, along with its accurate and comprehensive annotation, has become an indispensable foundation for crop genomics and breeding. With the development of innovative sequencing technologies, genomic studies of O. sativa have dramatically increased; in particular, a large number of cultivars and wild accessions have been sequenced and compared with the reference rice genome. Since de novo genome sequencing has become cost-effective, the genome of African cultivated rice, O. glaberrima, has also been determined. Comparative genomic studies have highlighted the independent domestication processes of different rice species, but it also turned out that Asian and African rice share a common gene set that has experienced similar artificial selection. An international project aimed at constructing reference genomes and examining the genome diversity of wild Oryza species is currently underway, and the genomes of some species are publicly available. This project provides a platform for investigations such as the evolution, development, polyploidization, and improvement of crops. Studies on the genomic diversity of Oryza species, including wild species, should provide new insights to solve the problem of growing food demands in the face of rapid climatic changes.

  1. Genome Sequences of Oryza Species

    KAUST Repository

    Kumagai, Masahiko

    2018-02-14

    This chapter summarizes recent data obtained from genome sequencing, annotation projects, and studies on the genome diversity of Oryza sativa and related Oryza species. O. sativa, commonly known as Asian rice, is the first monocot species whose complete genome sequence was deciphered based on physical mapping by an international collaborative effort. This genome, along with its accurate and comprehensive annotation, has become an indispensable foundation for crop genomics and breeding. With the development of innovative sequencing technologies, genomic studies of O. sativa have dramatically increased; in particular, a large number of cultivars and wild accessions have been sequenced and compared with the reference rice genome. Since de novo genome sequencing has become cost-effective, the genome of African cultivated rice, O. glaberrima, has also been determined. Comparative genomic studies have highlighted the independent domestication processes of different rice species, but it also turned out that Asian and African rice share a common gene set that has experienced similar artificial selection. An international project aimed at constructing reference genomes and examining the genome diversity of wild Oryza species is currently underway, and the genomes of some species are publicly available. This project provides a platform for investigations such as the evolution, development, polyploidization, and improvement of crops. Studies on the genomic diversity of Oryza species, including wild species, should provide new insights to solve the problem of growing food demands in the face of rapid climatic changes.

  2. Sequence Factorization with Multiple References.

    Directory of Open Access Journals (Sweden)

    Sebastian Wandelt

    Full Text Available The success of high-throughput sequencing has lead to an increasing number of projects which sequence large populations of a species. Storage and analysis of sequence data is a key challenge in these projects, because of the sheer size of the datasets. Compression is one simple technology to deal with this challenge. Referential factorization and compression schemes, which store only the differences between input sequence and a reference sequence, gained lots of interest in this field. Highly-similar sequences, e.g., Human genomes, can be compressed with a compression ratio of 1,000:1 and more, up to two orders of magnitude better than with standard compression techniques. Recently, it was shown that the compression against multiple references from the same species can boost the compression ratio up to 4,000:1. However, a detailed analysis of using multiple references is lacking, e.g., for main memory consumption and optimality. In this paper, we describe one key technique for the referential compression against multiple references: The factorization of sequences. Based on the notion of an optimal factorization, we propose optimization heuristics and identify parameter settings which greatly influence 1 the size of the factorization, 2 the time for factorization, and 3 the required amount of main memory. We evaluate a total of 30 setups with a varying number of references on data from three different species. Our results show a wide range of factorization sizes (optimal to an overhead of up to 300%, factorization speed (0.01 MB/s to more than 600 MB/s, and main memory usage (few dozen MB to dozens of GB. Based on our evaluation, we identify the best configurations for common use cases. Our evaluation shows that multi-reference factorization is much better than single-reference factorization.

  3. SOAP

    DEFF Research Database (Denmark)

    Li, Ruiqiang; Li, Yingrui; Kristiansen, Karsten

    2008-01-01

    MOTIVATION: We have developed a program SOAP for efficient gapped and ungapped alignment of short oligonucleotides onto reference sequences. The program is designed to handle the huge amounts of short reads generated by parallel sequencing using the new generation Illumina-Solexa sequencing...... technology. SOAP is compatible with numerous applications, including single-read or pair-end resequencing, small RNA discovery, and mRNA tag sequence mapping. SOAP is a command-driven program, which supports multithreaded parallel computing, and has a batch module for multiple query sets. AVAILABILITY: http://soap.......genomics.org.cn CONTACT: soap@genomics.org.cn ....

  4. Sequencing intractable DNA to close microbial genomes.

    Directory of Open Access Journals (Sweden)

    Richard A Hurt

    Full Text Available Advancement in high throughput DNA sequencing technologies has supported a rapid proliferation of microbial genome sequencing projects, providing the genetic blueprint for in-depth studies. Oftentimes, difficult to sequence regions in microbial genomes are ruled "intractable" resulting in a growing number of genomes with sequence gaps deposited in databases. A procedure was developed to sequence such problematic regions in the "non-contiguous finished" Desulfovibrio desulfuricans ND132 genome (6 intractable gaps and the Desulfovibrio africanus genome (1 intractable gap. The polynucleotides surrounding each gap formed GC rich secondary structures making the regions refractory to amplification and sequencing. Strand-displacing DNA polymerases used in concert with a novel ramped PCR extension cycle supported amplification and closure of all gap regions in both genomes. The developed procedures support accurate gene annotation, and provide a step-wise method that reduces the effort required for genome finishing.

  5. Sequencing Intractable DNA to Close Microbial Genomes

    Energy Technology Data Exchange (ETDEWEB)

    Hurt, Jr., Richard Ashley [ORNL; Brown, Steven D [ORNL; Podar, Mircea [ORNL; Palumbo, Anthony Vito [ORNL; Elias, Dwayne A [ORNL

    2012-01-01

    Advancement in high throughput DNA sequencing technologies has supported a rapid proliferation of microbial genome sequencing projects, providing the genetic blueprint for for in-depth studies. Oftentimes, difficult to sequence regions in microbial genomes are ruled intractable resulting in a growing number of genomes with sequence gaps deposited in databases. A procedure was developed to sequence such difficult regions in the non-contiguous finished Desulfovibrio desulfuricans ND132 genome (6 intractable gaps) and the Desulfovibrio africanus genome (1 intractable gap). The polynucleotides surrounding each gap formed GC rich secondary structures making the regions refractory to amplification and sequencing. Strand-displacing DNA polymerases used in concert with a novel ramped PCR extension cycle supported amplification and closure of all gap regions in both genomes. These developed procedures support accurate gene annotation, and provide a step-wise method that reduces the effort required for genome finishing.

  6. Nonparametric combinatorial sequence models.

    Science.gov (United States)

    Wauthier, Fabian L; Jordan, Michael I; Jojic, Nebojsa

    2011-11-01

    This work considers biological sequences that exhibit combinatorial structures in their composition: groups of positions of the aligned sequences are "linked" and covary as one unit across sequences. If multiple such groups exist, complex interactions can emerge between them. Sequences of this kind arise frequently in biology but methodologies for analyzing them are still being developed. This article presents a nonparametric prior on sequences which allows combinatorial structures to emerge and which induces a posterior distribution over factorized sequence representations. We carry out experiments on three biological sequence families which indicate that combinatorial structures are indeed present and that combinatorial sequence models can more succinctly describe them than simpler mixture models. We conclude with an application to MHC binding prediction which highlights the utility of the posterior distribution over sequence representations induced by the prior. By integrating out the posterior, our method compares favorably to leading binding predictors.

  7. FRESCO: Referential compression of highly similar sequences.

    Science.gov (United States)

    Wandelt, Sebastian; Leser, Ulf

    2013-01-01

    In many applications, sets of similar texts or sequences are of high importance. Prominent examples are revision histories of documents or genomic sequences. Modern high-throughput sequencing technologies are able to generate DNA sequences at an ever-increasing rate. In parallel to the decreasing experimental time and cost necessary to produce DNA sequences, computational requirements for analysis and storage of the sequences are steeply increasing. Compression is a key technology to deal with this challenge. Recently, referential compression schemes, storing only the differences between a to-be-compressed input and a known reference sequence, gained a lot of interest in this field. In this paper, we propose a general open-source framework to compress large amounts of biological sequence data called Framework for REferential Sequence COmpression (FRESCO). Our basic compression algorithm is shown to be one to two orders of magnitudes faster than comparable related work, while achieving similar compression ratios. We also propose several techniques to further increase compression ratios, while still retaining the advantage in speed: 1) selecting a good reference sequence; and 2) rewriting a reference sequence to allow for better compression. In addition,we propose a new way of further boosting the compression ratios by applying referential compression to already referentially compressed files (second-order compression). This technique allows for compression ratios way beyond state of the art, for instance,4,000:1 and higher for human genomes. We evaluate our algorithms on a large data set from three different species (more than 1,000 genomes, more than 3 TB) and on a collection of versions of Wikipedia pages. Our results show that real-time compression of highly similar sequences at high compression ratios is possible on modern hardware.

  8. The contribution of next generation sequencing to epilepsy genetics

    DEFF Research Database (Denmark)

    Møller, Rikke S.; Dahl, Hans A.; Helbig, Ingo

    2015-01-01

    During the last decade, next generation sequencing technologies such as targeted gene panels, whole exome sequencing and whole genome sequencing have led to an explosion of gene identifications in monogenic epilepsies including both familial epilepsies and severe epilepsies, often referred to as ...

  9. From Genome Sequence to Taxonomy - A Skeptic’s View

    DEFF Research Database (Denmark)

    Özen, Asli Ismihan; Vesth, Tammi Camilla; Ussery, David

    2012-01-01

    The relative ease of sequencing bacterial genomes has resulted in thousands of sequenced bacterial genomes available in the public databases. This same technology now allows for using the entire genome sequence as an identifier for an organism. There are many methods available which attempt to us...

  10. Deep-sequencing protocols influence the results obtained in small-RNA sequencing.

    Directory of Open Access Journals (Sweden)

    Joern Toedling

    Full Text Available Second-generation sequencing is a powerful method for identifying and quantifying small-RNA components of cells. However, little attention has been paid to the effects of the choice of sequencing platform and library preparation protocol on the results obtained. We present a thorough comparison of small-RNA sequencing libraries generated from the same embryonic stem cell lines, using different sequencing platforms, which represent the three major second-generation sequencing technologies, and protocols. We have analysed and compared the expression of microRNAs, as well as populations of small RNAs derived from repetitive elements. Despite the fact that different libraries display a good correlation between sequencing platforms, qualitative and quantitative variations in the results were found, depending on the protocol used. Thus, when comparing libraries from different biological samples, it is strongly recommended to use the same sequencing platform and protocol in order to ensure the biological relevance of the comparisons.

  11. Long sequence correlation coprocessor

    Science.gov (United States)

    Gage, Douglas W.

    1994-09-01

    A long sequence correlation coprocessor (LSCC) accelerates the bitwise correlation of arbitrarily long digital sequences by calculating in parallel the correlation score for 16, for example, adjacent bit alignments between two binary sequences. The LSCC integrated circuit is incorporated into a computer system with memory storage buffers and a separate general purpose computer processor which serves as its controller. Each of the LSCC's set of sequential counters simultaneously tallies a separate correlation coefficient. During each LSCC clock cycle, computer enable logic associated with each counter compares one bit of a first sequence with one bit of a second sequence to increment the counter if the bits are the same. A shift register assures that the same bit of the first sequence is simultaneously compared to different bits of the second sequence to simultaneously calculate the correlation coefficient by the different counters to represent different alignments of the two sequences.

  12. Roles of repetitive sequences

    Energy Technology Data Exchange (ETDEWEB)

    Bell, G.I.

    1991-12-31

    The DNA of higher eukaryotes contains many repetitive sequences. The study of repetitive sequences is important, not only because many have important biological function, but also because they provide information on genome organization, evolution and dynamics. In this paper, I will first discuss some generic effects that repetitive sequences will have upon genome dynamics and evolution. In particular, it will be shown that repetitive sequences foster recombination among, and turnover of, the elements of a genome. I will then consider some examples of repetitive sequences, notably minisatellite sequences and telomere sequences as examples of tandem repeats, without and with respectively known function, and Alu sequences as an example of interspersed repeats. Some other examples will also be considered in less detail.

  13. Anomaly Detection in Sequences

    Data.gov (United States)

    National Aeronautics and Space Administration — We present a set of novel algorithms which we call sequenceMiner, that detect and characterize anomalies in large sets of high-dimensional symbol sequences that...

  14. DNA sequencing conference, 2

    Energy Technology Data Exchange (ETDEWEB)

    Cook-Deegan, R.M. [Georgetown Univ., Kennedy Inst. of Ethics, Washington, DC (United States); Venter, J.C. [National Inst. of Neurological Disorders and Strokes, Bethesda, MD (United States); Gilbert, W. [Harvard Univ., Cambridge, MA (United States); Mulligan, J. [Stanford Univ., CA (United States); Mansfield, B.K. [Oak Ridge National Lab., TN (United States)

    1991-06-19

    This conference focused on DNA sequencing, genetic linkage mapping, physical mapping, informatics and bioethics. Several were used to study this sequencing and mapping. This article also discusses computer hardware and software aiding in the mapping of genes.

  15. sequenceMiner algorithm

    Data.gov (United States)

    National Aeronautics and Space Administration — Detecting and describing anomalies in large repositories of discrete symbol sequences. sequenceMiner has been open-sourced! Download the file below to try it out....

  16. Identification of novel and conserved microRNAs related to drought stress in potato by deep sequencing.

    Science.gov (United States)

    Zhang, Ning; Yang, Jiangwei; Wang, Zemin; Wen, Yikai; Wang, Jie; He, Wenhui; Liu, Bailin; Si, Huaijun; Wang, Di

    2014-01-01

    MicroRNAs (miRNAs) are a group of small, non-coding RNAs that play important roles in plant growth, development and stress response. There have been an increasing number of investigations aimed at discovering miRNAs and analyzing their functions in model plants (such as Arabidopsis thaliana and rice). In this research, we constructed small RNA libraries from both polyethylene glycol (PEG 6,000) treated and control potato samples, and a large number of known and novel miRNAs were identified. Differential expression analysis showed that 100 of the known miRNAs were down-regulated and 99 were up-regulated as a result of PEG stress, while 119 of the novel miRNAs were up-regulated and 151 were down-regulated. Based on target prediction, annotation and expression analysis of the miRNAs and their putative target genes, 4 miRNAs were identified as regulating drought-related genes (miR811, miR814, miR835, miR4398). Their target genes were MYB transcription factor (CV431094), hydroxyproline-rich glycoprotein (TC225721), quaporin (TC223412) and WRKY transcription factor (TC199112), respectively. Relative expression trends of those miRNAs were the same as that predicted by Solexa sequencing and they showed a negative correlation with the expression of the target genes. The results provide molecular evidence for the possible involvement of miRNAs in the process of drought response and/or tolerance in the potato plant.

  17. High-Throughput Sequencing Reveals Circulating miRNAs as Potential Biomarkers for Measuring Puberty Onset in Chicken (Gallus gallus).

    Science.gov (United States)

    Han, Wei; Zhu, Yunfen; Su, Yijun; Li, Guohui; Qu, Liang; Zhang, Huiyong; Wang, Kehua; Zou, Jianmin; Liu, Honglin

    2016-01-01

    There are still no highly sensitive and unique biomarkers for measurement of puberty onset. Circulating miRNAs have been shown to be promising biomarkers for diagnosis of various diseases. To identify circulating miRNAs that could be served as biomarkers for measuring chicken (Gallus gallus) puberty onset, the Solexa deep sequencing was performed to analyze the miRNA expression profiles in serum and plasma of hens from two different pubertal stages, before puberty onset (BO) and after puberty onset (AO). 197 conserved and 19 novel miRNAs (reads > 10) were identified as serum/plasma-expressed miRNAs in the chicken. The common miRNA amounts and their expression changes from BO to AO between serum and plasma were very similar, indicating the different treatments to generate serum and plasma had quite small influence on the miRNAs. 130 conserved serum-miRNAs were showed to be differentially expressed (reads > 10, P 1.0, P puberty onset. Further quantitative real-time PCR (RT-qPCR) test found that a seven-miRNA panel, including miR-29c, miR-375, miR-215, miR-217, miR-19b, miR-133a and let-7a, had great potentials to serve as novel biomarkers for measuring puberty onset in chicken. Due to highly conserved nature of miRNAs, the findings could provide cues for measurement of puberty onset in other animals as well as humans.

  18. Applying Next Generation Sequencing to Skeletal Development and Disease

    OpenAIRE

    Bowen, Margot Elizabeth

    2013-01-01

    Next Generation Sequencing (NGS) technologies have dramatically increased the throughput and lowered the cost of DNA sequencing. In this thesis, I apply these technologies to unresolved questions in skeletal development and disease. Firstly, I use targeted re-sequencing of genomic DNA to identify the genetic cause of the cartilage tumor syndrome, metachondromatosis (MC). I show that the majority of MC patients carry heterozygous loss-of-function mutations in the PTPN11 gene, which encodes a p...

  19. Enhanced Dynamic Algorithm of Genome Sequence Alignments

    OpenAIRE

    Arabi E. keshk

    2014-01-01

    The merging of biology and computer science has created a new field called computational biology that explore the capacities of computers to gain knowledge from biological data, bioinformatics. Computational biology is rooted in life sciences as well as computers, information sciences, and technologies. The main problem in computational biology is sequence alignment that is a way of arranging the sequences of DNA, RNA or protein to identify the region of similarity and relationship between se...

  20. Complete Genome Sequence of Ikoma Lyssavirus

    OpenAIRE

    Marston, Denise A.; Ellis, Richard J.; Horton, Daniel L.; Kuzmin, Ivan V.; Wise, Emma L.; McElhinney, Lorraine M.; Banyard, Ashley C.; Ngeleja, Chanasa; Keyyu, Julius; Cleaveland, Sarah; Lembo, Tiziana; Rupprecht, Charles E.; Fooks, Anthony R.

    2012-01-01

    Lyssaviruses (family Rhabdoviridae) constitute one of the most important groups of viral zoonoses globally. All lyssaviruses cause the disease rabies, an acute progressive encephalitis for which, once symptoms occur, there is no effective cure. Currently available vaccines are highly protective against the predominantly circulating lyssavirus species. Using next-generation sequencing technologies, we have obtained the whole-genome sequence for a novel lyssavirus, Ikoma lyssavirus (IKOV), isol...

  1. Enrichment of target sequences for next-generation sequencing applications in research and diagnostics.

    Science.gov (United States)

    Altmüller, Janine; Budde, Birgit S; Nürnberg, Peter

    2014-02-01

    Abstract Targeted re-sequencing such as gene panel sequencing (GPS) has become very popular in medical genetics, both for research projects and in diagnostic settings. The technical principles of the different enrichment methods have been reviewed several times before; however, new enrichment products are constantly entering the market, and researchers are often puzzled about the requirement to take decisions about long-term commitments, both for the enrichment product and the sequencing technology. This review summarizes important considerations for the experimental design and provides helpful recommendations in choosing the best sequencing strategy for various research projects and diagnostic applications.

  2. Quantitative phenotyping via deep barcode sequencing.

    Science.gov (United States)

    Smith, Andrew M; Heisler, Lawrence E; Mellor, Joseph; Kaper, Fiona; Thompson, Michael J; Chee, Mark; Roth, Frederick P; Giaever, Guri; Nislow, Corey

    2009-10-01

    Next-generation DNA sequencing technologies have revolutionized diverse genomics applications, including de novo genome sequencing, SNP detection, chromatin immunoprecipitation, and transcriptome analysis. Here we apply deep sequencing to genome-scale fitness profiling to evaluate yeast strain collections in parallel. This method, Barcode analysis by Sequencing, or "Bar-seq," outperforms the current benchmark barcode microarray assay in terms of both dynamic range and throughput. When applied to a complex chemogenomic assay, Bar-seq quantitatively identifies drug targets, with performance superior to the benchmark microarray assay. We also show that Bar-seq is well-suited for a multiplex format. We completely re-sequenced and re-annotated the yeast deletion collection using deep sequencing, found that approximately 20% of the barcodes and common priming sequences varied from expectation, and used this revised list of barcode sequences to improve data quality. Together, this new assay and analysis routine provide a deep-sequencing-based toolkit for identifying gene-environment interactions on a genome-wide scale.

  3. Snake Genome Sequencing: Results and Future Prospects.

    Science.gov (United States)

    Kerkkamp, Harald M I; Kini, R Manjunatha; Pospelov, Alexey S; Vonk, Freek J; Henkel, Christiaan V; Richardson, Michael K

    2016-12-01

    Snake genome sequencing is in its infancy-very much behind the progress made in sequencing the genomes of humans, model organisms and pathogens relevant to biomedical research, and agricultural species. We provide here an overview of some of the snake genome projects in progress, and discuss the biological findings, with special emphasis on toxinology, from the small number of draft snake genomes already published. We discuss the future of snake genomics, pointing out that new sequencing technologies will help overcome the problem of repetitive sequences in assembling snake genomes. Genome sequences are also likely to be valuable in examining the clustering of toxin genes on the chromosomes, in designing recombinant antivenoms and in studying the epigenetic regulation of toxin gene expression.

  4. Snake Genome Sequencing: Results and Future Prospects

    Directory of Open Access Journals (Sweden)

    Harald M. I. Kerkkamp

    2016-12-01

    Full Text Available Snake genome sequencing is in its infancy—very much behind the progress made in sequencing the genomes of humans, model organisms and pathogens relevant to biomedical research, and agricultural species. We provide here an overview of some of the snake genome projects in progress, and discuss the biological findings, with special emphasis on toxinology, from the small number of draft snake genomes already published. We discuss the future of snake genomics, pointing out that new sequencing technologies will help overcome the problem of repetitive sequences in assembling snake genomes. Genome sequences are also likely to be valuable in examining the clustering of toxin genes on the chromosomes, in designing recombinant antivenoms and in studying the epigenetic regulation of toxin gene expression.

  5. Enhanced throughput for infrared automated DNA sequencing

    Science.gov (United States)

    Middendorf, Lyle R.; Gartside, Bill O.; Humphrey, Pat G.; Roemer, Stephen C.; Sorensen, David R.; Steffens, David L.; Sutter, Scott L.

    1995-04-01

    Several enhancements have been developed and applied to infrared automated DNA sequencing resulting in significantly higher throughput. A 41 cm sequencing gel (31 cm well- to-read distance) combines high resolution of DNA sequencing fragments with optimized run times yielding two runs per day of 500 bases per sample. A 66 cm sequencing gel (56 cm well-to-read distance) produces sequence read lengths of up to 1000 bases for ds and ss templates using either T7 polymerase or cycle-sequencing protocols. Using a multichannel syringe to load 64 lanes allows 16 samples (compatible with 96-well format) to be visualized for each run. The 41 cm gel configuration allows 16,000 bases per day (16 samples X 500 bases/sample X 2 ten hour runs/day) to be sequenced with the advantages of infrared technology. Enhancements to internal labeling techniques using an infrared-labeled dATP molecule (Boehringer Mannheim GmbH, Penzberg, Germany; Sequenase (U.S. Biochemical) have also been made. The inclusion of glycerol in the sequencing reactions yields greatly improved results for some primer and template combinations. The inclusion of (alpha) -Thio-dNTP's in the labeling reaction increases signal intensity two- to three-fold.

  6. Protecting genomic sequence anonymity with generalization lattices.

    Science.gov (United States)

    Malin, B A

    2005-01-01

    Current genomic privacy technologies assume the identity of genomic sequence data is protected if personal information, such as demographics, are obscured, removed, or encrypted. While demographic features can directly compromise an individual's identity, recent research demonstrates such protections are insufficient because sequence data itself is susceptible to re-identification. To counteract this problem, we introduce an algorithm for anonymizing a collection of person-specific DNA sequences. The technique is termed DNA lattice anonymization (DNALA), and is based upon the formal privacy protection schema of k -anonymity. Under this model, it is impossible to observe or learn features that distinguish one genetic sequence from k-1 other entries in a collection. To maximize information retained in protected sequences, we incorporate a concept generalization lattice to learn the distance between two residues in a single nucleotide region. The lattice provides the most similar generalized concept for two residues (e.g. adenine and guanine are both purines). The method is tested and evaluated with several publicly available human population datasets ranging in size from 30 to 400 sequences. Our findings imply the anonymization schema is feasible for the protection of sequences privacy. The DNALA method is the first computational disclosure control technique for general DNA sequences. Given the computational nature of the method, guarantees of anonymity can be formally proven. There is room for improvement and validation, though this research provides the groundwork from which future researchers can construct genomics anonymization schemas tailored to specific datasharing scenarios.

  7. Sequences for Student Investigation

    Science.gov (United States)

    Barton, Jeffrey; Feil, David; Lartigue, David; Mullins, Bernadette

    2004-01-01

    We describe two classes of sequences that give rise to accessible problems for undergraduate research. These problems may be understood with virtually no prerequisites and are well suited for computer-aided investigation. The first sequence is a variation of one introduced by Stephen Wolfram in connection with his study of cellular automata. The…

  8. Sequence History Update Tool

    Science.gov (United States)

    Khanampompan, Teerapat; Gladden, Roy; Fisher, Forest; DelGuercio, Chris

    2008-01-01

    The Sequence History Update Tool performs Web-based sequence statistics archiving for Mars Reconnaissance Orbiter (MRO). Using a single UNIX command, the software takes advantage of sequencing conventions to automatically extract the needed statistics from multiple files. This information is then used to populate a PHP database, which is then seamlessly formatted into a dynamic Web page. This tool replaces a previous tedious and error-prone process of manually editing HTML code to construct a Web-based table. Because the tool manages all of the statistics gathering and file delivery to and from multiple data sources spread across multiple servers, there is also a considerable time and effort savings. With the use of The Sequence History Update Tool what previously took minutes is now done in less than 30 seconds, and now provides a more accurate archival record of the sequence commanding for MRO.

  9. Genome sequencing for obstetricians & gynaecologists | Kent ...

    African Journals Online (AJOL)

    The medical profession has been waiting for a decade to be invigorated by the sequencing of the human genome, arguably the greatest scientific project ever. The technology has been spectacular but the results of the project have yielded more unexpected results than definitive answers – many about the very nature of our ...

  10. Oxford Nanopore MinION Sequencing and Genome Assembly

    Directory of Open Access Journals (Sweden)

    Hengyun Lu

    2016-10-01

    Full Text Available The revolution of genome sequencing is continuing after the successful second-generation sequencing (SGS technology. The third-generation sequencing (TGS technology, led by Pacific Biosciences (PacBio, is progressing rapidly, moving from a technology once only capable of providing data for small genome analysis, or for performing targeted screening, to one that promises high quality de novo assembly and structural variation detection for human-sized genomes. In 2014, the MinION, the first commercial sequencer using nanopore technology, was released by Oxford Nanopore Technologies (ONT. MinION identifies DNA bases by measuring the changes in electrical conductivity generated as DNA strands pass through a biological pore. Its portability, affordability, and speed in data production makes it suitable for real-time applications, the release of the long read sequencer MinION has thus generated much excitement and interest in the genomics community. While de novo genome assemblies can be cheaply produced from SGS data, assembly continuity is often relatively poor, due to the limited ability of short reads to handle long repeats. Assembly quality can be greatly improved by using TGS long reads, since repetitive regions can be easily expanded into using longer sequencing lengths, despite having higher error rates at the base level. The potential of nanopore sequencing has been demonstrated by various studies in genome surveillance at locations where rapid and reliable sequencing is needed, but where resources are limited.

  11. JVM: Java Visual Mapping tool for next generation sequencing read.

    Science.gov (United States)

    Yang, Ye; Liu, Juan

    2015-01-01

    We developed a program JVM (Java Visual Mapping) for mapping next generation sequencing read to reference sequence. The program is implemented in Java and is designed to deal with millions of short read generated by sequence alignment using the Illumina sequencing technology. It employs seed index strategy and octal encoding operations for sequence alignments. JVM is useful for DNA-Seq, RNA-Seq when dealing with single-end resequencing. JVM is a desktop application, which supports reads capacity from 1 MB to 10 GB.

  12. The International Nucleotide Sequence Database Collaboration.

    Science.gov (United States)

    Cochrane, Guy; Karsch-Mizrachi, Ilene; Nakamura, Yasukazu

    2011-01-01

    Under the International Nucleotide Sequence Database Collaboration (INSDC; http://www.insdc.org), globally comprehensive public domain nucleotide sequence is captured, preserved and presented. The partners of this long-standing collaboration work closely together to provide data formats and conventions that enable consistent data submission to their databases and support regular data exchange around the globe. Clearly defined policy and governance in relation to free access to data and relationships with journal publishers have positioned INSDC databases as a key provider of the scientific record and a core foundation for the global bioinformatics data infrastructure. While growth in sequence data volumes comes no longer as a surprise to INSDC partners, the uptake of next-generation sequencing technology by mainstream science that we have witnessed in recent years brings a step-change to growth, necessarily making a clear mark on INSDC strategy. In this article, we introduce the INSDC, outline data growth patterns and comment on the challenges of increased growth.

  13. Aplikace logistických technologií Just-in-Time a Just-in-Sequence ve společnosti Robert Bosch spol. s r.o.

    OpenAIRE

    ŠTEFKOVÁ, Iveta

    2009-01-01

    This work is mostly oriented to define theory and application of principle Just in Time ( JIT ), Just in Sequence (JIS ) and related methods in Robert Bosch Ltd.Part goals are define theoretical basis of JIT and JIS from knowledge of Czech and foreign literature and detailed analysis of particular methods. Each method is well evaluated with all its pros and cons. Further observations engaged method of Kanban, Heijunka, MRPI, MRPII, which are close related to JIT and JIS. This work precisely d...

  14. HIV Sequence Compendium 2015

    Energy Technology Data Exchange (ETDEWEB)

    Foley, Brian Thomas [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Leitner, Thomas Kenneth [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Apetrei, Cristian [Univ. of Pittsburgh, PA (United States); Hahn, Beatrice [Univ. of Pennsylvania, Philadelphia, PA (United States); Mizrachi, Ilene [National Center for Biotechnology Information, Bethesda, MD (United States); Mullins, James [Univ. of Washington, Seattle, WA (United States); Rambaut, Andrew [Univ. of Edinburgh, Scotland (United Kingdom); Wolinsky, Steven [Northwestern Univ., Evanston, IL (United States); Korber, Bette Tina Marie [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)

    2015-10-05

    This compendium is an annual printed summary of the data contained in the HIV sequence database. We try to present a judicious selection of the data in such a way that it is of maximum utility to HIV researchers. Each of the alignments attempts to display the genetic variability within the different species, groups and subtypes of the virus. This compendium contains sequences published before January 1, 2015. Hence, though it is published in 2015 and called the 2015 Compendium, its contents correspond to the 2014 curated alignments on our website. The number of sequences in the HIV database is still increasing. In total, at the end of 2014, there were 624,121 sequences in the HIV Sequence Database, an increase of 7% since the previous year. This is the first year that the number of new sequences added to the database has decreased compared to the previous year. The number of near complete genomes (>7000 nucleotides) increased to 5834 by end of 2014. However, as in previous years, the compendium alignments contain only a fraction of these. A more complete version of all alignments is available on our website, http://www.hiv.lanl.gov/ content/sequence/NEWALIGN/align.html As always, we are open to complaints and suggestions for improvement. Inquiries and comments regarding the compendium should be addressed to seq-info@lanl.gov.

  15. Mapping sequences by parts

    Directory of Open Access Journals (Sweden)

    Guziolowski Carito

    2007-09-01

    Full Text Available Abstract Background: We present the N-map method, a pairwise and asymmetrical approach which allows us to compare sequences by taking into account evolutionary events that produce shuffled, reversed or repeated elements. Basically, the optimal N-map of a sequence s over a sequence t is the best way of partitioning the first sequence into N parts and placing them, possibly complementary reversed, over the second sequence in order to maximize the sum of their gapless alignment scores. Results: We introduce an algorithm computing an optimal N-map with time complexity O (|s| × |t| × N using O (|s| × |t| × N memory space. Among all the numbers of parts taken in a reasonable range, we select the value N for which the optimal N-map has the most significant score. To evaluate this significance, we study the empirical distributions of the scores of optimal N-maps and show that they can be approximated by normal distributions with a reasonable accuracy. We test the functionality of the approach over random sequences on which we apply artificial evolutionary events. Practical Application: The method is illustrated with four case studies of pairs of sequences involving non-standard evolutionary events.

  16. On site DNA barcoding by nanopore sequencing.

    Directory of Open Access Journals (Sweden)

    Michele Menegon

    Full Text Available Biodiversity research is becoming increasingly dependent on genomics, which allows the unprecedented digitization and understanding of the planet's biological heritage. The use of genetic markers i.e. DNA barcoding, has proved to be a powerful tool in species identification. However, full exploitation of this approach is hampered by the high sequencing costs and the absence of equipped facilities in biodiversity-rich countries. In the present work, we developed a portable sequencing laboratory based on the portable DNA sequencer from Oxford Nanopore Technologies, the MinION. Complementary laboratory equipment and reagents were selected to be used in remote and tough environmental conditions. The performance of the MinION sequencer and the portable laboratory was tested for DNA barcoding in a mimicking tropical environment, as well as in a remote rainforest of Tanzania lacking electricity. Despite the relatively high sequencing error-rate of the MinION, the development of a suitable pipeline for data analysis allowed the accurate identification of different species of vertebrates including amphibians, reptiles and mammals. In situ sequencing of a wild frog allowed us to rapidly identify the species captured, thus confirming that effective DNA barcoding in the field is possible. These results open new perspectives for real-time-on-site DNA sequencing thus potentially increasing opportunities for the understanding of biodiversity in areas lacking conventional laboratory facilities.

  17. The Colliding Beams Sequencer

    International Nuclear Information System (INIS)

    Johnson, D.E.; Johnson, R.P.

    1989-01-01

    The Colliding Beam Sequencer (CBS) is a computer program used to operate the pbar-p Collider by synchronizing the applications programs and simulating the activities of the accelerator operators during filling and storage. The Sequencer acts as a meta-program, running otherwise stand alone applications programs, to do the set-up, beam transfers, acceleration, low beta turn on, and diagnostics for the transfers and storage. The Sequencer and its operational performance will be described along with its special features which include a periodic scheduler and command logger. 14 refs., 3 figs

  18. Phylogenetic Trees From Sequences

    Science.gov (United States)

    Ryvkin, Paul; Wang, Li-San

    In this chapter, we review important concepts and approaches for phylogeny reconstruction from sequence data.We first cover some basic definitions and properties of phylogenetics, and briefly explain how scientists model sequence evolution and measure sequence divergence. We then discuss three major approaches for phylogenetic reconstruction: distance-based phylogenetic reconstruction, maximum parsimony, and maximum likelihood. In the third part of the chapter, we review how multiple phylogenies are compared by consensus methods and how to assess confidence using bootstrapping. At the end of the chapter are two sections that list popular software packages and additional reading.

  19. Application of Next-generation Sequencing in Clinical Molecular Diagnostics

    Directory of Open Access Journals (Sweden)

    Morteza Seifi

    2017-05-01

    Full Text Available ABSTRACT Next-generation sequencing (NGS is the catch all terms that used to explain several different modern sequencing technologies which let us to sequence nucleic acids much more rapidly and cheaply than the formerly used Sanger sequencing, and as such have revolutionized the study of molecular biology and genomics with excellent resolution and accuracy. Over the past years, many academic companies and institutions have continued technological advances to expand NGS applications from research to the clinic. In this review, the performance and technical features of current NGS platforms were described. Furthermore, advances in the applying of NGS technologies towards the progress of clinical molecular diagnostics were emphasized. General advantages and disadvantages of each sequencing system are summarized and compared to guide the selection of NGS platforms for specific research aims.

  20. Gomphid DNA sequence data

    Data.gov (United States)

    U.S. Environmental Protection Agency — DNA sequence data for several genetic loci. This dataset is not publicly accessible because: It's already publicly available on GenBank. It can be accessed through...

  1. Yeast genome sequencing:

    DEFF Research Database (Denmark)

    Piskur, Jure; Langkjær, Rikke Breinhold

    2004-01-01

    For decades, unicellular yeasts have been general models to help understand the eukaryotic cell and also our own biology. Recently, over a dozen yeast genomes have been sequenced, providing the basis to resolve several complex biological questions. Analysis of the novel sequence data has shown...... of closely related species helps in gene annotation and to answer how many genes there really are within the genomes. Analysis of non-coding regions among closely related species has provided an example of how to determine novel gene regulatory sequences, which were previously difficult to analyse because...... they are short and degenerate and occupy different positions. Comparative genomics helps to understand the origin of yeasts and points out crucial molecular events in yeast evolutionary history, such as whole-genome duplication and horizontal gene transfer(s). In addition, the accumulating sequence data provide...

  2. Dynamic Sequence Assignment.

    Science.gov (United States)

    1983-12-01

    D-136 548 DYNAMIIC SEQUENCE ASSIGNMENT(U) ADVANCED INFORMATION AND 1/2 DECISION SYSTEMS MOUNTAIN YIELW CA C A 0 REILLY ET AL. UNCLSSIIED DEC 83 AI/DS...I ADVANCED INFORMATION & DECISION SYSTEMS Mountain View. CA 94040 84 u ,53 V,..’. Unclassified _____ SCURITY CLASSIFICATION OF THIS PAGE REPORT...reviews some important heuristic algorithms developed for fas- ter solution of the sequence assignment problem. 3.1. DINAMIC MOGRAMUNIG FORMULATION FOR

  3. HIV Sequence Compendium 2010

    Energy Technology Data Exchange (ETDEWEB)

    Kuiken, Carla [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Foley, Brian [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Leitner, Thomas [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Apetrei, Christian [Univ. of Pittsburgh, PA (United States); Hahn, Beatrice [Univ. of Alabama, Tuscaloosa, AL (United States); Mizrachi, Ilene [National Center for Biotechnology Information, Bethesda, MD (United States); Mullins, James [Univ. of Washington, Seattle, WA (United States); Rambaut, Andrew [Univ. of Edinburgh, Scotland (United Kingdom); Wolinsky, Steven [Northwestern Univ., Evanston, IL (United States); Korber, Bette [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)

    2010-12-31

    This compendium is an annual printed summary of the data contained in the HIV sequence database. In these compendia we try to present a judicious selection of the data in such a way that it is of maximum utility to HIV researchers. Each of the alignments attempts to display the genetic variability within the different species, groups and subtypes of the virus. This compendium contains sequences published before January 1, 2010. Hence, though it is called the 2010 Compendium, its contents correspond to the 2009 curated alignments on our website. The number of sequences in the HIV database is still increasing exponentially. In total, at the time of printing, there were 339,306 sequences in the HIV Sequence Database, an increase of 45% since last year. The number of near complete genomes (>7000 nucleotides) increased to 2576 by end of 2009, reflecting a smaller increase than in previous years. However, as in previous years, the compendium alignments contain only a small fraction of these. Included in the alignments are a small number of sequences representing each of the subtypes and the more prevalent circulating recombinant forms (CRFs) such as 01 and 02, as well as a few outgroup sequences (group O and N and SIV-CPZ). Of the rarer CRFs we included one representative each. A more complete version of all alignments is available on our website, http://www.hiv.lanl.gov/content/sequence/NEWALIGN/align.html. Reprints are available from our website in the form of both HTML and PDF files. As always, we are open to complaints and suggestions for improvement. Inquiries and comments regarding the compendium should be addressed to seq-info@lanl.gov.

  4. General LTE Sequence

    OpenAIRE

    Billal, Masum

    2015-01-01

    In this paper,we have characterized sequences which maintain the same property described in Lifting the Exponent Lemma. Lifting the Exponent Lemma is a very powerful tool in olympiad number theory and recently it has become very popular. We generalize it to all sequences that maintain a property like it i.e. if p^{\\alpha}||a_k and p^\\b{eta}||n, then p^{{\\alpha}+\\b{eta}}||a_{nk}.

  5. Identification and comparative profiling of miRNAs in an early flowering mutant of trifoliate orange and its wild type by genome-wide deep sequencing.

    Directory of Open Access Journals (Sweden)

    Lei-Ming Sun

    Full Text Available MicroRNAs (miRNAs are a new class of small, endogenous RNAs that play a regulatory role in various biological and metabolic processes by negatively affecting gene expression at the post-transcriptional level. While the number of known Arabidopsis and rice miRNAs is continuously increasing, information regarding miRNAs from woody plants such as citrus remains limited. Solexa sequencing was performed at different developmental stages on both an early flowering mutant of trifoliate orange (precocious trifoliate orange, Poncirus trifoliata L. Raf. and its wild-type in this study, resulting in the obtainment of 141 known miRNAs belonging to 99 families and 75 novel miRNAs in four libraries. A total of 317 potential target genes were predicted based on the 51 novel miRNAs families, GO and KEGG annotation revealed that high ranked miRNA-target genes are those implicated in diverse cellular processes in plants, including development, transcription, protein degradation and cross adaptation. To characterize those miRNAs expressed at the juvenile and adult development stages of the mutant and its wild-type, further analysis on the expression profiles of several miRNAs through real-time PCR was performed. The results revealed that most miRNAs were down-regulated at adult stage compared with juvenile stage for both the mutant and its wild-type. These results indicate that both conserved and novel miRNAs may play important roles in citrus growth and development, stress responses and other physiological processes.

  6. Pairwise Sequence Alignment Library

    Energy Technology Data Exchange (ETDEWEB)

    2015-05-20

    Vector extensions, such as SSE, have been part of the x86 CPU since the 1990s, with applications in graphics, signal processing, and scientific applications. Although many algorithms and applications can naturally benefit from automatic vectorization techniques, there are still many that are difficult to vectorize due to their dependence on irregular data structures, dense branch operations, or data dependencies. Sequence alignment, one of the most widely used operations in bioinformatics workflows, has a computational footprint that features complex data dependencies. The trend of widening vector registers adversely affects the state-of-the-art sequence alignment algorithm based on striped data layouts. Therefore, a novel SIMD implementation of a parallel scan-based sequence alignment algorithm that can better exploit wider SIMD units was implemented as part of the Parallel Sequence Alignment Library (parasail). Parasail features: Reference implementations of all known vectorized sequence alignment approaches. Implementations of Smith Waterman (SW), semi-global (SG), and Needleman Wunsch (NW) sequence alignment algorithms. Implementations across all modern CPU instruction sets including AVX2 and KNC. Language interfaces for C/C++ and Python.

  7. Deciphering the distance to antibiotic resistance for the pneumococcus using genome sequencing data

    NARCIS (Netherlands)

    Mobegi, Fredrick M; Cremers, Amelieke J H; de Jonge, Marien I; Bentley, Stephen D; van Hijum, Sacha A F T; Zomer, Aldert|info:eu-repo/dai/nl/304642754

    2017-01-01

    Advances in genome sequencing technologies and genome-wide association studies (GWAS) have provided unprecedented insights into the molecular basis of microbial phenotypes and enabled the identification of the underlying genetic variants in real populations. However, utilization of genome sequencing

  8. Levenshtein error-correcting barcodes for multiplexed DNA sequencing

    NARCIS (Netherlands)

    Buschmann, Tilo; Bystrykh, Leonid V.

    2013-01-01

    Background: High-throughput sequencing technologies are improving in quality, capacity and costs, providing versatile applications in DNA and RNA research. For small genomes or fraction of larger genomes, DNA samples can be mixed and loaded together on the same sequencing track. This so-called

  9. Identification of Meconopsis species by a DNA barcode sequence ...

    African Journals Online (AJOL)

    Deoxyribonucleic acid (DNA) barcoding is a novel technology that uses a standard DNA sequence to facilitate species identification. Species identification is necessary for the authentication of traditional plant based medicines. Although a consensus has not been agreed regarding which DNA sequences can be used as ...

  10. Scalable Kernel Methods and Algorithms for General Sequence Analysis

    Science.gov (United States)

    Kuksa, Pavel

    2011-01-01

    Analysis of large-scale sequential data has become an important task in machine learning and pattern recognition, inspired in part by numerous scientific and technological applications such as the document and text classification or the analysis of biological sequences. However, current computational methods for sequence comparison still lack…

  11. Genome sequence of Stachybotrys chartarum Strain 51-11

    Science.gov (United States)

    Stachybotrys chartarum strain 51-11 genome was sequenced by shotgun sequencing utilizing Illumina Hiseq 2000 and PacBio long read technology. Since Stachybotrys chartarum has been implicated in health impacts within water-damaged buildings, any information extracted from the geno...

  12. What can next generation sequencing do for you? Next generation sequencing as a valuable tool in plant research

    OpenAIRE

    Bräutigam, Andrea; Gowik, Udo

    2010-01-01

    Next generation sequencing (NGS) technologies have opened fascinating opportunities for the analysis of plants with and without a sequenced genome on a genomic scale. During the last few years, NGS methods have become widely available and cost effective. They can be applied to a wide variety of biological questions, from the sequencing of complete eukaryotic genomes and transcriptomes, to the genome-scale analysis of DNA-protein interactions. In this review, we focus on the use of NGS for pla...

  13. Complete genome sequence of Ikoma lyssavirus.

    Science.gov (United States)

    Marston, Denise A; Ellis, Richard J; Horton, Daniel L; Kuzmin, Ivan V; Wise, Emma L; McElhinney, Lorraine M; Banyard, Ashley C; Ngeleja, Chanasa; Keyyu, Julius; Cleaveland, Sarah; Lembo, Tiziana; Rupprecht, Charles E; Fooks, Anthony R

    2012-09-01

    Lyssaviruses (family Rhabdoviridae) constitute one of the most important groups of viral zoonoses globally. All lyssaviruses cause the disease rabies, an acute progressive encephalitis for which, once symptoms occur, there is no effective cure. Currently available vaccines are highly protective against the predominantly circulating lyssavirus species. Using next-generation sequencing technologies, we have obtained the whole-genome sequence for a novel lyssavirus, Ikoma lyssavirus (IKOV), isolated from an African civet in Tanzania displaying clinical signs of rabies. Genetically, this virus is the most divergent within the genus Lyssavirus. Characterization of the genome will help to improve our understanding of lyssavirus diversity and enable investigation into vaccine-induced immunity and protection.

  14. Adaptive Processing for Sequence Alignment

    KAUST Repository

    Zidan, Mohammed A.; Bonny, Talal; Salama, Khaled N.

    2012-01-01

    Disclosed are various embodiments for adaptive processing for sequence alignment. In one embodiment, among others, a method includes obtaining a query sequence and a plurality of database sequences. A first portion of the plurality of database sequences is distributed to a central processing unit (CPU) and a second portion of the plurality of database sequences is distributed to a graphical processing unit (GPU) based upon a predetermined splitting ratio associated with the plurality of database sequences, where the database sequences of the first portion are shorter than the database sequences of the second portion. A first alignment score for the query sequence is determined with the CPU based upon the first portion of the plurality of database sequences and a second alignment score for the query sequence is determined with the GPU based upon the second portion of the plurality of database sequences.

  15. Adaptive Processing for Sequence Alignment

    KAUST Repository

    Zidan, Mohammed A.

    2012-01-26

    Disclosed are various embodiments for adaptive processing for sequence alignment. In one embodiment, among others, a method includes obtaining a query sequence and a plurality of database sequences. A first portion of the plurality of database sequences is distributed to a central processing unit (CPU) and a second portion of the plurality of database sequences is distributed to a graphical processing unit (GPU) based upon a predetermined splitting ratio associated with the plurality of database sequences, where the database sequences of the first portion are shorter than the database sequences of the second portion. A first alignment score for the query sequence is determined with the CPU based upon the first portion of the plurality of database sequences and a second alignment score for the query sequence is determined with the GPU based upon the second portion of the plurality of database sequences.

  16. SeqCompress: an algorithm for biological sequence compression.

    Science.gov (United States)

    Sardaraz, Muhammad; Tahir, Muhammad; Ikram, Ataul Aziz; Bajwa, Hassan

    2014-10-01

    The growth of Next Generation Sequencing technologies presents significant research challenges, specifically to design bioinformatics tools that handle massive amount of data efficiently. Biological sequence data storage cost has become a noticeable proportion of total cost in the generation and analysis. Particularly increase in DNA sequencing rate is significantly outstripping the rate of increase in disk storage capacity, which may go beyond the limit of storage capacity. It is essential to develop algorithms that handle large data sets via better memory management. This article presents a DNA sequence compression algorithm SeqCompress that copes with the space complexity of biological sequences. The algorithm is based on lossless data compression and uses statistical model as well as arithmetic coding to compress DNA sequences. The proposed algorithm is compared with recent specialized compression tools for biological sequences. Experimental results show that proposed algorithm has better compression gain as compared to other existing algorithms. Copyright © 2014 Elsevier Inc. All rights reserved.

  17. Main sequence mass loss

    International Nuclear Information System (INIS)

    Brunish, W.M.; Guzik, J.A.; Willson, L.A.; Bowen, G.

    1987-01-01

    It has been hypothesized that variable stars may experience mass loss, driven, at least in part, by oscillations. The class of stars we are discussing here are the δ Scuti variables. These are variable stars with masses between about 1.2 and 2.25 M/sub θ/, lying on or very near the main sequence. According to this theory, high rotation rates enhance the rate of mass loss, so main sequence stars born in this mass range would have a range of mass loss rates, depending on their initial rotation velocity and the amplitude of the oscillations. The stars would evolve rapidly down the main sequence until (at about 1.25 M/sub θ/) a surface convection zone began to form. The presence of this convective region would slow the rotation, perhaps allowing magnetic braking to occur, and thus sharply reduce the mass loss rate. 7 refs

  18. Inaugural Genomics Automation Congress and the coming deluge of sequencing data.

    Science.gov (United States)

    Creighton, Chad J

    2010-10-01

    Presentations at Select Biosciences's first 'Genomics Automation Congress' (Boston, MA, USA) in 2010 focused on next-generation sequencing and the platforms and methodology around them. The meeting provided an overview of sequencing technologies, both new and emerging. Speakers shared their recent work on applying sequencing to profile cells for various levels of biomolecular complexity, including DNA sequences, DNA copy, DNA methylation, mRNA and microRNA. With sequencing time and costs continuing to drop dramatically, a virtual explosion of very large sequencing datasets is at hand, which will probably present challenges and opportunities for high-level data analysis and interpretation, as well as for information technology infrastructure.

  19. Electricity sequence control

    International Nuclear Information System (INIS)

    Shin, Heung Ryeol

    2010-03-01

    The contents of the book are introduction of control system, like classification and control signal, introduction of electricity power switch, such as push-button and detection switch sensor for induction type and capacitance type machinery for control, solenoid valve, expression of sequence and type of electricity circuit about using diagram, time chart, marking and term, logic circuit like Yes, No, and, or and equivalence logic, basic electricity circuit, electricity sequence control, added condition, special program control about choice and jump of program, motor control, extra circuit on repeat circuit, pause circuit in a conveyer, safety regulations and rule about classification of electricity disaster and protective device for insulation.

  20. Next-generation sequencing

    DEFF Research Database (Denmark)

    Rieneck, Klaus; Bak, Mads; Jønson, Lars

    2013-01-01

    , Illumina); several millions of PCR sequences were analyzed. RESULTS: The results demonstrated the feasibility of diagnosing the fetal KEL1 or KEL2 blood group from cell-free DNA purified from maternal plasma. CONCLUSION: This method requires only one primer pair, and the large amount of sequence...... information obtained allows well for statistical analysis of the data. This general approach can be integrated into current laboratory practice and has numerous applications. Besides DNA-based predictions of blood group phenotypes, platelet phenotypes, or sickle cell anemia, and the determination of zygosity...

  1. Aspects of coverage in medical DNA sequencing

    Directory of Open Access Journals (Sweden)

    Wilson Richard K

    2008-05-01

    Full Text Available Abstract Background DNA sequencing is now emerging as an important component in biomedical studies of diseases like cancer. Short-read, highly parallel sequencing instruments are expected to be used heavily for such projects, but many design specifications have yet to be conclusively established. Perhaps the most fundamental of these is the redundancy required to detect sequence variations, which bears directly upon genomic coverage and the consequent resolving power for discerning somatic mutations. Results We address the medical sequencing coverage problem via an extension of the standard mathematical theory of haploid coverage. The expected diploid multi-fold coverage, as well as its generalization for aneuploidy are derived and these expressions can be readily evaluated for any project. The resulting theory is used as a scaling law to calibrate performance to that of standard BAC sequencing at 8× to 10× redundancy, i.e. for expected coverages that exceed 99% of the unique sequence. A differential strategy is formalized for tumor/normal studies wherein tumor samples are sequenced more deeply than normal ones. In particular, both tumor alleles should be detected at least twice, while both normal alleles are detected at least once. Our theory predicts these requirements can be met for tumor and normal redundancies of approximately 26× and 21×, respectively. We explain why these values do not differ by a factor of 2, as might intuitively be expected. Future technology developments should prompt even deeper sequencing of tumors, but the 21× value for normal samples is essentially a constant. Conclusion Given the assumptions of standard coverage theory, our model gives pragmatic estimates for required redundancy. The differential strategy should be an efficient means of identifying potential somatic mutations for further study.

  2. Application of high-throughput DNA sequencing in phytopathology.

    Science.gov (United States)

    Studholme, David J; Glover, Rachel H; Boonham, Neil

    2011-01-01

    The new sequencing technologies are already making a big impact in academic research on medically important microbes and may soon revolutionize diagnostics, epidemiology, and infection control. Plant pathology also stands to gain from exploiting these opportunities. This manuscript reviews some applications of these high-throughput sequencing methods that are relevant to phytopathology, with emphasis on the associated computational and bioinformatics challenges and their solutions. Second-generation sequencing technologies have recently been exploited in genomics of both prokaryotic and eukaryotic plant pathogens. They are also proving to be useful in diagnostics, especially with respect to viruses. Copyright © 2011 by Annual Reviews. All rights reserved.

  3. Epigenetics and assisted reproductive technologies

    DEFF Research Database (Denmark)

    Pinborg, Anja; Loft, Anne; Romundstad, Liv Bente

    2016-01-01

    Epigenetic modification controls gene activity without changes in the DNA sequence. The genome undergoes several phases of epigenetic programming during gametogenesis and early embryo development coinciding with assisted reproductive technologies (ART) treatments. Imprinting disorders have been...

  4. 10KP: A phylodiverse genome sequencing plan

    Science.gov (United States)

    Cheng, Shifeng; Melkonian, Michael; Brockington, Samuel; Archibald, John M; Delaux, Pierre-Marc; Melkonian, Barbara; Mavrodiev, Evgeny V; Sun, Wenjing; Fu, Yuan; Yang, Huanming; Soltis, Douglas E; Graham, Sean W; Soltis, Pamela S; Liu, Xin; Xu, Xun

    2018-01-01

    Abstract Understanding plant evolution and diversity in a phylogenomic context is an enormous challenge due, in part, to limited availability of genome-scale data across phylodiverse species. The 10KP (10,000 Plants) Genome Sequencing Project will sequence and characterize representative genomes from every major clade of embryophytes, green algae, and protists (excluding fungi) within the next 5 years. By implementing and continuously improving leading-edge sequencing technologies and bioinformatics tools, 10KP will catalogue the genome content of plant and protist diversity and make these data freely available as an enduring foundation for future scientific discoveries and applications. 10KP is structured as an international consortium, open to the global community, including botanical gardens, plant research institutes, universities, and private industry. Our immediate goal is to establish a policy framework for this endeavor, the principles of which are outlined here. PMID:29618049

  5. 10KP: A phylodiverse genome sequencing plan.

    Science.gov (United States)

    Cheng, Shifeng; Melkonian, Michael; Smith, Stephen A; Brockington, Samuel; Archibald, John M; Delaux, Pierre-Marc; Li, Fay-Wei; Melkonian, Barbara; Mavrodiev, Evgeny V; Sun, Wenjing; Fu, Yuan; Yang, Huanming; Soltis, Douglas E; Graham, Sean W; Soltis, Pamela S; Liu, Xin; Xu, Xun; Wong, Gane Ka-Shu

    2018-03-01

    Understanding plant evolution and diversity in a phylogenomic context is an enormous challenge due, in part, to limited availability of genome-scale data across phylodiverse species. The 10KP (10,000 Plants) Genome Sequencing Project will sequence and characterize representative genomes from every major clade of embryophytes, green algae, and protists (excluding fungi) within the next 5 years. By implementing and continuously improving leading-edge sequencing technologies and bioinformatics tools, 10KP will catalogue the genome content of plant and protist diversity and make these data freely available as an enduring foundation for future scientific discoveries and applications. 10KP is structured as an international consortium, open to the global community, including botanical gardens, plant research institutes, universities, and private industry. Our immediate goal is to establish a policy framework for this endeavor, the principles of which are outlined here.

  6. Automated constraint checking of spacecraft command sequences

    Science.gov (United States)

    Horvath, Joan C.; Alkalaj, Leon J.; Schneider, Karl M.; Spitale, Joseph M.; Le, Dang

    1995-01-01

    Robotic spacecraft are controlled by onboard sets of commands called "sequences." Determining that sequences will have the desired effect on the spacecraft can be expensive in terms of both labor and computer coding time, with different particular costs for different types of spacecraft. Specification languages and appropriate user interface to the languages can be used to make the most effective use of engineering validation time. This paper describes one specification and verification environment ("SAVE") designed for validating that command sequences have not violated any flight rules. This SAVE system was subsequently adapted for flight use on the TOPEX/Poseidon spacecraft. The relationship of this work to rule-based artificial intelligence and to other specification techniques is discussed, as well as the issues that arise in the transfer of technology from a research prototype to a full flight system.

  7. Biological sequence analysis

    DEFF Research Database (Denmark)

    Durbin, Richard; Eddy, Sean; Krogh, Anders Stærmose

    This book provides an up-to-date and tutorial-level overview of sequence analysis methods, with particular emphasis on probabilistic modelling. Discussed methods include pairwise alignment, hidden Markov models, multiple alignment, profile searches, RNA secondary structure analysis, and phylogene...

  8. THE RHIC SEQUENCER

    International Nuclear Information System (INIS)

    VAN ZEIJTS, J.; DOTTAVIO, T.; FRAK, B.; MICHNOFF, R.

    2001-01-01

    The Relativistic Heavy Ion Collider (RHIC) has a high level asynchronous time-line driven by a controlling program called the ''Sequencer''. Most high-level magnet and beam related issues are orchestrated by this system. The system also plays an important task in coordinated data acquisition and saving. We present the program, operator interface, operational impact and experience

  9. Twin anemia polycythemia sequence

    NARCIS (Netherlands)

    Slaghekke, Femke

    2014-01-01

    In this thesis we describe that Twin Anemia Polycythemia Sequence (TAPS) is a form of chronic feto-fetal transfusion in monochorionic (identical) twins based on a small amount of blood transfusion through very small anastomoses. For the antenatal diagnosis of TAPS, Middle Cerebral Artery – Peak

  10. simple sequence repeat (SSR)

    African Journals Online (AJOL)

    In the present study, 78 mapped simple sequence repeat (SSR) markers representing 11 linkage groups of adzuki bean were evaluated for transferability to mungbean and related Vigna spp. 41 markers amplified characteristic bands in at least one Vigna species. The transferability percentage across the genotypes ranged ...

  11. Sequence Matching Analysis for Curriculum Development

    Directory of Open Access Journals (Sweden)

    Liem Yenny Bendatu

    2015-06-01

    Full Text Available Many organizations apply information technologies to support their business processes. Using the information technologies, the actual events are recorded and utilized to conform with predefined model. Conformance checking is an approach to measure the fitness and appropriateness between process model and actual events. However, when there are multiple events with the same timestamp, the traditional approach unfit to result such measures. This study attempts to develop a sequence matching analysis. Considering conformance checking as the basis of this approach, this proposed approach utilizes the current control flow technique in process mining domain. A case study in the field of educational process has been conducted. This study also proposes a curriculum analysis framework to test the proposed approach. By considering the learning sequence of students, it results some measurements for curriculum development. Finally, the result of the proposed approach has been verified by relevant instructors for further development.

  12. Quantifying population genetic differentiation from next-generation sequencing data

    DEFF Research Database (Denmark)

    Fumagalli, Matteo; Garrett Vieira, Filipe Jorge; Korneliussen, Thorfinn Sand

    2013-01-01

    method for quantifying population genetic differentiation from next-generation sequencing data. In addition, we present a strategy to investigate population structure via Principal Components Analysis. Through extensive simulations, we compare the new method herein proposed to approaches based...... on genotype calling and demonstrate a marked improvement in estimation accuracy for a wide range of conditions. We apply the method to a large-scale genomic data set of domesticated and wild silkworms sequenced at low coverage. We find that we can infer the fine-scale genetic structure of the sampled......Over the last few years, new high-throughput DNA sequencing technologies have dramatically increased speed and reduced sequencing costs. However, the use of these sequencing technologies is often challenged by errors and biases associated with the bioinformatical methods used for analyzing the data...

  13. Targeted sequencing of plant genomes

    Science.gov (United States)

    Mark D. Huynh

    2014-01-01

    Next-generation sequencing (NGS) has revolutionized the field of genetics by providing a means for fast and relatively affordable sequencing. With the advancement of NGS, wholegenome sequencing (WGS) has become more commonplace. However, sequencing an entire genome is still not cost effective or even beneficial in all cases. In studies that do not require a whole-...

  14. Almost convergence of triple sequences

    OpenAIRE

    Ayhan Esi; M.Necdet Catalbas

    2013-01-01

    In this paper we introduce and study the concepts of almost convergence and almost Cauchy for triple sequences. Weshow that the set of almost convergent triple sequences of 0's and 1's is of the first category and also almost everytriple sequence of 0's and 1's is not almost convergent.Keywords: almost convergence, P-convergent, triple sequence.

  15. A few Smarandache Integer Sequences

    OpenAIRE

    Ibstedt, Henry

    2010-01-01

    This paper deals with the analysis of a few Smarandache Integer Sequences which first appeared in Properties or the Numbers, F. Smarandache, University or Craiova Archives, 1975. The first four sequences are recurrence generated sequences while the last three are concatenation sequences.

  16. Discovery of novel transcripts of the human tissue kallikrein (KLK1) and kallikrein-related peptidase 2 (KLK2) in human cancer cells, exploiting Next-Generation Sequencing technology.

    Science.gov (United States)

    Adamopoulos, Panagiotis G; Kontos, Christos K; Scorilas, Andreas

    2018-03-31

    Tissue kallikrein, kallikrein-related peptidases (KLKs), and plasma kallikrein form the largest group of serine proteases in the human genome, sharing many structural and functional properties. Several KLK transcripts have been found aberrantly expressed in numerous human malignancies, confirming their prognostic or/and diagnostic values. However, the process of alternative splicing can now be studied in-depth due to the development of Next-Generation Sequencing (NGS). In the present study, we used NGS to discover novel transcripts of the KLK1 and KLK2 genes, after nested touchdown PCR. Bioinformatics analysis and PCR experiments revealed a total of eleven novel KLK transcripts (two KLK1 and nine KLK2 transcripts). In addition, the expression profiles of each novel transcript were investigated with nested PCR experiments using variant-specific primers. Since KLKs are implicated in human malignancies, qualifying as potential biomarkers, the quantification of the presented novel transcripts in human samples may have clinical applications in different types of cancer. Copyright © 2018. Published by Elsevier Inc.

  17. 下一代测序技术在胚胎植入前遗传学检测中的应用%Application of the next generation sequencing technology in preimplantation genetic detection

    Institute of Scientific and Technical Information of China (English)

    谢美娟; 杨学习; 李明

    2017-01-01

    以下一代测序技术(next-generation sequencing,NGS)为代表的基因组学技术的迅猛发展给全面深度的染色体筛查和基因诊断提供了机会.NGS也迅速应用于胚胎植入前遗传学诊断(preimplantation genetic diagnosis,PGD)和胚胎植入前遗传学筛查(preimplantation genetic screening,PGS)临床检测中,成为常规检测技术,经济与可靠使其具有更广阔的应用前景.单细胞全基因组扩增(whole genome amplification,WGA)技术的进步使得NGS在PGD和PGS的临床应用中能够更加全面了解植入前胚胎的遗传学信息,可以检测到更加细微的差异;基于NGS技术的PGS和PGD将给移植成功率和试管婴儿(in-vitro fertilization,IVF)出生率带来明显提升.本文主要介绍PGD/PGS的定义、传统的PGD/PGS检测技术,单细胞全基因组扩增技术以及NGS在PGD/PGS中的应用.

  18. Transcriptome sequencing of the Microarray Quality Control (MAQC RNA reference samples using next generation sequencing

    Directory of Open Access Journals (Sweden)

    Thierry-Mieg Danielle

    2009-06-01

    Full Text Available Abstract Background Transcriptome sequencing using next-generation sequencing platforms will soon be competing with DNA microarray technologies for global gene expression analysis. As a preliminary evaluation of these promising technologies, we performed deep sequencing of cDNA synthesized from the Microarray Quality Control (MAQC reference RNA samples using Roche's 454 Genome Sequencer FLX. Results We generated more that 3.6 million sequence reads of average length 250 bp for the MAQC A and B samples and introduced a data analysis pipeline for translating cDNA read counts into gene expression levels. Using BLAST, 90% of the reads mapped to the human genome and 64% of the reads mapped to the RefSeq database of well annotated genes with e-values ≤ 10-20. We measured gene expression levels in the A and B samples by counting the numbers of reads that mapped to individual RefSeq genes in multiple sequencing runs to evaluate the MAQC quality metrics for reproducibility, sensitivity, specificity, and accuracy and compared the results with DNA microarrays and Quantitative RT-PCR (QRTPCR from the MAQC studies. In addition, 88% of the reads were successfully aligned directly to the human genome using the AceView alignment programs with an average 90% sequence similarity to identify 137,899 unique exon junctions, including 22,193 new exon junctions not yet contained in the RefSeq database. Conclusion Using the MAQC metrics for evaluating the performance of gene expression platforms, the ExpressSeq results for gene expression levels showed excellent reproducibility, sensitivity, and specificity that improved systematically with increasing shotgun sequencing depth, and quantitative accuracy that was comparable to DNA microarrays and QRTPCR. In addition, a careful mapping of the reads to the genome using the AceView alignment programs shed new light on the complexity of the human transcriptome including the discovery of thousands of new splice variants.

  19. Zseq: An Approach for Preprocessing Next-Generation Sequencing Data.

    Science.gov (United States)

    Alkhateeb, Abedalrhman; Rueda, Luis

    2017-08-01

    Next-generation sequencing technology generates a huge number of reads (short sequences), which contain a vast amount of genomic data. The sequencing process, however, comes with artifacts. Preprocessing of sequences is mandatory for further downstream analysis. We present Zseq, a linear method that identifies the most informative genomic sequences and reduces the number of biased sequences, sequence duplications, and ambiguous nucleotides. Zseq finds the complexity of the sequences by counting the number of unique k-mers in each sequence as its corresponding score and also takes into the account other factors such as ambiguous nucleotides or high GC-content percentage in k-mers. Based on a z-score threshold, Zseq sweeps through the sequences again and filters those with a z-score less than the user-defined threshold. Zseq algorithm is able to provide a better mapping rate; it reduces the number of ambiguous bases significantly in comparison with other methods. Evaluation of the filtered reads has been conducted by aligning the reads and assembling the transcripts using the reference genome as well as de novo assembly. The assembled transcripts show a better discriminative ability to separate cancer and normal samples in comparison with another state-of-the-art method. Moreover, de novo assembled transcripts from the reads filtered by Zseq have longer genomic sequences than other tested methods. Estimating the threshold of the cutoff point is introduced using labeling rules with optimistic results.

  20. Harnessing Whole Genome Sequencing in Medical Mycology.

    Science.gov (United States)

    Cuomo, Christina A

    2017-01-01

    Comparative genome sequencing studies of human fungal pathogens enable identification of genes and variants associated with virulence and drug resistance. This review describes current approaches, resources, and advances in applying whole genome sequencing to study clinically important fungal pathogens. Genomes for some important fungal pathogens were only recently assembled, revealing gene family expansions in many species and extreme gene loss in one obligate species. The scale and scope of species sequenced is rapidly expanding, leveraging technological advances to assemble and annotate genomes with higher precision. By using iteratively improved reference assemblies or those generated de novo for new species, recent studies have compared the sequence of isolates representing populations or clinical cohorts. Whole genome approaches provide the resolution necessary for comparison of closely related isolates, for example, in the analysis of outbreaks or sampled across time within a single host. Genomic analysis of fungal pathogens has enabled both basic research and diagnostic studies. The increased scale of sequencing can be applied across populations, and new metagenomic methods allow direct analysis of complex samples.

  1. OTU analysis using metagenomic shotgun sequencing data.

    Directory of Open Access Journals (Sweden)

    Xiaolin Hao

    Full Text Available Because of technological limitations, the primer and amplification biases in targeted sequencing of 16S rRNA genes have veiled the true microbial diversity underlying environmental samples. However, the protocol of metagenomic shotgun sequencing provides 16S rRNA gene fragment data with natural immunity against the biases raised during priming and thus the potential of uncovering the true structure of microbial community by giving more accurate predictions of operational taxonomic units (OTUs. Nonetheless, the lack of statistically rigorous comparison between 16S rRNA gene fragments and other data types makes it difficult to interpret previously reported results using 16S rRNA gene fragments. Therefore, in the present work, we established a standard analysis pipeline that would help confirm if the differences in the data are true or are just due to potential technical bias. This pipeline is built by using simulated data to find optimal mapping and OTU prediction methods. The comparison between simulated datasets revealed a relationship between 16S rRNA gene fragments and full-length 16S rRNA sequences that a 16S rRNA gene fragment having a length >150 bp provides the same accuracy as a full-length 16S rRNA sequence using our proposed pipeline, which could serve as a good starting point for experimental design and making the comparison between 16S rRNA gene fragment-based and targeted 16S rRNA sequencing-based surveys possible.

  2. BLEACHING EUCALYPTUS PULPS WITH SHORT SEQUENCES

    Directory of Open Access Journals (Sweden)

    Flaviana Reis Milagres

    2011-03-01

    Full Text Available Eucalyptus spp kraft pulp, due to its high content of hexenuronic acids, is quite easy to bleach. Therefore, investigations have been made attempting to decrease the number of stages in the bleaching process in order to minimize capital costs. This study focused on the evaluation of short ECF (Elemental Chlorine Free and TCF (Totally Chlorine Free sequences for bleaching oxygen delignified Eucalyptus spp kraft pulp to 90% ISO brightness: PMoDP (Molybdenum catalyzed acid peroxide, chlorine dioxide and hydrogen peroxide, PMoD/P (Molybdenum catalyzed acid peroxide, chlorine dioxide and hydrogen peroxide, without washing PMoD(PO (Molybdenum catalyzed acid peroxide, chlorine dioxide and pressurized peroxide, D(EPODP (chlorine dioxide, extraction oxidative with oxygen and peroxide, chlorine dioxide and hydrogen peroxide, PMoQ(PO (Molybdenum catalyzed acid peroxide, DTPA and pressurized peroxide, and XPMoQ(PO (Enzyme, molybdenum catalyzed acid peroxide, DTPA and pressurized peroxide. Uncommon pulp treatments, such as molybdenum catalyzed acid peroxide (PMo and xylanase (X bleaching stages, were used. Among the ECF alternatives, the two-stage PMoD/P sequence proved highly cost-effective without affecting pulp quality in relation to the traditional D(EPODP sequence and produced better quality effluent in relation to the reference. However, a four stage sequence, XPMoQ(PO, was required to achieve full brightness using the TCF technology. This sequence was highly cost-effective although it only produced pulp of acceptable quality.

  3. Human Genome Sequencing in Health and Disease

    Science.gov (United States)

    Gonzaga-Jauregui, Claudia; Lupski, James R.; Gibbs, Richard A.

    2013-01-01

    Following the “finished,” euchromatic, haploid human reference genome sequence, the rapid development of novel, faster, and cheaper sequencing technologies is making possible the era of personalized human genomics. Personal diploid human genome sequences have been generated, and each has contributed to our better understanding of variation in the human genome. We have consequently begun to appreciate the vastness of individual genetic variation from single nucleotide to structural variants. Translation of genome-scale variation into medically useful information is, however, in its infancy. This review summarizes the initial steps undertaken in clinical implementation of personal genome information, and describes the application of whole-genome and exome sequencing to identify the cause of genetic diseases and to suggest adjuvant therapies. Better analysis tools and a deeper understanding of the biology of our genome are necessary in order to decipher, interpret, and optimize clinical utility of what the variation in the human genome can teach us. Personal genome sequencing may eventually become an instrument of common medical practice, providing information that assists in the formulation of a differential diagnosis. We outline herein some of the remaining challenges. PMID:22248320

  4. Single-cell sequencing in stem cell biology.

    Science.gov (United States)

    Wen, Lu; Tang, Fuchou

    2016-04-15

    Cell-to-cell variation and heterogeneity are fundamental and intrinsic characteristics of stem cell populations, but these differences are masked when bulk cells are used for omic analysis. Single-cell sequencing technologies serve as powerful tools to dissect cellular heterogeneity comprehensively and to identify distinct phenotypic cell types, even within a 'homogeneous' stem cell population. These technologies, including single-cell genome, epigenome, and transcriptome sequencing technologies, have been developing rapidly in recent years. The application of these methods to different types of stem cells, including pluripotent stem cells and tissue-specific stem cells, has led to exciting new findings in the stem cell field. In this review, we discuss the recent progress as well as future perspectives in the methodologies and applications of single-cell omic sequencing technologies.

  5. Multilocus Sequence Typing

    OpenAIRE

    Belén, Ana; Pavón, Ibarz; Maiden, Martin C.J.

    2009-01-01

    Multilocus sequence typing (MLST) was first proposed in 1998 as a typing approach that enables the unambiguous characterization of bacterial isolates in a standardized, reproducible, and portable manner using the human pathogen Neisseria meningitidis as the exemplar organism. Since then, the approach has been applied to a large and growing number of organisms by public health laboratories and research institutions. MLST data, shared by investigators over the world via the Internet, have been ...

  6. Achalasia Carcinoma Sequence

    OpenAIRE

    Makmun, Dadang

    2001-01-01

    We report a case of carcinoma of the esophagus in a 58 years old woman with achalasia, who has been diagnosed since 30 years ago, which initiated by surgical treatment (myotomy) and the symptoms recurred since 3 years ago. According to the progress of the disease, Malignancy was strongly suspected due to prolonged stasis and mucosal irritation caused by achalasia (achalasia carcinoma sequence). Because of these contributing factors for the development of serious complications such as Malignan...

  7. Whole-genome comparison of two Campylobacter jejuni isolates of the same sequence type reveals multiple loci of different ancestral lineage.

    Directory of Open Access Journals (Sweden)

    Patrick J Biggs

    Full Text Available Campylobacter jejuni ST-474 is the most important human enteric pathogen in New Zealand, and yet this genotype is rarely found elsewhere in the world. Insight into the evolution of this organism was gained by a whole genome comparison of two ST-474, flaA SVR-14 isolates and other available C. jejuni isolates and genomes. The two isolates were collected from different sources, human (H22082 and retail poultry (P110b, at the same time and from the same geographical location. Solexa sequencing of each isolate resulted in ~1.659 Mb (H22082 and ~1.656 Mb (P110b of assembled sequences within 28 (H22082 and 29 (P110b contigs. We analysed 1502 genes for which we had sequences within both ST-474 isolates and within at least one of 11 C. jejuni reference genomes. Although 94.5% of genes were identical between the two ST-474 isolates, we identified 83 genes that differed by at least one nucleotide, including 55 genes with non-synonymous substitutions. These covered 101 kb and contained 672 point differences. We inferred that 22 (3.3% of these differences were due to mutation and 650 (96.7% were imported via recombination. Our analysis estimated 38 recombinant breakpoints within these 83 genes, which correspond to recombination events affecting at least 19 loci regions and gives a tract length estimate of ~2 kb. This includes a ~12 kb region displaying non-homologous recombination in one of the ST-474 genomes, with the insertion of two genes, including ykgC, a putative oxidoreductase, and a conserved hypothetical protein of unknown function. Furthermore, our analysis indicates that the source of this recombined DNA is more likely to have come from C. jejuni strains that are more closely related to ST-474. This suggests that the rates of recombination and mutation are similar in order of magnitude, but that recombination has been much more important for generating divergence between the two ST-474 isolates.

  8. Identification and characterization of microRNAs related to salt stress in broccoli, using high-throughput sequencing and bioinformatics analysis.

    Science.gov (United States)

    Tian, Yunhong; Tian, Yunming; Luo, Xiaojun; Zhou, Tao; Huang, Zuoping; Liu, Ying; Qiu, Yihan; Hou, Bing; Sun, Dan; Deng, Hongyu; Qian, Shen; Yao, Kaitai

    2014-09-03

    MicroRNAs (miRNAs) are a new class of endogenous regulators of a broad range of physiological processes, which act by regulating gene expression post-transcriptionally. The brassica vegetable, broccoli (Brassica oleracea var. italica), is very popular with a wide range of consumers, but environmental stresses such as salinity are a problem worldwide in restricting its growth and yield. Little is known about the role of miRNAs in the response of broccoli to salt stress. In this study, broccoli subjected to salt stress and broccoli grown under control conditions were analyzed by high-throughput sequencing. Differential miRNA expression was confirmed by real-time reverse transcription polymerase chain reaction (RT-PCR). The prediction of miRNA targets was undertaken using the Kyoto Encyclopedia of Genes and Genomes (KEGG) Orthology (KO) database and Gene Ontology (GO)-enrichment analyses. Two libraries of small (or short) RNAs (sRNAs) were constructed and sequenced by high-throughput Solexa sequencing. A total of 24,511,963 and 21,034,728 clean reads, representing 9,861,236 (40.23%) and 8,574,665 (40.76%) unique reads, were obtained for control and salt-stressed broccoli, respectively. Furthermore, 42 putative known and 39 putative candidate miRNAs that were differentially expressed between control and salt-stressed broccoli were revealed by their read counts and confirmed by the use of stem-loop real-time RT-PCR. Amongst these, the putative conserved miRNAs, miR393 and miR855, and two putative candidate miRNAs, miR3 and miR34, were the most strongly down-regulated when broccoli was salt-stressed, whereas the putative conserved miRNA, miR396a, and the putative candidate miRNA, miR37, were the most up-regulated. Finally, analysis of the predicted gene targets of miRNAs using the GO and KO databases indicated that a range of metabolic and other cellular functions known to be associated with salt stress were up-regulated in broccoli treated with salt. A comprehensive

  9. Sequencing BPS spectra

    Energy Technology Data Exchange (ETDEWEB)

    Gukov, Sergei [Walter Burke Institute for Theoretical Physics, California Institute of Technology,1200 E California Blvd, Pasadena, CA 91125 (United States); Max-Planck-Institut für Mathematik,Vivatsgasse 7, D-53111 Bonn (Germany); Nawata, Satoshi [Walter Burke Institute for Theoretical Physics, California Institute of Technology,1200 E California Blvd, Pasadena, CA 91125 (United States); Centre for Quantum Geometry of Moduli Spaces, University of Aarhus,Nordre Ringgade 1, DK-8000 (Denmark); Saberi, Ingmar [Walter Burke Institute for Theoretical Physics, California Institute of Technology,1200 E California Blvd, Pasadena, CA 91125 (United States); Stošić, Marko [CAMGSD, Departamento de Matemática, Instituto Superior Técnico,Av. Rovisco Pais, 1049-001 Lisbon (Portugal); Mathematical Institute SANU,Knez Mihajlova 36, 11000 Belgrade (Serbia); Sułkowski, Piotr [Walter Burke Institute for Theoretical Physics, California Institute of Technology,1200 E California Blvd, Pasadena, CA 91125 (United States); Faculty of Physics, University of Warsaw,ul. Pasteura 5, 02-093 Warsaw (Poland)

    2016-03-02

    This paper provides both a detailed study of color-dependence of link homologies, as realized in physics as certain spaces of BPS states, and a broad study of the behavior of BPS states in general. We consider how the spectrum of BPS states varies as continuous parameters of a theory are perturbed. This question can be posed in a wide variety of physical contexts, and we answer it by proposing that the relationship between unperturbed and perturbed BPS spectra is described by a spectral sequence. These general considerations unify previous applications of spectral sequence techniques to physics, and explain from a physical standpoint the appearance of many spectral sequences relating various link homology theories to one another. We also study structural properties of colored HOMFLY homology for links and evaluate Poincaré polynomials in numerous examples. Among these structural properties is a novel “sliding” property, which can be explained by using (refined) modular S-matrix. This leads to the identification of modular transformations in Chern-Simons theory and 3d N=2 theory via the 3d/3d correspondence. Lastly, we introduce the notion of associated varieties as classical limits of recursion relations of colored superpolynomials of links, and study their properties.

  10. Sequencing BPS spectra

    International Nuclear Information System (INIS)

    Gukov, Sergei; Nawata, Satoshi; Saberi, Ingmar; Stošić, Marko; Sułkowski, Piotr

    2016-01-01

    This paper provides both a detailed study of color-dependence of link homologies, as realized in physics as certain spaces of BPS states, and a broad study of the behavior of BPS states in general. We consider how the spectrum of BPS states varies as continuous parameters of a theory are perturbed. This question can be posed in a wide variety of physical contexts, and we answer it by proposing that the relationship between unperturbed and perturbed BPS spectra is described by a spectral sequence. These general considerations unify previous applications of spectral sequence techniques to physics, and explain from a physical standpoint the appearance of many spectral sequences relating various link homology theories to one another. We also study structural properties of colored HOMFLY homology for links and evaluate Poincaré polynomials in numerous examples. Among these structural properties is a novel “sliding” property, which can be explained by using (refined) modular S-matrix. This leads to the identification of modular transformations in Chern-Simons theory and 3d N=2 theory via the 3d/3d correspondence. Lastly, we introduce the notion of associated varieties as classical limits of recursion relations of colored superpolynomials of links, and study their properties.

  11. The diploid genome sequence of an Asian individual

    DEFF Research Database (Denmark)

    Wang, Jun; Wang, Wei; Li, Ruiqiang

    2008-01-01

    Here we present the first diploid genome sequence of an Asian individual. The genome was sequenced to 36-fold average coverage using massively parallel sequencing technology. We aligned the short reads onto the NCBI human reference genome to 99.97% coverage, and guided by the reference genome, we...... used uniquely mapped reads to assemble a high-quality consensus sequence for 92% of the Asian individual's genome. We identified approximately 3 million single-nucleotide polymorphisms (SNPs) inside this region, of which 13.6% were not in the dbSNP database. Genotyping analysis showed that SNP...... identification had high accuracy and consistency, indicating the high sequence quality of this assembly. We also carried out heterozygote phasing and haplotype prediction against HapMap CHB and JPT haplotypes (Chinese and Japanese, respectively), sequence comparison with the two available individual genomes (J...

  12. The Release 6 reference sequence of the Drosophila melanogaster genome.

    Science.gov (United States)

    Hoskins, Roger A; Carlson, Joseph W; Wan, Kenneth H; Park, Soo; Mendez, Ivonne; Galle, Samuel E; Booth, Benjamin W; Pfeiffer, Barret D; George, Reed A; Svirskas, Robert; Krzywinski, Martin; Schein, Jacqueline; Accardo, Maria Carmela; Damia, Elisabetta; Messina, Giovanni; Méndez-Lago, María; de Pablos, Beatriz; Demakova, Olga V; Andreyeva, Evgeniya N; Boldyreva, Lidiya V; Marra, Marco; Carvalho, A Bernardo; Dimitri, Patrizio; Villasante, Alfredo; Zhimulev, Igor F; Rubin, Gerald M; Karpen, Gary H; Celniker, Susan E

    2015-03-01

    Drosophila melanogaster plays an important role in molecular, genetic, and genomic studies of heredity, development, metabolism, behavior, and human disease. The initial reference genome sequence reported more than a decade ago had a profound impact on progress in Drosophila research, and improving the accuracy and completeness of this sequence continues to be important to further progress. We previously described improvement of the 117-Mb sequence in the euchromatic portion of the genome and 21 Mb in the heterochromatic portion, using a whole-genome shotgun assembly, BAC physical mapping, and clone-based finishing. Here, we report an improved reference sequence of the single-copy and middle-repetitive regions of the genome, produced using cytogenetic mapping to mitotic and polytene chromosomes, clone-based finishing and BAC fingerprint verification, ordering of scaffolds by alignment to cDNA sequences, incorporation of other map and sequence data, and validation by whole-genome optical restriction mapping. These data substantially improve the accuracy and completeness of the reference sequence and the order and orientation of sequence scaffolds into chromosome arm assemblies. Representation of the Y chromosome and other heterochromatic regions is particularly improved. The new 143.9-Mb reference sequence, designated Release 6, effectively exhausts clone-based technologies for mapping and sequencing. Highly repeat-rich regions, including large satellite blocks and functional elements such as the ribosomal RNA genes and the centromeres, are largely inaccessible to current sequencing and assembly methods and remain poorly represented. Further significant improvements will require sequencing technologies that do not depend on molecular cloning and that produce very long reads. © 2015 Hoskins et al.; Published by Cold Spring Harbor Laboratory Press.

  13. Foundations of Sequence-to-Sequence Modeling for Time Series

    OpenAIRE

    Kuznetsov, Vitaly; Mariet, Zelda

    2018-01-01

    The availability of large amounts of time series data, paired with the performance of deep-learning algorithms on a broad class of problems, has recently led to significant interest in the use of sequence-to-sequence models for time series forecasting. We provide the first theoretical analysis of this time series forecasting framework. We include a comparison of sequence-to-sequence modeling to classical time series models, and as such our theory can serve as a quantitative guide for practiti...

  14. Roche genome sequencer FLX based high-throughput sequencing of ancient DNA

    DEFF Research Database (Denmark)

    Alquezar-Planas, David E; Fordyce, Sarah Louise

    2012-01-01

    Since the development of so-called "next generation" high-throughput sequencing in 2005, this technology has been applied to a variety of fields. Such applications include disease studies, evolutionary investigations, and ancient DNA. Each application requires a specialized protocol to ensure...... that the data produced is optimal. Although much of the procedure can be followed directly from the manufacturer's protocols, the key differences lie in the library preparation steps. This chapter presents an optimized protocol for the sequencing of fossil remains and museum specimens, commonly referred...

  15. Novel expressed sequence tag- simple sequence repeats (EST ...

    African Journals Online (AJOL)

    Using different bioinformatic criteria, the SUCEST database was used to mine for simple sequence repeat (SSR) markers. Among 42,189 clusters, 1,425 expressed sequence tag- simple sequence repeats (EST-SSRs) were identified in silico. Trinucleotide repeats were the most abundant SSRs detected. Of 212 primer pairs ...

  16. A computational genomics pipeline for prokaryotic sequencing projects.

    Science.gov (United States)

    Kislyuk, Andrey O; Katz, Lee S; Agrawal, Sonia; Hagen, Matthew S; Conley, Andrew B; Jayaraman, Pushkala; Nelakuditi, Viswateja; Humphrey, Jay C; Sammons, Scott A; Govil, Dhwani; Mair, Raydel D; Tatti, Kathleen M; Tondella, Maria L; Harcourt, Brian H; Mayer, Leonard W; Jordan, I King

    2010-08-01

    New sequencing technologies have accelerated research on prokaryotic genomes and have made genome sequencing operations outside major genome sequencing centers routine. However, no off-the-shelf solution exists for the combined assembly, gene prediction, genome annotation and data presentation necessary to interpret sequencing data. The resulting requirement to invest significant resources into custom informatics support for genome sequencing projects remains a major impediment to the accessibility of high-throughput sequence data. We present a self-contained, automated high-throughput open source genome sequencing and computational genomics pipeline suitable for prokaryotic sequencing projects. The pipeline has been used at the Georgia Institute of Technology and the Centers for Disease Control and Prevention for the analysis of Neisseria meningitidis and Bordetella bronchiseptica genomes. The pipeline is capable of enhanced or manually assisted reference-based assembly using multiple assemblers and modes; gene predictor combining; and functional annotation of genes and gene products. Because every component of the pipeline is executed on a local machine with no need to access resources over the Internet, the pipeline is suitable for projects of a sensitive nature. Annotation of virulence-related features makes the pipeline particularly useful for projects working with pathogenic prokaryotes. The pipeline is licensed under the open-source GNU General Public License and available at the Georgia Tech Neisseria Base (http://nbase.biology.gatech.edu/). The pipeline is implemented with a combination of Perl, Bourne Shell and MySQL and is compatible with Linux and other Unix systems.

  17. Technical Considerations for Reduced Representation Bisulfite Sequencing with Multiplexed Libraries

    Science.gov (United States)

    Chatterjee, Aniruddha; Rodger, Euan J.; Stockwell, Peter A.; Weeks, Robert J.; Morison, Ian M.

    2012-01-01

    Reduced representation bisulfite sequencing (RRBS), which couples bisulfite conversion and next generation sequencing, is an innovative method that specifically enriches genomic regions with a high density of potential methylation sites and enables investigation of DNA methylation at single-nucleotide resolution. Recent advances in the Illumina DNA sample preparation protocol and sequencing technology have vastly improved sequencing throughput capacity. Although the new Illumina technology is now widely used, the unique challenges associated with multiplexed RRBS libraries on this platform have not been previously described. We have made modifications to the RRBS library preparation protocol to sequence multiplexed libraries on a single flow cell lane of the Illumina HiSeq 2000. Furthermore, our analysis incorporates a bioinformatics pipeline specifically designed to process bisulfite-converted sequencing reads and evaluate the output and quality of the sequencing data generated from the multiplexed libraries. We obtained an average of 42 million paired-end reads per sample for each flow-cell lane, with a high unique mapping efficiency to the reference human genome. Here we provide a roadmap of modifications, strategies, and trouble shooting approaches we implemented to optimize sequencing of multiplexed libraries on an a RRBS background. PMID:23193365

  18. Infinite sequences and series

    CERN Document Server

    Knopp, Konrad

    1956-01-01

    One of the finest expositors in the field of modern mathematics, Dr. Konrad Knopp here concentrates on a topic that is of particular interest to 20th-century mathematicians and students. He develops the theory of infinite sequences and series from its beginnings to a point where the reader will be in a position to investigate more advanced stages on his own. The foundations of the theory are therefore presented with special care, while the developmental aspects are limited by the scope and purpose of the book. All definitions are clearly stated; all theorems are proved with enough detail to ma

  19. Open source tools to exploit DNA sequence data from livestock species

    Science.gov (United States)

    Next-Generation Sequencing (NGS) is a recent technological development that allows researchers to rapidly determine the DNA sequence of an individual. The decrease in cost of NGS has brought the technology into the realm of practical applications in livestock genomics, where it can be used to genera...

  20. Fast global sequence alignment technique

    KAUST Repository

    Bonny, Mohamed Talal; Salama, Khaled N.

    2011-01-01

    fast alignment algorithm, called 'Alignment By Scanning' (ABS), to provide an approximate alignment of two DNA sequences. We compare our algorithm with the wellknown sequence alignment algorithms, the 'GAP' (which is heuristic) and the 'Needleman

  1. Next-Generation Sequencing Platforms

    Science.gov (United States)

    Mardis, Elaine R.

    2013-06-01

    Automated DNA sequencing instruments embody an elegant interplay among chemistry, engineering, software, and molecular biology and have built upon Sanger's founding discovery of dideoxynucleotide sequencing to perform once-unfathomable tasks. Combined with innovative physical mapping approaches that helped to establish long-range relationships between cloned stretches of genomic DNA, fluorescent DNA sequencers produced reference genome sequences for model organisms and for the reference human genome. New types of sequencing instruments that permit amazing acceleration of data-collection rates for DNA sequencing have been developed. The ability to generate genome-scale data sets is now transforming the nature of biological inquiry. Here, I provide an historical perspective of the field, focusing on the fundamental developments that predated the advent of next-generation sequencing instruments and providing information about how these instruments work, their application to biological research, and the newest types of sequencers that can extract data from single DNA molecules.

  2. Laser Technology.

    Science.gov (United States)

    Gauger, Robert

    1993-01-01

    Describes lasers and indicates that learning about laser technology and creating laser technology activities are among the teacher enhancement processes needed to strengthen technology education. (JOW)

  3. ReRep: Computational detection of repetitive sequences in genome survey sequences (GSS

    Directory of Open Access Journals (Sweden)

    Alves-Ferreira Marcelo

    2008-09-01

    Full Text Available Abstract Background Genome survey sequences (GSS offer a preliminary global view of a genome since, unlike ESTs, they cover coding as well as non-coding DNA and include repetitive regions of the genome. A more precise estimation of the nature, quantity and variability of repetitive sequences very early in a genome sequencing project is of considerable importance, as such data strongly influence the estimation of genome coverage, library quality and progress in scaffold construction. Also, the elimination of repetitive sequences from the initial assembly process is important to avoid errors and unnecessary complexity. Repetitive sequences are also of interest in a variety of other studies, for instance as molecular markers. Results We designed and implemented a straightforward pipeline called ReRep, which combines bioinformatics tools for identifying repetitive structures in a GSS dataset. In a case study, we first applied the pipeline to a set of 970 GSSs, sequenced in our laboratory from the human pathogen Leishmania braziliensis, the causative agent of leishmaniosis, an important public health problem in Brazil. We also verified the applicability of ReRep to new sequencing technologies using a set of 454-reads of an Escheria coli. The behaviour of several parameters in the algorithm is evaluated and suggestions are made for tuning of the analysis. Conclusion The ReRep approach for identification of repetitive elements in GSS datasets proved to be straightforward and efficient. Several potential repetitive sequences were found in a L. braziliensis GSS dataset generated in our laboratory, and further validated by the analysis of a more complete genomic dataset from the EMBL and Sanger Centre databases. ReRep also identified most of the E. coli K12 repeats prior to assembly in an example dataset obtained by automated sequencing using 454 technology. The parameters controlling the algorithm behaved consistently and may be tuned to the properties

  4. Rapid Polymer Sequencer

    Science.gov (United States)

    Stolc, Viktor (Inventor); Brock, Matthew W (Inventor)

    2013-01-01

    Method and system for rapid and accurate determination of each of a sequence of unknown polymer components, such as nucleic acid components. A self-assembling monolayer of a selected substance is optionally provided on an interior surface of a pipette tip, and the interior surface is immersed in a selected liquid. A selected electrical field is impressed in a longitudinal direction, or in a transverse direction, in the tip region, a polymer sequence is passed through the tip region, and a change in an electrical current signal is measured as each polymer component passes through the tip region. Each of the measured changes in electrical current signals is compared with a database of reference electrical change signals, with each reference signal corresponding to an identified polymer component, to identify the unknown polymer component with a reference polymer component. The nanopore preferably has a pore inner diameter of no more than about 40 nm and is prepared by heating and pulling a very small section of a glass tubing.

  5. Putting instruction sequences into effect

    NARCIS (Netherlands)

    Bergstra, J.A.

    2011-01-01

    An attempt is made to define the concept of execution of an instruction sequence. It is found to be a special case of directly putting into effect of an instruction sequence. Directly putting into effect of an instruction sequences comprises interpretation as well as execution. Directly putting into

  6. Region segmentation along image sequence

    International Nuclear Information System (INIS)

    Monchal, L.; Aubry, P.

    1995-01-01

    A method to extract regions in sequence of images is proposed. Regions are not matched from one image to the following one. The result of a region segmentation is used as an initialization to segment the following and image to track the region along the sequence. The image sequence is exploited as a spatio-temporal event. (authors). 12 refs., 8 figs

  7. Transcriptome sequencing of the blind subterranean mole rat, Spalax galili: Utility and potential for the discovery of novel evolutionary patterns

    KAUST Repository

    Malik, Assaf; Korol, Abraham; Hü bner, Sariel; Hernandez, Alvaro G.; Thimmapuram, Jyothi; Ali, Shahjahan; Glaser, Fabian; Paz, Arnon; Avivi, Aaron; Band, Mark

    2011-01-01

    sequencing of Spalax galili, a chromosomal type of S. ehrenbergi. cDNA pools from muscle and brain tissues isolated from animals exposed to hypoxic and normoxic conditions were sequenced using Sanger, GS FLX, and GS FLX Titanium technologies. Assembly

  8. DeepGO: predicting protein functions from sequence and interactions using a deep ontology-aware classifier

    KAUST Repository

    Kulmanov, Maxat; Khan, Mohammed Asif; Hoehndorf, Robert

    2017-01-01

    A large number of protein sequences are becoming available through the application of novel high-throughput sequencing technologies. Experimental functional characterization of these proteins is time-consuming and expensive, and is often

  9. An evaluation of Comparative Genome Sequencing (CGS by comparing two previously-sequenced bacterial genomes

    Directory of Open Access Journals (Sweden)

    Herring Christopher D

    2007-08-01

    Full Text Available Abstract Background With the development of new technology, it has recently become practical to resequence the genome of a bacterium after experimental manipulation. It is critical though to know the accuracy of the technique used, and to establish confidence that all of the mutations were detected. Results In order to evaluate the accuracy of genome resequencing using the microarray-based Comparative Genome Sequencing service provided by Nimblegen Systems Inc., we resequenced the E. coli strain W3110 Kohara using MG1655 as a reference, both of which have been completely sequenced using traditional sequencing methods. CGS detected 7 of 8 small sequence differences, one large deletion, and 9 of 12 IS element insertions present in W3110, but did not detect a large chromosomal inversion. In addition, we confirmed that CGS also detected 2 SNPs, one deletion and 7 IS element insertions that are not present in the genome sequence, which we attribute to changes that occurred after the creation of the W3110 lambda clone library. The false positive rate for SNPs was one per 244 Kb of genome sequence. Conclusion CGS is an effective way to detect multiple mutations present in one bacterium relative to another, and while highly cost-effective, is prone to certain errors. Mutations occurring in repeated sequences or in sequences with a high degree of secondary structure may go undetected. It is also critical to follow up on regions of interest in which SNPs were not called because they often indicate deletions or IS element insertions.

  10. Applications of nanotechnology, next generation sequencing and microarrays in biomedical research.

    Science.gov (United States)

    Elingaramil, Sauli; Li, Xiaolong; He, Nongyue

    2013-07-01

    Next-generation sequencing technologies, microarrays and advances in bio nanotechnology have had an enormous impact on research within a short time frame. This impact appears certain to increase further as many biomedical institutions are now acquiring these prevailing new technologies. Beyond conventional sampling of genome content, wide-ranging applications are rapidly evolving for next-generation sequencing, microarrays and nanotechnology. To date, these technologies have been applied in a variety of contexts, including whole-genome sequencing, targeted re sequencing and discovery of transcription factor binding sites, noncoding RNA expression profiling and molecular diagnostics. This paper thus discusses current applications of nanotechnology, next-generation sequencing technologies and microarrays in biomedical research and highlights the transforming potential these technologies offer.

  11. Log-balanced combinatorial sequences

    Directory of Open Access Journals (Sweden)

    Tomislav Došlic

    2005-01-01

    Full Text Available We consider log-convex sequences that satisfy an additional constraint imposed on their rate of growth. We call such sequences log-balanced. It is shown that all such sequences satisfy a pair of double inequalities. Sufficient conditions for log-balancedness are given for the case when the sequence satisfies a two- (or more- term linear recurrence. It is shown that many combinatorially interesting sequences belong to this class, and, as a consequence, that the above-mentioned double inequalities are valid for all of them.

  12. International interlaboratory study comparing single organism 16S rRNA gene sequencing data: Beyond consensus sequence comparisons

    Science.gov (United States)

    Olson, Nathan D.; Lund, Steven P.; Zook, Justin M.; Rojas-Cornejo, Fabiola; Beck, Brian; Foy, Carole; Huggett, Jim; Whale, Alexandra S.; Sui, Zhiwei; Baoutina, Anna; Dobeson, Michael; Partis, Lina; Morrow, Jayne B.

    2015-01-01

    This study presents the results from an interlaboratory sequencing study for which we developed a novel high-resolution method for comparing data from different sequencing platforms for a multi-copy, paralogous gene. The combination of PCR amplification and 16S ribosomal RNA gene (16S rRNA) sequencing has revolutionized bacteriology by enabling rapid identification, frequently without the need for culture. To assess variability between laboratories in sequencing 16S rRNA, six laboratories sequenced the gene encoding the 16S rRNA from Escherichia coli O157:H7 strain EDL933 and Listeria monocytogenes serovar 4b strain NCTC11994. Participants performed sequencing methods and protocols available in their laboratories: Sanger sequencing, Roche 454 pyrosequencing®, or Ion Torrent PGM®. The sequencing data were evaluated on three levels: (1) identity of biologically conserved position, (2) ratio of 16S rRNA gene copies featuring identified variants, and (3) the collection of variant combinations in a set of 16S rRNA gene copies. The same set of biologically conserved positions was identified for each sequencing method. Analytical methods using Bayesian and maximum likelihood statistics were developed to estimate variant copy ratios, which describe the ratio of nucleotides at each identified biologically variable position, as well as the likely set of variant combinations present in 16S rRNA gene copies. Our results indicate that estimated variant copy ratios at biologically variable positions were only reproducible for high throughput sequencing methods. Furthermore, the likely variant combination set was only reproducible with increased sequencing depth and longer read lengths. We also demonstrate novel methods for evaluating variable positions when comparing multi-copy gene sequence data from multiple laboratories generated using multiple sequencing technologies. PMID:27077030

  13. International interlaboratory study comparing single organism 16S rRNA gene sequencing data: Beyond consensus sequence comparisons

    Directory of Open Access Journals (Sweden)

    Nathan D. Olson

    2015-03-01

    Full Text Available This study presents the results from an interlaboratory sequencing study for which we developed a novel high-resolution method for comparing data from different sequencing platforms for a multi-copy, paralogous gene. The combination of PCR amplification and 16S ribosomal RNA gene (16S rRNA sequencing has revolutionized bacteriology by enabling rapid identification, frequently without the need for culture. To assess variability between laboratories in sequencing 16S rRNA, six laboratories sequenced the gene encoding the 16S rRNA from Escherichia coli O157:H7 strain EDL933 and Listeria monocytogenes serovar 4b strain NCTC11994. Participants performed sequencing methods and protocols available in their laboratories: Sanger sequencing, Roche 454 pyrosequencing®, or Ion Torrent PGM®. The sequencing data were evaluated on three levels: (1 identity of biologically conserved position, (2 ratio of 16S rRNA gene copies featuring identified variants, and (3 the collection of variant combinations in a set of 16S rRNA gene copies. The same set of biologically conserved positions was identified for each sequencing method. Analytical methods using Bayesian and maximum likelihood statistics were developed to estimate variant copy ratios, which describe the ratio of nucleotides at each identified biologically variable position, as well as the likely set of variant combinations present in 16S rRNA gene copies. Our results indicate that estimated variant copy ratios at biologically variable positions were only reproducible for high throughput sequencing methods. Furthermore, the likely variant combination set was only reproducible with increased sequencing depth and longer read lengths. We also demonstrate novel methods for evaluating variable positions when comparing multi-copy gene sequence data from multiple laboratories generated using multiple sequencing technologies.

  14. Next-Generation Sequencing: From Understanding Biology to Personalized Medicine

    Directory of Open Access Journals (Sweden)

    Benjamin Meder

    2013-03-01

    Full Text Available Within just a few years, the new methods for high-throughput next-generation sequencing have generated completely novel insights into the heritability and pathophysiology of human disease. In this review, we wish to highlight the benefits of the current state-of-the-art sequencing technologies for genetic and epigenetic research. We illustrate how these technologies help to constantly improve our understanding of genetic mechanisms in biological systems and summarize the progress made so far. This can be exemplified by the case of heritable heart muscle diseases, so-called cardiomyopathies. Here, next-generation sequencing is able to identify novel disease genes, and first clinical applications demonstrate the successful translation of this technology into personalized patient care.

  15. Genome-wide identification and comparative analysis of conserved and novel microRNAs in grafted watermelon by high-throughput sequencing.

    Science.gov (United States)

    Liu, Na; Yang, Jinghua; Guo, Shaogui; Xu, Yong; Zhang, Mingfang

    2013-01-01

    MicroRNAs (miRNAs) are a class of endogenous small non-coding RNAs involved in the post-transcriptional gene regulation and play a critical role in plant growth, development and stresses response. However less is known about miRNAs involvement in grafting behaviors, especially with the watermelon (Citrullus lanatus L.) crop, which is one of the most important agricultural crops worldwide. Grafting method is commonly used in watermelon production in attempts to improve its adaptation to abiotic and biotic stresses, in particular to the soil-borne fusarium wilt disease. In this study, Solexa sequencing has been used to discover small RNA populations and compare miRNAs on genome-wide scale in watermelon grafting system. A total of 11,458,476, 11,614,094 and 9,339,089 raw reads representing 2,957,751, 2,880,328 and 2,964,990 unique sequences were obtained from the scions of self-grafted watermelon and watermelon grafted on-to bottle gourd and squash at two true-leaf stage, respectively. 39 known miRNAs belonging to 30 miRNA families and 80 novel miRNAs were identified in our small RNA dataset. Compared with self-grafted watermelon, 20 (5 known miRNA families and 15 novel miRNAs) and 47 (17 known miRNA families and 30 novel miRNAs) miRNAs were expressed significantly different in watermelon grafted on to bottle gourd and squash, respectively. MiRNAs expressed differentially when watermelon was grafted onto different rootstocks, suggesting that miRNAs might play an important role in diverse biological and metabolic processes in watermelon and grafting may possibly by changing miRNAs expressions to regulate plant growth and development as well as adaptation to stresses. The small RNA transcriptomes obtained in this study provided insights into molecular aspects of miRNA-mediated regulation in grafted watermelon. Obviously, this result would provide a basis for further unravelling the mechanism on how miRNAs information is exchanged between scion and rootstock in grafted

  16. Study and realisation of a programmable generator of pulse sequences, for nuclear magnetic resonance

    International Nuclear Information System (INIS)

    Lambert, Daniel

    1974-01-01

    After having recalled the operation of pulse-based nuclear magnetic resonance and the use of pulse sequences in NMR-based measurements, and outlined the need for a pulse sequence generator, the author reports the design and realisation of such a device. He describes its general organisation with its base sequence, base clock, sequence start, duration, displays, data transfers, data processing, and signal distribution. He presents the chosen technology (ECL logics), the sequence base set, time bases, multiplexers, comparison sets, the distribution set, the sequence programming, the sampling and output set. He reports tests and the use of the so-designed generator [fr

  17. New MR pulse sequence

    International Nuclear Information System (INIS)

    Harms, S.E.; Flamig, D.P.; Griffey, R.H.

    1990-01-01

    This paper describes a method for fat suppression for three-dimensional MR imaging. The FATS (fat-suppressed acquisition with echo time shortened) sequence employs a pair of opposing adiabatic half-passage RF pulses tuned on fat resonance. The imaging parameters are as follows: TR, 20 msec; TE, 21.7-3.2 msec; 1,024 x 128 x 128 acquired matrix; imaging time, approximately 11 minutes. A series of 54 examinations were performed. Excellent fat suppression with water excitation is achieved in all cases. The orbital images demonstrate superior resolution of small orbital lesions. The high signal-to-noise ratio (SNR) in cranial studies demonstrates excellent petrous bone and internal auditory canal anatomy

  18. Bioinformatics for Next Generation Sequencing Data

    Directory of Open Access Journals (Sweden)

    Alberto Magi

    2010-09-01

    Full Text Available The emergence of next-generation sequencing (NGS platforms imposes increasing demands on statistical methods and bioinformatic tools for the analysis and the management of the huge amounts of data generated by these technologies. Even at the early stages of their commercial availability, a large number of softwares already exist for analyzing NGS data. These tools can be fit into many general categories including alignment of sequence reads to a reference, base-calling and/or polymorphism detection, de novo assembly from paired or unpaired reads, structural variant detection and genome browsing. This manuscript aims to guide readers in the choice of the available computational tools that can be used to face the several steps of the data analysis workflow.

  19. Bacterial identification and subtyping using DNA microarray and DNA sequencing.

    Science.gov (United States)

    Al-Khaldi, Sufian F; Mossoba, Magdi M; Allard, Marc M; Lienau, E Kurt; Brown, Eric D

    2012-01-01

    The era of fast and accurate discovery of biological sequence motifs in prokaryotic and eukaryotic cells is here. The co-evolution of direct genome sequencing and DNA microarray strategies not only will identify, isotype, and serotype pathogenic bacteria, but also it will aid in the discovery of new gene functions by detecting gene expressions in different diseases and environmental conditions. Microarray bacterial identification has made great advances in working with pure and mixed bacterial samples. The technological advances have moved beyond bacterial gene expression to include bacterial identification and isotyping. Application of new tools such as mid-infrared chemical imaging improves detection of hybridization in DNA microarrays. The research in this field is promising and future work will reveal the potential of infrared technology in bacterial identification. On the other hand, DNA sequencing by using 454 pyrosequencing is so cost effective that the promise of $1,000 per bacterial genome sequence is becoming a reality. Pyrosequencing technology is a simple to use technique that can produce accurate and quantitative analysis of DNA sequences with a great speed. The deposition of massive amounts of bacterial genomic information in databanks is creating fingerprint phylogenetic analysis that will ultimately replace several technologies such as Pulsed Field Gel Electrophoresis. In this chapter, we will review (1) the use of DNA microarray using fluorescence and infrared imaging detection for identification of pathogenic bacteria, and (2) use of pyrosequencing in DNA cluster analysis to fingerprint bacterial phylogenetic trees.

  20. Quack: A quality assurance tool for high throughput sequence data.

    Science.gov (United States)

    Thrash, Adam; Arick, Mark; Peterson, Daniel G

    2018-05-01

    The quality of data generated by high-throughput DNA sequencing tools must be rapidly assessed in order to determine how useful the data may be in making biological discoveries; higher quality data leads to more confident results and conclusions. Due to the ever-increasing size of data sets and the importance of rapid quality assessment, tools that analyze sequencing data should quickly produce easily interpretable graphics. Quack addresses these issues by generating information-dense visualizations from FASTQ files at a speed far surpassing other publicly available quality assurance tools in a manner independent of sequencing technology. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.

  1. Implementace analýzy FMEA v rámci logistické technologie Just-in-Sequence

    OpenAIRE

    FRANĚK, Václav

    2013-01-01

    The main objective of this master thesis is to design implementation of FMEA analysis in logistics technology Just-in-Sequence at company Robert Bosch in České Budějovice. The operational objective is to define process and product produced by technology Just-in-Sequence. Analyze newly developed technology Just-in-Sequence and compare its advantages over the original method.

  2. An integrated semiconductor device enabling non-optical genome sequencing.

    Science.gov (United States)

    Rothberg, Jonathan M; Hinz, Wolfgang; Rearick, Todd M; Schultz, Jonathan; Mileski, William; Davey, Mel; Leamon, John H; Johnson, Kim; Milgrew, Mark J; Edwards, Matthew; Hoon, Jeremy; Simons, Jan F; Marran, David; Myers, Jason W; Davidson, John F; Branting, Annika; Nobile, John R; Puc, Bernard P; Light, David; Clark, Travis A; Huber, Martin; Branciforte, Jeffrey T; Stoner, Isaac B; Cawley, Simon E; Lyons, Michael; Fu, Yutao; Homer, Nils; Sedova, Marina; Miao, Xin; Reed, Brian; Sabina, Jeffrey; Feierstein, Erika; Schorn, Michelle; Alanjary, Mohammad; Dimalanta, Eileen; Dressman, Devin; Kasinskas, Rachel; Sokolsky, Tanya; Fidanza, Jacqueline A; Namsaraev, Eugeni; McKernan, Kevin J; Williams, Alan; Roth, G Thomas; Bustillo, James

    2011-07-20

    The seminal importance of DNA sequencing to the life sciences, biotechnology and medicine has driven the search for more scalable and lower-cost solutions. Here we describe a DNA sequencing technology in which scalable, low-cost semiconductor manufacturing techniques are used to make an integrated circuit able to directly perform non-optical DNA sequencing of genomes. Sequence data are obtained by directly sensing the ions produced by template-directed DNA polymerase synthesis using all-natural nucleotides on this massively parallel semiconductor-sensing device or ion chip. The ion chip contains ion-sensitive, field-effect transistor-based sensors in perfect register with 1.2 million wells, which provide confinement and allow parallel, simultaneous detection of independent sequencing reactions. Use of the most widely used technology for constructing integrated circuits, the complementary metal-oxide semiconductor (CMOS) process, allows for low-cost, large-scale production and scaling of the device to higher densities and larger array sizes. We show the performance of the system by sequencing three bacterial genomes, its robustness and scalability by producing ion chips with up to 10 times as many sensors and sequencing a human genome.

  3. Googling DNA sequences on the World Wide Web.

    Science.gov (United States)

    Hajibabaei, Mehrdad; Singer, Gregory A C

    2009-11-10

    New web-based technologies provide an excellent opportunity for sharing and accessing information and using web as a platform for interaction and collaboration. Although several specialized tools are available for analyzing DNA sequence information, conventional web-based tools have not been utilized for bioinformatics applications. We have developed a novel algorithm and implemented it for searching species-specific genomic sequences, DNA barcodes, by using popular web-based methods such as Google. We developed an alignment independent character based algorithm based on dividing a sequence library (DNA barcodes) and query sequence to words. The actual search is conducted by conventional search tools such as freely available Google Desktop Search. We implemented our algorithm in two exemplar packages. We developed pre and post-processing software to provide customized input and output services, respectively. Our analysis of all publicly available DNA barcode sequences shows a high accuracy as well as rapid results. Our method makes use of conventional web-based technologies for specialized genetic data. It provides a robust and efficient solution for sequence search on the web. The integration of our search method for large-scale sequence libraries such as DNA barcodes provides an excellent web-based tool for accessing this information and linking it to other available categories of information on the web.

  4. Sequencing genes in silico using single nucleotide polymorphisms

    Directory of Open Access Journals (Sweden)

    Zhang Xinyi

    2012-01-01

    Full Text Available Abstract Background The advent of high throughput sequencing technology has enabled the 1000 Genomes Project Pilot 3 to generate complete sequence data for more than 906 genes and 8,140 exons representing 697 subjects. The 1000 Genomes database provides a critical opportunity for further interpreting disease associations with single nucleotide polymorphisms (SNPs discovered from genetic association studies. Currently, direct sequencing of candidate genes or regions on a large number of subjects remains both cost- and time-prohibitive. Results To accelerate the translation from discovery to functional studies, we propose an in silico gene sequencing method (ISS, which predicts phased sequences of intragenic regions, using SNPs. The key underlying idea of our method is to infer diploid sequences (a pair of phased sequences/alleles at every functional locus utilizing the deep sequencing data from the 1000 Genomes Project and SNP data from the HapMap Project, and to build prediction models using flanking SNPs. Using this method, we have developed a database of prediction models for 611 known genes. Sequence prediction accuracy for these genes is 96.26% on average (ranges 79%-100%. This database of prediction models can be enhanced and scaled up to include new genes as the 1000 Genomes Project sequences additional genes on additional individuals. Applying our predictive model for the KCNJ11 gene to the Wellcome Trust Case Control Consortium (WTCCC Type 2 diabetes cohort, we demonstrate how the prediction of phased sequences inferred from GWAS SNP genotype data can be used to facilitate interpretation and identify a probable functional mechanism such as protein changes. Conclusions Prior to the general availability of routine sequencing of all subjects, the ISS method proposed here provides a time- and cost-effective approach to broadening the characterization of disease associated SNPs and regions, and facilitating the prioritization of candidate

  5. Combinatorial Pooling Enables Selective Sequencing of the Barley Gene Space

    Science.gov (United States)

    Lonardi, Stefano; Duma, Denisa; Alpert, Matthew; Cordero, Francesca; Beccuti, Marco; Bhat, Prasanna R.; Wu, Yonghui; Ciardo, Gianfranco; Alsaihati, Burair; Ma, Yaqin; Wanamaker, Steve; Resnik, Josh; Bozdag, Serdar; Luo, Ming-Cheng; Close, Timothy J.

    2013-01-01

    For the vast majority of species – including many economically or ecologically important organisms, progress in biological research is hampered due to the lack of a reference genome sequence. Despite recent advances in sequencing technologies, several factors still limit the availability of such a critical resource. At the same time, many research groups and international consortia have already produced BAC libraries and physical maps and now are in a position to proceed with the development of whole-genome sequences organized around a physical map anchored to a genetic map. We propose a BAC-by-BAC sequencing protocol that combines combinatorial pooling design and second-generation sequencing technology to efficiently approach denovo selective genome sequencing. We show that combinatorial pooling is a cost-effective and practical alternative to exhaustive DNA barcoding when preparing sequencing libraries for hundreds or thousands of DNA samples, such as in this case gene-bearing minimum-tiling-path BAC clones. The novelty of the protocol hinges on the computational ability to efficiently compare hundred millions of short reads and assign them to the correct BAC clones (deconvolution) so that the assembly can be carried out clone-by-clone. Experimental results on simulated data for the rice genome show that the deconvolution is very accurate, and the resulting BAC assemblies have high quality. Results on real data for a gene-rich subset of the barley genome confirm that the deconvolution is accurate and the BAC assemblies have good quality. While our method cannot provide the level of completeness that one would achieve with a comprehensive whole-genome sequencing project, we show that it is quite successful in reconstructing the gene sequences within BACs. In the case of plants such as barley, this level of sequence knowledge is sufficient to support critical end-point objectives such as map-based cloning and marker-assisted breeding. PMID:23592960

  6. Combinatorial pooling enables selective sequencing of the barley gene space.

    Directory of Open Access Journals (Sweden)

    Stefano Lonardi

    2013-04-01

    Full Text Available For the vast majority of species - including many economically or ecologically important organisms, progress in biological research is hampered due to the lack of a reference genome sequence. Despite recent advances in sequencing technologies, several factors still limit the availability of such a critical resource. At the same time, many research groups and international consortia have already produced BAC libraries and physical maps and now are in a position to proceed with the development of whole-genome sequences organized around a physical map anchored to a genetic map. We propose a BAC-by-BAC sequencing protocol that combines combinatorial pooling design and second-generation sequencing technology to efficiently approach denovo selective genome sequencing. We show that combinatorial pooling is a cost-effective and practical alternative to exhaustive DNA barcoding when preparing sequencing libraries for hundreds or thousands of DNA samples, such as in this case gene-bearing minimum-tiling-path BAC clones. The novelty of the protocol hinges on the computational ability to efficiently compare hundred millions of short reads and assign them to the correct BAC clones (deconvolution so that the assembly can be carried out clone-by-clone. Experimental results on simulated data for the rice genome show that the deconvolution is very accurate, and the resulting BAC assemblies have high quality. Results on real data for a gene-rich subset of the barley genome confirm that the deconvolution is accurate and the BAC assemblies have good quality. While our method cannot provide the level of completeness that one would achieve with a comprehensive whole-genome sequencing project, we show that it is quite successful in reconstructing the gene sequences within BACs. In the case of plants such as barley, this level of sequence knowledge is sufficient to support critical end-point objectives such as map-based cloning and marker-assisted breeding.

  7. Combinatorial pooling enables selective sequencing of the barley gene space.

    Science.gov (United States)

    Lonardi, Stefano; Duma, Denisa; Alpert, Matthew; Cordero, Francesca; Beccuti, Marco; Bhat, Prasanna R; Wu, Yonghui; Ciardo, Gianfranco; Alsaihati, Burair; Ma, Yaqin; Wanamaker, Steve; Resnik, Josh; Bozdag, Serdar; Luo, Ming-Cheng; Close, Timothy J

    2013-04-01

    For the vast majority of species - including many economically or ecologically important organisms, progress in biological research is hampered due to the lack of a reference genome sequence. Despite recent advances in sequencing technologies, several factors still limit the availability of such a critical resource. At the same time, many research groups and international consortia have already produced BAC libraries and physical maps and now are in a position to proceed with the development of whole-genome sequences organized around a physical map anchored to a genetic map. We propose a BAC-by-BAC sequencing protocol that combines combinatorial pooling design and second-generation sequencing technology to efficiently approach denovo selective genome sequencing. We show that combinatorial pooling is a cost-effective and practical alternative to exhaustive DNA barcoding when preparing sequencing libraries for hundreds or thousands of DNA samples, such as in this case gene-bearing minimum-tiling-path BAC clones. The novelty of the protocol hinges on the computational ability to efficiently compare hundred millions of short reads and assign them to the correct BAC clones (deconvolution) so that the assembly can be carried out clone-by-clone. Experimental results on simulated data for the rice genome show that the deconvolution is very accurate, and the resulting BAC assemblies have high quality. Results on real data for a gene-rich subset of the barley genome confirm that the deconvolution is accurate and the BAC assemblies have good quality. While our method cannot provide the level of completeness that one would achieve with a comprehensive whole-genome sequencing project, we show that it is quite successful in reconstructing the gene sequences within BACs. In the case of plants such as barley, this level of sequence knowledge is sufficient to support critical end-point objectives such as map-based cloning and marker-assisted breeding.

  8. Universal sequence map (USM of arbitrary discrete sequences

    Directory of Open Access Journals (Sweden)

    Almeida Jonas S

    2002-02-01

    Full Text Available Abstract Background For over a decade the idea of representing biological sequences in a continuous coordinate space has maintained its appeal but not been fully realized. The basic idea is that any sequence of symbols may define trajectories in the continuous space conserving all its statistical properties. Ideally, such a representation would allow scale independent sequence analysis – without the context of fixed memory length. A simple example would consist on being able to infer the homology between two sequences solely by comparing the coordinates of any two homologous units. Results We have successfully identified such an iterative function for bijective mappingψ of discrete sequences into objects of continuous state space that enable scale-independent sequence analysis. The technique, named Universal Sequence Mapping (USM, is applicable to sequences with an arbitrary length and arbitrary number of unique units and generates a representation where map distance estimates sequence similarity. The novel USM procedure is based on earlier work by these and other authors on the properties of Chaos Game Representation (CGR. The latter enables the representation of 4 unit type sequences (like DNA as an order free Markov Chain transition table. The properties of USM are illustrated with test data and can be verified for other data by using the accompanying web-based tool:http://bioinformatics.musc.edu/~jonas/usm/. Conclusions USM is shown to enable a statistical mechanics approach to sequence analysis. The scale independent representation frees sequence analysis from the need to assume a memory length in the investigation of syntactic rules.

  9. How Next-Generation Sequencing Has Aided Our Understanding of the Sequence Composition and Origin of B Chromosomes

    Directory of Open Access Journals (Sweden)

    Alevtina Ruban

    2017-10-01

    Full Text Available Accessory, supernumerary, or—most simply—B chromosomes, are found in many eukaryotic karyotypes. These small chromosomes do not follow the usual pattern of segregation, but rather are transmitted in a higher than expected frequency. As increasingly being demonstrated by next-generation sequencing (NGS, their structure comprises fragments of standard (A chromosomes, although in some plant species, their sequence also includes contributions from organellar genomes. Transcriptomic analyses of various animal and plant species have revealed that, contrary to what used to be the common belief, some of the B chromosome DNA is protein-encoding. This review summarizes the progress in understanding B chromosome biology enabled by the application of next-generation sequencing technology and state-of-the-art bioinformatics. In particular, a contrast is drawn between a direct sequencing approach and a strategy based on a comparative genomics as alternative routes that can be taken towards the identification of B chromosome sequences.

  10. Application of Quaternion in improving the quality of global sequence alignment scores for an ambiguous sequence target in Streptococcus pneumoniae DNA

    Science.gov (United States)

    Lestari, D.; Bustamam, A.; Novianti, T.; Ardaneswari, G.

    2017-07-01

    DNA sequence can be defined as a succession of letters, representing the order of nucleotides within DNA, using a permutation of four DNA base codes including adenine (A), guanine (G), cytosine (C), and thymine (T). The precise code of the sequences is determined using DNA sequencing methods and technologies, which have been developed since the 1970s and currently become highly developed, advanced and highly throughput sequencing technologies. So far, DNA sequencing has greatly accelerated biological and medical research and discovery. However, in some cases DNA sequencing could produce any ambiguous and not clear enough sequencing results that make them quite difficult to be determined whether these codes are A, T, G, or C. To solve these problems, in this study we can introduce other representation of DNA codes namely Quaternion Q = (PA, PT, PG, PC), where PA, PT, PG, PC are the probability of A, T, G, C bases that could appear in Q and PA + PT + PG + PC = 1. Furthermore, using Quaternion representations we are able to construct the improved scoring matrix for global sequence alignment processes, by applying a dot product method. Moreover, this scoring matrix produces better and higher quality of the match and mismatch score between two DNA base codes. In implementation, we applied the Needleman-Wunsch global sequence alignment algorithm using Octave, to analyze our target sequence which contains some ambiguous sequence data. The subject sequences are the DNA sequences of Streptococcus pneumoniae families obtained from the Genebank, meanwhile the target DNA sequence are received from our collaborator database. As the results we found the Quaternion representations improve the quality of the sequence alignment score and we can conclude that DNA sequence target has maximum similarity with Streptococcus pneumoniae.

  11. Short sequence motifs, overrepresented in mammalian conservednon-coding sequences

    Energy Technology Data Exchange (ETDEWEB)

    Minovitsky, Simon; Stegmaier, Philip; Kel, Alexander; Kondrashov,Alexey S.; Dubchak, Inna

    2007-02-21

    Background: A substantial fraction of non-coding DNAsequences of multicellular eukaryotes is under selective constraint. Inparticular, ~;5 percent of the human genome consists of conservednon-coding sequences (CNSs). CNSs differ from other genomic sequences intheir nucleotide composition and must play important functional roles,which mostly remain obscure.Results: We investigated relative abundancesof short sequence motifs in all human CNSs present in the human/mousewhole-genome alignments vs. three background sets of sequences: (i)weakly conserved or unconserved non-coding sequences (non-CNSs); (ii)near-promoter sequences (located between nucleotides -500 and -1500,relative to a start of transcription); and (iii) random sequences withthe same nucleotide composition as that of CNSs. When compared tonon-CNSs and near-promoter sequences, CNSs possess an excess of AT-richmotifs, often containing runs of identical nucleotides. In contrast, whencompared to random sequences, CNSs contain an excess of GC-rich motifswhich, however, lack CpG dinucleotides. Thus, abundance of short sequencemotifs in human CNSs, taken as a whole, is mostly determined by theiroverall compositional properties and not by overrepresentation of anyspecific short motifs. These properties are: (i) high AT-content of CNSs,(ii) a tendency, probably due to context-dependent mutation, of A's andT's to clump, (iii) presence of short GC-rich regions, and (iv) avoidanceof CpG contexts, due to their hypermutability. Only a small number ofshort motifs, overrepresented in all human CNSs are similar to bindingsites of transcription factors from the FOX family.Conclusion: Human CNSsas a whole appear to be too broad a class of sequences to possess strongfootprints of any short sequence-specific functions. Such footprintsshould be studied at the level of functional subclasses of CNSs, such asthose which flank genes with a particular pattern of expression. Overallproperties of CNSs are affected by

  12. ABS: Sequence alignment by scanning

    KAUST Repository

    Bonny, Mohamed Talal

    2011-08-01

    Sequence alignment is an essential tool in almost any computational biology research. It processes large database sequences and considered to be high consumers of computation time. Heuristic algorithms are used to get approximate but fast results. We introduce fast alignment algorithm, called Alignment By Scanning (ABS), to provide an approximate alignment of two DNA sequences. We compare our algorithm with the well-known alignment algorithms, the FASTA (which is heuristic) and the \\'Needleman-Wunsch\\' (which is optimal). The proposed algorithm achieves up to 76% enhancement in alignment score when it is compared with the FASTA Algorithm. The evaluations are conducted using different lengths of DNA sequences. © 2011 IEEE.

  13. ABS: Sequence alignment by scanning

    KAUST Repository

    Bonny, Mohamed Talal; Salama, Khaled N.

    2011-01-01

    Sequence alignment is an essential tool in almost any computational biology research. It processes large database sequences and considered to be high consumers of computation time. Heuristic algorithms are used to get approximate but fast results. We introduce fast alignment algorithm, called Alignment By Scanning (ABS), to provide an approximate alignment of two DNA sequences. We compare our algorithm with the well-known alignment algorithms, the FASTA (which is heuristic) and the 'Needleman-Wunsch' (which is optimal). The proposed algorithm achieves up to 76% enhancement in alignment score when it is compared with the FASTA Algorithm. The evaluations are conducted using different lengths of DNA sequences. © 2011 IEEE.

  14. Fast global sequence alignment technique

    KAUST Repository

    Bonny, Mohamed Talal

    2011-11-01

    Bioinformatics database is growing exponentially in size. Processing these large amount of data may take hours of time even if super computers are used. One of the most important processing tool in Bioinformatics is sequence alignment. We introduce fast alignment algorithm, called \\'Alignment By Scanning\\' (ABS), to provide an approximate alignment of two DNA sequences. We compare our algorithm with the wellknown sequence alignment algorithms, the \\'GAP\\' (which is heuristic) and the \\'Needleman-Wunsch\\' (which is optimal). The proposed algorithm achieves up to 51% enhancement in alignment score when it is compared with the GAP Algorithm. The evaluations are conducted using different lengths of DNA sequences. © 2011 IEEE.

  15. Evaluation of a pooled strategy for high-throughput sequencing of cosmid clones from metagenomic libraries.

    Science.gov (United States)

    Lam, Kathy N; Hall, Michael W; Engel, Katja; Vey, Gregory; Cheng, Jiujun; Neufeld, Josh D; Charles, Trevor C

    2014-01-01

    High-throughput sequencing methods have been instrumental in the growing field of metagenomics, with technological improvements enabling greater throughput at decreased costs. Nonetheless, the economy of high-throughput sequencing cannot be fully leveraged in the subdiscipline of functional metagenomics. In this area of research, environmental DNA is typically cloned to generate large-insert libraries from which individual clones are isolated, based on specific activities of interest. Sequence data are required for complete characterization of such clones, but the sequencing of a large set of clones requires individual barcode-based sample preparation; this can become costly, as the cost of clone barcoding scales linearly with the number of clones processed, and thus sequencing a large number of metagenomic clones often remains cost-prohibitive. We investigated a hybrid Sanger/Illumina pooled sequencing strategy that omits barcoding altogether, and we evaluated this strategy by comparing the pooled sequencing results to reference sequence data obtained from traditional barcode-based sequencing of the same set of clones. Using identity and coverage metrics in our evaluation, we show that pooled sequencing can generate high-quality sequence data, without producing problematic chimeras. Though caveats of a pooled strategy exist and further optimization of the method is required to improve recovery of complete clone sequences and to avoid circumstances that generate unrecoverable clone sequences, our results demonstrate that pooled sequencing represents an effective and low-cost alternative for sequencing large sets of metagenomic clones.

  16. Multilocus Sequence Typing of Total-Genome-Sequenced Bacteria

    DEFF Research Database (Denmark)

    Larsen, Mette Voldby; Cosentino, Salvatore; Rasmussen, Simon

    2012-01-01

    Accurate strain identification is essential for anyone working with bacteria. For many species, multilocus sequence typing (MLST) is considered the "gold standard" of typing, but it is traditionally performed in an expensive and time-consuming manner. As the costs of whole-genome sequencing (WGS...

  17. Chaos game representation (CGR)-walk model for DNA sequences

    International Nuclear Information System (INIS)

    Jie, Gao; Zhen-Yuan, Xu

    2009-01-01

    Chaos game representation (CGR) is an iterative mapping technique that processes sequences of units, such as nucleotides in a DNA sequence or amino acids in a protein, in order to determine the coordinates of their positions in a continuous space. This distribution of positions has two features: one is unique, and the other is source sequence that can be recovered from the coordinates so that the distance between positions may serve as a measure of similarity between the corresponding sequences. A CGR-walk model is proposed based on CGR coordinates for the DNA sequences. The CGR coordinates are converted into a time series, and a long-memory ARFIMA (p, d, q) model, where ARFIMA stands for autoregressive fractionally integrated moving average, is introduced into the DNA sequence analysis. This model is applied to simulating real CGR-walk sequence data of ten genomic sequences. Remarkably long-range correlations are uncovered in the data, and the results from these models are reasonably fitted with those from the ARFIMA (p, d, q) model. (cross-disciplinary physics and related areas of science and technology)

  18. Dog Y chromosomal DNA sequence: identification, sequencing and SNP discovery

    Directory of Open Access Journals (Sweden)

    Kirkness Ewen

    2006-10-01

    Full Text Available Abstract Background Population genetic studies of dogs have so far mainly been based on analysis of mitochondrial DNA, describing only the history of female dogs. To get a picture of the male history, as well as a second independent marker, there is a need for studies of biallelic Y-chromosome polymorphisms. However, there are no biallelic polymorphisms reported, and only 3200 bp of non-repetitive dog Y-chromosome sequence deposited in GenBank, necessitating the identification of dog Y chromosome sequence and the search for polymorphisms therein. The genome has been only partially sequenced for one male dog, disallowing mapping of the sequence into specific chromosomes. However, by comparing the male genome sequence to the complete female dog genome sequence, candidate Y-chromosome sequence may be identified by exclusion. Results The male dog genome sequence was analysed by Blast search against the human genome to identify sequences with a best match to the human Y chromosome and to the female dog genome to identify those absent in the female genome. Candidate sequences were then tested for male specificity by PCR of five male and five female dogs. 32 sequences from the male genome, with a total length of 24 kbp, were identified as male specific, based on a match to the human Y chromosome, absence in the female dog genome and male specific PCR results. 14437 bp were then sequenced for 10 male dogs originating from Europe, Southwest Asia, Siberia, East Asia, Africa and America. Nine haplotypes were found, which were defined by 14 substitutions. The genetic distance between the haplotypes indicates that they originate from at least five wolf haplotypes. There was no obvious trend in the geographic distribution of the haplotypes. Conclusion We have identified 24159 bp of dog Y-chromosome sequence to be used for population genetic studies. We sequenced 14437 bp in a worldwide collection of dogs, identifying 14 SNPs for future SNP analyses, and

  19. SVX Sequencer Board

    International Nuclear Information System (INIS)

    Utes, M.

    1997-01-01

    The SVX Sequencer boards are 9U by 280mm circuit boards that reside in slots 2 through 21 of each of eight Eurocard crates in the D0 Detector Platform. The basic purpose is to control the SVX chips for data acquisition and when a trigger occurs, to gather the SVX data and relay the data to the VRB boards in the Movable Counting House. Functions and features are as follows: (1) Initialization of eight SVX chip strings using the MIL-STD-1553 data bus; (2) Real time manipulation of the SVX control lines to effect data acquisition, digitization, and readout based on the NRZ/Clock signals from the Controller; (3) Conversion of 8-bit electrical SVX readout data to an optical signal operating at 1.062 Gbit/sec, sent to the VRB. Eight HDIs will be serviced per board; (4) Built-in logic analyzer which can record the most important control and data lines during a data acquisition cycle and put this recorded information onto the 1553 bus; (5) Identification header and end of data trailer tacked onto data stream; (6) 1553 register which can read the current values of the control and data lines; (7) 1553 register which can test the optical link; (8) 1553 registers for crossing pulse width, calibration pulse voltage, and calibration pipeline select; (9) 1553 register for reading the optical drivers status link; (10) 1553 register for power control of SVX chips and ignoring bad SVX strings; (11) Front panel displays and LEDs show the board status at a glance; (12) In-system programmable EPLDs are programmed via 1553 or Altera's 'Bitblaster'; (13) Automatic readout abort after 45us; (14) Supplies BUSY signal back to Trigger Framework; (15) Supports a heartbeat system to prevent excessive SVX current draw; and (16) Supports a SVX power trip feature if heartbeat failure occurs.

  20. Sequence Algebra, Sequence Decision Diagrams and Dynamic Fault Trees

    International Nuclear Information System (INIS)

    Rauzy, Antoine B.

    2011-01-01

    A large attention has been focused on the Dynamic Fault Trees in the past few years. By adding new gates to static (regular) Fault Trees, Dynamic Fault Trees aim to take into account dependencies among events. Merle et al. proposed recently an algebraic framework to give a formal interpretation to these gates. In this article, we extend Merle et al.'s work by adopting a slightly different perspective. We introduce Sequence Algebras that can be seen as Algebras of Basic Events, representing failures of non-repairable components. We show how to interpret Dynamic Fault Trees within this framework. Finally, we propose a new data structure to encode sets of sequences of Basic Events: Sequence Decision Diagrams. Sequence Decision Diagrams are very much inspired from Minato's Zero-Suppressed Binary Decision Diagrams. We show that all operations of Sequence Algebras can be performed on this data structure.

  1. Sport Technology

    CSIR Research Space (South Africa)

    Kirkbride, T

    2007-11-01

    Full Text Available Technology is transforming the games themselves and at times with dire consequences. Tony Kirkbride, Head: CSIR Technology Centre said there are a variety of sports technologies and there have been advances in material sciences and advances...

  2. Assistive Technology

    Science.gov (United States)

    ... Page Resize Text Printer Friendly Online Chat Assistive Technology Assistive technology (AT) is any service or tool that helps ... be difficult or impossible. For older adults, such technology may be a walker to improve mobility or ...

  3. Rhipicephalus microplus dataset of nonredundant raw sequence reads from 454 GS FLX sequencing of Cot-selected (Cot = 660) genomic DNA

    Science.gov (United States)

    A reassociation kinetics-based approach was used to reduce the complexity of genomic DNA from the Deutsch laboratory strain of the cattle tick, Rhipicephalus microplus, to facilitate genome sequencing. Selected genomic DNA (Cot value = 660) was sequenced using 454 GS FLX technology, resulting in 356...

  4. Nano technology

    International Nuclear Information System (INIS)

    Lee, In Sik

    2002-03-01

    This book is introduction of nano technology, which describes what nano technology is, alpha and omega of nano technology, the future of Korean nano technology and human being's future and nano technology. The contents of this book are nano period is coming, a engine of creation, what is molecular engineering, a huge nano technology, technique on making small things, nano materials with exorbitant possibility, the key of nano world the most desirable nano technology in bio industry, nano development plan of government, the direction of development for nano technology and children of heart.

  5. Rover Technologies

    Data.gov (United States)

    National Aeronautics and Space Administration — Develop and mature rover technologies supporting robotic exploration including rover design, controlling rovers over time delay and for exploring . Technology...

  6. Chameleon sequences in neurodegenerative diseases

    International Nuclear Information System (INIS)

    Bahramali, Golnaz; Goliaei, Bahram; Minuchehr, Zarrin; Salari, Ali

    2016-01-01

    Chameleon sequences can adopt either alpha helix sheet or a coil conformation. Defining chameleon sequences in PDB (Protein Data Bank) may yield to an insight on defining peptides and proteins responsible in neurodegeneration. In this research, we benefitted from the large PDB and performed a sequence analysis on Chameleons, where we developed an algorithm to extract peptide segments with identical sequences, but different structures. In order to find new chameleon sequences, we extracted a set of 8315 non-redundant protein sequences from the PDB with an identity less than 25%. Our data was classified to “helix to strand (HE)”, “helix to coil (HC)” and “strand to coil (CE)” alterations. We also analyzed the occurrence of singlet and doublet amino acids and the solvent accessibility in the chameleon sequences; we then sorted out the proteins with the most number of chameleon sequences and named them Chameleon Flexible Proteins (CFPs) in our dataset. Our data revealed that Gly, Val, Ile, Tyr and Phe, are the major amino acids in Chameleons. We also found that there are proteins such as Insulin Degrading Enzyme IDE and GTP-binding nuclear protein Ran (RAN) with the most number of chameleons (640 and 405 respectively). These proteins have known roles in neurodegenerative diseases. Therefore it can be inferred that other CFP's can serve as key proteins in neurodegeneration, and a study on them can shed light on curing and preventing neurodegenerative diseases.

  7. Direct, rapid RNA sequence analysis

    International Nuclear Information System (INIS)

    Peattie, D.A.

    1987-01-01

    The original methods of RNA sequence analysis were based on enzymatic production and chromatographic separation of overlapping oligonucleotide fragments from within an RNA molecule followed by identification of the mononucleotides comprising the oligomer. Over the past decade the field of nucleic acid sequencing has changed dramatically, however, and RNA molecules now can be sequenced in a variety of more streamlined fashions. Most of the more recent advances in RNA sequencing have involved one-dimensional electrophoretic separation of 32 P-end-labeled oligoribonucleotides on polyacrylamide gels. In this chapter the author discusses two of these methods for determining the nucleotide sequences of RNA molecules rapidly: the chemical method and the enzymatic method. Both methods are direct and degradative, i.e., they rely on fragmatic and chemical approaches should be utilized. The single-strand-specific ribonucleases (A, T 1 , T 2 , and S 1 ) provide an efficient means to locate double-helical regions rapidly, and the chemical reactions provide a means to determine the RNA sequence within these regions. In addition, the chemical reactions allow one to assign interactions to specific atoms and to distinguish secondary interactions from tertiary ones. If the RNA molecule is small enough to be sequenced directly by the enzymatic or chemical method, the probing reactions can be done easily at the same time as sequencing reactions

  8. Chameleon sequences in neurodegenerative diseases.

    Science.gov (United States)

    Bahramali, Golnaz; Goliaei, Bahram; Minuchehr, Zarrin; Salari, Ali

    2016-03-25

    Chameleon sequences can adopt either alpha helix sheet or a coil conformation. Defining chameleon sequences in PDB (Protein Data Bank) may yield to an insight on defining peptides and proteins responsible in neurodegeneration. In this research, we benefitted from the large PDB and performed a sequence analysis on Chameleons, where we developed an algorithm to extract peptide segments with identical sequences, but different structures. In order to find new chameleon sequences, we extracted a set of 8315 non-redundant protein sequences from the PDB with an identity less than 25%. Our data was classified to "helix to strand (HE)", "helix to coil (HC)" and "strand to coil (CE)" alterations. We also analyzed the occurrence of singlet and doublet amino acids and the solvent accessibility in the chameleon sequences; we then sorted out the proteins with the most number of chameleon sequences and named them Chameleon Flexible Proteins (CFPs) in our dataset. Our data revealed that Gly, Val, Ile, Tyr and Phe, are the major amino acids in Chameleons. We also found that there are proteins such as Insulin Degrading Enzyme IDE and GTP-binding nuclear protein Ran (RAN) with the most number of chameleons (640 and 405 respectively). These proteins have known roles in neurodegenerative diseases. Therefore it can be inferred that other CFP's can serve as key proteins in neurodegeneration, and a study on them can shed light on curing and preventing neurodegenerative diseases. Copyright © 2016 Elsevier Inc. All rights reserved.

  9. Farey sequences and resistor networks

    Indian Academy of Sciences (India)

    Green's function, while the perturbation of a network is investigated in [3]. ... In Theorem 1 below, we employ the Farey sequence to establish a strict .... We next show that the Farey sequence method is applicable for circuits with n or fewer.

  10. Chameleon sequences in neurodegenerative diseases

    Energy Technology Data Exchange (ETDEWEB)

    Bahramali, Golnaz [Institute of Biochemistry and Biophysics, University of Tehran, Tehran (Iran, Islamic Republic of); Goliaei, Bahram, E-mail: goliaei@ut.ac.ir [Institute of Biochemistry and Biophysics, University of Tehran, Tehran (Iran, Islamic Republic of); Minuchehr, Zarrin, E-mail: minuchehr@nigeb.ac.ir [Department of Systems Biotechnology, National Institute of Genetic Engineering and Biotechnology, (NIGEB), Tehran (Iran, Islamic Republic of); Salari, Ali [Department of Systems Biotechnology, National Institute of Genetic Engineering and Biotechnology, (NIGEB), Tehran (Iran, Islamic Republic of)

    2016-03-25

    Chameleon sequences can adopt either alpha helix sheet or a coil conformation. Defining chameleon sequences in PDB (Protein Data Bank) may yield to an insight on defining peptides and proteins responsible in neurodegeneration. In this research, we benefitted from the large PDB and performed a sequence analysis on Chameleons, where we developed an algorithm to extract peptide segments with identical sequences, but different structures. In order to find new chameleon sequences, we extracted a set of 8315 non-redundant protein sequences from the PDB with an identity less than 25%. Our data was classified to “helix to strand (HE)”, “helix to coil (HC)” and “strand to coil (CE)” alterations. We also analyzed the occurrence of singlet and doublet amino acids and the solvent accessibility in the chameleon sequences; we then sorted out the proteins with the most number of chameleon sequences and named them Chameleon Flexible Proteins (CFPs) in our dataset. Our data revealed that Gly, Val, Ile, Tyr and Phe, are the major amino acids in Chameleons. We also found that there are proteins such as Insulin Degrading Enzyme IDE and GTP-binding nuclear protein Ran (RAN) with the most number of chameleons (640 and 405 respectively). These proteins have known roles in neurodegenerative diseases. Therefore it can be inferred that other CFP's can serve as key proteins in neurodegeneration, and a study on them can shed light on curing and preventing neurodegenerative diseases.

  11. Commercial Art: Scope and Sequence.

    Science.gov (United States)

    Nashville - Davidson County Metropolitan Public Schools, TN.

    This scope and sequence guide, developed for a commercial art vocational education program, represents an initial step in the development of a systemwide articulated curriculum sequence for all vocational programs within the Metropolitan Nashville Public School System. It was developed as a result of needs expressed by teachers, parents, and the…

  12. Sequence Capture versus Restriction Site Associated DNA Sequencing for Shallow Systematics.

    Science.gov (United States)

    Harvey, Michael G; Smith, Brian Tilston; Glenn, Travis C; Faircloth, Brant C; Brumfield, Robb T

    2016-09-01

    Sequence capture and restriction site associated DNA sequencing (RAD-Seq) are two genomic enrichment strategies for applying next-generation sequencing technologies to systematics studies. At shallow timescales, such as within species, RAD-Seq has been widely adopted among researchers, although there has been little discussion of the potential limitations and benefits of RAD-Seq and sequence capture. We discuss a series of issues that may impact the utility of sequence capture and RAD-Seq data for shallow systematics in non-model species. We review prior studies that used both methods, and investigate differences between the methods by re-analyzing existing RAD-Seq and sequence capture data sets from a Neotropical bird (Xenops minutus). We suggest that the strengths of RAD-Seq data sets for shallow systematics are the wide dispersion of markers across the genome, the relative ease and cost of laboratory work, the deep coverage and read overlap at recovered loci, and the high overall information that results. Sequence capture's benefits include flexibility and repeatability in the genomic regions targeted, success using low-quality samples, more straightforward read orthology assessment, and higher per-locus information content. The utility of a method in systematics, however, rests not only on its performance within a study, but on the comparability of data sets and inferences with those of prior work. In RAD-Seq data sets, comparability is compromised by low overlap of orthologous markers across species and the sensitivity of genetic diversity in a data set to an interaction between the level of natural heterozygosity in the samples examined and the parameters used for orthology assessment. In contrast, sequence capture of conserved genomic regions permits interrogation of the same loci across divergent species, which is preferable for maintaining comparability among data sets and studies for the purpose of drawing general conclusions about the impact of

  13. Rapid Diagnostics of Onboard Sequences

    Science.gov (United States)

    Starbird, Thomas W.; Morris, John R.; Shams, Khawaja S.; Maimone, Mark W.

    2012-01-01

    Keeping track of sequences onboard a spacecraft is challenging. When reviewing Event Verification Records (EVRs) of sequence executions on the Mars Exploration Rover (MER), operators often found themselves wondering which version of a named sequence the EVR corresponded to. The lack of this information drastically impacts the operators diagnostic capabilities as well as their situational awareness with respect to the commands the spacecraft has executed, since the EVRs do not provide argument values or explanatory comments. Having this information immediately available can be instrumental in diagnosing critical events and can significantly enhance the overall safety of the spacecraft. This software provides auditing capability that can eliminate that uncertainty while diagnosing critical conditions. Furthermore, the Restful interface provides a simple way for sequencing tools to automatically retrieve binary compiled sequence SCMFs (Space Command Message Files) on demand. It also enables developers to change the underlying database, while maintaining the same interface to the existing applications. The logging capabilities are also beneficial to operators when they are trying to recall how they solved a similar problem many days ago: this software enables automatic recovery of SCMF and RML (Robot Markup Language) sequence files directly from the command EVRs, eliminating the need for people to find and validate the corresponding sequences. To address the lack of auditing capability for sequences onboard a spacecraft during earlier missions, extensive logging support was added on the Mars Science Laboratory (MSL) sequencing server. This server is responsible for generating all MSL binary SCMFs from RML input sequences. The sequencing server logs every SCMF it generates into a MySQL database, as well as the high-level RML file and dictionary name inputs used to create the SCMF. The SCMF is then indexed by a hash value that is automatically included in all command

  14. Accident sequence quantification with KIRAP

    International Nuclear Information System (INIS)

    Kim, Tae Un; Han, Sang Hoon; Kim, Kil You; Yang, Jun Eon; Jeong, Won Dae; Chang, Seung Cheol; Sung, Tae Yong; Kang, Dae Il; Park, Jin Hee; Lee, Yoon Hwan; Hwang, Mi Jeong.

    1997-01-01

    The tasks of probabilistic safety assessment(PSA) consists of the identification of initiating events, the construction of event tree for each initiating event, construction of fault trees for event tree logics, the analysis of reliability data and finally the accident sequence quantification. In the PSA, the accident sequence quantification is to calculate the core damage frequency, importance analysis and uncertainty analysis. Accident sequence quantification requires to understand the whole model of the PSA because it has to combine all event tree and fault tree models, and requires the excellent computer code because it takes long computation time. Advanced Research Group of Korea Atomic Energy Research Institute(KAERI) has developed PSA workstation KIRAP(Korea Integrated Reliability Analysis Code Package) for the PSA work. This report describes the procedures to perform accident sequence quantification, the method to use KIRAP's cut set generator, and method to perform the accident sequence quantification with KIRAP. (author). 6 refs

  15. Accident sequence quantification with KIRAP

    Energy Technology Data Exchange (ETDEWEB)

    Kim, Tae Un; Han, Sang Hoon; Kim, Kil You; Yang, Jun Eon; Jeong, Won Dae; Chang, Seung Cheol; Sung, Tae Yong; Kang, Dae Il; Park, Jin Hee; Lee, Yoon Hwan; Hwang, Mi Jeong

    1997-01-01

    The tasks of probabilistic safety assessment(PSA) consists of the identification of initiating events, the construction of event tree for each initiating event, construction of fault trees for event tree logics, the analysis of reliability data and finally the accident sequence quantification. In the PSA, the accident sequence quantification is to calculate the core damage frequency, importance analysis and uncertainty analysis. Accident sequence quantification requires to understand the whole model of the PSA because it has to combine all event tree and fault tree models, and requires the excellent computer code because it takes long computation time. Advanced Research Group of Korea Atomic Energy Research Institute(KAERI) has developed PSA workstation KIRAP(Korea Integrated Reliability Analysis Code Package) for the PSA work. This report describes the procedures to perform accident sequence quantification, the method to use KIRAP`s cut set generator, and method to perform the accident sequence quantification with KIRAP. (author). 6 refs.

  16. Repeated DNA sequences in fungi

    Energy Technology Data Exchange (ETDEWEB)

    Dutta, S K

    1974-11-01

    Several fungal species, representatives of all broad groups like basidiomycetes, ascomycetes and phycomycetes, were examined for the nature of repeated DNA sequences by DNA:DNA reassociation studies using hydroxyapatite chromatography. All of the fungal species tested contained 10 to 20 percent repeated DNA sequences. There are approximately 100 to 110 copies of repeated DNA sequences of approximately 4 x 10/sup 7/ daltons piece size of each. Repeated DNA sequence homoduplexes showed on average 5/sup 0/C difference of T/sub e/50 (temperature at which 50 percent duplexes dissociate) values from the corresponding homoduplexes of unfractionated whole DNA. It is suggested that a part of repetitive sequences in fungi constitutes mitochondrial DNA and a part of it constitutes nuclear DNA. (auth)

  17. [Complete genome sequencing and sequence analysis of BCG Tice].

    Science.gov (United States)

    Wang, Zhiming; Pan, Yuanlong; Wu, Jun; Zhu, Baoli

    2012-10-04

    The objective of this study is to obtain the complete genome sequence of Bacillus Calmette-Guerin Tice (BCG Tice), in order to provide more information about the molecular biology of BCG Tice and design more reasonable vaccines to prevent tuberculosis. We assembled the data from high-throughput sequencing with SOAPdenovo software, with many contigs and scaffolds obtained. There are many sequence gaps and physical gaps remained as a result of regional low coverage and low quality. We designed primers at the end of contigs and performed PCR amplification in order to link these contigs and scaffolds. With various enzymes to perform PCR amplification, adjustment of PCR reaction conditions, and combined with clone construction to sequence, all the gaps were finished. We obtained the complete genome sequence of BCG Tice and submitted it to GenBank of National Center for Biotechnology Information (NCBI). The genome of BCG Tice is 4334064 base pairs in length, with GC content 65.65%. The problems and strategies during the finishing step of BCG Tice sequencing are illuminated here, with the hope of affording some experience to those who are involved in the finishing step of genome sequencing. The microarray data were verified by our results.

  18. A Framework for Developing Expertise in Engaging the Technological World.

    Science.gov (United States)

    Puk, Tom

    1996-01-01

    Technology education needs a pluralistic approach that begins with a perspective of what a technology is and follows a sequence of levels of engagement with it: functionality, intuitive excellence, conceptual understanding, and self-transcendence. (SK)

  19. Systems genetics of complex diseases using RNA-sequencing methods

    DEFF Research Database (Denmark)

    Mazzoni, Gianluca; Kogelman, Lisette; Suravajhala, Prashanth

    2015-01-01

    Next generation sequencing technologies have enabled the generation of huge quantities of biological data, and nowadays extensive datasets at different ‘omics levels have been generated. Systems genetics is a powerful approach that allows to integrate different ‘omics level and understand the bio...

  20. SNP Discovery In Marine Fish Species By 454 Sequencing

    DEFF Research Database (Denmark)

    Panitz, Frank; Nielsen, Rasmus Ory; van Houdt, Jeroen K J

    2011-01-01

    Based on the 454 Next-Generation-Sequencing technology (Roche) a high throughput screening method was devised in order to generate novel genetic markers (SNPs). SNP discovery was performed for three target species of marine fish: hake (Merluccius merluccius), herring (Clupea harengus) and sole...

  1. Illumina-based de novo transcriptome sequencing and analysis

    Indian Academy of Sciences (India)

    In the present study, we used Illumina HiSeq technology to perform de novo assembly of heart and musk gland transcriptomes from the Chinese forest musk deer. A total of 239,383 transcripts and 176,450 unigenes were obtained, of which 37,329 unigenes were matched to known sequences in the NCBI nonredundant ...

  2. Low-pass whole-genome sequencing in clinical cytogenetics

    DEFF Research Database (Denmark)

    Dong, Zirui; Zhang, Jun; Hu, Ping

    2016-01-01

    Purpose: Chromosomal microarray analysis is the gold standard for copy-number variant (CNV) detection in prenatal and postnatal diagnosis. We aimed to determine whether next-generation sequencing (NGS) technology could be an alternative method for CNV detection in routine clinical application. Me...

  3. Next-generation sequencing approaches to understanding the oral microbiome

    NARCIS (Netherlands)

    Zaura, E.

    2012-01-01

    Until recently, the focus in dental research has been on studying a small fraction of the oral microbiome—so-called opportunistic pathogens. With the advent of next-generation sequencing (NGS) technologies, researchers now have the tools that allow for profiling of the microbiomes and metagenomes at

  4. Noninvasive prenatal paternity testing (NIPAT) through maternal plasma DNA sequencing

    DEFF Research Database (Denmark)

    Jiang, Haojun; Xie, Yifan; Li, Xuchao

    2016-01-01

    developed a noninvasive prenatal paternity testing (NIPAT) based on SNP typing with maternal plasma DNA sequencing. We evaluated the influence factors (minor allele frequency (MAF), the number of total SNP, fetal fraction and effective sequencing depth) and designed three different selective SNP panels......Short tandem repeats (STRs) and single nucleotide polymorphisms (SNPs) have been already used to perform noninvasive prenatal paternity testing from maternal plasma DNA. The frequently used technologies were PCR followed by capillary electrophoresis and SNP typing array, respectively. Here, we...... paternity test using STR multiplex system. Our study here proved that the maternal plasma DNA sequencing-based technology is feasible and accurate in determining paternity, which may provide an alternative in forensic application in the future....

  5. Advancing analytical algorithms and pipelines for billions of microbial sequences.

    Science.gov (United States)

    Gonzalez, Antonio; Knight, Rob

    2012-02-01

    The vast number of microbial sequences resulting from sequencing efforts using new technologies require us to re-assess currently available analysis methodologies and tools. Here we describe trends in the development and distribution of software for analyzing microbial sequence data. We then focus on one widely used set of methods, dimensionality reduction techniques, which allow users to summarize and compare these vast datasets. We conclude by emphasizing the utility of formal software engineering methods for the development of computational biology tools, and the need for new algorithms for comparing microbial communities. Such large-scale comparisons will allow us to fulfill the dream of rapid integration and comparison of microbial sequence data sets, in a replicable analytical environment, in order to describe the microbial world we inhabit. Copyright © 2011 Elsevier Ltd. All rights reserved.

  6. Haematobia irritans dataset of raw sequence reads from Illumina-based transcriptome sequencing of specific tissues and life stages

    Science.gov (United States)

    Illumina HiSeq technology was used to sequence the transcriptome from various dissected tissues and life stages from the horn fly, Haematobia irritans. These samples include eggs (0, 2, 4, and 9 hours post-oviposition), adult fly gut, adult fly legs, adult fly malpighian tubule, adult fly ovary, adu...

  7. Sparc: a sparsity-based consensus algorithm for long erroneous sequencing reads

    Directory of Open Access Journals (Sweden)

    Chengxi Ye

    2016-06-01

    Full Text Available Motivation. The third generation sequencing (3GS technology generates long sequences of thousands of bases. However, its current error rates are estimated in the range of 15–40%, significantly higher than those of the prevalent next generation sequencing (NGS technologies (less than 1%. Fundamental bioinformatics tasks such as de novo genome assembly and variant calling require high-quality sequences that need to be extracted from these long but erroneous 3GS sequences. Results. We describe a versatile and efficient linear complexity consensus algorithm Sparc to facilitate de novo genome assembly. Sparc builds a sparse k-mer graph using a collection of sequences from a targeted genomic region. The heaviest path which approximates the most likely genome sequence is searched through a sparsity-induced reweighted graph as the consensus sequence. Sparc supports using NGS and 3GS data together, which leads to significant improvements in both cost efficiency and computational efficiency. Experiments with Sparc show that our algorithm can efficiently provide high-quality consensus sequences using both PacBio and Oxford Nanopore sequencing technologies. With only 30× PacBio data, Sparc can reach a consensus with error rate <0.5%. With the more challenging Oxford Nanopore data, Sparc can also achieve similar error rate when combined with NGS data. Compared with the existing approaches, Sparc calculates the consensus with higher accuracy, and uses approximately 80% less memory and time. Availability. The source code is available for download at https://github.com/yechengxi/Sparc.

  8. GROUPING WEB ACCESS SEQUENCES uSING SEQUENCE ALIGNMENT METHOD

    OpenAIRE

    BHUPENDRA S CHORDIA; KRISHNAKANT P ADHIYA

    2011-01-01

    In web usage mining grouping of web access sequences can be used to determine the behavior or intent of a set of users. Grouping websessions is how to measure the similarity between web sessions. There are many shortcomings in traditional measurement methods. The taskof grouping web sessions based on similarity and consists of maximizing the intra-group similarity while minimizing the inter-groupsimilarity is done using sequence alignment method. This paper introduces a new method to group we...

  9. Plantagora: modeling whole genome sequencing and assembly of plant genomes.

    Directory of Open Access Journals (Sweden)

    Roger Barthelson

    Full Text Available BACKGROUND: Genomics studies are being revolutionized by the next generation sequencing technologies, which have made whole genome sequencing much more accessible to the average researcher. Whole genome sequencing with the new technologies is a developing art that, despite the large volumes of data that can be produced, may still fail to provide a clear and thorough map of a genome. The Plantagora project was conceived to address specifically the gap between having the technical tools for genome sequencing and knowing precisely the best way to use them. METHODOLOGY/PRINCIPAL FINDINGS: For Plantagora, a platform was created for generating simulated reads from several different plant genomes of different sizes. The resulting read files mimicked either 454 or Illumina reads, with varying paired end spacing. Thousands of datasets of reads were created, most derived from our primary model genome, rice chromosome one. All reads were assembled with different software assemblers, including Newbler, Abyss, and SOAPdenovo, and the resulting assemblies were evaluated by an extensive battery of metrics chosen for these studies. The metrics included both statistics of the assembly sequences and fidelity-related measures derived by alignment of the assemblies to the original genome source for the reads. The results were presented in a website, which includes a data graphing tool, all created to help the user compare rapidly the feasibility and effectiveness of different sequencing and assembly strategies prior to testing an approach in the lab. Some of our own conclusions regarding the different strategies were also recorded on the website. CONCLUSIONS/SIGNIFICANCE: Plantagora provides a substantial body of information for comparing different approaches to sequencing a plant genome, and some conclusions regarding some of the specific approaches. Plantagora also provides a platform of metrics and tools for studying the process of sequencing and assembly

  10. Electronic technology

    International Nuclear Information System (INIS)

    Kim, Jin Su

    2010-07-01

    This book is composed of five chapters, which introduces electronic technology about understanding of electronic, electronic component, radio, electronic application, communication technology, semiconductor on its basic, free electron and hole, intrinsic semiconductor and semiconductor element, Diode such as PN junction diode, characteristic of junction diode, rectifier circuit and smoothing circuit, transistor on structure of transistor, characteristic of transistor and common emitter circuit, electronic application about electronic equipment, communication technology and education, robot technology and high electronic technology.

  11. LPTAU, Quasi Random Sequence Generator

    International Nuclear Information System (INIS)

    Sobol, Ilya M.

    1993-01-01

    1 - Description of program or function: LPTAU generates quasi random sequences. These are uniformly distributed sets of L=M N points in the N-dimensional unit cube: I N =[0,1]x...x[0,1]. These sequences are used as nodes for multidimensional integration; as searching points in global optimization; as trial points in multi-criteria decision making; as quasi-random points for quasi Monte Carlo algorithms. 2 - Method of solution: Uses LP-TAU sequence generation (see references). 3 - Restrictions on the complexity of the problem: The number of points that can be generated is L 30 . The dimension of the space cannot exceed 51

  12. Weak disorder in Fibonacci sequences

    Energy Technology Data Exchange (ETDEWEB)

    Ben-Naim, E [Theoretical Division and Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, NM 87545 (United States); Krapivsky, P L [Department of Physics and Center for Molecular Cybernetics, Boston University, Boston, MA 02215 (United States)

    2006-05-19

    We study how weak disorder affects the growth of the Fibonacci series. We introduce a family of stochastic sequences that grow by the normal Fibonacci recursion with probability 1 - {epsilon}, but follow a different recursion rule with a small probability {epsilon}. We focus on the weak disorder limit and obtain the Lyapunov exponent that characterizes the typical growth of the sequence elements, using perturbation theory. The limiting distribution for the ratio of consecutive sequence elements is obtained as well. A number of variations to the basic Fibonacci recursion including shift, doubling and copying are considered. (letter to the editor)

  13. Generation of control sequences for a pilot-disassembly system

    Science.gov (United States)

    Seliger, Guenther; Kim, Hyung-Ju; Keil, Thomas

    2002-02-01

    Closing the product and material cycles has emerged as a paradigm for industry in the 21st century. Disassembly plays a key role in a life cycle economy since it enables the recovery of resources. A partly automated disassembly system should adapt to a large variety of products and different degrees of devaluation. Also the amounts of products to be disassembled can vary strongly. To cope with these demands an approach to generate on-line disassembly control sequences will be presented. In order to react on these demands the technological feasibility is considered within a procedure for the generation of disassembly control sequences. Procedures are designed to find available and technologically feasible disassembly processes. The control system is formed by modularised and parameterised control units in the cell level within the entire control architecture. In the first development stage product and process analyses at the sample product washing machine were executed. Furthermore a generalized disassembly process was defined. Afterwards these processes were structured in primary and secondary functions. In the second stage the disassembly control at the technological level was investigated. Factors were the availability of the disassembly tools and the technological feasibility of the disassembly processes within the disassembly system. Technical alternative disassembly processes are determined as a result of availability of the tools and technological feasibility of processes. The fourth phase was the concept for the generation of the disassembly control sequences. The approach will be proved in a prototypical disassembly system.

  14. Identification of optimum sequencing depth especially for de novo genome assembly of small genomes using next generation sequencing data.

    Science.gov (United States)

    Desai, Aarti; Marwah, Veer Singh; Yadav, Akshay; Jha, Vineet; Dhaygude, Kishor; Bangar, Ujwala; Kulkarni, Vivek; Jere, Abhay

    2013-01-01

    Next Generation Sequencing (NGS) is a disruptive technology that has found widespread acceptance in the life sciences research community. The high throughput and low cost of sequencing has encouraged researchers to undertake ambitious genomic projects, especially in de novo genome sequencing. Currently, NGS systems generate sequence data as short reads and de novo genome assembly using these short reads is computationally very intensive. Due to lower cost of sequencing and higher throughput, NGS systems now provide the ability to sequence genomes at high depth. However, currently no report is available highlighting the impact of high sequence depth on genome assembly using real data sets and multiple assembly algorithms. Recently, some studies have evaluated the impact of sequence coverage, error rate and average read length on genome assembly using multiple assembly algorithms, however, these evaluations were performed using simulated datasets. One limitation of using simulated datasets is that variables such as error rates, read length and coverage which are known to impact genome assembly are carefully controlled. Hence, this study was undertaken to identify the minimum depth of sequencing required for de novo assembly for different sized genomes using graph based assembly algorithms and real datasets. Illumina reads for E.coli (4.6 MB) S.kudriavzevii (11.18 MB) and C.elegans (100 MB) were assembled using SOAPdenovo, Velvet, ABySS, Meraculous and IDBA-UD. Our analysis shows that 50X is the optimum read depth for assembling these genomes using all assemblers except Meraculous which requires 100X read depth. Moreover, our analysis shows that de novo assembly from 50X read data requires only 6-40 GB RAM depending on the genome size and assembly algorithm used. We believe that this information can be extremely valuable for researchers in designing experiments and multiplexing which will enable optimum utilization of sequencing as well as analysis resources.

  15. Experimental design-based functional mining and characterization of high-throughput sequencing data in the sequence read archive.

    Directory of Open Access Journals (Sweden)

    Takeru Nakazato

    Full Text Available High-throughput sequencing technology, also called next-generation sequencing (NGS, has the potential to revolutionize the whole process of genome sequencing, transcriptomics, and epigenetics. Sequencing data is captured in a public primary data archive, the Sequence Read Archive (SRA. As of January 2013, data from more than 14,000 projects have been submitted to SRA, which is double that of the previous year. Researchers can download raw sequence data from SRA website to perform further analyses and to compare with their own data. However, it is extremely difficult to search entries and download raw sequences of interests with SRA because the data structure is complicated, and experimental conditions along with raw sequences are partly described in natural language. Additionally, some sequences are of inconsistent quality because anyone can submit sequencing data to SRA with no quality check. Therefore, as a criterion of data quality, we focused on SRA entries that were cited in journal articles. We extracted SRA IDs and PubMed IDs (PMIDs from SRA and full-text versions of journal articles and retrieved 2748 SRA ID-PMID pairs. We constructed a publication list referring to SRA entries. Since, one of the main themes of -omics analyses is clarification of disease mechanisms, we also characterized SRA entries by disease keywords, according to the Medical Subject Headings (MeSH extracted from articles assigned to each SRA entry. We obtained 989 SRA ID-MeSH disease term pairs, and constructed a disease list referring to SRA data. We previously developed feature profiles of diseases in a system called "Gendoo". We generated hyperlinks between diseases extracted from SRA and the feature profiles of it. The developed project, publication and disease lists resulting from this study are available at our web service, called "DBCLS SRA" (http://sra.dbcls.jp/. This service will improve accessibility to high-quality data from SRA.

  16. Short read sequence typing (SRST: multi-locus sequence types from short reads

    Directory of Open Access Journals (Sweden)

    Inouye Michael

    2012-07-01

    Full Text Available Abstract Background Multi-locus sequence typing (MLST has become the gold standard for population analyses of bacterial pathogens. This method focuses on the sequences of a small number of loci (usually seven to divide the population and is simple, robust and facilitates comparison of results between laboratories and over time. Over the last decade, researchers and population health specialists have invested substantial effort in building up public MLST databases for nearly 100 different bacterial species, and these databases contain a wealth of important information linked to MLST sequence types such as time and place of isolation, host or niche, serotype and even clinical or drug resistance profiles. Recent advances in sequencing technology mean it is increasingly feasible to perform bacterial population analysis at the whole genome level. This offers massive gains in resolving power and genetic profiling compared to MLST, and will eventually replace MLST for bacterial typing and population analysis. However given the wealth of data currently available in MLST databases, it is crucial to maintain backwards compatibility with MLST schemes so that new genome analyses can be understood in their proper historical context. Results We present a software tool, SRST, for quick and accurate retrieval of sequence types from short read sets, using inputs easily downloaded from public databases. SRST uses read mapping and an allele assignment score incorporating sequence coverage and variability, to determine the most likely allele at each MLST locus. Analysis of over 3,500 loci in more than 500 publicly accessible Illumina read sets showed SRST to be highly accurate at allele assignment. SRST output is compatible with common analysis tools such as eBURST, Clonal Frame or PhyloViz, allowing easy comparison between novel genome data and MLST data. Alignment, fastq and pileup files can also be generated for novel alleles. Conclusions SRST is a novel

  17. A tale of three next generation sequencing platforms: comparison of Ion Torrent, Pacific Biosciences and Illumina MiSeq sequencers

    Directory of Open Access Journals (Sweden)

    Quail Michael A

    2012-07-01

    Full Text Available Abstract Background Next generation sequencing (NGS technology has revolutionized genomic and genetic research. The pace of change in this area is rapid with three major new sequencing platforms having been released in 2011: Ion Torrent’s PGM, Pacific Biosciences’ RS and the Illumina MiSeq. Here we compare the results obtained with those platforms to the performance of the Illumina HiSeq, the current market leader. In order to compare these platforms, and get sufficient coverage depth to allow meaningful analysis, we have sequenced a set of 4 microbial genomes with mean GC content ranging from 19.3 to 67.7%. Together, these represent a comprehensive range of genome content. Here we report our analysis of that sequence data in terms of coverage distribution, bias, GC distribution, variant detection and accuracy. Results Sequence generated by Ion Torrent, MiSeq and Pacific Biosciences technologies displays near perfect coverage behaviour on GC-rich, neutral and moderately AT-rich genomes, but a profound bias was observed upon sequencing the extremely AT-rich genome of Plasmodium falciparum on the PGM, resulting in no coverage for approximately 30% of the genome. We analysed the ability to call variants from each platform and found that we could call slightly more variants from Ion Torrent data compared to MiSeq data, but at the expense of a higher false positive rate. Variant calling from Pacific Biosciences data was possible but higher coverage depth was required. Context specific errors were observed in both PGM and MiSeq data, but not in that from the Pacific Biosciences platform. Conclusions All three fast turnaround sequencers evaluated here were able to generate usable sequence. However there are key differences between the quality of that data and the applications it will support.

  18. Sequence analysis of Leukemia DNA

    Science.gov (United States)

    Nacong, Nasria; Lusiyanti, Desy; Irawan, Muhammad. Isa

    2018-03-01

    Cancer is a very deadly disease, one of which is leukemia disease or better known as blood cancer. The cancer cell can be detected by taking DNA in laboratory test. This study focused on local alignment of leukemia and non leukemia data resulting from NCBI in the form of DNA sequences by using Smith-Waterman algorithm. SmithWaterman algorithm was invented by TF Smith and MS Waterman in 1981. These algorithms try to find as much as possible similarity of a pair of sequences, by giving a negative value to the unequal base pair (mismatch), and positive values on the same base pair (match). So that will obtain the maximum positive value as the end of the alignment, and the minimum value as the initial alignment. This study will use sequences of leukemia and 3 sequences of non leukemia.

  19. Mining and Development of Novel SSR Markers Using Next Generation Sequencing (NGS Data in Plants

    Directory of Open Access Journals (Sweden)

    Sima Taheri

    2018-02-01

    Full Text Available Microsatellites, or simple sequence repeats (SSRs, are one of the most informative and multi-purpose genetic markers exploited in plant functional genomics. However, the discovery of SSRs and development using traditional methods are laborious, time-consuming, and costly. Recently, the availability of high-throughput sequencing technologies has enabled researchers to identify a substantial number of microsatellites at less cost and effort than traditional approaches. Illumina is a noteworthy transcriptome sequencing technology that is currently used in SSR marker development. Although 454 pyrosequencing datasets can be used for SSR development, this type of sequencing is no longer supported. This review aims to present an overview of the next generation sequencing, with a focus on the efficient use of de novo transcriptome sequencing (RNA-Seq and related tools for mining and development of microsatellites in plants.

  20. Integrated sequence analysis. Final report

    International Nuclear Information System (INIS)

    Andersson, K.; Pyy, P.

    1998-02-01

    The NKS/RAK subprojet 3 'integrated sequence analysis' (ISA) was formulated with the overall objective to develop and to test integrated methodologies in order to evaluate event sequences with significant human action contribution. The term 'methodology' denotes not only technical tools but also methods for integration of different scientific disciplines. In this report, we first discuss the background of ISA and the surveys made to map methods in different application fields, such as man machine system simulation software, human reliability analysis (HRA) and expert judgement. Specific event sequences were, after the surveys, selected for application and testing of a number of ISA methods. The event sequences discussed in the report were cold overpressure of BWR, shutdown LOCA of BWR, steam generator tube rupture of a PWR and BWR disturbed signal view in the control room after an external event. Different teams analysed these sequences by using different ISA and HRA methods. Two kinds of results were obtained from the ISA project: sequence specific and more general findings. The sequence specific results are discussed together with each sequence description. The general lessons are discussed under a separate chapter by using comparisons of different case studies. These lessons include areas ranging from plant safety management (design, procedures, instrumentation, operations, maintenance and safety practices) to methodological findings (ISA methodology, PSA,HRA, physical analyses, behavioural analyses and uncertainty assessment). Finally follows a discussion about the project and conclusions are presented. An interdisciplinary study of complex phenomena is a natural way to produce valuable and innovative results. This project came up with structured ways to perform ISA and managed to apply the in practice. The project also highlighted some areas where more work is needed. In the HRA work, development is required for the use of simulators and expert judgement as

  1. Optimization of sequence alignment for simple sequence repeat regions

    Directory of Open Access Journals (Sweden)

    Ogbonnaya Francis C

    2011-07-01

    Full Text Available Abstract Background Microsatellites, or simple sequence repeats (SSRs, are tandemly repeated DNA sequences, including tandem copies of specific sequences no longer than six bases, that are distributed in the genome. SSR has been used as a molecular marker because it is easy to detect and is used in a range of applications, including genetic diversity, genome mapping, and marker assisted selection. It is also very mutable because of slipping in the DNA polymerase during DNA replication. This unique mutation increases the insertion/deletion (INDELs mutation frequency to a high ratio - more than other types of molecular markers such as single nucleotide polymorphism (SNPs. SNPs are more frequent than INDELs. Therefore, all designed algorithms for sequence alignment fit the vast majority of the genomic sequence without considering microsatellite regions, as unique sequences that require special consideration. The old algorithm is limited in its application because there are many overlaps between different repeat units which result in false evolutionary relationships. Findings To overcome the limitation of the aligning algorithm when dealing with SSR loci, a new algorithm was developed using PERL script with a Tk graphical interface. This program is based on aligning sequences after determining the repeated units first, and the last SSR nucleotides positions. This results in a shifting process according to the inserted repeated unit type. When studying the phylogenic relations before and after applying the new algorithm, many differences in the trees were obtained by increasing the SSR length and complexity. However, less distance between different linage had been observed after applying the new algorithm. Conclusions The new algorithm produces better estimates for aligning SSR loci because it reflects more reliable evolutionary relations between different linages. It reduces overlapping during SSR alignment, which results in a more realistic

  2. ADDRESS SEQUENCES FOR MULTI RUN RAM TESTING

    Directory of Open Access Journals (Sweden)

    V. N. Yarmolik

    2014-01-01

    Full Text Available A universal approach for generation of address sequences with specified properties is proposed and analyzed. A modified version of the Antonov and Saleev algorithm for Sobol sequences genera-tion is chosen as a mathematical description of the proposed method. Within the framework of the proposed universal approach, the Sobol sequences form a subset of the address sequences. Other sub-sets are also formed, which are Gray sequences, anti-Gray sequences, counter sequences and sequenc-es with specified properties.

  3. Fast and secure retrieval of DNA sequences

    NARCIS (Netherlands)

    2014-01-01

    Sequence models are retrieved from a sequences index. The sequence models model DNA or RNA sequences stored in a database, and each comprises a finite memory tree source model and parameters for the finite memory tree source model. One or more DNA or RNA sequences stored in the database are

  4. Decidability of uniform recurrence of morphic sequences

    OpenAIRE

    Durand , Fabien

    2012-01-01

    We prove that the uniform recurrence of morphic sequences is decidable. For this we show that the number of derived sequences of uniformly recurrent morphic sequences is bounded. As a corollary we obtain that uniformly recurrent morphic sequences are primitive substitutive sequences.

  5. Validation of Genotyping-By-Sequencing Analysis in Populations of Tetraploid Alfalfa by 454 Sequencing

    Science.gov (United States)

    Rocher, Solen; Jean, Martine; Castonguay, Yves; Belzile, François

    2015-01-01

    Genotyping-by-sequencing (GBS) is a relatively low-cost high throughput genotyping technology based on next generation sequencing and is applicable to orphan species with no reference genome. A combination of genome complexity reduction and multiplexing with DNA barcoding provides a simple and affordable way to resolve allelic variation between plant samples or populations. GBS was performed on ApeKI libraries using DNA from 48 genotypes each of two heterogeneous populations of tetraploid alfalfa (Medicago sativa spp. sativa): the synthetic cultivar Apica (ATF0) and a derived population (ATF5) obtained after five cycles of recurrent selection for superior tolerance to freezing (TF). Nearly 400 million reads were obtained from two lanes of an Illumina HiSeq 2000 sequencer and analyzed with the Universal Network-Enabled Analysis Kit (UNEAK) pipeline designed for species with no reference genome. Following the application of whole dataset-level filters, 11,694 single nucleotide polymorphism (SNP) loci were obtained. About 60% had a significant match on the Medicago truncatula syntenic genome. The accuracy of allelic ratios and genotype calls based on GBS data was directly assessed using 454 sequencing on a subset of SNP loci scored in eight plant samples. Sequencing depth in this study was not sufficient for accurate tetraploid allelic dosage, but reliable genotype calls based on diploid allelic dosage were obtained when using additional quality filtering. Principal Component Analysis of SNP loci in plant samples revealed that a small proportion (<5%) of the genetic variability assessed by GBS is able to differentiate ATF0 and ATF5. Our results confirm that analysis of GBS data using UNEAK is a reliable approach for genome-wide discovery of SNP loci in outcrossed polyploids. PMID:26115486

  6. Beyond DNA Sequencing in Space: Current and Future Omics Capabilities of the Biomolecule Sequencer Payload

    Science.gov (United States)

    Wallace, Sarah

    2017-01-01

    Why do we need a DNA sequencer to support the human exploration of space? (A) Operational environmental monitoring; (1) Identification of contaminating microbes, (2) Infectious disease diagnosis, (3) Reduce down mass (sample return for environmental monitoring, crew health, etc.). (B) Research; (1) Human, (2) Animal, (3) Microbes/Cell lines, (4) Plant. (C) Med Ops; (1) Response to countermeasures, (2) Radiation, (3) Real-time analysis can influence medical intervention. (C) Support astrobiology science investigations; (1) Technology superiorly suited to in situ nucleic acid-based life detection, (2) Functional testing for integration into robotics for extraplanetary exploration mission.

  7. Field-based species identification in eukaryotes using real-time nanopore sequencing.

    OpenAIRE

    Papadopulos, Alexander; Devey, Dion; Helmstetter, Andrew; Parker, Joe

    2017-01-01

    Advances in DNA sequencing and informatics have revolutionised biology over the past four decades, but technological limitations have left many applications unexplored. Recently, portable, real-time, nanopore sequencing (RTnS) has become available. This offers opportunities to rapidly collect and analyse genomic data anywhere. However, the generation of datasets from large, complex genomes has been constrained to laboratories. The portability and long DNA sequences of RTnS offer great potenti...

  8. Next generation sequencing in clinical medicine: Challenges and lessons for pathology and biomedical informatics

    Directory of Open Access Journals (Sweden)

    Rama R Gullapalli

    2012-01-01

    Full Text Available The Human Genome Project (HGP provided the initial draft of mankind′s DNA sequence in 2001. The HGP was produced by 23 collaborating laboratories using Sanger sequencing of mapped regions as well as shotgun sequencing techniques in a process that occupied 13 years at a cost of ~$3 billion. Today, Next Generation Sequencing (NGS techniques represent the next phase in the evolution of DNA sequencing technology at dramatically reduced cost compared to traditional Sanger sequencing. A single laboratory today can sequence the entire human genome in a few days for a few thousand dollars in reagents and staff time. Routine whole exome or even whole genome sequencing of clinical patients is well within the realm of affordability for many academic institutions across the country. This paper reviews current sequencing technology methods and upcoming advancements in sequencing technology as well as challenges associated with data generation, data manipulation and data storage. Implementation of routine NGS data in cancer genomics is discussed along with potential pitfalls in the interpretation of the NGS data. The overarching importance of bioinformatics in the clinical implementation of NGS is emphasized. [7] We also review the issue of physician education which also is an important consideration for the successful implementation of NGS in the clinical workplace. NGS technologies represent a golden opportunity for the next generation of pathologists to be at the leading edge of the personalized medicine approaches coming our way. Often under-emphasized issues of data access and control as well as potential ethical implications of whole genome NGS sequencing are also discussed. Despite some challenges, it′s hard not to be optimistic about the future of personalized genome sequencing and its potential impact on patient care and the advancement of knowledge of human biology and disease in the near future.

  9. Uncovering of Classical Swine Fever Virus adaptive response to vaccination by Next Generation Sequencing

    DEFF Research Database (Denmark)

    Fahnøe, Ulrik; Orton, Richard; Höper, Dirk

    Next Generation Sequencing (NGS) has rapidly become the preferred technology in nucleotide sequencing, and can be applied to unravel molecular adaptation of RNA viruses such as Classical Swine Fever Virus (CSFV). However, the detection of low frequency variants within viral populations by NGS...... is affected by errors introduced during sample preparation and sequencing, and so far no definitive solution to this problem has been presented....

  10. Civil Engineering Technology Program Guide.

    Science.gov (United States)

    Georgia Univ., Athens. Dept. of Vocational Education.

    This program guide presents civil engineering technology curriculum for technical institutes in Georgia. The general information section contains the following: purpose and objectives; program description, including admissions, typical job titles, and accreditation and certification; and curriculum model, including standard curriculum sequence and…

  11. Complete Genome Sequence of the Probiotic Strain Lactobacillus salivarius LPM01.

    Science.gov (United States)

    Chenoll, Empar; Codoñer, Francisco M; Martinez-Blanch, Juan F; Acevedo-Piérart, Marcelo; Ormeño, M Loreto; Ramón, Daniel; Genovés, Salvador

    2016-11-23

    Lactobacillus salivarius LPM01 (DSM 22150) is a probiotic strain able to improve health status in immunocompromised people. Here, we report its complete genome sequence deciphered by PacBio single-molecule real-time (SMRT) technology. Analysis of the sequence may provide insights into its functional activity and safety assessment. Copyright © 2016 Chenoll et al.

  12. Sequence analysis and over-expression of ribosomal protein S28 ...

    African Journals Online (AJOL)

    RPS28 is a component of the 40S small ribosomal subunit encoded by RPS28 gene, which is specific to eukaryotes. The cDNA and the genomic sequence of RPS28 were cloned successfully from the Giant Panda using RT-PCR technology and Touchdown-PCR, respectively. Both sequences were analyzed preliminarily ...

  13. Genome sequence analysis with MonetDB - A case study on Ebola virus diversity

    NARCIS (Netherlands)

    Cijvat, R.; Manegold, S.; Kersten, M.; Klau, G.W.; Schönhuth, A.; Marschall, T.; Zhang, Y.

    2015-01-01

    Next-generation sequencing (NGS) technology has led the life sciences into the big data era. Today, sequencing genomes takes little time and cost, but yields terabytes of data to be stored and analyzed. Biologists are often exposed to excessively time consuming and error-prone data management and

  14. Genome sequencing of Deutsch strain of cattle ticks, Rhipicephalus microplus: Raw Pac Bio reads.

    Science.gov (United States)

    Pac Bio RS II whole genome shotgun sequencing technology was used to sequence the genome of the cattle tick, Rhipicephalus microplus. The DNA was derived from 14 day old eggs from the Deutsch Texas outbreak strain reared at the USDA-ARS Cattle Fever Tick Research Laboratory, Edinburg, TX. Each corre...

  15. Casting Technology.

    Science.gov (United States)

    Wright, Michael D.; And Others

    1992-01-01

    Three articles discuss (1) casting technology as it relates to industry, with comparisons of shell casting, shell molding, and die casting; (2) evaporative pattern casting for metals; and (3) high technological casting with silicone rubber. (JOW)

  16. Living Technology

    DEFF Research Database (Denmark)

    2010-01-01

    This book is aimed at anyone who is interested in learning more about living technology, whether coming from business, the government, policy centers, academia, or anywhere else. Its purpose is to help people to learn what living technology is, what it might develop into, and how it might impact...... our lives. The phrase 'living technology' was coined to refer to technology that is alive as well as technology that is useful because it shares the fundamental properties of living systems. In particular, the invention of this phrase was called for to describe the trend of our technology becoming...... increasingly life-like or literally alive. Still, the phrase has different interpretations depending on how one views what life is. This book presents nineteen perspectives on living technology. Taken together, the interviews convey the collective wisdom on living technology's power and promise, as well as its...

  17. Technology transfer

    International Nuclear Information System (INIS)

    1998-01-01

    On the base of technological opportunities and of the environmental target of the various sectors of energy system this paper intend to conjugate the opportunity/objective with economic and social development through technology transfer and information dissemination [it

  18. Calculation of Tajima's D and other neutrality test statistics from low depth next-generation sequencing data

    DEFF Research Database (Denmark)

    Korneliussen, Thorfinn Sand; Moltke, Ida; Albrechtsen, Anders

    2013-01-01

    A number of different statistics are used for detecting natural selection using DNA sequencing data, including statistics that are summaries of the frequency spectrum, such as Tajima's D. These statistics are now often being applied in the analysis of Next Generation Sequencing (NGS) data. Howeve......, estimates of frequency spectra from NGS data are strongly affected by low sequencing coverage; the inherent technology dependent variation in sequencing depth causes systematic differences in the value of the statistic among genomic regions....

  19. Identifying structural variants using linked-read sequencing data.

    Science.gov (United States)

    Elyanow, Rebecca; Wu, Hsin-Ta; Raphael, Benjamin J

    2017-11-03

    Structural variation, including large deletions, duplications, inversions, translocations, and other rearrangements, is common in human and cancer genomes. A number of methods have been developed to identify structural variants from Illumina short-read sequencing data. However, reliable identification of structural variants remains challenging because many variants have breakpoints in repetitive regions of the genome and thus are difficult to identify with short reads. The recently developed linked-read sequencing technology from 10X Genomics combines a novel barcoding strategy with Illumina sequencing. This technology labels all reads that originate from a small number (~5-10) DNA molecules ~50Kbp in length with the same molecular barcode. These barcoded reads contain long-range sequence information that is advantageous for identification of structural variants. We present Novel Adjacency Identification with Barcoded Reads (NAIBR), an algorithm to identify structural variants in linked-read sequencing data. NAIBR predicts novel adjacencies in a individual genome resulting from structural variants using a probabilistic model that combines multiple signals in barcoded reads. We show that NAIBR outperforms several existing methods for structural variant identification - including two recent methods that also analyze linked-reads - on simulated sequencing data and 10X whole-genome sequencing data from the NA12878 human genome and the HCC1954 breast cancer cell line. Several of the novel somatic structural variants identified in HCC1954 overlap known cancer genes. Software is available at compbio.cs.brown.edu/software. braphael@princeton.edu. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  20. Next-generation sequencing in schizophrenia and other neuropsychiatric disorders.

    Science.gov (United States)

    Schreiber, Matthew; Dorschner, Michael; Tsuang, Debby

    2013-10-01

    Schizophrenia is a debilitating lifelong illness that lacks a cure and poses a worldwide public health burden. The disease is characterized by a heterogeneous clinical and genetic presentation that complicates research efforts to identify causative genetic variations. This review examines the potential of current findings in schizophrenia and in other related neuropsychiatric disorders for application in next-generation technologies, particularly whole-exome sequencing (WES) and whole-genome sequencing (WGS). These approaches may lead to the discovery of underlying genetic factors for schizophrenia and may thereby identify and target novel therapeutic targets for this devastating disorder. © 2013 Wiley Periodicals, Inc.

  1. Algorithms for mapping high-throughput DNA sequences

    DEFF Research Database (Denmark)

    Frellsen, Jes; Menzel, Peter; Krogh, Anders

    2014-01-01

    of data generation, new bioinformatics approaches have been developed to cope with the large amount of sequencing reads obtained in these experiments. In this chapter, we first introduce HTS technologies and their usage in molecular biology and discuss the problem of mapping sequencing reads...... to their genomic origin. We then in detail describe two approaches that offer very fast heuristics to solve the mapping problem in a feasible runtime. In particular, we describe the BLAT algorithm, and we give an introduction to the Burrows-Wheeler Transform and the mapping algorithms based on this transformation....

  2. Earthing Technology

    NARCIS (Netherlands)

    Blok, Vincent

    2017-01-01

    In this article, we reflect on the conditions under which new technologies emerge in the Anthropocene and raise the question of how to conceptualize sustainable technologies therein. To this end, we explore an eco-centric approach to technology development, called biomimicry. We discuss opposing

  3. Technology Tiers

    DEFF Research Database (Denmark)

    Karlsson, Christer

    2015-01-01

    A technology tier is a level in a product system: final product, system, subsystem, component, or part. As a concept, it contrasts traditional “vertical” special technologies (for example, mechanics and electronics) and focuses “horizontal” feature technologies such as product characteristics...

  4. The Biomolecule Sequencer Project: Nanopore Sequencing as a Dual-Use Tool for Crew Health and Astrobiology Investigations

    Science.gov (United States)

    John, K. K.; Botkin, D. S.; Burton, A. S.; Castro-Wallace, S. L.; Chaput, J. D.; Dworkin, J. P.; Lehman, N.; Lupisella, M. L.; Mason, C. E.; Smith, D. J.; hide

    2016-01-01

    Human missions to Mars will fundamentally transform how the planet is explored, enabling new scientific discoveries through more sophisticated sample acquisition and processing than can currently be implemented in robotic exploration. The presence of humans also poses new challenges, including ensuring astronaut safety and health and monitoring contamination. Because the capability to transfer materials to Earth will be extremely limited, there is a strong need for in situ diagnostic capabilities. Nucleotide sequencing is a particularly powerful tool because it can be used to: (1) mitigate microbial risks to crew by allowing identification of microbes in water, in air, and on surfaces; (2) identify optimal treatment strategies for infections that arise in crew members; and (3) track how crew members, microbes, and mission-relevant organisms (e.g., farmed plants) respond to conditions on Mars through transcriptomic and genomic changes. Sequencing would also offer benefits for science investigations occurring on the surface of Mars by permitting identification of Earth-derived contamination in samples. If Mars contains indigenous life, and that life is based on nucleic acids or other closely related molecules, sequencing would serve as a critical tool for the characterization of those molecules. Therefore, spaceflight-compatible nucleic acid sequencing would be an important capability for both crew health and astrobiology exploration. Advances in sequencing technology on Earth have been driven largely by needs for higher throughput and read accuracy. Although some reduction in size has been achieved, nearly all commercially available sequencers are not compatible with spaceflight due to size, power, and operational requirements. Exceptions are nanopore-based sequencers that measure changes in current caused by DNA passing through pores; these devices are inherently much smaller and require significantly less power than sequencers using other detection methods

  5. Analysis of Pteridium ribosomal RNA sequences by rapid direct sequencing.

    Science.gov (United States)

    Tan, M K

    1991-08-01

    A total of 864 bases from 5 regions interspersed in the 18S and 26S rRNA molecules from various clones of Pteridium covering the general geographical distribution of the genus was analysed using a rapid rRNA sequencing technique. No base difference has been detected amongst the three major lineages, two of which apparently separated before the breakup of the ancient supercontinent, Pangaea. These regions of the rRNA sequences have thus been conserved for at least 160 million years and are here compared with other eukaryotic, especially plant rRNAs.

  6. Statistical analysis of next generation sequencing data

    CERN Document Server

    Nettleton, Dan

    2014-01-01

    Next Generation Sequencing (NGS) is the latest high throughput technology to revolutionize genomic research. NGS generates massive genomic datasets that play a key role in the big data phenomenon that surrounds us today. To extract signals from high-dimensional NGS data and make valid statistical inferences and predictions, novel data analytic and statistical techniques are needed. This book contains 20 chapters written by prominent statisticians working with NGS data. The topics range from basic preprocessing and analysis with NGS data to more complex genomic applications such as copy number variation and isoform expression detection. Research statisticians who want to learn about this growing and exciting area will find this book useful. In addition, many chapters from this book could be included in graduate-level classes in statistical bioinformatics for training future biostatisticians who will be expected to deal with genomic data in basic biomedical research, genomic clinical trials and personalized med...

  7. SRComp: short read sequence compression using burstsort and Elias omega coding.

    Directory of Open Access Journals (Sweden)

    Jeremy John Selva

    Full Text Available Next-generation sequencing (NGS technologies permit the rapid production of vast amounts of data at low cost. Economical data storage and transmission hence becomes an increasingly important challenge for NGS experiments. In this paper, we introduce a new non-reference based read sequence compression tool called SRComp. It works by first employing a fast string-sorting algorithm called burstsort to sort read sequences in lexicographical order and then Elias omega-based integer coding to encode the sorted read sequences. SRComp has been benchmarked on four large NGS datasets, where experimental results show that it can run 5-35 times faster than current state-of-the-art read sequence compression tools such as BEETL and SCALCE, while retaining comparable compression efficiency for large collections of short read sequences. SRComp is a read sequence compression tool that is particularly valuable in certain applications where compression time is of major concern.

  8. Transformed composite sequences for improved qubit addressing

    Science.gov (United States)

    Merrill, J. True; Doret, S. Charles; Vittorini, Grahame; Addison, J. P.; Brown, Kenneth R.

    2014-10-01

    Selective laser addressing of a single atom or atomic ion qubit can be improved using narrow-band composite pulse sequences. We describe a Lie-algebraic technique to generalize known narrow-band sequences and introduce sequences related by dilation and rotation of sequence generators. Our method improves known narrow-band sequences by decreasing both the pulse time and the residual error. Finally, we experimentally demonstrate these composite sequences using 40Ca+ ions trapped in a surface-electrode ion trap.

  9. Characteristics of alternating current hopping conductivity in DNA sequences

    International Nuclear Information System (INIS)

    Song-Shan, Ma; Hui, Xu; Huan-You, Wang; Rui, Guo

    2009-01-01

    This paper presents a model to describe alternating current (AC) conductivity of DNA sequences, in which DNA is considered as a one-dimensional (1D) disordered system, and electrons transport via hopping between localized states. It finds that AC conductivity in DNA sequences increases as the frequency of the external electric field rises, and it takes the form of ø ac (ω) ∼ ω 2 ln 2 (1/ω). Also AC conductivity of DNA sequences increases with the increase of temperature, this phenomenon presents characteristics of weak temperature-dependence. Meanwhile, the AC conductivity in an off-diagonally correlated case is much larger than that in the uncorrelated case of the Anderson limit in low temperatures, which indicates that the off-diagonal correlations in DNA sequences have a great effect on the AC conductivity, while at high temperature the off-diagonal correlations no longer play a vital role in electric transport. In addition, the proportion of nucleotide pairs p also plays an important role in AC electron transport of DNA sequences. For p < 0.5, the conductivity of DNA sequence decreases with the increase of p, while for p ≥ 0.5, the conductivity increases with the increase of p. (cross-disciplinary physics and related areas of science and technology)

  10. Efficient alignment of pyrosequencing reads for re-sequencing applications

    Directory of Open Access Journals (Sweden)

    Russo Luis MS

    2011-05-01

    Full Text Available Abstract Background Over the past few years, new massively parallel DNA sequencing technologies have emerged. These platforms generate massive amounts of data per run, greatly reducing the cost of DNA sequencing. However, these techniques also raise important computational difficulties mostly due to the huge volume of data produced, but also because of some of their specific characteristics such as read length and sequencing errors. Among the most critical problems is that of efficiently and accurately mapping reads to a reference genome in the context of re-sequencing projects. Results We present an efficient method for the local alignment of pyrosequencing reads produced by the GS FLX (454 system against a reference sequence. Our approach explores the characteristics of the data in these re-sequencing applications and uses state of the art indexing techniques combined with a flexible seed-based approach, leading to a fast and accurate algorithm which needs very little user parameterization. An evaluation performed using real and simulated data shows that our proposed method outperforms a number of mainstream tools on the quantity and quality of successful alignments, as well as on the execution time. Conclusions The proposed methodology was implemented in a software tool called TAPyR--Tool for the Alignment of Pyrosequencing Reads--which is publicly available from http://www.tapyr.net.

  11. Sequence tagging reveals unexpected modifications in toxicoproteomics

    Science.gov (United States)

    Dasari, Surendra; Chambers, Matthew C.; Codreanu, Simona G.; Liebler, Daniel C.; Collins, Ben C.; Pennington, Stephen R.; Gallagher, William M.; Tabb, David L.

    2010-01-01

    Toxicoproteomic samples are rich in posttranslational modifications (PTMs) of proteins. Identifying these modifications via standard database searching can incur significant performance penalties. Here we describe the latest developments in TagRecon, an algorithm that leverages inferred sequence tags to identify modified peptides in toxicoproteomic data sets. TagRecon identifies known modifications more effectively than the MyriMatch database search engine. TagRecon outperformed state of the art software in recognizing unanticipated modifications from LTQ, Orbitrap, and QTOF data sets. We developed user-friendly software for detecting persistent mass shifts from samples. We follow a three-step strategy for detecting unanticipated PTMs in samples. First, we identify the proteins present in the sample with a standard database search. Next, identified proteins are interrogated for unexpected PTMs with a sequence tag-based search. Finally, additional evidence is gathered for the detected mass shifts with a refinement search. Application of this technology on toxicoproteomic data sets revealed unintended cross-reactions between proteins and sample processing reagents. Twenty five proteins in rat liver showed signs of oxidative stress when exposed to potentially toxic drugs. These results demonstrate the value of mining toxicoproteomic data sets for modifications. PMID:21214251

  12. From Conventional to Next Generation Sequencing of Epstein-Barr Virus Genomes.

    Science.gov (United States)

    Kwok, Hin; Chiang, Alan Kwok Shing

    2016-02-24

    Genomic sequences of Epstein-Barr virus (EBV) have been of interest because the virus is associated with cancers, such as nasopharyngeal carcinoma, and conditions such as infectious mononucleosis. The progress of whole-genome EBV sequencing has been limited by the inefficiency and cost of the first-generation sequencing technology. With the advancement of next-generation sequencing (NGS) and target enrichment strategies, increasing number of EBV genomes has been published. These genomes were sequenced using different approaches, either with or without EBV DNA enrichment. This review provides an overview of the EBV genomes published to date, and a description of the sequencing technology and bioinformatic analyses employed in generating these sequences. We further explored ways through which the quality of sequencing data can be improved, such as using DNA oligos for capture hybridization, and longer insert size and read length in the sequencing runs. These advances will enable large-scale genomic sequencing of EBV which will facilitate a better understanding of the genetic variations of EBV in different geographic regions and discovery of potentially pathogenic variants in specific diseases.

  13. From Conventional to Next Generation Sequencing of Epstein-Barr Virus Genomes

    Directory of Open Access Journals (Sweden)

    Hin Kwok

    2016-02-01

    Full Text Available Genomic sequences of Epstein–Barr virus (EBV have been of interest because the virus is associated with cancers, such as nasopharyngeal carcinoma, and conditions such as infectious mononucleosis. The progress of whole-genome EBV sequencing has been limited by the inefficiency and cost of the first-generation sequencing technology. With the advancement of next-generation sequencing (NGS and target enrichment strategies, increasing number of EBV genomes has been published. These genomes were sequenced using different approaches, either with or without EBV DNA enrichment. This review provides an overview of the EBV genomes published to date, and a description of the sequencing technology and bioinformatic analyses employed in generating these sequences. We further explored ways through which the quality of sequencing data can be improved, such as using DNA oligos for capture hybridization, and longer insert size and read length in the sequencing runs. These advances will enable large-scale genomic sequencing of EBV which will facilitate a better understanding of the genetic variations of EBV in different geographic regions and discovery of potentially pathogenic variants in specific diseases.

  14. Targeted amplicon sequencing (TAS): a scalable next-gen approach to multilocus, multitaxa phylogenetics.

    Science.gov (United States)

    Bybee, Seth M; Bracken-Grissom, Heather; Haynes, Benjamin D; Hermansen, Russell A; Byers, Robert L; Clement, Mark J; Udall, Joshua A; Wilcox, Edward R; Crandall, Keith A

    2011-01-01

    Next-gen sequencing technologies have revolutionized data collection in genetic studies and advanced genome biology to novel frontiers. However, to date, next-gen technologies have been used principally for whole genome sequencing and transcriptome sequencing. Yet many questions in population genetics and systematics rely on sequencing specific genes of known function or diversity levels. Here, we describe a targeted amplicon sequencing (TAS) approach capitalizing on next-gen capacity to sequence large numbers of targeted gene regions from a large number of samples. Our TAS approach is easily scalable, simple in execution, neither time-nor labor-intensive, relatively inexpensive, and can be applied to a broad diversity of organisms and/or genes. Our TAS approach includes a bioinformatic application, BarcodeCrucher, to take raw next-gen sequence reads and perform quality control checks and convert the data into FASTA format organized by gene and sample, ready for phylogenetic analyses. We demonstrate our approach by sequencing targeted genes of known phylogenetic utility to estimate a phylogeny for the Pancrustacea. We generated data from 44 taxa using 68 different 10-bp multiplexing identifiers. The overall quality of data produced was robust and was informative for phylogeny estimation. The potential for this method to produce copious amounts of data from a single 454 plate (e.g., 325 taxa for 24 loci) significantly reduces sequencing expenses incurred from traditional Sanger sequencing. We further discuss the advantages and disadvantages of this method, while offering suggestions to enhance the approach.

  15. Sensemaking technologies

    DEFF Research Database (Denmark)

    Madsen, Charlotte Øland

    Research scope: The scope of the project is to study technological implementation processes by using Weick's sensemaking concept (Weick, 1995). The purpose of using a social constructivist approach to investigate technological implementation processes is to find out how new technologies transform......, Orlikowski 2000). Viewing the use of technology as a process of enactment opens up for investigating the social processes of interpreting new technology into the organisation (Orlikowski 2000). The scope of the PhD project will therefore be to gain a deeper understanding of how the enactment of new...... & Brass, 1990; Kling 1991; Orlikowski 2000). It also demonstrates that technology is a flexible variable adapted to the organisation's needs, culture, climate and management philosophy, thus leading to different uses and outcomes of the same technology in different organisations (Barley 1986; 1990...

  16. Technology roadmaps

    Energy Technology Data Exchange (ETDEWEB)

    Pearson, B. [Natural Resources Canada, Ottawa, ON (Canada). CANMET Energy Technology Centre

    2003-07-01

    The purpose of a technology road map is to define the state of a current technology, relevant market issues, and future market needs; to develop a plan that industry can follow to provide these new products and services; and to map technology pathways and performance goals for bringing these products and services to market. The three stages (planning, implementation, and reviewing and updating), benefits, and status of the Clean Coal Technology Roadmap are outlined. Action Plan 2000, a $1.7 million 2000 Climate Change Technology and Innovation Program, which uses the technology roadmapping process, is described. The members of the management steering committee for the Clean Coal Technology Roadmap are listed. A flowsheet showing activities until November 2004, when the final clean coal road map is due, is included.

  17. Sequences, groups, and number theory

    CERN Document Server

    Rigo, Michel

    2018-01-01

    This collaborative book presents recent trends on the study of sequences, including combinatorics on words and symbolic dynamics, and new interdisciplinary links to group theory and number theory. Other chapters branch out from those areas into subfields of theoretical computer science, such as complexity theory and theory of automata. The book is built around four general themes: number theory and sequences, word combinatorics, normal numbers, and group theory. Those topics are rounded out by investigations into automatic and regular sequences, tilings and theory of computation, discrete dynamical systems, ergodic theory, numeration systems, automaton semigroups, and amenable groups.  This volume is intended for use by graduate students or research mathematicians, as well as computer scientists who are working in automata theory and formal language theory. With its organization around unified themes, it would also be appropriate as a supplemental text for graduate level courses.

  18. Explaining the harmonic sequence paradox.

    Science.gov (United States)

    Schmidt, Ulrich; Zimper, Alexander

    2012-05-01

    According to the harmonic sequence paradox, an expected utility decision maker's willingness to pay for a gamble whose expected payoffs evolve according to the harmonic series is finite if and only if his marginal utility of additional income becomes zero for rather low payoff levels. Since the assumption of zero marginal utility is implausible for finite payoff levels, expected utility theory - as well as its standard generalizations such as cumulative prospect theory - are apparently unable to explain a finite willingness to pay. This paper presents first an experimental study of the harmonic sequence paradox. Additionally, it demonstrates that the theoretical argument of the harmonic sequence paradox only applies to time-patient decision makers, whereas the paradox is easily avoided if time-impatience is introduced. ©2011 The British Psychological Society.

  19. Application of genotyping-by-sequencing on semiconductor sequencing platforms: a comparison of genetic and reference-based marker ordering in barley.

    Directory of Open Access Journals (Sweden)

    Martin Mascher

    Full Text Available The rapid development of next-generation sequencing platforms has enabled the use of sequencing for routine genotyping across a range of genetics studies and breeding applications. Genotyping-by-sequencing (GBS, a low-cost, reduced representation sequencing method, is becoming a common approach for whole-genome marker profiling in many species. With quickly developing sequencing technologies, adapting current GBS methodologies to new platforms will leverage these advancements for future studies. To test new semiconductor sequencing platforms for GBS, we genotyped a barley recombinant inbred line (RIL population. Based on a previous GBS approach, we designed bar code and adapter sets for the Ion Torrent platforms. Four sets of 24-plex libraries were constructed consisting of 94 RILs and the two parents and sequenced on two Ion platforms. In parallel, a 96-plex library of the same RILs was sequenced on the Illumina HiSeq 2000. We applied two different computational pipelines to analyze sequencing data; the reference-independent TASSEL pipeline and a reference-based pipeline using SAMtools. Sequence contigs positioned on the integrated physical and genetic map were used for read mapping and variant calling. We found high agreement in genotype calls between the different platforms and high concordance between genetic and reference-based marker order. There was, however, paucity in the number of SNP that were jointly discovered by the different pipelines indicating a strong effect of alignment and filtering parameters on SNP discovery. We show the utility of the current barley genome assembly as a framework for developing very low-cost genetic maps, facilitating high resolution genetic mapping and negating the need for developing de novo genetic maps for future studies in barley. Through demonstration of GBS on semiconductor sequencing platforms, we conclude that the GBS approach is amenable to a range of platforms and can easily be modified as new

  20. First fungal genome sequence from Africa: A preliminary analysis

    Directory of Open Access Journals (Sweden)

    Rene Sutherland

    2012-01-01

    Full Text Available Some of the most significant breakthroughs in the biological sciences this century will emerge from the development of next generation sequencing technologies. The ease of availability of DNA sequence made possible through these new technologies has given researchers opportunities to study organisms in a manner that was not possible with Sanger sequencing. Scientists will, therefore, need to embrace genomics, as well as develop and nurture the human capacity to sequence genomes and utilise the ’tsunami‘ of data that emerge from genome sequencing. In response to these challenges, we sequenced the genome of Fusarium circinatum, a fungal pathogen of pine that causes pitch canker, a disease of great concern to the South African forestry industry. The sequencing work was conducted in South Africa, making F. circinatum the first eukaryotic organism for which the complete genome has been sequenced locally. Here we report on the process that was followed to sequence, assemble and perform a preliminary characterisation of the genome. Furthermore, details of the computer annotation and manual curation of this genome are presented. The F. circinatum genome was found to be nearly 44 million bases in size, which is similar to that of four other Fusarium genomes that have been sequenced elsewhere. The genome contains just over 15 000 open reading frames, which is less than that of the related species, Fusarium oxysporum, but more than that for Fusarium verticillioides. Amongst the various putative gene clusters identified in F. circinatum, those encoding the secondary metabolites fumosin and fusarin appeared to harbour evidence of gene translocation. It is anticipated that similar comparisons of other loci will provide insights into the genetic basis for pathogenicity of the pitch canker pathogen. Perhaps more importantly, this project has engaged a relatively large group of scientists

  1. Appropriate Technology as Indian Technology.

    Science.gov (United States)

    Barry, Tom

    1979-01-01

    Describes the mounting enthusiasm of Indian communities for appropriate technology as an inexpensive means of providing much needed energy and job opportunities. Describes the development of several appropriate technology projects, and the goals and activities of groups involved in utilizing low scale solar technology for economic development on…

  2. Integrated sequence analysis. Final report

    Energy Technology Data Exchange (ETDEWEB)

    Andersson, K.; Pyy, P

    1998-02-01

    The NKS/RAK subprojet 3 `integrated sequence analysis` (ISA) was formulated with the overall objective to develop and to test integrated methodologies in order to evaluate event sequences with significant human action contribution. The term `methodology` denotes not only technical tools but also methods for integration of different scientific disciplines. In this report, we first discuss the background of ISA and the surveys made to map methods in different application fields, such as man machine system simulation software, human reliability analysis (HRA) and expert judgement. Specific event sequences were, after the surveys, selected for application and testing of a number of ISA methods. The event sequences discussed in the report were cold overpressure of BWR, shutdown LOCA of BWR, steam generator tube rupture of a PWR and BWR disturbed signal view in the control room after an external event. Different teams analysed these sequences by using different ISA and HRA methods. Two kinds of results were obtained from the ISA project: sequence specific and more general findings. The sequence specific results are discussed together with each sequence description. The general lessons are discussed under a separate chapter by using comparisons of different case studies. These lessons include areas ranging from plant safety management (design, procedures, instrumentation, operations, maintenance and safety practices) to methodological findings (ISA methodology, PSA,HRA, physical analyses, behavioural analyses and uncertainty assessment). Finally follows a discussion about the project and conclusions are presented. An interdisciplinary study of complex phenomena is a natural way to produce valuable and innovative results. This project came up with structured ways to perform ISA and managed to apply the in practice. The project also highlighted some areas where more work is needed. In the HRA work, development is required for the use of simulators and expert judgement as

  3. Matrix transformations and sequence spaces

    International Nuclear Information System (INIS)

    Nanda, S.

    1983-06-01

    In most cases the most general linear operator from one sequence space into another is actually given by an infinite matrix and therefore the theory of matrix transformations has always been of great interest in the study of sequence spaces. The study of general theory of matrix transformations was motivated by the special results in summability theory. This paper is a review article which gives almost all known results on matrix transformations. This also suggests a number of open problems for further study and will be very useful for research workers. (author)

  4. Green's theorem and Gorenstein sequences

    OpenAIRE

    Ahn, Jeaman; Migliore, Juan C.; Shin, Yong-Su

    2016-01-01

    We study consequences, for a standard graded algebra, of extremal behavior in Green's Hyperplane Restriction Theorem. First, we extend his Theorem 4 from the case of a plane curve to the case of a hypersurface in a linear space. Second, assuming a certain Lefschetz condition, we give a connection to extremal behavior in Macaulay's theorem. We apply these results to show that $(1,19,17,19,1)$ is not a Gorenstein sequence, and as a result we classify the sequences of the form $(1,a,a-2,a,1)$ th...

  5. Massively Parallel Interrogation of Aptamer Sequence, Structure and Function

    Energy Technology Data Exchange (ETDEWEB)

    Fischer, N O; Tok, J B; Tarasow, T M

    2008-02-08

    Optimization of high affinity reagents is a significant bottleneck in medicine and the life sciences. The ability to synthetically create thousands of permutations of a lead high-affinity reagent and survey the properties of individual permutations in parallel could potentially relieve this bottleneck. Aptamers are single stranded oligonucleotides affinity reagents isolated by in vitro selection processes and as a class have been shown to bind a wide variety of target molecules. Methodology/Principal Findings. High density DNA microarray technology was used to synthesize, in situ, arrays of approximately 3,900 aptamer sequence permutations in triplicate. These sequences were interrogated on-chip for their ability to bind the fluorescently-labeled cognate target, immunoglobulin E, resulting in the parallel execution of thousands of experiments. Fluorescence intensity at each array feature was well resolved and shown to be a function of the sequence present. The data demonstrated high intra- and interchip correlation between the same features as well as among the sequence triplicates within a single array. Consistent with aptamer mediated IgE binding, fluorescence intensity correlated strongly with specific aptamer sequences and the concentration of IgE applied to the array. The massively parallel sequence-function analyses provided by this approach confirmed the importance of a consensus sequence found in all 21 of the original IgE aptamer sequences and support a common stem:loop structure as being the secondary structure underlying IgE binding. The microarray application, data and results presented illustrate an efficient, high information content approach to optimizing aptamer function. It also provides a foundation from which to better understand and manipulate this important class of high affinity biomolecules.

  6. CREST--classification resources for environmental sequence tags.

    Directory of Open Access Journals (Sweden)

    Anders Lanzén

    Full Text Available Sequencing of taxonomic or phylogenetic markers is becoming a fast and efficient method for studying environmental microbial communities. This has resulted in a steadily growing collection of marker sequences, most notably of the small-subunit (SSU ribosomal RNA gene, and an increased understanding of microbial phylogeny, diversity and community composition patterns. However, to utilize these large datasets together with new sequencing technologies, a reliable and flexible system for taxonomic classification is critical. We developed CREST (Classification Resources for Environmental Sequence Tags, a set of resources and tools for generating and utilizing custom taxonomies and reference datasets for classification of environmental sequences. CREST uses an alignment-based classification method with the lowest common ancestor algorithm. It also uses explicit rank similarity criteria to reduce false positives and identify novel taxa. We implemented this method in a web server, a command line tool and the graphical user interfaced program MEGAN. Further, we provide the SSU rRNA reference database and taxonomy SilvaMod, derived from the publicly available SILVA SSURef, for classification of sequences from bacteria, archaea and eukaryotes. Using cross-validation and environmental datasets, we compared the performance of CREST and SilvaMod to the RDP Classifier. We also utilized Greengenes as a reference database, both with CREST and the RDP Classifier. These analyses indicate that CREST performs better than alignment-free methods with higher recall rate (sensitivity as well as precision, and with the ability to accurately identify most sequences from novel taxa. Classification using SilvaMod performed better than with Greengenes, particularly when applied to environmental sequences. CREST is freely available under a GNU General Public License (v3 from http://apps.cbu.uib.no/crest and http://lcaclassifier.googlecode.com.

  7. Next generation sequencing reveals the hidden diversity of zooplankton assemblages.

    Directory of Open Access Journals (Sweden)

    Penelope K Lindeque

    Full Text Available BACKGROUND: Zooplankton play an important role in our oceans, in biogeochemical cycling and providing a food source for commercially important fish larvae. However, difficulties in correctly identifying zooplankton hinder our understanding of their roles in marine ecosystem functioning, and can prevent detection of long term changes in their community structure. The advent of massively parallel next generation sequencing technology allows DNA sequence data to be recovered directly from whole community samples. Here we assess the ability of such sequencing to quantify richness and diversity of a mixed zooplankton assemblage from a productive time series site in the Western English Channel. METHODOLOGY/PRINCIPLE FINDINGS: Plankton net hauls (200 µm were taken at the Western Channel Observatory station L4 in September 2010 and January 2011. These samples were analysed by microscopy and metagenetic analysis of the 18S nuclear small subunit ribosomal RNA gene using the 454 pyrosequencing platform. Following quality control a total of 419,041 sequences were obtained for all samples. The sequences clustered into 205 operational taxonomic units using a 97% similarity cut-off. Allocation of taxonomy by comparison with the National Centre for Biotechnology Information database identified 135 OTUs to species level, 11 to genus level and 1 to order, <2.5% of sequences were classified as unknowns. By comparison a skilled microscopic analyst was able to routinely enumerate only 58 taxonomic groups. CONCLUSIONS: Metagenetics reveals a previously hidden taxonomic richness, especially for Copepoda and hard-to-identify meroplankton such as Bivalvia, Gastropoda and Polychaeta. It also reveals rare species and parasites. We conclude that Next Generation Sequencing of 18S amplicons is a powerful tool for elucidating the true diversity and species richness of zooplankton communities. While this approach allows for broad diversity assessments of plankton it may

  8. Massively parallel interrogation of aptamer sequence, structure and function.

    Directory of Open Access Journals (Sweden)

    Nicholas O Fischer

    Full Text Available BACKGROUND: Optimization of high affinity reagents is a significant bottleneck in medicine and the life sciences. The ability to synthetically create thousands of permutations of a lead high-affinity reagent and survey the properties of individual permutations in parallel could potentially relieve this bottleneck. Aptamers are single stranded oligonucleotides affinity reagents isolated by in vitro selection processes and as a class have been shown to bind a wide variety of target molecules. METHODOLOGY/PRINCIPAL FINDINGS: High density DNA microarray technology was used to synthesize, in situ, arrays of approximately 3,900 aptamer sequence permutations in triplicate. These sequences were interrogated on-chip for their ability to bind the fluorescently-labeled cognate target, immunoglobulin E, resulting in the parallel execution of thousands of experiments. Fluorescence intensity at each array feature was well resolved and shown to be a function of the sequence present. The data demonstrated high intra- and inter-chip correlation between the same features as well as among the sequence triplicates within a single array. Consistent with aptamer mediated IgE binding, fluorescence intensity correlated strongly with specific aptamer sequences and the concentration of IgE applied to the array. CONCLUSION AND SIGNIFICANCE: The massively parallel sequence-function analyses provided by this approach confirmed the importance of a consensus sequence found in all 21 of the original IgE aptamer sequences and support a common stem:loop structure as being the secondary structure underlying IgE binding. The microarray application, data and results presented illustrate an efficient, high information content approach to optimizing aptamer function. It also provides a foundation from which to better understand and manipulate this important class of high affinity biomolecules.

  9. Genome-wide SNP identification by high-throughput sequencing and selective mapping allows sequence assembly positioning using a framework genetic linkage map

    Directory of Open Access Journals (Sweden)

    Xu Xiangming

    2010-12-01

    Full Text Available Abstract Background Determining the position and order of contigs and scaffolds from a genome assembly within an organism's genome remains a technical challenge in a majority of sequencing projects. In order to exploit contemporary technologies for DNA sequencing, we developed a strategy for whole genome single nucleotide polymorphism sequencing allowing the positioning of sequence contigs onto a linkage map using the bin mapping method. Results The strategy was tested on a draft genome of the fungal pathogen Venturia inaequalis, the causal agent of apple scab, and further validated using sequence contigs derived from the diploid plant genome Fragaria vesca. Using our novel method we were able to anchor 70% and 92% of sequences assemblies for V. inaequalis and F. vesca, respectively, to genetic linkage maps. Conclusions We demonstrated the utility of this approach by accurately determining the bin map positions of the majority of the large sequence contigs from each genome sequence and validated our method by mapping single sequence repeat markers derived from sequence contigs on a full mapping population.

  10. Technology '90

    International Nuclear Information System (INIS)

    1991-01-01

    The US Department of Energy (DOE) laboratories have a long history of excellence in performing research and development in a number of areas, including the basic sciences, applied-energy technology, and weapons-related technology. Although technology transfer has always been an element of DOE and laboratory activities, it has received increasing emphasis in recent years as US industrial competitiveness has eroded and efforts have increased to better utilize the research and development resources the laboratories provide. This document, Technology '90, is the latest in a series that is intended to communicate some of the many opportunities available for US industry and universities to work with the DOE and its laboratories in the vital activity of improving technology transfer to meet national needs. Technology '90 is divided into three sections: Overview, Technologies, and Laboratories. The Overview section describes the activities and accomplishments of the DOE research and development program offices. The Technologies section provides descriptions of new technologies developed at the DOE laboratories. The Laboratories section presents information on the missions, programs, and facilities of each laboratory, along with a name and telephone number of a technology transfer contact for additional information. Separate papers were prepared for appropriate sections of this report

  11. A case history of technology transfer

    Science.gov (United States)

    1981-01-01

    A sequence of events, occurring over the last 25 years, are described that chronicle the evolution of ion-bombardment electric propulsion technology. Emphasis is placed on the latter phases of this evolution, where special efforts were made to pave the way toward the use of this technology in operational space flight systems. These efforts consisted of a planned program to focus the technology toward its end applications and an organized process that was followed to transfer the technology from the research-technology NASA Center to the user-development NASA Center and its industry team. Major milestones in this evolution, which are described, include the development of thruster technology across a large size range, the successful completion of two space electric rocket tests, SERT I and SERT II, development of power-processing technology for electric propulsion, completion of a program to make the technology ready for flight system development, and finally the technology transfer events.

  12. Getting complete genomes from complex samples using nanopore sequencing

    DEFF Research Database (Denmark)

    Kirkegaard, Rasmus Hansen; Karst, Søren Michael; Albertsen, Mads

    Background Short read DNA sequencing and metagenomic binning workflows have made it possible to extract bacterial genome bins from environmental microbial samples containing hundreds to thousands of different species. However, these genome bins often do not represent complete genomes......, as they are mostly fragmented, incomplete and often contaminated with foreign DNA. The value of these `draft genomes` have limited, lasting value to the scientific community, as gene synteny is broken and there is some uncertainty of what is missing1. The genetic material most often missed is important multi......-copy and/or conserved marker genes such as the 16S rRNA gene, as sequence micro-heterogeneity prevents assembly of these genes in the de novo assembly. However, long read sequencing technologies are emerging promising an end to fragmented genome assemblies2. Experimental design We extracted DNA from a full...

  13. Sequences in language and text

    CERN Document Server

    Mikros, George K

    2015-01-01

    The aim of this volume is to present the diverse but highly interesting area of the quantitative analysis of the sequence of various linguistic structures. The collected articles present a wide spectrum of quantitative analyses of linguistic syntagmatic structures and explore novel sequential linguistic entities. This volume will be interesting to all researchers studying linguistics using quantitative methods.

  14. Probabilistic studies of accident sequences

    International Nuclear Information System (INIS)

    Villemeur, A.; Berger, J.P.

    1986-01-01

    For several years, Electricite de France has carried out probabilistic assessment of accident sequences for nuclear power plants. In the framework of this program many methods were developed. As the interest in these studies was increasing and as adapted methods were developed, Electricite de France has undertaken a probabilistic safety assessment of a nuclear power plant [fr

  15. MRI sequences and their parameters

    International Nuclear Information System (INIS)

    Teissier, J.M.

    1993-01-01

    Listing basic sequences and their present variants makes a synthetic classification of the various acquisition modes possible. The knowledge of the advantages of each of them, as well as of their disadvantages and restraints, seems to be an essential prerequisite to an optimal utilization of each magnetic resonance imaging system. (author)

  16. Degree sequence in message transfer

    Science.gov (United States)

    Yamuna, M.

    2017-11-01

    Message encryption is always an issue in current communication scenario. Methods are being devised using various domains. Graphs satisfy numerous unique properties which can be used for message transfer. In this paper, I propose a message encryption method based on degree sequence of graphs.

  17. Fractals in DNA sequence analysis

    Institute of Scientific and Technical Information of China (English)

    Yu Zu-Guo(喻祖国); Vo Anh; Gong Zhi-Min(龚志民); Long Shun-Chao(龙顺潮)

    2002-01-01

    Fractal methods have been successfully used to study many problems in physics, mathematics, engineering, finance,and even in biology. There has been an increasing interest in unravelling the mysteries of DNA; for example, how can we distinguish coding and noncoding sequences, and the problems of classification and evolution relationship of organisms are key problems in bioinformatics. Although much research has been carried out by taking into consideration the long-range correlations in DNA sequences, and the global fractal dimension has been used in these works by other people, the models and methods are somewhat rough and the results are not satisfactory. In recent years, our group has introduced a time series model (statistical point of view) and a visual representation (geometrical point of view)to DNA sequence analysis. We have also used fractal dimension, correlation dimension, the Hurst exponent and the dimension spectrum (multifractal analysis) to discuss problems in this field. In this paper, we introduce these fractal models and methods and the results of DNA sequence analysis.

  18. On primes in Lucas sequences

    Czech Academy of Sciences Publication Activity Database

    Křížek, Michal; Somer, L.

    2015-01-01

    Roč. 53, č. 1 (2015), s. 2-23 ISSN 0015-0517 R&D Projects: GA ČR GA14-02067S Institutional support: RVO:67985840 Keywords : Lucas sequence * primes Subject RIV: BA - General Mathematics http://www.fq.math.ca/Abstracts/53-1/somer.pdf

  19. De novo transcriptome sequencing and sequence analysis of the malaria vector Anopheles sinensis (Diptera: Culicidae)

    Science.gov (United States)

    2014-01-01

    Background Anopheles sinensis is the major malaria vector in China and Southeast Asia. Vector control is one of the most effective measures to prevent malaria transmission. However, there is little transcriptome information available for the malaria vector. To better understand the biological basis of malaria transmission and to develop novel and effective means of vector control, there is a need to build a transcriptome dataset for functional genomics analysis by large-scale RNA sequencing (RNA-seq). Methods To provide a more comprehensive and complete transcriptome of An. sinensis, eggs, larvae, pupae, male adults and female adults RNA were pooled together for cDNA preparation, sequenced using the Illumina paired-end sequencing technology and assembled into unigenes. These unigenes were then analyzed in their genome mapping, functional annotation, homology, codon usage bias and simple sequence repeats (SSRs). Results Approximately 51.6 million clean reads were obtained, trimmed, and assembled into 38,504 unigenes with an average length of 571 bp, an N50 of 711 bp, and an average GC content 51.26%. Among them, 98.4% of unigenes could be mapped onto the reference genome, and 69% of unigenes could be annotated with known biological functions. Homology analysis identified certain numbers of An. sinensis unigenes that showed homology or being putative 1:1 orthologues with genomes of other Dipteran species. Codon usage bias was analyzed and 1,904 SSRs were detected, which will provide effective molecular markers for the population genetics of this species. Conclusions Our data and analysis provide the most comprehensive transcriptomic resource and characteristics currently available for An. sinensis, and will facilitate genetic, genomic studies, and further vector control of An. sinensis. PMID:25000941

  20. Soulful Technologies

    DEFF Research Database (Denmark)

    Fausing, Bent

    2010-01-01

    Samsung introduced in 2008 a mobile phone called "Soul" made with a human touch and including itself a "magic touch". Through the analysis of a Nokia mobile phone TV-commercials I want to examine the function and form of digital technology in everyday images. The mobile phone and its digital camera...... and other devices are depicted by everyday aesthetics as capable of producing a unique human presence and interaction. The medium, the technology is a necessary helper of this very special and lost humanity. Without the technology, no special humanity, no soul - such is the prophecy. This personification...... or anthropomorphism is important for the branding of new technology. Technology is seen as creating a techno-transcendence towards a more qualified humanity which is in contact with fundamental human values like intuition, vision, and sensing; all the qualities that technology, industrialization, and rationalization...

  1. Application of whole genome shotgun sequencing for detection and characterization of genetically modified organisms and derived products.

    Science.gov (United States)

    Holst-Jensen, Arne; Spilsberg, Bjørn; Arulandhu, Alfred J; Kok, Esther; Shi, Jianxin; Zel, Jana

    2016-07-01

    The emergence of high-throughput, massive or next-generation sequencing technologies has created a completely new foundation for molecular analyses. Various selective enrichment processes are commonly applied to facilitate detection of predefined (known) targets. Such approaches, however, inevitably introduce a bias and are prone to miss unknown targets. Here we review the application of high-throughput sequencing technologies and the preparation of fit-for-purpose whole genome shotgun sequencing libraries for the detection and characterization of genetically modified and derived products. The potential impact of these new sequencing technologies for the characterization, breeding selection, risk assessment, and traceability of genetically modified organisms and genetically modified products is yet to be fully acknowledged. The published literature is reviewed, and the prospects for future developments and use of the new sequencing technologies for these purposes are discussed.

  2. Is sequence awareness mandatory for perceptual sequence learning: An assessment using a pure perceptual sequence learning design.

    Science.gov (United States)

    Deroost, Natacha; Coomans, Daphné

    2018-02-01

    We examined the role of sequence awareness in a pure perceptual sequence learning design. Participants had to react to the target's colour that changed according to a perceptual sequence. By varying the mapping of the target's colour onto the response keys, motor responses changed randomly. The effect of sequence awareness on perceptual sequence learning was determined by manipulating the learning instructions (explicit versus implicit) and assessing the amount of sequence awareness after the experiment. In the explicit instruction condition (n = 15), participants were instructed to intentionally search for the colour sequence, whereas in the implicit instruction condition (n = 15), they were left uninformed about the sequenced nature of the task. Sequence awareness after the sequence learning task was tested by means of a questionnaire and the process-dissociation-procedure. The results showed that the instruction manipulation had no effect on the amount of perceptual sequence learning. Based on their report to have actively applied their sequence knowledge during the experiment, participants were subsequently regrouped in a sequence strategy group (n = 14, of which 4 participants from the implicit instruction condition and 10 participants from the explicit instruction condition) and a no-sequence strategy group (n = 16, of which 11 participants from the implicit instruction condition and 5 participants from the explicit instruction condition). Only participants of the sequence strategy group showed reliable perceptual sequence learning and sequence awareness. These results indicate that perceptual sequence learning depends upon the continuous employment of strategic cognitive control processes on sequence knowledge. Sequence awareness is suggested to be a necessary but not sufficient condition for perceptual learning to take place. Copyright © 2018 Elsevier B.V. All rights reserved.

  3. Determining mutant spectra of three RNA viral samples using ultra-deep sequencing

    Energy Technology Data Exchange (ETDEWEB)

    Chen, H

    2012-06-06

    RNA viruses have extremely high mutation rates that enable the virus to adapt to new host environments and even jump from one species to another. As part of a viral transmission study, three viral samples collected from naturally infected animals were sequenced using Illumina paired-end technology at ultra-deep coverage. In order to determine the mutant spectra within the viral quasispecies, it is critical to understand the sequencing error rates and control for false positive calls of viral variants (point mutantations). I will estimate the sequencing error rate from two control sequences and characterize the mutant spectra in the natural samples with this error rate.

  4. Evaluating de Bruijn graph assemblers on 454 transcriptomic data.

    Directory of Open Access Journals (Sweden)

    Xianwen Ren

    Full Text Available Next generation sequencing (NGS technologies have greatly changed the landscape of transcriptomic studies of non-model organisms. Since there is no reference genome available, de novo assembly methods play key roles in the analysis of these data sets. Because of the huge amount of data generated by NGS technologies for each run, many assemblers, e.g., ABySS, Velvet and Trinity, are developed based on a de Bruijn graph due to its time- and space-efficiency. However, most of these assemblers were developed initially for the Illumina/Solexa platform. The performance of these assemblers on 454 transcriptomic data is unknown. In this study, we evaluated and compared the relative performance of these de Bruijn graph based assemblers on both simulated and real 454 transcriptomic data. The results suggest that Trinity, the Illumina/Solexa-specialized transcriptomic assembler, performs the best among the multiple de Bruijn graph assemblers, comparable to or even outperforming the standard 454 assembler Newbler which is based on the overlap-layout-consensus algorithm. Our evaluation is expected to provide helpful guidance for researchers to choose assemblers when analyzing 454 transcriptomic data.

  5. Globalization & technology

    DEFF Research Database (Denmark)

    Narula, Rajneesh

    Technology and globalization are interdependent processes. Globalization has a fundamental influence on the creation and diffusion of technology, which, in turn, affects the interdependence of firms and locations. This volume examines the international aspect of this interdependence at two levels...... of innovation" understanding of learning. Narula and Smith reconcile an important paradox. On the one hand, locations and firms are increasingly interdependent through supranational organisations, regional integration, strategic alliances, and the flow of investments, technologies, ideas and people...

  6. Army Technology

    Science.gov (United States)

    2015-02-01

    that allows them to perform applied research under the Institute for Biotechnology research team 1 2 3 20 | ARMY TECHNOLOGY MAGAZINE ...DASA(R&T) Deputy Assistant Secretary of the Army for Research and Technology Download the magazine , view online or read each individual story with...Army photo by Conrad Johnson) Front and back cover designs by Joe Stephens EXECUTIVE DEPUTY TO THE COMMANDING GENERAL Army Technology Magazine is an

  7. Technology alliances

    International Nuclear Information System (INIS)

    Torgerson, D.F.; Boczar, P.G.; Kugler, G.

    1991-10-01

    In the field of nuclear technology, Canada and Korea developed a highly successful relationship that could serve as a model for other high-technology industries. This is particularly significant when one considers the complexity and technical depth required to design, build and operate a nuclear reactor. This paper will outline the overall framework for technology transfer and cooperation between Canada and Korea, and will focus on cooperation in nuclear R and D between the two countries

  8. Technological risks

    International Nuclear Information System (INIS)

    Klinke, A.; Renn, O.

    1998-01-01

    The empirical part about the technological risks deals with different technologies: nuclear energy, early warning systems of nuclear weapons and NBC-weapons, and electromagnetic fields. The potential of damage, the contemporary management strategies and the relevant characteristics will be described for each technology: risks of nuclear energy; risks of early warning systems of nuclear weapons and NBC-weapons; risks of electromagnetic fields. (authors)

  9. Technological risks

    Energy Technology Data Exchange (ETDEWEB)

    Klinke, A.; Renn, O. [Center of Technology Assessment in Baden-Wuerttemberg, Stuttgart (Germany)

    1998-07-01

    The empirical part about the technological risks deals with different technologies: nuclear energy, early warning systems of nuclear weapons and NBC-weapons, and electromagnetic fields. The potential of damage, the contemporary management strategies and the relevant characteristics will be described for each technology: risks of nuclear energy; risks of early warning systems of nuclear weapons and NBC-weapons; risks of electromagnetic fields. (authors)

  10. Next generation sequencing and its applications in forensic genetics

    DEFF Research Database (Denmark)

    Børsting, Claus; Morling, Niels

    2015-01-01

    articles and presentations at conferences with forensic aspects of NGS. These contributions have demonstrated that NGS offers new possibilities for forensic genetic case work. More information may be obtained from unique samples in a single experiment by analyzing combinations of markers (STRs, SNPs......It has been almost a decade since the first next generation sequencing (NGS) technologies emerged and quickly changed the way genetic research is conducted. Today, full genomes are mapped and published almost weekly and with ever increasing speed and decreasing costs. NGS methods and platforms have...... matured during the last 10 years, and the quality of the sequences has reached a level where NGS is used in clinical diagnostics of humans. Forensic genetic laboratories have also explored NGS technologies and especially in the last year, there has been a small explosion in the number of scientific...

  11. Chemistry Technology

    Data.gov (United States)

    Federal Laboratory Consortium — Chemistry technology experts at NCATS engage in a variety of innovative translational research activities, including:Design of bioactive small molecules.Development...

  12. Teaching Task Sequencing via Verbal Mediation.

    Science.gov (United States)

    Rusch, Frank R.; And Others

    1987-01-01

    Verbal sequence training was used to teach a moderately mentally retarded woman to sequence job-related tasks. Learning to say the tasks in the proper sequence resulted in the employee performing her tasks in that sequence, and the employee was capable of mediating her own work behavior when scheduled changes occurred. (Author/JDD)

  13. Repdigits in k-Lucas sequences

    Indian Academy of Sciences (India)

    57(2) 2000 243-254) proved that 11 is the largest number with only one distinct digit (the so-called repdigit) in the sequence ( L n ( 2 ) ) n . In this paper, we address a similar problem in the family of -Lucas sequences. We also show that the -Lucas sequences have similar properties to those of -Fibonacci sequences ...

  14. Next generation sequencing and its applications in forensic genetics.

    Science.gov (United States)

    Børsting, Claus; Morling, Niels

    2015-09-01

    It has been almost a decade since the first next generation sequencing (NGS) technologies emerged and quickly changed the way genetic research is conducted. Today, full genomes are mapped and published almost weekly and with ever increasing speed and decreasing costs. NGS methods and platforms have matured during the last 10 years, and the quality of the sequences has reached a level where NGS is used in clinical diagnostics of humans. Forensic genetic laboratories have also explored NGS technologies and especially in the last year, there has been a small explosion in the number of scientific articles and presentations at conferences with forensic aspects of NGS. These contributions have demonstrated that NGS offers new possibilities for forensic genetic case work. More information may be obtained from unique samples in a single experiment by analyzing combinations of markers (STRs, SNPs, insertion/deletions, mRNA) that cannot be analyzed simultaneously with the standard PCR-CE methods used today. The true variation in core forensic STR loci has been uncovered, and previously unknown STR alleles have been discovered. The detailed sequence information may aid mixture interpretation and will increase the statistical weight of the evidence. In this review, we will give an introduction to NGS and single-molecule sequencing, and we will discuss the possible applications of NGS in forensic genetics. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  15. Next-Generation Sequencing of Antibody Display Repertoires

    Directory of Open Access Journals (Sweden)

    Romain Rouet

    2018-02-01

    Full Text Available In vitro selection technology has transformed the development of therapeutic monoclonal antibodies. Using methods such as phage, ribosome, and yeast display, high affinity binders can be selected from diverse repertoires. Here, we review strategies for the next-generation sequencing (NGS of phage- and other antibody-display libraries, as well as NGS platforms and analysis tools. Moreover, we discuss recent examples relating to the use of NGS to assess library diversity, clonal enrichment, and affinity maturation.

  16. Statistical Approaches for Next-Generation Sequencing Data

    OpenAIRE

    Qiao, Dandi

    2012-01-01

    During the last two decades, genotyping technology has advanced rapidly, which enabled the tremendous success of genome-wide association studies (GWAS) in the search of disease susceptibility loci (DSLs). However, only a small fraction of the overall predicted heritability can be explained by the DSLs discovered. One possible explanation for this ”missing heritability” phenomenon is that many causal variants are rare. The recent development of high-throughput next-generation sequencing (NGS) ...

  17. Southern-by-Sequencing: A Robust Screening Approach for Molecular Characterization of Genetically Modified Crops

    Directory of Open Access Journals (Sweden)

    Gina M. Zastrow-Hayes

    2015-03-01

    Full Text Available Molecular characterization of events is an integral part of the advancement process during genetically modified (GM crop product development. Assessment of these events is traditionally accomplished by polymerase chain reaction (PCR and Southern blot analyses. Southern blot analysis can be time-consuming and comparatively expensive and does not provide sequence-level detail. We have developed a sequence-based application, Southern-by-Sequencing (SbS, utilizing sequence capture coupled with next-generation sequencing (NGS technology to replace Southern blot analysis for event selection in a high-throughput molecular characterization environment. SbS is accomplished by hybridizing indexed and pooled whole-genome DNA libraries from GM plants to biotinylated probes designed to target the sequence of transformation plasmids used to generate events within the pool. This sequence capture process enriches the sequence data obtained for targeted regions of interest (transformation plasmid DNA. Taking advantage of the DNA adjacent to the targeted bases (referred to as next-to-target sequence that accompanies the targeted transformation plasmid sequence, the data analysis detects plasmid-to-genome and plasmid-to-plasmid junctions introduced during insertion into the plant genome. Analysis of these junction sequences provides sequence-level information as to the following: the number of insertion loci including detection of unlinked, independently segregating, small DNA fragments; copy number; rearrangements, truncations, or deletions of the intended insertion DNA; and the presence of transformation plasmid backbone sequences. This molecular evidence from SbS analysis is used to characterize and select GM plants meeting optimal molecular characterization criteria. SbS technology has proven to be a robust event screening tool for use in a high-throughput molecular characterization environment.

  18. The role of next generation sequencing for the development and testing of veterinary biologics

    Science.gov (United States)

    Next generation sequencing technology has become widely available and it offers many new opportunities in vaccine technology. Both human and veterinary medicine has numerous examples of adventitious agents being found in live vaccines. In veterinary medicine a continuing trend is the use of viral ...

  19. Intra-Genomic Internal Transcribed Spacer Region Sequence Heterogeneity and Molecular Diagnosis in Clinical Microbiology.

    Science.gov (United States)

    Zhao, Ying; Tsang, Chi-Ching; Xiao, Meng; Cheng, Jingwei; Xu, Yingchun; Lau, Susanna K P; Woo, Patrick C Y

    2015-10-22

    Internal transcribed spacer region (ITS) sequencing is the most extensively used technology for accurate molecular identification of fungal pathogens in clinical microbiology laboratories. Intra-genomic ITS sequence heterogeneity, which makes fungal identification based on direct sequencing of PCR products difficult, has rarely been reported in pathogenic fungi. During the process of performing ITS sequencing on 71 yeast strains isolated from various clinical specimens, direct sequencing of the PCR products showed ambiguous sequences in six of them. After cloning the PCR products into plasmids for sequencing, interpretable sequencing electropherograms could be obtained. For each of the six isolates, 10-49 clones were selected for sequencing and two to seven intra-genomic ITS copies were detected. The identities of these six isolates were confirmed to be Candida glabrata (n=2), Pichia (Candida) norvegensis (n=2), Candida tropicalis (n=1) and Saccharomyces cerevisiae (n=1). Multiple sequence alignment revealed that one to four intra-genomic ITS polymorphic sites were present in the six isolates, and all these polymorphic sites were located in the ITS1 and/or ITS2 regions. We report and describe the first evidence of intra-genomic ITS sequence heterogeneity in four different pathogenic yeasts, which occurred exclusively in the ITS1 and ITS2 spacer regions for the six isolates in this study.

  20. Optimal choice of word length when comparing two Markov sequences using a χ 2-statistic.

    Science.gov (United States)

    Bai, Xin; Tang, Kujin; Ren, Jie; Waterman, Michael; Sun, Fengzhu

    2017-10-03

    Alignment-free sequence comparison using counts of word patterns (grams, k-tuples) has become an active research topic due to the large amount of sequence data from the new sequencing technologies. Genome sequences are frequently modelled by Markov chains and the likelihood ratio test or the corresponding approximate χ 2 -statistic has been suggested to compare two sequences. However, it is not known how to best choose the word length k in such studies. We develop an optimal strategy to choose k by maximizing the statistical power of detecting differences between two sequences. Let the orders of the Markov chains for the two sequences be r 1 and r 2 , respectively. We show through both simulations and theoretical studies that the optimal k= max(r 1 ,r 2 )+1 for both long sequences and next generation sequencing (NGS) read data. The orders of the Markov chains may be unknown and several methods have been developed to estimate the orders of Markov chains based on both long sequences and NGS reads. We study the power loss of the statistics when the estimated orders are used. It is shown that the power loss is minimal for some of the estimators of the orders of Markov chains. Our studies provide guidelines on choosing the optimal word length for the comparison of Markov sequences.

  1. Technology Catalogue

    International Nuclear Information System (INIS)

    1994-02-01

    The Department of Energy's Office of Environmental Restoration and Waste Management (EM) is responsible for remediating its contaminated sites and managing its waste inventory in a safe and efficient manner. EM's Office of Technology Development (OTD) supports applied research and demonstration efforts to develop and transfer innovative, cost-effective technologies to its site clean-up and waste management programs within EM's Office of Environmental Restoration and Office of Waste Management. The purpose of the Technology Catalogue is to provide performance data on OTD-developed technologies to scientists and engineers assessing and recommending technical solutions within the Department's clean-up and waste management programs, as well as to industry, other federal and state agencies, and the academic community. OTD's applied research and demonstration activities are conducted in programs referred to as Integrated Demonstrations (IDs) and Integrated Programs (IPs). The IDs test and evaluate.systems, consisting of coupled technologies, at specific sites to address generic problems, such as the sensing, treatment, and disposal of buried waste containers. The IPs support applied research activities in specific applications areas, such as in situ remediation, efficient separations processes, and site characterization. The Technology Catalogue is a means for communicating the status. of the development of these innovative technologies. The FY93 Technology Catalogue features technologies successfully demonstrated in the field through IDs and sufficiently mature to be used in the near-term. Technologies from the following IDs are featured in the FY93 Technology Catalogue: Buried Waste ID (Idaho National Engineering Laboratory, Idaho); Mixed Waste Landfill ID (Sandia National Laboratories, New Mexico); Underground Storage Tank ID (Hanford, Washington); Volatile organic compound (VOC) Arid ID (Richland, Washington); and VOC Non-Arid ID (Savannah River Site, South Carolina)

  2. Thermally activated technologies: Technology Roadmap

    Energy Technology Data Exchange (ETDEWEB)

    None, None

    2003-05-01

    The purpose of this Technology Roadmap is to outline a set of actions for government and industry to develop thermally activated technologies for converting America’s wasted heat resources into a reservoir of pollution-free energy for electric power, heating, cooling, refrigeration, and humidity control. Fuel flexibility is important. The actions also cover thermally activated technologies that use fossil fuels, biomass, and ultimately hydrogen, along with waste heat.

  3. Genetic technologies and ethics.

    Science.gov (United States)

    Ardekani, Ali M

    2009-01-01

    In the past decade, the human genome has been completely sequenced and the knowledge from it has begun to influence the fields of biological and social sciences in fundamental ways. Identification of about 25000 genes in the human genome is expected to create great benefits in diagnosis and treatment of diseases in the coming years. However, Genetic technologies have also created many interesting and difficult ethical issues which can affect the human societies now and in the future. Application of genetic technologies in the areas of stem cells, cloning, gene therapy, genetic manipulation, gene selection, sex selection and preimplantation diagnosis has created a great potential for the human race to influence and change human life on earth as we know it today. Therefore, it is important for leaders of societies in the modern world to pay attention to the advances in genetic technologies and prepare themselves and those institutions under their command to face the challenges which these new technologies induce in the areas of ethics, law and social policies.

  4. Technology Exhibition

    Energy Technology Data Exchange (ETDEWEB)

    Anon.

    1979-09-15

    Linked to the 25th Anniversary celebrations, an exhibition of some of CERN's technological achievements was opened on 22 June. Set up in a new 600 m{sup 2} Exhibition Hall on the CERN site, the exhibition is divided into eight technology areas — magnets, vacuum, computers and data handling, survey and alignment, radiation protection, beam monitoring and handling, detectors, and workshop techniques.

  5. Radiation Technology

    International Nuclear Information System (INIS)

    1990-01-01

    The conference was organized to evaluate the application directions of radiation technology in Vietnam and to utilize the Irradiation Centre in Hanoi with the Co-60 source of 110 kCi. The investigation and study of technico-economic feasibility for technology development to various items of food and non-food objects was reported. (N.H.A)

  6. Technology Transformation

    Science.gov (United States)

    Scott, Heather; McGilll, Toria

    2011-01-01

    Social networking and other technologies, if used judiciously, present the means to integrate 21st century skills into the classroom curriculum. But they also introduce challenges that educators must overcome. Increased concerns about plagiarism and access to technology can test educators' creativity and school resources. Air Academy High School,…

  7. Maritime Technology

    DEFF Research Database (Denmark)

    Sørensen, Herman

    1997-01-01

    Elementary introduction to the subject "Maritime Technology".The contents include drawings, sketches and references in English without any supplementary text.......Elementary introduction to the subject "Maritime Technology".The contents include drawings, sketches and references in English without any supplementary text....

  8. Sensemaking technology

    DEFF Research Database (Denmark)

    Madsen, Charlotte Øland

    Research objective: The object of the LOK research project is to gain a better understanding of the technological strategic processes in organisations by using the concept/metaphor of sensemaking. The project will investigate the technological strategies in organisations in order to gain a deeper...... understanding of the cognitive competencies and barriers towards implementing new technology in organisations. The research will therefore concentrate on researching the development process in the organisation's perception of the external environmental elements of customers, suppliers, competitors, internal...... and external technology and legislation and the internal environmental elements of structure, power relations and political arenas. All of these variables have influence on which/how technologies are implemented thus creating different outcomes all depending on the social dynamics that are triggered by changes...

  9. Sequencing and annotation of mitochondrial genomes from individual parasitic helminths.

    Science.gov (United States)

    Jex, Aaron R; Littlewood, D Timothy; Gasser, Robin B

    2015-01-01

    Mitochondrial (mt) genomics has significant implications in a range of fundamental areas of parasitology, including evolution, systematics, and population genetics as well as explorations of mt biochemistry, physiology, and function. Mt genomes also provide a rich source of markers to aid molecular epidemiological and ecological studies of key parasites. However, there is still a paucity of information on mt genomes for many metazoan organisms, particularly parasitic helminths, which has often related to challenges linked to sequencing from tiny amounts of material. The advent of next-generation sequencing (NGS) technologies has paved the way for low cost, high-throughput mt genomic research, but there have been obstacles, particularly in relation to post-sequencing assembly and analyses of large datasets. In this chapter, we describe protocols for the efficient amplification and sequencing of mt genomes from small portions of individual helminths, and highlight the utility of NGS platforms to expedite mt genomics. In addition, we recommend approaches for manual or semi-automated bioinformatic annotation and analyses to overcome the bioinformatic "bottleneck" to research in this area. Taken together, these approaches have demonstrated applicability to a range of parasites and provide prospects for using complete mt genomic sequence datasets for large-scale molecular systematic and epidemiological studies. In addition, these methods have broader utility and might be readily adapted to a range of other medium-sized molecular regions (i.e., 10-100 kb), including large genomic operons, and other organellar (e.g., plastid) and viral genomes.

  10. Quantum-Sequencing: Biophysics of quantum tunneling through nucleic acids

    Science.gov (United States)

    Casamada Ribot, Josep; Chatterjee, Anushree; Nagpal, Prashant

    2014-03-01

    Tunneling microscopy and spectroscopy has extensively been used in physical surface sciences to study quantum tunneling to measure electronic local density of states of nanomaterials and to characterize adsorbed species. Quantum-Sequencing (Q-Seq) is a new method based on tunneling microscopy for electronic sequencing of single molecule of nucleic acids. A major goal of third-generation sequencing technologies is to develop a fast, reliable, enzyme-free single-molecule sequencing method. Here, we present the unique ``electronic fingerprints'' for all nucleotides on DNA and RNA using Q-Seq along their intrinsic biophysical parameters. We have analyzed tunneling spectra for the nucleotides at different pH conditions and analyzed the HOMO, LUMO and energy gap for all of them. In addition we show a number of biophysical parameters to further characterize all nucleobases (electron and hole transition voltage and energy barriers). These results highlight the robustness of Q-Seq as a technique for next-generation sequencing.

  11. Genotyping common and rare variation using overlapping pool sequencing

    Directory of Open Access Journals (Sweden)

    Pasaniuc Bogdan

    2011-07-01

    Full Text Available Abstract Background Recent advances in sequencing technologies set the stage for large, population based studies, in which the ANA or RNA of thousands of individuals will be sequenced. Currently, however, such studies are still infeasible using a straightforward sequencing approach; as a result, recently a few multiplexing schemes have been suggested, in which a small number of ANA pools are sequenced, and the results are then deconvoluted using compressed sensing or similar approaches. These methods, however, are limited to the detection of rare variants. Results In this paper we provide a new algorithm for the deconvolution of DNA pools multiplexing schemes. The presented algorithm utilizes a likelihood model and linear programming. The approach allows for the addition of external data, particularly imputation data, resulting in a flexible environment that is suitable for different applications. Conclusions Particularly, we demonstrate that both low and high allele frequency SNPs can be accurately genotyped when the DNA pooling scheme is performed in conjunction with microarray genotyping and imputation. Additionally, we demonstrate the use of our framework for the detection of cancer fusion genes from RNA sequences.

  12. Long-read sequencing data analysis for yeasts.

    Science.gov (United States)

    Yue, Jia-Xing; Liti, Gianni

    2018-06-01

    Long-read sequencing technologies have become increasingly popular due to their strengths in resolving complex genomic regions. As a leading model organism with small genome size and great biotechnological importance, the budding yeast Saccharomyces cerevisiae has many isolates currently being sequenced with long reads. However, analyzing long-read sequencing data to produce high-quality genome assembly and annotation remains challenging. Here, we present a modular computational framework named long-read sequencing data analysis for yeasts (LRSDAY), the first one-stop solution that streamlines this process. Starting from the raw sequencing reads, LRSDAY can produce chromosome-level genome assembly and comprehensive genome annotation in a highly automated manner with minimal manual intervention, which is not possible using any alternative tool available to date. The annotated genomic features include centromeres, protein-coding genes, tRNAs, transposable elements (TEs), and telomere-associated elements. Although tailored for S. cerevisiae, we designed LRSDAY to be highly modular and customizable, making it adaptable to virtually any eukaryotic organism. When applying LRSDAY to an S. cerevisiae strain, it takes ∼41 h to generate a complete and well-annotated genome from ∼100× Pacific Biosciences (PacBio) running the basic workflow with four threads. Basic experience working within the Linux command-line environment is recommended for carrying out the analysis using LRSDAY.

  13. Nonparametric Inference for Periodic Sequences

    KAUST Repository

    Sun, Ying

    2012-02-01

    This article proposes a nonparametric method for estimating the period and values of a periodic sequence when the data are evenly spaced in time. The period is estimated by a "leave-out-one-cycle" version of cross-validation (CV) and complements the periodogram, a widely used tool for period estimation. The CV method is computationally simple and implicitly penalizes multiples of the smallest period, leading to a "virtually" consistent estimator of integer periods. This estimator is investigated both theoretically and by simulation.We also propose a nonparametric test of the null hypothesis that the data have constantmean against the alternative that the sequence of means is periodic. Finally, our methodology is demonstrated on three well-known time series: the sunspots and lynx trapping data, and the El Niño series of sea surface temperatures. © 2012 American Statistical Association and the American Society for Quality.

  14. Multi-qubit compensation sequences

    International Nuclear Information System (INIS)

    Tomita, Y; Merrill, J T; Brown, K R

    2010-01-01

    The Hamiltonian control of n qubits requires precision control of both the strength and timing of interactions. Compensation pulses relax the precision requirements by reducing unknown but systematic errors. Using composite pulse techniques designed for single qubits, we show that systematic errors for n-qubit systems can be corrected to arbitrary accuracy given either two non-commuting control Hamiltonians with identical systematic errors or one error-free control Hamiltonian. We also examine composite pulses in the context of quantum computers controlled by two-qubit interactions. For quantum computers based on the XY interaction, single-qubit composite pulse sequences naturally correct systematic errors. For quantum computers based on the Heisenberg or exchange interaction, the composite pulse sequences reduce the logical single-qubit gate errors but increase the errors for logical two-qubit gates.

  15. Cassini Mission Sequence Subsystem (MSS)

    Science.gov (United States)

    Alland, Robert

    2011-01-01

    This paper describes my work with the Cassini Mission Sequence Subsystem (MSS) team during the summer of 2011. It gives some background on the motivation for this project and describes the expected benefit to the Cassini program. It then introduces the two tasks that I worked on - an automatic system auditing tool and a series of corrections to the Cassini Sequence Generator (SEQ_GEN) - and the specific objectives these tasks were to accomplish. Next, it details the approach I took to meet these objectives and the results of this approach, followed by a discussion of how the outcome of the project compares with my initial expectations. The paper concludes with a summary of my experience working on this project, lists what the next steps are, and acknowledges the help of my Cassini colleagues.

  16. Exploring the Mechanisms of Gastrointestinal Cancer Development Using Deep Sequencing Analysis

    International Nuclear Information System (INIS)

    Matsumoto, Tomonori; Shimizu, Takahiro; Takai, Atsushi; Marusawa, Hiroyuki

    2015-01-01

    Next-generation sequencing (NGS) technologies have revolutionized cancer genomics due to their high throughput sequencing capacity. Reports of the gene mutation profiles of various cancers by many researchers, including international cancer genome research consortia, have increased over recent years. In addition to detecting somatic mutations in tumor cells, NGS technologies enable us to approach the subject of carcinogenic mechanisms from new perspectives. Deep sequencing, a method of optimizing the high throughput capacity of NGS technologies, allows for the detection of genetic aberrations in small subsets of premalignant and/or tumor cells in noncancerous chronically inflamed tissues. Genome-wide NGS data also make it possible to clarify the mutational signatures of each cancer tissue by identifying the precise pattern of nucleotide alterations in the cancer genome, providing new information regarding the mechanisms of tumorigenesis. In this review, we highlight these new methods taking advantage of NGS technologies, and discuss our current understanding of carcinogenic mechanisms elucidated from such approaches

  17. Exploring the Mechanisms of Gastrointestinal Cancer Development Using Deep Sequencing Analysis

    Energy Technology Data Exchange (ETDEWEB)

    Matsumoto, Tomonori; Shimizu, Takahiro; Takai, Atsushi; Marusawa, Hiroyuki, E-mail: maru@kuhp.kyoto-u.ac.jp [Department of Gastroenterology and Hepatology, Graduate School of Medicine, Kyoto University, 54 Shogoin-Kawahara-cho, Sakyo-ku, Kyoto 606-8507 (Japan)

    2015-06-15

    Next-generation sequencing (NGS) technologies have revolutionized cancer genomics due to their high throughput sequencing capacity. Reports of the gene mutation profiles of various cancers by many researchers, including international cancer genome research consortia, have increased over recent years. In addition to detecting somatic mutations in tumor cells, NGS technologies enable us to approach the subject of carcinogenic mechanisms from new perspectives. Deep sequencing, a method of optimizing the high throughput capacity of NGS technologies, allows for the detection of genetic aberrations in small subsets of premalignant and/or tumor cells in noncancerous chronically inflamed tissues. Genome-wide NGS data also make it possible to clarify the mutational signatures of each cancer tissue by identifying the precise pattern of nucleotide alterations in the cancer genome, providing new information regarding the mechanisms of tumorigenesis. In this review, we highlight these new methods taking advantage of NGS technologies, and discuss our current understanding of carcinogenic mechanisms elucidated from such approaches.

  18. Sequence complexity and work extraction

    International Nuclear Information System (INIS)

    Merhav, Neri

    2015-01-01

    We consider a simplified version of a solvable model by Mandal and Jarzynski, which constructively demonstrates the interplay between work extraction and the increase of the Shannon entropy of an information reservoir which is in contact with a physical system. We extend Mandal and Jarzynski’s main findings in several directions: first, we allow sequences of correlated bits rather than just independent bits. Secondly, at least for the case of binary information, we show that, in fact, the Shannon entropy is only one measure of complexity of the information that must increase in order for work to be extracted. The extracted work can also be upper bounded in terms of the increase in other quantities that measure complexity, like the predictability of future bits from past ones. Third, we provide an extension to the case of non-binary information (i.e. a larger alphabet), and finally, we extend the scope to the case where the incoming bits (before the interaction) form an individual sequence, rather than a random one. In this case, the entropy before the interaction can be replaced by the Lempel–Ziv (LZ) complexity of the incoming sequence, a fact that gives rise to an entropic meaning of the LZ complexity, not only in information theory, but also in physics. (paper)

  19. Entropic fluctuations in DNA sequences

    Science.gov (United States)

    Thanos, Dimitrios; Li, Wentian; Provata, Astero

    2018-03-01

    The Local Shannon Entropy (LSE) in blocks is used as a complexity measure to study the information fluctuations along DNA sequences. The LSE of a DNA block maps the local base arrangement information to a single numerical value. It is shown that despite this reduction of information, LSE allows to extract meaningful information related to the detection of repetitive sequences in whole chromosomes and is useful in finding evolutionary differences between organisms. More specifically, large regions of tandem repeats, such as centromeres, can be detected based on their low LSE fluctuations along the chromosome. Furthermore, an empirical investigation of the appropriate block sizes is provided and the relationship of LSE properties with the structure of the underlying repetitive units is revealed by using both computational and mathematical methods. Sequence similarity between the genomic DNA of closely related species also leads to similar LSE values at the orthologous regions. As an application, the LSE covariance function is used to measure the evolutionary distance between several primate genomes.

  20. Simultaneous Structural Variation Discovery in Multiple Paired-End Sequenced Genomes

    Science.gov (United States)

    Hormozdiari, Fereydoun; Hajirasouliha, Iman; McPherson, Andrew; Eichler, Evan E.; Sahinalp, S. Cenk

    Next generation sequencing technologies have been decreasing the costs and increasing the world-wide capacity for sequence production at an unprecedented rate, making the initiation of large scale projects aiming to sequence almost 2000 genomes [1]. Structural variation detection promises to be one of the key diagnostic tools for cancer and other diseases with genomic origin. In this paper, we study the problem of detecting structural variation events in two or more sequenced genomes through high throughput sequencing . We propose to move from the current model of (1) detecting genomic variations in single next generation sequenced (NGS) donor genomes independently, and (2) checking whether two or more donor genomes indeed agree or disagree on the variations (in this paper we name this framework Independent Structural Variation Discovery and Merging - ISV&M), to a new model in which we detect structural variation events among multiple genomes simultaneously.

  1. Next-Generation Sequencing Workflow for NSCLC Critical Samples Using a Targeted Sequencing Approach by Ion Torrent PGM™ Platform.

    Science.gov (United States)

    Vanni, Irene; Coco, Simona; Truini, Anna; Rusmini, Marta; Dal Bello, Maria Giovanna; Alama, Angela; Banelli, Barbara; Mora, Marco; Rijavec, Erika; Barletta, Giulia; Genova, Carlo; Biello, Federica; Maggioni, Claudia; Grossi, Francesco

    2015-12-03

    Next-generation sequencing (NGS) is a cost-effective technology capable of screening several genes simultaneously; however, its application in a clinical context requires an established workflow to acquire reliable sequencing results. Here, we report an optimized NGS workflow analyzing 22 lung cancer-related genes to sequence critical samples such as DNA from formalin-fixed paraffin-embedded (FFPE) blocks and circulating free DNA (cfDNA). Snap frozen and matched FFPE gDNA from 12 non-small cell lung cancer (NSCLC) patients, whose gDNA fragmentation status was previously evaluated using a multiplex PCR-based quality control, were successfully sequenced with Ion Torrent PGM™. The robust bioinformatic pipeline allowed us to correctly call both Single Nucleotide Variants (SNVs) and indels with a detection limit of 5%, achieving 100% specificity and 96% sensitivity. This workflow was also validated in 13 FFPE NSCLC biopsies. Furthermore, a specific protocol for low input gDNA capable of producing good sequencing data with high coverage, high uniformity, and a low error rate was also optimized. In conclusion, we demonstrate the feasibility of obtaining gDNA from FFPE samples suitable for NGS by performing appropriate quality controls. The optimized workflow, capable of screening low input gDNA, highlights NGS as a potential tool in the detection, disease monitoring, and treatment of NSCLC.

  2. Rapid whole genome sequencing and precision neonatology.

    Science.gov (United States)

    Petrikin, Joshua E; Willig, Laurel K; Smith, Laurie D; Kingsmore, Stephen F

    2015-12-01

    Traditionally, genetic testing has been too slow or perceived to be impractical to initial management of the critically ill neonate. Technological advances have led to the ability to sequence and interpret the entire genome of a neonate in as little as 26 h. As the cost and speed of testing decreases, the utility of whole genome sequencing (WGS) of neonates for acute and latent genetic illness increases. Analyzing the entire genome allows for concomitant evaluation of the currently identified 5588 single gene diseases. When applied to a select population of ill infants in a level IV neonatal intensive care unit, WGS yielded a diagnosis of a causative genetic disease in 57% of patients. These diagnoses may lead to clinical management changes ranging from transition to palliative care for uniformly lethal conditions for alteration or initiation of medical or surgical therapy to improve outcomes in others. Thus, institution of 2-day WGS at time of acute presentation opens the possibility of early implementation of precision medicine. This implementation may create opportunities for early interventional, frequently novel or off-label therapies that may alter disease trajectory in infants with what would otherwise be fatal disease. Widespread deployment of rapid WGS and precision medicine will raise ethical issues pertaining to interpretation of variants of unknown significance, discovery of incidental findings related to adult onset conditions and carrier status, and implementation of medical therapies for which little is known in terms of risks and benefits. Despite these challenges, precision neonatology has significant potential both to decrease infant mortality related to genetic diseases with onset in newborns and to facilitate parental decision making regarding transition to palliative care. Copyright © 2015 Elsevier Inc. All rights reserved.

  3. Method and apparatus for biological sequence comparison

    Science.gov (United States)

    Marr, T.G.; Chang, W.I.

    1997-12-23

    A method and apparatus are disclosed for comparing biological sequences from a known source of sequences, with a subject (query) sequence. The apparatus takes as input a set of target similarity levels (such as evolutionary distances in units of PAM), and finds all fragments of known sequences that are similar to the subject sequence at each target similarity level, and are long enough to be statistically significant. The invention device filters out fragments from the known sequences that are too short, or have a lower average similarity to the subject sequence than is required by each target similarity level. The subject sequence is then compared only to the remaining known sequences to find the best matches. The filtering member divides the subject sequence into overlapping blocks, each block being sufficiently large to contain a minimum-length alignment from a known sequence. For each block, the filter member compares the block with every possible short fragment in the known sequences and determines a best match for each comparison. The determined set of short fragment best matches for the block provide an upper threshold on alignment values. Regions of a certain length from the known sequences that have a mean alignment value upper threshold greater than a target unit score are concatenated to form a union. The current block is compared to the union and provides an indication of best local alignment with the subject sequence. 5 figs.

  4. Memory and learning with rapid audiovisual sequences

    Science.gov (United States)

    Keller, Arielle S.; Sekuler, Robert

    2015-01-01

    We examined short-term memory for sequences of visual stimuli embedded in varying multisensory contexts. In two experiments, subjects judged the structure of the visual sequences while disregarding concurrent, but task-irrelevant auditory sequences. Stimuli were eight-item sequences in which varying luminances and frequencies were presented concurrently and rapidly (at 8 Hz). Subjects judged whether the final four items in a visual sequence identically replicated the first four items. Luminances and frequencies in each sequence were either perceptually correlated (Congruent) or were unrelated to one another (Incongruent). Experiment 1 showed that, despite encouragement to ignore the auditory stream, subjects' categorization of visual sequences was strongly influenced by the accompanying auditory sequences. Moreover, this influence tracked the similarity between a stimulus's separate audio and visual sequences, demonstrating that task-irrelevant auditory sequences underwent a considerable degree of processing. Using a variant of Hebb's repetition design, Experiment 2 compared musically trained subjects and subjects who had little or no musical training on the same task as used in Experiment 1. Test sequences included some that intermittently and randomly recurred, which produced better performance than sequences that were generated anew for each trial. The auditory component of a recurring audiovisual sequence influenced musically trained subjects more than it did other subjects. This result demonstrates that stimulus-selective, task-irrelevant learning of sequences can occur even when such learning is an incidental by-product of the task being performed. PMID:26575193

  5. Memory and learning with rapid audiovisual sequences.

    Science.gov (United States)

    Keller, Arielle S; Sekuler, Robert

    2015-01-01

    We examined short-term memory for sequences of visual stimuli embedded in varying multisensory contexts. In two experiments, subjects judged the structure of the visual sequences while disregarding concurrent, but task-irrelevant auditory sequences. Stimuli were eight-item sequences in which varying luminances and frequencies were presented concurrently and rapidly (at 8 Hz). Subjects judged whether the final four items in a visual sequence identically replicated the first four items. Luminances and frequencies in each sequence were either perceptually correlated (Congruent) or were unrelated to one another (Incongruent). Experiment 1 showed that, despite encouragement to ignore the auditory stream, subjects' categorization of visual sequences was strongly influenced by the accompanying auditory sequences. Moreover, this influence tracked the similarity between a stimulus's separate audio and visual sequences, demonstrating that task-irrelevant auditory sequences underwent a considerable degree of processing. Using a variant of Hebb's repetition design, Experiment 2 compared musically trained subjects and subjects who had little or no musical training on the same task as used in Experiment 1. Test sequences included some that intermittently and randomly recurred, which produced better performance than sequences that were generated anew for each trial. The auditory component of a recurring audiovisual sequence influenced musically trained subjects more than it did other subjects. This result demonstrates that stimulus-selective, task-irrelevant learning of sequences can occur even when such learning is an incidental by-product of the task being performed.

  6. Multineuronal Spike Sequences Repeat with Millisecond Precision

    Directory of Open Access Journals (Sweden)

    Koki eMatsumoto

    2013-06-01

    Full Text Available Cortical microcircuits are nonrandomly wired by neurons. As a natural consequence, spikes emitted by microcircuits are also nonrandomly patterned in time and space. One of the prominent spike organizations is a repetition of fixed patterns of spike series across multiple neurons. However, several questions remain unsolved, including how precisely spike sequences repeat, how the sequences are spatially organized, how many neurons participate in sequences, and how different sequences are functionally linked. To address these questions, we monitored spontaneous spikes of hippocampal CA3 neurons ex vivo using a high-speed functional multineuron calcium imaging technique that allowed us to monitor spikes with millisecond resolution and to record the location of spiking and nonspiking neurons. Multineuronal spike sequences were overrepresented in spontaneous activity compared to the statistical chance level. Approximately 75% of neurons participated in at least one sequence during our observation period. The participants were sparsely dispersed and did not show specific spatial organization. The number of sequences relative to the chance level decreased when larger time frames were used to detect sequences. Thus, sequences were precise at the millisecond level. Sequences often shared common spikes with other sequences; parts of sequences were subsequently relayed by following sequences, generating complex chains of multiple sequences.

  7. A complete mitochondrial genome sequence from a mesolithic wild aurochs (Bos primigenius).

    LENUS (Irish Health Repository)

    Edwards, Ceiridwen J

    2010-01-01

    BACKGROUND: The derivation of domestic cattle from the extinct wild aurochs (Bos primigenius) has been well-documented by archaeological and genetic studies. Genetic studies point towards the Neolithic Near East as the centre of origin for Bos taurus, with some lines of evidence suggesting possible, albeit rare, genetic contributions from locally domesticated wild aurochsen across Eurasia. Inferences from these investigations have been based largely on the analysis of partial mitochondrial DNA sequences generated from modern animals, with limited sequence data from ancient aurochsen samples. Recent developments in DNA sequencing technologies, however, are affording new opportunities for the examination of genetic material retrieved from extinct species, providing new insight into their evolutionary history. Here we present DNA sequence analysis of the first complete mitochondrial genome (16,338 base pairs) from an archaeologically-verified and exceptionally-well preserved aurochs bone sample. METHODOLOGY: DNA extracts were generated from an aurochs humerus bone sample recovered from a cave site located in Derbyshire, England and radiocarbon-dated to 6,738+\\/-68 calibrated years before present. These extracts were prepared for both Sanger and next generation DNA sequencing technologies (Illumina Genome Analyzer). In total, 289.9 megabases (22.48%) of the post-filtered DNA sequences generated using the Illumina Genome Analyzer from this sample mapped with confidence to the bovine genome. A consensus B. primigenius mitochondrial genome sequence was constructed and was analysed alongside all available complete bovine mitochondrial genome sequences. CONCLUSIONS: For all nucleotide positions where both Sanger and Illumina Genome Analyzer sequencing methods gave high-confidence calls, no discrepancies were observed. Sequence analysis reveals evidence of heteroplasmy in this sample and places this mitochondrial genome sequence securely within a previously identified

  8. Ergonomics technology

    Science.gov (United States)

    Jones, W. L.

    1977-01-01

    Major areas of research and development in ergonomics technology for space environments are discussed. Attention is given to possible applications of the technology developed by NASA in industrial settings. A group of mass spectrometers for gas analysis capable of fully automatic operation has been developed for atmosphere control on spacecraft; a version for industrial use has been constructed. Advances have been made in personal cooling technology, remote monitoring of medical information, and aerosol particle control. Experience gained by NASA during the design and development of portable life support units has recently been applied to improve breathing equipment used by fire fighters.

  9. Diagnostics of Primary Immunodeficiencies through Next Generation Sequencing

    Directory of Open Access Journals (Sweden)

    Vera Gallo

    2016-11-01

    Full Text Available Background: Recently, a growing number of novel genetic defects underlying primary immunodeficiencies (PID have been identified, increasing the number of PID up to more than 250 well-defined forms. Next-generation sequencing (NGS technologies and proper filtering strategies greatly contributed to this rapid evolution, providing the possibility to rapidly and simultaneously analyze large numbers of genes or the whole exome. Objective: To evaluate the role of targeted next-generation sequencing and whole exome sequencing in the diagnosis of a case series, characterized by complex or atypical clinical features suggesting a PID, difficult to diagnose using the current diagnostic procedures.Methods: We retrospectively analyzed genetic variants identified through targeted next-generation sequencing or whole exome sequencing in 45 patients with complex PID of unknown etiology. Results: 40 variants were identified using targeted next-generation sequencing, while 5 were identified using whole exome sequencing. Newly identified genetic variants were classified into 4 groups: I variations associated with a well-defined PID; II variations associated with atypical features of a well-defined PID; III functionally relevant variations potentially involved in the immunological features; IV non-diagnostic genotype, in whom the link with phenotype is missing. We reached a conclusive genetic diagnosis in 7/45 patients (~16%. Among them, 4 patients presented with a typical well-defined PID. In the remaining 3 cases, mutations were associated with unexpected clinical features, expanding the phenotypic spectrum of typical PIDs. In addition, we identified 31 variants in 10 patients with complex phenotype, individually not causative per se of the disorder.Conclusion: NGS technologies represent a cost-effective and rapid first-line genetic approaches for the evaluation of complex PIDs. Whole exome sequencing, despite a moderate higher cost compared to targeted, is

  10. Static multiplicities in heterogeneous azeotropic distillation sequences

    DEFF Research Database (Denmark)

    Esbjerg, Klavs; Andersen, Torben Ravn; Jørgensen, Sten Bay

    1998-01-01

    In this paper the results of a bifurcation analysis on heterogeneous azeotropic distillation sequences are given. Two sequences suitable for ethanol dehydration are compared: The 'direct' and the 'indirect' sequence. It is shown, that the two sequences, despite their similarities, exhibit very...... different static behavior. The method of Petlyuk and Avet'yan (1971), Bekiaris et al. (1993), which assumes infinite reflux and infinite number of stages, is extended to and applied on heterogeneous azeotropic distillation sequences. The predictions are substantiated through simulations. The static sequence...

  11. Evaluation of second-generation sequencing of 19 dilated cardiomyopathy genes for clinical applications.

    Science.gov (United States)

    Gowrisankar, Sivakumar; Lerner-Ellis, Jordan P; Cox, Stephanie; White, Emily T; Manion, Megan; LeVan, Kevin; Liu, Jonathan; Farwell, Lisa M; Iartchouk, Oleg; Rehm, Heidi L; Funke, Birgit H

    2010-11-01

    Medical sequencing for diseases with locus and allelic heterogeneities has been limited by the high cost and low throughput of traditional sequencing technologies. "Second-generation" sequencing (SGS) technologies allow the parallel processing of a large number of genes and, therefore, offer great promise for medical sequencing; however, their use in clinical laboratories is still in its infancy. Our laboratory offers clinical resequencing for dilated cardiomyopathy (DCM) using an array-based platform that interrogates 19 of more than 30 genes known to cause DCM. We explored both the feasibility and cost effectiveness of using PCR amplification followed by SGS technology for sequencing these 19 genes in a set of five samples enriched for known sequence alterations (109 unique substitutions and 27 insertions and deletions). While the analytical sensitivity for substitutions was comparable to that of the DCM array (98%), SGS technology performed better than the DCM array for insertions and deletions (90.6% versus 58%). Overall, SGS performed substantially better than did the current array-based testing platform; however, the operational cost and projected turnaround time do not meet our current standards. Therefore, efficient capture methods and/or sample pooling strategies that shorten the turnaround time and decrease reagent and labor costs are needed before implementing this platform into routine clinical applications.

  12. Technology Innovation

    Science.gov (United States)

    EPA produces innovative technologies and facilitates their creation in line with the Agency mission to create products such as the stormwater calculator, remote sensing, innovation clusters, and low-cost air sensors.

  13. Technology | FNLCR

    Science.gov (United States)

    The Frederick National Laboratory develops and applies advanced, next-generation technologies to solve basic and applied problems in the biomedical sciences, and serves as a national resource of shared high-tech facilities.

  14. Plasma technology

    International Nuclear Information System (INIS)

    Drouet, M.G.

    1984-03-01

    IREQ was contracted by the Canadian Electrical Association to review plasma technology and assess the potential for application of this technology in Canada. A team of experts in the various aspects of this technology was assembled and each team member was asked to contribute to this report on the applications of plasma pertinent to his or her particular field of expertise. The following areas were examined in detail: iron, steel and strategic-metals production; surface treatment by spraying; welding and cutting; chemical processing; drying; and low-temperature treatment. A large market for the penetration of electricity has been identified. To build up confidence in the technology, support should be provided for selected R and D projects, plasma torch demonstrations at full power, and large-scale plasma process testing

  15. Exploration technology

    Energy Technology Data Exchange (ETDEWEB)

    Roennevik, H.C. [Saga Petroleum A/S, Forus (Norway)

    1996-12-31

    The paper evaluates exploration technology. Topics discussed are: Visions; the subsurface challenge; the creative tension; the exploration process; seismic; geology; organic geochemistry; seismic resolution; integration; drilling; value creation. 4 refs., 22 figs.

  16. A Case Study into Microbial Genome Assembly Gap Sequences and Finishing Strategies.

    Science.gov (United States)

    Utturkar, Sagar M; Klingeman, Dawn M; Hurt, Richard A; Brown, Steven D

    2017-01-01

    This study characterized regions of DNA which remained unassembled by either PacBio and Illumina sequencing technologies for seven bacterial genomes. Two genomes were manually finished using bioinformatics and PCR/Sanger sequencing approaches and regions not assembled by automated software were analyzed. Gaps present within Illumina assemblies mostly correspond to repetitive DNA regions such as multiple rRNA operon sequences. PacBio gap sequences were evaluated for several properties such as GC content, read coverage, gap length, ability to form strong secondary structures, and corresponding annotations. Our hypothesis that strong secondary DNA structures blocked DNA polymerases and contributed to gap sequences was not accepted. PacBio assemblies had few limitations overall and gaps were explained as cumulative effect of lower than average sequence coverage and repetitive sequences at contig termini. An important aspect of the present study is the compilation of biological features that interfered with assembly and included active transposons, multiple plasmid sequences, phage DNA integration, and large sequence duplication. Our targeted genome finishing approach and systematic evaluation of the unassembled DNA will be useful for others looking to close, finish, and polish microbial genome sequences.

  17. MiSeq: A Next Generation Sequencing Platform for Genomic Analysis.

    Science.gov (United States)

    Ravi, Rupesh Kanchi; Walton, Kendra; Khosroheidari, Mahdieh

    2018-01-01

    MiSeq, Illumina's integrated next generation sequencing instrument, uses reversible-terminator sequencing-by-synthesis technology to provide end-to-end sequencing solutions. The MiSeq instrument is one of the smallest benchtop sequencers that can perform onboard cluster generation, amplification, genomic DNA sequencing, and data analysis, including base calling, alignment and variant calling, in a single run. It performs both single- and paired-end runs with adjustable read lengths from 1 × 36 base pairs to 2 × 300 base pairs. A single run can produce output data of up to 15 Gb in as little as 4 h of runtime and can output up to 25 M single reads and 50 M paired-end reads. Thus, MiSeq provides an ideal platform for rapid turnaround time. MiSeq is also a cost-effective tool for various analyses focused on targeted gene sequencing (amplicon sequencing and target enrichment), metagenomics, and gene expression studies. For these reasons, MiSeq has become one of the most widely used next generation sequencing platforms. Here, we provide a protocol to prepare libraries for sequencing using the MiSeq instrument and basic guidelines for analysis of output data from the MiSeq sequencing run.

  18. SCORE DIGITAL TECHNOLOGY: THE CONVERGENCE

    Directory of Open Access Journals (Sweden)

    Chernyshov Alexander V.

    2013-12-01

    Full Text Available Explores the role of digital scorewriters in today's culture, education, and music industry and media environment. The main principle of the development of software is not only publishing innovation (relating to the sheet music, and integration into the area of composition, arrangement, education, creative process for works based on digital technology (films, television and radio broadcasting, Internet, audio and video art. Therefore the own convergence of musically-computer technology is a total phenomenon: notation program combined with means MIDI-sequencer, audio and video editor. The article contains the unique interview with the creator of music notation processors.

  19. Technological risk

    Energy Technology Data Exchange (ETDEWEB)

    Dierkes, M; Coppock, R; Edwards, S

    1980-01-01

    The book begins with brief statements from representatives of political organizations. Part II presents an overview of the discussion about the control and management of technological progress. Parts III and IV discuss important elements in citizens' perception of technological risks and the development of consensus on how to deal with them. In Part V practical problems in the application of risk assessment and management, and in Part VI additional points are summarized.

  20. Lasers technology

    International Nuclear Information System (INIS)

    2014-01-01

    The Laser Technology Program of IPEN is developed by the Center for Lasers and Applications (CLA) and is committed to the development of new lasers based on the research of new optical materials and new resonator technologies. Laser applications and research occur within several areas such as Nuclear, Medicine, Dentistry, Industry, Environment and Advanced Research. Additional goals of the Program are human resource development and innovation, in association with Brazilian Universities and commercial partners