WorldWideScience

Sample records for chip-seq accurately predicts

  1. Comparing genome-wide chromatin profiles using ChIP-chip or ChIP-seq

    NARCIS (Netherlands)

    Johannes, Frank; Wardenaar, Rene; Colomé Tatché, Maria; Mousson, Florence; de Graaf, Petra; Mokry, Michal; Guryev, Victor; Timmers, H. Th. Marc; Cuppen, Edwin; Jansen, Ritsert C.; Bateman, Alex

    2010-01-01

    Motivation: ChIP-chip and ChIP-seq technologies provide genomewide measurements of various types of chromatin marks at an unprecedented resolution. With ChIP samples collected from different tissue types and/ or individuals, we can now begin to characterize stochastic or systematic changes in

  2. Comparing genome-wide chromatin profiles using ChIP-chip or ChIP-seq

    NARCIS (Netherlands)

    Johannes, F.; Wardenaar, R.; Colome-Tatche, M.; Mousson, F.; de Graaf, P.; Mokry, M.; Guryev, V.; Timmers, H.T.; Cuppen, E.; Jansen, R.

    2010-01-01

    MOTIVATION: ChIP-chip and ChIP-seq technologies provide genome-wide measurements of various types of chromatin marks at an unprecedented resolution. With ChIP samples collected from different tissue types and/or individuals, we can now begin to characterize stochastic or systematic changes in

  3. iTAR: a web server for identifying target genes of transcription factors using ChIP-seq or ChIP-chip data.

    Science.gov (United States)

    Yang, Chia-Chun; Andrews, Erik H; Chen, Min-Hsuan; Wang, Wan-Yu; Chen, Jeremy J W; Gerstein, Mark; Liu, Chun-Chi; Cheng, Chao

    2016-08-12

    Chromatin immunoprecipitation followed by massively parallel DNA sequencing (ChIP-seq) or microarray hybridization (ChIP-chip) has been widely used to determine the genomic occupation of transcription factors (TFs). We have previously developed a probabilistic method, called TIP (Target Identification from Profiles), to identify TF target genes using ChIP-seq/ChIP-chip data. To achieve high specificity, TIP applies a conservative method to estimate significance of target genes, with the trade-off being a relatively low sensitivity of target gene identification compared to other methods. Additionally, TIP's output does not render binding-peak locations or intensity, information highly useful for visualization and general experimental biological use, while the variability of ChIP-seq/ChIP-chip file formats has made input into TIP more difficult than desired. To improve upon these facets, here we present are fined TIP with key extensions. First, it implements a Gaussian mixture model for p-value estimation, increasing target gene identification sensitivity and more accurately capturing the shape of TF binding profile distributions. Second, it enables the incorporation of TF binding-peak data by identifying their locations in significant target gene promoter regions and quantifies their strengths. Finally, for full ease of implementation we have incorporated it into a web server ( http://syslab3.nchu.edu.tw/iTAR/ ) that enables flexibility of input file format, can be used across multiple species and genome assembly versions, and is freely available for public use. The web server additionally performs GO enrichment analysis for the identified target genes to reveal the potential function of the corresponding TF. The iTAR web server provides a user-friendly interface and supports target gene identification in seven species, ranging from yeast to human. To facilitate investigating the quality of ChIP-seq/ChIP-chip data, the web server generates the chart of the

  4. CMT: a constrained multi-level thresholding approach for ChIP-Seq data analysis.

    Directory of Open Access Journals (Sweden)

    Iman Rezaeian

    Full Text Available Genome-wide profiling of DNA-binding proteins using ChIP-Seq has emerged as an alternative to ChIP-chip methods. ChIP-Seq technology offers many advantages over ChIP-chip arrays, including but not limited to less noise, higher resolution, and more coverage. Several algorithms have been developed to take advantage of these abilities and find enriched regions by analyzing ChIP-Seq data. However, the complexity of analyzing various patterns of ChIP-Seq signals still needs the development of new algorithms. Most current algorithms use various heuristics to detect regions accurately. However, despite how many formulations are available, it is still difficult to accurately determine individual peaks corresponding to each binding event. We developed Constrained Multi-level Thresholding (CMT, an algorithm used to detect enriched regions on ChIP-Seq data. CMT employs a constraint-based module that can target regions within a specific range. We show that CMT has higher accuracy in detecting enriched regions (peaks by objectively assessing its performance relative to other previously proposed peak finders. This is shown by testing three algorithms on the well-known FoxA1 Data set, four transcription factors (with a total of six antibodies for Drosophila melanogaster and the H3K4ac antibody dataset.

  5. RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome

    Directory of Open Access Journals (Sweden)

    Dewey Colin N

    2011-08-01

    Full Text Available Abstract Background RNA-Seq is revolutionizing the way transcript abundances are measured. A key challenge in transcript quantification from RNA-Seq data is the handling of reads that map to multiple genes or isoforms. This issue is particularly important for quantification with de novo transcriptome assemblies in the absence of sequenced genomes, as it is difficult to determine which transcripts are isoforms of the same gene. A second significant issue is the design of RNA-Seq experiments, in terms of the number of reads, read length, and whether reads come from one or both ends of cDNA fragments. Results We present RSEM, an user-friendly software package for quantifying gene and isoform abundances from single-end or paired-end RNA-Seq data. RSEM outputs abundance estimates, 95% credibility intervals, and visualization files and can also simulate RNA-Seq data. In contrast to other existing tools, the software does not require a reference genome. Thus, in combination with a de novo transcriptome assembler, RSEM enables accurate transcript quantification for species without sequenced genomes. On simulated and real data sets, RSEM has superior or comparable performance to quantification methods that rely on a reference genome. Taking advantage of RSEM's ability to effectively use ambiguously-mapping reads, we show that accurate gene-level abundance estimates are best obtained with large numbers of short single-end reads. On the other hand, estimates of the relative frequencies of isoforms within single genes may be improved through the use of paired-end reads, depending on the number of possible splice forms for each gene. Conclusions RSEM is an accurate and user-friendly software tool for quantifying transcript abundances from RNA-Seq data. As it does not rely on the existence of a reference genome, it is particularly useful for quantification with de novo transcriptome assemblies. In addition, RSEM has enabled valuable guidance for cost

  6. Comparison of RNA-seq and microarray-based models for clinical endpoint prediction.

    Science.gov (United States)

    Zhang, Wenqian; Yu, Ying; Hertwig, Falk; Thierry-Mieg, Jean; Zhang, Wenwei; Thierry-Mieg, Danielle; Wang, Jian; Furlanello, Cesare; Devanarayan, Viswanath; Cheng, Jie; Deng, Youping; Hero, Barbara; Hong, Huixiao; Jia, Meiwen; Li, Li; Lin, Simon M; Nikolsky, Yuri; Oberthuer, André; Qing, Tao; Su, Zhenqiang; Volland, Ruth; Wang, Charles; Wang, May D; Ai, Junmei; Albanese, Davide; Asgharzadeh, Shahab; Avigad, Smadar; Bao, Wenjun; Bessarabova, Marina; Brilliant, Murray H; Brors, Benedikt; Chierici, Marco; Chu, Tzu-Ming; Zhang, Jibin; Grundy, Richard G; He, Min Max; Hebbring, Scott; Kaufman, Howard L; Lababidi, Samir; Lancashire, Lee J; Li, Yan; Lu, Xin X; Luo, Heng; Ma, Xiwen; Ning, Baitang; Noguera, Rosa; Peifer, Martin; Phan, John H; Roels, Frederik; Rosswog, Carolina; Shao, Susan; Shen, Jie; Theissen, Jessica; Tonini, Gian Paolo; Vandesompele, Jo; Wu, Po-Yen; Xiao, Wenzhong; Xu, Joshua; Xu, Weihong; Xuan, Jiekun; Yang, Yong; Ye, Zhan; Dong, Zirui; Zhang, Ke K; Yin, Ye; Zhao, Chen; Zheng, Yuanting; Wolfinger, Russell D; Shi, Tieliu; Malkas, Linda H; Berthold, Frank; Wang, Jun; Tong, Weida; Shi, Leming; Peng, Zhiyu; Fischer, Matthias

    2015-06-25

    Gene expression profiling is being widely applied in cancer research to identify biomarkers for clinical endpoint prediction. Since RNA-seq provides a powerful tool for transcriptome-based applications beyond the limitations of microarrays, we sought to systematically evaluate the performance of RNA-seq-based and microarray-based classifiers in this MAQC-III/SEQC study for clinical endpoint prediction using neuroblastoma as a model. We generate gene expression profiles from 498 primary neuroblastomas using both RNA-seq and 44 k microarrays. Characterization of the neuroblastoma transcriptome by RNA-seq reveals that more than 48,000 genes and 200,000 transcripts are being expressed in this malignancy. We also find that RNA-seq provides much more detailed information on specific transcript expression patterns in clinico-genetic neuroblastoma subgroups than microarrays. To systematically compare the power of RNA-seq and microarray-based models in predicting clinical endpoints, we divide the cohort randomly into training and validation sets and develop 360 predictive models on six clinical endpoints of varying predictability. Evaluation of factors potentially affecting model performances reveals that prediction accuracies are most strongly influenced by the nature of the clinical endpoint, whereas technological platforms (RNA-seq vs. microarrays), RNA-seq data analysis pipelines, and feature levels (gene vs. transcript vs. exon-junction level) do not significantly affect performances of the models. We demonstrate that RNA-seq outperforms microarrays in determining the transcriptomic characteristics of cancer, while RNA-seq and microarray-based models perform similarly in clinical endpoint prediction. Our findings may be valuable to guide future studies on the development of gene expression-based predictive models and their implementation in clinical practice.

  7. Ancestry prediction in Singapore population samples using the Illumina ForenSeq kit.

    Science.gov (United States)

    Ramani, Anantharaman; Wong, Yongxun; Tan, Si Zhen; Shue, Bing Hong; Syn, Christopher

    2017-11-01

    ) was generally high in the absence of admixture. Misclassification occurred in admixed individuals, who were likely offspring of inter-ethnic marriages, and hence whose self-reported bio-geographic ancestries were dependent on that of their fathers, and in individuals of minority sub-populations with inter-ethnic beliefs. The ancestry prediction capabilities of the 59 SNPs on the ForenSeq kit were reasonably effective in differentiating the Singapore Chinese, Malay and Indian sub-populations, and will be of use for investigative purposes. However, there is potential for more accurate prediction through the evaluation of other AIM sets. Copyright © 2017 Elsevier B.V. All rights reserved.

  8. Experiment list: SRX186172 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available 1=YY1 || chip antibody manufacturer 1=Abcam || chip antibody 2=YY1 || chip antibody manufacturer 2=Santa Cru...ip-Seq; Mus musculus; ChIP-Seq source_name=Rag1 -/- pro-B cells || chip antibody

  9. ChIP-seq Accurately Predicts Tissue-Specific Activity of Enhancers

    Energy Technology Data Exchange (ETDEWEB)

    Visel, Axel; Blow, Matthew J.; Li, Zirong; Zhang, Tao; Akiyama, Jennifer A.; Holt, Amy; Plajzer-Frick, Ingrid; Shoukry, Malak; Wright, Crystal; Chen, Feng; Afzal, Veena; Ren, Bing; Rubin, Edward M.; Pennacchio, Len A.

    2009-02-01

    A major yet unresolved quest in decoding the human genome is the identification of the regulatory sequences that control the spatial and temporal expression of genes. Distant-acting transcriptional enhancers are particularly challenging to uncover since they are scattered amongst the vast non-coding portion of the genome. Evolutionary sequence constraint can facilitate the discovery of enhancers, but fails to predict when and where they are active in vivo. Here, we performed chromatin immunoprecipitation with the enhancer-associated protein p300, followed by massively-parallel sequencing, to map several thousand in vivo binding sites of p300 in mouse embryonic forebrain, midbrain, and limb tissue. We tested 86 of these sequences in a transgenic mouse assay, which in nearly all cases revealed reproducible enhancer activity in those tissues predicted by p300 binding. Our results indicate that in vivo mapping of p300 binding is a highly accurate means for identifying enhancers and their associated activities and suggest that such datasets will be useful to study the role of tissue-specific enhancers in human biology and disease on a genome-wide scale.

  10. Accurate detection of carcinoma cells by use of a cell microarray chip.

    Directory of Open Access Journals (Sweden)

    Shohei Yamamura

    Full Text Available BACKGROUND: Accurate detection and analysis of circulating tumor cells plays an important role in the diagnosis and treatment of metastatic cancer treatment. METHODS AND FINDINGS: A cell microarray chip was used to detect spiked carcinoma cells among leukocytes. The chip, with 20,944 microchambers (105 µm width and 50 µm depth, was made from polystyrene; and the formation of monolayers of leukocytes in the microchambers was observed. Cultured human T lymphoblastoid leukemia (CCRF-CEM cells were used to examine the potential of the cell microarray chip for the detection of spiked carcinoma cells. A T lymphoblastoid leukemia suspension was dispersed on the chip surface, followed by 15 min standing to allow the leukocytes to settle down into the microchambers. Approximately 29 leukocytes were found in each microchamber when about 600,000 leukocytes in total were dispersed onto a cell microarray chip. Similarly, when leukocytes isolated from human whole blood were used, approximately 89 leukocytes entered each microchamber when about 1,800,000 leukocytes in total were placed onto the cell microarray chip. After washing the chip surface, PE-labeled anti-cytokeratin monoclonal antibody and APC-labeled anti-CD326 (EpCAM monoclonal antibody solution were dispersed onto the chip surface and allowed to react for 15 min; and then a microarray scanner was employed to detect any fluorescence-positive cells within 20 min. In the experiments using spiked carcinoma cells (NCI-H1650, 0.01 to 0.0001%, accurate detection of carcinoma cells was achieved with PE-labeled anti-cytokeratin monoclonal antibody. Furthermore, verification of carcinoma cells in the microchambers was performed by double staining with the above monoclonal antibodies. CONCLUSION: The potential application of the cell microarray chip for the detection of CTCs was shown, thus demonstrating accurate detection by double staining for cytokeratin and EpCAM at the single carcinoma cell level.

  11. An effective approach for identification of in vivo protein-DNA binding sites from paired-end ChIP-Seq data

    Directory of Open Access Journals (Sweden)

    Wilson Zoe A

    2010-02-01

    Full Text Available Abstract Background ChIP-Seq, which combines chromatin immunoprecipitation (ChIP with high-throughput massively parallel sequencing, is increasingly being used for identification of protein-DNA interactions in vivo in the genome. However, to maximize the effectiveness of data analysis of such sequences requires the development of new algorithms that are able to accurately predict DNA-protein binding sites. Results Here, we present SIPeS (Site Identification from Paired-end Sequencing, a novel algorithm for precise identification of binding sites from short reads generated by paired-end solexa ChIP-Seq technology. In this paper we used ChIP-Seq data from the Arabidopsis basic helix-loop-helix transcription factor ABORTED MICROSPORES (AMS, which is expressed within the anther during pollen development, the results show that SIPeS has better resolution for binding site identification compared to two existing ChIP-Seq peak detection algorithms, Cisgenome and MACS. Conclusions When compared to Cisgenome and MACS, SIPeS shows better resolution for binding site discovery. Moreover, SIPeS is designed to calculate the mappable genome length accurately with the fragment length based on the paired-end reads. Dynamic baselines are also employed to effectively discriminate closely adjacent binding sites, for effective binding sites discovery, which is of particular value when working with high-density genomes.

  12. Parallel factor ChIP provides essential internal control for quantitative differential ChIP-seq.

    Science.gov (United States)

    Guertin, Michael J; Cullen, Amy E; Markowetz, Florian; Holding, Andrew N

    2018-04-17

    A key challenge in quantitative ChIP combined with high-throughput sequencing (ChIP-seq) is the normalization of data in the presence of genome-wide changes in occupancy. Analysis-based normalization methods were developed for transcriptomic data and these are dependent on the underlying assumption that total transcription does not change between conditions. For genome-wide changes in transcription factor (TF) binding, these assumptions do not hold true. The challenges in normalization are confounded by experimental variability during sample preparation, processing and recovery. We present a novel normalization strategy utilizing an internal standard of unchanged peaks for reference. Our method can be readily applied to monitor genome-wide changes by ChIP-seq that are otherwise lost or misrepresented through analytical normalization. We compare our approach to normalization by total read depth and two alternative methods that utilize external experimental controls to study TF binding. We successfully resolve the key challenges in quantitative ChIP-seq analysis and demonstrate its application by monitoring the loss of Estrogen Receptor-alpha (ER) binding upon fulvestrant treatment, ER binding in response to estrodiol, ER mediated change in H4K12 acetylation and profiling ER binding in patient-derived xenographs. This is supported by an adaptable pipeline to normalize and quantify differential TF binding genome-wide and generate metrics for differential binding at individual sites.

  13. SeqAPASS: Predicting chemical susceptibility to threatened/endangered species

    Science.gov (United States)

    Conservation of a molecular target across species can be used as a line-of-evidence to predict the likelihood of chemical susceptibility. The web-based Sequence Alignment to Predict Across Species Susceptibility (SeqAPASS; https://seqapass.epa.gov/seqapass/) application was devel...

  14. Experiment list: SRX150661 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available is=Adenocarcinoma 59396606,71.7,11.1,1200 GSM935582: Harvard ChipSeq HeLa-S3 BRF1 std source_name=HeLa-S3 ||... biomaterial_provider=ATCC || lab=Harvard || lab description=Struhl - Harvard University || datatype=ChipSeq

  15. Experiment list: SRX150495 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available is=Adenocarcinoma 62508352,67.6,8.4,1556 GSM935416: Harvard ChipSeq HeLa-S3 ZZZ3 std source_name=HeLa-S3 || ...biomaterial_provider=ATCC || lab=Harvard || lab description=Struhl - Harvard University || datatype=ChipSeq

  16. Experiment list: SRX150565 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available =Adenocarcinoma 54953593,74.3,12.2,1703 GSM935486: Harvard ChipSeq HeLa-S3 BDP1 std source_name=HeLa-S3 || b...iomaterial_provider=ATCC || lab=Harvard || lab description=Struhl - Harvard University || datatype=ChipSeq |

  17. Experiment list: SRX150586 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available -Barr Virus 33195472,90.4,25.9,15633 GSM935507: Harvard ChipSeq GM12878 NF-YB IgG-mus source_name=GM12878 ||...?PgId=165&q=GM12878 || lab=Harvard || lab description=Struhl - Harvard University || datatype=ChipSeq || dat

  18. Experiment list: SRX150496 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available ein-Barr Virus 63040797,85.0,19.7,1435 GSM935417: Harvard ChipSeq GM12878 SPT20 std source_name=GM12878 || b...gId=165&q=GM12878 || lab=Harvard || lab description=Struhl - Harvard University || datatype=ChipSeq || datat

  19. Experiment list: SRX150585 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available -Barr Virus 32926476,94.0,12.0,2668 GSM935506: Harvard ChipSeq GM12878 NF-YA IgG-mus source_name=GM12878 || ...PgId=165&q=GM12878 || lab=Harvard || lab description=Struhl - Harvard University || datatype=ChipSeq || data

  20. A comparison of the whole genome approach of MeDIP-seq to the targeted approach of the Infinium HumanMethylation450 BeadChip(® for methylome profiling.

    Directory of Open Access Journals (Sweden)

    Christine Clark

    Full Text Available DNA methylation is one of the most studied epigenetic marks in the human genome, with the result that the desire to map the human methylome has driven the development of several methods to map DNA methylation on a genomic scale. Our study presents the first comparison of two of these techniques - the targeted approach of the Infinium HumanMethylation450 BeadChip® with the immunoprecipitation and sequencing-based method, MeDIP-seq. Both methods were initially validated with respect to bisulfite sequencing as the gold standard and then assessed in terms of coverage, resolution and accuracy. The regions of the methylome that can be assayed by both methods and those that can only be assayed by one method were determined and the discovery of differentially methylated regions (DMRs by both techniques was examined. Our results show that the Infinium HumanMethylation450 BeadChip® and MeDIP-seq show a good positive correlation (Spearman correlation of 0.68 on a genome-wide scale and can both be used successfully to determine differentially methylated loci in RefSeq genes, CpG islands, shores and shelves. MeDIP-seq however, allows a wider interrogation of methylated regions of the human genome, including thousands of non-RefSeq genes and repetitive elements, all of which may be of importance in disease. In our study MeDIP-seq allowed the detection of 15,709 differentially methylated regions, nearly twice as many as the array-based method (8070, which may result in a more comprehensive study of the methylome.

  1. Experiment list: SRX107410 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available Adenocarcinoma 37378122,96.3,56.7,376 GSM838388: h3k36me3 si23 ChIP-Seq; Homo sapiens; ChIP-Seq source_name=Hela cells knock...down Med23 || chip antibody=H3K36me3 || treatment=knockdown Med23 || cell line=HeLa || chip

  2. Experiment list: SRX085443 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available SRX085443 mm9 Input control Input control Neural Cerebellum MeSH Description=The pa...ntain balance, and learn motor skills. 38330550,73.2,10.7,866 GSM769020: lab ChipSeq Cerebellum Input source_name=Cerebellum... Cancer Research || datatype=ChipSeq || datatype description=ChIP-Seq || cell=Cerebellum... || cell organism=mouse || cell description=Cerebellum || cell sex=M || antibody=Input || antibody de...on=Standard input signal for most experiments. || controlid=Cerebellum/Input/std || labversion=05/27/09 Lane

  3. Experiment list: SRX150629 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available sue Diagnosis=Fibrocystic Disease 27949151,89.1,5.9,589 GSM935550: Harvard ChipSeq MCF10A-Er-Src EtOH 0.01pc...t 12hr Input std source_name=MCF10A-Er-Src || biomaterial_provider=Struhl laboratory || lab=Harvard || lab description=Struhl - Harva...rd University || datatype=ChipSeq || datatype descriptio

  4. Experiment list: SRX150494 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available n-Barr Virus 44912180,85.6,7.7,1806 GSM935415: Harvard ChipSeq GM12878 GCN5 std source_name=GM12878 || bioma...ard || lab description=Struhl - Harvard University || datatype=ChipSeq || datatype ...terial_provider=Coriell; http://ccr.coriell.org/Sections/Search/Search.aspx?PgId=165&q=GM12878 || lab=Harv

  5. Experiment list: SRX150667 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available t|Tissue Diagnosis=Fibrocystic Disease 69172664,86.5,35.3,28780 GSM935588: Harvard ChipSeq MCF10A-Er-Src EtO...H 0.01pct Pol2 std source_name=MCF10A-Er-Src || biomaterial_provider=Struhl laboratory || lab=Harvard || lab description=Struhl - Har...vard University || datatype=ChipSeq || datatype descript

  6. Experiment list: SRX150535 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available t|Tissue Diagnosis=Fibrocystic Disease 69171580,86.8,43.7,20874 GSM935456: Harvard ChipSeq MCF10A-Er-Src 4OH...TAM 1uM 36hr Pol2 std source_name=MCF10A-Er-Src || biomaterial_provider=Struhl laboratory || lab=Harvard || ...lab description=Struhl - Harvard University || datatype=ChipSeq || datatype descr

  7. Experiment list: SRX150562 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available -Barr Virus 57294082,73.4,6.6,1909 GSM935483: Harvard ChipSeq GM12878 ZZZ3 std source_name=GM12878 || biomat...rd || lab description=Struhl - Harvard University || datatype=ChipSeq || datatype d...erial_provider=Coriell; http://ccr.coriell.org/Sections/Search/Search.aspx?PgId=165&q=GM12878 || lab=Harva

  8. Experiment list: SRX085450 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available SRX085450 mm9 Histone H3K4me3 Neural Cerebellum MeSH Description=The part of brain ...e, and learn motor skills. 40109154,65.1,24.9,36907 GSM769027: lab ChipSeq Cerebellum H3K4me3 source_name=Cerebellum...Research || datatype=ChipSeq || datatype description=ChIP-Seq || cell=Cerebellum ...|| cell organism=mouse || cell description=Cerebellum || cell sex=M || antibody=H3K4me3 || antibody antibody...Standard input signal for most experiments. || controlid=Cerebellum/Input/std || labversion=5/26/09 Lane 6 |

  9. Experiment list: SRX085441 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available SRX085441 mm9 Histone H3K4me1 Neural Cerebellum MeSH Description=The part of brain ...e, and learn motor skills. 34909537,81.0,5.9,29316 GSM769018: lab ChipSeq Cerebellum H3K4me1 source_name=Cerebellum...esearch || datatype=ChipSeq || datatype description=ChIP-Seq || cell=Cerebellum |...| cell organism=mouse || cell description=Cerebellum || cell sex=M || antibody=H3K4me1 || antibody antibodyd...iption=Standard input signal for most experiments. || controlid=Cerebellum/Input/std || labversion=12/09/09

  10. Predicting gene regulatory networks of soybean nodulation from RNA-Seq transcriptome data.

    Science.gov (United States)

    Zhu, Mingzhu; Dahmen, Jeremy L; Stacey, Gary; Cheng, Jianlin

    2013-09-22

    High-throughput RNA sequencing (RNA-Seq) is a revolutionary technique to study the transcriptome of a cell under various conditions at a systems level. Despite the wide application of RNA-Seq techniques to generate experimental data in the last few years, few computational methods are available to analyze this huge amount of transcription data. The computational methods for constructing gene regulatory networks from RNA-Seq expression data of hundreds or even thousands of genes are particularly lacking and urgently needed. We developed an automated bioinformatics method to predict gene regulatory networks from the quantitative expression values of differentially expressed genes based on RNA-Seq transcriptome data of a cell in different stages and conditions, integrating transcriptional, genomic and gene function data. We applied the method to the RNA-Seq transcriptome data generated for soybean root hair cells in three different development stages of nodulation after rhizobium infection. The method predicted a soybean nodulation-related gene regulatory network consisting of 10 regulatory modules common for all three stages, and 24, 49 and 70 modules separately for the first, second and third stage, each containing both a group of co-expressed genes and several transcription factors collaboratively controlling their expression under different conditions. 8 of 10 common regulatory modules were validated by at least two kinds of validations, such as independent DNA binding motif analysis, gene function enrichment test, and previous experimental data in the literature. We developed a computational method to reliably reconstruct gene regulatory networks from RNA-Seq transcriptome data. The method can generate valuable hypotheses for interpreting biological data and designing biological experiments such as ChIP-Seq, RNA interference, and yeast two hybrid experiments.

  11. HPeak: an HMM-based algorithm for defining read-enriched regions in ChIP-Seq data

    Directory of Open Access Journals (Sweden)

    Maher Christopher A

    2010-07-01

    Full Text Available Abstract Background Protein-DNA interaction constitutes a basic mechanism for the genetic regulation of target gene expression. Deciphering this mechanism has been a daunting task due to the difficulty in characterizing protein-bound DNA on a large scale. A powerful technique has recently emerged that couples chromatin immunoprecipitation (ChIP with next-generation sequencing, (ChIP-Seq. This technique provides a direct survey of the cistrom of transcription factors and other chromatin-associated proteins. In order to realize the full potential of this technique, increasingly sophisticated statistical algorithms have been developed to analyze the massive amount of data generated by this method. Results Here we introduce HPeak, a Hidden Markov model (HMM-based Peak-finding algorithm for analyzing ChIP-Seq data to identify protein-interacting genomic regions. In contrast to the majority of available ChIP-Seq analysis software packages, HPeak is a model-based approach allowing for rigorous statistical inference. This approach enables HPeak to accurately infer genomic regions enriched with sequence reads by assuming realistic probability distributions, in conjunction with a novel weighting scheme on the sequencing read coverage. Conclusions Using biologically relevant data collections, we found that HPeak showed a higher prevalence of the expected transcription factor binding motifs in ChIP-enriched sequences relative to the control sequences when compared to other currently available ChIP-Seq analysis approaches. Additionally, in comparison to the ChIP-chip assay, ChIP-Seq provides higher resolution along with improved sensitivity and specificity of binding site detection. Additional file and the HPeak program are freely available at http://www.sph.umich.edu/csg/qin/HPeak.

  12. Experiment list: SRX758028 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available agnosis=Carcinoma 82846335,87.8,5.7,22076 GSM1543791: LNCaP TOP1 ChIP-seq 30min vehicle; Homo sapiens; ChIP-...Seq source_name=ChIP-seq with BLRP-tagged TOP1, 30min treatment with vehicle || cell line=LNCaP || chip anti

  13. Experiment list: SRX148878 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available nosis=Carcinoma 71832151,98.1,33.1,1626 GSM934593: H2B ChIP-Seq USP49 knockdown; Homo sapiens; ChIP-Seq sour...ce_name=H2B ChIP-Seq USP49 knockdown || cell line=HCT116 || cell type=colorectal carcinoma || chip antibody=

  14. Searching for an Accurate Marker-Based Prediction of an Individual Quantitative Trait in Molecular Plant Breeding.

    Science.gov (United States)

    Fu, Yong-Bi; Yang, Mo-Hua; Zeng, Fangqin; Biligetu, Bill

    2017-01-01

    Molecular plant breeding with the aid of molecular markers has played an important role in modern plant breeding over the last two decades. Many marker-based predictions for quantitative traits have been made to enhance parental selection, but the trait prediction accuracy remains generally low, even with the aid of dense, genome-wide SNP markers. To search for more accurate trait-specific prediction with informative SNP markers, we conducted a literature review on the prediction issues in molecular plant breeding and on the applicability of an RNA-Seq technique for developing function-associated specific trait (FAST) SNP markers. To understand whether and how FAST SNP markers could enhance trait prediction, we also performed a theoretical reasoning on the effectiveness of these markers in a trait-specific prediction, and verified the reasoning through computer simulation. To the end, the search yielded an alternative to regular genomic selection with FAST SNP markers that could be explored to achieve more accurate trait-specific prediction. Continuous search for better alternatives is encouraged to enhance marker-based predictions for an individual quantitative trait in molecular plant breeding.

  15. Experiment list: SRX148876 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available agnosis=Carcinoma 62096923,98.1,90.9,255 GSM934591: uH2b ChIP-Seq USP49 knockdown; Homo sapiens; ChIP-Seq so...urce_name=uH2b ChIP-Seq USP49 knockdown || cell line=HCT116 || cell type=colorectal carcinoma || chip antibo

  16. Integration of RNA-Seq and RPPA data for survival time prediction in cancer patients.

    Science.gov (United States)

    Isik, Zerrin; Ercan, Muserref Ece

    2017-10-01

    Integration of several types of patient data in a computational framework can accelerate the identification of more reliable biomarkers, especially for prognostic purposes. This study aims to identify biomarkers that can successfully predict the potential survival time of a cancer patient by integrating the transcriptomic (RNA-Seq), proteomic (RPPA), and protein-protein interaction (PPI) data. The proposed method -RPBioNet- employs a random walk-based algorithm that works on a PPI network to identify a limited number of protein biomarkers. Later, the method uses gene expression measurements of the selected biomarkers to train a classifier for the survival time prediction of patients. RPBioNet was applied to classify kidney renal clear cell carcinoma (KIRC), glioblastoma multiforme (GBM), and lung squamous cell carcinoma (LUSC) patients based on their survival time classes (long- or short-term). The RPBioNet method correctly identified the survival time classes of patients with between 66% and 78% average accuracy for three data sets. RPBioNet operates with only 20 to 50 biomarkers and can achieve on average 6% higher accuracy compared to the closest alternative method, which uses only RNA-Seq data in the biomarker selection. Further analysis of the most predictive biomarkers highlighted genes that are common for both cancer types, as they may be driver proteins responsible for cancer progression. The novelty of this study is the integration of a PPI network with mRNA and protein expression data to identify more accurate prognostic biomarkers that can be used for clinical purposes in the future. Copyright © 2017 Elsevier Ltd. All rights reserved.

  17. Computational Methods for ChIP-seq Data Analysis and Applications

    KAUST Repository

    Ashoor, Haitham

    2017-04-25

    The development of Chromatin immunoprecipitation followed by sequencing (ChIP-seq) technology has enabled the construction of genome-wide maps of protein-DNA interaction. Such maps provide information about transcriptional regulation at the epigenetic level (histone modifications and histone variants) and at the level of transcription factor (TF) activity. This dissertation presents novel computational methods for ChIP-seq data analysis and applications. The work of this dissertation addresses four main challenges. First, I address the problem of detecting histone modifications from ChIP-seq cancer samples. The presence of copy number variations (CNVs) in cancer samples results in statistical biases that lead to inaccurate predictions when standard methods are used. To overcome this issue I developed HMCan, a specially designed algorithm to handle ChIP-seq cancer data by accounting for the presence of CNVs. When using ChIP-seq data from cancer cells, HMCan demonstrates unbiased and accurate predictions compared to the standard state of the art methods. Second, I address the problem of identifying changes in histone modifications between two ChIP-seq samples with different genetic backgrounds (for example cancer vs. normal). In addition to CNVs, different antibody efficiency between samples and presence of samples replicates are challenges for this problem. To overcome these issues, I developed the HMCan-diff algorithm as an extension to HMCan. HMCan-diff implements robust normalization methods to address the challenges listed above. HMCan-diff significantly outperforms another state of the art methods on data containing cancer samples. Third, I investigate and analyze predictions of different methods for enhancer prediction based on ChIP-seq data. The analysis shows that predictions generated by different methods are poorly overlapping. To overcome this issue, I developed DENdb, a database that integrates enhancer predictions from different methods. DENdb also

  18. Rcorrector: efficient and accurate error correction for Illumina RNA-seq reads.

    Science.gov (United States)

    Song, Li; Florea, Liliana

    2015-01-01

    Next-generation sequencing of cellular RNA (RNA-seq) is rapidly becoming the cornerstone of transcriptomic analysis. However, sequencing errors in the already short RNA-seq reads complicate bioinformatics analyses, in particular alignment and assembly. Error correction methods have been highly effective for whole-genome sequencing (WGS) reads, but are unsuitable for RNA-seq reads, owing to the variation in gene expression levels and alternative splicing. We developed a k-mer based method, Rcorrector, to correct random sequencing errors in Illumina RNA-seq reads. Rcorrector uses a De Bruijn graph to compactly represent all trusted k-mers in the input reads. Unlike WGS read correctors, which use a global threshold to determine trusted k-mers, Rcorrector computes a local threshold at every position in a read. Rcorrector has an accuracy higher than or comparable to existing methods, including the only other method (SEECER) designed for RNA-seq reads, and is more time and memory efficient. With a 5 GB memory footprint for 100 million reads, it can be run on virtually any desktop or server. The software is available free of charge under the GNU General Public License from https://github.com/mourisl/Rcorrector/.

  19. Multifrequency Excitation Method for Rapid and Accurate Dynamic Test of Micromachined Gyroscope Chips

    Directory of Open Access Journals (Sweden)

    Yan Deng

    2014-10-01

    Full Text Available A novel multifrequency excitation (MFE method is proposed to realize rapid and accurate dynamic testing of micromachined gyroscope chips. Compared with the traditional sweep-frequency excitation (SFE method, the computational time for testing one chip under four modes at a 1-Hz frequency resolution and 600-Hz bandwidth was dramatically reduced from 10 min to 6 s. A multifrequency signal with an equal amplitude and initial linear-phase-difference distribution was generated to ensure test repeatability and accuracy. The current test system based on LabVIEW using the SFE method was modified to use the MFE method without any hardware changes. The experimental results verified that the MFE method can be an ideal solution for large-scale dynamic testing of gyroscope chips and gyroscopes.

  20. Experiment list: SRX485203 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available 346544: Rhino ChIP from control germline knock-down ovaries, replicate 2; Drosophila melanogaster; ChIP-Seq ...source_name=Rhino ChIP from control germline knock-down ovaries || developmental stage=4-6 days old adult ||... Sex=female || tissue=ovary || germline knock-down=control || chip antibody=custo

  1. Experiment list: SRX485202 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available 346543: Rhino ChIP from control germline knock-down ovaries, replicate 1; Drosophila melanogaster; ChIP-Seq ...source_name=Rhino ChIP from control germline knock-down ovaries || developmental stage=4-6 days old adult ||... Sex=female || tissue=ovary || germline knock-down=control || chip antibody=custo

  2. Experiment list: SRX485210 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available 6551: Deadlock ChIP from deadlock germline knock-down ovaries; Drosophila melanogaster; ChIP-Seq source_name...=Deadlock ChIP from deadlock germline knock-down ovaries || developmental stage=4-6 days old adult || Sex=fe...male || tissue=ovary || germline knock-down=deadlock || chip antibody=custom-made

  3. Experiment list: SRX485211 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available 346552: Cutoff ChIP from control germline knock-down ovaries; Drosophila melanogaster; ChIP-Seq source_name=...Cutoff ChIP from control germline knock-down ovaries || developmental stage=4-6 days old adult || Sex=female... || tissue=ovary || germline knock-down=control || chip antibody=custom-made rabb

  4. In Silico Pooling of ChIP-seq Control Experiments

    Science.gov (United States)

    Sun, Guannan; Srinivasan, Rajini; Lopez-Anido, Camila; Hung, Holly A.; Svaren, John; Keleş, Sündüz

    2014-01-01

    As next generation sequencing technologies are becoming more economical, large-scale ChIP-seq studies are enabling the investigation of the roles of transcription factor binding and epigenome on phenotypic variation. Studying such variation requires individual level ChIP-seq experiments. Standard designs for ChIP-seq experiments employ a paired control per ChIP-seq sample. Genomic coverage for control experiments is often sacrificed to increase the resources for ChIP samples. However, the quality of ChIP-enriched regions identifiable from a ChIP-seq experiment depends on the quality and the coverage of the control experiments. Insufficient coverage leads to loss of power in detecting enrichment. We investigate the effect of in silico pooling of control samples within multiple biological replicates, multiple treatment conditions, and multiple cell lines and tissues across multiple datasets with varying levels of genomic coverage. Our computational studies suggest guidelines for performing in silico pooling of control experiments. Using vast amounts of ENCODE data, we show that pairwise correlations between control samples originating from multiple biological replicates, treatments, and cell lines/tissues can be grouped into two classes representing whether or not in silico pooling leads to power gain in detecting enrichment between the ChIP and the control samples. Our findings have important implications for multiplexing samples. PMID:25380244

  5. Experiment list: SRX185907 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available Homo sapiens; ChIP-Seq source_name=MCF-7 breast adenocarcinoma cells, control, FOXM1 ChIP || cell_line=MCF-...7 || cell_type=ER-positive breast adenocarcinoma cells || treatment=DMSO || chip_

  6. Experiment list: SRX150520 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available osis=Fibrocystic Disease 49296691,89.4,24.9,46885 GSM935441: Harvard ChipSeq MCF10A-Er-Src EtOH 0.01pct c-Myc Harvard... Control source_name=MCF10A-Er-Src || biomaterial_provider=Struhl laboratory || lab=Harvard || lab description=Struhl - Harv...|| antibody vendorid=sc-764 || control=Harvard_Control || control description=input library was prepared at Harvard. || control=Harva...rd_Control || control description=input library was prepared at Harvard...ard University || datatype=ChipSeq || datatype descripti

  7. Experiment list: SRX150478 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available osis=Fibrocystic Disease 66690540,98.1,24.9,110111 GSM935398: Harvard ChipSeq MCF10A-Er-Src 4OHTAM 1uM 12hr c-Fos Harvard... Control source_name=MCF10A-Er-Src || biomaterial_provider=Struhl laboratory || lab=Harvard || ...lab description=Struhl - Harvard University || datatype=ChipSeq || datatype descr... is a leucine-zipper. || antibody vendorname=Santa Cruz Biotech || antibody vendorid=sc-7202 || control=Harvard..._Control || control description=input library was prepared at Harvard. || control=Harvard

  8. Experiment list: SRX150696 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available -1 || cell organism=human || cell description=pancreatic carcinoma, (PMID: 1140870) PANC-1 was established from a panc...SRX150696 hg19 Input control Input control Pancreas PANC-1 Tissue=Pancreas/Duct|Dis...ease=Epithelioid Carcinoma 41671673,95.8,10.4,1584 GSM935617: USC ChipSeq PANC-1 Input UCDavis source_name=PANC... || datatype=ChipSeq || datatype description=Chromatin IP Sequencing || cell=PANC...reatic carcinoma, which was extracted via pancreatico-duodenectomy specimen from a 56-year-old Cau

  9. Experiment list: SRX199860 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available | cell organism=human || cell description=pancreatic carcinoma, (PMID: 1140870) PANC-1 was established from a panc...SRX199860 hg19 Input control Input control Pancreas PANC-1 Tissue=Pancreas/Duct|Dis...ease=Epithelioid Carcinoma 27365308,98.2,3.6,969 GSM1022632: UW ChipSeq PANC-1 InputRep1 source_name=PANC-1 ...datatype=ChipSeq || datatype description=Chromatin IP Sequencing || cell=PANC-1 |...reatic carcinoma, which was extracted via pancreatico-duodenectomy specimen from a 56-year-old Caucasi

  10. Experiment list: SRX977433 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available ates NA 65512838,98.3,15.4,34790 GSM1648684: RNAPII ChipSeq day15; Mus musculus; ChIP-Seq source_name=reprogramming... intermediate after 15days of induction || cell type=reprogramming intermediate || genotype=RNAPII-GF

  11. Experiment list: SRX977429 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available ates NA 57685706,97.8,16.7,27894 GSM1648680: RNAPII ChipSeq day3; Mus musculus; ChIP-Seq source_name=reprogramming... intermediate after 3days of induction || cell type=reprogramming intermediate || genotype=RNAPII-GFP/

  12. Experiment list: SRX977432 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available ates NA 56429535,98.1,16.9,32459 GSM1648683: RNAPII ChipSeq day11; Mus musculus; ChIP-Seq source_name=reprogramming... intermediate after 11days of induction || cell type=reprogramming intermediate || genotype=RNAPII-GF

  13. Experiment list: SRX185915 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available mo sapiens; ChIP-Seq source_name=MCF-7 breast adenocarcinoma cells, control, FOXM1 ChIP || cell_line=MCF-7 |...| cell_type=ER-positive breast adenocarcinoma cells || treatment=DMSO || chip_tar

  14. Experiment list: SRX185909 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available omo sapiens; ChIP-Seq source_name=MCF-7 breast adenocarcinoma cells, control, FOXM1 ChIP || cell_line=MCF-7 ...|| cell_type=ER-positive breast adenocarcinoma cells || treatment=DMSO || chip_ta

  15. Experiment list: SRX185917 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available omo sapiens; ChIP-Seq source_name=MCF-7 breast adenocarcinoma cells, control, FOXM1 ChIP || cell_line=MCF-7 ...|| cell_type=ER-positive breast adenocarcinoma cells || treatment=DMSO || chip_ta

  16. Experiment list: SRX485205 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available 46546: Rhino ChIP from deadlock germline knock-down ovaries; Drosophila melanogaster; ChIP-Seq source_name=R...hino ChIP from deadlock germline knock-down ovaries || developmental stage=4-6 days old adult || Sex=female ...|| tissue=ovary || germline knock-down=deadlock || chip antibody=custom-made rabb

  17. Experiment list: SRX485212 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available 346553: Cutoff ChIP from cutoff germline knock-down ovaries; Drosophila melanogaster; ChIP-Seq source_name=C...utoff ChIP from cutoff germline knock-down ovaries || developmental stage=4-6 days old adult || Sex=female |...| tissue=ovary || germline knock-down=cutoff || chip antibody=custom-made rabbit

  18. Experiment list: SRX485206 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available 346547: Rhino ChIP from cutoff germline knock-down ovaries; Drosophila melanogaster; ChIP-Seq source_name=Rh...ino ChIP from cutoff germline knock-down ovaries || developmental stage=4-6 days old adult || Sex=female || ...tissue=ovary || germline knock-down=cutoff || chip antibody=custom-made rabbit po

  19. Experiment list: SRX485209 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available 346550: Deadlock ChIP from control germline knock-down ovaries; Drosophila melanogaster; ChIP-Seq source_nam...e=Deadlock ChIP from control germline knock-down ovaries || developmental stage=4-6 days old adult || Sex=fe...male || tissue=ovary || germline knock-down=control || chip antibody=custom-made

  20. A Time-predictable Memory Network-on-Chip

    DEFF Research Database (Denmark)

    Schoeberl, Martin; Chong, David VH; Puffitsch, Wolfgang

    2014-01-01

    To derive safe bounds on worst-case execution times (WCETs), all components of a computer system need to be time-predictable: the processor pipeline, the caches, the memory controller, and memory arbitration on a multicore processor. This paper presents a solution for time-predictable memory...... arbitration and access for chip-multiprocessors. The memory network-on-chip is organized as a tree with time-division multiplexing (TDM) of accesses to the shared memory. The TDM based arbitration completely decouples processor cores and allows WCET analysis of the memory accesses on individual cores without...

  1. Searching for an Accurate Marker-Based Prediction of an Individual Quantitative Trait in Molecular Plant Breeding

    Directory of Open Access Journals (Sweden)

    Yong-Bi Fu

    2017-07-01

    Full Text Available Molecular plant breeding with the aid of molecular markers has played an important role in modern plant breeding over the last two decades. Many marker-based predictions for quantitative traits have been made to enhance parental selection, but the trait prediction accuracy remains generally low, even with the aid of dense, genome-wide SNP markers. To search for more accurate trait-specific prediction with informative SNP markers, we conducted a literature review on the prediction issues in molecular plant breeding and on the applicability of an RNA-Seq technique for developing function-associated specific trait (FAST SNP markers. To understand whether and how FAST SNP markers could enhance trait prediction, we also performed a theoretical reasoning on the effectiveness of these markers in a trait-specific prediction, and verified the reasoning through computer simulation. To the end, the search yielded an alternative to regular genomic selection with FAST SNP markers that could be explored to achieve more accurate trait-specific prediction. Continuous search for better alternatives is encouraged to enhance marker-based predictions for an individual quantitative trait in molecular plant breeding.

  2. Searching for an Accurate Marker-Based Prediction of an Individual Quantitative Trait in Molecular Plant Breeding

    Science.gov (United States)

    Fu, Yong-Bi; Yang, Mo-Hua; Zeng, Fangqin; Biligetu, Bill

    2017-01-01

    Molecular plant breeding with the aid of molecular markers has played an important role in modern plant breeding over the last two decades. Many marker-based predictions for quantitative traits have been made to enhance parental selection, but the trait prediction accuracy remains generally low, even with the aid of dense, genome-wide SNP markers. To search for more accurate trait-specific prediction with informative SNP markers, we conducted a literature review on the prediction issues in molecular plant breeding and on the applicability of an RNA-Seq technique for developing function-associated specific trait (FAST) SNP markers. To understand whether and how FAST SNP markers could enhance trait prediction, we also performed a theoretical reasoning on the effectiveness of these markers in a trait-specific prediction, and verified the reasoning through computer simulation. To the end, the search yielded an alternative to regular genomic selection with FAST SNP markers that could be explored to achieve more accurate trait-specific prediction. Continuous search for better alternatives is encouraged to enhance marker-based predictions for an individual quantitative trait in molecular plant breeding. PMID:28729875

  3. Experiment list: SRX485220 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available 53 GSM1346561: RNA Polymerase II ChIP from rhino germline knock-down ovaries; Drosophila melanogaster; ChIP-...Seq source_name=RNA Polymerase II ChIP from rhino germline knock-down ovaries || developmental stage=4-6 day...s old adult || Sex=female || tissue=ovary || germline knock-down=rhino || chip an

  4. Experiment list: SRX485204 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available 346545: Rhino ChIP from rhino germline knock-down ovaries; Drosophila melanogaster; ChIP-Seq source_name=Rhi...no ChIP from rhino germline knock-down ovaries || developmental stage=4-6 days old adult || Sex=female || ti...ssue=ovary || germline knock-down=rhino || chip antibody=custom-made rabbit polyc

  5. Experiment list: SRX485208 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available 346549: Rhino ChIP from piwi germline knock-down ovaries, replicate 2; Drosophila melanogaster; ChIP-Seq sou...rce_name=Rhino ChIP from piwi germline knock-down ovaries || developmental stage=4-6 days old adult || Sex=f...emale || tissue=ovary || germline knock-down=piwi || chip antibody=custom-made ra

  6. Experiment list: SRX190029 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available l organism=human || cell description=pancreatic carcinoma, (PMID: 1140870) PANC-1 was established from a panc...SRX190029 hg19 Input control Input control Pancreas PANC-1 Tissue=Pancreas/Duct|Dis...ease=Epithelioid Carcinoma 27365308,98.2,3.6,980 GSM945246: UW ChipSeq PANC-1 Input source_name=PANC-1 || bi...ype=ChipSeq || datatype description=Chromatin IP Sequencing || cell=PANC-1 || cel...reatic carcinoma, which was extracted via pancreatico-duodenectomy specimen from a 56-year-old Caucasian in

  7. Experiment list: SRX977378 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available 23908,98.7,9.6,43208 GSM1648629: total Oct4 ChipSeq day11; Mus musculus; ChIP-Seq source_name=reprogramming ...intermediate after 11days of induction || cell type=reprogramming intermediate || genotype=Oct4-GFP/ Rosa26-

  8. Experiment list: SRX977377 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available 40719,98.1,13.8,57202 GSM1648628: total Oct4 ChipSeq day7; Mus musculus; ChIP-Seq source_name=reprogramming ...intermediate after 7days of induction || cell type=reprogramming intermediate || genotype=Oct4-GFP/ Rosa26-M

  9. Experiment list: SRX977366 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available NA 60072229,98.6,9.6,78331 GSM1648617: flag Oct4 ChipSeq day1; Mus musculus; ChIP-Seq source_name=reprogramming... intermediate after 24hour of induction || cell type=reprogramming intermediate || genotype=Oct4-GFP/ Ros

  10. Experiment list: SRX977375 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available 20178,94.6,10.8,52926 GSM1648626: total Oct4 ChipSeq day3; Mus musculus; ChIP-Seq source_name=reprogramming ...intermediate after 3days of induction || cell type=reprogramming intermediate || genotype=Oct4-GFP/ Rosa26-M

  11. Experiment list: SRX977379 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available 93532,97.8,13.3,43901 GSM1648630: total Oct4 ChipSeq day15; Mus musculus; ChIP-Seq source_name=reprogramming... intermediate after 15days of induction || cell type=reprogramming intermediate || genotype=Oct4-GFP/ Rosa26

  12. Experiment list: SRX977368 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available NA 58474287,97.8,9.8,84163 GSM1648619: flag Oct4 ChipSeq day5; Mus musculus; ChIP-Seq source_name=reprogramming... intermediate after 5days of induction || cell type=reprogramming intermediate || genotype=Oct4-GFP/ Rosa

  13. Experiment list: SRX977369 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available NA 58631406,98.5,10.3,72036 GSM1648620: flag Oct4 ChipSeq day7; Mus musculus; ChIP-Seq source_name=reprogramming... intermediate after 7days of induction || cell type=reprogramming intermediate || genotype=Oct4-GFP/ Ros

  14. Experiment list: SRX684775 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available ng || cell type=intermediate stage of somatic cell reprogramming || chip antibody=M...,29.2,290321 GSM1483904: pre-iPS.H3.MNase-ChIP-Seq; Mus musculus; ChIP-Seq source_name=intermediate stage of somatic cell reprogrammi

  15. Experiment list: SRX977371 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available NA 63273752,96.7,14.1,37891 GSM1648622: flag Oct4 ChipSeq day15; Mus musculus; ChIP-Seq source_name=reprogramming... intermediate after 15days of induction || cell type=reprogramming intermediate || genotype=Oct4-GFP/ R

  16. Experiment list: SRX1090864 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available ming || cell type=intermediate stage of somatic cell reprogramming || chip antibody...5,12.8,491 GSM1816301: pre-iPS rep.H3.MNase-ChIP-Seq; Mus musculus; ChIP-Seq source_name=intermediate stage of somatic cell reprogram

  17. Experiment list: SRX107407 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available osis=Adenocarcinoma 37670219,66.5,15.7,675 GSM838385: hnrnpl sictrl ChIP-Seq; Homo sapiens; ChIP-Seq source_name=Hela cells knock...down control || chip antibody=hnRNP L || treatment=knockdown control || cell line=HeLa

  18. Accurate wavelength prediction of photonic crystal resonant reflection and applications in refractive index measurement

    DEFF Research Database (Denmark)

    Hermannsson, Pétur Gordon; Vannahme, Christoph; Smith, Cameron L. C.

    2014-01-01

    and superstrate materials. The importance of accounting for material dispersion in order to obtain accurate simulation results is highlighted, and a method for doing so using an iterative approach is demonstrated. Furthermore, an application for the model is demonstrated, in which the material dispersion......In the past decade, photonic crystal resonant reflectors have been increasingly used as the basis for label-free biochemical assays in lab-on-a-chip applications. In both designing and interpreting experimental results, an accurate model describing the optical behavior of such structures...... is essential. Here, an analytical method for precisely predicting the absolute positions of resonantly reflected wavelengths is presented. The model is experimentally verified to be highly accurate using nanoreplicated, polymer-based photonic crystal grating reflectors with varying grating periods...

  19. Experiment list: SRX1122118 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available M1833874: ChIP-seq H3K27Ac DMSO; Homo sapiens; ChIP-Seq source_name=Cultured Leukemic Blasts || chip antibod...y=H3K27ac (ActiveMotif, ab4729) || tissue=Cultured Leukemic Blasts || treatment compound=DMSO http://dbarchi

  20. Experiment list: SRX977374 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available 16864,98.6,9.4,51198 GSM1648625: total Oct4 ChipSeq day1; Mus musculus; ChIP-Seq source_name=reprogramming i...ntermediate after 24hour of induction || cell type=reprogramming intermediate || genotype=Oct4-GFP/ Rosa26-M

  1. Experiment list: SRX152077 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available ll organism=human || cell description=pancreatic carcinoma, (PMID: 1140870) PANC-1 was established from a panc...SRX152077 hg19 Histone H3K4me3 Pancreas PANC-1 Tissue=Pancreas/Duct|Disease=Epithel...ioid Carcinoma 53620150,97.5,34.9,29597 GSM945856: USC ChipSeq PANC-1 H3K4me3B UCDavis source_name=PANC-1 ||...type=ChipSeq || datatype description=Chromatin IP Sequencing || cell=PANC-1 || ce...reatic carcinoma, which was extracted via pancreatico-duodenectomy specimen from a 56-year-old Caucasian i

  2. Experiment list: SRX352046 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available SM1232564: CSB M CHIP; Homo sapiens; ChIP-Seq source_name=fibroblast_menadione_CSB-ChIP || cell type=fibroblast || treated with=menad...ione || chip antibody=Mouse monoclonal anti-CSB N Terminus (1B1) http://dbarchive.b

  3. Experiment list: SRX144526 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available stein-Barr Virus transformed 11803840,92.5,91.6,38 GSM922971: NRF2 ChIP vehicle treated rep2; Homo sapiens; ...ChIP-Seq source_name=NRF2 ChIP vehicle treated || biomaterial_provider=Coriell; h

  4. Experiment list: SRX151245 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available 0: CTCF ChIPSeq; Homo sapiens; ChIP-Seq source_name=BCBL1 pleural effusion lymphoma, CTCF ChIP || cell line=...BCBL1 || cell type=KSHV-infected pleural effusion lymphoma cells || chip antibody=rabbit anti-CTCF || antibo

  5. Experiment list: SRX485222 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available 4me2 ChIP from control germline knock-down ovaries, replicate 2; Drosophila melanogaster; ChIP-Seq source_na...me=H3K4me2 ChIP from control germline knock-down ovaries || developmental stage=4-6 days old adult || Sex=fe...male || tissue=ovary || germline knock-down=control || chip antibody=Anti-dimethy

  6. Experiment list: SRX485221 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available K4me2 ChIP from control germline knock-down ovaries, replicate 1; Drosophila melanogaster; ChIP-Seq source_n...ame=H3K4me2 ChIP from control germline knock-down ovaries || developmental stage=4-6 days old adult || Sex=f...emale || tissue=ovary || germline knock-down=control || chip antibody=Anti-dimeth

  7. An integrated model of multiple-condition ChIP-Seq data reveals predeterminants of Cdx2 binding.

    Directory of Open Access Journals (Sweden)

    Shaun Mahony

    2014-03-01

    Full Text Available Regulatory proteins can bind to different sets of genomic targets in various cell types or conditions. To reliably characterize such condition-specific regulatory binding we introduce MultiGPS, an integrated machine learning approach for the analysis of multiple related ChIP-seq experiments. MultiGPS is based on a generalized Expectation Maximization framework that shares information across multiple experiments for binding event discovery. We demonstrate that our framework enables the simultaneous modeling of sparse condition-specific binding changes, sequence dependence, and replicate-specific noise sources. MultiGPS encourages consistency in reported binding event locations across multiple-condition ChIP-seq datasets and provides accurate estimation of ChIP enrichment levels at each event. MultiGPS's multi-experiment modeling approach thus provides a reliable platform for detecting differential binding enrichment across experimental conditions. We demonstrate the advantages of MultiGPS with an analysis of Cdx2 binding in three distinct developmental contexts. By accurately characterizing condition-specific Cdx2 binding, MultiGPS enables novel insight into the mechanistic basis of Cdx2 site selectivity. Specifically, the condition-specific Cdx2 sites characterized by MultiGPS are highly associated with pre-existing genomic context, suggesting that such sites are pre-determined by cell-specific regulatory architecture. However, MultiGPS-defined condition-independent sites are not predicted by pre-existing regulatory signals, suggesting that Cdx2 can bind to a subset of locations regardless of genomic environment. A summary of this paper appears in the proceedings of the RECOMB 2014 conference, April 2-5.

  8. AtRTD2: A Reference Transcript Dataset for accurate quantification of alternative splicing and expression changes in Arabidopsis thaliana RNA-seq data

    KAUST Repository

    Zhang, Runxuan

    2016-05-06

    Background Alternative splicing is the major post-transcriptional mechanism by which gene expression is regulated and affects a wide range of processes and responses in most eukaryotic organisms. RNA-sequencing (RNA-seq) can generate genome-wide quantification of individual transcript isoforms to identify changes in expression and alternative splicing. RNA-seq is an essential modern tool but its ability to accurately quantify transcript isoforms depends on the diversity, completeness and quality of the transcript information. Results We have developed a new Reference Transcript Dataset for Arabidopsis (AtRTD2) for RNA-seq analysis containing over 82k non-redundant transcripts, whereby 74,194 transcripts originate from 27,667 protein-coding genes. A total of 13,524 protein-coding genes have at least one alternatively spliced transcript in AtRTD2 such that about 60% of the 22,453 protein-coding, intron-containing genes in Arabidopsis undergo alternative splicing. More than 600 putative U12 introns were identified in more than 2,000 transcripts. AtRTD2 was generated from transcript assemblies of ca. 8.5 billion pairs of reads from 285 RNA-seq data sets obtained from 129 RNA-seq libraries and merged along with the previous version, AtRTD, and Araport11 transcript assemblies. AtRTD2 increases the diversity of transcripts and through application of stringent filters represents the most extensive and accurate transcript collection for Arabidopsis to date. We have demonstrated a generally good correlation of alternative splicing ratios from RNA-seq data analysed by Salmon and experimental data from high resolution RT-PCR. However, we have observed inaccurate quantification of transcript isoforms for genes with multiple transcripts which have variation in the lengths of their UTRs. This variation is not effectively corrected in RNA-seq analysis programmes and will therefore impact RNA-seq analyses generally. To address this, we have tested different genome

  9. Experiment list: SRX684777 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available ing || cell type=intermediate stage of somatic cell reprogramming || chip antibody=...,98.5,5.8,239 GSM1483906: pre-iPS.H3K27me3.ChIP-Seq; Mus musculus; ChIP-Seq source_name=intermediate stage of somatic cell reprogramm

  10. Experiment list: SRX1122119 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available 04 GSM1833875: ChIP-seq H3K27Ac GSI; Homo sapiens; ChIP-Seq source_name=Cultured Leukemic Blasts || chip ant...ibody=H3K27ac (ActiveMotif, ab4729) || tissue=Cultured Leukemic Blasts || treatment compound=GSI BMS-906024

  11. Experiment list: SRX1090865 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available amming || cell type=intermediate stage of somatic cell reprogramming || chip antibo...,95.9,8.5,270 GSM1816302: pre-iPS rep.H3K4me3.ChIP-Seq; Mus musculus; ChIP-Seq source_name=intermediate stage of somatic cell reprogr

  12. Experiment list: SRX684776 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available ng || cell type=intermediate stage of somatic cell reprogramming || chip antibody=a...98.0,10.6,335 GSM1483905: pre-iPS.H3K4me3.ChIP-Seq; Mus musculus; ChIP-Seq source_name=intermediate stage of somatic cell reprogrammi

  13. Experiment list: SRX897943 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available 153,97.9,13.9,15440 GSM1624628: ChIP seq Renilla Sox2 IP day3; Mus musculus; ChIP-Seq source_name=OKSM reprogramming... intermediates from Mouse Embryonic Fibroblasts || strain=Black6-129X1/SvJ || cell type=OKSM reprogramming

  14. Experiment list: SRX684778 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available 98.0,9.9,325 GSM1483907: pre-iPS.H3K9me3.ChIP-Seq; Mus musculus; ChIP-Seq source_name=intermediate stage of somatic cell reprogrammin...g || cell type=intermediate stage of somatic cell reprogramming || chip antibody=an

  15. Experiment list: SRX150568 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available is=Adenocarcinoma 59265240,72.4,16.4,4779 GSM935489: Harvard ChipSeq HeLa-S3 RPC155 std source_name=HeLa-S3 ...|| biomaterial_provider=ATCC || lab=Harvard || lab description=Struhl - Harvard University || datatype=ChipS

  16. Experiment list: SRX507380 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available + (wildtype) || age of animals=1-5 day old || tissue=Ovaries || chip antibody=anti-HP1 || chip antibody vend...1770: WT anti-HP1- replicate#2; Drosophila melanogaster; ChIP-Seq source_name=WT_WT_anti-HP1 || strain=piwi/

  17. Experiment list: SRX176054 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available nosis=Carcinoma 13338805,91.2,4.9,792 GSM984386: LNCAP AR vehicle; Homo sapiens; ChIP-Seq source_name=prosta...te cancer cells || cell line=LNCaP || chip antibody=AR || chip antibody manufacturer=Abcam || treatment=EtOH vehicle

  18. Experiment list: SRX144525 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available neage=mesoderm|Description=parental cell type to lymphoblastoid cell lines 14487710,85.8,82.8,188 GSM922970: NRF2 ChIP vehicle... treated rep1; Homo sapiens; ChIP-Seq source_name=NRF2 ChIP vehicle treated || biomaterial

  19. Experiment list: SRX144524 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available neage=mesoderm|Description=parental cell type to lymphoblastoid cell lines 4766716,6.2,89.4,0 GSM922969: NRF2 ChIP vehicle... treated pilot; Homo sapiens; ChIP-Seq source_name=NRF2 ChIP vehicle treated || biomaterial_pr

  20. Experiment list: SRX151246 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available 11: SMC1 ChIPSeq; Homo sapiens; ChIP-Seq source_name=BCBL1 pleural effusion lymphoma, SMC1 ChIP || cell line...=BCBL1 || cell type=KSHV-infected pleural effusion lymphoma cells || chip antibody=rabbit anti-SMC1 || antib

  1. Amplification of pico-scale DNA mediated by bacterial carrier DNA for small-cell-number transcription factor ChIP-seq

    DEFF Research Database (Denmark)

    Jakobsen, Janus S; Bagger, Frederik O; Hasemann, Marie S

    2015-01-01

    BACKGROUND: Chromatin-Immunoprecipitation coupled with deep sequencing (ChIP-seq) is used to map transcription factor occupancy and generate epigenetic profiles genome-wide. The requirement of nano-scale ChIP DNA for generation of sequencing libraries has impeded ChIP-seq on in vivo tissues of low...... transcription factor (CEBPA) and histone mark (H3K4me3) ChIP. We further demonstrate that genomic profiles are highly resilient to changes in carrier DNA to ChIP DNA ratios. CONCLUSIONS: This represents a significant advance compared to existing technologies, which involve either complex steps of pre...... cell numbers. RESULTS: We describe a robust, simple and scalable methodology for ChIP-seq of low-abundant cell populations, verified down to 10,000 cells. By employing non-mammalian genome mapping bacterial carrier DNA during amplification, we reliably amplify down to 50 pg of ChIP DNA from...

  2. Experiment list: SRX897941 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available 081,98.0,12.4,11292 GSM1624626: ChIP seq Chaf1a.166 Sox2 IP day3; Mus musculus; ChIP-Seq source_name=OKSM reprogramming... intermediates from Mouse Embryonic Fibroblasts || strain=Black6-129X1/SvJ || cell type=OKSM reprogramming

  3. Experiment list: SRX1090866 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available gramming || cell type=intermediate stage of somatic cell reprogramming || chip anti...1,96.8,4.5,197 GSM1816303: pre-iPS rep.H3K27me3.ChIP-Seq; Mus musculus; ChIP-Seq source_name=intermediate stage of somatic cell repro

  4. Experiment list: SRX107409 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available Adenocarcinoma 38890980,97.0,45.8,396 GSM838387: h3k36me3 sictrl ChIP-Seq; Homo sapiens; ChIP-Seq source_name=Hela cells knock...down control || chip antibody=H3K36me3 || treatment=knockdown control || cell line=HeLa ||

  5. Experiment list: SRX485218 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available K9me3 ChIP from piwi germline knock-down ovaries, replicate 2; Drosophila melanogaster; ChIP-Seq source_name...=H3K9me3 ChIP from piwi germline knock-down ovaries || developmental stage=4-6 days old adult || Sex=female ...|| tissue=ovary || germline knock-down=piwi || chip antibody=Histone H3K9me3 anti

  6. Experiment list: SRX485213 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available K9me3 ChIP from control germline knock-down ovaries, replicate 1; Drosophila melanogaster; ChIP-Seq source_n...ame=H3K9me3 ChIP from control germline knock-down ovaries || developmental stage=4-6 days old adult || Sex=f...emale || tissue=ovary || germline knock-down=control || chip antibody=Histone H3K

  7. Experiment list: SRX485214 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available K9me3 ChIP from control germline knock-down ovaries, replicate 2; Drosophila melanogaster; ChIP-Seq source_n...ame=H3K9me3 ChIP from control germline knock-down ovaries || developmental stage=4-6 days old adult || Sex=f...emale || tissue=ovary || germline knock-down=control || chip antibody=Histone H3K

  8. Mixture models reveal multiple positional bias types in RNA-Seq data and lead to accurate transcript concentration estimates.

    Directory of Open Access Journals (Sweden)

    Andreas Tuerk

    2017-05-01

    Full Text Available Accuracy of transcript quantification with RNA-Seq is negatively affected by positional fragment bias. This article introduces Mix2 (rd. "mixquare", a transcript quantification method which uses a mixture of probability distributions to model and thereby neutralize the effects of positional fragment bias. The parameters of Mix2 are trained by Expectation Maximization resulting in simultaneous transcript abundance and bias estimates. We compare Mix2 to Cufflinks, RSEM, eXpress and PennSeq; state-of-the-art quantification methods implementing some form of bias correction. On four synthetic biases we show that the accuracy of Mix2 overall exceeds the accuracy of the other methods and that its bias estimates converge to the correct solution. We further evaluate Mix2 on real RNA-Seq data from the Microarray and Sequencing Quality Control (MAQC, SEQC Consortia. On MAQC data, Mix2 achieves improved correlation to qPCR measurements with a relative increase in R2 between 4% and 50%. Mix2 also yields repeatable concentration estimates across technical replicates with a relative increase in R2 between 8% and 47% and reduced standard deviation across the full concentration range. We further observe more accurate detection of differential expression with a relative increase in true positives between 74% and 378% for 5% false positives. In addition, Mix2 reveals 5 dominant biases in MAQC data deviating from the common assumption of a uniform fragment distribution. On SEQC data, Mix2 yields higher consistency between measured and predicted concentration ratios. A relative error of 20% or less is obtained for 51% of transcripts by Mix2, 40% of transcripts by Cufflinks and RSEM and 30% by eXpress. Titration order consistency is correct for 47% of transcripts for Mix2, 41% for Cufflinks and RSEM and 34% for eXpress. We, further, observe improved repeatability across laboratory sites with a relative increase in R2 between 8% and 44% and reduced standard deviation.

  9. Experiment list: SRX153146 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available -Seq source_name=Human breast adenocarcinoma cell-line MCF7 || cell-line=MCF7 || passage=5 || chip antibody=...n=Pleura|Tissue Diagnosis=Adenocarcinoma 60170246,98.4,5.7,16756 GSM946850: MCF7 H3K27ac; Homo sapiens; ChIP

  10. Experiment list: SRX176063 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available =Carcinoma 11279321,95.5,3.6,13985 GSM984395: LNCAP ACH3 vehicle; Homo sapiens; ChIP-Seq source_name=prostat...e cancer cells || cell line=LNCaP || chip antibody=AcH3 || chip antibody manufacturer=Millipore || treatment=EtOH vehicle

  11. Experiment list: SRX176057 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available nosis=Carcinoma 21582823,90.1,7.3,1074 GSM984389: 22RV1 AR vehicle; Homo sapiens; ChIP-Seq source_name=prost...ate cancer cells || cell line=22RV1 || chip antibody=AR || chip antibody manufacturer=Abcam || treatment=EtOH vehicle

  12. Experiment list: SRX144527 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available neage=mesoderm|Description=parental cell type to lymphoblastoid cell lines 8704444,92.1,92.5,9 GSM922972: NRF2 ChIP vehicle... treated rep3; Homo sapiens; ChIP-Seq source_name=NRF2 ChIP vehicle treated || biomaterial_pr

  13. Experiment list: SRX160914 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available M970829: IgG for KSHV LANA; Homo sapiens; ChIP-Seq source_name=BCBL1 pleural effusion lymphoma, IgG ChIP || ...cell line=BCBL1 || cell type=KSHV-infected pleural effusion lymphoma cells || chip antibody=Rabbit IgG [Sant

  14. Experiment list: SRX160915 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available M970828: IgG for CTCF SMC1; Homo sapiens; ChIP-Seq source_name=BCBL1 pleural effusion lymphoma, IgG ChIP || ...cell line=BCBL1 || cell type=KSHV-infected pleural effusion lymphoma cells || chip antibody=Mouse IgG [Santa

  15. Predicting stimulation-dependent enhancer-promoter interactions from ChIP-Seq time course data

    Directory of Open Access Journals (Sweden)

    Tomasz Dzida

    2017-09-01

    Full Text Available We have developed a machine learning approach to predict stimulation-dependent enhancer-promoter interactions using evidence from changes in genomic protein occupancy over time. The occupancy of estrogen receptor alpha (ERα, RNA polymerase (Pol II and histone marks H2AZ and H3K4me3 were measured over time using ChIP-Seq experiments in MCF7 cells stimulated with estrogen. A Bayesian classifier was developed which uses the correlation of temporal binding patterns at enhancers and promoters and genomic proximity as features to predict interactions. This method was trained using experimentally determined interactions from the same system and was shown to achieve much higher precision than predictions based on the genomic proximity of nearest ERα binding. We use the method to identify a genome-wide confident set of ERα target genes and their regulatory enhancers genome-wide. Validation with publicly available GRO-Seq data demonstrates that our predicted targets are much more likely to show early nascent transcription than predictions based on genomic ERα binding proximity alone.

  16. Experiment list: SRX977401 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available 98.9,13.8,95651 GSM1648652: H3K27ac ChipSeq day1; Mus musculus; ChIP-Seq source_name=reprogramming intermedi...ate after 24hour of induction || cell type=reprogramming intermediate || genotype=H3K27ac-GFP/ Rosa26-M2rtTA

  17. Experiment list: SRX977402 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available 98.8,11.6,43739 GSM1648653: H3K27ac ChipSeq day3; Mus musculus; ChIP-Seq source_name=reprogramming intermedi...ate after 3days of induction || cell type=reprogramming intermediate || genotype=H3K27ac-GFP/ Rosa26-M2rtTA

  18. Experiment list: SRX977406 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available 98.9,8.3,18501 GSM1648657: H3K27ac ChipSeq day15; Mus musculus; ChIP-Seq source_name=reprogramming intermedi...ate after 15days of induction || cell type=reprogramming intermediate || genotype=H3K27ac-GFP/ Rosa26-M2rtTA

  19. Experiment list: SRX218536 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available strain=C57Bl/6 || age/gender=2-3 month old males || chip antibody=Santa Cruz Tech. rIgG (sc-2027) http://db....7,16.5,1194 GSM1067409: Rabbit IgG mouse liver ChIP-seq; Mus musculus; ChIP-Seq GEO Accession=GSM1067409 ||

  20. Experiment list: SRX977405 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available 98.9,7.4,10239 GSM1648656: H3K27ac ChipSeq day11; Mus musculus; ChIP-Seq source_name=reprogramming intermedi...ate after 11days of induction || cell type=reprogramming intermediate || genotype=H3K27ac-GFP/ Rosa26-M2rtTA

  1. Experiment list: SRX153147 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available -Seq source_name=Human breast adenocarcinoma cell-line MCF7 || cell-line=MCF7 || passage=5 || chip antibody=...on=Pleura|Tissue Diagnosis=Adenocarcinoma 64054379,98.7,5.2,764 GSM946851: MCF7 H3K27me3; Homo sapiens; ChIP

  2. Experiment list: SRX153148 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available -Seq source_name=Human breast adenocarcinoma cell-line MCF7 || cell-line=MCF7 || passage=5 || chip antibody=...n=Pleura|Tissue Diagnosis=Adenocarcinoma 57306360,95.7,15.1,2666 GSM946852: MCF7 H3K9me3; Homo sapiens; ChIP

  3. Experiment list: SRX485216 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available 3K9me3 ChIP from rhino germline knock-down ovaries, replicate 2; Drosophila melanogaster; ChIP-Seq source_na...me=H3K9me3 ChIP from rhino germline knock-down ovaries || developmental stage=4-6 days old adult || Sex=fema...le || tissue=ovary || germline knock-down=rhino || chip antibody=Histone H3K9me3

  4. Experiment list: SRX485215 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available K9me3 ChIP from rhino germline knock-down ovaries, replicate 1; Drosophila melanogaster; ChIP-Seq source_nam...e=H3K9me3 ChIP from rhino germline knock-down ovaries || developmental stage=4-6 days old adult || Sex=femal...e || tissue=ovary || germline knock-down=rhino || chip antibody=Histone H3K9me3 a

  5. Experiment list: SRX485217 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available 3K9me3 ChIP from piwi germline knock-down ovaries, replicate 1; Drosophila melanogaster; ChIP-Seq source_nam...e=H3K9me3 ChIP from piwi germline knock-down ovaries || developmental stage=4-6 days old adult || Sex=female... || tissue=ovary || germline knock-down=piwi || chip antibody=Histone H3K9me3 ant

  6. Experiment list: SRX977412 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available 99.3,28.8,54450 GSM1648663: H3K4me3 ChipSeq day5; Mus musculus; ChIP-Seq source_name=reprogramming intermedi...ate after 5days of induction || cell type=reprogramming intermediate || genotype=H3K4me3-GFP/ Rosa26-M2rtTA

  7. Experiment list: SRX977420 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available ,98.5,9.4,5405 GSM1648671: H3K27me3 ChipSeq day3; Mus musculus; ChIP-Seq source_name=reprogramming intermedi...ate after 3days of induction || cell type=reprogramming intermediate || genotype=H3K27me3-GFP/ Rosa26-M2rtTA

  8. Experiment list: SRX977415 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available 97.4,25.7,24253 GSM1648666: H3K4me3 ChipSeq day15; Mus musculus; ChIP-Seq source_name=reprogramming intermed...iate after 15days of induction || cell type=reprogramming intermediate || genotype=H3K4me3-GFP/ Rosa26-M2rtT

  9. Experiment list: SRX977394 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available 99.1,7.7,102329 GSM1648645: H3K4me1 ChipSeq day5; Mus musculus; ChIP-Seq source_name=reprogramming intermedi...ate after 5days of induction || cell type=reprogramming intermediate || genotype=H3K4me1-GFP/ Rosa26-M2rtTA

  10. Experiment list: SRX977423 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available ,99.0,6.9,4818 GSM1648674: H3K27me3 ChipSeq day11; Mus musculus; ChIP-Seq source_name=reprogramming intermed...iate after 11days of induction || cell type=reprogramming intermediate || genotype=H3K27me3-GFP/ Rosa26-M2rt

  11. Experiment list: SRX977392 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available 98.9,6.9,91916 GSM1648643: H3K4me1 ChipSeq day1; Mus musculus; ChIP-Seq source_name=reprogramming intermedia...te after 24hour of induction || cell type=reprogramming intermediate || genotype=H3K4me1-GFP/ Rosa26-M2rtTA

  12. Experiment list: SRX977411 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available 99.2,25.6,52503 GSM1648662: H3K4me3 ChipSeq day3; Mus musculus; ChIP-Seq source_name=reprogramming intermedi...ate after 3days of induction || cell type=reprogramming intermediate || genotype=H3K4me3-GFP/ Rosa26-M2rtTA

  13. Experiment list: SRX977410 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available 99.3,25.0,46365 GSM1648661: H3K4me3 ChipSeq day1; Mus musculus; ChIP-Seq source_name=reprogramming intermedi...ate after 24hour of induction || cell type=reprogramming intermediate || genotype=H3K4me3-GFP/ Rosa26-M2rtTA

  14. Experiment list: SRX977421 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available ,98.5,10.5,6589 GSM1648672: H3K27me3 ChipSeq day5; Mus musculus; ChIP-Seq source_name=reprogramming intermed...iate after 5days of induction || cell type=reprogramming intermediate || genotype=H3K27me3-GFP/ Rosa26-M2rtT

  15. Experiment list: SRX977396 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available 99.3,8.4,86055 GSM1648647: H3K4me1 ChipSeq day11; Mus musculus; ChIP-Seq source_name=reprogramming intermedi...ate after 11days of induction || cell type=reprogramming intermediate || genotype=H3K4me1-GFP/ Rosa26-M2rtTA

  16. Experiment list: SRX977413 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available 97.5,28.8,20673 GSM1648664: H3K4me3 ChipSeq day7; Mus musculus; ChIP-Seq source_name=reprogramming intermedi...ate after 7days of induction || cell type=reprogramming intermediate || genotype=H3K4me3-GFP/ Rosa26-M2rtTA

  17. Experiment list: SRX977419 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available ,99.0,7.8,3913 GSM1648670: H3K27me3 ChipSeq day1; Mus musculus; ChIP-Seq source_name=reprogramming intermedi...ate after 24hour of induction || cell type=reprogramming intermediate || genotype=H3K27me3-GFP/ Rosa26-M2rtT

  18. Experiment list: SRX977404 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available 99.1,5.7,17625 GSM1648655: H3K27ac ChipSeq day7; Mus musculus; ChIP-Seq source_name=reprogramming intermedia...te after 7days of induction || cell type=reprogramming intermediate || genotype=H3K27ac-GFP/ Rosa26-M2rtTA t

  19. Experiment list: SRX977414 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available 98.5,25.1,26501 GSM1648665: H3K4me3 ChipSeq day11; Mus musculus; ChIP-Seq source_name=reprogramming intermed...iate after 11days of induction || cell type=reprogramming intermediate || genotype=H3K4me3-GFP/ Rosa26-M2rtT

  20. Experiment list: SRX977397 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available 98.9,8.9,45421 GSM1648648: H3K4me1 ChipSeq day15; Mus musculus; ChIP-Seq source_name=reprogramming intermedi...ate after 15days of induction || cell type=reprogramming intermediate || genotype=H3K4me1-GFP/ Rosa26-M2rtTA

  1. Experiment list: SRX977403 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available 98.9,9.4,35286 GSM1648654: H3K27ac ChipSeq day5; Mus musculus; ChIP-Seq source_name=reprogramming intermedia...te after 5days of induction || cell type=reprogramming intermediate || genotype=H3K27ac-GFP/ Rosa26-M2rtTA t

  2. BRAKER1: Unsupervised RNA-Seq-Based Genome Annotation with GeneMark-ET and AUGUSTUS.

    Science.gov (United States)

    Hoff, Katharina J; Lange, Simone; Lomsadze, Alexandre; Borodovsky, Mark; Stanke, Mario

    2016-03-01

    Gene finding in eukaryotic genomes is notoriously difficult to automate. The task is to design a work flow with a minimal set of tools that would reach state-of-the-art performance across a wide range of species. GeneMark-ET is a gene prediction tool that incorporates RNA-Seq data into unsupervised training and subsequently generates ab initio gene predictions. AUGUSTUS is a gene finder that usually requires supervised training and uses information from RNA-Seq reads in the prediction step. Complementary strengths of GeneMark-ET and AUGUSTUS provided motivation for designing a new combined tool for automatic gene prediction. We present BRAKER1, a pipeline for unsupervised RNA-Seq-based genome annotation that combines the advantages of GeneMark-ET and AUGUSTUS. As input, BRAKER1 requires a genome assembly file and a file in bam-format with spliced alignments of RNA-Seq reads to the genome. First, GeneMark-ET performs iterative training and generates initial gene structures. Second, AUGUSTUS uses predicted genes for training and then integrates RNA-Seq read information into final gene predictions. In our experiments, we observed that BRAKER1 was more accurate than MAKER2 when it is using RNA-Seq as sole source for training and prediction. BRAKER1 does not require pre-trained parameters or a separate expert-prepared training step. BRAKER1 is available for download at http://bioinf.uni-greifswald.de/bioinf/braker/ and http://exon.gatech.edu/GeneMark/ katharina.hoff@uni-greifswald.de or borodovsky@gatech.edu Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  3. Experiment list: SRX119679 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available 8,18360 GSM874985: ES.H3K27me3; Homo sapiens; ChIP-Seq source_name=H1 human Embryonic stem cells || cell line=H1 || treatment=diagnos...tic sample (pre-treatment) || chip antibody=H3K27me3 || chip antibody manufacturer=

  4. Experiment list: SRX119684 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available 2,13603 GSM874990: ES.H3K79me2; Homo sapiens; ChIP-Seq source_name=H1 human Embryonic stem cell || cell line=H1 || treatment=diagnost...ic sample (pre-treatment) || chip antibody=H3K79me2 || chip antibody manufacturer=A

  5. Experiment list: SRX977393 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available 98.8,8.9,31577 GSM1648644: H3K4me1 ChipSeq day3; Mus musculus; ChIP-Seq source_name=reprogramming intermedia...te after 3days of induction || cell type=reprogramming intermediate || genotype=H3K4me1-GFP/ Rosa26-M2rtTA t

  6. Experiment list: SRX977395 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available 98.7,7.3,62664 GSM1648646: H3K4me1 ChipSeq day7; Mus musculus; ChIP-Seq source_name=reprogramming intermedia...te after 7days of induction || cell type=reprogramming intermediate || genotype=H3K4me1-GFP/ Rosa26-M2rtTA t

  7. Experiment list: SRX507384 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available + (wildtype) || age of animals=1-5 day old || tissue=Ovaries || chip antibody=Anti-H3K4me2 || chip antibody ... Anti-H3K4me2- replicate#2; Drosophila melanogaster; ChIP-Seq source_name=WT_WT_Anti-H3K4me2 || strain=piwi/

  8. Experiment list: SRX507382 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available + (wildtype) || age of animals=1-5 day old || tissue=Ovaries || chip antibody=Anti-H3K9me3 || chip antibody ... Anti-H3K9me3- replicate#2; Drosophila melanogaster; ChIP-Seq source_name=WT_WT_Anti-H3K9me3 || strain=piwi/

  9. Experiment list: SRX176067 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available sis=Carcinoma 6619400,91.7,7.2,13648 GSM984399: LNCAP H3K4ME3 vehicle; Homo sapiens; ChIP-Seq source_name=pr...ostate cancer cells || cell line=LNCaP || chip antibody=H3K4Me3 || chip antibody manufacturer=Millipore || treatment=EtOH vehicle

  10. Experiment list: SRX485219 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available 56 GSM1346560: RNA Polymerase II ChIP from control germline knock-down ovaries; Drosophila melanogaster; ChI...P-Seq source_name=RNA Polymerase II ChIP from control germline knock-down ovaries || developmental stage=4-6... days old adult || Sex=female || tissue=ovary || germline knock-down=control || c

  11. In silico site-directed mutagenesis informs species-specific predictions of chemical susceptibility derived from the Sequence Alignment to Predict Across Species Susceptibility (SeqAPASS) tool

    Science.gov (United States)

    The Sequence Alignment to Predict Across Species Susceptibility (SeqAPASS) tool was developed to address needs for rapid, cost effective methods of species extrapolation of chemical susceptibility. Specifically, the SeqAPASS tool compares the primary sequence (Level 1), functiona...

  12. Limitations and possibilities of low cell number ChIP-seq

    Directory of Open Access Journals (Sweden)

    Gilfillan Gregor D

    2012-11-01

    Full Text Available Abstract Background Chromatin immunoprecipitation coupled with high-throughput DNA sequencing (ChIP-seq offers high resolution, genome-wide analysis of DNA-protein interactions. However, current standard methods require abundant starting material in the range of 1–20 million cells per immunoprecipitation, and remain a bottleneck to the acquisition of biologically relevant epigenetic data. Using a ChIP-seq protocol optimised for low cell numbers (down to 100,000 cells / IP, we examined the performance of the ChIP-seq technique on a series of decreasing cell numbers. Results We present an enhanced native ChIP-seq method tailored to low cell numbers that represents a 200-fold reduction in input requirements over existing protocols. The protocol was tested over a range of starting cell numbers covering three orders of magnitude, enabling determination of the lower limit of the technique. At low input cell numbers, increased levels of unmapped and duplicate reads reduce the number of unique reads generated, and can drive up sequencing costs and affect sensitivity if ChIP is attempted from too few cells. Conclusions The optimised method presented here considerably reduces the input requirements for performing native ChIP-seq. It extends the applicability of the technique to isolated primary cells and rare cell populations (e.g. biobank samples, stem cells, and in many cases will alleviate the need for cell culture and any associated alteration of epigenetic marks. However, this study highlights a challenge inherent to ChIP-seq from low cell numbers: as cell input numbers fall, levels of unmapped sequence reads and PCR-generated duplicate reads rise. We discuss a number of solutions to overcome the effects of reducing cell number that may aid further improvements to ChIP performance.

  13. Experiment list: SRX688848 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available d prostate cancer cell line || treatment=vehicle || chip antibody=rabbit anti-ASH... prostate cancer cells, vehicle, ASH2 ChIP || cell line=VCaP || cell type=vertebral metastatic lesion-derive...agnosis=Carcinoma 25750434,89.3,6.2,7152 GSM1489926: vcap ash2l veh; Homo sapiens; ChIP-Seq source_name=VCaP

  14. SpliceSeq: a resource for analysis and visualization of RNA-Seq data on alternative splicing and its functional impacts.

    Science.gov (United States)

    Ryan, Michael C; Cleland, James; Kim, RyangGuk; Wong, Wing Chung; Weinstein, John N

    2012-09-15

    SpliceSeq is a resource for RNA-Seq data that provides a clear view of alternative splicing and identifies potential functional changes that result from splice variation. It displays intuitive visualizations and prioritized lists of results that highlight splicing events and their biological consequences. SpliceSeq unambiguously aligns reads to gene splice graphs, facilitating accurate analysis of large, complex transcript variants that cannot be adequately represented in other formats. SpliceSeq is freely available at http://bioinformatics.mdanderson.org/main/SpliceSeq:Overview. The application is a Java program that can be launched via a browser or installed locally. Local installation requires MySQL and Bowtie. mryan@insilico.us.com Supplementary data are available at Bioinformatics online.

  15. Optimal use of tandem biotin and V5 tags in ChIP assays

    NARCIS (Netherlands)

    K.E. Kolodziej (Katarzyna); F. Pourfarzad, F. (Farzin); E. de Boer (Ernie); S. Krpic (Sanja); F.G. Grosveld (Frank); J. Strouboulis (John)

    2009-01-01

    textabstractBackground: Chromatin immunoprecipitation (ChIP) assays coupled to genome arrays (Chip-on-chip) or massive parallel sequencing (ChIP-seq) lead to the genome wide identification of binding sites of chromatin associated proteins. However, the highly variable quality of antibodies and the

  16. Single-Cell mRNA-Seq Using the Fluidigm C1 System and Integrated Fluidics Circuits.

    Science.gov (United States)

    Gong, Haibiao; Do, Devin; Ramakrishnan, Ramesh

    2018-01-01

    Single-cell mRNA-seq is a valuable tool to dissect expression profiles and to understand the regulatory network of genes. Microfluidics is well suited for single-cell analysis owing both to the small volume of the reaction chambers and easiness of automation. Here we describe the workflow of single-cell mRNA-seq using C1 IFC, which can isolate and process up to 96 cells. Both on-chip procedure (lysis, reverse transcription, and preamplification PCR) and off-chip sequencing library preparation protocols are described. The workflow generates full-length mRNA information, which is more valuable compared to 3' end counting method for many applications.

  17. Experiment list: SRX142526 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available source_name=C2C12 || biomaterial_provider=Barbara Wold lab || lab=Caltech-m || lab description=Wold - Califonia Institute of Technolo...gy || datatype=ChipSeq || datatype description=Chromatin

  18. Genomic prediction of starch content and chipping quality in tetraploid potato using genotyping-by-sequencing

    DEFF Research Database (Denmark)

    Sverrisdóttir, Elsa; Byrne, Stephen; Nielsen, Ea Høegh Riis

    2017-01-01

    continue to fall. In this study, we have generated genomic prediction models for starch content and chipping quality in tetraploid potato to facilitate varietal development. Chipping quality was evaluated as the colour of a potato chip after frying following cold induced sweetening. We used genotyping...... genomic estimated breeding values. Cross-validated prediction correlations of 0.56 and 0.73 were obtained within the training population for starch content and chipping quality, respectively, while correlations were lower when predicting performance in the test panel, at 0.30–0.31 and 0...

  19. Experiment list: SRX620297 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available body=Pol II (Santa Cruz Biotechnology, N20, sc-899) http://dbarchive.biosciencedb...IP-Seq source_name=NIH3T3 fibroblasts || culture condition=continuous culture || chemicals=DMSO || chip anti

  20. Experiment list: SRX142520 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available Califonia Institute of Technology || datatype=ChipSeq || datatype description=Ch...t 24hr source_name=C2C12 || biomaterial_provider=Barbara Wold lab || lab=Caltech-m || lab description=Wold -

  1. Experiment list: SRX143619 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available Califonia Institute of Technology || datatype=ChipSeq || datatype description=Ch...t 24hr source_name=C2C12 || biomaterial_provider=Barbara Wold lab || lab=Caltech-m || lab description=Wold -

  2. Experiment list: SRX142522 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available Califonia Institute of Technology || datatype=ChipSeq || datatype description=Ch...t 60hr source_name=C2C12 || biomaterial_provider=Barbara Wold lab || lab=Caltech-m || lab description=Wold -

  3. SignalSpider: Probabilistic pattern discovery on multiple normalized ChIP-Seq signal profiles

    KAUST Repository

    Wong, Kachun; Li, Yue; Peng, Chengbin; Zhang, Zhaolei

    2014-01-01

    Motivation: Chromatin immunoprecipitation (ChIP) followed by high-throughput sequencing (ChIP-Seq) measures the genome-wide occupancy of transcription factors in vivo. Different combinations of DNA-binding protein occupancies may result in a gene

  4. Experiment list: SRX655691 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available ; Mus musculus; ChIP-Seq source_name=MEFs cells knockout MED23 || cell type=mouse embryonic fibroblast || ge...notype/variation=MED23 knockout || chip antibody=H2Bub http://dbarchive.bioscienc

  5. Experiment list: SRX191071 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available Mus musculus; ChIP-Seq source_name=Kdm2b knockdown mouse ES cells || chip antibody=Flag || antibody manufact... || cell type=embryonbic stem cells || genotype=Kdm2b knockdown http://dbarchive.

  6. Experiment list: SRX620294 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available body=Pol II (Santa Cruz Biotechnology, N20, sc-899) http://dbarchive.biosciencedb...s; ChIP-Seq source_name=NIH3T3 fibroblasts || culture condition=serum starved || chemicals=DMSO || chip anti

  7. Experiment list: SRX891837 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available denocarcinoma || chip antibody=none (input) http://dbarc...ut, treated, replicate 2; Homo sapiens; ChIP-Seq source_name=MDA-MB-231 || cell line=MDA-MB-231 || cell type=triple negative breast a

  8. Experiment list: SRX185908 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available epton 1; Homo sapiens; ChIP-Seq source_name=MCF-7 breast adenocarcinoma cells, thiostrepton, FOXM1 ChIP || c...ell_line=MCF-7 || cell_type=ER-positive breast adenocarcinoma cells || treatment=

  9. Experiment list: SRX185916 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available on 4; Homo sapiens; ChIP-Seq source_name=MCF-7 breast adenocarcinoma cells, thiostrepton, FOXM1 ChIP || cell..._line=MCF-7 || cell_type=ER-positive breast adenocarcinoma cells || treatment=thi

  10. Experiment list: SRX745835 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available d Pol II; Homo sapiens; ChIP-Seq source_name=Breast cancer cells || cell line=MCF7 || treatment=untreated || sample type=Pleural effu...sion || passages=14-17 || chip antibody=Homemade Anti-Po

  11. Experiment list: SRX143851 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available lance, and learn motor skills. 57796523,71.5,23.1,31516 GSM918759: LICR ChipSeq Cerebellum CTCF adult-8wks s...cerebellar nuclei. Its function is to coordinate voluntary movements, maintain ba

  12. Experiment list: SRX185910 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available ton 2; Homo sapiens; ChIP-Seq source_name=MCF-7 breast adenocarcinoma cells, thiostrepton, FOXM1 ChIP || cel...l_line=MCF-7 || cell_type=ER-positive breast adenocarcinoma cells || treatment=th

  13. Experiment list: SRX185918 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available ton 7; Homo sapiens; ChIP-Seq source_name=MCF-7 breast adenocarcinoma cells, thiostrepton, FOXM1 ChIP || cel...l_line=MCF-7 || cell_type=ER-positive breast adenocarcinoma cells || treatment=th

  14. Experiment list: SRX286394 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available 3: AR Cast1 [kidney, castrated+vehicle]; Mus musculus; ChIP-Seq source_name=AR_Cast1 || strain=wild type ICR... || tissue=kidney || age=8-12 weeks old || treatment=castrated+vehicle || chip an

  15. Fish the ChIPs: a pipeline for automated genomic annotation of ChIP-Seq data

    Directory of Open Access Journals (Sweden)

    Minucci Saverio

    2011-10-01

    Full Text Available Abstract Background High-throughput sequencing is generating massive amounts of data at a pace that largely exceeds the throughput of data analysis routines. Here we introduce Fish the ChIPs (FC, a computational pipeline aimed at a broad public of users and designed to perform complete ChIP-Seq data analysis of an unlimited number of samples, thus increasing throughput, reproducibility and saving time. Results Starting from short read sequences, FC performs the following steps: 1 quality controls, 2 alignment to a reference genome, 3 peak calling, 4 genomic annotation, 5 generation of raw signal tracks for visualization on the UCSC and IGV genome browsers. FC exploits some of the fastest and most effective tools today available. Installation on a Mac platform requires very basic computational skills while configuration and usage are supported by a user-friendly graphic user interface. Alternatively, FC can be compiled from the source code on any Unix machine and then run with the possibility of customizing each single parameter through a simple configuration text file that can be generated using a dedicated user-friendly web-form. Considering the execution time, FC can be run on a desktop machine, even though the use of a computer cluster is recommended for analyses of large batches of data. FC is perfectly suited to work with data coming from Illumina Solexa Genome Analyzers or ABI SOLiD and its usage can potentially be extended to any sequencing platform. Conclusions Compared to existing tools, FC has two main advantages that make it suitable for a broad range of users. First of all, it can be installed and run by wet biologists on a Mac machine. Besides it can handle an unlimited number of samples, being convenient for large analyses. In this context, computational biologists can increase reproducibility of their ChIP-Seq data analyses while saving time for downstream analyses. Reviewers This article was reviewed by Gavin Huttley, George

  16. Experiment list: SRX1165098 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available tibody=CREB1 Santa Cruz Biotechnology, sc-240 http://dbarchive.biosciencedbc.jp/k...apiens; ChIP-Seq source_name=HepG2 cells || cell line=HepG2 || cell type=Hepatocellular Carcinoma || chip an

  17. Experiment list: SRX115969 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available ; ChIP-Seq source_name=Breast cancer cells || cell lines=MCF-7 || agent=E2 || time=24 hr || chip antibody=ERα, Santa Cruz Biotechnolo...gy, sc-8005 X http://dbarchive.biosciencedbc.jp/kyushu-u

  18. Experiment list: SRX1427025 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available pring produced at one birth by a viviparous animal. 32763382,74.3,19.7,747 GSM1937508: SHP ChIP-seq with vehicle...week || genotype=wildtype || treatment=vehicle || chip antibody=SHP (sc-30169) ht

  19. Experiment list: SRX655689 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available 8698: pol2 KO ChIPSeq; Mus musculus; ChIP-Seq source_name=MEFs cells knockout MED23 || cell type=mouse embry...onic fibroblast || genotype/variation=MED23 knockout || chip antibody=Pol II http

  20. Optimizing ChIP-seq peak detectors using visual labels and supervised machine learning.

    Science.gov (United States)

    Hocking, Toby Dylan; Goerner-Potvin, Patricia; Morin, Andreanne; Shao, Xiaojian; Pastinen, Tomi; Bourque, Guillaume

    2017-02-15

    Many peak detection algorithms have been proposed for ChIP-seq data analysis, but it is not obvious which algorithm and what parameters are optimal for any given dataset. In contrast, regions with and without obvious peaks can be easily labeled by visual inspection of aligned read counts in a genome browser. We propose a supervised machine learning approach for ChIP-seq data analysis, using labels that encode qualitative judgments about which genomic regions contain or do not contain peaks. The main idea is to manually label a small subset of the genome, and then learn a model that makes consistent peak predictions on the rest of the genome. We created 7 new histone mark datasets with 12 826 visually determined labels, and analyzed 3 existing transcription factor datasets. We observed that default peak detection parameters yield high false positive rates, which can be reduced by learning parameters using a relatively small training set of labeled data from the same experiment type. We also observed that labels from different people are highly consistent. Overall, these data indicate that our supervised labeling method is useful for quantitatively training and testing peak detection algorithms. Labeled histone mark data http://cbio.ensmp.fr/~thocking/chip-seq-chunk-db/ , R package to compute the label error of predicted peaks https://github.com/tdhock/PeakError. toby.hocking@mail.mcgill.ca or guil.bourque@mcgill.ca. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.

  1. Experiment list: SRX190193 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available rce_name=HL-60 || biomaterial_provider=ATCC || datatype=ChipSeq || datatype description=Chromatin IP Sequencing || antibody antibody...description=Mouse monoclonal to RNA polymerase II CTD repeat YSPTSPS antibody... (4H8) - ChIP Grade. Antibody Target: POL2 || antibody targetdescription=This gene encod...es the largest subunit of RNA polymerase II, the polymerase responsible for synthesizing messenger RNA in eukaryotes || antibody... vendorname=abcam || antibody vendorid=ab5408 || controlid=SL

  2. Experiment list: SRX100504 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available .1 source_name=U87 || biomaterial_provider=ATCC || datatype=ChipSeq || datatype description=Chromatin IP Sequencing || antibody antib...odydescription=Mouse monoclonal to RNA polymerase II CTD repeat YSPTSPS antibody... (4H8) - ChIP Grade. Antibody Target: POL2 || antibody targetdescription=This gene e...ncodes the largest subunit of RNA polymerase II, the polymerase responsible for synthesizing messenger RNA in eukaryotes || antibody... vendorname=abcam || antibody vendorid=ab5408 || controli

  3. Experiment list: SRX100529 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available aterial_provider=WiCell Research Institute || datatype=ChipSeq || datatype description=Chromatin IP Sequencing || antibody antibody...description=Mouse monoclonal to RNA polymerase II CTD repeat YSPTSPS antibody... (4H8) - ChIP Grade. Antibody Target: POL2 || antibody targetdescription=This gene encode...s the largest subunit of RNA polymerase II, the polymerase responsible for synthesizing messenger RNA in eukaryotes || antibody... vendorname=abcam || antibody vendorid=ab5408 || controlid=SL9

  4. Sequence- vs. chip-assisted genomic selection: accurate biological information is advised.

    Science.gov (United States)

    Pérez-Enciso, Miguel; Rincón, Juan C; Legarra, Andrés

    2015-05-09

    The development of next-generation sequencing technologies (NGS) has made the use of whole-genome sequence data for routine genetic evaluations possible, which has triggered a considerable interest in animal and plant breeding fields. Here, we investigated whether complete or partial sequence data can improve upon existing SNP (single nucleotide polymorphism) array-based selection strategies by simulation using a mixed coalescence - gene-dropping approach. We simulated 20 or 100 causal mutations (quantitative trait nucleotides, QTN) within 65 predefined 'gene' regions, each 10 kb long, within a genome composed of ten 3-Mb chromosomes. We compared prediction accuracy by cross-validation using a medium-density chip (7.5 k SNPs), a high-density (HD, 17 k) and sequence data (335 k). Genetic evaluation was based on a GBLUP method. The simulations showed: (1) a law of diminishing returns with increasing number of SNPs; (2) a modest effect of SNP ascertainment bias in arrays; (3) a small advantage of using whole-genome sequence data vs. HD arrays i.e. ~4%; (4) a minor effect of NGS errors except when imputation error rates are high (≥20%); and (5) if QTN were known, prediction accuracy approached 1. Since this is obviously unrealistic, we explored milder assumptions. We showed that, if all SNPs within causal genes were included in the prediction model, accuracy could also dramatically increase by ~40%. However, this criterion was highly sensitive to either misspecification (including wrong genes) or to the use of an incomplete gene list; in these cases, accuracy fell rapidly towards that reached when all SNPs from sequence data were blindly included in the model. Our study shows that, unless an accurate prior estimate on the functionality of SNPs can be included in the predictor, there is a law of diminishing returns with increasing SNP density. As a result, use of whole-genome sequence data may not result in a highly increased selection response over high

  5. Experiment list: SRX150451 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available is=Leukemia Chronic Myelogenous 39880346,54.9,7.2,2209 GSM935371: Harvard ChipSeq K562 SIRT6 std source_name...=K562 || biomaterial_provider=ATCC || lab=Harvard || lab description=Struhl - Harvard University || datatype

  6. Experiment list: SRX150472 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available is=Leukemia Chronic Myelogenous 38544300,59.2,11.1,1031 GSM935392: Harvard ChipSeq K562 NELFe std source_nam...e=K562 || biomaterial_provider=ATCC || lab=Harvard || lab description=Struhl - Harvard University || datatyp

  7. Experiment list: SRX191067 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available us; ChIP-Seq source_name=Kdm2b knockdown mouse ES cells || chip antibody=Ezh2 || antibody manufacturer=Cell ...ine=E14 || cell type=embryonbic stem cells || genotype=Kdm2b knockdown http://dba

  8. Experiment list: SRX1121725 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available ChIPSeq; Mus musculus; ChIP-Seq source_name=MEFs cells knockout MED23 || cell type=mouse embryonic fibroblas...t || genotype/variation=MED23 knockout || chip antibody=H3K4me3 http://dbarchive.

  9. Experiment list: SRX248443 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available n PolII rep; Homo sapiens; ChIP-Seq source_name=Breast cancer cells || cell line=MCF7 || time point=40 min || cell type=Pleural effus...ion || passages=14-17 || chip antibody=Homemade Anti-Pol

  10. Experiment list: SRX248446 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available usion || passages=14-17 || chip antibody=Homemade Anti-P...in PolII rep; Homo sapiens; ChIP-Seq source_name=Breast cancer cells || cell line=MCF7 || time point=320 min || cell type=Pleural eff

  11. Experiment list: SRX248445 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available usion || passages=14-17 || chip antibody=Homemade Anti-P...in PolII rep; Homo sapiens; ChIP-Seq source_name=Breast cancer cells || cell line=MCF7 || time point=160 min || cell type=Pleural eff

  12. Experiment list: SRX248442 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available n PolII rep; Homo sapiens; ChIP-Seq source_name=Breast cancer cells || cell line=MCF7 || time point=20 min || cell type=Pleural effus...ion || passages=14-17 || chip antibody=Homemade Anti-Pol

  13. Experiment list: SRX248444 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available n PolII rep; Homo sapiens; ChIP-Seq source_name=Breast cancer cells || cell line=MCF7 || time point=80 min || cell type=Pleural effus...ion || passages=14-17 || chip antibody=Homemade Anti-Pol

  14. Experiment list: SRX248441 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available n PolII rep; Homo sapiens; ChIP-Seq source_name=Breast cancer cells || cell line=MCF7 || time point=10 min || cell type=Pleural effus...ion || passages=14-17 || chip antibody=Homemade Anti-Pol

  15. Experiment list: SRX150623 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available is=Leukemia Chronic Myelogenous 34396876,78.6,11.1,16076 GSM935544: Harvard ChipSeq K562 HMGN3 std source_na...me=K562 || biomaterial_provider=ATCC || lab=Harvard || lab description=Struhl - Harvard University || dataty

  16. Experiment list: SRX150471 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available s=Leukemia Chronic Myelogenous 34337514,69.2,8.6,1665 GSM935391: Harvard ChipSeq K562 ATF3 std source_name=K...562 || biomaterial_provider=ATCC || lab=Harvard || lab description=Struhl - Harvard University || datatype=C

  17. Experiment list: SRX150423 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available sis=Leukemia Chronic Myelogenous 19694334,54.9,12.8,4256 GSM935343: Harvard ChipSeq K562 TFIIIC-110 std sour...ce_name=K562 || biomaterial_provider=ATCC || lab=Harvard || lab description=Struhl - Harvard University || d

  18. Experiment list: SRX150474 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available is=Leukemia Chronic Myelogenous 16833014,69.7,4.8,2339 GSM935394: Harvard ChipSeq K562 GTF2B std source_name...=K562 || biomaterial_provider=ATCC || lab=Harvard || lab description=Struhl - Harvard University || datatype

  19. Experiment list: SRX150452 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available s=Leukemia Chronic Myelogenous 17157530,93.1,18.0,2344 GSM935372: Harvard ChipSeq K562 RPC155 std source_nam...e=K562 || biomaterial_provider=ATCC || lab=Harvard || lab description=Struhl - Harvard University || datatyp

  20. Experiment list: SRX185912 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available MB-231 foxm1 Thiostrepton 1; Homo sapiens; ChIP-Seq source_name=MDA-MB-231 breast adenocarcinoma cells, thio...strepton, FOXM1 ChIP || cell_line=MDA-MB-231 || cell_type=ER-negative breast adenocarcinoma

  1. Experiment list: SRX185911 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available B-231 foxm1 DMSO 1; Homo sapiens; ChIP-Seq source_name=MDA-MB-231 breast adenocarcinoma cells, control, FOXM...1 ChIP || cell_line=MDA-MB-231 || cell_type=ER-negative breast adenocarcinoma cel

  2. Experiment list: SRX185914 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available B-231 foxm1 Thiostrepton 2; Homo sapiens; ChIP-Seq source_name=MDA-MB-231 breast adenocarcinoma cells, thios...trepton, FOXM1 ChIP || cell_line=MDA-MB-231 || cell_type=ER-negative breast adenocarcinoma

  3. Experiment list: SRX172567 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available P-Seq source_name=Embryonic Stem Cell || background mouse strain=129SvJae/C57BL/6 || chip antibody=streptavidin beads... || antibody vendor/catalog=Invitrogen 656-01 Dynabeads® MyOne? Streptavidin T1 || genotype/variati

  4. Experiment list: SRX191073 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available ulus; ChIP-Seq source_name=Kdm2b knockdown mouse ES cells || chip antibody=Ring1b || antibody manufacturer=C...ll line=E14 || cell type=embryonbic stem cells || genotype=Kdm2b knockdown http:/

  5. Experiment list: SRX998277 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available ial decompensation with sepsis || postmortem delay=4.2 hrs || experiment type=ChIP-Seq || chip antibody=H3K2...n female occipital pole tissue || tissue=Occipital pole || gender=female || age=68 || Cause of death=Myocard

  6. Experiment list: SRX150674 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available s=Leukemia Chronic Myelogenous 18469470,89.4,7.1,725 GSM935595: Harvard ChipSeq K562 BRF1 std source_name=K5...62 || biomaterial_provider=ATCC || lab=Harvard || lab description=Struhl - Harvard University || datatype=Ch

  7. Experiment list: SRX150569 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available s=Leukemia Chronic Myelogenous 51487836,63.2,7.7,861 GSM935490: Harvard ChipSeq K562 BRF2 std source_name=K5...62 || biomaterial_provider=ATCC || lab=Harvard || lab description=Struhl - Harvard University || datatype=Ch

  8. Experiment list: SRX891827 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available d, replicate 2; Homo sapiens; ChIP-Seq source_name=MDA-MB-231 || cell line=MDA-MB-231 || cell type=triple negative breast adenocarcin...oma || chip antibody=H3K9ac, Millipore #07-352, Lot 2388

  9. Experiment list: SRX185913 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available MB-231 foxm1 DMSO 2; Homo sapiens; ChIP-Seq source_name=MDA-MB-231 breast adenocarcinoma cells, control, FOX...M1 ChIP || cell_line=MDA-MB-231 || cell_type=ER-negative breast adenocarcinoma ce

  10. Experiment list: SRX891826 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available ed, replicate 1; Homo sapiens; ChIP-Seq source_name=MDA-MB-231 || cell line=MDA-MB-231 || cell type=triple negative breast adenocarci...noma || chip antibody=H3K9ac, Millipore #07-352, Lot 238

  11. Experiment list: SRX810435 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available idity in Shneider S2 Drosophila medium(Gibco) supplement...ster; ChIP-Seq source_name=Cell culture || cell line=S2 || chip antibody=none || growth protocol=S2 cells were grown in 24°C, 95% hum

  12. Experiment list: SRX810433 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available er; ChIP-Seq source_name=Cell culture || cell line=S2 || chip antibody=none || growth protocol=S2 cells were grown in 24°C, 95% humid...ity in Shneider S2 Drosophila medium(Gibco) supplemented

  13. Experiment list: SRX810434 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available idity in Shneider S2 Drosophila medium(Gibco) supplement...ster; ChIP-Seq source_name=Cell culture || cell line=S2 || chip antibody=none || growth protocol=S2 cells were grown in 24°C, 95% hum

  14. Experiment list: SRX172568 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available us; ChIP-Seq source_name=Embryonic Stem Cell || background mouse strain=129SvJae/C57BL/6 || chip antibody=streptavidin beads... || antibody vendor/catalog=Invitrogen 656-01 Dynabeads® MyOne? Streptavidin T1 || genotype/

  15. Experiment list: SRX643466 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available itory cortex) || chip antibody=input || tissue=Female Brain: medial superior temporal gyrus (secondary audit...,725 GSM1423167: Area 13 Brain1 input; Homo sapiens; ChIP-Seq source_name=Female Brain: medial superior temporal gyrus (secondary aud

  16. Experiment list: SRX1427026 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available of offspring produced at one birth by a viviparous animal. 22029930,97.6,9.8,279 GSM1937509: Input for SHP ChIP-seq with vehicle...ver || age=8-12 week || genotype=wildtype || treatment=vehicle || chip antibody=n

  17. Experiment list: SRX190244 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available 1610.1 source_name=PANC-1 || biomaterial_provider=ATCC || datatype=ChipSeq || datatype description=Chromatin IP Sequencing || antibod...y antibodydescription=Mouse monoclonal to RNA polymerase... II CTD repeat YSPTSPS antibody (4H8) - ChIP Grade. Antibody Target: POL2 || antibody targetdescription=This...r RNA in eukaryotes || antibody vendorname=abcam || antibody vendorid=ab5408 || c...ontrolid=SL2340 || labexpid=SL2343,SL5609 || softwareversion=MACS || cell sex=M || antibody=Pol2-4H8 || antibody antibody

  18. Experiment list: SRX150720 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available sue Diagnosis=Fibrocystic Disease 71490650,87.2,15.5,1356 GSM935641: Harvard ChipSeq MCF10A-Er-Src Input std... source_name=MCF10A-Er-Src || biomaterial_provider=Struhl laboratory || lab=Harvard || lab description=Struhl - Harvard

  19. Experiment list: SRX150587 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available is=Adenocarcinoma 32859626,93.1,16.6,6448 GSM935508: Harvard ChipSeq HeLa-S3 NF-YA IgG-rab source_name=HeLa-...S3 || biomaterial_provider=ATCC || lab=Harvard || lab description=Struhl - Harvard University || datatype=Ch

  20. Experiment list: SRX100563 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available s=Leukemia Chronic Myelogenous 39078535,86.0,20.8,1302 GSM803518: HudsonAlpha ChipSeq K562 BCL3 PCR1x source..._name=K562 || biomaterial_provider=ATCC || lab=HudsonAlpha || lab description=Myers - Hudson Alpha Institute

  1. Experiment list: SRX100430 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available s=Leukemia Chronic Myelogenous 47818475,79.5,9.7,26072 GSM803385: HudsonAlpha ChipSeq K562 HEY1 PCR1x source..._name=K562 || biomaterial_provider=ATCC || lab=HudsonAlpha || lab description=Myers - Hudson Alpha Institute

  2. Experiment list: SRX286381 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available 7,533 GSM1146460: AR Cast1b [prostate, castrated+vehicle]; Mus musculus; ChIP-Seq source_name=AR_Cast1b || s...train=wild type ICR || tissue=prostate || age=8-12 weeks old || treatment=castrated+vehicle || chip antibody

  3. Experiment list: SRX286380 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available 2,497 GSM1146459: AR Cast1a [prostate, castrated+vehicle]; Mus musculus; ChIP-Seq source_name=AR_Cast1a || s...train=wild type ICR || tissue=prostate || age=8-12 weeks old || treatment=castrated+vehicle || chip antibody

  4. Experiment list: SRX286386 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available 60.2,17474 GSM1146465: FoxA1 Cast [prostate, castrated+vehicle]; Mus musculus; ChIP-Seq source_name=FoxA1_Ca...st || strain=wild type ICR || tissue=prostate || age=8-12 weeks old || treatment=castrated+vehicle || chip a

  5. Strawberry: Fast and accurate genome-guided transcript reconstruction and quantification from RNA-Seq.

    Science.gov (United States)

    Liu, Ruolin; Dickerson, Julie

    2017-11-01

    We propose a novel method and software tool, Strawberry, for transcript reconstruction and quantification from RNA-Seq data under the guidance of genome alignment and independent of gene annotation. Strawberry consists of two modules: assembly and quantification. The novelty of Strawberry is that the two modules use different optimization frameworks but utilize the same data graph structure, which allows a highly efficient, expandable and accurate algorithm for dealing large data. The assembly module parses aligned reads into splicing graphs, and uses network flow algorithms to select the most likely transcripts. The quantification module uses a latent class model to assign read counts from the nodes of splicing graphs to transcripts. Strawberry simultaneously estimates the transcript abundances and corrects for sequencing bias through an EM algorithm. Based on simulations, Strawberry outperforms Cufflinks and StringTie in terms of both assembly and quantification accuracies. Under the evaluation of a real data set, the estimated transcript expression by Strawberry has the highest correlation with Nanostring probe counts, an independent experiment measure for transcript expression. Strawberry is written in C++14, and is available as open source software at https://github.com/ruolin/strawberry under the MIT license.

  6. Mental models accurately predict emotion transitions.

    Science.gov (United States)

    Thornton, Mark A; Tamir, Diana I

    2017-06-06

    Successful social interactions depend on people's ability to predict others' future actions and emotions. People possess many mechanisms for perceiving others' current emotional states, but how might they use this information to predict others' future states? We hypothesized that people might capitalize on an overlooked aspect of affective experience: current emotions predict future emotions. By attending to regularities in emotion transitions, perceivers might develop accurate mental models of others' emotional dynamics. People could then use these mental models of emotion transitions to predict others' future emotions from currently observable emotions. To test this hypothesis, studies 1-3 used data from three extant experience-sampling datasets to establish the actual rates of emotional transitions. We then collected three parallel datasets in which participants rated the transition likelihoods between the same set of emotions. Participants' ratings of emotion transitions predicted others' experienced transitional likelihoods with high accuracy. Study 4 demonstrated that four conceptual dimensions of mental state representation-valence, social impact, rationality, and human mind-inform participants' mental models. Study 5 used 2 million emotion reports on the Experience Project to replicate both of these findings: again people reported accurate models of emotion transitions, and these models were informed by the same four conceptual dimensions. Importantly, neither these conceptual dimensions nor holistic similarity could fully explain participants' accuracy, suggesting that their mental models contain accurate information about emotion dynamics above and beyond what might be predicted by static emotion knowledge alone.

  7. Mental models accurately predict emotion transitions

    Science.gov (United States)

    Thornton, Mark A.; Tamir, Diana I.

    2017-01-01

    Successful social interactions depend on people’s ability to predict others’ future actions and emotions. People possess many mechanisms for perceiving others’ current emotional states, but how might they use this information to predict others’ future states? We hypothesized that people might capitalize on an overlooked aspect of affective experience: current emotions predict future emotions. By attending to regularities in emotion transitions, perceivers might develop accurate mental models of others’ emotional dynamics. People could then use these mental models of emotion transitions to predict others’ future emotions from currently observable emotions. To test this hypothesis, studies 1–3 used data from three extant experience-sampling datasets to establish the actual rates of emotional transitions. We then collected three parallel datasets in which participants rated the transition likelihoods between the same set of emotions. Participants’ ratings of emotion transitions predicted others’ experienced transitional likelihoods with high accuracy. Study 4 demonstrated that four conceptual dimensions of mental state representation—valence, social impact, rationality, and human mind—inform participants’ mental models. Study 5 used 2 million emotion reports on the Experience Project to replicate both of these findings: again people reported accurate models of emotion transitions, and these models were informed by the same four conceptual dimensions. Importantly, neither these conceptual dimensions nor holistic similarity could fully explain participants’ accuracy, suggesting that their mental models contain accurate information about emotion dynamics above and beyond what might be predicted by static emotion knowledge alone. PMID:28533373

  8. Experiment list: SRX471838 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available : ChIPseq GS WT Bmi1; Mus musculus; ChIP-Seq source_name=Cultured germline stem cells || genotype/variation=...Wild-type || strain=CD1 x C57BL/6 || cell type=Cultured germline stem cells || chip antibody=Mouse anti-Bmi1

  9. Experiment list: SRX471836 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available 0: ChIPseq GS WT Scml2; Mus musculus; ChIP-Seq source_name=Cultured germline stem cells || genotype/variatio...n=Wild-type || strain=CD1 x C57BL/6 || cell type=Cultured germline stem cells || chip antibody=Rabbit anti-S

  10. Experiment list: SRX100493 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available is=Carcinoma Hepatocellular 67635839,78.1,18.6,34708 GSM803448: HudsonAlpha ChipSeq HepG2 HEY1 v041610.1 sou...rce_name=HepG2 || biomaterial_provider=ATCC || lab=HudsonAlpha || lab description=Myers - Hudson Alpha Insti

  11. Experiment list: SRX1084162 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available 0 GSM1811344: P2 HA ChIPSeq; Drosophila melanogaster; ChIP-Seq source_name=Female whole animal_paraquat || tissue=whole animal || gen...der=female || age=1-3 days || genotype=k6801/k6801;gHA-KDM5 || chip antibody=HA htt

  12. Experiment list: SRX212457 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available apiens; ChIP-Seq source_name=CD4+CD25+CD45RA- expanded memory regulatory T cells || donor=C || cell type=CD4...+CD25+CD45RA- expanded memory regulatory T cells || chip antibody=H3K27ac (abcam

  13. Experiment list: SRX212463 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available sapiens; ChIP-Seq source_name=CD4+CD25-CD45RA- expanded memory conventional T cells || donor=C || cell type...=CD4+CD25-CD45RA- expanded memory conventional T cells || chip antibody=H3K4me1 (

  14. Experiment list: SRX203399 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available r lymph. (Rosen et al., Dictionary of Immunology, 1989, p169 & Abbas et al., Cellular and Molecular Immuno...logy, 2d ed, p20) 30508063,94.1,4.3,15496 GSM1033764: MM.1S RNAPII JQ1 JL ChipSeq;

  15. Experiment list: SRX1084161 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available tissue=whole animal || gender=female || age=1-3 days || genotype=k6801/k6801;gHA-KDM5 || chip antibody=HA ht...,0 GSM1811343: P1 HA ChIPSeq; Drosophila melanogaster; ChIP-Seq source_name=Female whole animal_paraquat ||

  16. Experiment list: SRX837357 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available ource_name=ChIP-Seq with MyoD antibody 6975 in mouse P19 cells transduced with MD(ND2bHLH) chimera || cell l...ine=P19 || transduction=MD(ND2bHLH) chimera || chip antibody=MyoD antibody 6975 h

  17. Experiment list: SRX837356 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available ource_name=ChIP-Seq with MyoD antibody 6196 in mouse P19 cells transduced with MD(ND2bHLH) chimera || cell l...ine=P19 || transduction=MD(ND2bHLH) chimera || chip antibody=MyoD antibody 6196 h

  18. Experiment list: SRX192254 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available f offspring produced at one birth by a viviparous animal. 84324878,97.5,12.5,559 GSM1016423: A INPUT vehicle... donor1024; Mus musculus; ChIP-Seq source_name=whole liver extract || age=29-32 days || chip antibody=INPUT || treament=vehicle

  19. Experiment list: SRX192258 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available f offspring produced at one birth by a viviparous animal. 20609955,77.1,6.7,117 GSM1016427: B INPUT vehicle ...donor1; Mus musculus; ChIP-Seq source_name=whole liver extract || age=29-32 days || chip antibody=INPUT || treament=vehicle

  20. Experiment list: SRX192259 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available f offspring produced at one birth by a viviparous animal. 15402183,74.5,5.4,117 GSM1016428: B INPUT vehicle ...donor7; Mus musculus; ChIP-Seq source_name=whole liver extract || age=29-32 days || chip antibody=INPUT || treament=vehicle

  1. Experiment list: SRX192267 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available f offspring produced at one birth by a viviparous animal. 28874920,80.3,4.9,183 GSM1016436: C INPUT vehicle ...donor10; Mus musculus; ChIP-Seq source_name=whole liver extract || age=29-32 days || chip antibody=INPUT || treament=vehicle

  2. Experiment list: SRX192266 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available f offspring produced at one birth by a viviparous animal. 25871870,71.5,4.2,143 GSM1016435: C INPUT vehicle ...donor5; Mus musculus; ChIP-Seq source_name=whole liver extract || age=29-32 days || chip antibody=INPUT || treament=vehicle

  3. Experiment list: SRX192253 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available f offspring produced at one birth by a viviparous animal. 106959657,97.4,12.9,578 GSM1016422: A INPUT vehicle... donor1021; Mus musculus; ChIP-Seq source_name=whole liver extract || age=29-32 days || chip antibody=INPUT || treament=vehicle

  4. Highly expressed loci are vulnerable to misleading ChIP localization of multiple unrelated proteins

    NARCIS (Netherlands)

    Teytelman, L.; Thurtle, D.M.; Rine, J.; van Oudenaarden, A.

    2013-01-01

    Chromatin immunoprecipitation (ChIP) is the gold-standard technique for localizing nuclear proteins in the genome. We used ChIP, in combination with deep sequencing (Seq), to study the genome-wide distribution of the Silent information regulator (Sir) complex in Saccharomyces cerevisiae. We analyzed

  5. RNA-Seq-Based Transcript Structure Analysis with TrBorderExt.

    Science.gov (United States)

    Wang, Yejun; Sun, Ming-An; White, Aaron P

    2018-01-01

    RNA-Seq has become a routine strategy for genome-wide gene expression comparisons in bacteria. Despite lower resolution in transcript border parsing compared with dRNA-Seq, TSS-EMOTE, Cappable-seq, Term-seq, and others, directional RNA-Seq still illustrates its advantages: low cost, quantification and transcript border analysis with a medium resolution (±10-20 nt). To facilitate mining of directional RNA-Seq datasets especially with respect to transcript structure analysis, we developed a tool, TrBorderExt, which can parse transcript start sites and termination sites accurately in bacteria. A detailed protocol is described in this chapter for how to use the software package step by step to identify bacterial transcript borders from raw RNA-Seq data. The package was developed with Perl and R programming languages, and is accessible freely through the website: http://www.szu-bioinf.org/TrBorderExt .

  6. GC-Content Normalization for RNA-Seq Data

    Science.gov (United States)

    2011-01-01

    Background Transcriptome sequencing (RNA-Seq) has become the assay of choice for high-throughput studies of gene expression. However, as is the case with microarrays, major technology-related artifacts and biases affect the resulting expression measures. Normalization is therefore essential to ensure accurate inference of expression levels and subsequent analyses thereof. Results We focus on biases related to GC-content and demonstrate the existence of strong sample-specific GC-content effects on RNA-Seq read counts, which can substantially bias differential expression analysis. We propose three simple within-lane gene-level GC-content normalization approaches and assess their performance on two different RNA-Seq datasets, involving different species and experimental designs. Our methods are compared to state-of-the-art normalization procedures in terms of bias and mean squared error for expression fold-change estimation and in terms of Type I error and p-value distributions for tests of differential expression. The exploratory data analysis and normalization methods proposed in this article are implemented in the open-source Bioconductor R package EDASeq. Conclusions Our within-lane normalization procedures, followed by between-lane normalization, reduce GC-content bias and lead to more accurate estimates of expression fold-changes and tests of differential expression. Such results are crucial for the biological interpretation of RNA-Seq experiments, where downstream analyses can be sensitive to the supplied lists of genes. PMID:22177264

  7. An improved ChIP-seq peak detection system for simultaneously identifying post-translational modified transcription factors by combinatorial fusion, using SUMOylation as an example.

    Science.gov (United States)

    Cheng, Chia-Yang; Chu, Chia-Han; Hsu, Hung-Wei; Hsu, Fang-Rong; Tang, Chung Yi; Wang, Wen-Ching; Kung, Hsing-Jien; Chang, Pei-Ching

    2014-01-01

    Post-translational modification (PTM) of transcriptional factors and chromatin remodelling proteins is recognized as a major mechanism by which transcriptional regulation occurs. Chromatin immunoprecipitation (ChIP) in combination with high-throughput sequencing (ChIP-seq) is being applied as a gold standard when studying the genome-wide binding sites of transcription factor (TFs). This has greatly improved our understanding of protein-DNA interactions on a genomic-wide scale. However, current ChIP-seq peak calling tools are not sufficiently sensitive and are unable to simultaneously identify post-translational modified TFs based on ChIP-seq analysis; this is largely due to the wide-spread presence of multiple modified TFs. Using SUMO-1 modification as an example; we describe here an improved approach that allows the simultaneous identification of the particular genomic binding regions of all TFs with SUMO-1 modification. Traditional peak calling methods are inadequate when identifying multiple TF binding sites that involve long genomic regions and therefore we designed a ChIP-seq processing pipeline for the detection of peaks via a combinatorial fusion method. Then, we annotate the peaks with known transcription factor binding sites (TFBS) using the Transfac Matrix Database (v7.0), which predicts potential SUMOylated TFs. Next, the peak calling result was further analyzed based on the promoter proximity, TFBS annotation, a literature review, and was validated by ChIP-real-time quantitative PCR (qPCR) and ChIP-reChIP real-time qPCR. The results show clearly that SUMOylated TFs are able to be pinpointed using our pipeline. A methodology is presented that analyzes SUMO-1 ChIP-seq patterns and predicts related TFs. Our analysis uses three peak calling tools. The fusion of these different tools increases the precision of the peak calling results. TFBS annotation method is able to predict potential SUMOylated TFs. Here, we offer a new approach that enhances ChIP-seq

  8. Experiment list: SRX212459 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available o sapiens; ChIP-Seq source_name=CD4+CD25-CD45RA- expanded memory conventional T cells || donor=C || cell typ...e=CD4+CD25-CD45RA- expanded memory conventional T cells || chip antibody=H3K27ac

  9. Experiment list: SRX212458 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available sapiens; ChIP-Seq source_name=CD4+CD25+CD45RA- expanded memory regulatory T cells || donor=D || cell type=C...D4+CD25+CD45RA- expanded memory regulatory T cells || chip antibody=H3K27ac (abca

  10. Experiment list: SRX212464 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available o sapiens; ChIP-Seq source_name=CD4+CD25-CD45RA- expanded memory conventional T cells || donor=D || cell typ...e=CD4+CD25-CD45RA- expanded memory conventional T cells || chip antibody=H3K4me1

  11. Experiment list: SRX212460 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available o sapiens; ChIP-Seq source_name=CD4+CD25-CD45RA- expanded memory conventional T cells || donor=D || cell typ...e=CD4+CD25-CD45RA- expanded memory conventional T cells || chip antibody=H3K27ac

  12. Experiment list: SRX212462 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available sapiens; ChIP-Seq source_name=CD4+CD25+CD45RA- expanded memory regulatory T cells || donor=D || cell type=C...D4+CD25+CD45RA- expanded memory regulatory T cells || chip antibody=H3K4me1 (abca

  13. Experiment list: SRX286391 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available 7,47.9,41.3,666 GSM1146470: rIgG1a [prostate, castrated+vehicle]; Mus musculus; ChIP-Seq source_name=rIgG1a ...|| strain=wild type ICR || tissue=prostate || age=8-12 weeks old || treatment=castrated+vehicle || chip anti

  14. Experiment list: SRX590292 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available -rep1; Mus musculus; ChIP-Seq source_name=R-Ctrl-Flag ChIP || strain background=C57BL/6 || genotype/variation=Foxd3 conditional knock...out || cell type=embryonic stem cells (ESCs; R cells) || cell line of origin=Foxd3 conditional knock

  15. Experiment list: SRX542620 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available sculus; ChIP-Seq source_name=Embryonic Stem Cells || time point=NA || treatment=no treatment || strain=129 X C57bl/6 || cell type=Par...ental mouse ES(KH2) cells || chip antibody=MacroH2A2 antibody, Abcam ab4173 http://

  16. Empirical methods for controlling false positives and estimating confidence in ChIP-Seq peaks

    Directory of Open Access Journals (Sweden)

    Courdy Samir J

    2008-12-01

    Full Text Available Abstract Background High throughput signature sequencing holds many promises, one of which is the ready identification of in vivo transcription factor binding sites, histone modifications, changes in chromatin structure and patterns of DNA methylation across entire genomes. In these experiments, chromatin immunoprecipitation is used to enrich for particular DNA sequences of interest and signature sequencing is used to map the regions to the genome (ChIP-Seq. Elucidation of these sites of DNA-protein binding/modification are proving instrumental in reconstructing networks of gene regulation and chromatin remodelling that direct development, response to cellular perturbation, and neoplastic transformation. Results Here we present a package of algorithms and software that makes use of control input data to reduce false positives and estimate confidence in ChIP-Seq peaks. Several different methods were compared using two simulated spike-in datasets. Use of control input data and a normalized difference score were found to more than double the recovery of ChIP-Seq peaks at a 5% false discovery rate (FDR. Moreover, both a binomial p-value/q-value and an empirical FDR were found to predict the true FDR within 2–3 fold and are more reliable estimators of confidence than a global Poisson p-value. These methods were then used to reanalyze Johnson et al.'s neuron-restrictive silencer factor (NRSF ChIP-Seq data without relying on extensive qPCR validated NRSF sites and the presence of NRSF binding motifs for setting thresholds. Conclusion The methods developed and tested here show considerable promise for reducing false positives and estimating confidence in ChIP-Seq data without any prior knowledge of the chIP target. They are part of a larger open source package freely available from http://useq.sourceforge.net/.

  17. Impact of artefact removal on ChIP quality metrics in ChIP-seq and ChIP-exo data.

    Directory of Open Access Journals (Sweden)

    Thomas Samuel Carroll

    2014-04-01

    Full Text Available With the advent of ChIP-seq multiplexing technologies and the subsequent increase in ChIP-seq throughput, the development of working standards for the quality assessment of ChIP-seq studies has received significant attention. The ENCODE consortium’s large scale analysis of transcription factor binding and epigenetic marks as well as concordant work on ChIP-seq by other laboratories has established a new generation of ChIP-seq quality control measures. The use of these metrics alongside common processing steps has however not been evaluated. In this study, we investigate the effects of blacklisting and removal of duplicated reads on established metrics of ChIP-seq quality and show that the interpretation of these metrics is highly dependent on the ChIP-seq preprocessing steps applied. Further to this we perform the first investigation of the use of these metrics for ChIP-exo data and make recommendations for the adaptation of the NSC statistic to allow for the assessment of ChIP-exo efficiency.

  18. Experiment list: SRX150670 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available gnosis=Fibrocystic Disease 53773968,83.9,44.3,13035 GSM935591: Harvard ChipSeq MCF10A-Er-Src EtOH 0.01pct ST...AT3 std source_name=MCF10A-Er-Src || biomaterial_provider=Struhl laboratory || lab=Harvard || lab description=Struhl - Harvard

  19. Experiment list: SRX150630 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available gnosis=Fibrocystic Disease 42446970,82.6,25.1,47688 GSM935551: Harvard ChipSeq MCF10A-Er-Src EtOH 0.01pct 12...hr STAT3 std source_name=MCF10A-Er-Src || biomaterial_provider=Struhl laboratory || lab=Harvard || lab description=Struhl - Harvard

  20. Experiment list: SRX100511 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available ,37.1,107266 GSM803466: HudsonAlpha ChipSeq H1-hESC Rad21 v041610.2 source_name=H1-hESC || biomaterial_provi...SRX100511 hg19 TFs and others RAD21 Pluripotent stem cell hESC H1 NA 109287919,77.2

  1. Experiment list: SRX153145 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available ast adenocarcinoma cell-line MCF7 || cell-line=MCF7 || passage=5 || chip antibody=H...n=Pleura|Tissue Diagnosis=Adenocarcinoma 93237597,98.2,8.2,1358 GSM946849: MCF7 H3K4me1; Homo sapiens; ChIP-Seq source_name=Human bre

  2. Experiment list: SRX212461 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available iens; ChIP-Seq source_name=CD4+CD25+CD45RA- expanded memory regulatory T cells || donor=C || cell type=CD4+CD25+CD45RA- expanded memo...ry regulatory T cells || chip antibody=H3K4me1 (abcam ab

  3. Experiment list: SRX713898 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available -D567 || agent=Ethanol (vehicle control) || chip antibody=AR -N20 (Santa Cruz, SC-816 lot B1012) || biologic...5527: R1D567 Eth ARv567es rep2; Homo sapiens; ChIP-Seq source_name=prostate cancer cell line || cell type=R1

  4. Experiment list: SRX713899 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available D567 || agent=Ethanol (vehicle control) || chip antibody=AR -N20 (Santa Cruz, SC-816 lot B1012) || biologica...528: R1D567 Eth ARv567es rep3; Homo sapiens; ChIP-Seq source_name=prostate cancer cell line || cell type=R1-

  5. Genotype-driven identification of a molecular network predictive of advanced coronary calcium in ClinSeq® and Framingham Heart Study cohorts.

    Science.gov (United States)

    Oguz, Cihan; Sen, Shurjo K; Davis, Adam R; Fu, Yi-Ping; O'Donnell, Christopher J; Gibbons, Gary H

    2017-10-26

    One goal of personalized medicine is leveraging the emerging tools of data science to guide medical decision-making. Achieving this using disparate data sources is most daunting for polygenic traits. To this end, we employed random forests (RFs) and neural networks (NNs) for predictive modeling of coronary artery calcium (CAC), which is an intermediate endo-phenotype of coronary artery disease (CAD). Model inputs were derived from advanced cases in the ClinSeq®; discovery cohort (n=16) and the FHS replication cohort (n=36) from 89 th -99 th CAC score percentile range, and age-matched controls (ClinSeq®; n=16, FHS n=36) with no detectable CAC (all subjects were Caucasian males). These inputs included clinical variables and genotypes of 56 single nucleotide polymorphisms (SNPs) ranked highest in terms of their nominal correlation with the advanced CAC state in the discovery cohort. Predictive performance was assessed by computing the areas under receiver operating characteristic curves (ROC-AUC). RF models trained and tested with clinical variables generated ROC-AUC values of 0.69 and 0.61 in the discovery and replication cohorts, respectively. In contrast, in both cohorts, the set of SNPs derived from the discovery cohort were highly predictive (ROC-AUC ≥0.85) with no significant change in predictive performance upon integration of clinical and genotype variables. Using the 21 SNPs that produced optimal predictive performance in both cohorts, we developed NN models trained with ClinSeq®; data and tested with FHS data and obtained high predictive accuracy (ROC-AUC=0.80-0.85) with several topologies. Several CAD and "vascular aging" related biological processes were enriched in the network of genes constructed from the predictive SNPs. We identified a molecular network predictive of advanced coronary calcium using genotype data from ClinSeq®; and FHS cohorts. Our results illustrate that machine learning tools, which utilize complex interactions between disease

  6. Prediction of 3D chip formation in the facing cutting with lathe machine using FEM

    Science.gov (United States)

    Prasetyo, Yudhi; Tauviqirrahman, Mohamad; Rusnaldy

    2016-04-01

    This paper presents the prediction of the chip formation at the machining process using a lathe machine in a more specific way focusing on facing cutting (face turning). The main purpose is to propose a new approach to predict the chip formation with the variation of the cutting directions i.e., the backward and forward direction. In addition, the interaction between stress analysis and chip formation on cutting process was also investigated. The simulations were conducted using three dimensional (3D) finite element method based on ABAQUS software with aluminum and high speed steel (HSS) as the workpiece and the tool materials, respectively. The simulation result showed that the chip resulted using a backward direction depicts a better formation than that using a conventional (forward) direction.

  7. Experiment list: SRX150497 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available sue Diagnosis=Fibrocystic Disease 30305817,89.0,5.1,579 GSM935418: Harvard ChipSeq MCF10A-Er-Src EtOH 0.01pc...t 4hr Input std source_name=MCF10A-Er-Src || biomaterial_provider=Struhl laboratory || lab=Harvard || lab description=Struhl - Harvar

  8. Experiment list: SRX100495 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available SRX100495 hg19 TFs and others TAF1 Pluripotent stem cell hESC H1 NA 40640767,80.2,12.3,29488 GSM803450: Huds...onAlpha ChipSeq H1-hESC TAF1 v041610.2 source_name=H1-hESC || biomaterial_provider=

  9. Experiment list: SRX100469 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available SRX100469 hg19 TFs and others GABPA Pluripotent stem cell hESC H1 NA 59698055,78.1,10.0,16884 GSM803424: Hud...sonAlpha ChipSeq H1-hESC GABP PCR1x source_name=H1-hESC || biomaterial_provider=WiC

  10. Experiment list: SRX100587 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available SRX100587 hg19 TFs and others EP300 Pluripotent stem cell hESC H1 NA 46380456,87.7,10.1,18476 GSM803542: Hud...sonAlpha ChipSeq H1-hESC p300 v041610.2 source_name=H1-hESC || biomaterial_provider

  11. A new, accurate predictive model for incident hypertension

    DEFF Research Database (Denmark)

    Völzke, Henry; Fung, Glenn; Ittermann, Till

    2013-01-01

    Data mining represents an alternative approach to identify new predictors of multifactorial diseases. This work aimed at building an accurate predictive model for incident hypertension using data mining procedures.......Data mining represents an alternative approach to identify new predictors of multifactorial diseases. This work aimed at building an accurate predictive model for incident hypertension using data mining procedures....

  12. On the Scalability of Time-predictable Chip-Multiprocessing

    DEFF Research Database (Denmark)

    Puffitsch, Wolfgang; Schoeberl, Martin

    2012-01-01

    Real-time systems need a time-predictable execution platform to be able to determine the worst-case execution time statically. In order to be time-predictable, several advanced processor features, such as out-of-order execution and other forms of speculation, have to be avoided. However, just using...... simple processors is not an option for embedded systems with high demands on computing power. In order to provide high performance and predictability we argue to use multiprocessor systems with a time-predictable memory interface. In this paper we present the scalability of a Java chip......-multiprocessor system that is designed to be time-predictable. Adding time-predictable caches is mandatory to achieve scalability with a shared memory multi-processor system. As Java bytecode retains information about the nature of memory accesses, it is possible to implement a memory hierarchy that takes...

  13. Experiment list: SRX100473 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available SRX100473 hg19 TFs and others SIN3A Pluripotent stem cell hESC H1 NA 48520029,77.7,11.0,14690 GSM803428: Hud...sonAlpha ChipSeq H1-hESC Sin3Ak-20 PCR1x source_name=H1-hESC || biomaterial_provide

  14. Experiment list: SRX524970 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available SRX524970 hg19 Input control Input control Uterus Endometrial stromal cells NA 3525...5109,88.2,37.7,339 GSM1372862: Input case1 control; Homo sapiens; ChIP-Seq source_name=Human endometrial stromal cells || tissue=Endo...metrial stromal cells || condition=control (without any induction) || chip antibody

  15. Experiment list: SRX524980 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available SRX524980 hg19 Input control Input control Uterus Endometrial stromal cells NA 2708...7271,98.6,31.8,308 GSM1372876: Input case2 control; Homo sapiens; ChIP-Seq source_name=Human endometrial stromal cells || tissue=Endo...metrial stromal cells || condition=control (without any induction) || chip antibody

  16. Experiment list: SRX192270 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available produced at one birth by a viviparous animal. 29213160,79.3,3.6,131 GSM1016439: C H3K27me3 vehicle donor5; ...Mus musculus; ChIP-Seq source_name=whole liver extract || age=29-32 days || chip antibody=H3K27me3 || treament=vehicle

  17. Experiment list: SRX192271 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available produced at one birth by a viviparous animal. 29600718,81.2,4.5,188 GSM1016440: C H3K27me3 vehicle donor10;... Mus musculus; ChIP-Seq source_name=whole liver extract || age=29-32 days || chip antibody=H3K27me3 || treament=vehicle

  18. Analysis of ChIP-seq Data in R/Bioconductor.

    Science.gov (United States)

    de Santiago, Ines; Carroll, Thomas

    2018-01-01

    The development of novel high-throughput sequencing methods for ChIP (chromatin immunoprecipitation) has provided a very powerful tool to study gene regulation in multiple conditions at unprecedented resolution and scale. Proactive quality-control and appropriate data analysis techniques are of critical importance to extract the most meaningful results from the data. Over the last years, an array of R/Bioconductor tools has been developed allowing researchers to process and analyze ChIP-seq data. This chapter provides an overview of the methods available to analyze ChIP-seq data based primarily on software packages from the open-source Bioconductor project. Protocols described in this chapter cover basic steps including data alignment, peak calling, quality control and data visualization, as well as more complex methods such as the identification of differentially bound regions and functional analyses to annotate regulatory regions. The steps in the data analysis process were demonstrated on publicly available data sets and will serve as a demonstration of the computational procedures routinely used for the analysis of ChIP-seq data in R/Bioconductor, from which readers can construct their own analysis pipelines.

  19. Experiment list: SRX192263 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available produced at one birth by a viviparous animal. 19707029,78.6,5.0,577 GSM1016432: B H3K4me2 vehicle donor7; Mu...s musculus; ChIP-Seq source_name=whole liver extract || age=29-32 days || chip antibody=H3K4me2 || treament=vehicle

  20. Experiment list: SRX192262 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available produced at one birth by a viviparous animal. 13178340,75.6,7.8,715 GSM1016431: B H3K4me2 vehicle donor1; Mu...s musculus; ChIP-Seq source_name=whole liver extract || age=29-32 days || chip antibody=H3K4me2 || treament=vehicle

  1. Experiment list: SRX192250 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available produced at one birth by a viviparous animal. 98336587,96.8,24.1,627 GSM1016419: A H3K36me3 vehicle donor10...24; Mus musculus; ChIP-Seq source_name=whole liver extract || age=29-32 days || chip antibody=H3K36me3 || treament=vehicle

  2. Experiment list: SRX192257 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available produced at one birth by a viviparous animal. 73877067,96.7,23.5,560 GSM1016426: A H3K36me3 vehicle donor10...21; Mus musculus; ChIP-Seq source_name=whole liver extract || age=29-32 days || chip antibody=H3K36me3 || treament=vehicle

  3. Experiment list: SRX190205 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available labexpid=SL7110,SL8075 || softwareversion=MACS || cell sex=M || antibody=NRSF ||...ChIP protocol & AMpure XP size selection for ChIP-seq (Myers) || controlid=SL6021 || labexpid=SL7110,SL8075 || replicate=1,2 || softw...areversion=MACS http://dbarchive.biosciencedbc.jp/kyushu-u/hg19/eachData/bw/SRX1902

  4. Experiment list: SRX1056357 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available || cell type=ES cells || treated with=2.5 µM tamoxifen (Tam) || chip antibody=none http://dbarchive.bioscien...tiate into specialized cells. 35337197,98.9,19.2,224 GSM1708671: input DNA +Tam R...2; Mus musculus; ChIP-Seq source_name=input DNA_+Tam || strain=J1 || genotype/variation=inducible SetDB1 KO

  5. Experiment list: SRX026424 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available SRX026424 mm9 RNA polymerase RNA Polymerase II Neural Cerebellum MeSH Description=T..., maintain balance, and learn motor skills. 15567107,93.1,10.8,650 GSM587797: P5 Cerebellum RNAP-II ChIP-Seq... source_name=PostNatal-Day5_Cerebellum || strain=CD1 || age=postnatal day 5 || tissue=cerebellum || chip-ant

  6. Multiplexed ChIP-Seq Using Direct Nucleosome Barcoding: A Tool for High-Throughput Chromatin Analysis.

    Science.gov (United States)

    Chabbert, Christophe D; Adjalley, Sophie H; Steinmetz, Lars M; Pelechano, Vicent

    2018-01-01

    Chromatin immunoprecipitation followed by sequencing (ChIP-Seq) or microarray hybridization (ChIP-on-chip) are standard methods for the study of transcription factor binding sites and histone chemical modifications. However, these approaches only allow profiling of a single factor or protein modification at a time.In this chapter, we present Bar-ChIP, a higher throughput version of ChIP-Seq that relies on the direct ligation of molecular barcodes to chromatin fragments. Bar-ChIP enables the concurrent profiling of multiple DNA-protein interactions and is therefore amenable to experimental scale-up, without the need for any robotic instrumentation.

  7. Characterizing and annotating the genome using RNA-seq data.

    Science.gov (United States)

    Chen, Geng; Shi, Tieliu; Shi, Leming

    2017-02-01

    Bioinformatics methods for various RNA-seq data analyses are in fast evolution with the improvement of sequencing technologies. However, many challenges still exist in how to efficiently process the RNA-seq data to obtain accurate and comprehensive results. Here we reviewed the strategies for improving diverse transcriptomic studies and the annotation of genetic variants based on RNA-seq data. Mapping RNA-seq reads to the genome and transcriptome represent two distinct methods for quantifying the expression of genes/transcripts. Besides the known genes annotated in current databases, many novel genes/transcripts (especially those long noncoding RNAs) still can be identified on the reference genome using RNA-seq. Moreover, owing to the incompleteness of current reference genomes, some novel genes are missing from them. Genome- guided and de novo transcriptome reconstruction are two effective and complementary strategies for identifying those novel genes/transcripts on or beyond the reference genome. In addition, integrating the genes of distinct databases to conduct transcriptomics and genetics studies can improve the results of corresponding analyses.

  8. Accurate Multisteps Traffic Flow Prediction Based on SVM

    Directory of Open Access Journals (Sweden)

    Zhang Mingheng

    2013-01-01

    Full Text Available Accurate traffic flow prediction is prerequisite and important for realizing intelligent traffic control and guidance, and it is also the objective requirement for intelligent traffic management. Due to the strong nonlinear, stochastic, time-varying characteristics of urban transport system, artificial intelligence methods such as support vector machine (SVM are now receiving more and more attentions in this research field. Compared with the traditional single-step prediction method, the multisteps prediction has the ability that can predict the traffic state trends over a certain period in the future. From the perspective of dynamic decision, it is far important than the current traffic condition obtained. Thus, in this paper, an accurate multi-steps traffic flow prediction model based on SVM was proposed. In which, the input vectors were comprised of actual traffic volume and four different types of input vectors were compared to verify their prediction performance with each other. Finally, the model was verified with actual data in the empirical analysis phase and the test results showed that the proposed SVM model had a good ability for traffic flow prediction and the SVM-HPT model outperformed the other three models for prediction.

  9. Gene expression profiling of human breast tissue samples using SAGE-Seq.

    Science.gov (United States)

    Wu, Zhenhua Jeremy; Meyer, Clifford A; Choudhury, Sibgat; Shipitsin, Michail; Maruyama, Reo; Bessarabova, Marina; Nikolskaya, Tatiana; Sukumar, Saraswati; Schwartzman, Armin; Liu, Jun S; Polyak, Kornelia; Liu, X Shirley

    2010-12-01

    We present a powerful application of ultra high-throughput sequencing, SAGE-Seq, for the accurate quantification of normal and neoplastic mammary epithelial cell transcriptomes. We develop data analysis pipelines that allow the mapping of sense and antisense strands of mitochondrial and RefSeq genes, the normalization between libraries, and the identification of differentially expressed genes. We find that the diversity of cancer transcriptomes is significantly higher than that of normal cells. Our analysis indicates that transcript discovery plateaus at 10 million reads/sample, and suggests a minimum desired sequencing depth around five million reads. Comparison of SAGE-Seq and traditional SAGE on normal and cancerous breast tissues reveals higher sensitivity of SAGE-Seq to detect less-abundant genes, including those encoding for known breast cancer-related transcription factors and G protein-coupled receptors (GPCRs). SAGE-Seq is able to identify genes and pathways abnormally activated in breast cancer that traditional SAGE failed to call. SAGE-Seq is a powerful method for the identification of biomarkers and therapeutic targets in human disease.

  10. Robust design and thermal fatigue life prediction of anisotropic conductive film flip chip package

    International Nuclear Information System (INIS)

    Nam, Hyun Wook

    2004-01-01

    The use of flip-chip technology has many advantages over other approaches for high-density electronic packaging. ACF(Anisotropic Conductive Film) is one of the major flip-chip technologies, which has short chip-to-chip interconnection length, high productivity, and miniaturization of package. In this study, thermal fatigue life of ACF bonding flip-chip package has been predicted. Elastic and thermal properties of ACF were measured by using DMA and TMA. Temperature dependent nonlinear bi-thermal analysis was conducted and the result was compared with Moire interferometer experiment. Calculated displacement field was well matched with experimental result. Thermal fatigue analysis was also conducted. The maximum shear strain occurs at the outmost located bump. Shear stress-strain curve was obtained to calculate fatigue life. Fatigue model for electronic adhesives was used to predict thermal fatigue life of ACF bonding flip-chip packaging. DOE (Design Of Experiment) technique was used to find important design factors. The results show that PCB CTE (Coefficient of Thermal Expansion) and elastic modulus of ACF material are important material parameters. And as important design parameters, chip width, bump pitch and bump width were chose. 2 nd DOE was conducted to obtain RSM equation for the choose 3 design parameter. The coefficient of determination (R 2 ) for the calculated RSM equation is 0.99934. Optimum design is conducted using the RSM equation. MMFD (Modified Method for Feasible Direction) algorithm is used to optimum design. The optimum value for chip width, bump pitch and bump width were 7.87mm, 430μm, and 78μm, respectively. Approximately, 1400 cycles have been expected under optimum conditions. Reliability analysis was conducted to find out guideline for control range of design parameter. Sigma value was calculated with changing standard deviation of design variable. To acquire 6 sigma level thermal fatigue reliability, the Std. Deviation of design parameter

  11. Sentence‐Chain Based Seq2seq Model for Corpus Expansion

    Directory of Open Access Journals (Sweden)

    Euisok Chung

    2017-08-01

    Full Text Available This study focuses on a method for sequential data augmentation in order to alleviate data sparseness problems. Specifically, we present corpus expansion techniques for enhancing the coverage of a language model. Recent recurrent neural network studies show that a seq2seq model can be applied for addressing language generation issues; it has the ability to generate new sentences from given input sentences. We present a method of corpus expansion using a sentence‐chain based seq2seq model. For training the seq2seq model, sentence chains are used as triples. The first two sentences in a triple are used for the encoder of the seq2seq model, while the last sentence becomes a target sequence for the decoder. Using only internal resources, evaluation results show an improvement of approximately 7.6% relative perplexity over a baseline language model of Korean text. Additionally, from a comparison with a previous study, the sentence chain approach reduces the size of the training data by 38.4% while generating 1.4‐times the number of n‐grams with superior performance for English text.

  12. Transforming RNA-Seq data to improve the performance of prognostic gene signatures.

    Science.gov (United States)

    Zwiener, Isabella; Frisch, Barbara; Binder, Harald

    2014-01-01

    Gene expression measurements have successfully been used for building prognostic signatures, i.e for identifying a short list of important genes that can predict patient outcome. Mostly microarray measurements have been considered, and there is little advice available for building multivariable risk prediction models from RNA-Seq data. We specifically consider penalized regression techniques, such as the lasso and componentwise boosting, which can simultaneously consider all measurements and provide both, multivariable regression models for prediction and automated variable selection. However, they might be affected by the typical skewness, mean-variance-dependency or extreme values of RNA-Seq covariates and therefore could benefit from transformations of the latter. In an analytical part, we highlight preferential selection of covariates with large variances, which is problematic due to the mean-variance dependency of RNA-Seq data. In a simulation study, we compare different transformations of RNA-Seq data for potentially improving detection of important genes. Specifically, we consider standardization, the log transformation, a variance-stabilizing transformation, the Box-Cox transformation, and rank-based transformations. In addition, the prediction performance for real data from patients with kidney cancer and acute myeloid leukemia is considered. We show that signature size, identification performance, and prediction performance critically depend on the choice of a suitable transformation. Rank-based transformations perform well in all scenarios and can even outperform complex variance-stabilizing approaches. Generally, the results illustrate that the distribution and potential transformations of RNA-Seq data need to be considered as a critical step when building risk prediction models by penalized regression techniques.

  13. Transforming RNA-Seq data to improve the performance of prognostic gene signatures.

    Directory of Open Access Journals (Sweden)

    Isabella Zwiener

    Full Text Available Gene expression measurements have successfully been used for building prognostic signatures, i.e for identifying a short list of important genes that can predict patient outcome. Mostly microarray measurements have been considered, and there is little advice available for building multivariable risk prediction models from RNA-Seq data. We specifically consider penalized regression techniques, such as the lasso and componentwise boosting, which can simultaneously consider all measurements and provide both, multivariable regression models for prediction and automated variable selection. However, they might be affected by the typical skewness, mean-variance-dependency or extreme values of RNA-Seq covariates and therefore could benefit from transformations of the latter. In an analytical part, we highlight preferential selection of covariates with large variances, which is problematic due to the mean-variance dependency of RNA-Seq data. In a simulation study, we compare different transformations of RNA-Seq data for potentially improving detection of important genes. Specifically, we consider standardization, the log transformation, a variance-stabilizing transformation, the Box-Cox transformation, and rank-based transformations. In addition, the prediction performance for real data from patients with kidney cancer and acute myeloid leukemia is considered. We show that signature size, identification performance, and prediction performance critically depend on the choice of a suitable transformation. Rank-based transformations perform well in all scenarios and can even outperform complex variance-stabilizing approaches. Generally, the results illustrate that the distribution and potential transformations of RNA-Seq data need to be considered as a critical step when building risk prediction models by penalized regression techniques.

  14. Experiment list: SRX150684 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available sue Diagnosis=Fibrocystic Disease 174026370,97.9,12.7,1654 GSM935605: Harvard ChipSeq MCF10A-Er-Src Input Harvard... Control source_name=MCF10A-Er-Src || biomaterial_provider=Struhl laboratory || lab=Harvard || lab description=Struhl - Harvard...before peaks are called. || control=Harvard_Control || control description=input library was prepared at Harvard. || control=Harvard..._Control || control description=input library was prepared at Harvard. || controlid=

  15. Experiment list: SRX143825 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available SRX143825 mm9 Input control Input control Neural Cerebellum MeSH Description=The pa...ntain balance, and learn motor skills. 38330550,73.2,10.7,868 GSM918733: LICR ChipSeq Cerebellum Input adult-8wks source_name=Cerebel...ption=Chromatin IP Sequencing || cell=Cerebellum || cell organism=mouse || cell description=Cerebellum || ce...lum || biomaterial_provider=1)LICR lab; 2)CSHL lab || lab=LICR-m || lab description

  16. Comprehensive Assessments of RNA-seq by the SEQC Consortium: FDA-Led Efforts Advance Precision Medicine

    Directory of Open Access Journals (Sweden)

    Joshua Xu

    2016-03-01

    Full Text Available Studies on gene expression in response to therapy have led to the discovery of pharmacogenomics biomarkers and advances in precision medicine. Whole transcriptome sequencing (RNA-seq is an emerging tool for profiling gene expression and has received wide adoption in the biomedical research community. However, its value in regulatory decision making requires rigorous assessment and consensus between various stakeholders, including the research community, regulatory agencies, and industry. The FDA-led SEquencing Quality Control (SEQC consortium has made considerable progress in this direction, and is the subject of this review. Specifically, three RNA-seq platforms (Illumina HiSeq, Life Technologies SOLiD, and Roche 454 were extensively evaluated at multiple sites to assess cross-site and cross-platform reproducibility. The results demonstrated that relative gene expression measurements were consistently comparable across labs and platforms, but not so for the measurement of absolute expression levels. As part of the quality evaluation several studies were included to evaluate the utility of RNA-seq in clinical settings and safety assessment. The neuroblastoma study profiled tumor samples from 498 pediatric neuroblastoma patients by both microarray and RNA-seq. RNA-seq offers more utilities than microarray in determining the transcriptomic characteristics of cancer. However, RNA-seq and microarray-based models were comparable in clinical endpoint prediction, even when including additional features unique to RNA-seq beyond gene expression. The toxicogenomics study compared microarray and RNA-seq profiles of the liver samples from rats exposed to 27 different chemicals representing multiple toxicity modes of action. Cross-platform concordance was dependent on chemical treatment and transcript abundance. Though both RNA-seq and microarray are suitable for developing gene expression based predictive models with comparable prediction performance, RNA-seq

  17. An Annotation Agnostic Algorithm for Detecting Nascent RNA Transcripts in GRO-Seq.

    Science.gov (United States)

    Azofeifa, Joseph G; Allen, Mary A; Lladser, Manuel E; Dowell, Robin D

    2017-01-01

    We present a fast and simple algorithm to detect nascent RNA transcription in global nuclear run-on sequencing (GRO-seq). GRO-seq is a relatively new protocol that captures nascent transcripts from actively engaged polymerase, providing a direct read-out on bona fide transcription. Most traditional assays, such as RNA-seq, measure steady state RNA levels which are affected by transcription, post-transcriptional processing, and RNA stability. GRO-seq data, however, presents unique analysis challenges that are only beginning to be addressed. Here, we describe a new algorithm, Fast Read Stitcher (FStitch), that takes advantage of two popular machine-learning techniques, hidden Markov models and logistic regression, to classify which regions of the genome are transcribed. Given a small user-defined training set, our algorithm is accurate, robust to varying read depth, annotation agnostic, and fast. Analysis of GRO-seq data without a priori need for annotation uncovers surprising new insights into several aspects of the transcription process.

  18. PASSion: a pattern growth algorithm-based pipeline for splice junction detection in paired-end RNA-Seq data.

    Science.gov (United States)

    Zhang, Yanju; Lameijer, Eric-Wubbo; 't Hoen, Peter A C; Ning, Zemin; Slagboom, P Eline; Ye, Kai

    2012-02-15

    RNA-seq is a powerful technology for the study of transcriptome profiles that uses deep-sequencing technologies. Moreover, it may be used for cellular phenotyping and help establishing the etiology of diseases characterized by abnormal splicing patterns. In RNA-Seq, the exact nature of splicing events is buried in the reads that span exon-exon boundaries. The accurate and efficient mapping of these reads to the reference genome is a major challenge. We developed PASSion, a pattern growth algorithm-based pipeline for splice site detection in paired-end RNA-Seq reads. Comparing the performance of PASSion to three existing RNA-Seq analysis pipelines, TopHat, MapSplice and HMMSplicer, revealed that PASSion is competitive with these packages. Moreover, the performance of PASSion is not affected by read length and coverage. It performs better than the other three approaches when detecting junctions in highly abundant transcripts. PASSion has the ability to detect junctions that do not have known splicing motifs, which cannot be found by the other tools. Of the two public RNA-Seq datasets, PASSion predicted ≈ 137,000 and 173,000 splicing events, of which on average 82 are known junctions annotated in the Ensembl transcript database and 18% are novel. In addition, our package can discover differential and shared splicing patterns among multiple samples. The code and utilities can be freely downloaded from https://trac.nbic.nl/passion and ftp://ftp.sanger.ac.uk/pub/zn1/passion.

  19. Experiment list: SRX1056356 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available O || cell type=ES cells || treated with=2.5 µM tamoxifen (Tam) || chip antibody=H3K27me3 (Millipore: 07-449)...specialized cells. 43433717,97.9,45.2,1333 GSM1708670: H3K27me3 ChIPSeq+Tam R2; M...us musculus; ChIP-Seq source_name=H3K27me3_ChIPSeq+Tam || strain=J1 || genotype/variation=inducible SetDB1 K

  20. Forecasting forest chip energy production in Finland 2008-2014

    International Nuclear Information System (INIS)

    Linden, Mikael

    2011-01-01

    Energy policy measures aim to increase energy production from forest chips in Finland to 10 TWh by year 2010. However, on the regional level production differences are large, and the regional estimates of the potential base of raw materials for the production of forest chips are heterogeneous. In order to analyse the validity of the above target, two methods are proposed to derive forecasts for region-level energy production from forest chips in Finland in the years 2008-2014. The plant-level data from 2003-2007 gives a starting point for a detailed statistical analysis of present and future region-level forest chip production. Observed 2008 regional levels are above the estimated prediction 95% confidence intervals based on aggregation of plant-level time averages. A simple time trend model with fixed-region effects provides accurate forecasts for the years 2008-2014. Forest chip production forecast confidence intervals cover almost all regions for the 2008 levels and the estimates of potential production levels for 2014. The forecast confidence intervals are also derived with re-sampling methods, i.e. with bootstrap methods, to obtain more reliable results. Results confirm that a general materials shortfall is not expected in the near future for forest chip energy production in Finland.

  1. Chromatin immunoprecipitation (ChIP) of plant transcription factors followed by sequencing (ChIP-SEQ) or hybridization to whole genome arrays (ChIP-CHIP)

    NARCIS (Netherlands)

    Kaufmann, K.; Muiño, J.M.; Østerås, M.; Farinelli, L.; Krajewski, P.; Angenent, G.C.

    2010-01-01

    Chromatin immunoprecipitation (ChIP) is a powerful technique to study interactions between transcription factors (TFs) and DNA in vivo. For genome-wide de novo discovery of TF-binding sites, the DNA that is obtained in ChIP experiments needs to be processed for sequence identification. The sequences

  2. Quantitative ChIP-Seq Normalization Reveals Global Modulation of the Epigenome

    Directory of Open Access Journals (Sweden)

    David A. Orlando

    2014-11-01

    Full Text Available Epigenomic profiling by chromatin immunoprecipitation coupled with massively parallel DNA sequencing (ChIP-seq is a prevailing methodology used to investigate chromatin-based regulation in biological systems such as human disease, but the lack of an empirical methodology to enable normalization among experiments has limited the precision and usefulness of this technique. Here, we describe a method called ChIP with reference exogenous genome (ChIP-Rx that allows one to perform genome-wide quantitative comparisons of histone modification status across cell populations using defined quantities of a reference epigenome. ChIP-Rx enables the discovery and quantification of dynamic epigenomic profiles across mammalian cells that would otherwise remain hidden using traditional normalization methods. We demonstrate the utility of this method for measuring epigenomic changes following chemical perturbations and show how reference normalization of ChIP-seq experiments enables the discovery of disease-relevant changes in histone modification occupancy.

  3. ORMAN: optimal resolution of ambiguous RNA-Seq multimappings in the presence of novel isoforms.

    Science.gov (United States)

    Dao, Phuong; Numanagić, Ibrahim; Lin, Yen-Yi; Hach, Faraz; Karakoc, Emre; Donmez, Nilgun; Collins, Colin; Eichler, Evan E; Sahinalp, S Cenk

    2014-03-01

    RNA-Seq technology is promising to uncover many novel alternative splicing events, gene fusions and other variations in RNA transcripts. For an accurate detection and quantification of transcripts, it is important to resolve the mapping ambiguity for those RNA-Seq reads that can be mapped to multiple loci: >17% of the reads from mouse RNA-Seq data and 50% of the reads from some plant RNA-Seq data have multiple mapping loci. In this study, we show how to resolve the mapping ambiguity in the presence of novel transcriptomic events such as exon skipping and novel indels towards accurate downstream analysis. We introduce ORMAN ( O ptimal R esolution of M ultimapping A mbiguity of R N A-Seq Reads), which aims to compute the minimum number of potential transcript products for each gene and to assign each multimapping read to one of these transcripts based on the estimated distribution of the region covering the read. ORMAN achieves this objective through a combinatorial optimization formulation, which is solved through well-known approximation algorithms, integer linear programs and heuristics. On a simulated RNA-Seq dataset including a random subset of transcripts from the UCSC database, the performance of several state-of-the-art methods for identifying and quantifying novel transcripts, such as Cufflinks, IsoLasso and CLIIQ, is significantly improved through the use of ORMAN. Furthermore, in an experiment using real RNA-Seq reads, we show that ORMAN is able to resolve multimapping to produce coverage values that are similar to the original distribution, even in genes with highly non-uniform coverage. ORMAN is available at http://orman.sf.net

  4. Highly Accurate Prediction of Jobs Runtime Classes

    OpenAIRE

    Reiner-Benaim, Anat; Grabarnick, Anna; Shmueli, Edi

    2016-01-01

    Separating the short jobs from the long is a known technique to improve scheduling performance. In this paper we describe a method we developed for accurately predicting the runtimes classes of the jobs to enable this separation. Our method uses the fact that the runtimes can be represented as a mixture of overlapping Gaussian distributions, in order to train a CART classifier to provide the prediction. The threshold that separates the short jobs from the long jobs is determined during the ev...

  5. Discovering transcription factor binding sites in highly repetitive regions of genomes with multi-read analysis of ChIP-Seq data.

    Science.gov (United States)

    Chung, Dongjun; Kuan, Pei Fen; Li, Bo; Sanalkumar, Rajendran; Liang, Kun; Bresnick, Emery H; Dewey, Colin; Keleş, Sündüz

    2011-07-01

    Chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) is rapidly replacing chromatin immunoprecipitation combined with genome-wide tiling array analysis (ChIP-chip) as the preferred approach for mapping transcription-factor binding sites and chromatin modifications. The state of the art for analyzing ChIP-seq data relies on using only reads that map uniquely to a relevant reference genome (uni-reads). This can lead to the omission of up to 30% of alignable reads. We describe a general approach for utilizing reads that map to multiple locations on the reference genome (multi-reads). Our approach is based on allocating multi-reads as fractional counts using a weighted alignment scheme. Using human STAT1 and mouse GATA1 ChIP-seq datasets, we illustrate that incorporation of multi-reads significantly increases sequencing depths, leads to detection of novel peaks that are not otherwise identifiable with uni-reads, and improves detection of peaks in mappable regions. We investigate various genome-wide characteristics of peaks detected only by utilization of multi-reads via computational experiments. Overall, peaks from multi-read analysis have similar characteristics to peaks that are identified by uni-reads except that the majority of them reside in segmental duplications. We further validate a number of GATA1 multi-read only peaks by independent quantitative real-time ChIP analysis and identify novel target genes of GATA1. These computational and experimental results establish that multi-reads can be of critical importance for studying transcription factor binding in highly repetitive regions of genomes with ChIP-seq experiments.

  6. Discovering transcription factor binding sites in highly repetitive regions of genomes with multi-read analysis of ChIP-Seq data.

    Directory of Open Access Journals (Sweden)

    Dongjun Chung

    2011-07-01

    Full Text Available Chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq is rapidly replacing chromatin immunoprecipitation combined with genome-wide tiling array analysis (ChIP-chip as the preferred approach for mapping transcription-factor binding sites and chromatin modifications. The state of the art for analyzing ChIP-seq data relies on using only reads that map uniquely to a relevant reference genome (uni-reads. This can lead to the omission of up to 30% of alignable reads. We describe a general approach for utilizing reads that map to multiple locations on the reference genome (multi-reads. Our approach is based on allocating multi-reads as fractional counts using a weighted alignment scheme. Using human STAT1 and mouse GATA1 ChIP-seq datasets, we illustrate that incorporation of multi-reads significantly increases sequencing depths, leads to detection of novel peaks that are not otherwise identifiable with uni-reads, and improves detection of peaks in mappable regions. We investigate various genome-wide characteristics of peaks detected only by utilization of multi-reads via computational experiments. Overall, peaks from multi-read analysis have similar characteristics to peaks that are identified by uni-reads except that the majority of them reside in segmental duplications. We further validate a number of GATA1 multi-read only peaks by independent quantitative real-time ChIP analysis and identify novel target genes of GATA1. These computational and experimental results establish that multi-reads can be of critical importance for studying transcription factor binding in highly repetitive regions of genomes with ChIP-seq experiments.

  7. Genome-wide analysis of replication timing by next-generation sequencing with E/L Repli-seq.

    Science.gov (United States)

    Marchal, Claire; Sasaki, Takayo; Vera, Daniel; Wilson, Korey; Sima, Jiao; Rivera-Mulia, Juan Carlos; Trevilla-García, Claudia; Nogues, Coralin; Nafie, Ebtesam; Gilbert, David M

    2018-05-01

    This protocol is an extension to: Nat. Protoc. 6, 870-895 (2014); doi:10.1038/nprot.2011.328; published online 02 June 2011Cycling cells duplicate their DNA content during S phase, following a defined program called replication timing (RT). Early- and late-replicating regions differ in terms of mutation rates, transcriptional activity, chromatin marks and subnuclear position. Moreover, RT is regulated during development and is altered in diseases. Here, we describe E/L Repli-seq, an extension of our Repli-chip protocol. E/L Repli-seq is a rapid, robust and relatively inexpensive protocol for analyzing RT by next-generation sequencing (NGS), allowing genome-wide assessment of how cellular processes are linked to RT. Briefly, cells are pulse-labeled with BrdU, and early and late S-phase fractions are sorted by flow cytometry. Labeled nascent DNA is immunoprecipitated from both fractions and sequenced. Data processing leads to a single bedGraph file containing the ratio of nascent DNA from early versus late S-phase fractions. The results are comparable to those of Repli-chip, with the additional benefits of genome-wide sequence information and an increased dynamic range. We also provide computational pipelines for downstream analyses, for parsing phased genomes using single-nucleotide polymorphisms (SNPs) to analyze RT allelic asynchrony, and for direct comparison to Repli-chip data. This protocol can be performed in up to 3 d before sequencing, and requires basic cellular and molecular biology skills, as well as a basic understanding of Unix and R.

  8. voomDDA: discovery of diagnostic biomarkers and classification of RNA-seq data

    Directory of Open Access Journals (Sweden)

    Gokmen Zararsiz

    2017-10-01

    Full Text Available RNA-Seq is a recent and efficient technique that uses the capabilities of next-generation sequencing technology for characterizing and quantifying transcriptomes. One important task using gene-expression data is to identify a small subset of genes that can be used to build diagnostic classifiers particularly for cancer diseases. Microarray based classifiers are not directly applicable to RNA-Seq data due to its discrete nature. Overdispersion is another problem that requires careful modeling of mean and variance relationship of the RNA-Seq data. In this study, we present voomDDA classifiers: variance modeling at the observational level (voom extensions of the nearest shrunken centroids (NSC and the diagonal discriminant classifiers. VoomNSC is one of these classifiers and brings voom and NSC approaches together for the purpose of gene-expression based classification. For this purpose, we propose weighted statistics and put these weighted statistics into the NSC algorithm. The VoomNSC is a sparse classifier that models the mean-variance relationship using the voom method and incorporates voom’s precision weights into the NSC classifier via weighted statistics. A comprehensive simulation study was designed and four real datasets are used for performance assessment. The overall results indicate that voomNSC performs as the sparsest classifier. It also provides the most accurate results together with power-transformed Poisson linear discriminant analysis, rlog transformed support vector machines and random forests algorithms. In addition to prediction purposes, the voomNSC classifier can be used to identify the potential diagnostic biomarkers for a condition of interest. Through this work, statistical learning methods proposed for microarrays can be reused for RNA-Seq data. An interactive web application is freely available at http://www.biosoft.hacettepe.edu.tr/voomDDA/.

  9. voomDDA: discovery of diagnostic biomarkers and classification of RNA-seq data.

    Science.gov (United States)

    Zararsiz, Gokmen; Goksuluk, Dincer; Klaus, Bernd; Korkmaz, Selcuk; Eldem, Vahap; Karabulut, Erdem; Ozturk, Ahmet

    2017-01-01

    RNA-Seq is a recent and efficient technique that uses the capabilities of next-generation sequencing technology for characterizing and quantifying transcriptomes. One important task using gene-expression data is to identify a small subset of genes that can be used to build diagnostic classifiers particularly for cancer diseases. Microarray based classifiers are not directly applicable to RNA-Seq data due to its discrete nature. Overdispersion is another problem that requires careful modeling of mean and variance relationship of the RNA-Seq data. In this study, we present voomDDA classifiers: variance modeling at the observational level (voom) extensions of the nearest shrunken centroids (NSC) and the diagonal discriminant classifiers. VoomNSC is one of these classifiers and brings voom and NSC approaches together for the purpose of gene-expression based classification. For this purpose, we propose weighted statistics and put these weighted statistics into the NSC algorithm. The VoomNSC is a sparse classifier that models the mean-variance relationship using the voom method and incorporates voom's precision weights into the NSC classifier via weighted statistics. A comprehensive simulation study was designed and four real datasets are used for performance assessment. The overall results indicate that voomNSC performs as the sparsest classifier. It also provides the most accurate results together with power-transformed Poisson linear discriminant analysis, rlog transformed support vector machines and random forests algorithms. In addition to prediction purposes, the voomNSC classifier can be used to identify the potential diagnostic biomarkers for a condition of interest. Through this work, statistical learning methods proposed for microarrays can be reused for RNA-Seq data. An interactive web application is freely available at http://www.biosoft.hacettepe.edu.tr/voomDDA/.

  10. Physics-based process modeling, reliability prediction, and design guidelines for flip-chip devices

    Science.gov (United States)

    Michaelides, Stylianos

    Flip Chip on Board (FCOB) and Chip-Scale Packages (CSPs) are relatively new technologies that are being increasingly used in the electronic packaging industry. Compared to the more widely used face-up wirebonding and TAB technologies, flip-chips and most CSPs provide the shortest possible leads, lower inductance, higher frequency, better noise control, higher density, greater input/output (I/O), smaller device footprint and lower profile. However, due to the short history and due to the introduction of several new electronic materials, designs, and processing conditions, very limited work has been done to understand the role of material, geometry, and processing parameters on the reliability of flip-chip devices. Also, with the ever-increasing complexity of semiconductor packages and with the continued reduction in time to market, it is too costly to wait until the later stages of design and testing to discover that the reliability is not satisfactory. The objective of the research is to develop integrated process-reliability models that will take into consideration the mechanics of assembly processes to be able to determine the reliability of face-down devices under thermal cycling and long-term temperature dwelling. The models incorporate the time and temperature-dependent constitutive behavior of various materials in the assembly to be able to predict failure modes such as die cracking and solder cracking. In addition, the models account for process-induced defects and macro-micro features of the assembly. Creep-fatigue and continuum-damage mechanics models for the solder interconnects and fracture-mechanics models for the die have been used to determine the reliability of the devices. The results predicted by the models have been successfully validated against experimental data. The validated models have been used to develop qualification and test procedures for implantable medical devices. In addition, the research has helped develop innovative face

  11. Accurate lithography simulation model based on convolutional neural networks

    Science.gov (United States)

    Watanabe, Yuki; Kimura, Taiki; Matsunawa, Tetsuaki; Nojima, Shigeki

    2017-07-01

    Lithography simulation is an essential technique for today's semiconductor manufacturing process. In order to calculate an entire chip in realistic time, compact resist model is commonly used. The model is established for faster calculation. To have accurate compact resist model, it is necessary to fix a complicated non-linear model function. However, it is difficult to decide an appropriate function manually because there are many options. This paper proposes a new compact resist model using CNN (Convolutional Neural Networks) which is one of deep learning techniques. CNN model makes it possible to determine an appropriate model function and achieve accurate simulation. Experimental results show CNN model can reduce CD prediction errors by 70% compared with the conventional model.

  12. Transcriptome dynamics-based operon prediction in prokaryotes.

    Science.gov (United States)

    Fortino, Vittorio; Smolander, Olli-Pekka; Auvinen, Petri; Tagliaferri, Roberto; Greco, Dario

    2014-05-16

    Inferring operon maps is crucial to understanding the regulatory networks of prokaryotic genomes. Recently, RNA-seq based transcriptome studies revealed that in many bacterial species the operon structure vary with the change of environmental conditions. Therefore, new computational solutions that use both static and dynamic data are necessary to create condition specific operon predictions. In this work, we propose a novel classification method that integrates RNA-seq based transcriptome profiles with genomic sequence features to accurately identify the operons that are expressed under a measured condition. The classifiers are trained on a small set of confirmed operons and then used to classify the remaining gene pairs of the organism studied. Finally, by linking consecutive gene pairs classified as operons, our computational approach produces condition-dependent operon maps. We evaluated our approach on various RNA-seq expression profiles of the bacteria Haemophilus somni, Porphyromonas gingivalis, Escherichia coli and Salmonella enterica. Our results demonstrate that, using features depending on both transcriptome dynamics and genome sequence characteristics, we can identify operon pairs with high accuracy. Moreover, the combination of DNA sequence and expression data results in more accurate predictions than each one alone. We present a computational strategy for the comprehensive analysis of condition-dependent operon maps in prokaryotes. Our method can be used to generate condition specific operon maps of many bacterial organisms for which high-resolution transcriptome data is available.

  13. Experiment list: SRX186751 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available | biomaterial_provider=Lonza || datatype=ChipSeq || datatype description=Chromatin IP Sequencing || antibody antibody...description=Rabbit polyclonal antibody raised against a peptide containi...ng K79 di-methylation. Antibody Target: H3K79me2 || antibody targetdescription=H3K79me2 is a mark of the tra...nscriptional transition region - the region between the initiation marks (K4me3, etc) and the elongation marks (K36me3). || antibody... vendorname=Active Motif || antibody vendorid=39143 || co

  14. Experiment list: SRX186750 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available l_provider=Lonza || datatype=ChipSeq || datatype description=Chromatin IP Sequencing || antibody antibodydes...cription=Rabbit polyclonal antibody raised against a peptide containing K79 di-me...thylation. Antibody Target: H3K79me2 || antibody targetdescription=H3K79me2 is a mark of the transcriptional... transition region - the region between the initiation marks (K4me3, etc) and the elongation marks (K36me3). || antibody... vendorname=Active Motif || antibody vendorid=39143 || controlid=wgEn

  15. Experiment list: SRX186672 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available RC || datatype=ChipSeq || datatype description=Chromatin IP Sequencing || antibody antibody...description=Rabbit polyclonal antibody raised against a peptide corresponding to the C-terminus of... H2AZ. Antibody Target: H2AZ || antibody targetdescription=H2A.Z is a sequence variant of Histone H2A. || antibody... vendorname=Millipore || antibody vendorid=07-594 || controlid=wgEncodeEH00...3132 || replicate=1,2 || softwareversion=ScriptureVPaperR3 || cell sex=F || antibody=H2A.Z || antibody antibody

  16. How and how much does RAD-seq bias genetic diversity estimates?

    Science.gov (United States)

    Cariou, Marie; Duret, Laurent; Charlat, Sylvain

    2016-11-08

    RAD-seq is a powerful tool, increasingly used in population genomics. However, earlier studies have raised red flags regarding possible biases associated with this technique. In particular, polymorphism on restriction sites results in preferential sampling of closely related haplotypes, so that RAD data tends to underestimate genetic diversity. Here we (1) clarify the theoretical basis of this bias, highlighting the potential confounding effects of population structure and selection, (2) confront predictions to real data from in silico digestion of full genomes and (3) provide a proof of concept toward an ABC-based correction of the RAD-seq bias. Under a neutral and panmictic model, we confirm the previously established relationship between the true polymorphism and its RAD-based estimation, showing a more pronounced bias when polymorphism is high. Using more elaborate models, we show that selection, resulting in heterogeneous levels of polymorphism along the genome, exacerbates the bias and leads to a more pronounced underestimation. On the contrary, spatial genetic structure tends to reduce the bias. We confront the neutral and panmictic model to "ideal" empirical data (in silico RAD-sequencing) using full genomes from natural populations of the fruit fly Drosophila melanogaster and the fungus Shizophyllum commune, harbouring respectively moderate and high genetic diversity. In D. melanogaster, predictions fit the model, but the small difference between the true and RAD polymorphism makes this comparison insensitive to deviations from the model. In the highly polymorphic fungus, the model captures a large part of the bias but makes inaccurate predictions. Accordingly, ABC corrections based on this model improve the estimations, albeit with some imprecisions. The RAD-seq underestimation of genetic diversity associated with polymorphism in restriction sites becomes more pronounced when polymorphism is high. In practice, this means that in many systems where

  17. NBLDA: negative binomial linear discriminant analysis for RNA-Seq data.

    Science.gov (United States)

    Dong, Kai; Zhao, Hongyu; Tong, Tiejun; Wan, Xiang

    2016-09-13

    RNA-sequencing (RNA-Seq) has become a powerful technology to characterize gene expression profiles because it is more accurate and comprehensive than microarrays. Although statistical methods that have been developed for microarray data can be applied to RNA-Seq data, they are not ideal due to the discrete nature of RNA-Seq data. The Poisson distribution and negative binomial distribution are commonly used to model count data. Recently, Witten (Annals Appl Stat 5:2493-2518, 2011) proposed a Poisson linear discriminant analysis for RNA-Seq data. The Poisson assumption may not be as appropriate as the negative binomial distribution when biological replicates are available and in the presence of overdispersion (i.e., when the variance is larger than or equal to the mean). However, it is more complicated to model negative binomial variables because they involve a dispersion parameter that needs to be estimated. In this paper, we propose a negative binomial linear discriminant analysis for RNA-Seq data. By Bayes' rule, we construct the classifier by fitting a negative binomial model, and propose some plug-in rules to estimate the unknown parameters in the classifier. The relationship between the negative binomial classifier and the Poisson classifier is explored, with a numerical investigation of the impact of dispersion on the discriminant score. Simulation results show the superiority of our proposed method. We also analyze two real RNA-Seq data sets to demonstrate the advantages of our method in real-world applications. We have developed a new classifier using the negative binomial model for RNA-seq data classification. Our simulation results show that our proposed classifier has a better performance than existing works. The proposed classifier can serve as an effective tool for classifying RNA-seq data. Based on the comparison results, we have provided some guidelines for scientists to decide which method should be used in the discriminant analysis of RNA-Seq data

  18. A fast one-chip event-preprocessor and sequencer for the Simbol-X Low Energy Detector

    Science.gov (United States)

    Schanz, T.; Tenzer, C.; Maier, D.; Kendziorra, E.; Santangelo, A.

    2010-12-01

    We present an FPGA-based digital camera electronics consisting of an Event-Preprocessor (EPP) for on-board data preprocessing and a related Sequencer (SEQ) to generate the necessary signals to control the readout of the detector. The device has been originally designed for the Simbol-X low energy detector (LED). The EPP operates on 64×64 pixel images and has a real-time processing capability of more than 8000 frames per second. The already working releases of the EPP and the SEQ are now combined into one Digital-Camera-Controller-Chip (D3C).

  19. A fast one-chip event-preprocessor and sequencer for the Simbol-X Low Energy Detector

    Energy Technology Data Exchange (ETDEWEB)

    Schanz, T., E-mail: schanz@astro.uni-tuebingen.d [Kepler Center for Astro- and Particlephysics, Institut fuer Astronomie und Astrophysik Tuebingen, Sand 1, 72076 Tuebingen (Germany); Tenzer, C., E-mail: tenzer@astro.uni-tuebingen.d [Kepler Center for Astro- and Particlephysics, Institut fuer Astronomie und Astrophysik Tuebingen, Sand 1, 72076 Tuebingen (Germany); Maier, D.; Kendziorra, E.; Santangelo, A. [Kepler Center for Astro- and Particlephysics, Institut fuer Astronomie und Astrophysik Tuebingen, Sand 1, 72076 Tuebingen (Germany)

    2010-12-11

    We present an FPGA-based digital camera electronics consisting of an Event-Preprocessor (EPP) for on-board data preprocessing and a related Sequencer (SEQ) to generate the necessary signals to control the readout of the detector. The device has been originally designed for the Simbol-X low energy detector (LED). The EPP operates on 64x64 pixel images and has a real-time processing capability of more than 8000 frames per second. The already working releases of the EPP and the SEQ are now combined into one Digital-Camera-Controller-Chip (D3C).

  20. A fast one-chip event-preprocessor and sequencer for the Simbol-X Low Energy Detector

    International Nuclear Information System (INIS)

    Schanz, T.; Tenzer, C.; Maier, D.; Kendziorra, E.; Santangelo, A.

    2010-01-01

    We present an FPGA-based digital camera electronics consisting of an Event-Preprocessor (EPP) for on-board data preprocessing and a related Sequencer (SEQ) to generate the necessary signals to control the readout of the detector. The device has been originally designed for the Simbol-X low energy detector (LED). The EPP operates on 64x64 pixel images and has a real-time processing capability of more than 8000 frames per second. The already working releases of the EPP and the SEQ are now combined into one Digital-Camera-Controller-Chip (D3C).

  1. RNA-Seq Atlas of Glycine max: A guide to the soybean transcriptome

    Directory of Open Access Journals (Sweden)

    Severin Andrew J

    2010-08-01

    Full Text Available Abstract Background Next generation sequencing is transforming our understanding of transcriptomes. It can determine the expression level of transcripts with a dynamic range of over six orders of magnitude from multiple tissues, developmental stages or conditions. Patterns of gene expression provide insight into functions of genes with unknown annotation. Results The RNA Seq-Atlas presented here provides a record of high-resolution gene expression in a set of fourteen diverse tissues. Hierarchical clustering of transcriptional profiles for these tissues suggests three clades with similar profiles: aerial, underground and seed tissues. We also investigate the relationship between gene structure and gene expression and find a correlation between gene length and expression. Additionally, we find dramatic tissue-specific gene expression of both the most highly-expressed genes and the genes specific to legumes in seed development and nodule tissues. Analysis of the gene expression profiles of over 2,000 genes with preferential gene expression in seed suggests there are more than 177 genes with functional roles that are involved in the economically important seed filling process. Finally, the Seq-atlas also provides a means of evaluating existing gene model annotations for the Glycine max genome. Conclusions This RNA-Seq atlas extends the analyses of previous gene expression atlases performed using Affymetrix GeneChip technology and provides an example of new methods to accommodate the increase in transcriptome data obtained from next generation sequencing. Data contained within this RNA-Seq atlas of Glycine max can be explored at http://www.soybase.org/soyseq.

  2. Experiment list: SRX144529 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available ChIP-Seq source_name=IgG ChIP vehicle treated || biomaterial_provider=Coriell; ht...tp://ccr.coriell.org/Sections/Search/Search.aspx?PgId=165&q=GM06993 || biomaterial_provider=Coriell; http://...ccr.coriell.org/Sections/Search/Search.aspx?PgId=165&q=GM07000 || biomaterial_provider=Coriell; http://ccr.c...oriell.org/Sections/Search/Search.aspx?PgId=165&q=GM11882 || biomaterial_provider...=Coriell; http://ccr.coriell.org/Sections/Search/Search.aspx?PgId=165&q=GM11992 || biomaterial_provider=Cori

  3. Experiment list: SRX144528 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available -Seq source_name=IgG ChIP SFN treated || biomaterial_provider=Coriell; http://ccr....coriell.org/Sections/Search/Search.aspx?PgId=165&q=GM06993 || biomaterial_provider=Coriell; http://ccr.cori...ell.org/Sections/Search/Search.aspx?PgId=165&q=GM07000 || biomaterial_provider=Coriell; http://ccr.coriell.o...rg/Sections/Search/Search.aspx?PgId=165&q=GM11882 || biomaterial_provider=Coriell...; http://ccr.coriell.org/Sections/Search/Search.aspx?PgId=165&q=GM11992 || biomaterial_provider=Coriell; htt

  4. Experiment list: SRX150477 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available osis=Fibrocystic Disease 70636654,97.7,24.3,106225 GSM935397: Harvard ChipSeq MCF10A-Er-Src 4OHTAM 1uM 4hr c-Fos Harvard... Control source_name=MCF10A-Er-Src || biomaterial_provider=Struhl laboratory || lab=Harvard || l... a leucine-zipper. || antibody vendorname=Santa Cruz Biotech || antibody vendorid=sc-7202 || control=Harvard..._Control || control description=input library was prepared at Harvard. || control=Harvard..._Control || control description=input library was prepared at Harvard. || controlid=wgEncodeEH002871

  5. Experiment list: SRX150476 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available osis=Fibrocystic Disease 60050220,94.5,18.5,85444 GSM935396: Harvard ChipSeq MCF10A-Er-Src 4OHTAM 1uM 36hr c-Fos Harvard... Control source_name=MCF10A-Er-Src || biomaterial_provider=Struhl laboratory || lab=Harvard || l...is a leucine-zipper. || antibody vendorname=Santa Cruz Biotech || antibody vendorid=sc-7202 || control=Harvard..._Control || control description=input library was prepared at Harvard. || control=Harvard..._Control || control description=input library was prepared at Harvard. || controlid=wgEncodeEH0028

  6. Experiment list: SRX186742 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available l_provider=ATCC || datatype=ChipSeq || datatype description=Chromatin IP Sequencing || antibody antibodydesc...ription=Rabbit polyclonal antibody raised against a peptide corresponding to the ...C-terminus of H2AZ. Antibody Target: H2AZ || antibody targetdescription=H2A.Z is a sequence variant of Histone H2A. || antibody... vendorname=Millipore || antibody vendorid=07-594 || controlid=wgEncodeEH003076 || replic...ate=1,2 || softwareversion=ScriptureVPaperR3 || cell sex=M || antibody=H2A.Z || antibody antibody

  7. Experiment list: SRX186752 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available aterial_provider=Lonza || datatype=ChipSeq || datatype description=Chromatin IP Sequencing || antibody antibody...description=Rabbit polyclonal antibody raised against a peptide corresponding ...to the C-terminus of H2AZ. Antibody Target: H2AZ || antibody targetdescription=H2A.Z is a sequence variant of Histone H2A. || antibod...y vendorname=Millipore || antibody vendorid=07-594 || controlid=wgEncodeEH000060 ||... replicate=1,2 || softwareversion=ScriptureVPaperR3 || cell sex=U || antibody=H2A.Z || antibody antibody

  8. Experiment list: SRX186726 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available rovider=Lonza || datatype=ChipSeq || datatype description=Chromatin IP Sequencing || antibody antibodydescri...terminus of H2AZ. Antibody Target: H2AZ || antibody targetdescription=H2A.Z is a sequence variant of Histone H2A. || antibody... vendorname=Millipore || antibody vendorid=07-594 || controlid=wgEncodeEH000105 || replicat...e=1,2 || softwareversion=ScriptureVPaperR3 || cell sex=U || antibody=H2A.Z || antibody antibody...description=Rabbit polyclonal antibody raised against a peptide corresponding to the C-terminu

  9. Experiment list: SRX186723 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available K || biomaterial_provider=Lonza || datatype=ChipSeq || datatype description=Chromatin IP Sequencing || antibody antibody...description=Rabbit polyclonal antibody raised against a peptide conta...ining K79 di-methylation. Antibody Target: H3K79me2 || antibody targetdescription=H3K79me2 is a mark of the ...dy vendorname=Active Motif || antibody vendorid=39143 ||... controlid=wgEncodeEH000072 || replicate=1,2 || softwareversion=ScriptureVPaperR3 || cell sex=M || antibody=H3K79me2 || antibody anti

  10. Experiment list: SRX190252 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available -1 || cell organism=human || cell description=pancreatic carcinoma, (PMID: 1140870) PANC-1 was established from a panc...SRX190252 hg19 Input control Input control Pancreas PANC-1 Tissue=Pancreas/Duct|Dis...ease=Epithelioid Carcinoma 117194289,90.9,30.1,1440 GSM1010796: HudsonAlpha ChipSeq PANC-1 RevXlinkChromatin...atin IP Sequencing || controlid=SL3776,SL2340 || labexpid=SL3776,SL2340 || cell=PANC...reatic carcinoma, which was extracted via pancreatico-duodenectomy specimen from a 56-year-old

  11. Influential Factors for Accurate Load Prediction in a Demand Response Context

    DEFF Research Database (Denmark)

    Wollsen, Morten Gill; Kjærgaard, Mikkel Baun; Jørgensen, Bo Nørregaard

    2016-01-01

    Accurate prediction of a buildings electricity load is crucial to respond to Demand Response events with an assessable load change. However, previous work on load prediction lacks to consider a wider set of possible data sources. In this paper we study different data scenarios to map the influence....... Next, the time of day that is being predicted greatly influence the prediction which is related to the weather pattern. By presenting these results we hope to improve the modeling of building loads and algorithms for Demand Response planning.......Accurate prediction of a buildings electricity load is crucial to respond to Demand Response events with an assessable load change. However, previous work on load prediction lacks to consider a wider set of possible data sources. In this paper we study different data scenarios to map the influence...

  12. Accurate predictions for the LHC made easy

    CERN Multimedia

    CERN. Geneva

    2014-01-01

    The data recorded by the LHC experiments is of a very high quality. To get the most out of the data, precise theory predictions, including uncertainty estimates, are needed to reduce as much as possible theoretical bias in the experimental analyses. Recently, significant progress has been made in computing Next-to-Leading Order (NLO) computations, including matching to the parton shower, that allow for these accurate, hadron-level predictions. I shall discuss one of these efforts, the MadGraph5_aMC@NLO program, that aims at the complete automation of predictions at the NLO accuracy within the SM as well as New Physics theories. I’ll illustrate some of the theoretical ideas behind this program, show some selected applications to LHC physics, as well as describe the future plans.

  13. Is this the right normalization? A diagnostic tool for ChIP-seq normalization.

    Science.gov (United States)

    Angelini, Claudia; Heller, Ruth; Volkinshtein, Rita; Yekutieli, Daniel

    2015-05-09

    Chip-seq experiments are becoming a standard approach for genome-wide profiling protein-DNA interactions, such as detecting transcription factor binding sites, histone modification marks and RNA Polymerase II occupancy. However, when comparing a ChIP sample versus a control sample, such as Input DNA, normalization procedures have to be applied in order to remove experimental source of biases. Despite the substantial impact that the choice of the normalization method can have on the results of a ChIP-seq data analysis, their assessment is not fully explored in the literature. In particular, there are no diagnostic tools that show whether the applied normalization is indeed appropriate for the data being analyzed. In this work we propose a novel diagnostic tool to examine the appropriateness of the estimated normalization procedure. By plotting the empirical densities of log relative risks in bins of equal read count, along with the estimated normalization constant, after logarithmic transformation, the researcher is able to assess the appropriateness of the estimated normalization constant. We use the diagnostic plot to evaluate the appropriateness of the estimates obtained by CisGenome, NCIS and CCAT on several real data examples. Moreover, we show the impact that the choice of the normalization constant can have on standard tools for peak calling such as MACS or SICER. Finally, we propose a novel procedure for controlling the FDR using sample swapping. This procedure makes use of the estimated normalization constant in order to gain power over the naive choice of constant (used in MACS and SICER), which is the ratio of the total number of reads in the ChIP and Input samples. Linear normalization approaches aim to estimate a scale factor, r, to adjust for different sequencing depths when comparing ChIP versus Input samples. The estimated scaling factor can easily be incorporated in many peak caller algorithms to improve the accuracy of the peak identification. The

  14. Can phenological models predict tree phenology accurately under climate change conditions?

    Science.gov (United States)

    Chuine, Isabelle; Bonhomme, Marc; Legave, Jean Michel; García de Cortázar-Atauri, Inaki; Charrier, Guillaume; Lacointe, André; Améglio, Thierry

    2014-05-01

    The onset of the growing season of trees has been globally earlier by 2.3 days/decade during the last 50 years because of global warming and this trend is predicted to continue according to climate forecast. The effect of temperature on plant phenology is however not linear because temperature has a dual effect on bud development. On one hand, low temperatures are necessary to break bud dormancy, and on the other hand higher temperatures are necessary to promote bud cells growth afterwards. Increasing phenological changes in temperate woody species have strong impacts on forest trees distribution and productivity, as well as crops cultivation areas. Accurate predictions of trees phenology are therefore a prerequisite to understand and foresee the impacts of climate change on forests and agrosystems. Different process-based models have been developed in the last two decades to predict the date of budburst or flowering of woody species. They are two main families: (1) one-phase models which consider only the ecodormancy phase and make the assumption that endodormancy is always broken before adequate climatic conditions for cell growth occur; and (2) two-phase models which consider both the endodormancy and ecodormancy phases and predict a date of dormancy break which varies from year to year. So far, one-phase models have been able to predict accurately tree bud break and flowering under historical climate. However, because they do not consider what happens prior to ecodormancy, and especially the possible negative effect of winter temperature warming on dormancy break, it seems unlikely that they can provide accurate predictions in future climate conditions. It is indeed well known that a lack of low temperature results in abnormal pattern of bud break and development in temperate fruit trees. An accurate modelling of the dormancy break date has thus become a major issue in phenology modelling. Two-phases phenological models predict that global warming should delay

  15. Prediction of Accurate Mixed Mode Fatigue Crack Growth Curves using the Paris' Law

    Science.gov (United States)

    Sajith, S.; Krishna Murthy, K. S. R.; Robi, P. S.

    2017-12-01

    Accurate information regarding crack growth times and structural strength as a function of the crack size is mandatory in damage tolerance analysis. Various equivalent stress intensity factor (SIF) models are available for prediction of mixed mode fatigue life using the Paris' law. In the present investigation these models have been compared to assess their efficacy in prediction of the life close to the experimental findings as there are no guidelines/suggestions available on selection of these models for accurate and/or conservative predictions of fatigue life. Within the limitations of availability of experimental data and currently available numerical simulation techniques, the results of present study attempts to outline models that would provide accurate and conservative life predictions.

  16. Organ/body-on-a-chip based on microfluidic technology for drug discovery.

    Science.gov (United States)

    Kimura, Hiroshi; Sakai, Yasuyuki; Fujii, Teruo

    2018-02-01

    Although animal experiments are indispensable for preclinical screening in the drug discovery process, various issues such as ethical considerations and species differences remain. To solve these issues, cell-based assays using human-derived cells have been actively pursued. However, it remains difficult to accurately predict drug efficacy, toxicity, and organs interactions, because cultivated cells often do not retain their original organ functions and morphologies in conventional in vitro cell culture systems. In the μTAS research field, which is a part of biochemical engineering, the technologies of organ-on-a-chip, based on microfluidic devices built using microfabrication, have been widely studied recently as a novel in vitro organ model. Since it is possible to physically and chemically mimic the in vitro environment by using microfluidic device technology, maintenance of cellular function and morphology, and replication of organ interactions can be realized using organ-on-a-chip devices. So far, functions of various organs and tissues, such as the lung, liver, kidney, and gut have been reproduced as in vitro models. Furthermore, a body-on-a-chip, integrating multi organ functions on a microfluidic device, has also been proposed for prediction of organ interactions. We herein provide a background of microfluidic systems, organ-on-a-chip, Body-on-a-chip technologies, and their challenges in the future. Copyright © 2017 The Japanese Society for the Study of Xenobiotics. Published by Elsevier Ltd. All rights reserved.

  17. A statistical method for predicting splice variants between two groups of samples using GeneChip® expression array data

    Directory of Open Access Journals (Sweden)

    Olson James M

    2006-04-01

    Full Text Available Abstract Background Alternative splicing of pre-messenger RNA results in RNA variants with combinations of selected exons. It is one of the essential biological functions and regulatory components in higher eukaryotic cells. Some of these variants are detectable with the Affymetrix GeneChip® that uses multiple oligonucleotide probes (i.e. probe set, since the target sequences for the multiple probes are adjacent within each gene. Hybridization intensity from a probe correlates with abundance of the corresponding transcript. Although the multiple-probe feature in the current GeneChip® was designed to assess expression values of individual genes, it also measures transcriptional abundance for a sub-region of a gene sequence. This additional capacity motivated us to develop a method to predict alternative splicing, taking advance of extensive repositories of GeneChip® gene expression array data. Results We developed a two-step approach to predict alternative splicing from GeneChip® data. First, we clustered the probes from a probe set into pseudo-exons based on similarity of probe intensities and physical adjacency. A pseudo-exon is defined as a sequence in the gene within which multiple probes have comparable probe intensity values. Second, for each pseudo-exon, we assessed the statistical significance of the difference in probe intensity between two groups of samples. Differentially expressed pseudo-exons are predicted to be alternatively spliced. We applied our method to empirical data generated from GeneChip® Hu6800 arrays, which include 7129 probe sets and twenty probes per probe set. The dataset consists of sixty-nine medulloblastoma (27 metastatic and 42 non-metastatic samples and four cerebellum samples as normal controls. We predicted that 577 genes would be alternatively spliced when we compared normal cerebellum samples to medulloblastomas, and predicted that thirteen genes would be alternatively spliced when we compared metastatic

  18. Experiment list: SRX150517 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available osis=Fibrocystic Disease 50667127,93.3,23.9,80828 GSM935438: Harvard ChipSeq MCF10A-Er-Src EtOH 0.01pct c-Fos Harvard... Control source_name=MCF10A-Er-Src || biomaterial_provider=Struhl laboratory || lab=Harvard || lab description=Struhl - Harv...me=Santa Cruz Biotech || antibody vendorid=sc-7202 || control=Harvard_Control || control description=input l...ibrary was prepared at Harvard. || control=Harvard_Control || control description...=input library was prepared at Harvard. || controlid=wgEncodeEH002871 || replicate=1 http://dbarchive.biosci

  19. Experiment list: SRX150570 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available osis=Fibrocystic Disease 64692659,93.8,19.3,37417 GSM935491: Harvard ChipSeq MCF10A-Er-Src 4OHTAM 1uM 4hr c-Myc Harvard... Control source_name=MCF10A-Er-Src || biomaterial_provider=Struhl laboratory || lab=Harvard || la...ntibody vendorname=Santa Cruz Biotech || antibody vendorid=sc-764 || control=Harvard_Control || control desc...ription=input library was prepared at Harvard. || control=Harvard_Control || cont...rol description=input library was prepared at Harvard. || controlid=wgEncodeEH002871 || replicate=1 http://d

  20. Experiment list: SRX186686 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available me=NH-A || biomaterial_provider=Lonza || datatype=ChipSeq || datatype description=Chromatin IP Sequencing || antibody antibody...description=Rabbit polyclonal antibody raised against a peptide... containing K79 di-methylation. Antibody Target: H3K79me2 || antibody targetdescription=H3K79me2 is a mark o...ation marks (K36me3). || antibody vendorname=Active Motif || antibody vendorid=39...143 || controlid=wgEncodeEH001027 || replicate=1,2 || softwareversion=ScriptureVPaperR3 || cell sex=U || antibody=H3K79me2 || antibod

  1. Experiment list: SRX186696 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available nza || datatype=ChipSeq || datatype description=Chromatin IP Sequencing || antibody antibodydescription=Rabbit polyclonal antibody...f H2AZ. Antibody Target: H2AZ || antibody targetdescription=H2A.Z is a sequence variant of Histone H2A. || antibody... vendorname=Millipore || antibody vendorid=07-594 || controlid=wgEncodeEH000093 || replicate=1,2 || s...oftwareversion=ScriptureVPaperR3 || cell sex=U || antibody=H2A.Z || antibody antibody...description=Rabbit polyclonal antibody raised against a peptide corresponding to the C-terminus of H2AZ.

  2. Experiment list: SRX186665 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available rovider=DSMZ || datatype=ChipSeq || datatype description=Chromatin IP Sequencing || antibody antibodydescrip...ation. Antibody Target: H3K79me2 || antibody targetdescription=H3K79me2 is a mark of the transcriptional tra...nsition region - the region between the initiation marks (K4me3, etc) and the elongation marks (K36me3). || antibody... vendorname=Active Motif || antibody vendorid=39143 || controlid=wgEncode...EH002434 || replicate=1,2 || softwareversion=ScriptureVPaperR3 || cell sex=M || antibody=H3K79me2 || antibody antibody

  3. Experiment list: SRX186661 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available r=DSMZ || datatype=ChipSeq || datatype description=Chromatin IP Sequencing || antibody antibodydescription=Rabbit polyclonal antibody...s of H2AZ. Antibody Target: H2AZ || antibody targetdescription=H2A.Z is a sequence variant of Histone H2A. || antibody... vendorname=Millipore || antibody vendorid=07-594 || controlid=wgEncodeEH002434 || replicate=1,2 |...| softwareversion=ScriptureVPaperR3 || cell sex=M || antibody=H2A.Z || antibody antibody...description=Rabbit polyclonal antibody raised against a peptide corresponding to the C-terminus of H2

  4. CASSys: an integrated software-system for the interactive analysis of ChIP-seq data

    Directory of Open Access Journals (Sweden)

    Alawi Malik

    2011-06-01

    Full Text Available The mapping of DNA-protein interactions is crucial for a full understanding of transcriptional regulation. Chromatin-immunoprecipitation followed bymassively parallel sequencing (ChIP-seq has become the standard technique for analyzing these interactions on a genome-wide scale. We have developed a software system called CASSys (ChIP-seq data Analysis Software System spanning all steps of ChIP-seq data analysis. It supersedes the laborious application of several single command line tools. CASSys provides functionality ranging from quality assessment and -control of short reads, over the mapping of reads against a reference genome (readmapping and the detection of enriched regions (peakdetection to various follow-up analyses. The latter are accessible via a state-of-the-art web interface and can be performed interactively by the user. The follow-up analyses allow for flexible user defined association of putative interaction sites with genes, visualization of their genomic context with an integrated genome browser, the detection of putative binding motifs, the identification of over-represented Gene Ontology-terms, pathway analysis and the visualization of interaction networks. The system is client-server based, accessible via a web browser and does not require any software installation on the client side. To demonstrate CASSys’s functionality we used the system for the complete data analysis of a publicly available Chip-seq study that investigated the role of the transcription factor estrogen receptor-α in breast cancer cells.

  5. SignalSpider: Probabilistic pattern discovery on multiple normalized ChIP-Seq signal profiles

    KAUST Repository

    Wong, Kachun

    2014-09-05

    Motivation: Chromatin immunoprecipitation (ChIP) followed by high-throughput sequencing (ChIP-Seq) measures the genome-wide occupancy of transcription factors in vivo. Different combinations of DNA-binding protein occupancies may result in a gene being expressed in different tissues or at different developmental stages. To fully understand the functions of genes, it is essential to develop probabilistic models on multiple ChIP-Seq profiles to decipher the combinatorial regulatory mechanisms by multiple transcription factors. Results: In this work, we describe a probabilistic model (SignalSpider) to decipher the combinatorial binding events of multiple transcription factors. Comparing with similar existing methods, we found SignalSpider performs better in clustering promoter and enhancer regions. Notably, SignalSpider can learn higher-order combinatorial patterns from multiple ChIP-Seq profiles. We have applied SignalSpider on the normalized ChIP-Seq profiles from the ENCODE consortium and learned model instances. We observed different higher-order enrichment and depletion patterns across sets of proteins. Those clustering patterns are supported by Gene Ontology (GO) enrichment, evolutionary conservation and chromatin interaction enrichment, offering biological insights for further focused studies. We also proposed a specific enrichment map visualization method to reveal the genome-wide transcription factor combinatorial patterns from the models built, which extend our existing fine-scale knowledge on gene regulation to a genome-wide level. Availability and implementation: The matrix-algebra-optimized executables and source codes are available at the authors\\' websites: http://www.cs.toronto.edu/∼wkc/SignalSpider. Contact: Supplementary information: Supplementary data are available at Bioinformatics online.

  6. ChIP-seq Identification of Weakly Conserved Heart Enhancers

    Energy Technology Data Exchange (ETDEWEB)

    Blow, Matthew J.; McCulley, David J.; Li, Zirong; Zhang, Tao; Akiyama, Jennifer A.; Holt, Amy; Plajzer-Frick, Ingrid; Shoukry, Malak; Wright, Crystal; Chen, Feng; Afzal, Veena; Bristow, James; Ren, Bing; Black, Brian L.; Rubin, Edward M.; Visel, Axel; Pennacchio, Len A.

    2010-07-01

    Accurate control of tissue-specific gene expression plays a pivotal role in heart development, but few cardiac transcriptional enhancers have thus far been identified. Extreme non-coding sequence conservation successfully predicts enhancers active in many tissues, but fails to identify substantial numbers of heart enhancers. Here we used ChIP-seq with the enhancer-associated protein p300 from mouse embryonic day 11.5 heart tissue to identify over three thousand candidate heart enhancers genome-wide. Compared to other tissues studied at this time-point, most candidate heart enhancers are less deeply conserved in vertebrate evolution. Nevertheless, the testing of 130 candidate regions in a transgenic mouse assay revealed that most of them reproducibly function as enhancers active in the heart, irrespective of their degree of evolutionary constraint. These results provide evidence for a large population of poorly conserved heart enhancers and suggest that the evolutionary constraint of embryonic enhancers can vary depending on tissue type.

  7. Accurate RNA consensus sequencing for high-fidelity detection of transcriptional mutagenesis-induced epimutations.

    Science.gov (United States)

    Reid-Bayliss, Kate S; Loeb, Lawrence A

    2017-08-29

    Transcriptional mutagenesis (TM) due to misincorporation during RNA transcription can result in mutant RNAs, or epimutations, that generate proteins with altered properties. TM has long been hypothesized to play a role in aging, cancer, and viral and bacterial evolution. However, inadequate methodologies have limited progress in elucidating a causal association. We present a high-throughput, highly accurate RNA sequencing method to measure epimutations with single-molecule sensitivity. Accurate RNA consensus sequencing (ARC-seq) uniquely combines RNA barcoding and generation of multiple cDNA copies per RNA molecule to eliminate errors introduced during cDNA synthesis, PCR, and sequencing. The stringency of ARC-seq can be scaled to accommodate the quality of input RNAs. We apply ARC-seq to directly assess transcriptome-wide epimutations resulting from RNA polymerase mutants and oxidative stress.

  8. Pipeline for the Analysis of ChIP-seq Data and New Motif Ranking Procedure

    KAUST Repository

    Ashoor, Haitham

    2011-06-01

    This thesis presents a computational methodology for ab-initio identification of transcription factor binding sites based on ChIP-seq data. This method consists of three main steps, namely ChIP-seq data processing, motif discovery and models selection. A novel method for ranking the models of motifs identified in this process is proposed. This method combines multiple factors in order to rank the provided candidate motifs. It combines the model coverage of the ChIP-seq fragments that contain motifs from which that model is built, the suitable background data made up of shuffled ChIP-seq fragments, and the p-value that resulted from evaluating the model on actual and background data. Two ChIP-seq datasets retrieved from ENCODE project are used to evaluate and demonstrate the ability of the method to predict correct TFBSs with high precision. The first dataset relates to neuron-restrictive silencer factor, NRSF, while the second one corresponds to growth-associated binding protein, GABP. The pipeline system shows high precision prediction for both datasets, as in both cases the top ranked motif closely resembles the known motifs for the respective transcription factors.

  9. Mechanism of Void Prediction in Flip Chip Packages with Molded Underfill

    Science.gov (United States)

    Wu, Kuo-Tsai; Hwang, Sheng-Jye; Lee, Huei-Huang

    2017-08-01

    Voids have always been present using the molded underfill (MUF) package process, which is a problem that needs further investigation. In this study, the process was studied using the Moldex3D numerical analysis software. The effects of gas (air vent effect) on the overall melt front were also considered. In this isothermal process containing two fluids, the gas and melt colloid interact in the mold cavity. Simulation enabled an appropriate understanding of the actual situation to be gained, and, through analysis, the void region and exact location of voids were predicted. First, the global flow end area was observed to predict the void movement trend, and then the local flow ends were observed to predict the location and size of voids. In the MUF 518 case study, simulations predicted the void region as well as the location and size of the voids. The void phenomenon in a flip chip ball grid array underfill is discussed as part of the study.

  10. Modeling ChIP sequencing in silico with applications.

    Directory of Open Access Journals (Sweden)

    Zhengdong D Zhang

    2008-08-01

    Full Text Available ChIP sequencing (ChIP-seq is a new method for genomewide mapping of protein binding sites on DNA. It has generated much excitement in functional genomics. To score data and determine adequate sequencing depth, both the genomic background and the binding sites must be properly modeled. To develop a computational foundation to tackle these issues, we first performed a study to characterize the observed statistical nature of this new type of high-throughput data. By linking sequence tags into clusters, we show that there are two components to the distribution of tag counts observed in a number of recent experiments: an initial power-law distribution and a subsequent long right tail. Then we develop in silico ChIP-seq, a computational method to simulate the experimental outcome by placing tags onto the genome according to particular assumed distributions for the actual binding sites and for the background genomic sequence. In contrast to current assumptions, our results show that both the background and the binding sites need to have a markedly nonuniform distribution in order to correctly model the observed ChIP-seq data, with, for instance, the background tag counts modeled by a gamma distribution. On the basis of these results, we extend an existing scoring approach by using a more realistic genomic-background model. This enables us to identify transcription-factor binding sites in ChIP-seq data in a statistically rigorous fashion.

  11. Enzymic colorimetry-based DNA chip: a rapid and accurate assay for detecting mutations for clarithromycin resistance in the 23S rRNA gene of Helicobacter pylori.

    Science.gov (United States)

    Xuan, Shi-Hai; Zhou, Yu-Gui; Shao, Bo; Cui, Ya-Lin; Li, Jian; Yin, Hong-Bo; Song, Xiao-Ping; Cong, Hui; Jing, Feng-Xiang; Jin, Qing-Hui; Wang, Hui-Min; Zhou, Jie

    2009-11-01

    Macrolide drugs, such as clarithromycin (CAM), are a key component of many combination therapies used to eradicate Helicobacter pylori. However, resistance to CAM is increasing in H. pylori and is becoming a serious problem in H. pylori eradication therapy. CAM resistance in H. pylori is mostly due to point mutations (A2142G/C, A2143G) in the peptidyltransferase-encoding region of the 23S rRNA gene. In this study an enzymic colorimetry-based DNA chip was developed to analyse single-nucleotide polymorphisms of the 23S rRNA gene to determine the prevalence of mutations in CAM-related resistance in H. pylori-positive patients. The results of the colorimetric DNA chip were confirmed by direct DNA sequencing. In 63 samples, the incidence of the A2143G mutation was 17.46 % (11/63). The results of the colorimetric DNA chip were concordant with DNA sequencing in 96.83 % of results (61/63). The colorimetric DNA chip could detect wild-type and mutant signals at every site, even at a DNA concentration of 1.53 x 10(2) copies microl(-1). Thus, the colorimetric DNA chip is a reliable assay for rapid and accurate detection of mutations in the 23S rRNA gene of H. pylori that lead to CAM-related resistance, directly from gastric tissues.

  12. NNLOPS accurate predictions for $W^+W^-$ production arXiv

    CERN Document Server

    Re, Emanuele; Zanderighi, Giulia

    We present novel predictions for the production of $W^+W^-$ pairs in hadron collisions that are next-to-next-to-leading order accurate and consistently matched to a parton shower (NNLOPS). All diagrams that lead to the process $pp\\to e^- \\bar \

  13. ASTRAL, DRAGON and SEDAN scores predict stroke outcome more accurately than physicians.

    Science.gov (United States)

    Ntaios, G; Gioulekas, F; Papavasileiou, V; Strbian, D; Michel, P

    2016-11-01

    ASTRAL, SEDAN and DRAGON scores are three well-validated scores for stroke outcome prediction. Whether these scores predict stroke outcome more accurately compared with physicians interested in stroke was investigated. Physicians interested in stroke were invited to an online anonymous survey to provide outcome estimates in randomly allocated structured scenarios of recent real-life stroke patients. Their estimates were compared to scores' predictions in the same scenarios. An estimate was considered accurate if it was within 95% confidence intervals of actual outcome. In all, 244 participants from 32 different countries responded assessing 720 real scenarios and 2636 outcomes. The majority of physicians' estimates were inaccurate (1422/2636, 53.9%). 400 (56.8%) of physicians' estimates about the percentage probability of 3-month modified Rankin score (mRS) > 2 were accurate compared with 609 (86.5%) of ASTRAL score estimates (P DRAGON score estimates (P DRAGON score estimates (P DRAGON and SEDAN scores predict outcome of acute ischaemic stroke patients with higher accuracy compared to physicians interested in stroke. © 2016 EAN.

  14. Quantitation of ultraviolet-induced single-strand breaks using oligonucleotide chip

    International Nuclear Information System (INIS)

    Pal, Sukdeb; Kim, Min Jung; Choo, Jaebum; Kang, Seong Ho; Lee, Kyeong-Hee; Song, Joon Myong

    2008-01-01

    A simple, accurate and robust methodology was established for the direct quantification of ultraviolet (UV)-induced single-strand break (SSB) using oligonucleotide chip. Oligonucleotide chips were fabricated by covalently anchoring the fluorescent-labeled ssDNAs onto silicon dioxide chip surfaces. Assuming that the possibility of more than one UV-induced SSB to be generated in a small oligonucleotide is extremely low, SSB formation was investigated quantifying the endpoint probe density by fluorescence measurement upon UV irradiation. The SSB yields obtained based on the highly sensitive laser-induced fluorometric determination of fluorophore-labeled oligonucleotides were found to coincide well with that predicted from a theoretical extrapolation of the results obtained for plasmid DNAs using conventional agarose gel electrophoresis. The developed method has the potential to serve as a high throughput, sample-thrifty, and time saving tool to realize more realistic, and direct quantification of radiation and chemical-induced strand breaks. It will be especially useful for determining the frequency of SSBs or lesions convertible to SSBs by specific cleaving reagents or enzymes

  15. Experiment list: SRX186753 [Chip-atlas[Archive

    Lifescience Database Archive (English)

    Full Text Available der=Lonza || datatype=ChipSeq || datatype description=Chromatin IP Sequencing || antibody antibodydescription=Rabbit polyclonal antib...on. Antibody Target: H3K79me2 || antibody targetdescription=H3K79me2 is a mark of the transcriptional transi...tion region - the region between the initiation marks (K4me3, etc) and the elongation marks (K36me3). || antibody... vendorname=Active Motif || antibody vendorid=39143 || controlid=wgEncodeEH0...00093 || replicate=1,2 || softwareversion=ScriptureVPaperR3 || cell sex=U || antibody=H3K79me2 || antibody antibody

  16. Time-Predictable Communication on a Time-Division Multiplexing Network-on-Chip Multicore

    DEFF Research Database (Denmark)

    Sørensen, Rasmus Bo

    This thesis presents time-predictable inter-core communication on a multicore platform with a time-division multiplexing (TDM) network-on-chip (NoC) for hard real-time systems. The thesis is structured as a collection of papers that contribute within the areas of: reconfigurable TDM NoCs, static...... TDM scheduling, and time-predictable inter-core communication. More specifically, the work presented in this thesis investigates the interaction between hardware and software involved in time-predictable inter-core communication on the multicore platform. The thesis presents: a new generation...... of the Argo NoC network interface (NI) that supports instantaneous reconfiguration, a TDM traffic scheduler that generates virtual circuit (VC) configurations for the Argo NoC, and software functions for two types of intercore communication. The new generation of the Argo NoC adds the capability...

  17. A fast template matching method for LED chip Localization

    Directory of Open Access Journals (Sweden)

    Zhong Fuqiang

    2015-01-01

    Full Text Available Efficiency determines the profits of the semiconductor producers. So the producers spare no effort to enhance the efficiency of every procedure. The purpose of the paper is to present a method to shorten the time to locate the LED chips on wafer. The method consists of 3 steps. Firstly, image segmentation and blob analyzation are used to predict the positions of potential chips. Then predict the orientations of potential chips based on their dominant orientations. Finally, according to the positions and orientations predicted above, locate the chips precisely based on gradient orientation features. Experiments show that the algorithm is faster than the traditional method we choose to locate the LED chips. Besides, even the orientations of the chips on wafer are of big deviation to the orientation of the template, the efficiency of this method won't be affected.

  18. Simulating the Effect of Modulated Tool-Path Chip Breaking On Surface Texture and Chip Length

    Energy Technology Data Exchange (ETDEWEB)

    Smith, K.S.; McFarland, J.T.; Tursky, D. A.; Assaid, T. S.; Barkman, W. E.; Babelay, Jr., E. F.

    2010-04-30

    One method for creating broken chips in turning processes involves oscillating the cutting tool in the feed direction utilizing the CNC machine axes. The University of North Carolina at Charlotte and the Y-12 National Security Complex have developed and are refining a method to reliably control surface finish and chip length based on a particular machine's dynamic performance. Using computer simulations it is possible to combine the motion of the machine axes with the geometry of the cutting tool to predict the surface characteristics and map the surface texture for a wide range of oscillation parameters. These data allow the selection of oscillation parameters to simultaneously ensure broken chips and acceptable surface characteristics. This paper describes the machine dynamic testing and characterization activities as well as the computational method used for evaluating and predicting chip length and surface texture.

  19. piRNA analysis framework from small RNA-Seq data by a novel cluster prediction tool - PILFER.

    Science.gov (United States)

    Ray, Rishav; Pandey, Priyanka

    2017-12-19

    With the increasing number of studies focusing on PIWI-interacting RNA (piRNAs), it is now pertinent to develop efficient tools dedicated towards piRNA analysis. We have developed a novel cluster prediction tool called PILFER (PIrna cLuster FindER), which can accurately predict piRNA clusters from small RNA sequencing data. PILFER is an open source, easy to use tool, and can be executed even on a personal computer with minimum resources. It uses a sliding-window mechanism by integrating the expression of the reads along with the spatial information to predict the piRNA clusters. We have additionally defined a piRNA analysis pipeline incorporating PILFER to detect and annotate piRNAs and their clusters from raw small RNA sequencing data and implemented it on publicly available data from healthy germline and somatic tissues. We compared PILFER with other existing piRNA cluster prediction tools and found it to be statistically more accurate and superior in many aspects such as the robustness of PILFER clusters is higher and memory efficiency is more. Overall, PILFER provides a fast and accurate solution to piRNA cluster prediction. Copyright © 2017 Elsevier Inc. All rights reserved.

  20. Optimal use of tandem biotin and V5 tags in ChIP assays

    Directory of Open Access Journals (Sweden)

    Krpic Sanja

    2009-02-01

    Full Text Available Abstract Background Chromatin immunoprecipitation (ChIP assays coupled to genome arrays (Chip-on-chip or massive parallel sequencing (ChIP-seq lead to the genome wide identification of binding sites of chromatin associated proteins. However, the highly variable quality of antibodies and the availability of epitopes in crosslinked chromatin can compromise genomic ChIP outcomes. Epitope tags have often been used as more reliable alternatives. In addition, we have employed protein in vivo biotinylation tagging as a very high affinity alternative to antibodies. In this paper we describe the optimization of biotinylation tagging for ChIP and its coupling to a known epitope tag in providing a reliable and efficient alternative to antibodies. Results Using the biotin tagged erythroid transcription factor GATA-1 as example, we describe several optimization steps for the application of the high affinity biotin streptavidin system in ChIP. We find that the omission of SDS during sonication, the use of fish skin gelatin as blocking agent and choice of streptavidin beads can lead to significantly improved ChIP enrichments and lower background compared to antibodies. We also show that the V5 epitope tag performs equally well under the conditions worked out for streptavidin ChIP and that it may suffer less from the effects of formaldehyde crosslinking. Conclusion The combined use of the very high affinity biotin tag with the less sensitive to crosslinking V5 tag provides for a flexible ChIP platform with potential implications in ChIP sequencing outcomes.

  1. Optimal use of tandem biotin and V5 tags in ChIP assays

    Science.gov (United States)

    Kolodziej, Katarzyna E; Pourfarzad, Farzin; de Boer, Ernie; Krpic, Sanja; Grosveld, Frank; Strouboulis, John

    2009-01-01

    Background Chromatin immunoprecipitation (ChIP) assays coupled to genome arrays (Chip-on-chip) or massive parallel sequencing (ChIP-seq) lead to the genome wide identification of binding sites of chromatin associated proteins. However, the highly variable quality of antibodies and the availability of epitopes in crosslinked chromatin can compromise genomic ChIP outcomes. Epitope tags have often been used as more reliable alternatives. In addition, we have employed protein in vivo biotinylation tagging as a very high affinity alternative to antibodies. In this paper we describe the optimization of biotinylation tagging for ChIP and its coupling to a known epitope tag in providing a reliable and efficient alternative to antibodies. Results Using the biotin tagged erythroid transcription factor GATA-1 as example, we describe several optimization steps for the application of the high affinity biotin streptavidin system in ChIP. We find that the omission of SDS during sonication, the use of fish skin gelatin as blocking agent and choice of streptavidin beads can lead to significantly improved ChIP enrichments and lower background compared to antibodies. We also show that the V5 epitope tag performs equally well under the conditions worked out for streptavidin ChIP and that it may suffer less from the effects of formaldehyde crosslinking. Conclusion The combined use of the very high affinity biotin tag with the less sensitive to crosslinking V5 tag provides for a flexible ChIP platform with potential implications in ChIP sequencing outcomes. PMID:19196479

  2. A dual transcript-discovery approach to improve the delimitation of gene features from RNA-seq data in the chicken model

    Directory of Open Access Journals (Sweden)

    Mickael Orgeur

    2018-01-01

    Full Text Available The sequence of the chicken genome, like several other draft genome sequences, is presently not fully covered. Gaps, contigs assigned with low confidence and uncharacterized chromosomes result in gene fragmentation and imprecise gene annotation. Transcript abundance estimation from RNA sequencing (RNA-seq data relies on read quality, library complexity and expression normalization. In addition, the quality of the genome sequence used to map sequencing reads, and the gene annotation that defines gene features, must also be taken into account. A partially covered genome sequence causes the loss of sequencing reads from the mapping step, while an inaccurate definition of gene features induces imprecise read counts from the assignment step. Both steps can significantly bias interpretation of RNA-seq data. Here, we describe a dual transcript-discovery approach combining a genome-guided gene prediction and a de novo transcriptome assembly. This dual approach enabled us to increase the assignment rate of RNA-seq data by nearly 20% as compared to when using only the chicken reference annotation, contributing therefore to a more accurate estimation of transcript abundance. More generally, this strategy could be applied to any organism with partial genome sequence and/or lacking a manually-curated reference annotation in order to improve the accuracy of gene expression studies.

  3. Computational Methods for ChIP-seq Data Analysis and Applications

    KAUST Repository

    Ashoor, Haitham

    2017-01-01

    four main challenges. First, I address the problem of detecting histone modifications from ChIP-seq cancer samples. The presence of copy number variations (CNVs) in cancer samples results in statistical biases that lead to inaccurate predictions when

  4. MetaRNA-Seq: An Interactive Tool to Browse and Annotate Metadata from RNA-Seq Studies

    Directory of Open Access Journals (Sweden)

    Pankaj Kumar

    2015-01-01

    Full Text Available The number of RNA-Seq studies has grown in recent years. The design of RNA-Seq studies varies from very simple (e.g., two-condition case-control to very complicated (e.g., time series involving multiple samples at each time point with separate drug treatments. Most of these publically available RNA-Seq studies are deposited in NCBI databases, but their metadata are scattered throughout four different databases: Sequence Read Archive (SRA, Biosample, Bioprojects, and Gene Expression Omnibus (GEO. Although the NCBI web interface is able to provide all of the metadata information, it often requires significant effort to retrieve study- or project-level information by traversing through multiple hyperlinks and going to another page. Moreover, project- and study-level metadata lack manual or automatic curation by categories, such as disease type, time series, case-control, or replicate type, which are vital to comprehending any RNA-Seq study. Here we describe “MetaRNA-Seq,” a new tool for interactively browsing, searching, and annotating RNA-Seq metadata with the capability of semiautomatic curation at the study level.

  5. Network-Based Isoform Quantification with RNA-Seq Data for Cancer Transcriptome Analysis.

    Directory of Open Access Journals (Sweden)

    Wei Zhang

    2015-12-01

    Full Text Available High-throughput mRNA sequencing (RNA-Seq is widely used for transcript quantification of gene isoforms. Since RNA-Seq data alone is often not sufficient to accurately identify the read origins from the isoforms for quantification, we propose to explore protein domain-domain interactions as prior knowledge for integrative analysis with RNA-Seq data. We introduce a Network-based method for RNA-Seq-based Transcript Quantification (Net-RSTQ to integrate protein domain-domain interaction network with short read alignments for transcript abundance estimation. Based on our observation that the abundances of the neighboring isoforms by domain-domain interactions in the network are positively correlated, Net-RSTQ models the expression of the neighboring transcripts as Dirichlet priors on the likelihood of the observed read alignments against the transcripts in one gene. The transcript abundances of all the genes are then jointly estimated with alternating optimization of multiple EM problems. In simulation Net-RSTQ effectively improved isoform transcript quantifications when isoform co-expressions correlate with their interactions. qRT-PCR results on 25 multi-isoform genes in a stem cell line, an ovarian cancer cell line, and a breast cancer cell line also showed that Net-RSTQ estimated more consistent isoform proportions with RNA-Seq data. In the experiments on the RNA-Seq data in The Cancer Genome Atlas (TCGA, the transcript abundances estimated by Net-RSTQ are more informative for patient sample classification of ovarian cancer, breast cancer and lung cancer. All experimental results collectively support that Net-RSTQ is a promising approach for isoform quantification. Net-RSTQ toolbox is available at http://compbio.cs.umn.edu/Net-RSTQ/.

  6. On-chip particle trapping and manipulation

    Science.gov (United States)

    Leake, Kaelyn Danielle

    model and predict a sorting method which combines fluid flow with a single optical source to automatically sort dielectric particles by size in waveguide networks. These simulations were shown to be accurate when repeated on-chip. Lastly I introduce a particle trapping technique that uses Multimode Interference(MMI) patterns in order to trap multiple particles at once. The location of the traps can be adjusted as can the number of trapping location by changing the input wavelength. By changing the wavelength back and forth between two values this MMI can be used to pass a particle down the channel like a conveyor belt.

  7. rSeqNP: a non-parametric approach for detecting differential expression and splicing from RNA-Seq data.

    Science.gov (United States)

    Shi, Yang; Chinnaiyan, Arul M; Jiang, Hui

    2015-07-01

    High-throughput sequencing of transcriptomes (RNA-Seq) has become a powerful tool to study gene expression. Here we present an R package, rSeqNP, which implements a non-parametric approach to test for differential expression and splicing from RNA-Seq data. rSeqNP uses permutation tests to access statistical significance and can be applied to a variety of experimental designs. By combining information across isoforms, rSeqNP is able to detect more differentially expressed or spliced genes from RNA-Seq data. The R package with its source code and documentation are freely available at http://www-personal.umich.edu/∼jianghui/rseqnp/. jianghui@umich.edu Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  8. Towards cycle-accurate performance predictions for real-time embedded systems

    NARCIS (Netherlands)

    Triantafyllidis, K.; Bondarev, E.; With, de P.H.N.; Arabnia, H.R.; Deligiannidis, L.; Jandieri, G.

    2013-01-01

    In this paper we present a model-based performance analysis method for component-based real-time systems, featuring cycle-accurate predictions of latencies and enhanced system robustness. The method incorporates the following phases: (a) instruction-level profiling of SW components, (b) modeling the

  9. Accurate Prediction of Motor Failures by Application of Multi CBM Tools: A Case Study

    Science.gov (United States)

    Dutta, Rana; Singh, Veerendra Pratap; Dwivedi, Jai Prakash

    2018-02-01

    Motor failures are very difficult to predict accurately with a single condition-monitoring tool as both electrical and the mechanical systems are closely related. Electrical problem, like phase unbalance, stator winding insulation failures can, at times, lead to vibration problem and at the same time mechanical failures like bearing failure, leads to rotor eccentricity. In this case study of a 550 kW blower motor it has been shown that a rotor bar crack was detected by current signature analysis and vibration monitoring confirmed the same. In later months in a similar motor vibration monitoring predicted bearing failure and current signature analysis confirmed the same. In both the cases, after dismantling the motor, the predictions were found to be accurate. In this paper we will be discussing the accurate predictions of motor failures through use of multi condition monitoring tools with two case studies.

  10. Selective amplification and sequencing of cyclic phosphate-containing RNAs by the cP-RNA-seq method.

    Science.gov (United States)

    Honda, Shozo; Morichika, Keisuke; Kirino, Yohei

    2016-03-01

    RNA digestions catalyzed by many ribonucleases generate RNA fragments that contain a 2',3'-cyclic phosphate (cP) at their 3' termini. However, standard RNA-seq methods are unable to accurately capture cP-containing RNAs because the cP inhibits the adapter ligation reaction. We recently developed a method named cP-RNA-seq that is able to selectively amplify and sequence cP-containing RNAs. Here we describe the cP-RNA-seq protocol in which the 3' termini of all RNAs, except those containing a cP, are cleaved through a periodate treatment after phosphatase treatment; hence, subsequent adapter ligation and cDNA amplification steps are exclusively applied to cP-containing RNAs. cP-RNA-seq takes ∼6 d, excluding the time required for sequencing and bioinformatics analyses, which are not covered in detail in this protocol. Biochemical validation of the existence of cP in the identified RNAs takes ∼3 d. Even though the cP-RNA-seq method was developed to identify angiogenin-generating 5'-tRNA halves as a proof of principle, the method should be applicable to global identification of cP-containing RNA repertoires in various transcriptomes.

  11. CEL-Seq: Single-Cell RNA-Seq by Multiplexed Linear Amplification

    Directory of Open Access Journals (Sweden)

    Tamar Hashimshony

    2012-09-01

    Full Text Available High-throughput sequencing has allowed for unprecedented detail in gene expression analyses, yet its efficient application to single cells is challenged by the small starting amounts of RNA. We have developed CEL-Seq, a method for overcoming this limitation by barcoding and pooling samples before linearly amplifying mRNA with the use of one round of in vitro transcription. We show that CEL-Seq gives more reproducible, linear, and sensitive results than a PCR-based amplification method. We demonstrate the power of this method by studying early C. elegans embryonic development at single-cell resolution. Differential distribution of transcripts between sister cells is seen as early as the two-cell stage embryo, and zygotic expression in the somatic cell lineages is enriched for transcription factors. The robust transcriptome quantifications enabled by CEL-Seq will be useful for transcriptomic analyses of complex tissues containing populations of diverse cell types.

  12. Statistical modeling of isoform splicing dynamics from RNA-seq time series data.

    Science.gov (United States)

    Huang, Yuanhua; Sanguinetti, Guido

    2016-10-01

    Isoform quantification is an important goal of RNA-seq experiments, yet it remains problematic for genes with low expression or several isoforms. These difficulties may in principle be ameliorated by exploiting correlated experimental designs, such as time series or dosage response experiments. Time series RNA-seq experiments, in particular, are becoming increasingly popular, yet there are no methods that explicitly leverage the experimental design to improve isoform quantification. Here, we present DICEseq, the first isoform quantification method tailored to correlated RNA-seq experiments. DICEseq explicitly models the correlations between different RNA-seq experiments to aid the quantification of isoforms across experiments. Numerical experiments on simulated datasets show that DICEseq yields more accurate results than state-of-the-art methods, an advantage that can become considerable at low coverage levels. On real datasets, our results show that DICEseq provides substantially more reproducible and robust quantifications, increasing the correlation of estimates from replicate datasets by up to 10% on genes with low or moderate expression levels (bottom third of all genes). Furthermore, DICEseq permits to quantify the trade-off between temporal sampling of RNA and depth of sequencing, frequently an important choice when planning experiments. Our results have strong implications for the design of RNA-seq experiments, and offer a novel tool for improved analysis of such datasets. Python code is freely available at http://diceseq.sf.net G.Sanguinetti@ed.ac.uk Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  13. Gene Ranking of RNA-Seq Data via Discriminant Non-Negative Matrix Factorization.

    Science.gov (United States)

    Jia, Zhilong; Zhang, Xiang; Guan, Naiyang; Bo, Xiaochen; Barnes, Michael R; Luo, Zhigang

    2015-01-01

    RNA-sequencing is rapidly becoming the method of choice for studying the full complexity of transcriptomes, however with increasing dimensionality, accurate gene ranking is becoming increasingly challenging. This paper proposes an accurate and sensitive gene ranking method that implements discriminant non-negative matrix factorization (DNMF) for RNA-seq data. To the best of our knowledge, this is the first work to explore the utility of DNMF for gene ranking. When incorporating Fisher's discriminant criteria and setting the reduced dimension as two, DNMF learns two factors to approximate the original gene expression data, abstracting the up-regulated or down-regulated metagene by using the sample label information. The first factor denotes all the genes' weights of two metagenes as the additive combination of all genes, while the second learned factor represents the expression values of two metagenes. In the gene ranking stage, all the genes are ranked as a descending sequence according to the differential values of the metagene weights. Leveraging the nature of NMF and Fisher's criterion, DNMF can robustly boost the gene ranking performance. The Area Under the Curve analysis of differential expression analysis on two benchmarking tests of four RNA-seq data sets with similar phenotypes showed that our proposed DNMF-based gene ranking method outperforms other widely used methods. Moreover, the Gene Set Enrichment Analysis also showed DNMF outweighs others. DNMF is also computationally efficient, substantially outperforming all other benchmarked methods. Consequently, we suggest DNMF is an effective method for the analysis of differential gene expression and gene ranking for RNA-seq data.

  14. "Hook"-calibration of GeneChip-microarrays: Chip characteristics and expression measures

    Directory of Open Access Journals (Sweden)

    Krohn Knut

    2008-08-01

    Full Text Available Abstract Background Microarray experiments rely on several critical steps that may introduce biases and uncertainty in downstream analyses. These steps include mRNA sample extraction, amplification and labelling, hybridization, and scanning causing chip-specific systematic variations on the raw intensity level. Also the chosen array-type and the up-to-dateness of the genomic information probed on the chip affect the quality of the expression measures. In the accompanying publication we presented theory and algorithm of the so-called hook method which aims at correcting expression data for systematic biases using a series of new chip characteristics. Results In this publication we summarize the essential chip characteristics provided by this method, analyze special benchmark experiments to estimate transcript related expression measures and illustrate the potency of the method to detect and to quantify the quality of a particular hybridization. It is shown that our single-chip approach provides expression measures responding linearly on changes of the transcript concentration over three orders of magnitude. In addition, the method calculates a detection call judging the relation between the signal and the detection limit of the particular measurement. The performance of the method in the context of different chip generations and probe set assignments is illustrated. The hook method characterizes the RNA-quality in terms of the 3'/5'-amplification bias and the sample-specific calling rate. We show that the proper judgement of these effects requires the disentanglement of non-specific and specific hybridization which, otherwise, can lead to misinterpretations of expression changes. The consequences of modifying probe/target interactions by either changing the labelling protocol or by substituting RNA by DNA targets are demonstrated. Conclusion The single-chip based hook-method provides accurate expression estimates and chip-summary characteristics

  15. Strand-Specific RNA-Seq Analyses of Fruiting Body Development in Coprinopsis cinerea.

    Directory of Open Access Journals (Sweden)

    Hajime Muraguchi

    Full Text Available The basidiomycete fungus Coprinopsis cinerea is an important model system for multicellular development. Fruiting bodies of C. cinerea are typical mushrooms, which can be produced synchronously on defined media in the laboratory. To investigate the transcriptome in detail during fruiting body development, high-throughput sequencing (RNA-seq was performed using cDNA libraries strand-specifically constructed from 13 points (stages/tissues with two biological replicates. The reads were aligned to 14,245 predicted transcripts, and counted for forward and reverse transcripts. Differentially expressed genes (DEGs between two adjacent points and between vegetative mycelium and each point were detected by Tag Count Comparison (TCC. To validate RNA-seq data, expression levels of selected genes were compared using RPKM values in RNA-seq data and qRT-PCR data, and DEGs detected in microarray data were examined in MA plots of RNA-seq data by TCC. We discuss events deduced from GO analysis of DEGs. In addition, we uncovered both transcription factor candidates and antisense transcripts that are likely to be involved in developmental regulation for fruiting.

  16. Alternative mapping of probes to genes for Affymetrix chips

    DEFF Research Database (Denmark)

    Gautier, Laurent; Møller, M.; Friis-Hansen, L.

    2004-01-01

    transcripts: the NCBI RefSeq database. We also built mappings and used them in place of the original probe to genes associations provided by the manufacturer of the arrays. Results: In a large number of cases, 36%, the probes matching a reference sequence were consistent with the grouping of probes...... by the manufacturer of the chips. For the remaining cases there were discrepancies and we show how that can affect the analysis of data. Conclusions: While the probes on Affymetrix arrays remain the same for several years, the biological knowledge concerning the genomic sequences evolves rapidly. Using up...

  17. Determining wood chip size: image analysis and clustering methods

    Directory of Open Access Journals (Sweden)

    Paolo Febbi

    2013-09-01

    Full Text Available One of the standard methods for the determination of the size distribution of wood chips is the oscillating screen method (EN 15149- 1:2010. Recent literature demonstrated how image analysis could return highly accurate measure of the dimensions defined for each individual particle, and could promote a new method depending on the geometrical shape to determine the chip size in a more accurate way. A sample of wood chips (8 litres was sieved through horizontally oscillating sieves, using five different screen hole diameters (3.15, 8, 16, 45, 63 mm; the wood chips were sorted in decreasing size classes and the mass of all fractions was used to determine the size distribution of the particles. Since the chip shape and size influence the sieving results, Wang’s theory, which concerns the geometric forms, was considered. A cluster analysis on the shape descriptors (Fourier descriptors and size descriptors (area, perimeter, Feret diameters, eccentricity was applied to observe the chips distribution. The UPGMA algorithm was applied on Euclidean distance. The obtained dendrogram shows a group separation according with the original three sieving fractions. A comparison has been made between the traditional sieve and clustering results. This preliminary result shows how the image analysis-based method has a high potential for the characterization of wood chip size distribution and could be further investigated. Moreover, this method could be implemented in an online detection machine for chips size characterization. An improvement of the results is expected by using supervised multivariate methods that utilize known class memberships. The main objective of the future activities will be to shift the analysis from a 2-dimensional method to a 3- dimensional acquisition process.

  18. Mastering multi-depth bio-chip patterns with DVD LBRs

    Science.gov (United States)

    Carson, Doug

    2017-08-01

    Bio chip and bio disc are rapidly growing technologies used in medical, health and other industries. While there are numerous unique designs and features, these products all rely on precise three-dimensional micro-fluidic channels or arrays to move, separate and combine samples under test. These bio chip and bio disc consumables are typically manufactured by molding these parts to a precise three-dimensional pattern on a negative metal stamper, or they can be made in smaller quantities using an appropriate curable resin and a negative mold/stamper. Stampers required for bio chips have been traditionally made using either micro machining or XY stepping lithography. Both of these technologies have their advantages as well as limitations when it comes to creating micro-fluidic patterns. Significant breakthroughs in continuous maskless lithography have enabled accurate and efficient manufacturing of micro-fluidic masters using LBRs (Laser Beam Recorders) and DRIE (Deep Reactive Ion Etching). The important advantages of LBR continuous lithography vs. XY stepping lithography and micro machining are speed and cost. LBR based continuous lithography is >100x faster than XY stepping lithography and more accurate than micro machining. Several innovations were required in order to create multi-depth patterns with sub micron accuracy. By combining proven industrial LBRs with DCA's G3-VIA pattern generator and DRIE, three-dimensional bio chip masters and stampers are being manufactured efficiently and accurately.

  19. A structured sparse regression method for estimating isoform expression level from multi-sample RNA-seq data.

    Science.gov (United States)

    Zhang, L; Liu, X J

    2016-06-03

    With the rapid development of next-generation high-throughput sequencing technology, RNA-seq has become a standard and important technique for transcriptome analysis. For multi-sample RNA-seq data, the existing expression estimation methods usually deal with each single-RNA-seq sample, and ignore that the read distributions are consistent across multiple samples. In the current study, we propose a structured sparse regression method, SSRSeq, to estimate isoform expression using multi-sample RNA-seq data. SSRSeq uses a non-parameter model to capture the general tendency of non-uniformity read distribution for all genes across multiple samples. Additionally, our method adds a structured sparse regularization, which not only incorporates the sparse specificity between a gene and its corresponding isoform expression levels, but also reduces the effects of noisy reads, especially for lowly expressed genes and isoforms. Four real datasets were used to evaluate our method on isoform expression estimation. Compared with other popular methods, SSRSeq reduced the variance between multiple samples, and produced more accurate isoform expression estimations, and thus more meaningful biological interpretations.

  20. An accurate and efficient method for large-scale SSR genotyping and applications.

    Science.gov (United States)

    Li, Lun; Fang, Zhiwei; Zhou, Junfei; Chen, Hong; Hu, Zhangfeng; Gao, Lifen; Chen, Lihong; Ren, Sheng; Ma, Hongyu; Lu, Long; Zhang, Weixiong; Peng, Hai

    2017-06-02

    Accurate and efficient genotyping of simple sequence repeats (SSRs) constitutes the basis of SSRs as an effective genetic marker with various applications. However, the existing methods for SSR genotyping suffer from low sensitivity, low accuracy, low efficiency and high cost. In order to fully exploit the potential of SSRs as genetic marker, we developed a novel method for SSR genotyping, named as AmpSeq-SSR, which combines multiplexing polymerase chain reaction (PCR), targeted deep sequencing and comprehensive analysis. AmpSeq-SSR is able to genotype potentially more than a million SSRs at once using the current sequencing techniques. In the current study, we simultaneously genotyped 3105 SSRs in eight rice varieties, which were further validated experimentally. The results showed that the accuracies of AmpSeq-SSR were nearly 100 and 94% with a single base resolution for homozygous and heterozygous samples, respectively. To demonstrate the power of AmpSeq-SSR, we adopted it in two applications. The first was to construct discriminative fingerprints of the rice varieties using 3105 SSRs, which offer much greater discriminative power than the 48 SSRs commonly used for rice. The second was to map Xa21, a gene that confers persistent resistance to rice bacterial blight. We demonstrated that genome-scale fingerprints of an organism can be efficiently constructed and candidate genes, such as Xa21 in rice, can be accurately and efficiently mapped using an innovative strategy consisting of multiplexing PCR, targeted sequencing and computational analysis. While the work we present focused on rice, AmpSeq-SSR can be readily extended to animals and micro-organisms. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  1. Accurate clinical genetic testing for autoinflammatory diseases using the next-generation sequencing platform MiSeq.

    Science.gov (United States)

    Nakayama, Manabu; Oda, Hirotsugu; Nakagawa, Kenji; Yasumi, Takahiro; Kawai, Tomoki; Izawa, Kazushi; Nishikomori, Ryuta; Heike, Toshio; Ohara, Osamu

    2017-03-01

    Autoinflammatory diseases occupy one of a group of primary immunodeficiency diseases that are generally thought to be caused by mutation of genes responsible for innate immunity, rather than by acquired immunity. Mutations related to autoinflammatory diseases occur in 12 genes. For example, low-level somatic mosaic NLRP3 mutations underlie chronic infantile neurologic, cutaneous, articular syndrome (CINCA), also known as neonatal-onset multisystem inflammatory disease (NOMID). In current clinical practice, clinical genetic testing plays an important role in providing patients with quick, definite diagnoses. To increase the availability of such testing, low-cost high-throughput gene-analysis systems are required, ones that not only have the sensitivity to detect even low-level somatic mosaic mutations, but also can operate simply in a clinical setting. To this end, we developed a simple method that employs two-step tailed PCR and an NGS system, MiSeq platform, to detect mutations in all coding exons of the 12 genes responsible for autoinflammatory diseases. Using this amplicon sequencing system, we amplified a total of 234 amplicons derived from the 12 genes with multiplex PCR. This was done simultaneously and in one test tube. Each sample was distinguished by an index sequence of second PCR primers following PCR amplification. With our procedure and tips for reducing PCR amplification bias, we were able to analyze 12 genes from 25 clinical samples in one MiSeq run. Moreover, with the certified primers designed by our short program-which detects and avoids common SNPs in gene-specific PCR primers-we used this system for routine genetic testing. Our optimized procedure uses a simple protocol, which can easily be followed by virtually any office medical staff. Because of the small PCR amplification bias, we can analyze simultaneously several clinical DNA samples with low cost and can obtain sufficient read numbers to detect a low level of somatic mosaic mutations.

  2. The bench scientist's guide to statistical analysis of RNA-Seq data

    OpenAIRE

    Yendrek, Craig R.; Ainsworth, Elizabeth A.; Thimmapuram, Jyothi

    2012-01-01

    Abstract Background RNA sequencing (RNA-Seq) is emerging as a highly accurate method to quantify transcript abundance. However, analyses of the large data sets obtained by sequencing the entire transcriptome of organisms have generally been performed by bioinformatics specialists. Here we provide a step-by-step guide and outline a strategy using currently available statistical tools that results in a conservative list of differentially expressed genes. We also discuss potential sources of err...

  3. AN ACCURATE MODELING OF DELAY AND SLEW METRICS FOR ON-CHIP VLSI RC INTERCONNECTS FOR RAMP INPUTS USING BURR’S DISTRIBUTION FUNCTION

    Directory of Open Access Journals (Sweden)

    Rajib Kar

    2010-09-01

    Full Text Available This work presents an accurate and efficient model to compute the delay and slew metric of on-chip interconnect of high speed CMOS circuits foe ramp input. Our metric assumption is based on the Burr’s Distribution function. The Burr’s distribution is used to characterize the normalized homogeneous portion of the step response. We used the PERI (Probability distribution function Extension for Ramp Inputs technique that extends delay metrics and slew metric for step inputs to the more general and realistic non-step inputs. The accuracy of our models is justified with the results compared with that of SPICE simulations.

  4. WaveSeq: a novel data-driven method of detecting histone modification enrichments using wavelets.

    Directory of Open Access Journals (Sweden)

    Apratim Mitra

    Full Text Available BACKGROUND: Chromatin immunoprecipitation followed by next-generation sequencing is a genome-wide analysis technique that can be used to detect various epigenetic phenomena such as, transcription factor binding sites and histone modifications. Histone modification profiles can be either punctate or diffuse which makes it difficult to distinguish regions of enrichment from background noise. With the discovery of histone marks having a wide variety of enrichment patterns, there is an urgent need for analysis methods that are robust to various data characteristics and capable of detecting a broad range of enrichment patterns. RESULTS: To address these challenges we propose WaveSeq, a novel data-driven method of detecting regions of significant enrichment in ChIP-Seq data. Our approach utilizes the wavelet transform, is free of distributional assumptions and is robust to diverse data characteristics such as low signal-to-noise ratios and broad enrichment patterns. Using publicly available datasets we showed that WaveSeq compares favorably with other published methods, exhibiting high sensitivity and precision for both punctate and diffuse enrichment regions even in the absence of a control data set. The application of our algorithm to a complex histone modification data set helped make novel functional discoveries which further underlined its utility in such an experimental setup. CONCLUSIONS: WaveSeq is a highly sensitive method capable of accurate identification of enriched regions in a broad range of data sets. WaveSeq can detect both narrow and broad peaks with a high degree of accuracy even in low signal-to-noise ratio data sets. WaveSeq is also suited for application in complex experimental scenarios, helping make biologically relevant functional discoveries.

  5. A deep learning-based multi-model ensemble method for cancer prediction.

    Science.gov (United States)

    Xiao, Yawen; Wu, Jun; Lin, Zongli; Zhao, Xiaodong

    2018-01-01

    Cancer is a complex worldwide health problem associated with high mortality. With the rapid development of the high-throughput sequencing technology and the application of various machine learning methods that have emerged in recent years, progress in cancer prediction has been increasingly made based on gene expression, providing insight into effective and accurate treatment decision making. Thus, developing machine learning methods, which can successfully distinguish cancer patients from healthy persons, is of great current interest. However, among the classification methods applied to cancer prediction so far, no one method outperforms all the others. In this paper, we demonstrate a new strategy, which applies deep learning to an ensemble approach that incorporates multiple different machine learning models. We supply informative gene data selected by differential gene expression analysis to five different classification models. Then, a deep learning method is employed to ensemble the outputs of the five classifiers. The proposed deep learning-based multi-model ensemble method was tested on three public RNA-seq data sets of three kinds of cancers, Lung Adenocarcinoma, Stomach Adenocarcinoma and Breast Invasive Carcinoma. The test results indicate that it increases the prediction accuracy of cancer for all the tested RNA-seq data sets as compared to using a single classifier or the majority voting algorithm. By taking full advantage of different classifiers, the proposed deep learning-based multi-model ensemble method is shown to be accurate and effective for cancer prediction. Copyright © 2017 Elsevier B.V. All rights reserved.

  6. Identification of innate lymphoid cells in single-cell RNA-Seq data.

    Science.gov (United States)

    Suffiotti, Madeleine; Carmona, Santiago J; Jandus, Camilla; Gfeller, David

    2017-07-01

    Innate lymphoid cells (ILCs) consist of natural killer (NK) cells and non-cytotoxic ILCs that are broadly classified into ILC1, ILC2, and ILC3 subtypes. These cells recently emerged as important early effectors of innate immunity for their roles in tissue homeostasis and inflammation. Over the last few years, ILCs have been extensively studied in mouse and human at the functional and molecular level, including gene expression profiling. However, sorting ILCs with flow cytometry for gene expression analysis is a delicate and time-consuming process. Here we propose and validate a novel framework for studying ILCs at the transcriptomic level using single-cell RNA-Seq data. Our approach combines unsupervised clustering and a new cell type classifier trained on mouse ILC gene expression data. We show that this approach can accurately identify different ILCs, especially ILC2 cells, in human lymphocyte single-cell RNA-Seq data. Our new model relies only on genes conserved across vertebrates, thereby making it in principle applicable in any vertebrate species. Considering the rapid increase in throughput of single-cell RNA-Seq technology, our work provides a computational framework for studying ILC2 cells in single-cell transcriptomic data and may help exploring their conservation in distant vertebrate species.

  7. Incorporation of unique molecular identifiers in TruSeq adapters improves the accuracy of quantitative sequencing.

    Science.gov (United States)

    Hong, Jungeui; Gresham, David

    2017-11-01

    Quantitative analysis of next-generation sequencing (NGS) data requires discriminating duplicate reads generated by PCR from identical molecules that are of unique origin. Typically, PCR duplicates are identified as sequence reads that align to the same genomic coordinates using reference-based alignment. However, identical molecules can be independently generated during library preparation. Misidentification of these molecules as PCR duplicates can introduce unforeseen biases during analyses. Here, we developed a cost-effective sequencing adapter design by modifying Illumina TruSeq adapters to incorporate a unique molecular identifier (UMI) while maintaining the capacity to undertake multiplexed, single-index sequencing. Incorporation of UMIs into TruSeq adapters (TrUMIseq adapters) enables identification of bona fide PCR duplicates as identically mapped reads with identical UMIs. Using TrUMIseq adapters, we show that accurate removal of PCR duplicates results in improved accuracy of both allele frequency (AF) estimation in heterogeneous populations using DNA sequencing and gene expression quantification using RNA-Seq.

  8. Accurate approximation method for prediction of class I MHC affinities for peptides of length 8, 10 and 11 using prediction tools trained on 9mers

    DEFF Research Database (Denmark)

    Lundegaard, Claus; Lund, Ole; Nielsen, Morten

    2008-01-01

    Several accurate prediction systems have been developed for prediction of class I major histocompatibility complex (MHC):peptide binding. Most of these are trained on binding affinity data of primarily 9mer peptides. Here, we show how prediction methods trained on 9mer data can be used for accurate...

  9. iMir: an integrated pipeline for high-throughput analysis of small non-coding RNA data obtained by smallRNA-Seq.

    Science.gov (United States)

    Giurato, Giorgio; De Filippo, Maria Rosaria; Rinaldi, Antonio; Hashim, Adnan; Nassa, Giovanni; Ravo, Maria; Rizzo, Francesca; Tarallo, Roberta; Weisz, Alessandro

    2013-12-13

    Qualitative and quantitative analysis of small non-coding RNAs by next generation sequencing (smallRNA-Seq) represents a novel technology increasingly used to investigate with high sensitivity and specificity RNA population comprising microRNAs and other regulatory small transcripts. Analysis of smallRNA-Seq data to gather biologically relevant information, i.e. detection and differential expression analysis of known and novel non-coding RNAs, target prediction, etc., requires implementation of multiple statistical and bioinformatics tools from different sources, each focusing on a specific step of the analysis pipeline. As a consequence, the analytical workflow is slowed down by the need for continuous interventions by the operator, a critical factor when large numbers of datasets need to be analyzed at once. We designed a novel modular pipeline (iMir) for comprehensive analysis of smallRNA-Seq data, comprising specific tools for adapter trimming, quality filtering, differential expression analysis, biological target prediction and other useful options by integrating multiple open source modules and resources in an automated workflow. As statistics is crucial in deep-sequencing data analysis, we devised and integrated in iMir tools based on different statistical approaches to allow the operator to analyze data rigorously. The pipeline created here proved to be efficient and time-saving than currently available methods and, in addition, flexible enough to allow the user to select the preferred combination of analytical steps. We present here the results obtained by applying this pipeline to analyze simultaneously 6 smallRNA-Seq datasets from either exponentially growing or growth-arrested human breast cancer MCF-7 cells, that led to the rapid and accurate identification, quantitation and differential expression analysis of ~450 miRNAs, including several novel miRNAs and isomiRs, as well as identification of the putative mRNA targets of differentially expressed mi

  10. Heart rate during basketball game play and volleyball drills accurately predicts oxygen uptake and energy expenditure.

    Science.gov (United States)

    Scribbans, T D; Berg, K; Narazaki, K; Janssen, I; Gurd, B J

    2015-09-01

    There is currently little information regarding the ability of metabolic prediction equations to accurately predict oxygen uptake and exercise intensity from heart rate (HR) during intermittent sport. The purpose of the present study was to develop and, cross-validate equations appropriate for accurately predicting oxygen cost (VO2) and energy expenditure from HR during intermittent sport participation. Eleven healthy adult males (19.9±1.1yrs) were recruited to establish the relationship between %VO2peak and %HRmax during low-intensity steady state endurance (END), moderate-intensity interval (MOD) and high intensity-interval exercise (HI), as performed on a cycle ergometer. Three equations (END, MOD, and HI) for predicting %VO2peak based on %HRmax were developed. HR and VO2 were directly measured during basketball games (6 male, 20.8±1.0 yrs; 6 female, 20.0±1.3yrs) and volleyball drills (12 female; 20.8±1.0yrs). Comparisons were made between measured and predicted VO2 and energy expenditure using the 3 equations developed and 2 previously published equations. The END and MOD equations accurately predicted VO2 and energy expenditure, while the HI equation underestimated, and the previously published equations systematically overestimated VO2 and energy expenditure. Intermittent sport VO2 and energy expenditure can be accurately predicted from heart rate data using either the END (%VO2peak=%HRmax x 1.008-17.17) or MOD (%VO2peak=%HRmax x 1.2-32) equations. These 2 simple equations provide an accessible and cost-effective method for accurate estimation of exercise intensity and energy expenditure during intermittent sport.

  11. A proposed holistic approach to on-chip, off-chip, test, and package interconnections

    Science.gov (United States)

    Bartelink, Dirk J.

    1998-11-01

    recognize—test is also performed using IC's. A system interconnection is proposed using multiple chips fabricated with conventional silicon processes, including MEMS technology. The system resembles an MCM that can be joined without committing to final assembly to perform at-speed testing. 50-Ohm test probes never load the circuit; only intended neighboring chips are ever connected. A `back-plane' chip provides the connection layers for both inter- and intra-chip signals and also serves as the probe card, in analogy with membrane probes now used for single-chip testing. Intra-chip connections, which require complicated connections during test that exactly match the product, are then properly made and all waveforms and loading conditions under test will be identical to those of the product. The major benefit is that all front-end chip technologies can be merged—logic, memory, RF, even passives. ESD protection is required only on external system connections. Manufacturing test information will accurately characterize process faults and thus avoid the Known-Good-Die problem that has slowed the arrival of conventional MCM's.

  12. Global Mapping of Transcription Factor Binding Sites by Sequencing Chromatin Surrogates: a Perspective on Experimental Design, Data Analysis, and Open Problems.

    Science.gov (United States)

    Wei, Yingying; Wu, George; Ji, Hongkai

    2013-05-01

    Mapping genome-wide binding sites of all transcription factors (TFs) in all biological contexts is a critical step toward understanding gene regulation. The state-of-the-art technologies for mapping transcription factor binding sites (TFBSs) couple chromatin immunoprecipitation (ChIP) with high-throughput sequencing (ChIP-seq) or tiling array hybridization (ChIP-chip). These technologies have limitations: they are low-throughput with respect to surveying many TFs. Recent advances in genome-wide chromatin profiling, including development of technologies such as DNase-seq, FAIRE-seq and ChIP-seq for histone modifications, make it possible to predict in vivo TFBSs by analyzing chromatin features at computationally determined DNA motif sites. This promising new approach may allow researchers to monitor the genome-wide binding sites of many TFs simultaneously. In this article, we discuss various experimental design and data analysis issues that arise when applying this approach. Through a systematic analysis of the data from the Encyclopedia Of DNA Elements (ENCODE) project, we compare the predictive power of individual and combinations of chromatin marks using supervised and unsupervised learning methods, and evaluate the value of integrating information from public ChIP and gene expression data. We also highlight the challenges and opportunities for developing novel analytical methods, such as resolving the one-motif-multiple-TF ambiguity and distinguishing functional and non-functional TF binding targets from the predicted binding sites. The online version of this article (doi:10.1007/s12561-012-9066-5) contains supplementary material, which is available to authorized users.

  13. PredictSNP: robust and accurate consensus classifier for prediction of disease-related mutations.

    Directory of Open Access Journals (Sweden)

    Jaroslav Bendl

    2014-01-01

    Full Text Available Single nucleotide variants represent a prevalent form of genetic variation. Mutations in the coding regions are frequently associated with the development of various genetic diseases. Computational tools for the prediction of the effects of mutations on protein function are very important for analysis of single nucleotide variants and their prioritization for experimental characterization. Many computational tools are already widely employed for this purpose. Unfortunately, their comparison and further improvement is hindered by large overlaps between the training datasets and benchmark datasets, which lead to biased and overly optimistic reported performances. In this study, we have constructed three independent datasets by removing all duplicities, inconsistencies and mutations previously used in the training of evaluated tools. The benchmark dataset containing over 43,000 mutations was employed for the unbiased evaluation of eight established prediction tools: MAPP, nsSNPAnalyzer, PANTHER, PhD-SNP, PolyPhen-1, PolyPhen-2, SIFT and SNAP. The six best performing tools were combined into a consensus classifier PredictSNP, resulting into significantly improved prediction performance, and at the same time returned results for all mutations, confirming that consensus prediction represents an accurate and robust alternative to the predictions delivered by individual tools. A user-friendly web interface enables easy access to all eight prediction tools, the consensus classifier PredictSNP and annotations from the Protein Mutant Database and the UniProt database. The web server and the datasets are freely available to the academic community at http://loschmidt.chemi.muni.cz/predictsnp.

  14. Accurate clinical genetic testing for autoinflammatory diseases using the next-generation sequencing platform MiSeq

    Directory of Open Access Journals (Sweden)

    Manabu Nakayama

    2017-03-01

    Full Text Available Autoinflammatory diseases occupy one of a group of primary immunodeficiency diseases that are generally thought to be caused by mutation of genes responsible for innate immunity, rather than by acquired immunity. Mutations related to autoinflammatory diseases occur in 12 genes. For example, low-level somatic mosaic NLRP3 mutations underlie chronic infantile neurologic, cutaneous, articular syndrome (CINCA, also known as neonatal-onset multisystem inflammatory disease (NOMID. In current clinical practice, clinical genetic testing plays an important role in providing patients with quick, definite diagnoses. To increase the availability of such testing, low-cost high-throughput gene-analysis systems are required, ones that not only have the sensitivity to detect even low-level somatic mosaic mutations, but also can operate simply in a clinical setting. To this end, we developed a simple method that employs two-step tailed PCR and an NGS system, MiSeq platform, to detect mutations in all coding exons of the 12 genes responsible for autoinflammatory diseases. Using this amplicon sequencing system, we amplified a total of 234 amplicons derived from the 12 genes with multiplex PCR. This was done simultaneously and in one test tube. Each sample was distinguished by an index sequence of second PCR primers following PCR amplification. With our procedure and tips for reducing PCR amplification bias, we were able to analyze 12 genes from 25 clinical samples in one MiSeq run. Moreover, with the certified primers designed by our short program—which detects and avoids common SNPs in gene-specific PCR primers—we used this system for routine genetic testing. Our optimized procedure uses a simple protocol, which can easily be followed by virtually any office medical staff. Because of the small PCR amplification bias, we can analyze simultaneously several clinical DNA samples with low cost and can obtain sufficient read numbers to detect a low level of

  15. Chromatin Immunoprecipitation (ChIP): Revisiting the Efficacy of Sample Preparation, Sonication, Quantification of Sheared DNA, and Analysis via PCR

    Science.gov (United States)

    Schoppee Bortz, Pamela D.; Wamhoff, Brian R.

    2011-01-01

    The “quantitative” ChIP, a tool commonly used to study protein-DNA interactions in cells and tissue, is a difficult assay often plagued with technical error. We present, herein, the process required to merge multiple protocols into a quick, reliable and easy method and an approach to accurately quantify ChIP DNA prior to performing PCR. We demonstrate that high intensity sonication for at least 30 min is required for full cellular disruption and maximum DNA recovery because ChIP lysis buffers fail to lyse formaldehyde-fixed cells. In addition, extracting ChIP DNA with chelex-100 yields samples that are too dilute for evaluation of shearing efficiency or quantification via nanospectrophotometry. However, DNA extracted from the Mock-ChIP supernatant via the phenol-chloroform-isoamyl alcohol (PCIA) method can be used to evaluate DNA shearing efficiency and used as the standard in a fluorescence-based microplate assay. This enabled accurate quantification of DNA in chelex-extracted ChIP samples and normalization to total DNA concentration prior to performing real-time PCR (rtPCR). Thus, a quick ChIP assay that can be completed in nine bench hours over two days has been validated along with a rapid, accurate and repeatable way to quantify ChIP DNA. The resulting rtPCR data more accurately depicts treatment effects on protein-DNA interactions of interest. PMID:22046253

  16. Hybrid De Novo Genome Assembly Using MiSeq and SOLiD Short Read Data.

    Directory of Open Access Journals (Sweden)

    Tsutomu Ikegami

    Full Text Available A hybrid de novo assembly pipeline was constructed to utilize both MiSeq and SOLiD short read data in combination in the assembly. The short read data were converted to a standard format of the pipeline, and were supplied to the pipeline components such as ABySS and SOAPdenovo. The assembly pipeline proceeded through several stages, and either MiSeq paired-end data, SOLiD mate-paired data, or both of them could be specified as input data at each stage separately. The pipeline was examined on the filamentous fungus Aspergillus oryzae RIB40, by aligning the assembly results against the reference sequences. Using both the MiSeq and the SOLiD data in the hybrid assembly, the alignment length was improved by a factor of 3 to 8, compared with the assemblies using either one of the data types. The number of the reproduced gene cluster regions encoding secondary metabolite biosyntheses (SMB was also improved by the hybrid assemblies. These results imply that the MiSeq data with long read length are essential to construct accurate nucleotide sequences, while the SOLiD mate-paired reads with long insertion length enhance long-range arrangements of the sequences. The pipeline was also tested on the actinomycete Streptomyces avermitilis MA-4680, whose gene is known to have high-GC content. Although the quality of the SOLiD reads was too low to perform any meaningful assemblies by themselves, the alignment length to the reference was improved by a factor of 2, compared with the assembly using only the MiSeq data.

  17. Development of a cell microarray chip for detection of circulating tumor cells

    Science.gov (United States)

    Yamamura, S.; Yatsushiro, S.; Abe, K.; Baba, Y.; Kataoka, M.

    2012-03-01

    Detection of circulating tumor cells (CTCs) in the peripheral blood of metastatic cancer patients has clinical significance in earlier diagnosis of metastases. In this study, a novel cell microarray chip for accurate and rapid detection of tumor cells from human leukocytes was developed. The chip with 20,944 microchambers (105 μm diameter and 50 μm depth) was made from polystyrene, and the surface was rendered to hydrophilic by means of reactive-ion etching, which led to the formation of mono-layers of leukocytes on the microchambers. As the model of CTCs detection, we spiked human bronchioalveolar carcinoma (H1650) cells into human T lymphoblastoid leukemia (CEM) cells suspension and detected H1650 cells using the chip. A CEM suspension contained with H1650 cells was dispersed on the chip surface, followed by 10 min standing to allow the cells to settle down into the microchambers. About 30 CEM cells were accommodated in each microchamber, over 600,000 CEM cells in total being on a chip. We could detect 1 H1650 cell per 106 CEM cells on the microarray by staining with fluorescence-conjugated antibody (Anti-Cytokeratin) and cell membrane marker (DiD). Thus, this cell microarray chip has highly potential to be a novel tool of accurate and rapid detection of CTCs.

  18. A new, accurate predictive model for incident hypertension.

    Science.gov (United States)

    Völzke, Henry; Fung, Glenn; Ittermann, Till; Yu, Shipeng; Baumeister, Sebastian E; Dörr, Marcus; Lieb, Wolfgang; Völker, Uwe; Linneberg, Allan; Jørgensen, Torben; Felix, Stephan B; Rettig, Rainer; Rao, Bharat; Kroemer, Heyo K

    2013-11-01

    Data mining represents an alternative approach to identify new predictors of multifactorial diseases. This work aimed at building an accurate predictive model for incident hypertension using data mining procedures. The primary study population consisted of 1605 normotensive individuals aged 20-79 years with 5-year follow-up from the population-based study, that is the Study of Health in Pomerania (SHIP). The initial set was randomly split into a training and a testing set. We used a probabilistic graphical model applying a Bayesian network to create a predictive model for incident hypertension and compared the predictive performance with the established Framingham risk score for hypertension. Finally, the model was validated in 2887 participants from INTER99, a Danish community-based intervention study. In the training set of SHIP data, the Bayesian network used a small subset of relevant baseline features including age, mean arterial pressure, rs16998073, serum glucose and urinary albumin concentrations. Furthermore, we detected relevant interactions between age and serum glucose as well as between rs16998073 and urinary albumin concentrations [area under the receiver operating characteristic (AUC 0.76)]. The model was confirmed in the SHIP validation set (AUC 0.78) and externally replicated in INTER99 (AUC 0.77). Compared to the established Framingham risk score for hypertension, the predictive performance of the new model was similar in the SHIP validation set and moderately better in INTER99. Data mining procedures identified a predictive model for incident hypertension, which included innovative and easy-to-measure variables. The findings promise great applicability in screening settings and clinical practice.

  19. Dissecting Cell-Type Composition and Activity-Dependent Transcriptional State in Mammalian Brains by Massively Parallel Single-Nucleus RNA-Seq.

    Science.gov (United States)

    Hu, Peng; Fabyanic, Emily; Kwon, Deborah Y; Tang, Sheng; Zhou, Zhaolan; Wu, Hao

    2017-12-07

    Massively parallel single-cell RNA sequencing can precisely resolve cellular diversity in a high-throughput manner at low cost, but unbiased isolation of intact single cells from complex tissues such as adult mammalian brains is challenging. Here, we integrate sucrose-gradient-assisted purification of nuclei with droplet microfluidics to develop a highly scalable single-nucleus RNA-seq approach (sNucDrop-seq), which is free of enzymatic dissociation and nucleus sorting. By profiling ∼18,000 nuclei isolated from cortical tissues of adult mice, we demonstrate that sNucDrop-seq not only accurately reveals neuronal and non-neuronal subtype composition with high sensitivity but also enables in-depth analysis of transient transcriptional states driven by neuronal activity, at single-cell resolution, in vivo. Copyright © 2017 Elsevier Inc. All rights reserved.

  20. High-sensitivity HLA typing by Saturated Tiling Capture Sequencing (STC-Seq).

    Science.gov (United States)

    Jiao, Yang; Li, Ran; Wu, Chao; Ding, Yibin; Liu, Yanning; Jia, Danmei; Wang, Lifeng; Xu, Xiang; Zhu, Jing; Zheng, Min; Jia, Junling

    2018-01-15

    Highly polymorphic human leukocyte antigen (HLA) genes are responsible for fine-tuning the adaptive immune system. High-resolution HLA typing is important for the treatment of autoimmune and infectious diseases. Additionally, it is routinely performed for identifying matched donors in transplantation medicine. Although many HLA typing approaches have been developed, the complexity, low-efficiency and high-cost of current HLA-typing assays limit their application in population-based high-throughput HLA typing for donors, which is required for creating large-scale databases for transplantation and precision medicine. Here, we present a cost-efficient Saturated Tiling Capture Sequencing (STC-Seq) approach to capturing 14 HLA class I and II genes. The highly efficient capture (an approximately 23,000-fold enrichment) of these genes allows for simplified allele calling. Tests on five genes (HLA-A/B/C/DRB1/DQB1) from 31 human samples and 351 datasets using STC-Seq showed results that were 98% consistent with the known two sets of digitals (field1 and field2) genotypes. Additionally, STC can capture genomic DNA fragments longer than 3 kb from HLA loci, making the library compatible with the third-generation sequencing. STC-Seq is a highly accurate and cost-efficient method for HLA typing which can be used to facilitate the establishment of population-based HLA databases for the precision and transplantation medicine.

  1. A Transcriptome Map of Actinobacillus pleuropneumoniae at Single-Nucleotide Resolution Using Deep RNA-Seq.

    Directory of Open Access Journals (Sweden)

    Zhipeng Su

    Full Text Available Actinobacillus pleuropneumoniae is the pathogen of porcine contagious pleuropneumoniae, a highly contagious respiratory disease of swine. Although the genome of A. pleuropneumoniae was sequenced several years ago, limited information is available on the genome-wide transcriptional analysis to accurately annotate the gene structures and regulatory elements. High-throughput RNA sequencing (RNA-seq has been applied to study the transcriptional landscape of bacteria, which can efficiently and accurately identify gene expression regions and unknown transcriptional units, especially small non-coding RNAs (sRNAs, UTRs and regulatory regions. The aim of this study is to comprehensively analyze the transcriptome of A. pleuropneumoniae by RNA-seq in order to improve the existing genome annotation and promote our understanding of A. pleuropneumoniae gene structures and RNA-based regulation. In this study, we utilized RNA-seq to construct a single nucleotide resolution transcriptome map of A. pleuropneumoniae. More than 3.8 million high-quality reads (average length ~90 bp from a cDNA library were generated and aligned to the reference genome. We identified 32 open reading frames encoding novel proteins that were mis-annotated in the previous genome annotations. The start sites for 35 genes based on the current genome annotation were corrected. Furthermore, 51 sRNAs in the A. pleuropneumoniae genome were discovered, of which 40 sRNAs were never reported in previous studies. The transcriptome map also enabled visualization of 5'- and 3'-UTR regions, in which contained 11 sRNAs. In addition, 351 operons covering 1230 genes throughout the whole genome were identified. The RNA-Seq based transcriptome map validated annotated genes and corrected annotations of open reading frames in the genome, and led to the identification of many functional elements (e.g. regions encoding novel proteins, non-coding sRNAs and operon structures. The transcriptional units

  2. SeqVISTA: a graphical tool for sequence feature visualization and comparison

    Directory of Open Access Journals (Sweden)

    Niu Tianhua

    2003-01-01

    Full Text Available Abstract Background Many readers will sympathize with the following story. You are viewing a gene sequence in Entrez, and you want to find whether it contains a particular sequence motif. You reach for the browser's "find in page" button, but those darn spaces every 10 bp get in the way. And what if the motif is on the opposite strand? Subsequently, your favorite sequence analysis software informs you that there is an interesting feature at position 13982–14013. By painstakingly counting the 10 bp blocks, you are able to examine the sequence at this location. But now you want to see what other features have been annotated close by, and this information is buried several screenfuls higher up the web page. Results SeqVISTA presents a holistic, graphical view of features annotated on nucleotide or protein sequences. This interactive tool highlights the residues in the sequence that correspond to features chosen by the user, and allows easy searching for sequence motifs or extraction of particular subsequences. SeqVISTA is able to display results from diverse sequence analysis tools in an integrated fashion, and aims to provide much-needed unity to the bioinformatics resources scattered around the Internet. Our viewer may be launched on a GenBank record by a single click of a button installed in the web browser. Conclusion SeqVISTA allows insights to be gained by viewing the totality of sequence annotations and predictions, which may be more revealing than the sum of their parts. SeqVISTA runs on any operating system with a Java 1.4 virtual machine. It is freely available to academic users at http://zlab.bu.edu/SeqVISTA.

  3. An Overview of Practical Applications of Protein Disorder Prediction and Drive for Faster, More Accurate Predictions.

    Science.gov (United States)

    Deng, Xin; Gumm, Jordan; Karki, Suman; Eickholt, Jesse; Cheng, Jianlin

    2015-07-07

    Protein disordered regions are segments of a protein chain that do not adopt a stable structure. Thus far, a variety of protein disorder prediction methods have been developed and have been widely used, not only in traditional bioinformatics domains, including protein structure prediction, protein structure determination and function annotation, but also in many other biomedical fields. The relationship between intrinsically-disordered proteins and some human diseases has played a significant role in disorder prediction in disease identification and epidemiological investigations. Disordered proteins can also serve as potential targets for drug discovery with an emphasis on the disordered-to-ordered transition in the disordered binding regions, and this has led to substantial research in drug discovery or design based on protein disordered region prediction. Furthermore, protein disorder prediction has also been applied to healthcare by predicting the disease risk of mutations in patients and studying the mechanistic basis of diseases. As the applications of disorder prediction increase, so too does the need to make quick and accurate predictions. To fill this need, we also present a new approach to predict protein residue disorder using wide sequence windows that is applicable on the genomic scale.

  4. An Overview of Practical Applications of Protein Disorder Prediction and Drive for Faster, More Accurate Predictions

    Directory of Open Access Journals (Sweden)

    Xin Deng

    2015-07-01

    Full Text Available Protein disordered regions are segments of a protein chain that do not adopt a stable structure. Thus far, a variety of protein disorder prediction methods have been developed and have been widely used, not only in traditional bioinformatics domains, including protein structure prediction, protein structure determination and function annotation, but also in many other biomedical fields. The relationship between intrinsically-disordered proteins and some human diseases has played a significant role in disorder prediction in disease identification and epidemiological investigations. Disordered proteins can also serve as potential targets for drug discovery with an emphasis on the disordered-to-ordered transition in the disordered binding regions, and this has led to substantial research in drug discovery or design based on protein disordered region prediction. Furthermore, protein disorder prediction has also been applied to healthcare by predicting the disease risk of mutations in patients and studying the mechanistic basis of diseases. As the applications of disorder prediction increase, so too does the need to make quick and accurate predictions. To fill this need, we also present a new approach to predict protein residue disorder using wide sequence windows that is applicable on the genomic scale.

  5. Design of Networks-on-Chip for Real-Time Multi-Processor Systems-on-Chip

    DEFF Research Database (Denmark)

    Sparsø, Jens

    2012-01-01

    This paper addresses the design of networks-on-chips for use in multi-processor systems-on-chips - the hardware platforms used in embedded systems. These platforms typically have to guarantee real-time properties, and as the network is a shared resource, it has to provide service guarantees...... (bandwidth and/or latency) to different communication flows. The paper reviews some past work in this field and the lessons learned, and the paper discusses ongoing research conducted as part of the project "Time-predictable Multi-Core Architecture for Embedded Systems" (T-CREST), supported by the European...

  6. Multi-fidelity machine learning models for accurate bandgap predictions of solids

    International Nuclear Information System (INIS)

    Pilania, Ghanshyam; Gubernatis, James E.; Lookman, Turab

    2016-01-01

    Here, we present a multi-fidelity co-kriging statistical learning framework that combines variable-fidelity quantum mechanical calculations of bandgaps to generate a machine-learned model that enables low-cost accurate predictions of the bandgaps at the highest fidelity level. Additionally, the adopted Gaussian process regression formulation allows us to predict the underlying uncertainties as a measure of our confidence in the predictions. In using a set of 600 elpasolite compounds as an example dataset and using semi-local and hybrid exchange correlation functionals within density functional theory as two levels of fidelities, we demonstrate the excellent learning performance of the method against actual high fidelity quantum mechanical calculations of the bandgaps. The presented statistical learning method is not restricted to bandgaps or electronic structure methods and extends the utility of high throughput property predictions in a significant way.

  7. Rapid and accurate prediction and scoring of water molecules in protein binding sites.

    Directory of Open Access Journals (Sweden)

    Gregory A Ross

    Full Text Available Water plays a critical role in ligand-protein interactions. However, it is still challenging to predict accurately not only where water molecules prefer to bind, but also which of those water molecules might be displaceable. The latter is often seen as a route to optimizing affinity of potential drug candidates. Using a protocol we call WaterDock, we show that the freely available AutoDock Vina tool can be used to predict accurately the binding sites of water molecules. WaterDock was validated using data from X-ray crystallography, neutron diffraction and molecular dynamics simulations and correctly predicted 97% of the water molecules in the test set. In addition, we combined data-mining, heuristic and machine learning techniques to develop probabilistic water molecule classifiers. When applied to WaterDock predictions in the Astex Diverse Set of protein ligand complexes, we could identify whether a water molecule was conserved or displaced to an accuracy of 75%. A second model predicted whether water molecules were displaced by polar groups or by non-polar groups to an accuracy of 80%. These results should prove useful for anyone wishing to undertake rational design of new compounds where the displacement of water molecules is being considered as a route to improved affinity.

  8. Estimate the thermomechanical fatigue life of two flip chip packages

    International Nuclear Information System (INIS)

    Pash, R.A.; Ullah, H.S.; Khan, M.Z.

    2005-01-01

    The continuing demand towards high density and low profile integrated circuit packaging has accelerated the development of flip chip structures as used in direct chip attach (DCA) technology, ball grid array (BOA) and chip scale package (CSP). In such structures the most widely used flip chip interconnects are solder joints. The reliability of flip chip structures largely depends on the reliability of solder joints. In this work solder joint fatigue life prediction for two chip scale packages is carried out. Elasto-plastic deformation behavior of the solder was simulated using ANSYS. Two dimensional plain strain finite element models were developed for each package to numerically compute the stress and total strain of the solder joints under temperature cycling. These stress and strain values are then used to predict the solder joint lifetime through modified Coffin Manson equation. The effect of solder joint's distance from edge of silicon die on life of the package is explored. The solder joint fatigue response is modeled for a typical temperature cycling of -60 to 140 degree C. (author)

  9. Differential contribution of visual and auditory information to accurately predict the direction and rotational motion of a visual stimulus.

    Science.gov (United States)

    Park, Seoung Hoon; Kim, Seonjin; Kwon, MinHyuk; Christou, Evangelos A

    2016-03-01

    Vision and auditory information are critical for perception and to enhance the ability of an individual to respond accurately to a stimulus. However, it is unknown whether visual and auditory information contribute differentially to identify the direction and rotational motion of the stimulus. The purpose of this study was to determine the ability of an individual to accurately predict the direction and rotational motion of the stimulus based on visual and auditory information. In this study, we recruited 9 expert table-tennis players and used table-tennis service as our experimental model. Participants watched recorded services with different levels of visual and auditory information. The goal was to anticipate the direction of the service (left or right) and the rotational motion of service (topspin, sidespin, or cut). We recorded their responses and quantified the following outcomes: (i) directional accuracy and (ii) rotational motion accuracy. The response accuracy was the accurate predictions relative to the total number of trials. The ability of the participants to predict the direction of the service accurately increased with additional visual information but not with auditory information. In contrast, the ability of the participants to predict the rotational motion of the service accurately increased with the addition of auditory information to visual information but not with additional visual information alone. In conclusion, this finding demonstrates that visual information enhances the ability of an individual to accurately predict the direction of the stimulus, whereas additional auditory information enhances the ability of an individual to accurately predict the rotational motion of stimulus.

  10. MGMT methylation analysis of glioblastoma on the Infinium methylation BeadChip identifies two distinct CpG regions associated with gene silencing and outcome, yielding a prediction model for comparisons across datasets, tumor grades, and CIMP-status.

    Science.gov (United States)

    Bady, Pierre; Sciuscio, Davide; Diserens, Annie-Claire; Bloch, Jocelyne; van den Bent, Martin J; Marosi, Christine; Dietrich, Pierre-Yves; Weller, Michael; Mariani, Luigi; Heppner, Frank L; Mcdonald, David R; Lacombe, Denis; Stupp, Roger; Delorenzi, Mauro; Hegi, Monika E

    2012-10-01

    The methylation status of the O(6)-methylguanine-DNA methyltransferase (MGMT) gene is an important predictive biomarker for benefit from alkylating agent therapy in glioblastoma. Recent studies in anaplastic glioma suggest a prognostic value for MGMT methylation. Investigation of pathogenetic and epigenetic features of this intriguingly distinct behavior requires accurate MGMT classification to assess high throughput molecular databases. Promoter methylation-mediated gene silencing is strongly dependent on the location of the methylated CpGs, complicating classification. Using the HumanMethylation450 (HM-450K) BeadChip interrogating 176 CpGs annotated for the MGMT gene, with 14 located in the promoter, two distinct regions in the CpG island of the promoter were identified with high importance for gene silencing and outcome prediction. A logistic regression model (MGMT-STP27) comprising probes cg12434587 [corrected] and cg12981137 provided good classification properties and prognostic value (kappa = 0.85; log-rank p CIMP) positive tumors was found in glioblastomas from The Cancer Genome Atlas than in low grade and anaplastic glioma cohorts, while in CIMP-negative gliomas MGMT was classified as methylated in approximately 50 % regardless of tumor grade. The proposed MGMT-STP27 prediction model allows mining of datasets derived on the HM-450K or HM-27K BeadChip to explore effects of distinct epigenetic context of MGMT methylation suspected to modulate treatment resistance in different tumor types.

  11. PRAPI: post-transcriptional regulation analysis pipeline for Iso-Seq.

    Science.gov (United States)

    Gao, Yubang; Wang, Huiyuan; Zhang, Hangxiao; Wang, Yongsheng; Chen, Jinfeng; Gu, Lianfeng

    2018-05-01

    The single-molecule real-time (SMRT) isoform sequencing (Iso-Seq) based on Pacific Bioscience (PacBio) platform has received increasing attention for its ability to explore full-length isoforms. Thus, comprehensive tools for Iso-Seq bioinformatics analysis are extremely useful. Here, we present a one-stop solution for Iso-Seq analysis, called PRAPI to analyze alternative transcription initiation (ATI), alternative splicing (AS), alternative cleavage and polyadenylation (APA), natural antisense transcripts (NAT), and circular RNAs (circRNAs) comprehensively. PRAPI is capable of combining Iso-Seq full-length isoforms with short read data, such as RNA-Seq or polyadenylation site sequencing (PAS-seq) for differential expression analysis of NAT, AS, APA and circRNAs. Furthermore, PRAPI can annotate new genes and correct mis-annotated genes when gene annotation is available. Finally, PRAPI generates high-quality vector graphics to visualize and highlight the Iso-Seq results. The Dockerfile of PRAPI is available at http://www.bioinfor.org/tool/PRAPI. lfgu@fafu.edu.cn.

  12. Genome-wide RNA-seq analysis of human and mouse platelet transcriptomes

    Science.gov (United States)

    Rowley, Jesse W.; Oler, Andrew J.; Tolley, Neal D.; Hunter, Benjamin N.; Low, Elizabeth N.; Nix, David A.; Yost, Christian C.; Zimmerman, Guy A.

    2011-01-01

    Inbred mice are a useful tool for studying the in vivo functions of platelets. Nonetheless, the mRNA signature of mouse platelets is not known. Here, we use paired-end next-generation RNA sequencing (RNA-seq) to characterize the polyadenylated transcriptomes of human and mouse platelets. We report that RNA-seq provides unprecedented resolution of mRNAs that are expressed across the entire human and mouse genomes. Transcript expression and abundance are often conserved between the 2 species. Several mRNAs, however, are differentially expressed in human and mouse platelets. Moreover, previously described functional disparities between mouse and human platelets are reflected in differences at the transcript level, including protease activated receptor-1, protease activated receptor-3, platelet activating factor receptor, and factor V. This suggests that RNA-seq is a useful tool for predicting differences in platelet function between mice and humans. Our next-generation sequencing analysis provides new insights into the human and murine platelet transcriptomes. The sequencing dataset will be useful in the design of mouse models of hemostasis and a catalyst for discovery of new functions of platelets. Access to the dataset is found in the “Introduction.” PMID:21596849

  13. Nebula--a web-server for advanced ChIP-seq data analysis.

    Science.gov (United States)

    Boeva, Valentina; Lermine, Alban; Barette, Camille; Guillouf, Christel; Barillot, Emmanuel

    2012-10-01

    ChIP-seq consists of chromatin immunoprecipitation and deep sequencing of the extracted DNA fragments. It is the technique of choice for accurate characterization of the binding sites of transcription factors and other DNA-associated proteins. We present a web service, Nebula, which allows inexperienced users to perform a complete bioinformatics analysis of ChIP-seq data. Nebula was designed for both bioinformaticians and biologists. It is based on the Galaxy open source framework. Galaxy already includes a large number of functionalities for mapping reads and peak calling. We added the following to Galaxy: (i) peak calling with FindPeaks and a module for immunoprecipitation quality control, (ii) de novo motif discovery with ChIPMunk, (iii) calculation of the density and the cumulative distribution of peak locations relative to gene transcription start sites, (iv) annotation of peaks with genomic features and (v) annotation of genes with peak information. Nebula generates the graphs and the enrichment statistics at each step of the process. During Steps 3-5, Nebula optionally repeats the analysis on a control dataset and compares these results with those from the main dataset. Nebula can also incorporate gene expression (or gene modulation) data during these steps. In summary, Nebula is an innovative web service that provides an advanced ChIP-seq analysis pipeline providing ready-to-publish results. Nebula is available at http://nebula.curie.fr/ Supplementary data are available at Bioinformatics online.

  14. RNA-seq reveals more consistent reference genes for gene expression studies in human non-melanoma skin cancers

    Directory of Open Access Journals (Sweden)

    Van L.T. Hoang

    2017-08-01

    Full Text Available Identification of appropriate reference genes (RGs is critical to accurate data interpretation in quantitative real-time PCR (qPCR experiments. In this study, we have utilised next generation RNA sequencing (RNA-seq to analyse the transcriptome of a panel of non-melanoma skin cancer lesions, identifying genes that are consistently expressed across all samples. Genes encoding ribosomal proteins were amongst the most stable in this dataset. Validation of this RNA-seq data was examined using qPCR to confirm the suitability of a set of highly stable genes for use as qPCR RGs. These genes will provide a valuable resource for the normalisation of qPCR data for the analysis of non-melanoma skin cancer.

  15. Chip-Level Electromigration Reliability for Cu Interconnects

    International Nuclear Information System (INIS)

    Gall, M.; Oh, C.; Grinshpon, A.; Zolotov, V.; Panda, R.; Demircan, E.; Mueller, J.; Justison, P.; Ramakrishna, K.; Thrasher, S.; Hernandez, R.; Herrick, M.; Fox, R.; Boeck, B.; Kawasaki, H.; Haznedar, H.; Ku, P.

    2004-01-01

    Even after the successful introduction of Cu-based metallization, the electromigration (EM) failure risk has remained one of the most important reliability concerns for most advanced process technologies. Ever increasing operating current densities and the introduction of low-k materials in the backend process scheme are some of the issues that threaten reliable, long-term operation at elevated temperatures. The traditional method of verifying EM reliability only through current density limit checks is proving to be inadequate in general, or quite expensive at the best. A Statistical EM Budgeting (SEB) methodology has been proposed to assess more realistic chip-level EM reliability from the complex statistical distribution of currents in a chip. To be valuable, this approach requires accurate estimation of currents for all interconnect segments in a chip. However, no efficient technique to manage the complexity of such a task for very large chip designs is known. We present an efficient method to estimate currents exhaustively for all interconnects in a chip. The proposed method uses pre-characterization of cells and macros, and steps to identify and filter out symmetrically bi-directional interconnects. We illustrate the strength of the proposed approach using a high-performance microprocessor design for embedded applications as a case study

  16. Discovery of transcription factors and regulatory regions driving in vivo tumor development by ATAC-seq and FAIRE-seq open chromatin profiling.

    Directory of Open Access Journals (Sweden)

    Kristofer Davie

    2015-02-01

    Full Text Available Genomic enhancers regulate spatio-temporal gene expression by recruiting specific combinations of transcription factors (TFs. When TFs are bound to active regulatory regions, they displace canonical nucleosomes, making these regions biochemically detectable as nucleosome-depleted regions or accessible/open chromatin. Here we ask whether open chromatin profiling can be used to identify the entire repertoire of active promoters and enhancers underlying tissue-specific gene expression during normal development and oncogenesis in vivo. To this end, we first compare two different approaches to detect open chromatin in vivo using the Drosophila eye primordium as a model system: FAIRE-seq, based on physical separation of open versus closed chromatin; and ATAC-seq, based on preferential integration of a transposon into open chromatin. We find that both methods reproducibly capture the tissue-specific chromatin activity of regulatory regions, including promoters, enhancers, and insulators. Using both techniques, we screened for regulatory regions that become ectopically active during Ras-dependent oncogenesis, and identified 3778 regions that become (over-activated during tumor development. Next, we applied motif discovery to search for candidate transcription factors that could bind these regions and identified AP-1 and Stat92E as key regulators. We validated the importance of Stat92E in the development of the tumors by introducing a loss of function Stat92E mutant, which was sufficient to rescue the tumor phenotype. Additionally we tested if the predicted Stat92E responsive regulatory regions are genuine, using ectopic induction of JAK/STAT signaling in developing eye discs, and observed that similar chromatin changes indeed occurred. Finally, we determine that these are functionally significant regulatory changes, as nearby target genes are up- or down-regulated. In conclusion, we show that FAIRE-seq and ATAC-seq based open chromatin profiling

  17. A Novel Fibrosis Index Comprising a Non-Cholesterol Sterol Accurately Predicts HCV-Related Liver Cirrhosis

    DEFF Research Database (Denmark)

    Ydreborg, Magdalena; Lisovskaja, Vera; Lagging, Martin

    2014-01-01

    of the present study was to create a model for accurate prediction of liver cirrhosis based on patient characteristics and biomarkers of liver fibrosis, including a panel of non-cholesterol sterols reflecting cholesterol synthesis and absorption and secretion. We evaluated variables with potential predictive...

  18. LocARNA-P: Accurate boundary prediction and improved detection of structural RNAs

    DEFF Research Database (Denmark)

    Will, Sebastian; Joshi, Tejal; Hofacker, Ivo L.

    2012-01-01

    Current genomic screens for noncoding RNAs (ncRNAs) predict a large number of genomic regions containing potential structural ncRNAs. The analysis of these data requires highly accurate prediction of ncRNA boundaries and discrimination of promising candidate ncRNAs from weak predictions. Existing...... methods struggle with these goals because they rely on sequence-based multiple sequence alignments, which regularly misalign RNA structure and therefore do not support identification of structural similarities. To overcome this limitation, we compute columnwise and global reliabilities of alignments based...... on sequence and structure similarity; we refer to these structure-based alignment reliabilities as STARs. The columnwise STARs of alignments, or STAR profiles, provide a versatile tool for the manual and automatic analysis of ncRNAs. In particular, we improve the boundary prediction of the widely used nc...

  19. Do Dual-Route Models Accurately Predict Reading and Spelling Performance in Individuals with Acquired Alexia and Agraphia?

    OpenAIRE

    Rapcsak, Steven Z.; Henry, Maya L.; Teague, Sommer L.; Carnahan, Susan D.; Beeson, Pélagie M.

    2007-01-01

    Coltheart and colleagues (Coltheart, Rastle, Perry, Langdon, & Ziegler, 2001; Castles, Bates, & Coltheart, 2006) have demonstrated that an equation derived from dual-route theory accurately predicts reading performance in young normal readers and in children with reading impairment due to developmental dyslexia or stroke. In this paper we present evidence that the dual-route equation and a related multiple regression model also accurately predict both reading and spelling performance in adult...

  20. Towards accurate performance prediction of a vertical axis wind turbine operating at different tip speed ratios

    NARCIS (Netherlands)

    Rezaeiha, A.; Kalkman, I.; Blocken, B.J.E.

    2017-01-01

    Accurate prediction of the performance of a vertical-axis wind turbine (VAWT) using CFD simulation requires the employment of a sufficiently fine azimuthal increment (dθ) combined with a mesh size at which essential flow characteristics can be accurately resolved. Furthermore, the domain size needs

  1. Wirebond crosstalk and cavity modes in large chip mounts for superconducting qubits

    Energy Technology Data Exchange (ETDEWEB)

    Wenner, J; Neeley, M; Bialczak, Radoslaw C; Lenander, M; Lucero, Erik; O' Connell, A D; Sank, D; Wang, H; Weides, M; Cleland, A N; Martinis, John M, E-mail: martinis@physics.ucsb.edu [Department of Physics, University of California, Santa Barbara, CA 93106 (United States)

    2011-06-15

    We analyze the performance of a microwave chip mount that uses wirebonds to connect the chip and mount grounds. A simple impedance ladder model predicts that transmission crosstalk between two feedlines falls off exponentially with distance at low frequencies, but rises to near unity above a resonance frequency set by the chip to ground capacitance. Using SPICE simulations and experimental measurements of a scale model, the basic predictions of the ladder model were verified. In particular, by decreasing the capacitance between the chip and box grounds, the resonance frequency increased and transmission decreased. This model then influenced the design of a new mount that improved the isolation to - 65 dB at 6 GHz, even though the chip dimensions were increased to 1 cm x 1 cm, three times as large as our previous devices. We measured a coplanar resonator in this mount as preparation for larger qubit chips, and were able to identify cavity, slotline, and resonator modes.

  2. Wirebond crosstalk and cavity modes in large chip mounts for superconducting qubits

    International Nuclear Information System (INIS)

    Wenner, J; Neeley, M; Bialczak, Radoslaw C; Lenander, M; Lucero, Erik; O'Connell, A D; Sank, D; Wang, H; Weides, M; Cleland, A N; Martinis, John M

    2011-01-01

    We analyze the performance of a microwave chip mount that uses wirebonds to connect the chip and mount grounds. A simple impedance ladder model predicts that transmission crosstalk between two feedlines falls off exponentially with distance at low frequencies, but rises to near unity above a resonance frequency set by the chip to ground capacitance. Using SPICE simulations and experimental measurements of a scale model, the basic predictions of the ladder model were verified. In particular, by decreasing the capacitance between the chip and box grounds, the resonance frequency increased and transmission decreased. This model then influenced the design of a new mount that improved the isolation to - 65 dB at 6 GHz, even though the chip dimensions were increased to 1 cm x 1 cm, three times as large as our previous devices. We measured a coplanar resonator in this mount as preparation for larger qubit chips, and were able to identify cavity, slotline, and resonator modes.

  3. The ChIP-Seq tools and web server: a resource for analyzing ChIP-seq and other types of genomic data.

    Science.gov (United States)

    Ambrosini, Giovanna; Dreos, René; Kumar, Sunil; Bucher, Philipp

    2016-11-18

    ChIP-seq and related high-throughput chromatin profilig assays generate ever increasing volumes of highly valuable biological data. To make sense out of it, biologists need versatile, efficient and user-friendly tools for access, visualization and itegrative analysis of such data. Here we present the ChIP-Seq command line tools and web server, implementing basic algorithms for ChIP-seq data analysis starting with a read alignment file. The tools are optimized for memory-efficiency and speed thus allowing for processing of large data volumes on inexpensive hardware. The web interface provides access to a large database of public data. The ChIP-Seq tools have a modular and interoperable design in that the output from one application can serve as input to another one. Complex and innovative tasks can thus be achieved by running several tools in a cascade. The various ChIP-Seq command line tools and web services either complement or compare favorably to related bioinformatics resources in terms of computational efficiency, ease of access to public data and interoperability with other web-based tools. The ChIP-Seq server is accessible at http://ccg.vital-it.ch/chipseq/ .

  4. Bayesian calibration of power plant models for accurate performance prediction

    International Nuclear Information System (INIS)

    Boksteen, Sowande Z.; Buijtenen, Jos P. van; Pecnik, Rene; Vecht, Dick van der

    2014-01-01

    Highlights: • Bayesian calibration is applied to power plant performance prediction. • Measurements from a plant in operation are used for model calibration. • A gas turbine performance model and steam cycle model are calibrated. • An integrated plant model is derived. • Part load efficiency is accurately predicted as a function of ambient conditions. - Abstract: Gas turbine combined cycles are expected to play an increasingly important role in the balancing of supply and demand in future energy markets. Thermodynamic modeling of these energy systems is frequently applied to assist in decision making processes related to the management of plant operation and maintenance. In most cases, model inputs, parameters and outputs are treated as deterministic quantities and plant operators make decisions with limited or no regard of uncertainties. As the steady integration of wind and solar energy into the energy market induces extra uncertainties, part load operation and reliability are becoming increasingly important. In the current study, methods are proposed to not only quantify various types of uncertainties in measurements and plant model parameters using measured data, but to also assess their effect on various aspects of performance prediction. The authors aim to account for model parameter and measurement uncertainty, and for systematic discrepancy of models with respect to reality. For this purpose, the Bayesian calibration framework of Kennedy and O’Hagan is used, which is especially suitable for high-dimensional industrial problems. The article derives a calibrated model of the plant efficiency as a function of ambient conditions and operational parameters, which is also accurate in part load. The article shows that complete statistical modeling of power plants not only enhances process models, but can also increases confidence in operational decisions

  5. Accurate Holdup Calculations with Predictive Modeling & Data Integration

    Energy Technology Data Exchange (ETDEWEB)

    Azmy, Yousry [North Carolina State Univ., Raleigh, NC (United States). Dept. of Nuclear Engineering; Cacuci, Dan [Univ. of South Carolina, Columbia, SC (United States). Dept. of Mechanical Engineering

    2017-04-03

    In facilities that process special nuclear material (SNM) it is important to account accurately for the fissile material that enters and leaves the plant. Although there are many stages and processes through which materials must be traced and measured, the focus of this project is material that is “held-up” in equipment, pipes, and ducts during normal operation and that can accumulate over time into significant quantities. Accurately estimating the holdup is essential for proper SNM accounting (vis-à-vis nuclear non-proliferation), criticality and radiation safety, waste management, and efficient plant operation. Usually it is not possible to directly measure the holdup quantity and location, so these must be inferred from measured radiation fields, primarily gamma and less frequently neutrons. Current methods to quantify holdup, i.e. Generalized Geometry Holdup (GGH), primarily rely on simple source configurations and crude radiation transport models aided by ad hoc correction factors. This project seeks an alternate method of performing measurement-based holdup calculations using a predictive model that employs state-of-the-art radiation transport codes capable of accurately simulating such situations. Inverse and data assimilation methods use the forward transport model to search for a source configuration that best matches the measured data and simultaneously provide an estimate of the level of confidence in the correctness of such configuration. In this work the holdup problem is re-interpreted as an inverse problem that is under-determined, hence may permit multiple solutions. A probabilistic approach is applied to solving the resulting inverse problem. This approach rates possible solutions according to their plausibility given the measurements and initial information. This is accomplished through the use of Bayes’ Theorem that resolves the issue of multiple solutions by giving an estimate of the probability of observing each possible solution. To use

  6. Fast and Accurate Prediction of Stratified Steel Temperature During Holding Period of Ladle

    Science.gov (United States)

    Deodhar, Anirudh; Singh, Umesh; Shukla, Rishabh; Gautham, B. P.; Singh, Amarendra K.

    2017-04-01

    Thermal stratification of liquid steel in a ladle during the holding period and the teeming operation has a direct bearing on the superheat available at the caster and hence on the caster set points such as casting speed and cooling rates. The changes in the caster set points are typically carried out based on temperature measurements at the end of tundish outlet. Thermal prediction models provide advance knowledge of the influence of process and design parameters on the steel temperature at various stages. Therefore, they can be used in making accurate decisions about the caster set points in real time. However, this requires both fast and accurate thermal prediction models. In this work, we develop a surrogate model for the prediction of thermal stratification using data extracted from a set of computational fluid dynamics (CFD) simulations, pre-determined using design of experiments technique. Regression method is used for training the predictor. The model predicts the stratified temperature profile instantaneously, for a given set of process parameters such as initial steel temperature, refractory heat content, slag thickness, and holding time. More than 96 pct of the predicted values are within an error range of ±5 K (±5 °C), when compared against corresponding CFD results. Considering its accuracy and computational efficiency, the model can be extended for thermal control of casting operations. This work also sets a benchmark for developing similar thermal models for downstream processes such as tundish and caster.

  7. ChIP-PIT: Enhancing the Analysis of ChIP-Seq Data Using Convex-Relaxed Pair-Wise Interaction Tensor Decomposition.

    Science.gov (United States)

    Zhu, Lin; Guo, Wei-Li; Deng, Su-Ping; Huang, De-Shuang

    2016-01-01

    In recent years, thanks to the efforts of individual scientists and research consortiums, a huge amount of chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) experimental data have been accumulated. Instead of investigating them independently, several recent studies have convincingly demonstrated that a wealth of scientific insights can be gained by integrative analysis of these ChIP-seq data. However, when used for the purpose of integrative analysis, a serious drawback of current ChIP-seq technique is that it is still expensive and time-consuming to generate ChIP-seq datasets of high standard. Most researchers are therefore unable to obtain complete ChIP-seq data for several TFs in a wide variety of cell lines, which considerably limits the understanding of transcriptional regulation pattern. In this paper, we propose a novel method called ChIP-PIT to overcome the aforementioned limitation. In ChIP-PIT, ChIP-seq data corresponding to a diverse collection of cell types, TFs and genes are fused together using the three-mode pair-wise interaction tensor (PIT) model, and the prediction of unperformed ChIP-seq experimental results is formulated as a tensor completion problem. Computationally, we propose efficient first-order method based on extensions of coordinate descent method to learn the optimal solution of ChIP-PIT, which makes it particularly suitable for the analysis of massive scale ChIP-seq data. Experimental evaluation the ENCODE data illustrate the usefulness of the proposed model.

  8. Revisiting lab-on-a-chip technology for drug discovery.

    Science.gov (United States)

    Neuži, Pavel; Giselbrecht, Stefan; Länge, Kerstin; Huang, Tony Jun; Manz, Andreas

    2012-08-01

    The field of microfluidics or lab-on-a-chip technology aims to improve and extend the possibilities of bioassays, cell biology and biomedical research based on the idea of miniaturization. Microfluidic systems allow more accurate modelling of physiological situations for both fundamental research and drug development, and enable systematic high-volume testing for various aspects of drug discovery. Microfluidic systems are in development that not only model biological environments but also physically mimic biological tissues and organs; such 'organs on a chip' could have an important role in expediting early stages of drug discovery and help reduce reliance on animal testing. This Review highlights the latest lab-on-a-chip technologies for drug discovery and discusses the potential for future developments in this field.

  9. Accurate prediction of the enthalpies of formation for xanthophylls.

    Science.gov (United States)

    Lii, Jenn-Huei; Liao, Fu-Xing; Hu, Ching-Han

    2011-11-30

    This study investigates the applications of computational approaches in the prediction of enthalpies of formation (ΔH(f)) for C-, H-, and O-containing compounds. Molecular mechanics (MM4) molecular mechanics method, density functional theory (DFT) combined with the atomic equivalent (AE) and group equivalent (GE) schemes, and DFT-based correlation corrected atomization (CCAZ) were used. We emphasized on the application to xanthophylls, C-, H-, and O-containing carotenoids which consist of ∼ 100 atoms and extended π-delocaization systems. Within the training set, MM4 predictions are more accurate than those obtained using AE and GE; however a systematic underestimation was observed in the extended systems. ΔH(f) for the training set molecules predicted by CCAZ combined with DFT are in very good agreement with the G3 results. The average absolute deviations (AADs) of CCAZ combined with B3LYP and MPWB1K are 0.38 and 0.53 kcal/mol compared with the G3 data, and are 0.74 and 0.69 kcal/mol compared with the available experimental data, respectively. Consistency of the CCAZ approach for the selected xanthophylls is revealed by the AAD of 2.68 kcal/mol between B3LYP-CCAZ and MPWB1K-CCAZ. Copyright © 2011 Wiley Periodicals, Inc.

  10. Usability of human Infinium MethylationEPIC BeadChip for mouse DNA methylation studies.

    Science.gov (United States)

    Needhamsen, Maria; Ewing, Ewoud; Lund, Harald; Gomez-Cabrero, David; Harris, Robert Adam; Kular, Lara; Jagodic, Maja

    2017-11-15

    The advent of array-based genome-wide DNA methylation methods has enabled quantitative measurement of single CpG methylation status at relatively low cost and sample input. Whereas the use of Infinium Human Methylation BeadChips has shown great utility in clinical studies, no equivalent tool is available for rodent animal samples. We examined the feasibility of using the new Infinium MethylationEPIC BeadChip for studying DNA methylation in mouse. In silico, we identified 19,420 EPIC probes (referred as mEPIC probes), which align with a unique best alignment score to the bisulfite converted reference mouse genome mm10. Further annotation revealed that 85% of mEPIC probes overlapped with mm10.refSeq genes at different genomic features including promoters (TSS1500 and TSS200), 1st exons, 5'UTRs, 3'UTRs, CpG islands, shores, shelves, open seas and FANTOM5 enhancers. Hybridization of mouse samples to Infinium Human MethylationEPIC BeadChips showed successful measurement of mEPIC probes and reproducibility between inter-array biological replicates. Finally, we demonstrated the utility of mEPIC probes for data exploration such as hierarchical clustering. Given the absence of cost and labor convenient genome-wide technologies in the murine system, our findings show that the Infinium MethylationEPIC BeadChip platform is suitable for investigation of the mouse methylome. Furthermore, we provide the "mEPICmanifest" with genomic features, available to users of Infinium Human MethylationEPIC arrays for mouse samples.

  11. XenoSite: accurately predicting CYP-mediated sites of metabolism with neural networks.

    Science.gov (United States)

    Zaretzki, Jed; Matlock, Matthew; Swamidass, S Joshua

    2013-12-23

    Understanding how xenobiotic molecules are metabolized is important because it influences the safety, efficacy, and dose of medicines and how they can be modified to improve these properties. The cytochrome P450s (CYPs) are proteins responsible for metabolizing 90% of drugs on the market, and many computational methods can predict which atomic sites of a molecule--sites of metabolism (SOMs)--are modified during CYP-mediated metabolism. This study improves on prior methods of predicting CYP-mediated SOMs by using new descriptors and machine learning based on neural networks. The new method, XenoSite, is faster to train and more accurate by as much as 4% or 5% for some isozymes. Furthermore, some "incorrect" predictions made by XenoSite were subsequently validated as correct predictions by revaluation of the source literature. Moreover, XenoSite output is interpretable as a probability, which reflects both the confidence of the model that a particular atom is metabolized and the statistical likelihood that its prediction for that atom is correct.

  12. A high quality Arabidopsis transcriptome for accurate transcript-level analysis of alternative splicing

    KAUST Repository

    Zhang, Runxuan

    2017-04-05

    Alternative splicing generates multiple transcript and protein isoforms from the same gene and thus is important in gene expression regulation. To date, RNA-sequencing (RNA-seq) is the standard method for quantifying changes in alternative splicing on a genome-wide scale. Understanding the current limitations of RNA-seq is crucial for reliable analysis and the lack of high quality, comprehensive transcriptomes for most species, including model organisms such as Arabidopsis, is a major constraint in accurate quantification of transcript isoforms. To address this, we designed a novel pipeline with stringent filters and assembled a comprehensive Reference Transcript Dataset for Arabidopsis (AtRTD2) containing 82,190 non-redundant transcripts from 34 212 genes. Extensive experimental validation showed that AtRTD2 and its modified version, AtRTD2-QUASI, for use in Quantification of Alternatively Spliced Isoforms, outperform other available transcriptomes in RNA-seq analysis. This strategy can be implemented in other species to build a pipeline for transcript-level expression and alternative splicing analyses.

  13. A high quality Arabidopsis transcriptome for accurate transcript-level analysis of alternative splicing

    KAUST Repository

    Zhang, Runxuan; Calixto, Cristiane  P.  G.; Marquez, Yamile; Venhuizen, Peter; Tzioutziou, Nikoleta A.; Guo, Wenbin; Spensley, Mark; Entizne, Juan Carlos; Lewandowska, Dominika; ten  Have, Sara; Frei  dit  Frey, Nicolas; Hirt, Heribert; James, Allan B.; Nimmo, Hugh G.; Barta, Andrea; Kalyna, Maria; Brown, John  W.  S.

    2017-01-01

    Alternative splicing generates multiple transcript and protein isoforms from the same gene and thus is important in gene expression regulation. To date, RNA-sequencing (RNA-seq) is the standard method for quantifying changes in alternative splicing on a genome-wide scale. Understanding the current limitations of RNA-seq is crucial for reliable analysis and the lack of high quality, comprehensive transcriptomes for most species, including model organisms such as Arabidopsis, is a major constraint in accurate quantification of transcript isoforms. To address this, we designed a novel pipeline with stringent filters and assembled a comprehensive Reference Transcript Dataset for Arabidopsis (AtRTD2) containing 82,190 non-redundant transcripts from 34 212 genes. Extensive experimental validation showed that AtRTD2 and its modified version, AtRTD2-QUASI, for use in Quantification of Alternatively Spliced Isoforms, outperform other available transcriptomes in RNA-seq analysis. This strategy can be implemented in other species to build a pipeline for transcript-level expression and alternative splicing analyses.

  14. Crosstalk in modern on-chip interconnects a FDTD approach

    CERN Document Server

    Kaushik, B K; Patnaik, Amalendu

    2016-01-01

    The book provides accurate FDTD models for on-chip interconnects, covering most recent advancements in materials and design. Furthermore, depending on the geometry and physical configurations, different electrical equivalent models for CNT and GNR based interconnects are presented. Based on the electrical equivalent models the performance comparison among the Cu, CNT and GNR-based interconnects are also discussed in the book. The proposed models are validated with the HSPICE simulations. The book introduces the current research scenario in the modeling of on-chip interconnects. It presents the structure, properties, and characteristics of graphene based on-chip interconnects and the FDTD modeling of Cu based on-chip interconnects. The model considers the non-linear effects of CMOS driver as well as the transmission line effects of interconnect line that includes coupling capacitance and mutual inductance effects. In a more realistic manner, the proposed model includes the effect of width-dependent MFP of the ...

  15. PredictSNP2: A Unified Platform for Accurately Evaluating SNP Effects by Exploiting the Different Characteristics of Variants in Distinct Genomic Regions.

    Science.gov (United States)

    Bendl, Jaroslav; Musil, Miloš; Štourač, Jan; Zendulka, Jaroslav; Damborský, Jiří; Brezovský, Jan

    2016-05-01

    An important message taken from human genome sequencing projects is that the human population exhibits approximately 99.9% genetic similarity. Variations in the remaining parts of the genome determine our identity, trace our history and reveal our heritage. The precise delineation of phenotypically causal variants plays a key role in providing accurate personalized diagnosis, prognosis, and treatment of inherited diseases. Several computational methods for achieving such delineation have been reported recently. However, their ability to pinpoint potentially deleterious variants is limited by the fact that their mechanisms of prediction do not account for the existence of different categories of variants. Consequently, their output is biased towards the variant categories that are most strongly represented in the variant databases. Moreover, most such methods provide numeric scores but not binary predictions of the deleteriousness of variants or confidence scores that would be more easily understood by users. We have constructed three datasets covering different types of disease-related variants, which were divided across five categories: (i) regulatory, (ii) splicing, (iii) missense, (iv) synonymous, and (v) nonsense variants. These datasets were used to develop category-optimal decision thresholds and to evaluate six tools for variant prioritization: CADD, DANN, FATHMM, FitCons, FunSeq2 and GWAVA. This evaluation revealed some important advantages of the category-based approach. The results obtained with the five best-performing tools were then combined into a consensus score. Additional comparative analyses showed that in the case of missense variations, protein-based predictors perform better than DNA sequence-based predictors. A user-friendly web interface was developed that provides easy access to the five tools' predictions, and their consensus scores, in a user-understandable format tailored to the specific features of different categories of variations. To

  16. A single chip with multiple talents

    CERN Multimedia

    Francesco Poppi

    2010-01-01

    The Medipix chips developed at CERN are being used in a variety of fields: from medicine to education and back to high-tech engineering. The scene is set for a bright future for this versatile technology.   The Medipix chip. It didn’t take long for a brilliant team of physicists and engineers who were working on pixel detectors for the LHC to realize that the technology had great potential in medical imaging. This was the birth of the Medipix project. Fifteen years later, with the collaboration of 18 research institutes, the team has produced an advanced version of the initial ideas: Medipix3 is a device that can measure very accurately the position and energy of the photons (one by one) that hit the associated detector. Radiography and computed tomography (CT) use X-ray photons to study the human body. The different energies of the photons in the beam can be thought of as the colours of the X-ray spectrum. This is why the use of Medipix3 chips in such diagnostic techniques is referred...

  17. PySeqLab: an open source Python package for sequence labeling and segmentation.

    Science.gov (United States)

    Allam, Ahmed; Krauthammer, Michael

    2017-11-01

    Text and genomic data are composed of sequential tokens, such as words and nucleotides that give rise to higher order syntactic constructs. In this work, we aim at providing a comprehensive Python library implementing conditional random fields (CRFs), a class of probabilistic graphical models, for robust prediction of these constructs from sequential data. Python Sequence Labeling (PySeqLab) is an open source package for performing supervised learning in structured prediction tasks. It implements CRFs models, that is discriminative models from (i) first-order to higher-order linear-chain CRFs, and from (ii) first-order to higher-order semi-Markov CRFs (semi-CRFs). Moreover, it provides multiple learning algorithms for estimating model parameters such as (i) stochastic gradient descent (SGD) and its multiple variations, (ii) structured perceptron with multiple averaging schemes supporting exact and inexact search using 'violation-fixing' framework, (iii) search-based probabilistic online learning algorithm (SAPO) and (iv) an interface for Broyden-Fletcher-Goldfarb-Shanno (BFGS) and the limited-memory BFGS algorithms. Viterbi and Viterbi A* are used for inference and decoding of sequences. Using PySeqLab, we built models (classifiers) and evaluated their performance in three different domains: (i) biomedical Natural language processing (NLP), (ii) predictive DNA sequence analysis and (iii) Human activity recognition (HAR). State-of-the-art performance comparable to machine-learning based systems was achieved in the three domains without feature engineering or the use of knowledge sources. PySeqLab is available through https://bitbucket.org/A_2/pyseqlab with tutorials and documentation. ahmed.allam@yale.edu or michael.krauthammer@yale.edu. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com

  18. COBRA-Seq: Sensitive and Quantitative Methylome Profiling

    Directory of Open Access Journals (Sweden)

    Hilal Varinli

    2015-10-01

    Full Text Available Combined Bisulfite Restriction Analysis (COBRA quantifies DNA methylation at a specific locus. It does so via digestion of PCR amplicons produced from bisulfite-treated DNA, using a restriction enzyme that contains a cytosine within its recognition sequence, such as TaqI. Here, we introduce COBRA-seq, a genome wide reduced methylome method that requires minimal DNA input (0.1–1.0 mg and can either use PCR or linear amplification to amplify the sequencing library. Variants of COBRA-seq can be used to explore CpG-depleted as well as CpG-rich regions in vertebrate DNA. The choice of enzyme influences enrichment for specific genomic features, such as CpG-rich promoters and CpG islands, or enrichment for less CpG dense regions such as enhancers. COBRA-seq coupled with linear amplification has the additional advantage of reduced PCR bias by producing full length fragments at high abundance. Unlike other reduced representative methylome methods, COBRA-seq has great flexibility in the choice of enzyme and can be multiplexed and tuned, to reduce sequencing costs and to interrogate different numbers of sites. Moreover, COBRA-seq is applicable to non-model organisms without the reference genome and compatible with the investigation of non-CpG methylation by using restriction enzymes containing CpA, CpT, and CpC in their recognition site.

  19. Microbiome Data Accurately Predicts the Postmortem Interval Using Random Forest Regression Models

    Directory of Open Access Journals (Sweden)

    Aeriel Belk

    2018-02-01

    Full Text Available Death investigations often include an effort to establish the postmortem interval (PMI in cases in which the time of death is uncertain. The postmortem interval can lead to the identification of the deceased and the validation of witness statements and suspect alibis. Recent research has demonstrated that microbes provide an accurate clock that starts at death and relies on ecological change in the microbial communities that normally inhabit a body and its surrounding environment. Here, we explore how to build the most robust Random Forest regression models for prediction of PMI by testing models built on different sample types (gravesoil, skin of the torso, skin of the head, gene markers (16S ribosomal RNA (rRNA, 18S rRNA, internal transcribed spacer regions (ITS, and taxonomic levels (sequence variants, species, genus, etc.. We also tested whether particular suites of indicator microbes were informative across different datasets. Generally, results indicate that the most accurate models for predicting PMI were built using gravesoil and skin data using the 16S rRNA genetic marker at the taxonomic level of phyla. Additionally, several phyla consistently contributed highly to model accuracy and may be candidate indicators of PMI.

  20. Optimizing de novo common wheat transcriptome assembly using short-read RNA-Seq data

    Directory of Open Access Journals (Sweden)

    Duan Jialei

    2012-08-01

    Full Text Available Abstract Background Rapid advances in next-generation sequencing methods have provided new opportunities for transcriptome sequencing (RNA-Seq. The unprecedented sequencing depth provided by RNA-Seq makes it a powerful and cost-efficient method for transcriptome study, and it has been widely used in model organisms and non-model organisms to identify and quantify RNA. For non-model organisms lacking well-defined genomes, de novo assembly is typically required for downstream RNA-Seq analyses, including SNP discovery and identification of genes differentially expressed by phenotypes. Although RNA-Seq has been successfully used to sequence many non-model organisms, the results of de novo assembly from short reads can still be improved by using recent bioinformatic developments. Results In this study, we used 212.6 million pair-end reads, which accounted for 16.2 Gb, to assemble the hexaploid wheat transcriptome. Two state-of-the-art assemblers, Trinity and Trans-ABySS, which use the single and multiple k-mer methods, respectively, were used, and the whole de novo assembly process was divided into the following four steps: pre-assembly, merging different samples, removal of redundancy and scaffolding. We documented every detail of these steps and how these steps influenced assembly performance to gain insight into transcriptome assembly from short reads. After optimization, the assembled transcripts were comparable to Sanger-derived ESTs in terms of both continuity and accuracy. We also provided considerable new wheat transcript data to the community. Conclusions It is feasible to assemble the hexaploid wheat transcriptome from short reads. Special attention should be paid to dealing with multiple samples to balance the spectrum of expression levels and redundancy. To obtain an accurate overview of RNA profiling, removal of redundancy may be crucial in de novo assembly.

  1. Towards Accurate Prediction of Unbalance Response, Oil Whirl and Oil Whip of Flexible Rotors Supported by Hydrodynamic Bearings

    Directory of Open Access Journals (Sweden)

    Rob Eling

    2016-09-01

    Full Text Available Journal bearings are used to support rotors in a wide range of applications. In order to ensure reliable operation, accurate analyses of these rotor-bearing systems are crucial. Coupled analysis of the rotor and the journal bearing is essential in the case that the rotor is flexible. The accuracy of prediction of the model at hand depends on its comprehensiveness. In this study, we construct three bearing models of increasing modeling comprehensiveness and use these to predict the response of two different rotor-bearing systems. The main goal is to evaluate the correlation with measurement data as a function of modeling comprehensiveness: 1D versus 2D pressure prediction, distributed versus lumped thermal model, Newtonian versus non-Newtonian fluid description and non-mass-conservative versus mass-conservative cavitation description. We conclude that all three models predict the existence of critical speeds and whirl for both rotor-bearing systems. However, the two more comprehensive models in general show better correlation with measurement data in terms of frequency and amplitude. Furthermore, we conclude that a thermal network model comprising temperature predictions of the bearing surroundings is essential to obtain accurate predictions. The results of this study aid in developing accurate and computationally-efficient models of flexible rotors supported by plain journal bearings.

  2. FibroChip, a Functional DNA Microarray to Monitor Cellulolytic and Hemicellulolytic Activities of Rumen Microbiota

    Directory of Open Access Journals (Sweden)

    Sophie Comtet-Marre

    2018-02-01

    Full Text Available Ruminants fulfill their energy needs for growth primarily through microbial breakdown of plant biomass in the rumen. Several biotic and abiotic factors influence the efficiency of fiber degradation, which can ultimately impact animal productivity and health. To provide more insight into mechanisms involved in the modulation of fibrolytic activity, a functional DNA microarray targeting genes encoding key enzymes involved in cellulose and hemicellulose degradation by rumen microbiota was designed. Eight carbohydrate-active enzyme (CAZyme families (GH5, GH9, GH10, GH11, GH43, GH48, CE1, and CE6 were selected which represented 392 genes from bacteria, protozoa, and fungi. The DNA microarray, designated as FibroChip, was validated using targets of increasing complexity and demonstrated sensitivity and specificity. In addition, FibroChip was evaluated for its explorative and semi-quantitative potential. Differential expression of CAZyme genes was evidenced in the rumen bacterium Fibrobacter succinogenes S85 grown on wheat straw or cellobiose. FibroChip was used to identify the expressed CAZyme genes from the targeted families in the rumen of a cow fed a mixed diet based on grass silage. Among expressed genes, those encoding GH43, GH5, and GH10 families were the most represented. Most of the F. succinogenes genes detected by the FibroChip were also detected following RNA-seq analysis of RNA transcripts obtained from the rumen fluid sample. Use of the FibroChip also indicated that transcripts of fiber degrading enzymes derived from eukaryotes (protozoa and anaerobic fungi represented a significant proportion of the total microbial mRNA pool. FibroChip represents a reliable and high-throughput tool that enables researchers to monitor active members of fiber degradation in the rumen.

  3. FibroChip, a Functional DNA Microarray to Monitor Cellulolytic and Hemicellulolytic Activities of Rumen Microbiota.

    Science.gov (United States)

    Comtet-Marre, Sophie; Chaucheyras-Durand, Frédérique; Bouzid, Ourdia; Mosoni, Pascale; Bayat, Ali R; Peyret, Pierre; Forano, Evelyne

    2018-01-01

    Ruminants fulfill their energy needs for growth primarily through microbial breakdown of plant biomass in the rumen. Several biotic and abiotic factors influence the efficiency of fiber degradation, which can ultimately impact animal productivity and health. To provide more insight into mechanisms involved in the modulation of fibrolytic activity, a functional DNA microarray targeting genes encoding key enzymes involved in cellulose and hemicellulose degradation by rumen microbiota was designed. Eight carbohydrate-active enzyme (CAZyme) families (GH5, GH9, GH10, GH11, GH43, GH48, CE1, and CE6) were selected which represented 392 genes from bacteria, protozoa, and fungi. The DNA microarray, designated as FibroChip, was validated using targets of increasing complexity and demonstrated sensitivity and specificity. In addition, FibroChip was evaluated for its explorative and semi-quantitative potential. Differential expression of CAZyme genes was evidenced in the rumen bacterium Fibrobacter succinogenes S85 grown on wheat straw or cellobiose. FibroChip was used to identify the expressed CAZyme genes from the targeted families in the rumen of a cow fed a mixed diet based on grass silage. Among expressed genes, those encoding GH43, GH5, and GH10 families were the most represented. Most of the F. succinogenes genes detected by the FibroChip were also detected following RNA-seq analysis of RNA transcripts obtained from the rumen fluid sample. Use of the FibroChip also indicated that transcripts of fiber degrading enzymes derived from eukaryotes (protozoa and anaerobic fungi) represented a significant proportion of the total microbial mRNA pool. FibroChip represents a reliable and high-throughput tool that enables researchers to monitor active members of fiber degradation in the rumen.

  4. MITIE: Simultaneous RNA-Seq-based transcript identification and quantification in multiple samples.

    Science.gov (United States)

    Behr, Jonas; Kahles, André; Zhong, Yi; Sreedharan, Vipin T; Drewe, Philipp; Rätsch, Gunnar

    2013-10-15

    High-throughput sequencing of mRNA (RNA-Seq) has led to tremendous improvements in the detection of expressed genes and reconstruction of RNA transcripts. However, the extensive dynamic range of gene expression, technical limitations and biases, as well as the observed complexity of the transcriptional landscape, pose profound computational challenges for transcriptome reconstruction. We present the novel framework MITIE (Mixed Integer Transcript IdEntification) for simultaneous transcript reconstruction and quantification. We define a likelihood function based on the negative binomial distribution, use a regularization approach to select a few transcripts collectively explaining the observed read data and show how to find the optimal solution using Mixed Integer Programming. MITIE can (i) take advantage of known transcripts, (ii) reconstruct and quantify transcripts simultaneously in multiple samples, and (iii) resolve the location of multi-mapping reads. It is designed for genome- and assembly-based transcriptome reconstruction. We present an extensive study based on realistic simulated RNA-Seq data. When compared with state-of-the-art approaches, MITIE proves to be significantly more sensitive and overall more accurate. Moreover, MITIE yields substantial performance gains when used with multiple samples. We applied our system to 38 Drosophila melanogaster modENCODE RNA-Seq libraries and estimated the sensitivity of reconstructing omitted transcript annotations and the specificity with respect to annotated transcripts. Our results corroborate that a well-motivated objective paired with appropriate optimization techniques lead to significant improvements over the state-of-the-art in transcriptome reconstruction. MITIE is implemented in C++ and is available from http://bioweb.me/mitie under the GPL license.

  5. Meta-analytic approach to the accurate prediction of secreted virulence effectors in gram-negative bacteria

    Directory of Open Access Journals (Sweden)

    Sato Yoshiharu

    2011-11-01

    Full Text Available Abstract Background Many pathogens use a type III secretion system to translocate virulence proteins (called effectors in order to adapt to the host environment. To date, many prediction tools for effector identification have been developed. However, these tools are insufficiently accurate for producing a list of putative effectors that can be applied directly for labor-intensive experimental verification. This also suggests that important features of effectors have yet to be fully characterized. Results In this study, we have constructed an accurate approach to predicting secreted virulence effectors from Gram-negative bacteria. This consists of a support vector machine-based discriminant analysis followed by a simple criteria-based filtering. The accuracy was assessed by estimating the average number of true positives in the top-20 ranking in the genome-wide screening. In the validation, 10 sets of 20 training and 20 testing examples were randomly selected from 40 known effectors of Salmonella enterica serovar Typhimurium LT2. On average, the SVM portion of our system predicted 9.7 true positives from 20 testing examples in the top-20 of the prediction. Removal of the N-terminal instability, codon adaptation index and ProtParam indices decreased the score to 7.6, 8.9 and 7.9, respectively. These discrimination features suggested that the following characteristics of effectors had been uncovered: unstable N-terminus, non-optimal codon usage, hydrophilic, and less aliphathic. The secondary filtering process represented by coexpression analysis and domain distribution analysis further refined the average true positive counts to 12.3. We further confirmed that our system can correctly predict known effectors of P. syringae DC3000, strongly indicating its feasibility. Conclusions We have successfully developed an accurate prediction system for screening effectors on a genome-wide scale. We confirmed the accuracy of our system by external validation

  6. MiSeq: A Next Generation Sequencing Platform for Genomic Analysis.

    Science.gov (United States)

    Ravi, Rupesh Kanchi; Walton, Kendra; Khosroheidari, Mahdieh

    2018-01-01

    MiSeq, Illumina's integrated next generation sequencing instrument, uses reversible-terminator sequencing-by-synthesis technology to provide end-to-end sequencing solutions. The MiSeq instrument is one of the smallest benchtop sequencers that can perform onboard cluster generation, amplification, genomic DNA sequencing, and data analysis, including base calling, alignment and variant calling, in a single run. It performs both single- and paired-end runs with adjustable read lengths from 1 × 36 base pairs to 2 × 300 base pairs. A single run can produce output data of up to 15 Gb in as little as 4 h of runtime and can output up to 25 M single reads and 50 M paired-end reads. Thus, MiSeq provides an ideal platform for rapid turnaround time. MiSeq is also a cost-effective tool for various analyses focused on targeted gene sequencing (amplicon sequencing and target enrichment), metagenomics, and gene expression studies. For these reasons, MiSeq has become one of the most widely used next generation sequencing platforms. Here, we provide a protocol to prepare libraries for sequencing using the MiSeq instrument and basic guidelines for analysis of output data from the MiSeq sequencing run.

  7. Edge chipping and flexural resistance of monolithic ceramics☆

    Science.gov (United States)

    Zhang, Yu; Lee, James J.-W.; Srikanth, Ramanathan; Lawn, Brian R.

    2014-01-01

    Objective Test the hypothesis that monolithic ceramics can be developed with combined esthetics and superior fracture resistance to circumvent processing and performance drawbacks of traditional all-ceramic crowns and fixed-dental-prostheses consisting of a hard and strong core with an esthetic porcelain veneer. Specifically, to demonstrate that monolithic prostheses can be produced with a much reduced susceptibility to fracture. Methods Protocols were applied for quantifying resistance to chipping as well as resistance to flexural failure in two classes of dental ceramic, microstructurally-modified zirconias and lithium disilicate glass–ceramics. A sharp indenter was used to induce chips near the edges of flat-layer specimens, and the results compared with predictions from a critical load equation. The critical loads required to produce cementation surface failure in monolithic specimens bonded to dentin were computed from established flexural strength relations and the predictions validated with experimental data. Results Monolithic zirconias have superior chipping and flexural fracture resistance relative to their veneered counterparts. While they have superior esthetics, glass–ceramics exhibit lower strength but higher chip fracture resistance relative to porcelain-veneered zirconias. Significance The study suggests a promising future for new and improved monolithic ceramic restorations, with combined durability and acceptable esthetics. PMID:24139756

  8. Silicon Chip-to-Chip Mode-Division Multiplexing

    DEFF Research Database (Denmark)

    Baumann, Jan Markus; Porto da Silva, Edson; Ding, Yunhong

    2018-01-01

    A chip-to-chip mode-division multiplexing connection is demonstrated using a pair of multiplexers/demultiplexers fabricated on the silicon-on-insulator platform. Successful mode multiplexing and demultiplexing is experimentally demonstrated, using the LP01, LP11a and LP11b modes.......A chip-to-chip mode-division multiplexing connection is demonstrated using a pair of multiplexers/demultiplexers fabricated on the silicon-on-insulator platform. Successful mode multiplexing and demultiplexing is experimentally demonstrated, using the LP01, LP11a and LP11b modes....

  9. A statistical method for the detection of alternative splicing using RNA-seq.

    Directory of Open Access Journals (Sweden)

    Liguo Wang

    2010-01-01

    Full Text Available Deep sequencing of transcriptome (RNA-seq provides unprecedented opportunity to interrogate plausible mRNA splicing patterns by mapping RNA-seq reads to exon junctions (thereafter junction reads. In most previous studies, exon junctions were detected by using the quantitative information of junction reads. The quantitative criterion (e.g. minimum of two junction reads, although is straightforward and widely used, usually results in high false positive and false negative rates, owning to the complexity of transcriptome. Here, we introduced a new metric, namely Minimal Match on Either Side of exon junction (MMES, to measure the quality of each junction read, and subsequently implemented an empirical statistical model to detect exon junctions. When applied to a large dataset (>200M reads consisting of mouse brain, liver and muscle mRNA sequences, and using independent transcripts databases as positive control, our method was proved to be considerably more accurate than previous ones, especially for detecting junctions originated from low-abundance transcripts. Our results were also confirmed by real time RT-PCR assay. The MMES metric can be used either in this empirical statistical model or in other more sophisticated classifiers, such as logistic regression.

  10. Quantitative profiling of selective Sox/POU pairing on hundreds of sequences in parallel by Coop-seq.

    Science.gov (United States)

    Chang, Yiming K; Srivastava, Yogesh; Hu, Caizhen; Joyce, Adam; Yang, Xiaoxiao; Zuo, Zheng; Havranek, James J; Stormo, Gary D; Jauch, Ralf

    2017-01-25

    Cooperative binding of transcription factors is known to be important in the regulation of gene expression programs conferring cellular identities. However, current methods to measure cooperativity parameters have been laborious and therefore limited to studying only a few sequence variants at a time. We developed Coop-seq (cooperativity by sequencing) that is capable of efficiently and accurately determining the cooperativity parameters for hundreds of different DNA sequences in a single experiment. We apply Coop-seq to 12 dimer pairs from the Sox and POU families of transcription factors using 324 unique sequences with changed half-site orientation, altered spacing and discrete randomization within the binding elements. The study reveals specific dimerization profiles of different Sox factors with Oct4. By contrast, Oct4 and the three neural class III POU factors Brn2, Brn4 and Oct6 assemble with Sox2 in a surprisingly indistinguishable manner. Two novel half-site configurations can support functional Sox/Oct dimerization in addition to known composite motifs. Moreover, Coop-seq uncovers a nucleotide switch within the POU half-site when spacing is altered, which is mirrored in genomic loci bound by Sox2/Oct4 complexes. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  11. Ultrahigh-speed hybrid laser for silicon photonic integrated chips

    DEFF Research Database (Denmark)

    Chung, Il-Sug; Park, Gyeong Cheol; Ran, Qijiang

    2013-01-01

    Increasing power consumption for electrical interconnects between and inside chips is posing a real challenge to continue the performance scaling of processors/computers as predicted by D. Moore. In recent processors, energy consumption for electrical interconnects is half of power supplied...... and will be 80% in near future. This challenge strongly has motivated replacing electrical interconnects with optical ones even in chip level communications [1]. This chip-level optical interconnects need quite different performance of optoelectronic devices than required for conventional optical communications....... For a light source, the energy consumption per sending a bit is required to be

  12. A Lab-on-Chip Design for Miniature Autonomous Bio-Chemoprospecting Planetary Rovers

    Science.gov (United States)

    Santoli, S.

    The performance of the so-called ` Lab-on-Chip ' devices, featuring micrometre size components and employed at present for carrying out in a very fast and economic way the extremely high number of sequence determinations required in genomic analyses, can be largely improved as to further size reduction, decrease of power consumption and reaction efficiency through development of nanofluidics and of nano-to-micro inte- grated systems. As is shown, such new technologies would lead to robotic, fully autonomous, microwatt consumption and complete ` laboratory on a chip ' units for accurate, fast and cost-effective astrobiological and planetary exploration missions. The theory and the manufacturing technologies for the ` active chip ' of a miniature bio/chemoprospecting planetary rover working on micro- and nanofluidics are investigated. The chip would include micro- and nanoreactors, integrated MEMS (MicroElectroMechanical System) components, nanoelectronics and an intracavity nanolaser for highly accurate and fast chemical analysis as an application of such recently introduced solid state devices. Nano-reactors would be able to strongly speed up reaction kinetics as a result of increased frequency of reactive collisions. The reaction dynamics may also be altered with respect to standard macroscopic reactors. A built-in miniature telemetering unit would connect a network of other similar rovers and a central, ground-based or orbiting control unit for data collection and transmission to an Earth-based unit through a powerful antenna. The development of the ` Lab-on-Chip ' concept for space applications would affect the economy of space exploration missions, as the rover's ` Lab-on-Chip ' development would link space missions with the ever growing terrestrial market and business concerning such devices, largely employed in modern genomics and bioinformatics, so that it would allow the recoupment of space mission costs.

  13. Getting the most out of RNA-seq data analysis

    Directory of Open Access Journals (Sweden)

    Tsung Fei Khang

    2015-10-01

    Full Text Available Background. A common research goal in transcriptome projects is to find genes that are differentially expressed in different phenotype classes. Biologists might wish to validate such gene candidates experimentally, or use them for downstream systems biology analysis. Producing a coherent differential gene expression analysis from RNA-seq count data requires an understanding of how numerous sources of variation such as the replicate size, the hypothesized biological effect size, and the specific method for making differential expression calls interact. We believe an explicit demonstration of such interactions in real RNA-seq data sets is of practical interest to biologists.Results. Using two large public RNA-seq data sets—one representing strong, and another mild, biological effect size—we simulated different replicate size scenarios, and tested the performance of several commonly-used methods for calling differentially expressed genes in each of them. We found that, when biological effect size was mild, RNA-seq experiments should focus on experimental validation of differentially expressed gene candidates. Importantly, at least triplicates must be used, and the differentially expressed genes should be called using methods with high positive predictive value (PPV, such as NOISeq or GFOLD. In contrast, when biological effect size was strong, differentially expressed genes mined from unreplicated experiments using NOISeq, ASC and GFOLD had between 30 to 50% mean PPV, an increase of more than 30-fold compared to the cases of mild biological effect size. Among methods with good PPV performance, having triplicates or more substantially improved mean PPV to over 90% for GFOLD, 60% for DESeq2, 50% for NOISeq, and 30% for edgeR. At a replicate size of six, we found DESeq2 and edgeR to be reasonable methods for calling differentially expressed genes at systems level analysis, as their PPV and sensitivity trade-off were superior to the other methods

  14. Optimal Design of Low-Density SNP Arrays for Genomic Prediction: Algorithm and Applications.

    Directory of Open Access Journals (Sweden)

    Xiao-Lin Wu

    Full Text Available Low-density (LD single nucleotide polymorphism (SNP arrays provide a cost-effective solution for genomic prediction and selection, but algorithms and computational tools are needed for the optimal design of LD SNP chips. A multiple-objective, local optimization (MOLO algorithm was developed for design of optimal LD SNP chips that can be imputed accurately to medium-density (MD or high-density (HD SNP genotypes for genomic prediction. The objective function facilitates maximization of non-gap map length and system information for the SNP chip, and the latter is computed either as locus-averaged (LASE or haplotype-averaged Shannon entropy (HASE and adjusted for uniformity of the SNP distribution. HASE performed better than LASE with ≤1,000 SNPs, but required considerably more computing time. Nevertheless, the differences diminished when >5,000 SNPs were selected. Optimization was accomplished conditionally on the presence of SNPs that were obligated to each chromosome. The frame location of SNPs on a chip can be either uniform (evenly spaced or non-uniform. For the latter design, a tunable empirical Beta distribution was used to guide location distribution of frame SNPs such that both ends of each chromosome were enriched with SNPs. The SNP distribution on each chromosome was finalized through the objective function that was locally and empirically maximized. This MOLO algorithm was capable of selecting a set of approximately evenly-spaced and highly-informative SNPs, which in turn led to increased imputation accuracy compared with selection solely of evenly-spaced SNPs. Imputation accuracy increased with LD chip size, and imputation error rate was extremely low for chips with ≥3,000 SNPs. Assuming that genotyping or imputation error occurs at random, imputation error rate can be viewed as the upper limit for genomic prediction error. Our results show that about 25% of imputation error rate was propagated to genomic prediction in an Angus

  15. An open RNA-Seq data analysis pipeline tutorial with an example of reprocessing data from a recent Zika virus study.

    Science.gov (United States)

    Wang, Zichen; Ma'ayan, Avi

    2016-01-01

    RNA-seq analysis is becoming a standard method for global gene expression profiling. However, open and standard pipelines to perform RNA-seq analysis by non-experts remain challenging due to the large size of the raw data files and the hardware requirements for running the alignment step. Here we introduce a reproducible open source RNA-seq pipeline delivered as an IPython notebook and a Docker image. The pipeline uses state-of-the-art tools and can run on various platforms with minimal configuration overhead. The pipeline enables the extraction of knowledge from typical RNA-seq studies by generating interactive principal component analysis (PCA) and hierarchical clustering (HC) plots, performing enrichment analyses against over 90 gene set libraries, and obtaining lists of small molecules that are predicted to either mimic or reverse the observed changes in mRNA expression. We apply the pipeline to a recently published RNA-seq dataset collected from human neuronal progenitors infected with the Zika virus (ZIKV). In addition to confirming the presence of cell cycle genes among the genes that are downregulated by ZIKV, our analysis uncovers significant overlap with upregulated genes that when knocked out in mice induce defects in brain morphology. This result potentially points to the molecular processes associated with the microcephaly phenotype observed in newborns from pregnant mothers infected with the virus. In addition, our analysis predicts small molecules that can either mimic or reverse the expression changes induced by ZIKV. The IPython notebook and Docker image are freely available at:  http://nbviewer.jupyter.org/github/maayanlab/Zika-RNAseq-Pipeline/blob/master/Zika.ipynb and  https://hub.docker.com/r/maayanlab/zika/.

  16. Application of LogitBoost Classifier for Traceability Using SNP Chip Data.

    Science.gov (United States)

    Kim, Kwondo; Seo, Minseok; Kang, Hyunsung; Cho, Seoae; Kim, Heebal; Seo, Kang-Seok

    2015-01-01

    Consumer attention to food safety has increased rapidly due to animal-related diseases; therefore, it is important to identify their places of origin (POO) for safety purposes. However, only a few studies have addressed this issue and focused on machine learning-based approaches. In the present study, classification analyses were performed using a customized SNP chip for POO prediction. To accomplish this, 4,122 pigs originating from 104 farms were genotyped using the SNP chip. Several factors were considered to establish the best prediction model based on these data. We also assessed the applicability of the suggested model using a kinship coefficient-filtering approach. Our results showed that the LogitBoost-based prediction model outperformed other classifiers in terms of classification performance under most conditions. Specifically, a greater level of accuracy was observed when a higher kinship-based cutoff was employed. These results demonstrated the applicability of a machine learning-based approach using SNP chip data for practical traceability.

  17. Evaluating whole transcriptome amplification for gene profiling experiments using RNA-Seq.

    Science.gov (United States)

    Faherty, Sheena L; Campbell, C Ryan; Larsen, Peter A; Yoder, Anne D

    2015-07-30

    RNA-Seq has enabled high-throughput gene expression profiling to provide insight into the functional link between genotype and phenotype. Low quantities of starting RNA can be a severe hindrance for studies that aim to utilize RNA-Seq. To mitigate this bottleneck, whole transcriptome amplification (WTA) technologies have been developed to generate sufficient sequencing targets from minute amounts of RNA. Successful WTA requires accurate replication of transcript abundance without the loss or distortion of specific mRNAs. Here, we test the efficacy of NuGEN's Ovation RNA-Seq V2 system, which uses linear isothermal amplification with a unique chimeric primer for amplification, using white adipose tissue from standard laboratory rats (Rattus norvegicus). Our goal was to investigate potential biological artifacts introduced through WTA approaches by establishing comparisons between matched raw and amplified RNA libraries derived from biological replicates. We found that 93% of expressed genes were identical between all unamplified versus matched amplified comparisons, also finding that gene density is similar across all comparisons. Our sequencing experiment and downstream bioinformatic analyses using the Tuxedo analysis pipeline resulted in the assembly of 25,543 high-quality transcripts. Libraries constructed from raw RNA and WTA samples averaged 15,298 and 15,253 expressed genes, respectively. Although significant differentially expressed genes (P < 0.05) were identified in all matched samples, each of these represents less than 0.15% of all shared genes for each comparison. Transcriptome amplification is efficient at maintaining relative transcript frequencies with no significant bias when using this NuGEN linear isothermal amplification kit under ideal laboratory conditions as presented in this study. This methodology has broad applications, from clinical and diagnostic, to field-based studies when sample acquisition, or sample preservation, methods prove

  18. Accurate cut-offs for predicting endoscopic activity and mucosal healing in Crohn's disease with fecal calprotectin

    Directory of Open Access Journals (Sweden)

    Juan María Vázquez-Morón

    Full Text Available Background: Fecal biomarkers, especially fecal calprotectin, are useful for predicting endoscopic activity in Crohn's disease; however, the cut-off point remains unclear. The aim of this paper was to analyze whether faecal calprotectin and M2 pyruvate kinase are good tools for generating highly accurate scores for the prediction of the state of endoscopic activity and mucosal healing. Methods: The simple endoscopic score for Crohn's disease and the Crohn's disease activity index was calculated for 71 patients diagnosed with Crohn's. Fecal calprotectin and M2-PK were measured by the enzyme-linked immunosorbent assay test. Results: A fecal calprotectin cut-off concentration of ≥ 170 µg/g (sensitivity 77.6%, specificity 95.5% and likelihood ratio +17.06 predicts a high probability of endoscopic activity, and a fecal calprotectin cut-off of ≤ 71 µg/g (sensitivity 95.9%, specificity 52.3% and likelihood ratio -0.08 predicts a high probability of mucosal healing. Three clinical groups were identified according to the data obtained: endoscopic activity (calprotectin ≥ 170, mucosal healing (calprotectin ≤ 71 and uncertainty (71 > calprotectin < 170, with significant differences in endoscopic values (F = 26.407, p < 0.01. Clinical activity or remission modified the probabilities of presenting endoscopic activity (100% vs 89% or mucosal healing (75% vs 87% in the diagnostic scores generated. M2-PK was insufficiently accurate to determine scores. Conclusions: The highly accurate scores for fecal calprotectin provide a useful tool for interpreting the probabilities of presenting endoscopic activity or mucosal healing, and are valuable in the specific clinical context.

  19. Flip chip assembly of thinned chips for hybrid pixel detector applications

    International Nuclear Information System (INIS)

    Fritzsch, T; Zoschke, K; Rothermund, M; Oppermann, H; Woehrmann, M; Ehrmann, O; Lang, K D; Huegging, F

    2014-01-01

    There is a steady trend to ultra-thin microelectronic devices. Especially for future particle detector systems a reduced readout chip thickness is required to limit the loss of tracking precision due to scattering. The reduction of silicon thickness is performed at wafer level in a two-step thinning process. To minimize the risk of wafer breakage the thinned wafer needs to be handled by a carrier during the whole process chain of wafer bumping. Another key process is the flip chip assembly of thinned readout chips onto thin sensor tiles. Besides the prevention of silicon breakage the minimization of chip warpage is one additional task for a high yield and reliable flip chip process. A new technology using glass carrier wafer will be described in detail. The main advantage of this technology is the combination of a carrier support during wafer processing and the chip support during flip chip assembly. For that a glass wafer is glue-bonded onto the backside of the thinned readout chip wafer. After the bump deposition process the glass-readout chip stack is diced in one step. Finally the glass carrier chip is released by laser illumination after flip chip assembly of the readout chip onto sensor tile. The results of the flip chip assembly process development for the ATLAS IBL upgrade are described more in detail. The new ATLAS FEI4B chip with a size of 20 × 19 mm 2 is flip chip bonded with a thickness of only 150 μm, but the capability of this technology has been demonstrated on hybrid modules with a reduced readout chip thickness of down to 50 μm which is a major step for ultra-thin electronic systems

  20. Rnnotator: an automated de novo transcriptome assembly pipeline from stranded RNA-Seq reads

    Energy Technology Data Exchange (ETDEWEB)

    Martin, Jeffrey; Bruno, Vincent M.; Fang, Zhide; Meng, Xiandong; Blow, Matthew; Zhang, Tao; Sherlock, Gavin; Snyder, Michael; Wang, Zhong

    2010-11-19

    Background: Comprehensive annotation and quantification of transcriptomes are outstanding problems in functional genomics. While high throughput mRNA sequencing (RNA-Seq) has emerged as a powerful tool for addressing these problems, its success is dependent upon the availability and quality of reference genome sequences, thus limiting the organisms to which it can be applied. Results: Here, we describe Rnnotator, an automated software pipeline that generates transcript models by de novo assembly of RNA-Seq data without the need for a reference genome. We have applied the Rnnotator assembly pipeline to two yeast transcriptomes and compared the results to the reference gene catalogs of these organisms. The contigs produced by Rnnotator are highly accurate (95percent) and reconstruct full-length genes for the majority of the existing gene models (54.3percent). Furthermore, our analyses revealed many novel transcribed regions that are absent from well annotated genomes, suggesting Rnnotator serves as a complementary approach to analysis based on a reference genome for comprehensive transcriptomics. Conclusions: These results demonstrate that the Rnnotator pipeline is able to reconstruct full-length transcripts in the absence of a complete reference genome.

  1. Machine learning predictions of molecular properties: Accurate many-body potentials and nonlocality in chemical space

    International Nuclear Information System (INIS)

    Hansen, Katja; Biegler, Franziska; Ramakrishnan, Raghunathan; Pronobis, Wiktor; Lilienfeld, O. Anatole von; Müller, Klaus-Robert; Tkatchenko, Alexandre

    2015-01-01

    Simultaneously accurate and efficient prediction of molecular properties throughout chemical compound space is a critical ingredient toward rational compound design in chemical and pharmaceutical industries. Aiming toward this goal, we develop and apply a systematic hierarchy of efficient empirical methods to estimate atomization and total energies of molecules. These methods range from a simple sum over atoms, to addition of bond energies, to pairwise interatomic force fields, reaching to the more sophisticated machine learning approaches that are capable of describing collective interactions between many atoms or bonds. In the case of equilibrium molecular geometries, even simple pairwise force fields demonstrate prediction accuracy comparable to benchmark energies calculated using density functional theory with hybrid exchange-correlation functionals; however, accounting for the collective many-body interactions proves to be essential for approaching the 'holy grail' of chemical accuracy of 1 kcal/mol for both equilibrium and out-of-equilibrium geometries. This remarkable accuracy is achieved by a vectorized representation of molecules (so-called Bag of Bonds model) that exhibits strong nonlocality in chemical space. The same representation allows us to predict accurate electronic properties of molecules, such as their polarizability and molecular frontier orbital energies

  2. Hardware support for CSP on a Java chip multiprocessor

    DEFF Research Database (Denmark)

    Gruian, Flavius; Schoeberl, Martin

    2013-01-01

    Due to memory bandwidth limitations, chip multiprocessors (CMPs) adopting the convenient shared memory model for their main memory architecture scale poorly. On-chip core-to-core communication is a solution to this problem, that can lead to further performance increase for a number of multithreaded...... applications. Programmatically, the Communicating Sequential Processes (CSPs) paradigm provides a sound computational model for such an architecture with message based communication. In this paper we explore hardware support for CSP in the context of an embedded Java CMP. The hardware support for CSP are on......-chip communication channels, implemented by a ring-based network-on-chip (NoC), to reduce the memory bandwidth pressure on the shared memory.The presented solution is scalable and also specific for our limited resources and real-time predictability requirements. CMP architectures of three to eight processors were...

  3. ReliefSeq: a gene-wise adaptive-K nearest-neighbor feature selection tool for finding gene-gene interactions and main effects in mRNA-Seq gene expression data.

    Directory of Open Access Journals (Sweden)

    Brett A McKinney

    Full Text Available Relief-F is a nonparametric, nearest-neighbor machine learning method that has been successfully used to identify relevant variables that may interact in complex multivariate models to explain phenotypic variation. While several tools have been developed for assessing differential expression in sequence-based transcriptomics, the detection of statistical interactions between transcripts has received less attention in the area of RNA-seq analysis. We describe a new extension and assessment of Relief-F for feature selection in RNA-seq data. The ReliefSeq implementation adapts the number of nearest neighbors (k for each gene to optimize the Relief-F test statistics (importance scores for finding both main effects and interactions. We compare this gene-wise adaptive-k (gwak Relief-F method with standard RNA-seq feature selection tools, such as DESeq and edgeR, and with the popular machine learning method Random Forests. We demonstrate performance on a panel of simulated data that have a range of distributional properties reflected in real mRNA-seq data including multiple transcripts with varying sizes of main effects and interaction effects. For simulated main effects, gwak-Relief-F feature selection performs comparably to standard tools DESeq and edgeR for ranking relevant transcripts. For gene-gene interactions, gwak-Relief-F outperforms all comparison methods at ranking relevant genes in all but the highest fold change/highest signal situations where it performs similarly. The gwak-Relief-F algorithm outperforms Random Forests for detecting relevant genes in all simulation experiments. In addition, Relief-F is comparable to the other methods based on computational time. We also apply ReliefSeq to an RNA-Seq study of smallpox vaccine to identify gene expression changes between vaccinia virus-stimulated and unstimulated samples. ReliefSeq is an attractive tool for inclusion in the suite of tools used for analysis of mRNA-Seq data; it has power to

  4. Automatic control of air curtains with CHIPS technology; Automatische regeling van luchtgordijnen met CHIPS-technologie

    Energy Technology Data Exchange (ETDEWEB)

    Cremers, B.E. [Biddle, Kloostertille (Netherlands)

    2010-03-15

    In times of drastic automation air curtains cannot lag behind. Yet, how do you control a product whose operation depends not only on own settings but also on the conditions in which it is used. This article describes the latest development in the automation of the air curtain above an open door. The automated air curtain now has the highest separation efficiency, low energy use and optimal comfort under changing circumstances without any need for manual adjustment. CHIPS refers to Corrective Heating and Impulse Prediction System. [Dutch] In een tijd van verregaande automatisering kan een luchtgordijn niet achterblijven. Maar hoe regel je een product waarvan de werking niet alleen afhangt van de eigen instellingen, maar ook van de omstandigheden waarin het wordt gebruikt? Dit artikel beschrijft de nieuwste stap in de automatisering van een luchtgordijn boven een openstaande deur. Hiermee heeft het automatisch geregelde luchtgordijn het hoogste scheidingsrendement, laag energiegebruik en optimaal comfort onder wisselende omstandigheden zonder het luchtgordijn handmatig te hoeven bijstellen. CHIPS staat voor Corrective Heating and Impulse Prediction System.

  5. DisoMCS: Accurately Predicting Protein Intrinsically Disordered Regions Using a Multi-Class Conservative Score Approach.

    Directory of Open Access Journals (Sweden)

    Zhiheng Wang

    Full Text Available The precise prediction of protein intrinsically disordered regions, which play a crucial role in biological procedures, is a necessary prerequisite to further the understanding of the principles and mechanisms of protein function. Here, we propose a novel predictor, DisoMCS, which is a more accurate predictor of protein intrinsically disordered regions. The DisoMCS bases on an original multi-class conservative score (MCS obtained by sequence-order/disorder alignment. Initially, near-disorder regions are defined on fragments located at both the terminus of an ordered region connecting a disordered region. Then the multi-class conservative score is generated by sequence alignment against a known structure database and represented as order, near-disorder and disorder conservative scores. The MCS of each amino acid has three elements: order, near-disorder and disorder profiles. Finally, the MCS is exploited as features to identify disordered regions in sequences. DisoMCS utilizes a non-redundant data set as the training set, MCS and predicted secondary structure as features, and a conditional random field as the classification algorithm. In predicted near-disorder regions a residue is determined as an order or a disorder according to the optimized decision threshold. DisoMCS was evaluated by cross-validation, large-scale prediction, independent tests and CASP (Critical Assessment of Techniques for Protein Structure Prediction tests. All results confirmed that DisoMCS was very competitive in terms of accuracy of prediction when compared with well-established publicly available disordered region predictors. It also indicated our approach was more accurate when a query has higher homologous with the knowledge database.The DisoMCS is available at http://cal.tongji.edu.cn/disorder/.

  6. TRANSIT--A Software Tool for Himar1 TnSeq Analysis.

    Directory of Open Access Journals (Sweden)

    Michael A DeJesus

    2015-10-01

    Full Text Available TnSeq has become a popular technique for determining the essentiality of genomic regions in bacterial organisms. Several methods have been developed to analyze the wealth of data that has been obtained through TnSeq experiments. We developed a tool for analyzing Himar1 TnSeq data called TRANSIT. TRANSIT provides a graphical interface to three different statistical methods for analyzing TnSeq data. These methods cover a variety of approaches capable of identifying essential genes in individual datasets as well as comparative analysis between conditions. We demonstrate the utility of this software by analyzing TnSeq datasets of M. tuberculosis grown on glycerol and cholesterol. We show that TRANSIT can be used to discover genes which have been previously implicated for growth on these carbon sources. TRANSIT is written in Python, and thus can be run on Windows, OSX and Linux platforms. The source code is distributed under the GNU GPL v3 license and can be obtained from the following GitHub repository: https://github.com/mad-lab/transit.

  7. ATACseqQC: a Bioconductor package for post-alignment quality assessment of ATAC-seq data.

    Science.gov (United States)

    Ou, Jianhong; Liu, Haibo; Yu, Jun; Kelliher, Michelle A; Castilla, Lucio H; Lawson, Nathan D; Zhu, Lihua Julie

    2018-03-01

    ATAC-seq (Assays for Transposase-Accessible Chromatin using sequencing) is a recently developed technique for genome-wide analysis of chromatin accessibility. Compared to earlier methods for assaying chromatin accessibility, ATAC-seq is faster and easier to perform, does not require cross-linking, has higher signal to noise ratio, and can be performed on small cell numbers. However, to ensure a successful ATAC-seq experiment, step-by-step quality assurance processes, including both wet lab quality control and in silico quality assessment, are essential. While several tools have been developed or adopted for assessing read quality, identifying nucleosome occupancy and accessible regions from ATAC-seq data, none of the tools provide a comprehensive set of functionalities for preprocessing and quality assessment of aligned ATAC-seq datasets. We have developed a Bioconductor package, ATACseqQC, for easily generating various diagnostic plots to help researchers quickly assess the quality of their ATAC-seq data. In addition, this package contains functions to preprocess aligned ATAC-seq data for subsequent peak calling. Here we demonstrate the utilities of our package using 25 publicly available ATAC-seq datasets from four studies. We also provide guidelines on what the diagnostic plots should look like for an ideal ATAC-seq dataset. This software package has been used successfully for preprocessing and assessing several in-house and public ATAC-seq datasets. Diagnostic plots generated by this package will facilitate the quality assessment of ATAC-seq data, and help researchers to evaluate their own ATAC-seq experiments as well as select high-quality ATAC-seq datasets from public repositories such as GEO to avoid generating hypotheses or drawing conclusions from low-quality ATAC-seq experiments. The software, source code, and documentation are freely available as a Bioconductor package at https://bioconductor.org/packages/release/bioc/html/ATACseqQC.html .

  8. An Integrated Approach for RNA-seq Data Normalization.

    Science.gov (United States)

    Yang, Shengping; Mercante, Donald E; Zhang, Kun; Fang, Zhide

    2016-01-01

    DNA copy number alteration is common in many cancers. Studies have shown that insertion or deletion of DNA sequences can directly alter gene expression, and significant correlation exists between DNA copy number and gene expression. Data normalization is a critical step in the analysis of gene expression generated by RNA-seq technology. Successful normalization reduces/removes unwanted nonbiological variations in the data, while keeping meaningful information intact. However, as far as we know, no attempt has been made to adjust for the variation due to DNA copy number changes in RNA-seq data normalization. In this article, we propose an integrated approach for RNA-seq data normalization. Comparisons show that the proposed normalization can improve power for downstream differentially expressed gene detection and generate more biologically meaningful results in gene profiling. In addition, our findings show that due to the effects of copy number changes, some housekeeping genes are not always suitable internal controls for studying gene expression. Using information from DNA copy number, integrated approach is successful in reducing noises due to both biological and nonbiological causes in RNA-seq data, thus increasing the accuracy of gene profiling.

  9. Analysis of the resistive network in a bio-inspired CMOS vision chip

    Science.gov (United States)

    Kong, Jae-Sung; Sung, Dong-Kyu; Hyun, Hyo-Young; Shin, Jang-Kyoo

    2007-12-01

    CMOS vision chips for edge detection based on a resistive circuit have recently been developed. These chips help develop neuromorphic systems with a compact size, high speed of operation, and low power dissipation. The output of the vision chip depends dominantly upon the electrical characteristics of the resistive network which consists of a resistive circuit. In this paper, the body effect of the MOSFET for current distribution in a resistive circuit is discussed with a simple model. In order to evaluate the model, two 160×120 CMOS vision chips have been fabricated by using a standard CMOS technology. The experimental results have been nicely matched with our prediction.

  10. Simple Mathematical Models Do Not Accurately Predict Early SIV Dynamics

    Directory of Open Access Journals (Sweden)

    Cecilia Noecker

    2015-03-01

    Full Text Available Upon infection of a new host, human immunodeficiency virus (HIV replicates in the mucosal tissues and is generally undetectable in circulation for 1–2 weeks post-infection. Several interventions against HIV including vaccines and antiretroviral prophylaxis target virus replication at this earliest stage of infection. Mathematical models have been used to understand how HIV spreads from mucosal tissues systemically and what impact vaccination and/or antiretroviral prophylaxis has on viral eradication. Because predictions of such models have been rarely compared to experimental data, it remains unclear which processes included in these models are critical for predicting early HIV dynamics. Here we modified the “standard” mathematical model of HIV infection to include two populations of infected cells: cells that are actively producing the virus and cells that are transitioning into virus production mode. We evaluated the effects of several poorly known parameters on infection outcomes in this model and compared model predictions to experimental data on infection of non-human primates with variable doses of simian immunodifficiency virus (SIV. First, we found that the mode of virus production by infected cells (budding vs. bursting has a minimal impact on the early virus dynamics for a wide range of model parameters, as long as the parameters are constrained to provide the observed rate of SIV load increase in the blood of infected animals. Interestingly and in contrast with previous results, we found that the bursting mode of virus production generally results in a higher probability of viral extinction than the budding mode of virus production. Second, this mathematical model was not able to accurately describe the change in experimentally determined probability of host infection with increasing viral doses. Third and finally, the model was also unable to accurately explain the decline in the time to virus detection with increasing viral

  11. Improving medical decisions for incapacitated persons: does focusing on "accurate predictions" lead to an inaccurate picture?

    Science.gov (United States)

    Kim, Scott Y H

    2014-04-01

    The Patient Preference Predictor (PPP) proposal places a high priority on the accuracy of predicting patients' preferences and finds the performance of surrogates inadequate. However, the quest to develop a highly accurate, individualized statistical model has significant obstacles. First, it will be impossible to validate the PPP beyond the limit imposed by 60%-80% reliability of people's preferences for future medical decisions--a figure no better than the known average accuracy of surrogates. Second, evidence supports the view that a sizable minority of persons may not even have preferences to predict. Third, many, perhaps most, people express their autonomy just as much by entrusting their loved ones to exercise their judgment than by desiring to specifically control future decisions. Surrogate decision making faces none of these issues and, in fact, it may be more efficient, accurate, and authoritative than is commonly assumed.

  12. The MIDAS touch for Accurately Predicting the Stress-Strain Behavior of Tantalum

    Energy Technology Data Exchange (ETDEWEB)

    Jorgensen, S. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)

    2016-03-02

    Testing the behavior of metals in extreme environments is not always feasible, so material scientists use models to try and predict the behavior. To achieve accurate results it is necessary to use the appropriate model and material-specific parameters. This research evaluated the performance of six material models available in the MIDAS database [1] to determine at which temperatures and strain-rates they perform best, and to determine to which experimental data their parameters were optimized. Additionally, parameters were optimized for the Johnson-Cook model using experimental data from Lassila et al [2].

  13. Kinetic model for torrefaction of wood chips in a pilot-scale continuous reactor

    DEFF Research Database (Denmark)

    Shang, Lei; Ahrenfeldt, Jesper; Holm, Jens Kai

    2014-01-01

    accordance with the model data. In an additional step a continuous, pilot scale reactor was built to produce torrefied wood chips in large quantities. The "two-step reaction in series" model was applied to predict the mass yield of the torrefaction reaction. Parameters used for the calculation were...... at different torrefaction temperatures, it was possible to predict the HHV of torrefied wood chips from the pilot reactor. The results from this study and the presented modeling approach can be used to predict the product quality from pilot scale torrefaction reactors based on small scale experiments and could...

  14. Predicting DNA-binding proteins and binding residues by complex structure prediction and application to human proteome.

    Directory of Open Access Journals (Sweden)

    Huiying Zhao

    Full Text Available As more and more protein sequences are uncovered from increasingly inexpensive sequencing techniques, an urgent task is to find their functions. This work presents a highly reliable computational technique for predicting DNA-binding function at the level of protein-DNA complex structures, rather than low-resolution two-state prediction of DNA-binding as most existing techniques do. The method first predicts protein-DNA complex structure by utilizing the template-based structure prediction technique HHblits, followed by binding affinity prediction based on a knowledge-based energy function (Distance-scaled finite ideal-gas reference state for protein-DNA interactions. A leave-one-out cross validation of the method based on 179 DNA-binding and 3797 non-binding protein domains achieves a Matthews correlation coefficient (MCC of 0.77 with high precision (94% and high sensitivity (65%. We further found 51% sensitivity for 82 newly determined structures of DNA-binding proteins and 56% sensitivity for the human proteome. In addition, the method provides a reasonably accurate prediction of DNA-binding residues in proteins based on predicted DNA-binding complex structures. Its application to human proteome leads to more than 300 novel DNA-binding proteins; some of these predicted structures were validated by known structures of homologous proteins in APO forms. The method [SPOT-Seq (DNA] is available as an on-line server at http://sparks-lab.org.

  15. A novel k-mer set memory (KSM) motif representation improves regulatory variant prediction.

    Science.gov (United States)

    Guo, Yuchun; Tian, Kevin; Zeng, Haoyang; Guo, Xiaoyun; Gifford, David Kenneth

    2018-04-13

    The representation and discovery of transcription factor (TF) sequence binding specificities is critical for understanding gene regulatory networks and interpreting the impact of disease-associated noncoding genetic variants. We present a novel TF binding motif representation, the k -mer set memory (KSM), which consists of a set of aligned k -mers that are overrepresented at TF binding sites, and a new method called KMAC for de novo discovery of KSMs. We find that KSMs more accurately predict in vivo binding sites than position weight matrix (PWM) models and other more complex motif models across a large set of ChIP-seq experiments. Furthermore, KSMs outperform PWMs and more complex motif models in predicting in vitro binding sites. KMAC also identifies correct motifs in more experiments than five state-of-the-art motif discovery methods. In addition, KSM-derived features outperform both PWM and deep learning model derived sequence features in predicting differential regulatory activities of expression quantitative trait loci (eQTL) alleles. Finally, we have applied KMAC to 1600 ENCODE TF ChIP-seq data sets and created a public resource of KSM and PWM motifs. We expect that the KSM representation and KMAC method will be valuable in characterizing TF binding specificities and in interpreting the effects of noncoding genetic variations. © 2018 Guo et al.; Published by Cold Spring Harbor Laboratory Press.

  16. Single-tube linear DNA amplification (LinDA) for robust ChIP-seq

    NARCIS (Netherlands)

    Shankaranarayanan, P.; Mendoza-Parra, M.A.; Walia, M.; Wang, L.; Li, N.; Trindade, L.M.; Gronemeyer, H.

    2011-01-01

    Genome-wide profiling of transcription factors based on massive parallel sequencing of immunoprecipitated chromatin (ChIP-seq) requires nanogram amounts of DNA. Here we describe a high-fidelity, single-tube linear DNA amplification method (LinDA) for ChIP-seq and reChIP-seq with picogram DNA amounts

  17. Normalization of RNA-seq data using factor analysis of control genes or samples

    Science.gov (United States)

    Risso, Davide; Ngai, John; Speed, Terence P.; Dudoit, Sandrine

    2015-01-01

    Normalization of RNA-seq data has proven essential to ensure accurate inference of expression levels. Here we show that usual normalization approaches mostly account for sequencing depth and fail to correct for library preparation and other more-complex unwanted effects. We evaluate the performance of the External RNA Control Consortium (ERCC) spike-in controls and investigate the possibility of using them directly for normalization. We show that the spike-ins are not reliable enough to be used in standard global-scaling or regression-based normalization procedures. We propose a normalization strategy, remove unwanted variation (RUV), that adjusts for nuisance technical effects by performing factor analysis on suitable sets of control genes (e.g., ERCC spike-ins) or samples (e.g., replicate libraries). Our approach leads to more-accurate estimates of expression fold-changes and tests of differential expression compared to state-of-the-art normalization methods. In particular, RUV promises to be valuable for large collaborative projects involving multiple labs, technicians, and/or platforms. PMID:25150836

  18. SLC9B1 methylation predicts fetal intolerance of labor.

    Science.gov (United States)

    Knight, Anna K; Conneely, Karen N; Kilaru, Varun; Cobb, Dawayland; Payne, Jennifer L; Meilman, Samantha; Corwin, Elizabeth J; Kaminsky, Zachary A; Dunlop, Anne L; Smith, Alicia K

    2018-01-01

    Fetal intolerance of labor is a common indication for delivery by Caesarean section. Diagnosis is based on the presence of category III fetal heart rate tracing, which is an abnormal heart tracing associated with increased likelihood of fetal hypoxia and metabolic acidemia. This study analyzed data from 177 unique women who, during their prenatal visits (7-15 weeks and/or 24-32 weeks) to Atlanta area prenatal care clinics, consented to provide blood samples for DNA methylation (HumanMethylation450 BeadChip) and gene expression (Human HT-12 v4 Expression BeadChip) analyses. We focused on 57 women aged 18-36 (mean 25.4), who had DNA methylation data available from their second prenatal visit. DNA methylation patterns at CpG sites across the genome were interrogated for associations with fetal intolerance of labor. Four CpG sites (P value intolerance of labor. DNA methylation and gene expression were negatively associated when examined longitudinally during pregnancy using a linear mixed-effects model. Positive predictive values of methylation of these four sites ranged from 0.80 to 0.89, while negative predictive values ranged from 0.91 to 0.92. The four CpG sites were also associated with fetal intolerance of labor in an independent cohort (the Johns Hopkins Prospective PPD cohort). Therefore, fetal intolerance of labor could be accurately predicted from maternal blood samples obtained between 24-32 weeks gestation. Fetal intolerance of labor may be accurately predicted from maternal blood samples obtained between 24-32 weeks gestation by assessing DNA methylation patterns of SLC9B1. The identification of pregnant women at elevated risk for fetal intolerance of labor may allow for the development of targeted treatments or management plans.

  19. Can adverse maternal and perinatal outcomes be predicted when blood pressure becomes elevated? Secondary analyses from the CHIPS (Control of Hypertension In Pregnancy Study) randomized controlled trial

    NARCIS (Netherlands)

    Magee, Laura A.; von Dadelszen, Peter; Singer, Joel; Lee, Terry; Rey, Evelyne; Ross, Susan; Asztalos, Elizabeth; Murphy, Kellie E.; Menzies, Jennifer; Sanchez, Johanna; Gafni, Amiram; Gruslin, Andrée; Helewa, Michael; Hutton, Eileen; Lee, Shoo K.; Logan, Alexander G.; Ganzevoort, Wessel; Welch, Ross; Thornton, Jim G.; Moutquin, Jean Marie

    2016-01-01

    Introduction. For women with chronic or gestational hypertension in CHIPS (Control of Hypertension In Pregnancy Study, NCT01192412), we aimed to examine whether clinical predictors collected at randomization could predict adverse outcomes. Material and methods. This was a planned, secondary analysis

  20. A highly accurate predictive-adaptive method for lithium-ion battery remaining discharge energy prediction in electric vehicle applications

    International Nuclear Information System (INIS)

    Liu, Guangming; Ouyang, Minggao; Lu, Languang; Li, Jianqiu; Hua, Jianfeng

    2015-01-01

    Highlights: • An energy prediction (EP) method is introduced for battery E RDE determination. • EP determines E RDE through coupled prediction of future states, parameters, and output. • The PAEP combines parameter adaptation and prediction to update model parameters. • The PAEP provides improved E RDE accuracy compared with DC and other EP methods. - Abstract: In order to estimate the remaining driving range (RDR) in electric vehicles, the remaining discharge energy (E RDE ) of the applied battery system needs to be precisely predicted. Strongly affected by the load profiles, the available E RDE varies largely in real-world applications and requires specific determination. However, the commonly-used direct calculation (DC) method might result in certain energy prediction errors by relating the E RDE directly to the current state of charge (SOC). To enhance the E RDE accuracy, this paper presents a battery energy prediction (EP) method based on the predictive control theory, in which a coupled prediction of future battery state variation, battery model parameter change, and voltage response, is implemented on the E RDE prediction horizon, and the E RDE is subsequently accumulated and real-timely optimized. Three EP approaches with different model parameter updating routes are introduced, and the predictive-adaptive energy prediction (PAEP) method combining the real-time parameter identification and the future parameter prediction offers the best potential. Based on a large-format lithium-ion battery, the performance of different E RDE calculation methods is compared under various dynamic profiles. Results imply that the EP methods provide much better accuracy than the traditional DC method, and the PAEP could reduce the E RDE error by more than 90% and guarantee the relative energy prediction error under 2%, proving as a proper choice in online E RDE prediction. The correlation of SOC estimation and E RDE calculation is then discussed to illustrate the

  1. Chips with everything

    CERN Document Server

    CERN. Geneva

    2007-01-01

    In March 1972, Sir Robin Saxby gave a talk to the Royal Television Society called 'TV and Chips' about a 'state of the art' integrated circuit, containing 50 resistors and 50 transistors. Today's 'state of the art' chips contain up to a billion transistors. This enormous leap forward illustrates how dramatically the semiconductor industry has evolved in the past 34 years. The next 10 years are predicted to bring times of turbulent change for the industry, as more and more digital devices are used around the world. In this talk, Sir Robin will discuss the history of the Microchip Industry in parallel with ARM's history, demonstrating how a small European start-up can become a world player in the IT sector. He will also present his vision of important applications and developments in the next 20 years that are likely to become even more pervasive than the mobile phone is today, and will provide anecdotes and learning points from his own experience at ARM. About ARM: Sir Robin and a group of designers from Acorn...

  2. Cardinality enhancement utilizing Sequential Algorithm (SeQ code in OCDMA system

    Directory of Open Access Journals (Sweden)

    Fazlina C. A. S.

    2017-01-01

    Full Text Available Optical Code Division Multiple Access (OCDMA has been important with increasing demand for high capacity and speed for communication in optical networks because of OCDMA technique high efficiency that can be achieved, hence fibre bandwidth is fully used. In this paper we will focus on Sequential Algorithm (SeQ code with AND detection technique using Optisystem design tool. The result revealed SeQ code capable to eliminate Multiple Access Interference (MAI and improve Bit Error Rate (BER, Phase Induced Intensity Noise (PIIN and orthogonally between users in the system. From the results, SeQ shows good performance of BER and capable to accommodate 190 numbers of simultaneous users contrast with existing code. Thus, SeQ code have enhanced the system about 36% and 111% of FCC and DCS code. In addition, SeQ have good BER performance 10-25 at 155 Mbps in comparison with 622 Mbps, 1 Gbps and 2 Gbps bit rate. From the plot graph, 155 Mbps bit rate is suitable enough speed for FTTH and LAN networks. Resolution can be made based on the superior performance of SeQ code. Thus, these codes will give an opportunity in OCDMA system for better quality of service in an optical access network for future generation's usage

  3. Cardinality enhancement utilizing Sequential Algorithm (SeQ) code in OCDMA system

    Science.gov (United States)

    Fazlina, C. A. S.; Rashidi, C. B. M.; Rahman, A. K.; Aljunid, S. A.

    2017-11-01

    Optical Code Division Multiple Access (OCDMA) has been important with increasing demand for high capacity and speed for communication in optical networks because of OCDMA technique high efficiency that can be achieved, hence fibre bandwidth is fully used. In this paper we will focus on Sequential Algorithm (SeQ) code with AND detection technique using Optisystem design tool. The result revealed SeQ code capable to eliminate Multiple Access Interference (MAI) and improve Bit Error Rate (BER), Phase Induced Intensity Noise (PIIN) and orthogonally between users in the system. From the results, SeQ shows good performance of BER and capable to accommodate 190 numbers of simultaneous users contrast with existing code. Thus, SeQ code have enhanced the system about 36% and 111% of FCC and DCS code. In addition, SeQ have good BER performance 10-25 at 155 Mbps in comparison with 622 Mbps, 1 Gbps and 2 Gbps bit rate. From the plot graph, 155 Mbps bit rate is suitable enough speed for FTTH and LAN networks. Resolution can be made based on the superior performance of SeQ code. Thus, these codes will give an opportunity in OCDMA system for better quality of service in an optical access network for future generation's usage

  4. Practical guidelines for the comprehensive analysis of ChIP-seq data.

    Directory of Open Access Journals (Sweden)

    Timothy Bailey

    Full Text Available Mapping the chromosomal locations of transcription factors, nucleosomes, histone modifications, chromatin remodeling enzymes, chaperones, and polymerases is one of the key tasks of modern biology, as evidenced by the Encyclopedia of DNA Elements (ENCODE Project. To this end, chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq is the standard methodology. Mapping such protein-DNA interactions in vivo using ChIP-seq presents multiple challenges not only in sample preparation and sequencing but also for computational analysis. Here, we present step-by-step guidelines for the computational analysis of ChIP-seq data. We address all the major steps in the analysis of ChIP-seq data: sequencing depth selection, quality checking, mapping, data normalization, assessment of reproducibility, peak calling, differential binding analysis, controlling the false discovery rate, peak annotation, visualization, and motif analysis. At each step in our guidelines we discuss some of the software tools most frequently used. We also highlight the challenges and problems associated with each step in ChIP-seq data analysis. We present a concise workflow for the analysis of ChIP-seq data in Figure 1 that complements and expands on the recommendations of the ENCODE and modENCODE projects. Each step in the workflow is described in detail in the following sections.

  5. Flip chip assembly of thinned chips for hybrid pixel detector applications

    CERN Document Server

    Fritzsch, T; Woehrmann, M; Rothermund, M; Huegging, F; Ehrmann, O; Oppermann, H; Lang, K.D

    2014-01-01

    There is a steady trend to ultra-thin microelectronic devices. Especially for future particle detector systems a reduced readout chip thickness is required to limit the loss of tracking precision due to scattering. The reduction of silicon thickness is performed at wafer level in a two-step thinning process. To minimize the risk of wafer breakage the thinned wafer needs to be handled by a carrier during the whole process chain of wafer bumping. Another key process is the flip chip assembly of thinned readout chips onto thin sensor tiles. Besides the prevention of silicon breakage the minimization of chip warpage is one additional task for a high yield and reliable flip chip process. A new technology using glass carrier wafer will be described in detail. The main advantage of this technology is the combination of a carrier support during wafer processing and the chip support during flip chip assembly. For that a glass wafer is glue-bonded onto the backside of the thinned readout chip wafer. After the bump depo...

  6. Granatum: a graphical single-cell RNA-Seq analysis pipeline for genomics scientists.

    Science.gov (United States)

    Zhu, Xun; Wolfgruber, Thomas K; Tasato, Austin; Arisdakessian, Cédric; Garmire, David G; Garmire, Lana X

    2017-12-05

    Single-cell RNA sequencing (scRNA-Seq) is an increasingly popular platform to study heterogeneity at the single-cell level. Computational methods to process scRNA-Seq data are not very accessible to bench scientists as they require a significant amount of bioinformatic skills. We have developed Granatum, a web-based scRNA-Seq analysis pipeline to make analysis more broadly accessible to researchers. Without a single line of programming code, users can click through the pipeline, setting parameters and visualizing results via the interactive graphical interface. Granatum conveniently walks users through various steps of scRNA-Seq analysis. It has a comprehensive list of modules, including plate merging and batch-effect removal, outlier-sample removal, gene-expression normalization, imputation, gene filtering, cell clustering, differential gene expression analysis, pathway/ontology enrichment analysis, protein network interaction visualization, and pseudo-time cell series construction. Granatum enables broad adoption of scRNA-Seq technology by empowering bench scientists with an easy-to-use graphical interface for scRNA-Seq data analysis. The package is freely available for research use at http://garmiregroup.org/granatum/app.

  7. Do dual-route models accurately predict reading and spelling performance in individuals with acquired alexia and agraphia?

    Science.gov (United States)

    Rapcsak, Steven Z; Henry, Maya L; Teague, Sommer L; Carnahan, Susan D; Beeson, Pélagie M

    2007-06-18

    Coltheart and co-workers [Castles, A., Bates, T. C., & Coltheart, M. (2006). John Marshall and the developmental dyslexias. Aphasiology, 20, 871-892; Coltheart, M., Rastle, K., Perry, C., Langdon, R., & Ziegler, J. (2001). DRC: A dual route cascaded model of visual word recognition and reading aloud. Psychological Review, 108, 204-256] have demonstrated that an equation derived from dual-route theory accurately predicts reading performance in young normal readers and in children with reading impairment due to developmental dyslexia or stroke. In this paper, we present evidence that the dual-route equation and a related multiple regression model also accurately predict both reading and spelling performance in adult neurological patients with acquired alexia and agraphia. These findings provide empirical support for dual-route theories of written language processing.

  8. Versatile single-chip event sequencer for atomic physics experiments

    Science.gov (United States)

    Eyler, Edward

    2010-03-01

    A very inexpensive dsPIC microcontroller with internal 32-bit counters is used to produce a flexible timing signal generator with up to 16 TTL-compatible digital outputs, with a time resolution and accuracy of 50 ns. This time resolution is easily sufficient for event sequencing in typical experiments involving cold atoms or laser spectroscopy. This single-chip device is capable of triggered operation and can also function as a sweeping delay generator. With one additional chip it can also concurrently produce accurately timed analog ramps, and another one-chip addition allows real-time control from an external computer. Compared to an FPGA-based digital pattern generator, this design is slower but simpler and more flexible, and it can be reprogrammed using ordinary `C' code without special knowledge. I will also describe the use of the same microcontroller with additional hardware to implement a digital lock-in amplifier and PID controller for laser locking, including a simple graphics-based control unit. This work is supported in part by the NSF.

  9. Quantitative RNA-Seq analysis in non-model species: assessing transcriptome assemblies as a scaffold and the utility of evolutionary divergent genomic reference species

    Directory of Open Access Journals (Sweden)

    Hornett Emily A

    2012-08-01

    Full Text Available Abstract Background How well does RNA-Seq data perform for quantitative whole gene expression analysis in the absence of a genome? This is one unanswered question facing the rapidly growing number of researchers studying non-model species. Using Homo sapiens data and resources, we compared the direct mapping of sequencing reads to predicted genes from the genome with mapping to de novo transcriptomes assembled from RNA-Seq data. Gene coverage and expression analysis was further investigated in the non-model context by using increasingly divergent genomic reference species to group assembled contigs by unique genes. Results Eight transcriptome sets, composed of varying amounts of Illumina and 454 data, were assembled and assessed. Hybrid 454/Illumina assemblies had the highest transcriptome and individual gene coverage. Quantitative whole gene expression levels were highly similar between using a de novo hybrid assembly and the predicted genes as a scaffold, although mapping to the de novo transcriptome assembly provided data on fewer genes. Using non-target species as reference scaffolds does result in some loss of sequence and expression data, and bias and error increase with evolutionary distance. However, within a 100 million year window these effect sizes are relatively small. Conclusions Predicted gene sets from sequenced genomes of related species can provide a powerful method for grouping RNA-Seq reads and annotating contigs. Gene expression results can be produced that are similar to results obtained using gene models derived from a high quality genome, though biased towards conserved genes. Our results demonstrate the power and limitations of conducting RNA-Seq in non-model species.

  10. GUIDEseq: a bioconductor package to analyze GUIDE-Seq datasets for CRISPR-Cas nucleases.

    Science.gov (United States)

    Zhu, Lihua Julie; Lawrence, Michael; Gupta, Ankit; Pagès, Hervé; Kucukural, Alper; Garber, Manuel; Wolfe, Scot A

    2017-05-15

    Genome editing technologies developed around the CRISPR-Cas9 nuclease system have facilitated the investigation of a broad range of biological questions. These nucleases also hold tremendous promise for treating a variety of genetic disorders. In the context of their therapeutic application, it is important to identify the spectrum of genomic sequences that are cleaved by a candidate nuclease when programmed with a particular guide RNA, as well as the cleavage efficiency of these sites. Powerful new experimental approaches, such as GUIDE-seq, facilitate the sensitive, unbiased genome-wide detection of nuclease cleavage sites within the genome. Flexible bioinformatics analysis tools for processing GUIDE-seq data are needed. Here, we describe an open source, open development software suite, GUIDEseq, for GUIDE-seq data analysis and annotation as a Bioconductor package in R. The GUIDEseq package provides a flexible platform with more than 60 adjustable parameters for the analysis of datasets associated with custom nuclease applications. These parameters allow data analysis to be tailored to different nuclease platforms with different length and complexity in their guide and PAM recognition sequences or their DNA cleavage position. They also enable users to customize sequence aggregation criteria, and vary peak calling thresholds that can influence the number of potential off-target sites recovered. GUIDEseq also annotates potential off-target sites that overlap with genes based on genome annotation information, as these may be the most important off-target sites for further characterization. In addition, GUIDEseq enables the comparison and visualization of off-target site overlap between different datasets for a rapid comparison of different nuclease configurations or experimental conditions. For each identified off-target, the GUIDEseq package outputs mapped GUIDE-Seq read count as well as cleavage score from a user specified off-target cleavage score prediction

  11. Discovery of Protein–lncRNA Interactions by Integrating Large-Scale CLIP-Seq and RNA-Seq Datasets

    Energy Technology Data Exchange (ETDEWEB)

    Li, Jun-Hao; Liu, Shun; Zheng, Ling-Ling; Wu, Jie; Sun, Wen-Ju; Wang, Ze-Lin; Zhou, Hui; Qu, Liang-Hu, E-mail: lssqlh@mail.sysu.edu.cn; Yang, Jian-Hua, E-mail: lssqlh@mail.sysu.edu.cn [RNA Information Center, Key Laboratory of Gene Engineering of the Ministry of Education, State Key Laboratory for Biocontrol, Sun Yat-sen University, Guangzhou (China)

    2015-01-14

    Long non-coding RNAs (lncRNAs) are emerging as important regulatory molecules in developmental, physiological, and pathological processes. However, the precise mechanism and functions of most of lncRNAs remain largely unknown. Recent advances in high-throughput sequencing of immunoprecipitated RNAs after cross-linking (CLIP-Seq) provide powerful ways to identify biologically relevant protein–lncRNA interactions. In this study, by analyzing millions of RNA-binding protein (RBP) binding sites from 117 CLIP-Seq datasets generated by 50 independent studies, we identified 22,735 RBP–lncRNA regulatory relationships. We found that one single lncRNA will generally be bound and regulated by one or multiple RBPs, the combination of which may coordinately regulate gene expression. We also revealed the expression correlation of these interaction networks by mining expression profiles of over 6000 normal and tumor samples from 14 cancer types. Our combined analysis of CLIP-Seq data and genome-wide association studies data discovered hundreds of disease-related single nucleotide polymorphisms resided in the RBP binding sites of lncRNAs. Finally, we developed interactive web implementations to provide visualization, analysis, and downloading of the aforementioned large-scale datasets. Our study represented an important step in identification and analysis of RBP–lncRNA interactions and showed that these interactions may play crucial roles in cancer and genetic diseases.

  12. Discovery of Protein–lncRNA Interactions by Integrating Large-Scale CLIP-Seq and RNA-Seq Datasets

    International Nuclear Information System (INIS)

    Li, Jun-Hao; Liu, Shun; Zheng, Ling-Ling; Wu, Jie; Sun, Wen-Ju; Wang, Ze-Lin; Zhou, Hui; Qu, Liang-Hu; Yang, Jian-Hua

    2015-01-01

    Long non-coding RNAs (lncRNAs) are emerging as important regulatory molecules in developmental, physiological, and pathological processes. However, the precise mechanism and functions of most of lncRNAs remain largely unknown. Recent advances in high-throughput sequencing of immunoprecipitated RNAs after cross-linking (CLIP-Seq) provide powerful ways to identify biologically relevant protein–lncRNA interactions. In this study, by analyzing millions of RNA-binding protein (RBP) binding sites from 117 CLIP-Seq datasets generated by 50 independent studies, we identified 22,735 RBP–lncRNA regulatory relationships. We found that one single lncRNA will generally be bound and regulated by one or multiple RBPs, the combination of which may coordinately regulate gene expression. We also revealed the expression correlation of these interaction networks by mining expression profiles of over 6000 normal and tumor samples from 14 cancer types. Our combined analysis of CLIP-Seq data and genome-wide association studies data discovered hundreds of disease-related single nucleotide polymorphisms resided in the RBP binding sites of lncRNAs. Finally, we developed interactive web implementations to provide visualization, analysis, and downloading of the aforementioned large-scale datasets. Our study represented an important step in identification and analysis of RBP–lncRNA interactions and showed that these interactions may play crucial roles in cancer and genetic diseases.

  13. On-chip concentration of bacteria using a 3D dielectrophoretic chip and subsequent laser-based DNA extraction in the same chip

    International Nuclear Information System (INIS)

    Cho, Yoon-Kyoung; Kim, Tae-hyeong; Lee, Jeong-Gun

    2010-01-01

    We report the on-chip concentration of bacteria using a dielectrophoretic (DEP) chip with 3D electrodes and subsequent laser-based DNA extraction in the same chip. The DEP chip has a set of interdigitated Au post electrodes with 50 µm height to generate a network of non-uniform electric fields for the efficient trapping by DEP. The metal post array was fabricated by photolithography and subsequent Ni and Au electroplating. Three model bacteria samples (Escherichia coli, Staphylococcus epidermidis, Streptococcus mutans) were tested and over 80-fold concentrations were achieved within 2 min. Subsequently, on-chip DNA extraction from the concentrated bacteria in the 3D DEP chip was performed by laser irradiation using the laser-irradiated magnetic bead system (LIMBS) in the same chip. The extracted DNA was analyzed with silicon chip-based real-time polymerase chain reaction (PCR). The total process of on-chip bacteria concentration and the subsequent DNA extraction can be completed within 10 min including the manual operation time.

  14. On-chip highly sensitive saliva glucose sensing using multilayer films composed of single-walled carbon nanotubes, gold nanoparticles, and glucose oxidase

    Directory of Open Access Journals (Sweden)

    Wenjun Zhang

    2015-06-01

    Full Text Available It is very important for human health to rapidly and accurately detect glucose levels in biological environments, especially for diabetes mellitus. We proposed a simple, highly sensitive, accurate, convenient, low-cost, and disposable glucose biosensor on a single chip. A working (sensor electrode, a counter electrode, and a reference electrode are integrated on a single chip through micro-fabrication. The working electrode is functionalized through a layer-by-layer (LBL assembly of single-walled carbon nanotubes (SWNTs and multilayer films composed of chitosan (CS, gold nanoparticles (GNp, and glucose oxidase (GOx to obtain high sensitivity and accuracy. The glucose sensor has following features: (1 direct electron transfer between GOx and the electrode surface; (2 on-a-chip; (3 glucose detection down to 0.1 mg/dL (5.6 μM; (4 good sensing linearity over 0.017–0.81 mM; (5 high sensitivity (61.4 μA/mM-cm2 with a small reactive area (8 mm2; (6 fast response; (7 high reproducibility and repeatability; (8 reliable and accurate saliva glucose detection. Thus, this disposable biosensor will be an alternative for real time tracking of glucose levels from body fluids, e.g. saliva, in a noninvasive, pain-free, accurate, and continuous way. In addition to being used as a disposable glucose biosensor, it also provides a suitable platform for on-chip electrochemical sensing for other chemical agents and biomolecules.

  15. SeqAn An efficient, generic C++ library for sequence analysis

    Directory of Open Access Journals (Sweden)

    Rausch Tobias

    2008-01-01

    Full Text Available Abstract Background The use of novel algorithmic techniques is pivotal to many important problems in life science. For example the sequencing of the human genome 1 would not have been possible without advanced assembly algorithms. However, owing to the high speed of technological progress and the urgent need for bioinformatics tools, there is a widening gap between state-of-the-art algorithmic techniques and the actual algorithmic components of tools that are in widespread use. Results To remedy this trend we propose the use of SeqAn, a library of efficient data types and algorithms for sequence analysis in computational biology. SeqAn comprises implementations of existing, practical state-of-the-art algorithmic components to provide a sound basis for algorithm testing and development. In this paper we describe the design and content of SeqAn and demonstrate its use by giving two examples. In the first example we show an application of SeqAn as an experimental platform by comparing different exact string matching algorithms. The second example is a simple version of the well-known MUMmer tool rewritten in SeqAn. Results indicate that our implementation is very efficient and versatile to use. Conclusion We anticipate that SeqAn greatly simplifies the rapid development of new bioinformatics tools by providing a collection of readily usable, well-designed algorithmic components which are fundamental for the field of sequence analysis. This leverages not only the implementation of new algorithms, but also enables a sound analysis and comparison of existing algorithms.

  16. NGScloud: RNA-seq analysis of non-model species using cloud computing.

    Science.gov (United States)

    Mora-Márquez, Fernando; Vázquez-Poletti, José Luis; López de Heredia, Unai

    2018-05-03

    RNA-seq analysis usually requires large computing infrastructures. NGScloud is a bioinformatic system developed to analyze RNA-seq data using the cloud computing services of Amazon that permit the access to ad hoc computing infrastructure scaled according to the complexity of the experiment, so its costs and times can be optimized. The application provides a user-friendly front-end to operate Amazon's hardware resources, and to control a workflow of RNA-seq analysis oriented to non-model species, incorporating the cluster concept, which allows parallel runs of common RNA-seq analysis programs in several virtual machines for faster analysis. NGScloud is freely available at https://github.com/GGFHF/NGScloud/. A manual detailing installation and how-to-use instructions is available with the distribution. unai.lopezdeheredia@upm.es.

  17. ILT based defect simulation of inspection images accurately predicts mask defect printability on wafer

    Science.gov (United States)

    Deep, Prakash; Paninjath, Sankaranarayanan; Pereira, Mark; Buck, Peter

    2016-05-01

    At advanced technology nodes mask complexity has been increased because of large-scale use of resolution enhancement technologies (RET) which includes Optical Proximity Correction (OPC), Inverse Lithography Technology (ILT) and Source Mask Optimization (SMO). The number of defects detected during inspection of such mask increased drastically and differentiation of critical and non-critical defects are more challenging, complex and time consuming. Because of significant defectivity of EUVL masks and non-availability of actinic inspection, it is important and also challenging to predict the criticality of defects for printability on wafer. This is one of the significant barriers for the adoption of EUVL for semiconductor manufacturing. Techniques to decide criticality of defects from images captured using non actinic inspection images is desired till actinic inspection is not available. High resolution inspection of photomask images detects many defects which are used for process and mask qualification. Repairing all defects is not practical and probably not required, however it's imperative to know which defects are severe enough to impact wafer before repair. Additionally, wafer printability check is always desired after repairing a defect. AIMSTM review is the industry standard for this, however doing AIMSTM review for all defects is expensive and very time consuming. Fast, accurate and an economical mechanism is desired which can predict defect printability on wafer accurately and quickly from images captured using high resolution inspection machine. Predicting defect printability from such images is challenging due to the fact that the high resolution images do not correlate with actual mask contours. The challenge is increased due to use of different optical condition during inspection other than actual scanner condition, and defects found in such images do not have correlation with actual impact on wafer. Our automated defect simulation tool predicts

  18. Accurate identification of RNA editing sites from primitive sequence with deep neural networks.

    Science.gov (United States)

    Ouyang, Zhangyi; Liu, Feng; Zhao, Chenghui; Ren, Chao; An, Gaole; Mei, Chuan; Bo, Xiaochen; Shu, Wenjie

    2018-04-16

    RNA editing is a post-transcriptional RNA sequence alteration. Current methods have identified editing sites and facilitated research but require sufficient genomic annotations and prior-knowledge-based filtering steps, resulting in a cumbersome, time-consuming identification process. Moreover, these methods have limited generalizability and applicability in species with insufficient genomic annotations or in conditions of limited prior knowledge. We developed DeepRed, a deep learning-based method that identifies RNA editing from primitive RNA sequences without prior-knowledge-based filtering steps or genomic annotations. DeepRed achieved 98.1% and 97.9% area under the curve (AUC) in training and test sets, respectively. We further validated DeepRed using experimentally verified U87 cell RNA-seq data, achieving 97.9% positive predictive value (PPV). We demonstrated that DeepRed offers better prediction accuracy and computational efficiency than current methods with large-scale, mass RNA-seq data. We used DeepRed to assess the impact of multiple factors on editing identification with RNA-seq data from the Association of Biomolecular Resource Facilities and Sequencing Quality Control projects. We explored developmental RNA editing pattern changes during human early embryogenesis and evolutionary patterns in Drosophila species and the primate lineage using DeepRed. Our work illustrates DeepRed's state-of-the-art performance; it may decipher the hidden principles behind RNA editing, making editing detection convenient and effective.

  19. Model Comparison in Subsurface Science: The DECOVALEX and Sim-SEQ Initiatives (Invited)

    Science.gov (United States)

    Birkholzer, J. T.; Mukhopadhyay, S.; Rutqvist, J.; Tsang, C.

    2013-12-01

    Building predictive model for flow and transport processes in the subsurface is a challenging task, even more so if these processes are coupled to geomechanical and/or geochemical effects. Modelers must take into consideration a multiplicity of length scales, a wide range of time scales, the coupling between processes, different model components, and the spatial variability in the value of most model input parameters (and often limited knowledge about them). Consequently, modelers have to make choices while developing their conceptual models. Such model choices may cause a wide range in the predictions made by different models and different modeling groups, even if each of the underlying simulators has been perfectly verified against appropriate benchmarks. In other words, the modeling activity itself is prone to uncertainty and bias. This uncertainty, referred to here as model selection uncertainty, forms one of the greatest sources of uncertainty for predictive modeling. In this paper, we discuss two examples of model intercomparison exercises that are currently undertaken to better understand model selection uncertainty, elucidate system behavior, inform needs for data collection and better physics parameterizations, and enhance community understanding of capabilities. The first example is the international DECOVALEX project, which was launched in 1992 by a group of countries dealing with modeling issues related to geologic disposal of radioactive waste. DECOVALEX is an acronym for DEvelopment of COupled THM models and their VALidation against Experiments. To date, the project has progressed successfully through five stages, each of which featuring a small number of test cases for model comparison related to coupled thermo-hydro-mechanical (THM) processes in geologic systems. The test cases are proposed and developed by the organizations participating in DECOVALEX; they typically involve results from major field and laboratory experiments. Over the past decades

  20. Prognostic breast cancer signature identified from 3D culture model accurately predicts clinical outcome across independent datasets

    Energy Technology Data Exchange (ETDEWEB)

    Martin, Katherine J.; Patrick, Denis R.; Bissell, Mina J.; Fournier, Marcia V.

    2008-10-20

    One of the major tenets in breast cancer research is that early detection is vital for patient survival by increasing treatment options. To that end, we have previously used a novel unsupervised approach to identify a set of genes whose expression predicts prognosis of breast cancer patients. The predictive genes were selected in a well-defined three dimensional (3D) cell culture model of non-malignant human mammary epithelial cell morphogenesis as down-regulated during breast epithelial cell acinar formation and cell cycle arrest. Here we examine the ability of this gene signature (3D-signature) to predict prognosis in three independent breast cancer microarray datasets having 295, 286, and 118 samples, respectively. Our results show that the 3D-signature accurately predicts prognosis in three unrelated patient datasets. At 10 years, the probability of positive outcome was 52, 51, and 47 percent in the group with a poor-prognosis signature and 91, 75, and 71 percent in the group with a good-prognosis signature for the three datasets, respectively (Kaplan-Meier survival analysis, p<0.05). Hazard ratios for poor outcome were 5.5 (95% CI 3.0 to 12.2, p<0.0001), 2.4 (95% CI 1.6 to 3.6, p<0.0001) and 1.9 (95% CI 1.1 to 3.2, p = 0.016) and remained significant for the two larger datasets when corrected for estrogen receptor (ER) status. Hence the 3D-signature accurately predicts breast cancer outcome in both ER-positive and ER-negative tumors, though individual genes differed in their prognostic ability in the two subtypes. Genes that were prognostic in ER+ patients are AURKA, CEP55, RRM2, EPHA2, FGFBP1, and VRK1, while genes prognostic in ER patients include ACTB, FOXM1 and SERPINE2 (Kaplan-Meier p<0.05). Multivariable Cox regression analysis in the largest dataset showed that the 3D-signature was a strong independent factor in predicting breast cancer outcome. The 3D-signature accurately predicts breast cancer outcome across multiple datasets and holds prognostic