WorldWideScience

Sample records for hypotheses indel-seq-gen version

  1. Restricted DCJ-indel model: sorting linear genomes with DCJ and indels

    Science.gov (United States)

    2012-01-01

    Background The double-cut-and-join (DCJ) is a model that is able to efficiently sort a genome into another, generalizing the typical mutations (inversions, fusions, fissions, translocations) to which genomes are subject, but allowing the existence of circular chromosomes at the intermediate steps. In the general model many circular chromosomes can coexist in some intermediate step. However, when the compared genomes are linear, it is more plausible to use the so-called restricted DCJ model, in which we proceed the reincorporation of a circular chromosome immediately after its creation. These two consecutive DCJ operations, which create and reincorporate a circular chromosome, mimic a transposition or a block-interchange. When the compared genomes have the same content, it is known that the genomic distance for the restricted DCJ model is the same as the distance for the general model. If the genomes have unequal contents, in addition to DCJ it is necessary to consider indels, which are insertions and deletions of DNA segments. Linear time algorithms were proposed to compute the distance and to find a sorting scenario in a general, unrestricted DCJ-indel model that considers DCJ and indels. Results In the present work we consider the restricted DCJ-indel model for sorting linear genomes with unequal contents. We allow DCJ operations and indels with the following constraint: if a circular chromosome is created by a DCJ, it has to be reincorporated in the next step (no other DCJ or indel can be applied between the creation and the reincorporation of a circular chromosome). We then develop a sorting algorithm and give a tight upper bound for the restricted DCJ-indel distance. Conclusions We have given a tight upper bound for the restricted DCJ-indel distance. The question whether this bound can be reduced so that both the general and the restricted DCJ-indel distances are equal remains open. PMID:23281630

  2. ScanIndel: a hybrid framework for indel detection via gapped alignment, split reads and de novo assembly.

    Science.gov (United States)

    Yang, Rendong; Nelson, Andrew C; Henzler, Christine; Thyagarajan, Bharat; Silverstein, Kevin A T

    2015-12-07

    Comprehensive identification of insertions/deletions (indels) across the full size spectrum from second generation sequencing is challenging due to the relatively short read length inherent in the technology. Different indel calling methods exist but are limited in detection to specific sizes with varying accuracy and resolution. We present ScanIndel, an integrated framework for detecting indels with multiple heuristics including gapped alignment, split reads and de novo assembly. Using simulation data, we demonstrate ScanIndel's superior sensitivity and specificity relative to several state-of-the-art indel callers across various coverage levels and indel sizes. ScanIndel yields higher predictive accuracy with lower computational cost compared with existing tools for both targeted resequencing data from tumor specimens and high coverage whole-genome sequencing data from the human NIST standard NA12878. Thus, we anticipate ScanIndel will improve indel analysis in both clinical and research settings. ScanIndel is implemented in Python, and is freely available for academic use at https://github.com/cauyrd/ScanIndel.

  3. Characterization and potential functional significance of human-chimpanzee large INDEL variation

    Directory of Open Access Journals (Sweden)

    Polavarapu Nalini

    2011-10-01

    Full Text Available Abstract Background Although humans and chimpanzees have accumulated significant differences in a number of phenotypic traits since diverging from a common ancestor about six million years ago, their genomes are more than 98.5% identical at protein-coding loci. This modest degree of nucleotide divergence is not sufficient to explain the extensive phenotypic differences between the two species. It has been hypothesized that the genetic basis of the phenotypic differences lies at the level of gene regulation and is associated with the extensive insertion and deletion (INDEL variation between the two species. To test the hypothesis that large INDELs (80 to 12,000 bp may have contributed significantly to differences in gene regulation between the two species, we categorized human-chimpanzee INDEL variation mapping in or around genes and determined whether this variation is significantly correlated with previously determined differences in gene expression. Results Extensive, large INDEL variation exists between the human and chimpanzee genomes. This variation is primarily attributable to retrotransposon insertions within the human lineage. There is a significant correlation between differences in gene expression and large human-chimpanzee INDEL variation mapping in genes or in proximity to them. Conclusions The results presented herein are consistent with the hypothesis that large INDELs, particularly those associated with retrotransposons, have played a significant role in human-chimpanzee regulatory evolution.

  4. GenMAPP 2: new features and resources for pathway analysis

    Directory of Open Access Journals (Sweden)

    Dahlquist Kam D

    2007-06-01

    Full Text Available Abstract Background Microarray technologies have evolved rapidly, enabling biologists to quantify genome-wide levels of gene expression, alternative splicing, and sequence variations for a variety of species. Analyzing and displaying these data present a significant challenge. Pathway-based approaches for analyzing microarray data have proven useful for presenting data and for generating testable hypotheses. Results To address the growing needs of the microarray community we have released version 2 of Gene Map Annotator and Pathway Profiler (GenMAPP, a new GenMAPP database schema, and integrated resources for pathway analysis. We have redesigned the GenMAPP database to support multiple gene annotations and species as well as custom species database creation for a potentially unlimited number of species. We have expanded our pathway resources by utilizing homology information to translate pathway content between species and extending existing pathways with data derived from conserved protein interactions and coexpression. We have implemented a new mode of data visualization to support analysis of complex data, including time-course, single nucleotide polymorphism (SNP, and splicing. GenMAPP version 2 also offers innovative ways to display and share data by incorporating HTML export of analyses for entire sets of pathways as organized web pages. Conclusion GenMAPP version 2 provides a means to rapidly interrogate complex experimental data for pathway-level changes in a diverse range of organisms.

  5. The missing indels: an estimate of indel variation in a human genome and analysis of factors that impede detection

    Science.gov (United States)

    Jiang, Yue; Turinsky, Andrei L.; Brudno, Michael

    2015-01-01

    With the development of High-Throughput Sequencing (HTS) thousands of human genomes have now been sequenced. Whenever different studies analyze the same genome they usually agree on the amount of single-nucleotide polymorphisms, but differ dramatically on the number of insertion and deletion variants (indels). Furthermore, there is evidence that indels are often severely under-reported. In this manuscript we derive the total number of indel variants in a human genome by combining data from different sequencing technologies, while assessing the indel detection accuracy. Our estimate of approximately 1 million indels in a Yoruban genome is much higher than the results reported in several recent HTS studies. We identify two key sources of difficulties in indel detection: the insufficient coverage, read length or alignment quality; and the presence of repeats, including short interspersed elements and homopolymers/dimers. We quantify the effect of these factors on indel detection. The quality of sequencing data plays a major role in improving indel detection by HTS methods. However, many indels exist in long homopolymers and repeats, where their detection is severely impeded. The true number of indel events is likely even higher than our current estimates, and new techniques and technologies will be required to detect them. PMID:26130710

  6. Single nucleotide polymorphism discovery in bovine liver using RNA-seq technology.

    Directory of Open Access Journals (Sweden)

    Chandra Shekhar Pareek

    Full Text Available RNA-seq is a useful next-generation sequencing (NGS technology that has been widely used to understand mammalian transcriptome architecture and function. In this study, a breed-specific RNA-seq experiment was utilized to detect putative single nucleotide polymorphisms (SNPs in liver tissue of young bulls of the Polish Red, Polish Holstein-Friesian (HF and Hereford breeds, and to understand the genomic variation in the three cattle breeds that may reflect differences in production traits.The RNA-seq experiment on bovine liver produced 107,114,4072 raw paired-end reads, with an average of approximately 60 million paired-end reads per library. Breed-wise, a total of 345.06, 290.04 and 436.03 million paired-end reads were obtained from the Polish Red, Polish HF, and Hereford breeds, respectively. Burrows-Wheeler Aligner (BWA read alignments showed that 81.35%, 82.81% and 84.21% of the mapped sequencing reads were properly paired to the Polish Red, Polish HF, and Hereford breeds, respectively. This study identified 5,641,401 SNPs and insertion and deletion (indel positions expressed in the bovine liver with an average of 313,411 SNPs and indel per young bull. Following the removal of the indel mutations, a total of 195,3804, 152,7120 and 205,3184 raw SNPs expressed in bovine liver were identified for the Polish Red, Polish HF, and Hereford breeds, respectively. Breed-wise, three highly reliable breed-specific SNP-databases (SNP-dbs with 31,562, 24,945 and 28,194 SNP records were constructed for the Polish Red, Polish HF, and Hereford breeds, respectively. Using a combination of stringent parameters of a minimum depth of ≥10 mapping reads that support the polymorphic nucleotide base and 100% SNP ratio, 4,368, 3,780 and 3,800 SNP records were detected in the Polish Red, Polish HF, and Hereford breeds, respectively. The SNP detections using RNA-seq data were successfully validated by kompetitive allele-specific PCR (KASPTM SNP genotyping assay. The

  7. Single nucleotide polymorphism discovery in bovine liver using RNA-seq technology.

    Science.gov (United States)

    Pareek, Chandra Shekhar; Błaszczyk, Paweł; Dziuba, Piotr; Czarnik, Urszula; Fraser, Leyland; Sobiech, Przemysław; Pierzchała, Mariusz; Feng, Yaping; Kadarmideen, Haja N; Kumar, Dibyendu

    2017-01-01

    RNA-seq is a useful next-generation sequencing (NGS) technology that has been widely used to understand mammalian transcriptome architecture and function. In this study, a breed-specific RNA-seq experiment was utilized to detect putative single nucleotide polymorphisms (SNPs) in liver tissue of young bulls of the Polish Red, Polish Holstein-Friesian (HF) and Hereford breeds, and to understand the genomic variation in the three cattle breeds that may reflect differences in production traits. The RNA-seq experiment on bovine liver produced 107,114,4072 raw paired-end reads, with an average of approximately 60 million paired-end reads per library. Breed-wise, a total of 345.06, 290.04 and 436.03 million paired-end reads were obtained from the Polish Red, Polish HF, and Hereford breeds, respectively. Burrows-Wheeler Aligner (BWA) read alignments showed that 81.35%, 82.81% and 84.21% of the mapped sequencing reads were properly paired to the Polish Red, Polish HF, and Hereford breeds, respectively. This study identified 5,641,401 SNPs and insertion and deletion (indel) positions expressed in the bovine liver with an average of 313,411 SNPs and indel per young bull. Following the removal of the indel mutations, a total of 195,3804, 152,7120 and 205,3184 raw SNPs expressed in bovine liver were identified for the Polish Red, Polish HF, and Hereford breeds, respectively. Breed-wise, three highly reliable breed-specific SNP-databases (SNP-dbs) with 31,562, 24,945 and 28,194 SNP records were constructed for the Polish Red, Polish HF, and Hereford breeds, respectively. Using a combination of stringent parameters of a minimum depth of ≥10 mapping reads that support the polymorphic nucleotide base and 100% SNP ratio, 4,368, 3,780 and 3,800 SNP records were detected in the Polish Red, Polish HF, and Hereford breeds, respectively. The SNP detections using RNA-seq data were successfully validated by kompetitive allele-specific PCR (KASPTM) SNP genotyping assay. The comprehensive

  8. Evaluation and optimisation of indel detection workflows for ion torrent sequencing of the BRCA1 and BRCA2 genes.

    Science.gov (United States)

    Yeo, Zhen Xuan; Wong, Joshua Chee Leong; Rozen, Steven G; Lee, Ann Siew Gek

    2014-06-24

    The Ion Torrent PGM is a popular benchtop sequencer that shows promise in replacing conventional Sanger sequencing as the gold standard for mutation detection. Despite the PGM's reported high accuracy in calling single nucleotide variations, it tends to generate many false positive calls in detecting insertions and deletions (indels), which may hinder its utility for clinical genetic testing. Recently, the proprietary analytical workflow for the Ion Torrent sequencer, Torrent Suite (TS), underwent a series of upgrades. We evaluated three major upgrades of TS by calling indels in the BRCA1 and BRCA2 genes. Our analysis revealed that false negative indels could be generated by TS under both default calling parameters and parameters adjusted for maximum sensitivity. However, indel calling with the same data using the open source variant callers, GATK and SAMtools showed that false negatives could be minimised with the use of appropriate bioinformatics analysis. Furthermore, we identified two variant calling measures, Quality-by-Depth (QD) and VARiation of the Width of gaps and inserts (VARW), which substantially reduced false positive indels, including non-homopolymer associated errors without compromising sensitivity. In our best case scenario that involved the TMAP aligner and SAMtools, we achieved 100% sensitivity, 99.99% specificity and 29% False Discovery Rate (FDR) in indel calling from all 23 samples, which is a good performance for mutation screening using PGM. New versions of TS, BWA and GATK have shown improvements in indel calling sensitivity and specificity over their older counterpart. However, the variant caller of TS exhibits a lower sensitivity than GATK and SAMtools. Our findings demonstrate that although indel calling from PGM sequences may appear to be noisy at first glance, proper computational indel calling analysis is able to maximize both the sensitivity and specificity at the single base level, paving the way for the usage of this technology

  9. Pathogenesis-related proteins in Brazilian wheat genotypes: protein induction and partial gene sequencing Proteínas relacionadas à patogênese em genótipos brasileiros de trigo: indução e seqüenciamento parcial

    Directory of Open Access Journals (Sweden)

    Loreta Brandão de Freitas

    2003-06-01

    Full Text Available Leaves from 14 Brazilian genotypes of Triticum aestivum L. were treated with salicylic acid to induce pathogenesis-related (PR proteins. Inter and intracellular extracts were then obtained and investigated through polyacrilamide gel electrophoresis. Seven bands were observed. Material related to two of them (of 40 and 24 kDa occurred in intracellular spaces only. DNA from these same genotypes was then amplified through PCR using primers developed from three sequences encoding PR proteins, and compared with previously described sequences. The fragments presented homologies to PR groups 1, 3 (chitinases, and 5 (thaumatin-like. The PR3-like sequence also showed a site characteristic of PRs induced by ethylene and a portion without homology with previous sequences. No variation among genotypes were observed, either for protein extracts or DNA sequences.Folhas de 14 genótipos brasileiros de Triticum aestivum L. foram tratadas com ácido salicílico para a indução de proteínas relacionadas à patogênese (PR. Extratos inter e intracelulares foram assim obtidos e estudados através de eletroforese em gel de poliacrilamida. Sete bandas foram observadas, sendo que o material referente a duas delas (de 40 e 24 kDa foi detectado somente nos espaços intracelulares. O DNA desses mesmos genótipos foi então amplificado através de PCR, usando iniciadores desenvolvidos a partir de três seqüências que codificam proteínas PR, e comparados com seqüências previamente descritas. Eles apresentaram homologia com os grupos PR 1, PR 3 (quitinases e PR 5 (semelhante à taumatina, sendo que a seqüência do grupo PR 3 apresentou também um sítio característico de PRs induzidas pelo etileno e uma porção sem homologia com seqüências prévias. Não foi observada qualquer variação entre genótipos, seja nos extratos protéicos ou nas seqüências de DNA.

  10. The Plant Genome Integrative Explorer Resource: PlantGenIE.org.

    Science.gov (United States)

    Sundell, David; Mannapperuma, Chanaka; Netotea, Sergiu; Delhomme, Nicolas; Lin, Yao-Cheng; Sjödin, Andreas; Van de Peer, Yves; Jansson, Stefan; Hvidsten, Torgeir R; Street, Nathaniel R

    2015-12-01

    Accessing and exploring large-scale genomics data sets remains a significant challenge to researchers without specialist bioinformatics training. We present the integrated PlantGenIE.org platform for exploration of Populus, conifer and Arabidopsis genomics data, which includes expression networks and associated visualization tools. Standard features of a model organism database are provided, including genome browsers, gene list annotation, Blast homology searches and gene information pages. Community annotation updating is supported via integration of WebApollo. We have produced an RNA-sequencing (RNA-Seq) expression atlas for Populus tremula and have integrated these data within the expression tools. An updated version of the ComPlEx resource for performing comparative plant expression analyses of gene coexpression network conservation between species has also been integrated. The PlantGenIE.org platform provides intuitive access to large-scale and genome-wide genomics data from model forest tree species, facilitating both community contributions to annotation improvement and tools supporting use of the included data resources to inform biological insight. © 2015 The Authors. New Phytologist © 2015 New Phytologist Trust.

  11. GapCoder automates the use of indel characters in phylogenetic analysis.

    Science.gov (United States)

    Young, Nelson D; Healy, John

    2003-02-19

    Several ways of incorporating indels into phylogenetic analysis have been suggested. Simple indel coding has two strengths: (1) biological realism and (2) efficiency of analysis. In the method, each indel with different start and/or end positions is considered to be a separate character. The presence/absence of these indel characters is then added to the data set. We have written a program, GapCoder to automate this procedure. The program can input PIR format aligned datasets, find the indels and add the indel-based characters. The output is a NEXUS format file, which includes a table showing what region each indel characters is based on. If regions are excluded from analysis, this table makes it easy to identify the corresponding indel characters for exclusion. Manual implementation of the simple indel coding method can be very time-consuming, especially in data sets where indels are numerous and/or overlapping. GapCoder automates this method and is therefore particularly useful during procedures where phylogenetic analyses need to be repeated many times, such as when different alignments are being explored or when various taxon or character sets are being explored. GapCoder is currently available for Windows from http://www.home.duq.edu/~youngnd/GapCoder.

  12. Psychometric properties of the Serbian version of the Empathy Quotient (S-EQ

    Directory of Open Access Journals (Sweden)

    Dimitrijević Aleksandar

    2012-01-01

    Full Text Available In the present study we examined psychometric properties of the Serbian translation of the Empathy Quotient scale (S-EQ. The translated version of the EQ was applied on a sample of 694 high-school students. A sub-sample consisting of 375 high-school students also completed the Interpersonal Reactivity Index (IRI, another widely used empathy measure. The following statistical analyses were applied: internal consistency analysis, explanatory (EFA and confirmatory (CFA factor analyses, and factor congruence analysis. Correlation with IRI and gender differences were calculated to demonstrate validity of the instrument. Results show that the Serbian 40-item version of EQ has lower reliability (Cronbach’s alpha = .782 than the original. The originally proposed one factor structure of the instrument was not confirmed. The short version with 28 items showed better reliablity (alpha= .807. The three-factor solution (cognitive empathy, emotional reactivity, and social skills showed good cross-sample stability (Tucker congruence coefficient over .8 but the results of CFA confirmed the solution proposed in the reviewed literature only partially. The mean scores are similar to those obtained in the other studies, and, as expected, women have significantly higher scores than men. Correlations with all subscales of IRI are statistically significant for the first two subscales of EQ, but not for the „social skills.” We concluded that the Serbian version of the „Empathy Quotient” is a useful research tool which can contribute to cross-cultural studies of empathy, although its psychometric characteristics are not as good as those obtained in the original study. We also suggest that a 28-item should be used preferably to the original 40-item version. [Projekat Ministarstva nauke Republike Srbije, br. 179018: Identification, measurement and development of cognitive and emotional competences important for a society oriented to European integrations

  13. The GenABEL Project for statistical genomics [version 1; referees: 2 approved

    Directory of Open Access Journals (Sweden)

    Lennart C. Karssen

    2016-05-01

    Full Text Available Development of free/libre open source software is usually done by a community of people with an interest in the tool. For scientific software, however, this is less often the case. Most scientific software is written by only a few authors, often a student working on a thesis. Once the paper describing the tool has been published, the tool is no longer developed further and is left to its own device. Here we describe the broad, multidisciplinary community we formed around a set of tools for statistical genomics. The GenABEL project for statistical omics actively promotes open interdisciplinary development of statistical methodology and its implementation in efficient and user-friendly software under an open source licence. The software tools developed withing the project collectively make up the GenABEL suite, which currently consists of eleven tools. The open framework of the project actively encourages involvement of the community in all stages, from formulation of methodological ideas to application of software to specific data sets. A web forum is used to channel user questions and discussions, further promoting the use of the GenABEL suite. Developer discussions take place on a dedicated mailing list, and development is further supported by robust development practices including use of public version control, code review and continuous integration. Use of this open science model attracts contributions from users and developers outside the “core team”, facilitating agile statistical omics methodology development and fast dissemination.

  14. Systematic analysis of short internal indels and their impact on protein folding

    Directory of Open Access Journals (Sweden)

    Guo Jun-tao

    2010-08-01

    Full Text Available Abstract Background Protein sequence insertions/deletions (indels can be introduced during evolution or through alternative splicing (AS. Alternative splicing is an important biological phenomenon and is considered as the major means of expanding structural and functional diversity in eukaryotes. Knowledge of the structural changes due to indels is critical to our understanding of the evolution of protein structure and function. In addition, it can help us probe the evolution of alternative splicing and the diversity of functional isoforms. However, little is known about the effects of indels, in particular the ones involving core secondary structures, on the folding of protein structures. The long term goal of our study is to accurately predict the protein AS isoform structures. As a first step towards this goal, we performed a systematic analysis on the structural changes caused by short internal indels through mining highly homologous proteins in Protein Data Bank (PDB. Results We compiled a non-redundant dataset of short internal indels (2-40 amino acids from highly homologous protein pairs and analyzed the sequence and structural features of the indels. We found that about one third of indel residues are in disordered state and majority of the residues are exposed to solvent, suggesting that these indels are generally located on the surface of proteins. Though naturally occurring indels are fewer than engineered ones in the dataset, there are no statistically significant differences in terms of amino acid frequencies and secondary structure types between the "Natural" indels and "All" indels in the dataset. Structural comparisons show that all the protein pairs with short internal indels in the dataset preserve the structural folds and about 85% of protein pairs have global RMSDs (root mean square deviations of 2Å or less, suggesting that protein structures tend to be conserved and can tolerate short insertions and deletions. A few pairs

  15. Incorporating indel information into phylogeny estimation for rapidly emerging pathogens

    Directory of Open Access Journals (Sweden)

    Suchard Marc A

    2007-03-01

    Full Text Available Abstract Background Phylogenies of rapidly evolving pathogens can be difficult to resolve because of the small number of substitutions that accumulate in the short times since divergence. To improve resolution of such phylogenies we propose using insertion and deletion (indel information in addition to substitution information. We accomplish this through joint estimation of alignment and phylogeny in a Bayesian framework, drawing inference using Markov chain Monte Carlo. Joint estimation of alignment and phylogeny sidesteps biases that stem from conditioning on a single alignment by taking into account the ensemble of near-optimal alignments. Results We introduce a novel Markov chain transition kernel that improves computational efficiency by proposing non-local topology rearrangements and by block sampling alignment and topology parameters. In addition, we extend our previous indel model to increase biological realism by placing indels preferentially on longer branches. We demonstrate the ability of indel information to increase phylogenetic resolution in examples drawn from within-host viral sequence samples. We also demonstrate the importance of taking alignment uncertainty into account when using such information. Finally, we show that codon-based substitution models can significantly affect alignment quality and phylogenetic inference by unrealistically forcing indels to begin and end between codons. Conclusion These results indicate that indel information can improve phylogenetic resolution of recently diverged pathogens and that alignment uncertainty should be considered in such analyses.

  16. Sequence length variation, indel costs, and congruence in sensitivity analysis

    DEFF Research Database (Denmark)

    Aagesen, Lone; Petersen, Gitte; Seberg, Ole

    2005-01-01

    The behavior of two topological and four character-based congruence measures was explored using different indel treatments in three empirical data sets, each with different alignment difficulties. The analyses were done using direct optimization within a sensitivity analysis framework in which...... the cost of indels was varied. Indels were treated either as a fifth character state, or strings of contiguous gaps were considered single events by using linear affine gap cost. Congruence consistently improved when indels were treated as single events, but no congruence measure appeared as the obviously...... preferable one. However, when combining enough data, all congruence measures clearly tended to select the same alignment cost set as the optimal one. Disagreement among congruence measures was mostly caused by a dominant fragment or a data partition that included all or most of the length variation...

  17. Single nucleotide variants and InDels identified from whole-genome re-sequencing of Guzerat, Gyr, Girolando and Holstein cattle breeds.

    Directory of Open Access Journals (Sweden)

    Nedenia Bonvino Stafuzza

    Full Text Available Whole-genome re-sequencing, alignment and annotation analyses were undertaken for 12 sires representing four important cattle breeds in Brazil: Guzerat (multi-purpose, Gyr, Girolando and Holstein (dairy production. A total of approximately 4.3 billion reads from an Illumina HiSeq 2000 sequencer generated for each animal 10.7 to 16.4-fold genome coverage. A total of 27,441,279 single nucleotide variations (SNVs and 3,828,041 insertions/deletions (InDels were detected in the samples, of which 2,557,670 SNVs and 883,219 InDels were novel. The submission of these genetic variants to the dbSNP database significantly increased the number of known variants, particularly for the indicine genome. The concordance rate between genotypes obtained using the Bovine HD BeadChip array and the same variants identified by sequencing was about 99.05%. The annotation of variants identified numerous non-synonymous SNVs and frameshift InDels which could affect phenotypic variation. Functional enrichment analysis was performed and revealed that variants in the olfactory transduction pathway was over represented in all four cattle breeds, while the ECM-receptor interaction pathway was over represented in Girolando and Guzerat breeds, the ABC transporters pathway was over represented only in Holstein breed, and the metabolic pathways was over represented only in Gyr breed. The genetic variants discovered here provide a rich resource to help identify potential genomic markers and their associated molecular mechanisms that impact economically important traits for Gyr, Girolando, Guzerat and Holstein breeding programs.

  18. ORMAN: optimal resolution of ambiguous RNA-Seq multimappings in the presence of novel isoforms.

    Science.gov (United States)

    Dao, Phuong; Numanagić, Ibrahim; Lin, Yen-Yi; Hach, Faraz; Karakoc, Emre; Donmez, Nilgun; Collins, Colin; Eichler, Evan E; Sahinalp, S Cenk

    2014-03-01

    RNA-Seq technology is promising to uncover many novel alternative splicing events, gene fusions and other variations in RNA transcripts. For an accurate detection and quantification of transcripts, it is important to resolve the mapping ambiguity for those RNA-Seq reads that can be mapped to multiple loci: >17% of the reads from mouse RNA-Seq data and 50% of the reads from some plant RNA-Seq data have multiple mapping loci. In this study, we show how to resolve the mapping ambiguity in the presence of novel transcriptomic events such as exon skipping and novel indels towards accurate downstream analysis. We introduce ORMAN ( O ptimal R esolution of M ultimapping A mbiguity of R N A-Seq Reads), which aims to compute the minimum number of potential transcript products for each gene and to assign each multimapping read to one of these transcripts based on the estimated distribution of the region covering the read. ORMAN achieves this objective through a combinatorial optimization formulation, which is solved through well-known approximation algorithms, integer linear programs and heuristics. On a simulated RNA-Seq dataset including a random subset of transcripts from the UCSC database, the performance of several state-of-the-art methods for identifying and quantifying novel transcripts, such as Cufflinks, IsoLasso and CLIIQ, is significantly improved through the use of ORMAN. Furthermore, in an experiment using real RNA-Seq reads, we show that ORMAN is able to resolve multimapping to produce coverage values that are similar to the original distribution, even in genes with highly non-uniform coverage. ORMAN is available at http://orman.sf.net

  19. A comprehensive evaluation of alignment algorithms in the context of RNA-seq.

    Directory of Open Access Journals (Sweden)

    Robert Lindner

    Full Text Available Transcriptome sequencing (RNA-Seq overcomes limitations of previously used RNA quantification methods and provides one experimental framework for both high-throughput characterization and quantification of transcripts at the nucleotide level. The first step and a major challenge in the analysis of such experiments is the mapping of sequencing reads to a transcriptomic origin including the identification of splicing events. In recent years, a large number of such mapping algorithms have been developed, all of which have in common that they require algorithms for aligning a vast number of reads to genomic or transcriptomic sequences. Although the FM-index based aligner Bowtie has become a de facto standard within mapping pipelines, a much larger number of possible alignment algorithms have been developed also including other variants of FM-index based aligners. Accordingly, developers and users of RNA-seq mapping pipelines have the choice among a large number of available alignment algorithms. To provide guidance in the choice of alignment algorithms for these purposes, we evaluated the performance of 14 widely used alignment programs from three different algorithmic classes: algorithms using either hashing of the reference transcriptome, hashing of reads, or a compressed FM-index representation of the genome. Here, special emphasis was placed on both precision and recall and the performance for different read lengths and numbers of mismatches and indels in a read. Our results clearly showed the significant reduction in memory footprint and runtime provided by FM-index based aligners at a precision and recall comparable to the best hash table based aligners. Furthermore, the recently developed Bowtie 2 alignment algorithm shows a remarkable tolerance to both sequencing errors and indels, thus, essentially making hash-based aligners obsolete.

  20. Traffic Generator (TrafficGen) Version 1.4.2: Users Guide

    Science.gov (United States)

    2016-06-01

    the network with Transmission Control Protocol and User Datagram Protocol Internet Protocol traffic. Each node generating network traffic in an...TrafficGen Graphical User Interface (GUI) 3 3.1 Anatomy of the User Interface 3 3.2 Scenario Configuration and MGEN Files 4 4. Working with...for public release; distribution is unlimited. vi List of Figures Fig. 1 TrafficGen user interface

  1. On peculiar Šindel sequences

    Czech Academy of Sciences Publication Activity Database

    Křížek, Michal; Somer, L.

    2010-01-01

    Roč. 17, č. 2 (2010), s. 129-140 ISSN 0972-5555 R&D Projects: GA AV ČR(CZ) IAA100190803 Institutional research plan: CEZ:AV0Z10190503 Keywords : quadratic residue * Chinese remainder theorem * primitive Šindel sequences * Prague clock sequence Subject RIV: BA - General Mathematics http://www.pphmj.com/abstract/5095.htm

  2. Analysis of the indel at the ARMS2 3′UTR in age-related macular degeneration

    Science.gov (United States)

    Wang, Gaofeng; Spencer, Kylee L.; Scott, William K.; Whitehead, Patrice; Court, Brenda L.; Ayala-Haedo, Juan; Mayo, Ping; Schwartz, Stephen G.; Kovach, Jaclyn L.; Gallins, Paul; Polk, Monica; Agarwal, Anita; Postel, Eric A.; Haines, Jonathan L.; Pericak-Vance, Margaret A.

    2010-01-01

    Controversy remains as to which gene at the chromosome 10q26 locus confers risk for age-related macular degeneration (AMD) and statistical genetic analysis is confounded by the strong linkage disequilibrium (LD) across the region. Functional analysis of related genetic variations could solve this puzzle. Recently Fritsche et al. reported that AMD is associated with unstable ARMS2 transcripts possibly caused by a complex insertion/deletion (indel; consisting of a 443 bp deletion and an adjacent 54 bp insertion) in its 3′UTR (untranslated region). To validate this indel, we sequenced our samples. We found that this indel is even more complex and is composed of two side-by-side indels separated by 17 bp: (1) 9 bp deletion with 10bp insertion; (2) 417 bp deletion with 27 bp insertion. The indel is significantly associated with the risk of AMD, but is also in strong LD with the non-synonymous single nucleotide polymorphism (SNP) rs10490924 (A69S). We also found that ARMS2 is expressed not only in placenta and retina but also in multiple human tissues. Using quantitative PCR, we found no correlation between the indel and ARMS2 mRNA level in human retina and blood samples. The lack of functional effects of the 3′UTR indel, the amino acid substitution of rs10490924 (A69S) and strong LD between them suggest that A69S, not the indel is the variant that confers risk of AMD. To our knowledge, it is the first time it's been shown that ARMS2 is widely expressed in human tissues. Conclusively, the indel at 3′UTR of ARMS2 actually contains two side-by-side indels. The indels are associated with risk of AMD, but not correlated with ARMS2 mRNA level. PMID:20182747

  3. An integrative approach to predicting the functional effects of small indels in non-coding regions of the human genome.

    Science.gov (United States)

    Ferlaino, Michael; Rogers, Mark F; Shihab, Hashem A; Mort, Matthew; Cooper, David N; Gaunt, Tom R; Campbell, Colin

    2017-10-06

    Small insertions and deletions (indels) have a significant influence in human disease and, in terms of frequency, they are second only to single nucleotide variants as pathogenic mutations. As the majority of mutations associated with complex traits are located outside the exome, it is crucial to investigate the potential pathogenic impact of indels in non-coding regions of the human genome. We present FATHMM-indel, an integrative approach to predict the functional effect, pathogenic or neutral, of indels in non-coding regions of the human genome. Our method exploits various genomic annotations in addition to sequence data. When validated on benchmark data, FATHMM-indel significantly outperforms CADD and GAVIN, state of the art models in assessing the pathogenic impact of non-coding variants. FATHMM-indel is available via a web server at indels.biocompute.org.uk. FATHMM-indel can accurately predict the functional impact and prioritise small indels throughout the whole non-coding genome.

  4. NextGen Avionics Roadmap Version 2.0

    Science.gov (United States)

    2011-09-30

    Systems Analysis ( IPSA ) Division has defined multiple NextGen Operational (NGOps) Levels, projecting relative performance and risk based on differing...degrees of capability improvements, as shown in Figure 4. IPSA forecasts include the most likely performance NGOps level (i.e., NGOps 3-4), as well...in the near-term. Figures 5 through 9 de- pict the various programs and capabilities aligned with the various NGOps levels. Factors from the IPSA

  5. Simple Detection of Large InDeLS by DHPLC: The ACE Gene as a Model

    Directory of Open Access Journals (Sweden)

    Renata Guedes Koyama

    2008-01-01

    Full Text Available Insertion-deletion polymorphism (InDeL is the second most frequent type of genetic variation in the human genome. For the detection of large InDeLs, researchers usually resort to either PCR gel analysis or RFLP, but these are time consuming and dependent on human interpretation. Therefore, a more efficient method for genotyping this kind of genetic variation is needed. In this report, we describe a method that can detect large InDeLs by DHPLC (denaturating high-performance liquid chromatography using the angiotensin-converting enzyme (ACE gene I/D polymorphism as a model. The InDeL targeted in this study is characterized by a 288 bp Alu element insertion (I. We used DHPLC at nondenaturating conditions to analyze the PCR product with a flow through the chromatographic column under two different gradients based on the differences between D and I sequences. The analysis described is quick and easy, making this technique a suitable and efficient means for DHPLC users to screen InDeLs in genetic epidemiological studies.

  6. Evaluating whole transcriptome amplification for gene profiling experiments using RNA-Seq.

    Science.gov (United States)

    Faherty, Sheena L; Campbell, C Ryan; Larsen, Peter A; Yoder, Anne D

    2015-07-30

    RNA-Seq has enabled high-throughput gene expression profiling to provide insight into the functional link between genotype and phenotype. Low quantities of starting RNA can be a severe hindrance for studies that aim to utilize RNA-Seq. To mitigate this bottleneck, whole transcriptome amplification (WTA) technologies have been developed to generate sufficient sequencing targets from minute amounts of RNA. Successful WTA requires accurate replication of transcript abundance without the loss or distortion of specific mRNAs. Here, we test the efficacy of NuGEN's Ovation RNA-Seq V2 system, which uses linear isothermal amplification with a unique chimeric primer for amplification, using white adipose tissue from standard laboratory rats (Rattus norvegicus). Our goal was to investigate potential biological artifacts introduced through WTA approaches by establishing comparisons between matched raw and amplified RNA libraries derived from biological replicates. We found that 93% of expressed genes were identical between all unamplified versus matched amplified comparisons, also finding that gene density is similar across all comparisons. Our sequencing experiment and downstream bioinformatic analyses using the Tuxedo analysis pipeline resulted in the assembly of 25,543 high-quality transcripts. Libraries constructed from raw RNA and WTA samples averaged 15,298 and 15,253 expressed genes, respectively. Although significant differentially expressed genes (P < 0.05) were identified in all matched samples, each of these represents less than 0.15% of all shared genes for each comparison. Transcriptome amplification is efficient at maintaining relative transcript frequencies with no significant bias when using this NuGEN linear isothermal amplification kit under ideal laboratory conditions as presented in this study. This methodology has broad applications, from clinical and diagnostic, to field-based studies when sample acquisition, or sample preservation, methods prove

  7. Fast and sensitive detection of indels induced by precise gene targeting

    DEFF Research Database (Denmark)

    Yang, Zhang; Steentoft, Catharina; Hauge, Camilla

    2015-01-01

    The nuclease-based gene editing tools are rapidly transforming capabilities for altering the genome of cells and organisms with great precision and in high throughput studies. A major limitation in application of precise gene editing lies in lack of sensitive and fast methods to detect...... and characterize the induced DNA changes. Precise gene editing induces double-stranded DNA breaks that are repaired by error-prone non-homologous end joining leading to introduction of insertions and deletions (indels) at the target site. These indels are often small and difficult and laborious to detect...

  8. GenPlay Multi-Genome, a tool to compare and analyze multiple human genomes in a graphical interface.

    Science.gov (United States)

    Lajugie, Julien; Fourel, Nicolas; Bouhassira, Eric E

    2015-01-01

    Parallel visualization of multiple individual human genomes is a complex endeavor that is rapidly gaining importance with the increasing number of personal, phased and cancer genomes that are being generated. It requires the display of variants such as SNPs, indels and structural variants that are unique to specific genomes and the introduction of multiple overlapping gaps in the reference sequence. Here, we describe GenPlay Multi-Genome, an application specifically written to visualize and analyze multiple human genomes in parallel. GenPlay Multi-Genome is ideally suited for the comparison of allele-specific expression and functional genomic data obtained from multiple phased genomes in a graphical interface with access to multiple-track operation. It also allows the analysis of data that have been aligned to custom genomes rather than to a standard reference and can be used as a variant calling format file browser and as a tool to compare different genome assembly, such as hg19 and hg38. GenPlay is available under the GNU public license (GPL-3) from http://genplay.einstein.yu.edu. The source code is available at https://github.com/JulienLajugie/GenPlay. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  9. SIPSMetGen: It's Not Just For Aircraft Data and ECS Anymore.

    Science.gov (United States)

    Schwab, M.

    2015-12-01

    The SIPSMetGen utility, developed for the NASA EOSDIS project, under the EED contract, simplified the creation of file level metadata for the ECS System. The utility has been enhanced for ease of use, efficiency, speed and increased flexibility. The SIPSMetGen utility was originally created as a means of generating file level spatial metadata for Operation IceBridge. The first version created only ODL metadata, specific for ingest into ECS. The core strength of the utility was, and continues to be, its ability to take complex shapes and patterns of data collection point clouds from aircraft flights and simplify them to a relatively simple concave hull geo-polygon. It has been found to be a useful and easy to use tool for creating file level metadata for many other missions, both aircraft and satellite. While the original version was useful it had its limitations. In 2014 Raytheon was tasked to make enhancements to SIPSMetGen, this resulted a new version of SIPSMetGen which can create ISO Compliant XML metadata; provides optimization and streamlining of the algorithm for creating the spatial metadata; a quicker runtime with more consistent results; a utility that can be configured to run multi-threaded on systems with multiple processors. The utility comes with a java based graphical user interface to aid in configuration and running of the utility. The enhanced SIPSMetGen allows more diverse data sets to be archived with file level metadata. The advantage of archiving data with file level metadata is that it makes it easier for data users, and scientists to find relevant data. File level metadata unlocks the power of existing archives and metadata repositories such as ECS and CMR and search and discovery utilities like Reverb and Earth Data Search. Current missions now using SIPSMetGen include: Aquarius, Measures, ARISE, and Nimbus.

  10. Estudo da diversidade genética de Podosphaera xanthii através de marcadores AFLP e seqüências ITS

    Directory of Open Access Journals (Sweden)

    Erika Sayuri Naruzawa

    2011-06-01

    Full Text Available O meloeiro (Cucumis melo L. é uma frutífera largamente cultivada no Brasil, principalmente no nordeste brasileiro, onde é produzida principalmente para a exportação. Plantas da família do meloeiro, como pepino e abóbora, podem ser severamente afetadas pelo oídio, causado por Podosphaera xanthii.. Este fungo apresenta diversas raças fisiológicas cuja correta identificação é importante para o manejo da doença, já que o uso de variedades resistentes é o método mais eficaz de seu controle. No entanto, a identificação destas raças por meio da prática tradicional de inoculações em uma série diferenciadora de variedades de meloeiro é laboriosa e passível de erros. Devido a isso, um método alternativo seria o uso de marcadores moleculares para determinar de forma rápida a identidade das raças. O objetivo deste estudo foi o de analisar a variabilidade entre isolados de P. xanthii previamente classificados em raças através da técnica de AFLP e do seqüenciamento da região ITS 5.8S do rDNA. A partir dos marcadores AFLP obteve-se um dendrograma no qual não houve separação dos isolados quanto às suas raças, origem geográfica ou hospedeiro de origem. Com esta técnica verificou-se alta variabilidade entre isolados, com similaridade genética máxima de 69% e similaridade mínima de 23%. Ao contrário da informação gerada por AFLP, não foi observada variação na sequência da região ITS 5.8S entre isolados. Desta forma, a análise por AFLP indicou que os isolados tem composição genética heterogênea muito embora este fato não tenha sido evidenciado pelo sequenciamento da região ITS.

  11. GenGIS 2: geospatial analysis of traditional and genetic biodiversity, with new gradient algorithms and an extensible plugin framework.

    Directory of Open Access Journals (Sweden)

    Donovan H Parks

    Full Text Available GenGIS is free and open source software designed to integrate biodiversity data with a digital map and information about geography and habitat. While originally developed with microbial community analyses and phylogeography in mind, GenGIS has been applied to a wide range of datasets. A key feature of GenGIS is the ability to test geographic axes that can correspond to routes of migration or gradients that influence community similarity. Here we introduce GenGIS version 2, which extends the linear gradient tests introduced in the first version to allow comprehensive testing of all possible linear geographic axes. GenGIS v2 also includes a new plugin framework that supports the development and use of graphically driven analysis packages: initial plugins include implementations of linear regression and the Mantel test, calculations of alpha-diversity (e.g., Shannon Index for all samples, and geographic visualizations of dissimilarity matrices. We have also implemented a recently published method for biomonitoring reference condition analysis (RCA, which compares observed species richness and diversity to predicted values to determine whether a given site has been impacted. The newest version of GenGIS supports vector data in addition to raster files. We demonstrate the new features of GenGIS by performing a full gradient analysis of an Australian kangaroo apple data set, by using plugins and embedded statistical commands to analyze human microbiome sample data, and by applying RCA to a set of samples from Atlantic Canada. GenGIS release versions, tutorials and documentation are freely available at http://kiwi.cs.dal.ca/GenGIS, and source code is available at https://github.com/beiko-lab/gengis.

  12. Dynamics of Indel Profiles Induced by Various CRISPR/Cas9 Delivery Methods

    DEFF Research Database (Denmark)

    Kosicki, Michael; Rajan, Sandeep S; Lorenzetti, Flaminia C

    2017-01-01

    The introduction of CRISPR/Cas9 gene editing in mammalian cells is a scientific breakthrough, which has greatly affected basic research and gene therapy. The simplicity and general access to CRISPR/Cas9 reagents has in an unprecedented manner "democratized" gene targeting in biomedical research...... approach. In this study we review the most commonly used indel detection methods and using a robust, sensitive, and cost efficient Indel Detection by Amplicon Analysis method, we have investigated the impact of the most commonly used CRISPR/Cas9 delivery formats, including lentivirus transduction, plasmid...

  13. Indel-II region deletion sizes in the white spot syndrome virus genome correlate with shrimp disease outbreaks in southern Vietnam

    NARCIS (Netherlands)

    Tran Thi Tuyet, H.; Zwart, M.P.; Phuong, N.T.; Oanh, D.T.H.; Jong, de M.C.M.; Vlak, J.M.

    2012-01-01

    Sequence comparisons of the genomes of white spot syndrome virus (WSSV) strains have identified regions containing variable-length insertions/deletions (i.e. indels). Indel-I and Indel-II, positioned between open reading frames (ORFs) 14/15 and 23/24, respectively, are the largest and the most

  14. Genetic variability of Brazilian phytoplasma and spiroplasma isolated from maize plants Variabilidade genética de fitoplasma e espiroplasma isolados de plantas de milho no Brasil

    Directory of Open Access Journals (Sweden)

    Eliane Aparecida Gomes

    2004-01-01

    Full Text Available The objective of this work was to characterize the genetic variability of phytoplasma and Spiroplasma kunkelii isolated from maize plants showing symptoms of stunt collected from different Brazilian geographic regions. A DNA fragment of 500 base pairs (bp was amplified from the spiralin gene in S. kunkelii and one fragment of 1,200 bp was generated from 16S rDNA gene in phytoplasma. The partial sequences of the spiralin gene showed similarity of 98% among the isolates of S. kunkelii analyzed. These sequences were compared with the sequence of the spiralin gene from other Spiroplasma species deposited in the GenBank, resulting in a similarity varying from 76.9% to 88.1%. The 16S rDNA sequence from the phytoplasma were completely similar within the Brazilian isolates and showed up to 98% of the similarity with sequences already found from other phytoplasmas. A very narrow genetic variability was detected by these gene fragments within phytoplasma and Spiroplasma analyzed. However, other genomic regions with higher polymorphic levels shall be identified in order to better evaluate the genetic diversity within these microorganisms population.O objetivo deste trabalho foi caracterizar a variabilidade genética de isolados de fitoplasma e de Spiroplasma kunkelii obtidos de plantas de milho, apresentando sintomas de enfezamento, coletados em diferentes regiões do Brasil. Um fragmento de 500 pares de bases (pb do gene que codifica a espiralina de S. kunkelii foi amplificado e um produto de amplificação de 1.200 pb foi gerado a partir do gene 16S rDNA de fitoplasma. As seqüências parciais do gene da espiralina mostraram similaridade de 98% entre os isolados de S. kunkelii analisados. Essas seqüências foram comparadas com a seqüência do gene da espiralina de outras espécies de Spiroplasma depositadas no GenBank, resultando em similaridade variável entre 76,9% e 88,1%. As seqüências do gene 16S rDNA dos isolados de fitoplasma foram

  15. SeqVISTA: a graphical tool for sequence feature visualization and comparison

    Directory of Open Access Journals (Sweden)

    Niu Tianhua

    2003-01-01

    Full Text Available Abstract Background Many readers will sympathize with the following story. You are viewing a gene sequence in Entrez, and you want to find whether it contains a particular sequence motif. You reach for the browser's "find in page" button, but those darn spaces every 10 bp get in the way. And what if the motif is on the opposite strand? Subsequently, your favorite sequence analysis software informs you that there is an interesting feature at position 13982–14013. By painstakingly counting the 10 bp blocks, you are able to examine the sequence at this location. But now you want to see what other features have been annotated close by, and this information is buried several screenfuls higher up the web page. Results SeqVISTA presents a holistic, graphical view of features annotated on nucleotide or protein sequences. This interactive tool highlights the residues in the sequence that correspond to features chosen by the user, and allows easy searching for sequence motifs or extraction of particular subsequences. SeqVISTA is able to display results from diverse sequence analysis tools in an integrated fashion, and aims to provide much-needed unity to the bioinformatics resources scattered around the Internet. Our viewer may be launched on a GenBank record by a single click of a button installed in the web browser. Conclusion SeqVISTA allows insights to be gained by viewing the totality of sequence annotations and predictions, which may be more revealing than the sum of their parts. SeqVISTA runs on any operating system with a Java 1.4 virtual machine. It is freely available to academic users at http://zlab.bu.edu/SeqVISTA.

  16. Typing of 30 insertion/deletions in Danes using the first commercial indel kit-Mentype(®) DIPplex

    DEFF Research Database (Denmark)

    Friis, Susanne Lunøe; Børsting, Claus; Rockenbauer, Eszter

    2012-01-01

    and all amplicon lengths were shorter than 160bp. Full indel profiles were generated from as little as 100pg of DNA. A total of 117 individuals from Danish paternity cases were successfully typed. No deviation from Hardy-Weinberg equilibrium was observed for any of the indels. The combined mean match...

  17. Evolutionary inference via the Poisson Indel Process.

    Science.gov (United States)

    Bouchard-Côté, Alexandre; Jordan, Michael I

    2013-01-22

    We address the problem of the joint statistical inference of phylogenetic trees and multiple sequence alignments from unaligned molecular sequences. This problem is generally formulated in terms of string-valued evolutionary processes along the branches of a phylogenetic tree. The classic evolutionary process, the TKF91 model [Thorne JL, Kishino H, Felsenstein J (1991) J Mol Evol 33(2):114-124] is a continuous-time Markov chain model composed of insertion, deletion, and substitution events. Unfortunately, this model gives rise to an intractable computational problem: The computation of the marginal likelihood under the TKF91 model is exponential in the number of taxa. In this work, we present a stochastic process, the Poisson Indel Process (PIP), in which the complexity of this computation is reduced to linear. The Poisson Indel Process is closely related to the TKF91 model, differing only in its treatment of insertions, but it has a global characterization as a Poisson process on the phylogeny. Standard results for Poisson processes allow key computations to be decoupled, which yields the favorable computational profile of inference under the PIP model. We present illustrative experiments in which Bayesian inference under the PIP model is compared with separate inference of phylogenies and alignments.

  18. Sentence‐Chain Based Seq2seq Model for Corpus Expansion

    Directory of Open Access Journals (Sweden)

    Euisok Chung

    2017-08-01

    Full Text Available This study focuses on a method for sequential data augmentation in order to alleviate data sparseness problems. Specifically, we present corpus expansion techniques for enhancing the coverage of a language model. Recent recurrent neural network studies show that a seq2seq model can be applied for addressing language generation issues; it has the ability to generate new sentences from given input sentences. We present a method of corpus expansion using a sentence‐chain based seq2seq model. For training the seq2seq model, sentence chains are used as triples. The first two sentences in a triple are used for the encoder of the seq2seq model, while the last sentence becomes a target sequence for the decoder. Using only internal resources, evaluation results show an improvement of approximately 7.6% relative perplexity over a baseline language model of Korean text. Additionally, from a comparison with a previous study, the sentence chain approach reduces the size of the training data by 38.4% while generating 1.4‐times the number of n‐grams with superior performance for English text.

  19. Big Bang à Genève - French version only

    CERN Multimedia

    2005-01-01

    C'est la dernière conférence du cycle organisé par la section de physique de l'Université de Genève à l'occasion de l'Année internationale de la physique. Pour le bouquet final, la section de physique a choisi le grand boum du Big Bang. Intitulée « Big Bang à Genève », la conférence donnée par Laurent Chevalier de l'institut français CEA Saclay évoquera les expériences qui se préparent au CERN avec le LHC. Leur but est de reproduire et d'analyser les conditions qui prévalaient à l'origine de l'Univers, juste après le Big Bang. L'exposé décrira de façon simple les techniques utilisées pour cette exploration, qui démarrera en 2007. Laurent Chevalier se demandera avec le public quels phénomènes nouveaux les physiciens espèrent découvrir dans ce monde inexploré. Comme les précédentes, la conférence débutera par une démonstration de détection de rayons cosmiques dans l'auditoire et l'utilisation de ces signaux pour créer une « musique cosmique », en collaboration avec le Pr...

  20. ATACseqQC: a Bioconductor package for post-alignment quality assessment of ATAC-seq data.

    Science.gov (United States)

    Ou, Jianhong; Liu, Haibo; Yu, Jun; Kelliher, Michelle A; Castilla, Lucio H; Lawson, Nathan D; Zhu, Lihua Julie

    2018-03-01

    ATAC-seq (Assays for Transposase-Accessible Chromatin using sequencing) is a recently developed technique for genome-wide analysis of chromatin accessibility. Compared to earlier methods for assaying chromatin accessibility, ATAC-seq is faster and easier to perform, does not require cross-linking, has higher signal to noise ratio, and can be performed on small cell numbers. However, to ensure a successful ATAC-seq experiment, step-by-step quality assurance processes, including both wet lab quality control and in silico quality assessment, are essential. While several tools have been developed or adopted for assessing read quality, identifying nucleosome occupancy and accessible regions from ATAC-seq data, none of the tools provide a comprehensive set of functionalities for preprocessing and quality assessment of aligned ATAC-seq datasets. We have developed a Bioconductor package, ATACseqQC, for easily generating various diagnostic plots to help researchers quickly assess the quality of their ATAC-seq data. In addition, this package contains functions to preprocess aligned ATAC-seq data for subsequent peak calling. Here we demonstrate the utilities of our package using 25 publicly available ATAC-seq datasets from four studies. We also provide guidelines on what the diagnostic plots should look like for an ideal ATAC-seq dataset. This software package has been used successfully for preprocessing and assessing several in-house and public ATAC-seq datasets. Diagnostic plots generated by this package will facilitate the quality assessment of ATAC-seq data, and help researchers to evaluate their own ATAC-seq experiments as well as select high-quality ATAC-seq datasets from public repositories such as GEO to avoid generating hypotheses or drawing conclusions from low-quality ATAC-seq experiments. The software, source code, and documentation are freely available as a Bioconductor package at https://bioconductor.org/packages/release/bioc/html/ATACseqQC.html .

  1. KEJADIAN INDEL SIMULTAN PADA INTRON 7 GEN BRANCHED-CHAIN Α-KETOACID DEHYDROGENASE E1A (BCKDHA PADA SAPI MADURA

    Directory of Open Access Journals (Sweden)

    Asri Febriana

    2015-08-01

    Full Text Available Madura cattle is one of the Indonesian local cattle breeds derived from crossing between Zebu cattle (Bos indicus and banteng (Bos javanicus. Branched-chain α-ketoacid dehydrogenase (BCKDH is one of the main enzyme complexes in the inner mitochondrial membrane that metabolizes branched chain amino acid (BCAA, ie valine, leucine, and isoleucine. The diversity of the nucleotide sequences of the genes largely determine the efficiency of enzyme encoded. This paper aimed to determine the nucleotide variation contained in section intron 7, exon 8, and intron 8 genes BCKDHA on Madura cattle. This study was conducted on three Madura cattle that used as bull race (karapan, beauty contest (sonok, and beef cattle. The analysis showed that the variation in intron higher than occurred in the exon. Simultaneous indel found at base position 34 and 68 in sonok cattle. In addition, the C266T variant found in beef cattle. These variants do not cause significant changes in amino acids. There was no specific mutation in intron 7, exon 8, and intron 8 were found in Madura cattle designation. This indicated the absence of differentiation Madura cattle designation of selection pressure of BCKDHA gene.

  2. O futuro da epidemiologia genética de características complexas

    Directory of Open Access Journals (Sweden)

    Feitosa Mary F.

    2002-01-01

    Full Text Available A epidemiologia genética evoluiu de um enfoque em estudos sobre doenças mendelianas raras para a análise genética de características complexas. Com o advento de informações sobre a completa seqüência de genes ao longo do genoma humano e de outros organismos, o interesse da epidemiologia genética em desvendar a natureza dos fatores que influenciam essas características se tornou primordial. São apresentados os principais métodos empregados no estudo de doenças complexas bem como suas principais vantagens e desvantagens. Discute-se a importância na determinação da amostra e o uso de fenótipos e marcadores genéticos apropriados. Como exemplo das estratégias citadas tomamos o estudo de índice de massa corporal (BMI para ilustrar um fator genético principal localizado no cromossomo 7. Em uma discussão sobre tendências no estudo de ligação, embora reconhecendo que famílias e genealogias continuarão sendo o foco principal das amostras, discute-se alguns novos e eficientes tipos de amostragem (como por exemplo, controles não-relacionados em que amostras de conjunto de DNA serão universalmente empregadas. O reconhecimento da heterogeneidade genética entre estudos e sua interpretação será uma das mais importantes características no futuro das análises de características complexas.

  3. SeqAn An efficient, generic C++ library for sequence analysis

    Directory of Open Access Journals (Sweden)

    Rausch Tobias

    2008-01-01

    Full Text Available Abstract Background The use of novel algorithmic techniques is pivotal to many important problems in life science. For example the sequencing of the human genome 1 would not have been possible without advanced assembly algorithms. However, owing to the high speed of technological progress and the urgent need for bioinformatics tools, there is a widening gap between state-of-the-art algorithmic techniques and the actual algorithmic components of tools that are in widespread use. Results To remedy this trend we propose the use of SeqAn, a library of efficient data types and algorithms for sequence analysis in computational biology. SeqAn comprises implementations of existing, practical state-of-the-art algorithmic components to provide a sound basis for algorithm testing and development. In this paper we describe the design and content of SeqAn and demonstrate its use by giving two examples. In the first example we show an application of SeqAn as an experimental platform by comparing different exact string matching algorithms. The second example is a simple version of the well-known MUMmer tool rewritten in SeqAn. Results indicate that our implementation is very efficient and versatile to use. Conclusion We anticipate that SeqAn greatly simplifies the rapid development of new bioinformatics tools by providing a collection of readily usable, well-designed algorithmic components which are fundamental for the field of sequence analysis. This leverages not only the implementation of new algorithms, but also enables a sound analysis and comparison of existing algorithms.

  4. SeqHound: biological sequence and structure database as a platform for bioinformatics research

    Directory of Open Access Journals (Sweden)

    Dumontier Michel

    2002-10-01

    Full Text Available Abstract Background SeqHound has been developed as an integrated biological sequence, taxonomy, annotation and 3-D structure database system. It provides a high-performance server platform for bioinformatics research in a locally-hosted environment. Results SeqHound is based on the National Center for Biotechnology Information data model and programming tools. It offers daily updated contents of all Entrez sequence databases in addition to 3-D structural data and information about sequence redundancies, sequence neighbours, taxonomy, complete genomes, functional annotation including Gene Ontology terms and literature links to PubMed. SeqHound is accessible via a web server through a Perl, C or C++ remote API or an optimized local API. It provides functionality necessary to retrieve specialized subsets of sequences, structures and structural domains. Sequences may be retrieved in FASTA, GenBank, ASN.1 and XML formats. Structures are available in ASN.1, XML and PDB formats. Emphasis has been placed on complete genomes, taxonomy, domain and functional annotation as well as 3-D structural functionality in the API, while fielded text indexing functionality remains under development. SeqHound also offers a streamlined WWW interface for simple web-user queries. Conclusions The system has proven useful in several published bioinformatics projects such as the BIND database and offers a cost-effective infrastructure for research. SeqHound will continue to develop and be provided as a service of the Blueprint Initiative at the Samuel Lunenfeld Research Institute. The source code and examples are available under the terms of the GNU public license at the Sourceforge site http://sourceforge.net/projects/slritools/ in the SLRI Toolkit.

  5. VoSeq: a voucher and DNA sequence web application.

    Directory of Open Access Journals (Sweden)

    Carlos Peña

    Full Text Available There is an ever growing number of molecular phylogenetic studies published, due to, in part, the advent of new techniques that allow cheap and quick DNA sequencing. Hence, the demand for relational databases with which to manage and annotate the amassing DNA sequences, genes, voucher specimens and associated biological data is increasing. In addition, a user-friendly interface is necessary for easy integration and management of the data stored in the database back-end. Available databases allow management of a wide variety of biological data. However, most database systems are not specifically constructed with the aim of being an organizational tool for researchers working in phylogenetic inference. We here report a new software facilitating easy management of voucher and sequence data, consisting of a relational database as back-end for a graphic user interface accessed via a web browser. The application, VoSeq, includes tools for creating molecular datasets of DNA or amino acid sequences ready to be used in commonly used phylogenetic software such as RAxML, TNT, MrBayes and PAUP, as well as for creating tables ready for publishing. It also has inbuilt BLAST capabilities against all DNA sequences stored in VoSeq as well as sequences in NCBI GenBank. By using mash-ups and calls to web services, VoSeq allows easy integration with public services such as Yahoo! Maps, Flickr, Encyclopedia of Life (EOL and GBIF (by generating data-dumps that can be processed with GBIF's Integrated Publishing Toolkit.

  6. Development of novel InDel markers and genetic diversity in Chenopodium quinoa through whole-genome re-sequencing.

    Science.gov (United States)

    Zhang, Tifu; Gu, Minfeng; Liu, Yuhe; Lv, Yuanda; Zhou, Ling; Lu, Haiyan; Liang, Shuaiqiang; Bao, Huabin; Zhao, Han

    2017-09-05

    Quinoa (Chenopodium quinoa Willd.) is a balanced nutritional crop, but its breeding improvement has been limited by the lack of information on its genetics and genomics. Therefore, it is necessary to obtain knowledge on genomic variation, population structure, and genetic diversity and to develop novel Insertion/Deletion (InDel) markers for quinoa by whole-genome re-sequencing. We re-sequenced 11 quinoa accessions and obtained a coverage depth between approximately 7× to 23× the quinoa genome. Based on the 1453-megabase (Mb) assembly from the reference accession Riobamba, 8,441,022 filtered bi-allelic single nucleotide polymorphisms (SNPs) and 842,783 filtered InDels were identified, with an estimated SNP and InDel density of 5.81 and 0.58 per kilobase (kb). From the genomic InDel variations, 85 dimorphic InDel markers were newly developed and validated. Together with the 62 simple sequence repeat (SSR) markers reported, a total of 147 markers were used for genotyping the 129 quinoa accessions. Molecular grouping analysis showed classification into two major groups, the Andean highland (composed of the northern and southern highland subgroups) and Chilean coastal, based on combined STRUCTURE, phylogenetic tree and PCA (Principle Component Analysis) analyses. Further analysis of the genetic diversity exhibited a decreasing tendency from the Chilean coast group to the Andean highland group, and the gene flow between subgroups was more frequent than that between the two subgroups and the Chilean coastal group. The majority of the variations (approximately 70%) were found through an analysis of molecular variation (AMOVA) due to the diversity between the groups. This was congruent with the observation of a highly significant F ST value (0.705) between the groups, demonstrating significant genetic differentiation between the Andean highland type of quinoa and the Chilean coastal type. Moreover, a core set of 16 quinoa germplasms that capture all 362 alleles was

  7. SeqWare Query Engine: storing and searching sequence data in the cloud

    Directory of Open Access Journals (Sweden)

    Merriman Barry

    2010-12-01

    Full Text Available Abstract Background Since the introduction of next-generation DNA sequencers the rapid increase in sequencer throughput, and associated drop in costs, has resulted in more than a dozen human genomes being resequenced over the last few years. These efforts are merely a prelude for a future in which genome resequencing will be commonplace for both biomedical research and clinical applications. The dramatic increase in sequencer output strains all facets of computational infrastructure, especially databases and query interfaces. The advent of cloud computing, and a variety of powerful tools designed to process petascale datasets, provide a compelling solution to these ever increasing demands. Results In this work, we present the SeqWare Query Engine which has been created using modern cloud computing technologies and designed to support databasing information from thousands of genomes. Our backend implementation was built using the highly scalable, NoSQL HBase database from the Hadoop project. We also created a web-based frontend that provides both a programmatic and interactive query interface and integrates with widely used genome browsers and tools. Using the query engine, users can load and query variants (SNVs, indels, translocations, etc with a rich level of annotations including coverage and functional consequences. As a proof of concept we loaded several whole genome datasets including the U87MG cell line. We also used a glioblastoma multiforme tumor/normal pair to both profile performance and provide an example of using the Hadoop MapReduce framework within the query engine. This software is open source and freely available from the SeqWare project (http://seqware.sourceforge.net. Conclusions The SeqWare Query Engine provided an easy way to make the U87MG genome accessible to programmers and non-programmers alike. This enabled a faster and more open exploration of results, quicker tuning of parameters for heuristic variant calling filters

  8. SeqWare Query Engine: storing and searching sequence data in the cloud

    Science.gov (United States)

    2010-01-01

    Background Since the introduction of next-generation DNA sequencers the rapid increase in sequencer throughput, and associated drop in costs, has resulted in more than a dozen human genomes being resequenced over the last few years. These efforts are merely a prelude for a future in which genome resequencing will be commonplace for both biomedical research and clinical applications. The dramatic increase in sequencer output strains all facets of computational infrastructure, especially databases and query interfaces. The advent of cloud computing, and a variety of powerful tools designed to process petascale datasets, provide a compelling solution to these ever increasing demands. Results In this work, we present the SeqWare Query Engine which has been created using modern cloud computing technologies and designed to support databasing information from thousands of genomes. Our backend implementation was built using the highly scalable, NoSQL HBase database from the Hadoop project. We also created a web-based frontend that provides both a programmatic and interactive query interface and integrates with widely used genome browsers and tools. Using the query engine, users can load and query variants (SNVs, indels, translocations, etc) with a rich level of annotations including coverage and functional consequences. As a proof of concept we loaded several whole genome datasets including the U87MG cell line. We also used a glioblastoma multiforme tumor/normal pair to both profile performance and provide an example of using the Hadoop MapReduce framework within the query engine. This software is open source and freely available from the SeqWare project (http://seqware.sourceforge.net). Conclusions The SeqWare Query Engine provided an easy way to make the U87MG genome accessible to programmers and non-programmers alike. This enabled a faster and more open exploration of results, quicker tuning of parameters for heuristic variant calling filters, and a common data

  9. SeqWare Query Engine: storing and searching sequence data in the cloud.

    Science.gov (United States)

    O'Connor, Brian D; Merriman, Barry; Nelson, Stanley F

    2010-12-21

    Since the introduction of next-generation DNA sequencers the rapid increase in sequencer throughput, and associated drop in costs, has resulted in more than a dozen human genomes being resequenced over the last few years. These efforts are merely a prelude for a future in which genome resequencing will be commonplace for both biomedical research and clinical applications. The dramatic increase in sequencer output strains all facets of computational infrastructure, especially databases and query interfaces. The advent of cloud computing, and a variety of powerful tools designed to process petascale datasets, provide a compelling solution to these ever increasing demands. In this work, we present the SeqWare Query Engine which has been created using modern cloud computing technologies and designed to support databasing information from thousands of genomes. Our backend implementation was built using the highly scalable, NoSQL HBase database from the Hadoop project. We also created a web-based frontend that provides both a programmatic and interactive query interface and integrates with widely used genome browsers and tools. Using the query engine, users can load and query variants (SNVs, indels, translocations, etc) with a rich level of annotations including coverage and functional consequences. As a proof of concept we loaded several whole genome datasets including the U87MG cell line. We also used a glioblastoma multiforme tumor/normal pair to both profile performance and provide an example of using the Hadoop MapReduce framework within the query engine. This software is open source and freely available from the SeqWare project (http://seqware.sourceforge.net). The SeqWare Query Engine provided an easy way to make the U87MG genome accessible to programmers and non-programmers alike. This enabled a faster and more open exploration of results, quicker tuning of parameters for heuristic variant calling filters, and a common data interface to simplify development of

  10. Genome-wide indel markers shared by diverse Asian rice cultivars compared to Japanese rice cultivar ?Koshihikari?

    OpenAIRE

    Yonemaru, Jun-ichi; Choi, Sun Hee; Sakai, Hiroaki; Ando, Tsuyu; Shomura, Ayahiko; Yano, Masahiro; Wu, Jianzhong; Fukuoka, Shuichi

    2015-01-01

    Insertion-deletion (indel) polymorphisms, such as simple sequence repeats, have been widely used as DNA markers to identify QTLs and genes and to facilitate rice breeding. Recently, next-generation sequencing has produced deep sequences that allow genome-wide detection of indels. These polymorphisms can potentially be used to develop high-accuracy polymerase chain reaction (PCR)-based markers. Here, re-sequencing of 5 indica, 2 aus, and 3 tropical japonica cultivars and Japanese elite cultiva...

  11. Minding the gap: Frequency of indels in mtDNA control region sequence data and influence on population genetic analyses

    Science.gov (United States)

    Pearce, J.M.

    2006-01-01

    Insertions and deletions (indels) result in sequences of various lengths when homologous gene regions are compared among individuals or species. Although indels are typically phylogenetically informative, occurrence and incorporation of these characters as gaps in intraspecific population genetic data sets are rarely discussed. Moreover, the impact of gaps on estimates of fixation indices, such as FST, has not been reviewed. Here, I summarize the occurrence and population genetic signal of indels among 60 published studies that involved alignments of multiple sequences from the mitochondrial DNA (mtDNA) control region of vertebrate taxa. Among 30 studies observing indels, an average of 12% of both variable and parsimony-informative sites were composed of these sites. There was no consistent trend between levels of population differentiation and the number of gap characters in a data block. Across all studies, the average influence on estimates of ??ST was small, explaining only an additional 1.8% of among population variance (range 0.0-8.0%). Studies most likely to observe an increase in ??ST with the inclusion of gap characters were those with control region DNA appears small, dependent upon total number of variable sites in the data block, and related to species-specific characteristics and the spatial distribution of mtDNA lineages that contain indels. ?? 2006 Blackwell Publishing Ltd.

  12. Predicting gene regulatory networks of soybean nodulation from RNA-Seq transcriptome data.

    Science.gov (United States)

    Zhu, Mingzhu; Dahmen, Jeremy L; Stacey, Gary; Cheng, Jianlin

    2013-09-22

    High-throughput RNA sequencing (RNA-Seq) is a revolutionary technique to study the transcriptome of a cell under various conditions at a systems level. Despite the wide application of RNA-Seq techniques to generate experimental data in the last few years, few computational methods are available to analyze this huge amount of transcription data. The computational methods for constructing gene regulatory networks from RNA-Seq expression data of hundreds or even thousands of genes are particularly lacking and urgently needed. We developed an automated bioinformatics method to predict gene regulatory networks from the quantitative expression values of differentially expressed genes based on RNA-Seq transcriptome data of a cell in different stages and conditions, integrating transcriptional, genomic and gene function data. We applied the method to the RNA-Seq transcriptome data generated for soybean root hair cells in three different development stages of nodulation after rhizobium infection. The method predicted a soybean nodulation-related gene regulatory network consisting of 10 regulatory modules common for all three stages, and 24, 49 and 70 modules separately for the first, second and third stage, each containing both a group of co-expressed genes and several transcription factors collaboratively controlling their expression under different conditions. 8 of 10 common regulatory modules were validated by at least two kinds of validations, such as independent DNA binding motif analysis, gene function enrichment test, and previous experimental data in the literature. We developed a computational method to reliably reconstruct gene regulatory networks from RNA-Seq transcriptome data. The method can generate valuable hypotheses for interpreting biological data and designing biological experiments such as ChIP-Seq, RNA interference, and yeast two hybrid experiments.

  13. Pathogenesis comparison between the United States porcine epidemic diarrhoea virus prototype and S-INDEL-variant strains in conventional neonatal piglets.

    Science.gov (United States)

    Chen, Qi; Gauger, Phillip C; Stafne, Molly R; Thomas, Joseph T; Madson, Darin M; Huang, Haiyan; Zheng, Ying; Li, Ganwu; Zhang, Jianqiang

    2016-05-01

    At least two genetically different porcine epidemic diarrhoea virus (PEDV) strains have been identified in the USA: US PEDV prototype and S-INDEL-variant strains. The objective of this study was to compare the pathogenicity differences of the US PEDV prototype and S-INDEL-variant strains in conventional neonatal piglets under experimental infections. Fifty PEDV-negative 5-day-old pigs were divided into five groups of ten pigs each and were inoculated orogastrically with three US PEDV prototype isolates (IN19338/2013, NC35140/2013 and NC49469/2013), an S-INDEL-variant isolate (IL20697/2014), and virus-negative culture medium, respectively, with virus titres of 104 TCID50 ml- 1, 10 ml per pig. All three PEDV prototype isolates tested in this study, regardless of their phylogenetic clades, had similar pathogenicity and caused severe enteric disease in 5-day-old pigs as evidenced by clinical signs, faecal virus shedding, and gross and histopathological lesions. Compared with pigs inoculated with the three US PEDV prototype isolates, pigs inoculated with the S-INDEL-variant isolate had significantly diminished clinical signs, virus shedding in faeces, gross lesions in small intestines, caeca and colons, histopathological lesions in small intestines, and immunohistochemistry staining in ileum. However, the US PEDV prototype and the S-INDEL-variant strains induced similar viraemia levels in inoculated pigs. Whole genome sequences of the PEDV prototype and S-INDEL-variant strains were determined, but the molecular basis of virulence differences between these PEDV strains remains to be elucidated using a reverse genetics approach.

  14. NextGen UAS Research, Development and Demonstration Roadmap. Version 1.0

    Science.gov (United States)

    2012-03-15

    18. NUMBER OF PAGES 80 19a. NAME OF RESPONSIBLE PERSON a. REPORT unclassified b. ABSTRACT unclassified c. THIS PAGE unclassified Standard ...individual COA, UAS may operate under both Visual Flight Rules (VFR) and Instrument Flight Rules ( IFR ), in both special use airspace and non- segregated...National Aeronautics Research and Development Plan,” February 2010 , which cites the importance of integrating UAS into the NextGen NAS and establishes

  15. Multiplexed ChIP-Seq Using Direct Nucleosome Barcoding: A Tool for High-Throughput Chromatin Analysis.

    Science.gov (United States)

    Chabbert, Christophe D; Adjalley, Sophie H; Steinmetz, Lars M; Pelechano, Vicent

    2018-01-01

    Chromatin immunoprecipitation followed by sequencing (ChIP-Seq) or microarray hybridization (ChIP-on-chip) are standard methods for the study of transcription factor binding sites and histone chemical modifications. However, these approaches only allow profiling of a single factor or protein modification at a time.In this chapter, we present Bar-ChIP, a higher throughput version of ChIP-Seq that relies on the direct ligation of molecular barcodes to chromatin fragments. Bar-ChIP enables the concurrent profiling of multiple DNA-protein interactions and is therefore amenable to experimental scale-up, without the need for any robotic instrumentation.

  16. DETEKSI GEN-GEN PENYANDI FAKTOR VIRULENSI PADA BAKTERI VIBRIO

    Directory of Open Access Journals (Sweden)

    Ince Ayu Khairani Kadriah

    2011-04-01

    menggunakan isolat bakteri yang diisolasi dari budidaya udang windu di berbagai daerah di Sulawesi Selatan dan Jawa. Pada penelitian ini digunakan primer spesifik untuk mendeteksi gen-gen virulen toxR gene, hemolysin (vvh gene, dan GyrB gene dengan metode PCR. Dari 35 isolat yang diisolasi, 20 isolat terdeteksi memiliki gen virulensi dan 8 di antaranya memiliki dua gen virulen. Spesies bakteri yang memiliki gen virulen adalah: V.harveyi, V. parahaemolyticus, V. mimicus, dan V. campbelli

  17. Development of Insertion and Deletion Markers for Bottle Gourd Based on Restriction Site-associated DNA Sequencing Data

    Directory of Open Access Journals (Sweden)

    Xinyi WU

    2017-01-01

    Full Text Available Bottle gourd is an important cucurbit crop worldwide. To provide more available molecular markers for this crop, a bioinformatic approach was employed to develop insertion–deletions (InDels markers in bottle gourd based on restriction site-associated DNA sequencing (RAD-Seq data. A total of 892 Indels were predicted, with the length varying from 1 bp to 167 bp. Single-nucleotide InDels were the predominant types of InDels. To validate these InDels, PCR primers were designed from 162 loci where InDels longer than 2 bp were predicated. A total of 112 InDels were found to be polymorphic among 9 bottle gourd accessions under investigation. The rate of prediction accuracy was thus at a high level of 72.7%. DNA fingerprinting for 4 cultivars were performed using 8 selected Indels markers, demonstrating the usefulness of these markers.

  18. An 8bp indel in exon 1 of Ghrelin gene associated with chicken growth.

    Science.gov (United States)

    Fang, Meixia; Nie, Qinghua; Luo, Chenglong; Zhang, Dexiang; Zhang, Xiquan

    2007-04-01

    Ghrelin, acts as the endogenous ligand for growth hormone secretagogues receptor (GHS-R), is a novel growth hormone (GH) releasing peptide with reported effects on food intake in chickens. In this study, an 8 bp indel polymorphism in exon 1 of the chicken Ghrelin (cGHRL) gene was genotyped in a F(2) designed full-sib population to analyze its associations with chicken growth and carcass traits. Later, mRNA level in the proventriculus was determined by real-time PCR to reveal the expression feature of cGHRL gene. Result showed that this 8 bp indel was significantly associated with body weight at the age of 28 days (BW28) and 56 days (BW56), eviscerated weight (EW) and leg muscle weight (LMW) (PGhrelin on chicken growth were indicated by this study.

  19. Diversidade genética de porta-enxertos cítricos baseada em marcadores moleculares RAPD

    Directory of Open Access Journals (Sweden)

    Schäfer Gilmar

    2004-01-01

    Full Text Available Este trabalho teve como objetivo caracterizar a diversidade genética, através do marcador molecular RAPD, dos porta-enxertos da Coleção de Citros da Estação Experimental Agronômica da Universidade Federal do Rio Grande do Sul (EEA/UFRGS e acessos de porta-enxertos cítricos coletados em viveiristas da Região do Vale do Rio Caí do estado do Rio Grande do Sul. Para tanto, coletaram-se folhas de nove porta-enxertos cítricos da EEA/UFRGS e de dez acessos de trifoliata (Poncirus trifoliata de viveiristas. Com o uso de nove seqüências inicializadoras, foi possível separar os porta-enxertos cítricos em dois grupos principais, um formado pelo limoeiro ?Cravo? e outro pelo trifoliata e seus híbridos, apresentando alta dissimilaridade genética entre os grupos. Marcadores moleculares RAPD foram eficientes para caracterizar variedades de porta-enxertos de citros e para separar o porta-enxerto P. trifoliata de seus híbridos podendo serem utilizados para caracterização de plantas matrizes, análise de variabilidade genética entre genitores em programas de melhoramento genético de porta-enxertos e para identificar a origem sexual ou nucelar de mudas de trifoliata em viveiros comerciais.

  20. NetGen: a novel network-based probabilistic generative model for gene set functional enrichment analysis.

    Science.gov (United States)

    Sun, Duanchen; Liu, Yinliang; Zhang, Xiang-Sun; Wu, Ling-Yun

    2017-09-21

    High-throughput experimental techniques have been dramatically improved and widely applied in the past decades. However, biological interpretation of the high-throughput experimental results, such as differential expression gene sets derived from microarray or RNA-seq experiments, is still a challenging task. Gene Ontology (GO) is commonly used in the functional enrichment studies. The GO terms identified via current functional enrichment analysis tools often contain direct parent or descendant terms in the GO hierarchical structure. Highly redundant terms make users difficult to analyze the underlying biological processes. In this paper, a novel network-based probabilistic generative model, NetGen, was proposed to perform the functional enrichment analysis. An additional protein-protein interaction (PPI) network was explicitly used to assist the identification of significantly enriched GO terms. NetGen achieved a superior performance than the existing methods in the simulation studies. The effectiveness of NetGen was explored further on four real datasets. Notably, several GO terms which were not directly linked with the active gene list for each disease were identified. These terms were closely related to the corresponding diseases when accessed to the curated literatures. NetGen has been implemented in the R package CopTea publicly available at GitHub ( http://github.com/wulingyun/CopTea/ ). Our procedure leads to a more reasonable and interpretable result of the functional enrichment analysis. As a novel term combination-based functional enrichment analysis method, NetGen is complementary to current individual term-based methods, and can help to explore the underlying pathogenesis of complex diseases.

  1. The maize glossy13 gene, cloned via BSR-Seq and Seq-walking encodes a putative ABC transporter required for the normal accumulation of epicuticular waxes.

    Directory of Open Access Journals (Sweden)

    Li Li

    Full Text Available Aerial plant surfaces are covered by epicuticular waxes that among other purposes serve to control water loss. Maize glossy mutants originally identified by their "glossy" phenotypes exhibit alterations in the accumulation of epicuticular waxes. By combining data from a BSR-Seq experiment and the newly developed Seq-Walking technology, GRMZM2G118243 was identified as a strong candidate for being the glossy13 gene. The finding that multiple EMS-induced alleles contain premature stop codons in GRMZM2G118243, and the one knockout allele of gl13, validates the hypothesis that gene GRMZM2G118243 is gl13. Consistent with this, GRMZM2G118243 is an ortholog of AtABCG32 (Arabidopsis thaliana, HvABCG31 (barley and OsABCG31 (rice, which encode ABCG subfamily transporters involved in the trans-membrane transport of various secondary metabolites. We therefore hypothesize that gl13 is involved in the transport of epicuticular waxes onto the surfaces of seedling leaves.

  2. National Survey of Sensory Features in Children with ASD: Factor Structure of the Sensory Experience Questionnaire (3.0)

    Science.gov (United States)

    Ausderau, Karla; Sideris, John; Furlong, Melissa; Little, Lauren M.; Bulluck, John; Baranek, Grace T.

    2014-01-01

    This national online survey study characterized sensory features in 1,307 children with autism spectrum disorder (ASD) ages 2-12 years using the Sensory Experiences Questionnaire Version 3.0 (SEQ-3.0). Using the SEQ-3.0, a confirmatory factor analytic model with four substantive factors of hypothesized sensory response patterns (i.e.,…

  3. Comparison and evaluation of two exome capture kits and sequencing platforms for variant calling.

    Science.gov (United States)

    Zhang, Guoqiang; Wang, Jianfeng; Yang, Jin; Li, Wenjie; Deng, Yutian; Li, Jing; Huang, Jun; Hu, Songnian; Zhang, Bing

    2015-08-05

    To promote the clinical application of next-generation sequencing, it is important to obtain accurate and consistent variants of target genomic regions at low cost. Ion Proton, the latest updated semiconductor-based sequencing instrument from Life Technologies, is designed to provide investigators with an inexpensive platform for human whole exome sequencing that achieves a rapid turnaround time. However, few studies have comprehensively compared and evaluated the accuracy of variant calling between Ion Proton and Illumina sequencing platforms such as HiSeq 2000, which is the most popular sequencing platform for the human genome. The Ion Proton sequencer combined with the Ion TargetSeq Exome Enrichment Kit together make up TargetSeq-Proton, whereas SureSelect-Hiseq is based on the Agilent SureSelect Human All Exon v4 Kit and the HiSeq 2000 sequencer. Here, we sequenced exonic DNA from four human blood samples using both TargetSeq-Proton and SureSelect-HiSeq. We then called variants in the exonic regions that overlapped between the two exome capture kits (33.6 Mb). The rates of shared variant loci called by two sequencing platforms were from 68.0 to 75.3% in four samples, whereas the concordance of co-detected variant loci reached 99%. Sanger sequencing validation revealed that the validated rate of concordant single nucleotide polymorphisms (SNPs) (91.5%) was higher than the SNPs specific to TargetSeq-Proton (60.0%) or specific to SureSelect-HiSeq (88.3%). With regard to 1-bp small insertions and deletions (InDels), the Sanger sequencing validated rates of concordant variants (100.0%) and SureSelect-HiSeq-specific (89.6%) were higher than those of TargetSeq-Proton-specific (15.8%). In the sequencing of exonic regions, a combination of using of two sequencing strategies (SureSelect-HiSeq and TargetSeq-Proton) increased the variant calling specificity for concordant variant loci and the sensitivity for variant loci called by any one platform. However, for the

  4. Taxonomic dissection of the genus Micrococcus: Kocuria gen. nov., Nesterenkonia gen. nov., Kytococcus gen. nov., Dermacoccus gen. nov., and Micrococcus Cohn 1872 gen. emend.

    Science.gov (United States)

    Stackebrandt, E; Koch, C; Gvozdiak, O; Schumann, P

    1995-10-01

    The results of a phylogenetic and chemotaxonomic analysis of the genus Micrococcus indicated that it is significantly heterogeneous. Except for Micrococcus lylae, no species groups phylogenetically with the type species of the genus, Micrococcus luteus. The other members of the genus form three separate phylogenetic lines which on the basis of chemotaxonomic properties can be assigned to four genera. These genera are the genus Kocuria gen. nov. for Micrococcus roseus, Micrococcus varians, and Micrococcus kristinae, described as Kocuria rosea comb. nov., Kocuria varians comb. nov., and Kocuria kristinae comb. nov., respectively; the genus Nesterenkonia gen. nov. for Micrococcus halobius, described as Nesterenkonia halobia comb. nov.; the genus Nesterenkonia gen. nov. for Micrococcus halobius, described as Nesterenkonia halobia comb. nov.; the genus Dermacoccus gen. nov. for Micrococcus nishinomiyaensis, described as Dermacoccus nishinomiyaensis comb. nov.; and the genus Kytocossus gen. nov. for Micrococcus sedentarius, described as Kytococcus sedentarius comb. nov. M. luteus and M. lylae, which are closely related phylogenetically but differ in some chemotaxonomic properties, are the only species that remain in the genus Micrococcus Cohn 1872. An emended description of the genus Micrococcus is given [corrected].

  5. MetaRNA-Seq: An Interactive Tool to Browse and Annotate Metadata from RNA-Seq Studies

    Directory of Open Access Journals (Sweden)

    Pankaj Kumar

    2015-01-01

    Full Text Available The number of RNA-Seq studies has grown in recent years. The design of RNA-Seq studies varies from very simple (e.g., two-condition case-control to very complicated (e.g., time series involving multiple samples at each time point with separate drug treatments. Most of these publically available RNA-Seq studies are deposited in NCBI databases, but their metadata are scattered throughout four different databases: Sequence Read Archive (SRA, Biosample, Bioprojects, and Gene Expression Omnibus (GEO. Although the NCBI web interface is able to provide all of the metadata information, it often requires significant effort to retrieve study- or project-level information by traversing through multiple hyperlinks and going to another page. Moreover, project- and study-level metadata lack manual or automatic curation by categories, such as disease type, time series, case-control, or replicate type, which are vital to comprehending any RNA-Seq study. Here we describe “MetaRNA-Seq,” a new tool for interactively browsing, searching, and annotating RNA-Seq metadata with the capability of semiautomatic curation at the study level.

  6. Developing market class specific InDel markers from next generation sequence data in Phaseolus vulgaris L.

    Directory of Open Access Journals (Sweden)

    Samira eMafi Moghaddam

    2014-05-01

    Full Text Available Next generation sequence data provides valuable information and tools for genetic and genomic research and offers new insights useful for marker development. This data is useful for the design of accurate and user-friendly molecular tools. Common bean (Phaseolus vulgaris L. is a diverse crop in which separate domestication events happened in each gene pool followed by race and market class diversification that has resulted in different morphological characteristics in each commercial market class. This has led to essentially independent breeding programs within each market class which in turn has resulted in limited within market class sequence variation. Sequence data from selected genotypes of five bean market classes (pinto, black, navy, and light and dark red kidney were used to develop InDel-based markers specific to each market class. Design of the InDel markers was conducted through a combination of assembly, alignment and primer design software using 1.6x to 5.1x coverage of Illumina GAII sequence data for each of the selected genotypes. The procedure we developed for primer design is fast, accurate, less error prone, and higher throughput than when they are designed manually. All InDel markers are easy to run and score with no need for PCR optimization. A total of 2,687 InDel markers distributed across the genome were developed. To highlight their usefulness, they were employed to construct a phylogenetic tree and a genetic map, showing that InDel markers are reliable, simple, and accurate.

  7. Conférences extérieures - Université de Genève - French version only

    CERN Multimedia

    2006-01-01

    Université de Genève Ecole de physique 24 quai Ernest Ansermet1211 Genève 4 Tél: + 41 22 379 6383 (secrétariat) Tél: + 41 22 379 6256 (réception) Fax: + 41 22 379 6922 Lundi 30 octobre 2006 COLLOQUE 17 heures - Auditoire Stückelberg Planètes extra-solaires: des propriétés inattendues des planètes géantes à la chasse à une nouvelle Terre Dr S. Udry / Observatoire de Genève, Sauverny Depuis la détection de la première 'exoplanète' en orbite autour d'une étoile semblable à notre soleil, il y a un peu plus de 10 ans, près de 200 candidats planétaires ont été mis à jour, la plupart par spectroscopie Doppler. Les propriétés variées et inattendues de ces systèmes seront discutées ainsi que les contraintes qu'elles fournissent pour les modèles de formation planétaire. Si les planètes découvertes sont pour la plupart des géantes gazeuses ressemblant à Jupiter, un nouveau pas a été récemment franchi avec la détection de planètes plus légères (10-20 masses terrestres) et p...

  8. rSeqNP: a non-parametric approach for detecting differential expression and splicing from RNA-Seq data.

    Science.gov (United States)

    Shi, Yang; Chinnaiyan, Arul M; Jiang, Hui

    2015-07-01

    High-throughput sequencing of transcriptomes (RNA-Seq) has become a powerful tool to study gene expression. Here we present an R package, rSeqNP, which implements a non-parametric approach to test for differential expression and splicing from RNA-Seq data. rSeqNP uses permutation tests to access statistical significance and can be applied to a variety of experimental designs. By combining information across isoforms, rSeqNP is able to detect more differentially expressed or spliced genes from RNA-Seq data. The R package with its source code and documentation are freely available at http://www-personal.umich.edu/∼jianghui/rseqnp/. jianghui@umich.edu Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  9. Genetic Diversity of Myanmar and Indonesia Native Chickens Together with Two Jungle Fowl Species by Using 102 Indels Polymorphisms

    Directory of Open Access Journals (Sweden)

    Aye Aye Maw

    2012-07-01

    Full Text Available The efficiency of insertion and/or deletion (indels polymorphisms as genetic markers was evaluated by genotyping 102 indels loci in native chicken populations from Myanmar and Indonesia as well as Red jungle fowls and Green jungle fowls from Java Island. Out of the 102 indel markers, 97 were polymorphic. The average observed and expected heterozygosities were 0.206 to 0.268 and 0.229 to 0.284 in native chicken populations and 0.003 to 0.101 and 0.012 to 0.078 in jungle fowl populations. The coefficients of genetic differentiation (Gst of the native chicken populations from Myanmar and Indonesia were 0.041 and 0.098 respectively. The genetic variability is higher among native chicken populations than jungle fowl populations. The high Gst value was found between native chicken populations and jungle fowl populations. Neighbor-joining tree using genetic distance revealed that the native chickens from two countries were genetically close to each other and remote from Red and Green jungle fowls of Java Island.

  10. CEL-Seq: Single-Cell RNA-Seq by Multiplexed Linear Amplification

    Directory of Open Access Journals (Sweden)

    Tamar Hashimshony

    2012-09-01

    Full Text Available High-throughput sequencing has allowed for unprecedented detail in gene expression analyses, yet its efficient application to single cells is challenged by the small starting amounts of RNA. We have developed CEL-Seq, a method for overcoming this limitation by barcoding and pooling samples before linearly amplifying mRNA with the use of one round of in vitro transcription. We show that CEL-Seq gives more reproducible, linear, and sensitive results than a PCR-based amplification method. We demonstrate the power of this method by studying early C. elegans embryonic development at single-cell resolution. Differential distribution of transcripts between sister cells is seen as early as the two-cell stage embryo, and zygotic expression in the somatic cell lineages is enriched for transcription factors. The robust transcriptome quantifications enabled by CEL-Seq will be useful for transcriptomic analyses of complex tissues containing populations of diverse cell types.

  11. SEQ-POINTER: Next generation, planetary spacecraft remote sensing science observation design tool

    Science.gov (United States)

    Boyer, Jeffrey S.

    1994-11-01

    Since Mariner, NASA-JPL planetary missions have been supported by ground software to plan and design remote sensing science observations. The software used by the science and sequence designers to plan and design observations has evolved with mission and technological advances. The original program, PEGASIS (Mariners 4, 6, and 7), was re-engineered as POGASIS (Mariner 9, Viking, and Mariner 10), and again later as POINTER (Voyager and Galileo). Each of these programs were developed under technological, political, and fiscal constraints which limited their adaptability to other missions and spacecraft designs. Implementation of a multi-mission tool, SEQ POINTER, under the auspices of the JPL Multimission Operations Systems Office (MOSO) is in progress. This version has been designed to address the limitations experienced on previous versions as they were being adapted to a new mission and spacecraft. The tool has been modularly designed with subroutine interface structures to support interchangeable celestial body and spacecraft definition models. The computational and graphics modules have also been designed to interface with data collected from previous spacecraft, or on-going observations, which describe the surface of each target body. These enhancements make SEQ POINTER a candidate for low-cost mission usage, when a remote sensing science observation design capability is required. The current and planned capabilities of the tool will be discussed. The presentation will also include a 5-10 minute video presentation demonstrating the capabilities of a proto-Cassini Project version that was adapted to test the tool. The work described in this abstract was performed by the Jet Propulsion Laboratory, California Institute of Technology, under contract to the National Aeronautics and Space Administration.

  12. GenBank

    OpenAIRE

    Benson, Dennis A.; Karsch-Mizrachi, Ilene; Lipman, David J.; Ostell, James; Rapp, Barbara A.; Wheeler, David L.

    2002-01-01

    The GenBank sequence database incorporates publicly available DNA sequences of more than 105 000 different organisms, primarily through direct submission of sequence data from individual laboratories and large-scale sequencing projects. Most submissions are made using the BankIt (web) or Sequin programs and accession numbers are assigned by GenBank staff upon receipt. Data exchange with the EMBL Data Library and the DNA Data Bank of Japan helps ensure comprehensive worldwide coverage. GenBank...

  13. GenBank

    Data.gov (United States)

    U.S. Department of Health & Human Services — GenBank is the NIH genetic sequence database, an annotated collection of all publicly available DNA sequences. GenBank is designed to provide and encourage access...

  14. Integration of ATAC-seq and RNA-seq identifies human alpha cell and beta cell signature genes.

    Science.gov (United States)

    Ackermann, Amanda M; Wang, Zhiping; Schug, Jonathan; Naji, Ali; Kaestner, Klaus H

    2016-03-01

    Although glucagon-secreting α-cells and insulin-secreting β-cells have opposing functions in regulating plasma glucose levels, the two cell types share a common developmental origin and exhibit overlapping transcriptomes and epigenomes. Notably, destruction of β-cells can stimulate repopulation via transdifferentiation of α-cells, at least in mice, suggesting plasticity between these cell fates. Furthermore, dysfunction of both α- and β-cells contributes to the pathophysiology of type 1 and type 2 diabetes, and β-cell de-differentiation has been proposed to contribute to type 2 diabetes. Our objective was to delineate the molecular properties that maintain islet cell type specification yet allow for cellular plasticity. We hypothesized that correlating cell type-specific transcriptomes with an atlas of open chromatin will identify novel genes and transcriptional regulatory elements such as enhancers involved in α- and β-cell specification and plasticity. We sorted human α- and β-cells and performed the "Assay for Transposase-Accessible Chromatin with high throughput sequencing" (ATAC-seq) and mRNA-seq, followed by integrative analysis to identify cell type-selective gene regulatory regions. We identified numerous transcripts with either α-cell- or β-cell-selective expression and discovered the cell type-selective open chromatin regions that correlate with these gene activation patterns. We confirmed cell type-selective expression on the protein level for two of the top hits from our screen. The "group specific protein" (GC; or vitamin D binding protein) was restricted to α-cells, while CHODL (chondrolectin) immunoreactivity was only present in β-cells. Furthermore, α-cell- and β-cell-selective ATAC-seq peaks were identified to overlap with known binding sites for islet transcription factors, as well as with single nucleotide polymorphisms (SNPs) previously identified as risk loci for type 2 diabetes. We have determined the genetic landscape of

  15. MultiSeq: unifying sequence and structure data for evolutionary analysis

    Directory of Open Access Journals (Sweden)

    Wright Dan

    2006-08-01

    visualization program for analyzing molecular dynamics simulations. Both are freely distributed by the NIH Resource for Macromolecular Modeling and Bioinformatics and MultiSeq is included with VMD starting with version 1.8.5. The MultiSeq website has details on how to download and use the software: http://www.scs.uiuc.edu/~schulten/multiseq/

  16. The GenABEL Project for statistical genomics.

    Science.gov (United States)

    Karssen, Lennart C; van Duijn, Cornelia M; Aulchenko, Yurii S

    2016-01-01

    Development of free/libre open source software is usually done by a community of people with an interest in the tool. For scientific software, however, this is less often the case. Most scientific software is written by only a few authors, often a student working on a thesis. Once the paper describing the tool has been published, the tool is no longer developed further and is left to its own device. Here we describe the broad, multidisciplinary community we formed around a set of tools for statistical genomics. The GenABEL project for statistical omics actively promotes open interdisciplinary development of statistical methodology and its implementation in efficient and user-friendly software under an open source licence. The software tools developed withing the project collectively make up the GenABEL suite, which currently consists of eleven tools. The open framework of the project actively encourages involvement of the community in all stages, from formulation of methodological ideas to application of software to specific data sets. A web forum is used to channel user questions and discussions, further promoting the use of the GenABEL suite. Developer discussions take place on a dedicated mailing list, and development is further supported by robust development practices including use of public version control, code review and continuous integration. Use of this open science model attracts contributions from users and developers outside the "core team", facilitating agile statistical omics methodology development and fast dissemination.

  17. Best-practices guidelines for L2PSA development and applications. Volume 2 - Best practices for the Gen II PWR, Gen II BWR L2PSAs. Extension to Gen III reactors

    International Nuclear Information System (INIS)

    Raimond, E.; Durin, T.; Rahni, N.; Meignen, R.; Cranga, M.; Pichereau, F.; Bentaib, A.; Guigueno, Y.; Loeffler, H.; Mildenberger, O.; Lajtha, G.; Santamaria, C.S.; Dienstbier, J.; Rydl, A.; Holmberg, J.E.; Lindholm, I.; Maennistoe, I.; Pauli, E.M.; Dirksen, G.; Grindon, L.; Peers, K.; Hulqvist, G.; Parozzi, F.; Polidoro, F.; Cazzoli, E.; Vitazkova, J.; Burgazzi, L.; Oury, L.; Ngatchou, C.; Siltanen, S.; Niemela, I.; Routamo, T.; Helstroem, P.; Bassi, C.; Brinkman, H.; Seidel, A.; Schubert, B.; Wohlstein, R.; Guentay, S.; Vincon, L.

    2010-01-01

    The objective of this coordinated action was to develop best practice guidelines for the performance of Level 2 PSA methodologies with a view of harmonisation at EU level and to allow meaningful and practical uncertainty evaluations in a Level 2 PSA. Specific relationships with community in charge of nuclear reactor safety (utilities, safety authorities, vendors, and research or services companies) have been established in order to define the current needs in terms of guidelines for level 2 PSA development and applications. An international workshop was organised in Hamburg, with the support of VATTENFALL, in November 2008. The level 2 PSA experts from the ASAMPSA2 project partners have proposed some guidelines for the development and application of L2PSA based on their experience and on information available from international cooperation (EC Severe Accident network of Excellence - SARNET, IAEA standards, OECD-NEA publications and workshop) or open literature. The number of technical issues addressed in the guideline is very large and all are not covered with the same relevancy in the first version of the guideline. This version is submitted for external review in November 2010 by severe accident experts and PSA, especially, from SARNET and OECD-NEA members. The feedback of the external review will be dis cussed during an international open works hop planned in March 2011 and all outcomes will be taken into consideration in the final version of this guideline (June 2011). The guideline includes 3 volumes: - Volume 1 - General considerations on L2PSA. - Volume 2 - Technical recommendations for Gen II and III reactors. - Volume 3 - Specific considerations for future reactor (Gen IV). The recommendations formulated in the guideline should not be considered as 'mandatory' but should help the L2PSA developers to achieve high quality studies with limited time and resources. It may also help the L2PSA reviewers by positioning one specific study in comparison with some

  18. Polimorfismo genético, terapia farmacológica e função cardíaca seqüencial em pacientes com insuficiência cardíaca Genetic polymorphism, medical therapy and sequential cardiac function in patients with heart failure

    Directory of Open Access Journals (Sweden)

    Marco Antonio Romeo Cuoco

    2008-04-01

    Full Text Available FUNDAMENTO: Variantes funcionais do gene da enzima conversora da angiotensina (ECA podem estar associados com a resposta à terapia em portadores de insuficiência cardíaca (IC. OBJETIVO: Testar a hipótese de diferenças na avaliação ecocardiográfica seqüencial da fração de ejeção do ventrículo esquerdo de pacientes com IC em tratamento farmacológico, inclusive com inibidores da ECA, em relação ao polimorfismo de inserção (I e deleção (D do gene da ECA. MÉTODOS: Estudamos 168 pacientes (média de idade 43,3±10,1 anos, 128 (76,2% dos quais homens, com IC e ecocardiogramas seqüenciais. O polimorfismo I/D foi determinado por reação em cadeia da polimerase. A fração de ejeção do ventrículo esquerdo (FEVE foi analisada comparativamente aos genótipos. Mais de 90% dos pacientes estavam tomando inibidores da ECA. RESULTADOS: Houve um aumento significantemente maior na FEVE média em pacientes com o alelo D, em comparação com pacientes com genótipo II (p = 0,01 após um seguimento médio de 38,9 meses. O alelo D foi associado com aumento de 8,8% na FEVE média no mesmo período. Além disso, observou-se uma tendência para um efeito do "número de cópias" do alelo D sobre o aumento da FEVE média com o tempo: uma diferença de 3,5% na variação da FEVE entre os pacientes com genótipos II e ID (p = 0,03 e de 5% entre os pacientes com genótipos II e DD (p = 0,02. CONCLUSÃO: O polimorfismo de deleção do gene da ECA pode estar associado com a resposta ao tratamento farmacológico com inibidores da ECA em portadores de IC. Outros estudos controlados poderão contribuir para uma melhor compreensão das influências genéticas sobre a resposta à terapia.BACKGROUND: Functional variants of angiotensin-converting enzyme (ACE gene may be associated with response to therapy in patients with heart failure (HF. OBJECTIVE: To test the hypothesis of differences in sequential echocardiographic evaluations of left ventricular

  19. StateGEN/StateNET - A structured method to perform route comparisons

    International Nuclear Information System (INIS)

    Cashwell, J.W.; Erickson, C.M.

    1989-01-01

    StateGEN/StateNET is a modeling structure and routing algorithm designed expressly to address the needs of state and local governments to perform analyses of routing alternatives. StateGEN/StateNET is designed to permit the user to construct a network and assign attributes of interest to the network on a personal computer (PC). The completed network is then transferred via a modem to the TRANSNET system and the preferred route is determined based upon attribute weights assigned by the user. This modeling structure permits the state or local government to perform a routing analysis, such as that required by the US Department of Transportation (DOT) for Highway Route-Controlled Quantity shipments of radioactive materials, with a minimum of resources. StateGEN/StateNET provides a computerized version of the DOT guidelines or allows the user to structure their own network parameters. Sandia National Laboratories (SNL) is the Department of Energy (DOE) lead organization for transportation research and development. The DOE Office of Defense Programs has been the prime sponsor of development of models and associated databases used to analyze the impacts of the transportation of radioactive materials. The routing algorithms used in StateGEN/StateNET were based on the existing models on TRANSNET, a system which was developed to enable outside users to access analytical codes and associated data developed for the DOE

  20. StateGEN/StateNET--A structured method to perform route comparisons

    International Nuclear Information System (INIS)

    Cashwell, J.W.; Erickson, C.M.

    1989-01-01

    StateGEN/StateNET is a modelling structure and routing algorithm designed expressly to address the needs of state and local governments to perform analyses of routing alternatives. StateGEN/StateNET is designed to permit the user to construct a network and assign attributes of interest to the network on a personal computer (PC). The completed network is then transferred via a modem to the TRANSNET system (Cashwell, 1989) and the preferred route is determined based upon attribute weights assigned by the user. This modelling structure permits the state or local to perform a routing analysis, such as that required by the US Department of Transportation (DOT) for Highway Route-Controlled Quantity shipments of radioactive materials, with a minimum of resources. StateGEN/StateNET provides a computerized version of the DOT guidelines (Cashwell, 1989) or allows the user to structure their own network parameters. Sandia national Laboratories (SNL) is the Department of Energy's (DOE) lead organization for transportation research and development. The DOE Office of Defense Programs has been the prime sponsor of development of models and associated databases used to analyze the impacts of the transportation of radioactive materials. The routing algorithms used in StateGEN/StateNET were based on the existing models on TRANSNET, a system which was developed to enable outside users to access analytical codes and associated data developed for the DOE. 2 refs

  1. SpliceSeq: a resource for analysis and visualization of RNA-Seq data on alternative splicing and its functional impacts.

    Science.gov (United States)

    Ryan, Michael C; Cleland, James; Kim, RyangGuk; Wong, Wing Chung; Weinstein, John N

    2012-09-15

    SpliceSeq is a resource for RNA-Seq data that provides a clear view of alternative splicing and identifies potential functional changes that result from splice variation. It displays intuitive visualizations and prioritized lists of results that highlight splicing events and their biological consequences. SpliceSeq unambiguously aligns reads to gene splice graphs, facilitating accurate analysis of large, complex transcript variants that cannot be adequately represented in other formats. SpliceSeq is freely available at http://bioinformatics.mdanderson.org/main/SpliceSeq:Overview. The application is a Java program that can be launched via a browser or installed locally. Local installation requires MySQL and Bowtie. mryan@insilico.us.com Supplementary data are available at Bioinformatics online.

  2. SeqReporter: automating next-generation sequencing result interpretation and reporting workflow in a clinical laboratory.

    Science.gov (United States)

    Roy, Somak; Durso, Mary Beth; Wald, Abigail; Nikiforov, Yuri E; Nikiforova, Marina N

    2014-01-01

    A wide repertoire of bioinformatics applications exist for next-generation sequencing data analysis; however, certain requirements of the clinical molecular laboratory limit their use: i) comprehensive report generation, ii) compatibility with existing laboratory information systems and computer operating system, iii) knowledgebase development, iv) quality management, and v) data security. SeqReporter is a web-based application developed using ASP.NET framework version 4.0. The client-side was designed using HTML5, CSS3, and Javascript. The server-side processing (VB.NET) relied on interaction with a customized SQL server 2008 R2 database. Overall, 104 cases (1062 variant calls) were analyzed by SeqReporter. Each variant call was classified into one of five report levels: i) known clinical significance, ii) uncertain clinical significance, iii) pending pathologists' review, iv) synonymous and deep intronic, and v) platform and panel-specific sequence errors. SeqReporter correctly annotated and classified 99.9% (859 of 860) of sequence variants, including 68.7% synonymous single-nucleotide variants, 28.3% nonsynonymous single-nucleotide variants, 1.7% insertions, and 1.3% deletions. One variant of potential clinical significance was re-classified after pathologist review. Laboratory information system-compatible clinical reports were generated automatically. SeqReporter also facilitated quality management activities. SeqReporter is an example of a customized and well-designed informatics solution to optimize and automate the downstream analysis of clinical next-generation sequencing data. We propose it as a model that may envisage the development of a comprehensive clinical informatics solution. Copyright © 2014 American Society for Investigative Pathology and the Association for Molecular Pathology. Published by Elsevier Inc. All rights reserved.

  3. Combined Targeted DNA Sequencing in Non-Small Cell Lung Cancer (NSCLC Using UNCseq and NGScopy, and RNA Sequencing Using UNCqeR for the Detection of Genetic Aberrations in NSCLC.

    Directory of Open Access Journals (Sweden)

    Xiaobei Zhao

    Full Text Available The recent FDA approval of the MiSeqDx platform provides a unique opportunity to develop targeted next generation sequencing (NGS panels for human disease, including cancer. We have developed a scalable, targeted panel-based assay termed UNCseq, which involves a NGS panel of over 200 cancer-associated genes and a standardized downstream bioinformatics pipeline for detection of single nucleotide variations (SNV as well as small insertions and deletions (indel. In addition, we developed a novel algorithm, NGScopy, designed for samples with sparse sequencing coverage to detect large-scale copy number variations (CNV, similar to human SNP Array 6.0 as well as small-scale intragenic CNV. Overall, we applied this assay to 100 snap-frozen lung cancer specimens lacking same-patient germline DNA (07-0120 tissue cohort and validated our results against Sanger sequencing, SNP Array, and our recently published integrated DNA-seq/RNA-seq assay, UNCqeR, where RNA-seq of same-patient tumor specimens confirmed SNV detected by DNA-seq, if RNA-seq coverage depth was adequate. In addition, we applied the UNCseq assay on an independent lung cancer tumor tissue collection with available same-patient germline DNA (11-1115 tissue cohort and confirmed mutations using assays performed in a CLIA-certified laboratory. We conclude that UNCseq can identify SNV, indel, and CNV in tumor specimens lacking germline DNA in a cost-efficient fashion.

  4. A validated pipeline for detection of SNVs and short InDels from RNA Sequencing

    Directory of Open Access Journals (Sweden)

    Nitin Mandloi

    2017-12-01

    In this study, we have developed a pipeline to detect germline variants from RNA-seq data. The pipeline steps include: pre-processing, alignment, GATK best practices for RNA-seq and variant filtering. The pre-processing step includes base and adapter trimming and removal of contamination reads from rRNA, tRNA, mitochondrial DNA and repeat regions. The read alignment of the pre-processed reads is performed using STAR/HiSAT. After this we used GATK best practices for the RNA-seq dataset to call germline variants. We benchmarked our pipeline on NA12878 RNA-seq data downloaded from SRA (SRR1258218. After variant calling, the quality passed variants were compared against the gold standard variants provided by GIAB consortium. Of the total ~3.6 million high quality variants reported as gold standard variants for this sample (considering whole genome, our pipeline identified ~58,104 variants to be expressed in RNA-seq. Our pipeline achieved more than 99% of sensitivity in detection of germline variants.

  5. Algoritmos genéticos locales

    OpenAIRE

    García-Martínez, Carlos; Lozano, Manuel

    2007-01-01

    Los Algoritmos Genéticos Locales son procedimientos que iterativamente re nan soluciones dadas. Su diferencia con procedimientos de mejora iterativa clásicos reside en el uso de operadores genéticos para realizar el re namiento. En este estudio presentamos un nuevo Algoritmo Genético Local Binario basado en un Algoritmo Genético Estacionario. Hemos comparado el Algoritmo Genético Local Binario con otros procedimientos de mejora iterativa de la literatura. Los res...

  6. MicroRNA transfection and AGO-bound CLIP-seq data sets reveal distinct determinants of miRNA action

    DEFF Research Database (Denmark)

    Wen, Jiayu; Parker, Brian J; Jacobsen, Anders

    2011-01-01

    the predictive effect of target flanking features. We observe distinct target determinants between expression-based and CLIP-based data. Target flanking features such as flanking region conservation are an important AGO-binding determinant-we hypothesize that CLIP experiments have a preference for strongly bound......Microarray expression analyses following miRNA transfection/inhibition and, more recently, Argonaute cross-linked immunoprecipitation (CLIP)-seq assays have been used to detect miRNA target sites. CLIP and expression approaches measure differing stages of miRNA functioning-initial binding of the mi...... miRNP-target interactions involving adjacent RNA-binding proteins that increase the strength of cross-linking. In contrast, seed-related features are major determinants in expression-based studies, but less so for CLIP-seq studies, and increased miRNA concentrations typical of transfection studies...

  7. Getting the most out of RNA-seq data analysis

    Directory of Open Access Journals (Sweden)

    Tsung Fei Khang

    2015-10-01

    Full Text Available Background. A common research goal in transcriptome projects is to find genes that are differentially expressed in different phenotype classes. Biologists might wish to validate such gene candidates experimentally, or use them for downstream systems biology analysis. Producing a coherent differential gene expression analysis from RNA-seq count data requires an understanding of how numerous sources of variation such as the replicate size, the hypothesized biological effect size, and the specific method for making differential expression calls interact. We believe an explicit demonstration of such interactions in real RNA-seq data sets is of practical interest to biologists.Results. Using two large public RNA-seq data sets—one representing strong, and another mild, biological effect size—we simulated different replicate size scenarios, and tested the performance of several commonly-used methods for calling differentially expressed genes in each of them. We found that, when biological effect size was mild, RNA-seq experiments should focus on experimental validation of differentially expressed gene candidates. Importantly, at least triplicates must be used, and the differentially expressed genes should be called using methods with high positive predictive value (PPV, such as NOISeq or GFOLD. In contrast, when biological effect size was strong, differentially expressed genes mined from unreplicated experiments using NOISeq, ASC and GFOLD had between 30 to 50% mean PPV, an increase of more than 30-fold compared to the cases of mild biological effect size. Among methods with good PPV performance, having triplicates or more substantially improved mean PPV to over 90% for GFOLD, 60% for DESeq2, 50% for NOISeq, and 30% for edgeR. At a replicate size of six, we found DESeq2 and edgeR to be reasonable methods for calling differentially expressed genes at systems level analysis, as their PPV and sensitivity trade-off were superior to the other methods

  8. KARAKTERISTIK SEKUEN cDNA PENGKODE GEN ANTI VIRUS DARI UDANG WINDU, Penaeus monodon

    Directory of Open Access Journals (Sweden)

    Andi Parenrengi

    2016-11-01

    Full Text Available Transgenesis pada ikan merupakan sebuah teknik modern yang berpotensi besar dalam menghasilkan organisme yang memiliki karakter lebih baik melalui rekombinan DNA gen target termasuk gen anti virus dalam peningkatan resistensi pada udang. Gen anti virus PmAV (Penaeus monodon Anti Viral gene merupakan salah satu gen pengkode anti virus yang berasal dari spesies krustase. Penelitian ini dilakukan untuk mengetahui karakteristik gen anti virus yang diisolasi dari udang windu, Penaeus monodon. Isolasi gen anti virus menggunakan metode Polymerase Chain Reaction (PCR dan selanjutnya dipurifikasi untuk sekuensing. Data yang dihasilkan dianalisis dengan program Genetyx Versi 7 dan basic local alignment search tool (BLAST. Hasil penelitian menunjukkan bahwa gen anti virus PmAV yang berhasil diisolasi dari cDNA udang windu dengan panjang sekuen 520 bp yang mengkodekan 170 asam amino. BLAST-N menunjukkan tingkat similaritas yang sangat tinggi (100% dengan gen anti virus yang ada di GeneBank. Komposisi asam amino penyusun gen anti virus yang paling besar adalah serin (10,00%, sedangkan yang terkecil adalah asam amino prolin dan lisin masing-masing 1,76%. Analisis sekuen gen dan deduksi asam amino (BLAST-P memperlihatkan adanya C-type lectin-like domain (CTLD yang memiliki kemiripan dengan gen C-type lectin yang diisolasi dari beberapa spesies krustase. Transgenic fish technology is a potential modern technique in producing better character organism through DNA recombinant of target genes including anti viral gene for improvement of shrimp immunity. PmAV (Penaeus monodon Anti Viral gene is one of anti viral genes isolated from crustacean species. The research was conducted to analyze the characteristics anti viral gene isolated from tiger prawn, Penaeus monodon. Anti viral gene was isolated using Polymerase Chain Reaction (PCR technique and then purified for sequencing. Data obtained were analyzed using Genetyx Version 7 software and basic local alignment

  9. Divergência genética entre genótipos de frangos tipo caipira

    Directory of Open Access Journals (Sweden)

    R. C. Veloso

    2015-10-01

    Full Text Available RESUMOObjetivou-se com este trabalho verificar a divergência genética entre sete genótipos de frangos tipo caipira da linhagem Redbro utilizando as características de desempenho por meio de técnicas de análise multivariada. Foram utilizados 840 pintos de um dia, machos, distribuídos em delineamento inteiramente ao acaso, dos seguintes genótipos: Caboclo, Carijó, Colorpak, Gigante Negro, Pesadão Vermelho, Pescoço Pelado e Tricolor. Após a consistência dos dados, foram avaliadas as seguintes variáveis: ganho em peso médio diário, consumo de ração médio diário e conversão alimentar, para os períodos: 1 a 28, 1 a 56, 1 a 70 e 1 a 84 dias de idade; peso corporal ao nascimento, aos 28, 56, 70 e aos 84 dias de idade. O desempenho dos genótipos foi avaliado por meio da análise de variância multivariada e da função discriminante linear de Fisher, usando os testes do maior autovalor de Roy e da união-interseção de Roy para as comparações múltiplas. O estudo da divergência genética foi feito por meio da análise por variáveis canônicas e pelo método de otimização de Tocher. Os genótipos Caboclo e Gigante Negro apresentaram médias canônicas diferentes dos demais genótipos. As duas primeiras variáveis canônicas explicaram 97,41% da variação entre os genótipos. A divergência genética entre os genótipos avaliados permitiu a formação de quatro grupos com os seguintes genótipos: grupo 1 - Colorpak; grupo 2 - Pesadão Vermelho e Pescoço Pelado; grupo 3 - Carijó e Tricolor; e grupo 4 - Caboclo e Gigante Negro.

  10. Gen IV Materials Handbook Functionalities and Operation (2B) Handbook Version 2.0

    International Nuclear Information System (INIS)

    Ren, Weiju

    2011-01-01

    This document is prepared for navigation and operation of the Gen IV Materials Handbook, with architecture description and new user access initiation instructions. Development rationale and history of the Handbook is summarized. The major development aspects, architecture, and design principles of the Handbook are briefly introduced to provide an overview of its past evolution and future prospects. Detailed instructions are given with examples for navigating the constructed Handbook components and using the main functionalities. Procedures are provided in a step-by-step fashion for Data Upload Managers to upload reports and data files, as well as for new users to initiate Handbook access.

  11. Gen IV Materials Handbook Functionalities and Operation (4A) Handbook Version 4.0

    Energy Technology Data Exchange (ETDEWEB)

    Ren, Weiju [ORNL

    2013-09-01

    This document is prepared for navigation and operation of the Gen IV Materials Handbook, with architecture description and new user access initiation instructions. Development rationale and history of the Handbook is summarized. The major development aspects, architecture, and design principles of the Handbook are briefly introduced to provide an overview of its past evolution and future prospects. Detailed instructions are given with examples for navigating the constructed Handbook components and using the main functionalities. Procedures are provided in a step-by-step fashion for Data Upload Managers to upload reports and data files, as well as for new users to initiate Handbook access.

  12. Gen IV Materials Handbook Functionalities and Operation (2B) Handbook Version 2.0

    Energy Technology Data Exchange (ETDEWEB)

    Ren, Weiju [ORNL

    2011-08-01

    This document is prepared for navigation and operation of the Gen IV Materials Handbook, with architecture description and new user access initiation instructions. Development rationale and history of the Handbook is summarized. The major development aspects, architecture, and design principles of the Handbook are briefly introduced to provide an overview of its past evolution and future prospects. Detailed instructions are given with examples for navigating the constructed Handbook components and using the main functionalities. Procedures are provided in a step-by-step fashion for Data Upload Managers to upload reports and data files, as well as for new users to initiate Handbook access.

  13. KoVariome: Korean National Standard Reference Variome database of whole genomes with comprehensive SNV, indel, CNV, and SV analyses.

    Science.gov (United States)

    Kim, Jungeun; Weber, Jessica A; Jho, Sungwoong; Jang, Jinho; Jun, JeHoon; Cho, Yun Sung; Kim, Hak-Min; Kim, Hyunho; Kim, Yumi; Chung, OkSung; Kim, Chang Geun; Lee, HyeJin; Kim, Byung Chul; Han, Kyudong; Koh, InSong; Chae, Kyun Shik; Lee, Semin; Edwards, Jeremy S; Bhak, Jong

    2018-04-04

    High-coverage whole-genome sequencing data of a single ethnicity can provide a useful catalogue of population-specific genetic variations, and provides a critical resource that can be used to more accurately identify pathogenic genetic variants. We report a comprehensive analysis of the Korean population, and present the Korean National Standard Reference Variome (KoVariome). As a part of the Korean Personal Genome Project (KPGP), we constructed the KoVariome database using 5.5 terabases of whole genome sequence data from 50 healthy Korean individuals in order to characterize the benign ethnicity-relevant genetic variation present in the Korean population. In total, KoVariome includes 12.7M single-nucleotide variants (SNVs), 1.7M short insertions and deletions (indels), 4K structural variations (SVs), and 3.6K copy number variations (CNVs). Among them, 2.4M (19%) SNVs and 0.4M (24%) indels were identified as novel. We also discovered selective enrichment of 3.8M SNVs and 0.5M indels in Korean individuals, which were used to filter out 1,271 coding-SNVs not originally removed from the 1,000 Genomes Project when prioritizing disease-causing variants. KoVariome health records were used to identify novel disease-causing variants in the Korean population, demonstrating the value of high-quality ethnic variation databases for the accurate interpretation of individual genomes and the precise characterization of genetic variations.

  14. QTL-seq for rapid identification of candidate genes for flowering time in broccoli × cabbage.

    Science.gov (United States)

    Shu, Jinshuai; Liu, Yumei; Zhang, Lili; Li, Zhansheng; Fang, Zhiyuan; Yang, Limei; Zhuang, Mu; Zhang, Yangyong; Lv, Honghao

    2018-04-01

    A major QTL controlling early flowering in broccoli × cabbage was identified by marker analysis and next-generation sequencing, corresponding to GRF6 gene conditioning flowering time in Arabidopsis. Flowering is an important agronomic trait for hybrid production in broccoli and cabbage, but the genetic mechanism underlying this process is unknown. In this study, segregation analysis with BC 1 P1, BC 1 P2, F 2 , and F 2:3 populations derived from a cross between two inbred lines "195" (late-flowering) and "93219" (early flowering) suggested that flowering time is a quantitative trait. Next, employing a next-generation sequencing-based whole-genome QTL-seq strategy, we identified a major genomic region harboring a robust flowering time QTL using an F 2 mapping population, designated Ef2.1 on cabbage chromosome 2 for early flowering. Ef2.1 was further validated by indel (insertion or deletion) marker-based classical QTL mapping, explaining 51.5% (LOD = 37.67) and 54.0% (LOD = 40.5) of the phenotypic variation in F 2 and F 2:3 populations, respectively. Combined QTL-seq and classical QTL analysis narrowed down Ef1.1 to a 228-kb genomic region containing 29 genes. A cabbage gene, Bol024659, was identified in this region, which is a homolog of GRF6, a major gene regulating flowering in Arabidopsis, and was designated BolGRF6. qRT-PCR study of the expression level of BolGRF6 revealed significantly higher expression in the early flowering genotypes. Taken together, our results provide support for BolGRF6 as a possible candidate gene for early flowering in the broccoli line 93219. The identified candidate genomic regions and genes may be useful for molecular breeding to improve broccoli and cabbage flowering times.

  15. Raça, genética & hipertensão: nova genética ou velha eugenia? Race, genetics, and hypertension: new genetics or old eugenics?

    Directory of Open Access Journals (Sweden)

    Laguardia Josué

    2005-08-01

    Full Text Available As estatísticas relativas às condições de saúde de grupos humanos, classificados segundo um determinado recorte racial, são utilizadas para apoiar argumentos científicos que vinculam uma diferença fenotípica a uma essência biológica de raça. Os estudos epidemiológicos sobre hipertensão arterial ilustram a força das hipóteses genéticas na atribuição de um papel causal à raça. Tomando as explicações genéticas para a etiologia da hipertensão, busco, neste trabalho, apontar os pressupostos etiológicos que embasam os argumentos racializadores dessa patologia, as hipóteses alternativas presentes na literatura científica e os aspectos éticos implicados nesses estudos.Statistics on the health conditions of human groups have been classified according to racial group and then used to support scientific arguments linking a difference in phenotype to a biological essential of race. Epidemiological studies on high blood pressure illustrate the strength that genetic hypotheses can have in assigning a causative role to race. Taking these genetic explanations of the etiology of hypertension, I seek to identify: the etiological presuppositions grounding the arguments that racialize this pathology, the alternative hypotheses found in the scientific literature, and the ethical aspects implicit to such studies.

  16. GenBank

    OpenAIRE

    Benson, Dennis A.; Cavanaugh, Mark; Clark, Karen; Karsch-Mizrachi, Ilene; Lipman, David J.; Ostell, James; Sayers, Eric W.

    2012-01-01

    GenBank? (http://www.ncbi.nlm.nih.gov) is a comprehensive database that contains publicly available nucleotide sequences for almost 260 000 formally described species. These sequences are obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects, including whole-genome shotgun (WGS) and environmental sampling projects. Most submissions are made using the web-based BankIt or standalone Sequin programs, and GenBank staff assig...

  17. PRAPI: post-transcriptional regulation analysis pipeline for Iso-Seq.

    Science.gov (United States)

    Gao, Yubang; Wang, Huiyuan; Zhang, Hangxiao; Wang, Yongsheng; Chen, Jinfeng; Gu, Lianfeng

    2018-05-01

    The single-molecule real-time (SMRT) isoform sequencing (Iso-Seq) based on Pacific Bioscience (PacBio) platform has received increasing attention for its ability to explore full-length isoforms. Thus, comprehensive tools for Iso-Seq bioinformatics analysis are extremely useful. Here, we present a one-stop solution for Iso-Seq analysis, called PRAPI to analyze alternative transcription initiation (ATI), alternative splicing (AS), alternative cleavage and polyadenylation (APA), natural antisense transcripts (NAT), and circular RNAs (circRNAs) comprehensively. PRAPI is capable of combining Iso-Seq full-length isoforms with short read data, such as RNA-Seq or polyadenylation site sequencing (PAS-seq) for differential expression analysis of NAT, AS, APA and circRNAs. Furthermore, PRAPI can annotate new genes and correct mis-annotated genes when gene annotation is available. Finally, PRAPI generates high-quality vector graphics to visualize and highlight the Iso-Seq results. The Dockerfile of PRAPI is available at http://www.bioinfor.org/tool/PRAPI. lfgu@fafu.edu.cn.

  18. GenBank

    OpenAIRE

    Benson, Dennis A.; Karsch-Mizrachi, Ilene; Lipman, David J.; Ostell, James; Sayers, Eric W.

    2008-01-01

    GenBank? is a comprehensive database that contains publicly available nucleotide sequences for more than 300 000 organisms named at the genus level or lower, obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects. Most submissions are made using the web-based BankIt or standalone Sequin programs, and accession numbers are assigned by GenBank? staff upon receipt. Daily data exchange with the European Molecular Biology Labo...

  19. Phylogenetic study of Class Armophorea (Alveolata, Ciliophora based on 18S-rDNA data

    Directory of Open Access Journals (Sweden)

    Thiago da Silva Paiva

    2013-01-01

    Full Text Available The 18S rDNA phylogeny of Class Armophorea, a group of anaerobic ciliates, is proposed based on an analysis of 44 sequences (out of 195 retrieved from the NCBI/GenBank database. Emphasis was placed on the use of two nucleotide alignment criteria that involved variation in the gap-opening and gap-extension parameters and the use of rRNA secondary structure to orientate multiple-alignment. A sensitivity analysis of 76 data sets was run to assess the effect of variations in indel parameters on tree topologies. Bayesian inference, maximum likelihood and maximum parsimony phylogenetic analyses were used to explore how different analytic frameworks influenced the resulting hypotheses. A sensitivity analysis revealed that the relationships among higher taxa of the Intramacronucleata were dependent upon how indels were determined during multiple-alignment of nucleotides. The phylogenetic analyses rejected the monophyly of the Armophorea most of the time and consistently indicated that the Metopidae and Nyctotheridae were related to the Litostomatea. There was no consensus on the placement of the Caenomorphidae, which could be a sister group of the Metopidae + Nyctorheridae, or could have diverged at the base of the Spirotrichea branch or the Intramacronucleata tree.

  20. [Association analysis of SNP-63 and indel-19 variant in the calpain-10 gene with polycystic ovary syndrome in women of reproductive age].

    Science.gov (United States)

    Flores-Martínez, Silvia Esperanza; Castro-Martínez, Anna Gabriela; López-Quintero, Andrés; García-Zapién, Alejandra Guadalupe; Torres-Rodríguez, Ruth Noemí; Sánchez-Corona, José

    2015-01-01

    Polycystic ovary syndrome is a complex and heterogeneous disease involving both reproductive and metabolic problems. It has been suggested a genetic predisposition in the etiology of this syndrome. The identification of calpain-10 gene (CAPN10) as the first candidate gene for type 2 diabetes mellitus, has focused the interest in investigating their possible relation with the polycystic ovary syndrome, because this syndrome is associated with hyperinsulinemia and insulin resistance, two metabolic abnormalities associated with type 2 diabetes mellitus. To investigate if there is association between the SNP-63 and the variant indel-19 of the CAPN10 gene and polycystic ovary syndrome in women of reproductive age. This study included 101 women (55 with polycystic ovary syndrome and 46 without polycystic ovary syndrome). The genetic variant indel-19 was identified by electrophoresis of the amplified fragments by PCR, and the SNP-63 by PCR-RFLP. The allele and genotype frequencies of the two variants do not differ significatly between women with polycystic ovary syndrome and control women group. The haplotype 21 (defined by the insertion allele of indel-19 variant and C allele of SNP-63) was found with higher frequency in both study groups, being more frequent in the polycystic ovary syndrome patients group, however, this difference was not statistically significant (p = 0.8353). The results suggest that SNP-63 and indel-19 variant of the CAPN10 gene do not represent a risk factor for polycystic ovary syndrome in our patients group. Copyright © 2015. Published by Masson Doyma México S.A.

  1. GenBank

    OpenAIRE

    Benson, Dennis A.; Karsch-Mizrachi, Ilene; Lipman, David J.; Ostell, James; Wheeler, David L.

    2006-01-01

    GenBank (R) is a comprehensive database that contains publicly available nucleotide sequences for more than 240 000 named organisms, obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects. Most submissions are made using the web-based BankIt or standalone Sequin programs and accession numbers are assigned by GenBank staff upon receipt. Daily data exchange with the EMBL Data Library in Europe and the DNA Data Bank of Japan...

  2. From reads to genes to pathways: differential expression analysis of RNA-Seq experiments using Rsubread and the edgeR quasi-likelihood pipeline [version 2; referees: 5 approved

    Directory of Open Access Journals (Sweden)

    Yunshun Chen

    2016-08-01

    Full Text Available In recent years, RNA sequencing (RNA-seq has become a very widely used technology for profiling gene expression. One of the most common aims of RNA-seq profiling is to identify genes or molecular pathways that are differentially expressed (DE between two or more biological conditions. This article demonstrates a computational workflow for the detection of DE genes and pathways from RNA-seq data by providing a complete analysis of an RNA-seq experiment profiling epithelial cell subsets in the mouse mammary gland. The workflow uses R software packages from the open-source Bioconductor project and covers all steps of the analysis pipeline, including alignment of read sequences, data exploration, differential expression analysis, visualization and pathway analysis. Read alignment and count quantification is conducted using the Rsubread package and the statistical analyses are performed using the edgeR package. The differential expression analysis uses the quasi-likelihood functionality of edgeR.

  3. An open RNA-Seq data analysis pipeline tutorial with an example of reprocessing data from a recent Zika virus study [version 1; referees: 2 approved

    Directory of Open Access Journals (Sweden)

    Zichen Wang

    2016-07-01

    Full Text Available RNA-seq analysis is becoming a standard method for global gene expression profiling. However, open and standard pipelines to perform RNA-seq analysis by non-experts remain challenging due to the large size of the raw data files and the hardware requirements for running the alignment step. Here we introduce a reproducible open source RNA-seq pipeline delivered as an IPython notebook and a Docker image. The pipeline uses state-of-the-art tools and can run on various platforms with minimal configuration overhead. The pipeline enables the extraction of knowledge from typical RNA-seq studies by generating interactive principal component analysis (PCA and hierarchical clustering (HC plots, performing enrichment analyses against over 90 gene set libraries, and obtaining lists of small molecules that are predicted to either mimic or reverse the observed changes in mRNA expression. We apply the pipeline to a recently published RNA-seq dataset collected from human neuronal progenitors infected with the Zika virus (ZIKV. In addition to confirming the presence of cell cycle genes among the genes that are downregulated by ZIKV, our analysis uncovers significant overlap with upregulated genes that when knocked out in mice induce defects in brain morphology. This result potentially points to the molecular processes associated with the microcephaly phenotype observed in newborns from pregnant mothers infected with the virus. In addition, our analysis predicts small molecules that can either mimic or reverse the expression changes induced by ZIKV. The IPython notebook and Docker image are freely available at: http://nbviewer.jupyter.org/github/maayanlab/Zika-RNAseq-Pipeline/blob/master/Zika.ipynb and https://hub.docker.com/r/maayanlab/zika/.

  4. A step-by-step workflow for low-level analysis of single-cell RNA-seq data with Bioconductor [version 2; referees: 1 approved, 4 approved with reservations

    Directory of Open Access Journals (Sweden)

    Aaron T.L. Lun

    2016-10-01

    Full Text Available Single-cell RNA sequencing (scRNA-seq is widely used to profile the transcriptome of individual cells. This provides biological resolution that cannot be matched by bulk RNA sequencing, at the cost of increased technical noise and data complexity. The differences between scRNA-seq and bulk RNA-seq data mean that the analysis of the former cannot be performed by recycling bioinformatics pipelines for the latter. Rather, dedicated single-cell methods are required at various steps to exploit the cellular resolution while accounting for technical noise. This article describes a computational workflow for low-level analyses of scRNA-seq data, based primarily on software packages from the open-source Bioconductor project. It covers basic steps including quality control, data exploration and normalization, as well as more complex procedures such as cell cycle phase assignment, identification of highly variable and correlated genes, clustering into subpopulations and marker gene detection. Analyses were demonstrated on gene-level count data from several publicly available datasets involving haematopoietic stem cells, brain-derived cells, T-helper cells and mouse embryonic stem cells. This will provide a range of usage scenarios from which readers can construct their own analysis pipelines.

  5. Making the most of RNA-seq: Pre-processing sequencing data with Opossum for reliable SNP variant detection [version 2; referees: 2 approved, 1 approved with reservations

    Directory of Open Access Journals (Sweden)

    Laura Oikkonen

    2017-03-01

    Full Text Available Identifying variants from RNA-seq (transcriptome sequencing data is a cost-effective and versatile complement to whole-exome (WES and whole-genome sequencing (WGS analysis. RNA-seq (transcriptome sequencing is primarily considered a method of gene expression analysis but it can also be used to detect DNA variants in expressed regions of the genome. However, current variant callers do not generally behave well with RNA-seq data due to reads encompassing intronic regions. We have developed a software programme called Opossum to address this problem. Opossum pre-processes RNA-seq reads prior to variant calling, and although it has been designed to work specifically with Platypus, it can be used equally well with other variant callers such as GATK HaplotypeCaller. In this work, we show that using Opossum in conjunction with either Platypus or GATK HaplotypeCaller maintains precision and improves the sensitivity for SNP detection compared to the GATK Best Practices pipeline. In addition, using it in combination with Platypus offers a substantial reduction in run times compared to the GATK pipeline so it is ideal when there are only limited time or computational resources available.

  6. The ChIP-Seq tools and web server: a resource for analyzing ChIP-seq and other types of genomic data.

    Science.gov (United States)

    Ambrosini, Giovanna; Dreos, René; Kumar, Sunil; Bucher, Philipp

    2016-11-18

    ChIP-seq and related high-throughput chromatin profilig assays generate ever increasing volumes of highly valuable biological data. To make sense out of it, biologists need versatile, efficient and user-friendly tools for access, visualization and itegrative analysis of such data. Here we present the ChIP-Seq command line tools and web server, implementing basic algorithms for ChIP-seq data analysis starting with a read alignment file. The tools are optimized for memory-efficiency and speed thus allowing for processing of large data volumes on inexpensive hardware. The web interface provides access to a large database of public data. The ChIP-Seq tools have a modular and interoperable design in that the output from one application can serve as input to another one. Complex and innovative tasks can thus be achieved by running several tools in a cascade. The various ChIP-Seq command line tools and web services either complement or compare favorably to related bioinformatics resources in terms of computational efficiency, ease of access to public data and interoperability with other web-based tools. The ChIP-Seq server is accessible at http://ccg.vital-it.ch/chipseq/ .

  7. RNA-Seq-Based Transcript Structure Analysis with TrBorderExt.

    Science.gov (United States)

    Wang, Yejun; Sun, Ming-An; White, Aaron P

    2018-01-01

    RNA-Seq has become a routine strategy for genome-wide gene expression comparisons in bacteria. Despite lower resolution in transcript border parsing compared with dRNA-Seq, TSS-EMOTE, Cappable-seq, Term-seq, and others, directional RNA-Seq still illustrates its advantages: low cost, quantification and transcript border analysis with a medium resolution (±10-20 nt). To facilitate mining of directional RNA-Seq datasets especially with respect to transcript structure analysis, we developed a tool, TrBorderExt, which can parse transcript start sites and termination sites accurately in bacteria. A detailed protocol is described in this chapter for how to use the software package step by step to identify bacterial transcript borders from raw RNA-Seq data. The package was developed with Perl and R programming languages, and is accessible freely through the website: http://www.szu-bioinf.org/TrBorderExt .

  8. Seqüência de Robin: protocolo único de tratamento Robin sequence: a single treatment protocol

    Directory of Open Access Journals (Sweden)

    Ilza L. Marques

    2005-02-01

    Full Text Available OBJETIVO: Apresentar protocolo único que possa atender tanto às dificuldades respiratórias como às dificuldades alimentares dos neonatos e lactentes com seqüência de Robin. FONTE DE DADOS: O artigo foi desenvolvido tomando como base as publicações mais recentes disponíveis em bancos de dados bibliográficos e livros que discutem o tratamento da seqüência de Robin, em especial os estudos realizados no Hospital de Reabilitação de Anomalias Craniofaciais da Universidade de São Paulo (HRAC/USP. SÍNTESE DE DADOS: O artigo apresenta os aspectos morfológicos e genéticos da seqüência de Robin e conceitos sobre nasofaringoscopia e suas implicações clínicas, discute o tratamento das dificuldades respiratórias e alimentares e apresenta um protocolo único para atender a todos os casos de seqüência de Robin, independentemente de sua gravidade e complexidade. CONCLUSÕES: A seqüência de Robin não é somente uma patologia obstrutiva anatômica para ser resolvida com procedimentos cirúrgicos, mas os conhecimentos sobre crescimento e desenvolvimento devem ser aplicados por uma equipe multidisciplinar, porque possibilitam a rápida recuperação da permeabilidade das vias aéreas e da capacidade de alimentação oral, evitando-se, muitas vezes, os procedimentos cirúrgicos e seus riscos, principalmente quando realizados em neonatos e lactentes pequenos.OBJECTIVE: To present a single protocol that might cover both the respiratory and feeding difficulties of neonates and infants with Robin sequence. SOURCES OF DATA: The article was prepared on the basis of the most recent publications available in bibliographic databases and in books that discuss the treatment of Robin sequence, especially the studies conducted at the Hospital for Rehabilitation of Craniofacial Anomalies of Universidade de São Paulo (HRAC/USP. SUMMARY OF THE FINDINGS: We present the morphological and genetic aspects of Robin sequence and concepts about

  9. Ancestry informative markers: inference of ancestry in aged bone samples using an autosomal AIM-Indel multiplex.

    Science.gov (United States)

    Romanini, Carola; Romero, Magdalena; Salado Puerto, Mercedes; Catelli, Laura; Phillips, Christopher; Pereira, Rui; Gusmão, Leonor; Vullo, Carlos

    2015-05-01

    Ancestry informative markers (AIMs) can be useful to infer ancestry proportions of the donors of forensic evidence. The probability of success typing degraded samples, such as human skeletal remains, is strongly influenced by the DNA fragment lengths that can be amplified and the presence of PCR inhibitors. Several AIM panels are available amongst the many forensic marker sets developed for genotyping degraded DNA. Using a 46 AIM Insertion Deletion (Indel) multiplex, we analyzed human skeletal remains of post mortem time ranging from 35 to 60 years from four different continents (Sub-Saharan Africa, South and Central America, East Asia and Europe) to ascertain the genetic ancestry components. Samples belonging to non-admixed individuals could be assigned to their corresponding continental group. For the remaining samples with admixed ancestry, it was possible to estimate the proportion of co-ancestry components from the four reference population groups. The 46 AIM Indel set was informative enough to efficiently estimate the proportion of ancestry even in samples yielding partial profiles, a frequent occurrence when analyzing inhibited and/or degraded DNA extracts. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.

  10. Divergência, variabilidade genética e desempenho agronômico em genótipos de couve.

    OpenAIRE

    Azevedo, Alcinei Mistico

    2012-01-01

    Embora haja grande variabilidade genética para a couve, são poucos trabalhos no Brasil que visão obter informações para programas de melhoramento genético nesta cultura. Assim, objetivou-se neste trabalho caracterizar 30 genótipos de couve a partir de caracteres morfo-agronômicos para estimar a divergência genética, a importância dos caracteres para a divergência, o desempenho agronômico, os parâmetros genéticos e a correlação entre as características avaliadas. O experimento foi conduzido na...

  11. Market share scenarios for Gen-DIII and gen-IV reactors in Europe

    International Nuclear Information System (INIS)

    Roelofs, F.; Heek, A. V.; Durpel, L. V. D.

    2008-01-01

    Nuclear energy is back on the agenda worldwide in order to meet growing energy demand and especially the growth in electricity demand. Many objectives direct to an increased use of nuclear energy, i.e. minimising energy costs, reducing climate change effects and others. In the light of the potential renewed growth of nuclear energy, the public demands a clear view on what nuclear energy may contribute towards meeting these objectives and especially how nuclear energy may address some socio-political obstructions with respect to economics, radioactive waste, safety and proliferation of fissile materials. To address these questions, the future nuclear reactor park mix in Europe has been analysed applying an integrated dynamic process modelling technique. Various market share scenarios for nuclear energy are derived including sub-variants with regard to the intra-nuclear options. In the analyses, it is assumed that different types of new reactors may be built, taking into account the introduction date of considered Gen-Ill (i.e. EPR) and Gen-IV (i.e. SCWR, HTR, FR) reactors, and the economic evaluation of the complete fuel cycle. The assessment was undertaken using the DANESS code (Dynamic Analysis of Nuclear Energy System Strategies). The analyses show that given the considered realistic nuclear energy demand and given a limited number of available Gen-III and Gen-IV reactor types, the future European nuclear park will exist of combinations of Gen-III and Gen-IV reactors. This mix will always consist of a set of reactor types each having its specific strengths. The analyses also highlight the triggers influencing the choice between different nuclear energy deployment scenarios. (authors)

  12. InDel polymorphisms in quantitative posttransplant chi merism evaluation

    Directory of Open Access Journals (Sweden)

    I. M. Barkhatov

    2016-01-01

    Full Text Available Reduction of minimal residual disease to undetectable levels is the key criterion for efficiency of allogeneic hematopoietic stem cell transplantation (alloHSCT, along with engraftment of transplanted cells with complete replacement of recipient hematopoiesis, i. e., full posttransplant chimerism. Among different approaches, molecular genetic techniques are preferable, being based on the analysis of highly polymorphic DNA sequences (short tandem repeats, STRs. However, this approach, despite its high specificity, has a limited sensitivity. In this regard, it seems appropriate to introduce more sensitive diagnostic solutions, in particular, analysis of insertion/deletion (InDel polymorphisms, followed by real-time detection of PCR products. The data obtained upon analysis of several genetic markers have shown higher sensitivity of this method. However, the deviations in the range of 10 to 90 % in evaluation of the cell ratios indicates the feasibility of using this approach just to evaluate the residual populations of recipient cells.

  13. COBRA-Seq: Sensitive and Quantitative Methylome Profiling

    Directory of Open Access Journals (Sweden)

    Hilal Varinli

    2015-10-01

    Full Text Available Combined Bisulfite Restriction Analysis (COBRA quantifies DNA methylation at a specific locus. It does so via digestion of PCR amplicons produced from bisulfite-treated DNA, using a restriction enzyme that contains a cytosine within its recognition sequence, such as TaqI. Here, we introduce COBRA-seq, a genome wide reduced methylome method that requires minimal DNA input (0.1–1.0 mg and can either use PCR or linear amplification to amplify the sequencing library. Variants of COBRA-seq can be used to explore CpG-depleted as well as CpG-rich regions in vertebrate DNA. The choice of enzyme influences enrichment for specific genomic features, such as CpG-rich promoters and CpG islands, or enrichment for less CpG dense regions such as enhancers. COBRA-seq coupled with linear amplification has the additional advantage of reduced PCR bias by producing full length fragments at high abundance. Unlike other reduced representative methylome methods, COBRA-seq has great flexibility in the choice of enzyme and can be multiplexed and tuned, to reduce sequencing costs and to interrogate different numbers of sites. Moreover, COBRA-seq is applicable to non-model organisms without the reference genome and compatible with the investigation of non-CpG methylation by using restriction enzymes containing CpA, CpT, and CpC in their recognition site.

  14. InGen Inconsistencies: The "Dinosaurs" Of Jurassic Park May Not Be What The Corporation Claims

    Science.gov (United States)

    Haupt, R. J.; Traer, M. M.

    2017-12-01

    biotech companies have been able to do at present. These inconsistencies suggest that the actual technological assets developed by InGen might be slightly, or even significantly, different than what their promotional materials claim, and that paleontologists should be wary of using these animals as study organisms to test paleontological hypotheses.

  15. Functional Analysis of In-frame Indel ARID1A Mutations Reveals New Regulatory Mechanisms of Its Tumor Suppressor Functions

    Directory of Open Access Journals (Sweden)

    Bin Guan

    2012-10-01

    Full Text Available AT-rich interactive domain 1A (ARID1A has emerged as a new tumor suppressor in which frequent somatic mutations have been identified in several types of human cancers. Although most ARID1A somatic mutations are frame-shift or nonsense mutations that contribute to mRNA decay and loss of protein expression, 5% of ARID1A mutations are in-frame insertions or deletions (indels that involve only a small stretch of peptides. Naturally occurring in-frame indel mutations provide unique and useful models to explore the biology and regulatory role of ARID1A. In this study, we analyzed indel mutations identified in gynecological cancers to determine how these mutations affect the tumor suppressor function of ARID1A. Our results demonstrate that all in-frame mutants analyzed lost their ability to inhibit cellular proliferation or activate transcription of CDKN1A, which encodes p21, a downstream effector of ARID1A. We also showed that ARID1A is a nucleocytoplasmic protein whose stability depends on its subcellular localization. Nuclear ARID1A is less stable than cytoplasmic ARID1A because ARID1A is rapidly degraded by the ubiquitin-proteasome system in the nucleus. In-frame deletions affecting the consensus nuclear export signal reduce steady-state protein levels of ARID1A. This defect in nuclear exportation leads to nuclear retention and subsequent degradation. Our findings delineate a mechanism underlying the regulation of ARID1A subcellular distribution and protein stability and suggest that targeting the nuclear ubiquitin-proteasome system can increase the amount of the ARID1A protein in the nucleus and restore its tumor suppressor functions.

  16. From reads to genes to pathways: differential expression analysis of RNA-Seq experiments using Rsubread and the edgeR quasi-likelihood pipeline [version 1; referees: 2 approved

    Directory of Open Access Journals (Sweden)

    Yunshun Chen

    2016-06-01

    Full Text Available In recent years, RNA sequencing (RNA-seq has become a very widely used technology for profiling gene expression. One of the most common aims of RNA-seq profiling is to identify genes or molecular pathways that are differentially expressed (DE between two or more biological conditions. This article demonstrates a computational workflow for the detection of DE genes and pathways from RNA-seq data by providing a complete analysis of an RNA-seq experiment profiling epithelial cell subsets in the mouse mammary gland. The workflow uses R software packages from the open-source Bioconductor project and covers all steps of the analysis pipeline, including alignment of read sequences, data exploration, differential expression analysis, visualization and pathway analysis. Read alignment and count quantification is conducted using the Rsubread package and the statistical analyses are performed using the edgeR package. The differential expression analysis uses the quasi-likelihood functionality of edgeR.

  17. Making the most of RNA-seq: Pre-processing sequencing data with Opossum for reliable SNP variant detection [version 1; referees: 2 approved, 1 approved with reservations

    Directory of Open Access Journals (Sweden)

    Laura Oikkonen

    2017-01-01

    Full Text Available Identifying variants from RNA-seq (transcriptome sequencing data is a cost-effective and versatile alternative to whole-genome sequencing. However, current variant callers do not generally behave well with RNA-seq data due to reads encompassing intronic regions. We have developed a software programme called Opossum to address this problem. Opossum pre-processes RNA-seq reads prior to variant calling, and although it has been designed to work specifically with Platypus, it can be used equally well with other variant callers such as GATK HaplotypeCaller. In this work, we show that using Opossum in conjunction with either Platypus or GATK HaplotypeCaller maintains precision and improves the sensitivity for SNP detection compared to the GATK Best Practices pipeline. In addition, using it in combination with Platypus offers a substantial reduction in run times compared to the GATK pipeline so it is ideal when there are only limited time or computational resources available.

  18. Indel Group in Genomes (IGG) Molecular Genetic Markers1[OPEN

    Science.gov (United States)

    Burkart-Waco, Diana; Kuppu, Sundaram; Britt, Anne; Chetelat, Roger

    2016-01-01

    Genetic markers are essential when developing or working with genetically variable populations. Indel Group in Genomes (IGG) markers are primer pairs that amplify single-locus sequences that differ in size for two or more alleles. They are attractive for their ease of use for rapid genotyping and their codominant nature. Here, we describe a heuristic algorithm that uses a k-mer-based approach to search two or more genome sequences to locate polymorphic regions suitable for designing candidate IGG marker primers. As input to the IGG pipeline software, the user provides genome sequences and the desired amplicon sizes and size differences. Primer sequences flanking polymorphic insertions/deletions are produced as output. IGG marker files for three sets of genomes, Solanum lycopersicum/Solanum pennellii, Arabidopsis (Arabidopsis thaliana) Columbia-0/Landsberg erecta-0 accessions, and S. lycopersicum/S. pennellii/Solanum tuberosum (three-way polymorphic) are included. PMID:27436831

  19. MiSeq: A Next Generation Sequencing Platform for Genomic Analysis.

    Science.gov (United States)

    Ravi, Rupesh Kanchi; Walton, Kendra; Khosroheidari, Mahdieh

    2018-01-01

    MiSeq, Illumina's integrated next generation sequencing instrument, uses reversible-terminator sequencing-by-synthesis technology to provide end-to-end sequencing solutions. The MiSeq instrument is one of the smallest benchtop sequencers that can perform onboard cluster generation, amplification, genomic DNA sequencing, and data analysis, including base calling, alignment and variant calling, in a single run. It performs both single- and paired-end runs with adjustable read lengths from 1 × 36 base pairs to 2 × 300 base pairs. A single run can produce output data of up to 15 Gb in as little as 4 h of runtime and can output up to 25 M single reads and 50 M paired-end reads. Thus, MiSeq provides an ideal platform for rapid turnaround time. MiSeq is also a cost-effective tool for various analyses focused on targeted gene sequencing (amplicon sequencing and target enrichment), metagenomics, and gene expression studies. For these reasons, MiSeq has become one of the most widely used next generation sequencing platforms. Here, we provide a protocol to prepare libraries for sequencing using the MiSeq instrument and basic guidelines for analysis of output data from the MiSeq sequencing run.

  20. Causal null hypotheses of sustained treatment strategies: What can be tested with an instrumental variable?

    Science.gov (United States)

    Swanson, Sonja A; Labrecque, Jeremy; Hernán, Miguel A

    2018-05-02

    Sometimes instrumental variable methods are used to test whether a causal effect is null rather than to estimate the magnitude of a causal effect. However, when instrumental variable methods are applied to time-varying exposures, as in many Mendelian randomization studies, it is unclear what causal null hypothesis is tested. Here, we consider different versions of causal null hypotheses for time-varying exposures, show that the instrumental variable conditions alone are insufficient to test some of them, and describe additional assumptions that can be made to test a wider range of causal null hypotheses, including both sharp and average causal null hypotheses. Implications for interpretation and reporting of instrumental variable results are discussed.

  1. GenLab, Laboratorio Virtual de Genética

    Directory of Open Access Journals (Sweden)

    Fidel Ramírez

    2000-07-01

    Full Text Available GenLab es el nombre que tiene el software diseñado por nosotros, en el cual se modela el proceso meiótico y la fecundación en organismos diploides. El objetivo de esta aplicación es ilustrar el resultado de un cruce determinado, tratando de ser lo más ajustados a la realidad. La modelación de la reproducción sexual se realiza internamente y el GenLab se limita a presentar los resultados según el número de descendencia seleccionado para un cruce específico, esto significa que se puede escoger una gran cantidad de características para los parentales y se puede estudiar la frecuencia de estos en la descendencia. El modelo cuenta con base de datos donde están almacenados algunos de los locus de Drosophila melanogaster junto con su ubicación en centimorgans 1. EI propósito de este modelo es servir como herramienta pedagógica  y didáctica tanto en universidades como en colegios, facilitando el aprendizaje de algunos principios básicos de la genética, por lo cual puede ser usado si se cuenta con una conexión a Internet y un navegador visitando http://biologia.unal.edu.co/fidel.

  2. Genetic Diversity and Population Structure in Native Chicken Populations from Myanmar, Thailand and Laos by Using 102 Indels Markers

    Directory of Open Access Journals (Sweden)

    A. A. Maw

    2015-01-01

    Full Text Available The genetic diversity of native chicken populations from Myanmar, Thailand, and Laos was examined by using 102 insertion and/or deletion (indels markers. Most of the indels loci were polymorphic (71% to 96%, and the genetic variability was similar in all populations. The average observed heterozygosities (HO and expected heterozygosities (HE ranged from 0.205 to 0.263 and 0.239 to 0.381, respectively. The coefficients of genetic differentiation (Gst for all cumulated populations was 0.125, and the Thai native chickens showed higher Gst (0.088 than Myanmar (0.041 and Laotian (0.024 populations. The pairwise Fst distances ranged from 0.144 to 0.308 among populations. A neighbor-joining (NJ tree, using Nei’s genetic distance, revealed that Thai and Laotian native chicken populations were genetically close, while Myanmar native chickens were distant from the others. The native chickens from these three countries were thought to be descended from three different origins (K = 3 from STRUCTURE analysis. Genetic admixture was observed in Thai and Laotian native chickens, while admixture was absent in Myanmar native chickens.

  3. Characterizing and annotating the genome using RNA-seq data.

    Science.gov (United States)

    Chen, Geng; Shi, Tieliu; Shi, Leming

    2017-02-01

    Bioinformatics methods for various RNA-seq data analyses are in fast evolution with the improvement of sequencing technologies. However, many challenges still exist in how to efficiently process the RNA-seq data to obtain accurate and comprehensive results. Here we reviewed the strategies for improving diverse transcriptomic studies and the annotation of genetic variants based on RNA-seq data. Mapping RNA-seq reads to the genome and transcriptome represent two distinct methods for quantifying the expression of genes/transcripts. Besides the known genes annotated in current databases, many novel genes/transcripts (especially those long noncoding RNAs) still can be identified on the reference genome using RNA-seq. Moreover, owing to the incompleteness of current reference genomes, some novel genes are missing from them. Genome- guided and de novo transcriptome reconstruction are two effective and complementary strategies for identifying those novel genes/transcripts on or beyond the reference genome. In addition, integrating the genes of distinct databases to conduct transcriptomics and genetics studies can improve the results of corresponding analyses.

  4. Seleção de genótipos parentais de acerola com base na divergência genética multivariada

    Directory of Open Access Journals (Sweden)

    CARPENTIERI-PÍPOLO VALÉRIA

    2000-01-01

    Full Text Available Este trabalho teve por objetivo identificar e selecionar genótipos parentais de acerola (Malpighia emarginata L. adequadas a programas de melhoramento genético. Nove caracteres quantitativos de maior importância agronômica foram usados para determinação da distância genética e formação de grupos similares de acessos. O agrupamento pelo método de Tocher, a partir das distâncias generalizadas de Mahalanobis, possibilitou a divisão de 14 genótipos em três grupos. Com base na divergência genética e no caráter agronômico-chave (teor de vitamina C, destacaram-se como mais promissores os cruzamentos dos genótipos: AM Mole pertencente ao grupo III, com os genótipos PR AM, N° 18, PR 17, PR 16, Eclipse, AM 22 e Dominga, todos pertencentes ao grupo I.

  5. GenNon-h: Generating multiple sequence alignments on nonhomogeneous phylogenetic trees

    Directory of Open Access Journals (Sweden)

    Kedzierska Anna M

    2012-08-01

    Full Text Available Abstract Background A number of software packages are available to generate DNA multiple sequence alignments (MSAs evolved under continuous-time Markov processes on phylogenetic trees. On the other hand, methods of simulating the DNA MSA directly from the transition matrices do not exist. Moreover, existing software restricts to the time-reversible models and it is not optimized to generate nonhomogeneous data (i.e. placing distinct substitution rates at different lineages. Results We present the first package designed to generate MSAs evolving under discrete-time Markov processes on phylogenetic trees, directly from probability substitution matrices. Based on the input model and a phylogenetic tree in the Newick format (with branch lengths measured as the expected number of substitutions per site, the algorithm produces DNA alignments of desired length. GenNon-h is publicly available for download. Conclusion The software presented here is an efficient tool to generate DNA MSAs on a given phylogenetic tree. GenNon-h provides the user with the nonstationary or nonhomogeneous phylogenetic data that is well suited for testing complex biological hypotheses, exploring the limits of the reconstruction algorithms and their robustness to such models.

  6. NBLDA: negative binomial linear discriminant analysis for RNA-Seq data.

    Science.gov (United States)

    Dong, Kai; Zhao, Hongyu; Tong, Tiejun; Wan, Xiang

    2016-09-13

    RNA-sequencing (RNA-Seq) has become a powerful technology to characterize gene expression profiles because it is more accurate and comprehensive than microarrays. Although statistical methods that have been developed for microarray data can be applied to RNA-Seq data, they are not ideal due to the discrete nature of RNA-Seq data. The Poisson distribution and negative binomial distribution are commonly used to model count data. Recently, Witten (Annals Appl Stat 5:2493-2518, 2011) proposed a Poisson linear discriminant analysis for RNA-Seq data. The Poisson assumption may not be as appropriate as the negative binomial distribution when biological replicates are available and in the presence of overdispersion (i.e., when the variance is larger than or equal to the mean). However, it is more complicated to model negative binomial variables because they involve a dispersion parameter that needs to be estimated. In this paper, we propose a negative binomial linear discriminant analysis for RNA-Seq data. By Bayes' rule, we construct the classifier by fitting a negative binomial model, and propose some plug-in rules to estimate the unknown parameters in the classifier. The relationship between the negative binomial classifier and the Poisson classifier is explored, with a numerical investigation of the impact of dispersion on the discriminant score. Simulation results show the superiority of our proposed method. We also analyze two real RNA-Seq data sets to demonstrate the advantages of our method in real-world applications. We have developed a new classifier using the negative binomial model for RNA-seq data classification. Our simulation results show that our proposed classifier has a better performance than existing works. The proposed classifier can serve as an effective tool for classifying RNA-seq data. Based on the comparison results, we have provided some guidelines for scientists to decide which method should be used in the discriminant analysis of RNA-Seq data

  7. RNA-Seq profiling reveals novel hepatic gene expression pattern in aflatoxin B1 treated rats.

    Science.gov (United States)

    Merrick, B Alex; Phadke, Dhiral P; Auerbach, Scott S; Mav, Deepak; Stiegelmeyer, Suzy M; Shah, Ruchir R; Tice, Raymond R

    2013-01-01

    Deep sequencing was used to investigate the subchronic effects of 1 ppm aflatoxin B1 (AFB1), a potent hepatocarcinogen, on the male rat liver transcriptome prior to onset of histopathological lesions or tumors. We hypothesized RNA-Seq would reveal more differentially expressed genes (DEG) than microarray analysis, including low copy and novel transcripts related to AFB1's carcinogenic activity compared to feed controls (CTRL). Paired-end reads were mapped to the rat genome (Rn4) with TopHat and further analyzed by DESeq and Cufflinks-Cuffdiff pipelines to identify differentially expressed transcripts, new exons and unannotated transcripts. PCA and cluster analysis of DEGs showed clear separation between AFB1 and CTRL treatments and concordance among group replicates. qPCR of eight high and medium DEGs and three low DEGs showed good comparability among RNA-Seq and microarray transcripts. DESeq analysis identified 1,026 differentially expressed transcripts at greater than two-fold change (p<0.005) compared to 626 transcripts by microarray due to base pair resolution of transcripts by RNA-Seq, probe placement within transcripts or an absence of probes to detect novel transcripts, splice variants and exons. Pathway analysis among DEGs revealed signaling of Ahr, Nrf2, GSH, xenobiotic, cell cycle, extracellular matrix, and cell differentiation networks consistent with pathways leading to AFB1 carcinogenesis, including almost 200 upregulated transcripts controlled by E2f1-related pathways related to kinetochore structure, mitotic spindle assembly and tissue remodeling. We report 49 novel, differentially-expressed transcripts including confirmation by PCR-cloning of two unique, unannotated, hepatic AFB1-responsive transcripts (HAfT's) on chromosomes 1.q55 and 15.q11, overexpressed by 10 to 25-fold. Several potentially novel exons were found and exon refinements were made including AFB1 exon-specific induction of homologous family members, Ugt1a6 and Ugt1a7c. We find the

  8. A dinâmica da pesquisa em redes: avanços e desafios do seqüenciamento genético da vassoura de bruxa e do eucalipto | The dynamics of research in networks: progress and challenge in DNA sequencing of witches’ broom and eucalyptus

    Directory of Open Access Journals (Sweden)

    Eliane Dias

    2008-04-01

    Full Text Available Resumo A pesquisa em redes tem se destacado como uma forma importante de organização dos trabalhos em genômica. Dois projetos conduzidos recentemente no Brasil – um sobre a Moniliophthora (ex Crinipellis perniciosa, agente causador da doença vassoura de bruxa em cacau, e outro sobre o seqüenciamento do eucalipto – possibilitam essa afirmação. Este artigo analisa a dinâmica e o funcionamento dessas redes de pesquisa e sua importância como geradoras de conhecimento e inovações, evidenciando as diferenças na concepção, evolução e integração entre atores. Os resultados obtidos apontam o impacto positivo dessas ações (sem desconsiderar os riscos inerentes e fornecem elementos para a implementação de políticas públicas para o desenvolvimento de arranjos cooperativos em áreas estratégicas. Palavras-chave Redes de pesquisa, organização da ciência e da tecnologia, pesquisa em genômica, dinâmica inovativa.   Abstract Network organization has been key for genomics research. Two research projects recently conducted in Brazil - one focused on Moniliophthora perniciosa, which causes witches' broom (vassoura de bruxa disease in cocoa, and the other on eucalyptus – were selected to discuss this statement. This article analyzes the dynamics and functioning of both networks and their importance in generating knowledge and innovation, pointing out the differences in project conception and evolution and in integration between actors. Results obtained highlight the strongly positive impact of these networks and provide some guidelines for public policy directed to the development of cooperative arrangements in strategic areas. Keywords research networks, science and technology organization, genomics research, innovative dynamics

  9. AtRTD2: A Reference Transcript Dataset for accurate quantification of alternative splicing and expression changes in Arabidopsis thaliana RNA-seq data

    KAUST Repository

    Zhang, Runxuan

    2016-05-06

    Background Alternative splicing is the major post-transcriptional mechanism by which gene expression is regulated and affects a wide range of processes and responses in most eukaryotic organisms. RNA-sequencing (RNA-seq) can generate genome-wide quantification of individual transcript isoforms to identify changes in expression and alternative splicing. RNA-seq is an essential modern tool but its ability to accurately quantify transcript isoforms depends on the diversity, completeness and quality of the transcript information. Results We have developed a new Reference Transcript Dataset for Arabidopsis (AtRTD2) for RNA-seq analysis containing over 82k non-redundant transcripts, whereby 74,194 transcripts originate from 27,667 protein-coding genes. A total of 13,524 protein-coding genes have at least one alternatively spliced transcript in AtRTD2 such that about 60% of the 22,453 protein-coding, intron-containing genes in Arabidopsis undergo alternative splicing. More than 600 putative U12 introns were identified in more than 2,000 transcripts. AtRTD2 was generated from transcript assemblies of ca. 8.5 billion pairs of reads from 285 RNA-seq data sets obtained from 129 RNA-seq libraries and merged along with the previous version, AtRTD, and Araport11 transcript assemblies. AtRTD2 increases the diversity of transcripts and through application of stringent filters represents the most extensive and accurate transcript collection for Arabidopsis to date. We have demonstrated a generally good correlation of alternative splicing ratios from RNA-seq data analysed by Salmon and experimental data from high resolution RT-PCR. However, we have observed inaccurate quantification of transcript isoforms for genes with multiple transcripts which have variation in the lengths of their UTRs. This variation is not effectively corrected in RNA-seq analysis programmes and will therefore impact RNA-seq analyses generally. To address this, we have tested different genome

  10. Porcine SOX9 Gene Expression Is Influenced by an 18 bp Indel in the 5'-Untranslated Region.

    Directory of Open Access Journals (Sweden)

    Bertram Brenig

    Full Text Available Sex determining region Y-box 9 (SOX9 is an important regulator of sex and skeletal development and is expressed in a variety of embryonal and adult tissues. Loss or gain of function resulting from mutations within the coding region or chromosomal aberrations of the SOX9 locus lead to a plethora of detrimental phenotypes in humans and animals. One of these phenotypes is the so-called male-to-female or female-to-male sex-reversal which has been observed in several mammals including pig, dog, cat, goat, horse, and deer. In 38,XX sex-reversal French Large White pigs, a genome-wide association study suggested SOX9 as the causal gene, although no functional mutations were identified in affected animals. However, besides others an 18 bp indel had been detected in the 5'-untranslated region of the SOX9 gene by comparing affected animals and controls. We have identified the same indel (Δ18 between position +247 bp and +266 bp downstream the transcription start site of the porcine SOX9 gene in four other pig breeds; i.e., German Large White, Laiwu Black, Bamei, and Erhualian. These animals have been genotyped in an attempt to identify candidate genes for porcine inguinal and/or scrotal hernia. Because the 18 bp segment in the wild type 5'-UTR harbours a highly conserved cAMP-response element (CRE half-site, we analysed its role in SOX9 expression in vitro. Competition and immunodepletion electromobility shift assays demonstrate that the CRE half-site is specifically recognized by CREB. Both binding of CREB to the wild type as well as the absence of the CRE half-site in Δ18 reduced expression efficiency in HEK293T, PK-15, and ATDC5 cells significantly. Transfection experiments of wild type and Δ18 SOX9 promoter luciferase constructs show a significant reduction of RNA and protein levels depending on the presence or absence of the 18 bp segment. Hence, the data presented here demonstrate that the 18 bp indel in the porcine SOX9 5'-UTR is of functional

  11. Comparison of RNA-seq and microarray-based models for clinical endpoint prediction.

    Science.gov (United States)

    Zhang, Wenqian; Yu, Ying; Hertwig, Falk; Thierry-Mieg, Jean; Zhang, Wenwei; Thierry-Mieg, Danielle; Wang, Jian; Furlanello, Cesare; Devanarayan, Viswanath; Cheng, Jie; Deng, Youping; Hero, Barbara; Hong, Huixiao; Jia, Meiwen; Li, Li; Lin, Simon M; Nikolsky, Yuri; Oberthuer, André; Qing, Tao; Su, Zhenqiang; Volland, Ruth; Wang, Charles; Wang, May D; Ai, Junmei; Albanese, Davide; Asgharzadeh, Shahab; Avigad, Smadar; Bao, Wenjun; Bessarabova, Marina; Brilliant, Murray H; Brors, Benedikt; Chierici, Marco; Chu, Tzu-Ming; Zhang, Jibin; Grundy, Richard G; He, Min Max; Hebbring, Scott; Kaufman, Howard L; Lababidi, Samir; Lancashire, Lee J; Li, Yan; Lu, Xin X; Luo, Heng; Ma, Xiwen; Ning, Baitang; Noguera, Rosa; Peifer, Martin; Phan, John H; Roels, Frederik; Rosswog, Carolina; Shao, Susan; Shen, Jie; Theissen, Jessica; Tonini, Gian Paolo; Vandesompele, Jo; Wu, Po-Yen; Xiao, Wenzhong; Xu, Joshua; Xu, Weihong; Xuan, Jiekun; Yang, Yong; Ye, Zhan; Dong, Zirui; Zhang, Ke K; Yin, Ye; Zhao, Chen; Zheng, Yuanting; Wolfinger, Russell D; Shi, Tieliu; Malkas, Linda H; Berthold, Frank; Wang, Jun; Tong, Weida; Shi, Leming; Peng, Zhiyu; Fischer, Matthias

    2015-06-25

    Gene expression profiling is being widely applied in cancer research to identify biomarkers for clinical endpoint prediction. Since RNA-seq provides a powerful tool for transcriptome-based applications beyond the limitations of microarrays, we sought to systematically evaluate the performance of RNA-seq-based and microarray-based classifiers in this MAQC-III/SEQC study for clinical endpoint prediction using neuroblastoma as a model. We generate gene expression profiles from 498 primary neuroblastomas using both RNA-seq and 44 k microarrays. Characterization of the neuroblastoma transcriptome by RNA-seq reveals that more than 48,000 genes and 200,000 transcripts are being expressed in this malignancy. We also find that RNA-seq provides much more detailed information on specific transcript expression patterns in clinico-genetic neuroblastoma subgroups than microarrays. To systematically compare the power of RNA-seq and microarray-based models in predicting clinical endpoints, we divide the cohort randomly into training and validation sets and develop 360 predictive models on six clinical endpoints of varying predictability. Evaluation of factors potentially affecting model performances reveals that prediction accuracies are most strongly influenced by the nature of the clinical endpoint, whereas technological platforms (RNA-seq vs. microarrays), RNA-seq data analysis pipelines, and feature levels (gene vs. transcript vs. exon-junction level) do not significantly affect performances of the models. We demonstrate that RNA-seq outperforms microarrays in determining the transcriptomic characteristics of cancer, while RNA-seq and microarray-based models perform similarly in clinical endpoint prediction. Our findings may be valuable to guide future studies on the development of gene expression-based predictive models and their implementation in clinical practice.

  12. Gene expression profiling of human breast tissue samples using SAGE-Seq.

    Science.gov (United States)

    Wu, Zhenhua Jeremy; Meyer, Clifford A; Choudhury, Sibgat; Shipitsin, Michail; Maruyama, Reo; Bessarabova, Marina; Nikolskaya, Tatiana; Sukumar, Saraswati; Schwartzman, Armin; Liu, Jun S; Polyak, Kornelia; Liu, X Shirley

    2010-12-01

    We present a powerful application of ultra high-throughput sequencing, SAGE-Seq, for the accurate quantification of normal and neoplastic mammary epithelial cell transcriptomes. We develop data analysis pipelines that allow the mapping of sense and antisense strands of mitochondrial and RefSeq genes, the normalization between libraries, and the identification of differentially expressed genes. We find that the diversity of cancer transcriptomes is significantly higher than that of normal cells. Our analysis indicates that transcript discovery plateaus at 10 million reads/sample, and suggests a minimum desired sequencing depth around five million reads. Comparison of SAGE-Seq and traditional SAGE on normal and cancerous breast tissues reveals higher sensitivity of SAGE-Seq to detect less-abundant genes, including those encoding for known breast cancer-related transcription factors and G protein-coupled receptors (GPCRs). SAGE-Seq is able to identify genes and pathways abnormally activated in breast cancer that traditional SAGE failed to call. SAGE-Seq is a powerful method for the identification of biomarkers and therapeutic targets in human disease.

  13. ReliefSeq: a gene-wise adaptive-K nearest-neighbor feature selection tool for finding gene-gene interactions and main effects in mRNA-Seq gene expression data.

    Directory of Open Access Journals (Sweden)

    Brett A McKinney

    Full Text Available Relief-F is a nonparametric, nearest-neighbor machine learning method that has been successfully used to identify relevant variables that may interact in complex multivariate models to explain phenotypic variation. While several tools have been developed for assessing differential expression in sequence-based transcriptomics, the detection of statistical interactions between transcripts has received less attention in the area of RNA-seq analysis. We describe a new extension and assessment of Relief-F for feature selection in RNA-seq data. The ReliefSeq implementation adapts the number of nearest neighbors (k for each gene to optimize the Relief-F test statistics (importance scores for finding both main effects and interactions. We compare this gene-wise adaptive-k (gwak Relief-F method with standard RNA-seq feature selection tools, such as DESeq and edgeR, and with the popular machine learning method Random Forests. We demonstrate performance on a panel of simulated data that have a range of distributional properties reflected in real mRNA-seq data including multiple transcripts with varying sizes of main effects and interaction effects. For simulated main effects, gwak-Relief-F feature selection performs comparably to standard tools DESeq and edgeR for ranking relevant transcripts. For gene-gene interactions, gwak-Relief-F outperforms all comparison methods at ranking relevant genes in all but the highest fold change/highest signal situations where it performs similarly. The gwak-Relief-F algorithm outperforms Random Forests for detecting relevant genes in all simulation experiments. In addition, Relief-F is comparable to the other methods based on computational time. We also apply ReliefSeq to an RNA-Seq study of smallpox vaccine to identify gene expression changes between vaccinia virus-stimulated and unstimulated samples. ReliefSeq is an attractive tool for inclusion in the suite of tools used for analysis of mRNA-Seq data; it has power to

  14. Relationships within Cladobranchia (Gastropoda: Nudibranchia) based on RNA-Seq data: an initial investigation.

    Science.gov (United States)

    Goodheart, Jessica A; Bazinet, Adam L; Collins, Allen G; Cummings, Michael P

    2015-09-01

    Cladobranchia (Gastropoda: Nudibranchia) is a diverse (approx. 1000 species) but understudied group of sea slug molluscs. In order to fully comprehend the diversity of nudibranchs and the evolution of character traits within Cladobranchia, a solid understanding of evolutionary relationships is necessary. To date, only two direct attempts have been made to understand the evolutionary relationships within Cladobranchia, neither of which resulted in well-supported phylogenetic hypotheses. In addition to these studies, several others have addressed some of the relationships within this clade while investigating the evolutionary history of more inclusive groups (Nudibranchia and Euthyneura). However, all of the resulting phylogenetic hypotheses contain conflicting topologies within Cladobranchia. In this study, we address some of these long-standing issues regarding the evolutionary history of Cladobranchia using RNA-Seq data (transcriptomes). We sequenced 16 transcriptomes and combined these with four transcriptomes from the NCBI Sequence Read Archive. Transcript assembly using Trinity and orthology determination using HaMStR yielded 839 orthologous groups for analysis. These data provide a well-supported and almost fully resolved phylogenetic hypothesis for Cladobranchia. Our results support the monophyly of Cladobranchia and the sub-clade Aeolidida, but reject the monophyly of Dendronotida.

  15. FutureGen Project Report

    Energy Technology Data Exchange (ETDEWEB)

    Cabe, Jim; Elliott, Mike

    2010-09-30

    This report summarizes the comprehensive siting, permitting, engineering, design, and costing activities completed by the FutureGen Industrial Alliance, the Department of Energy, and associated supporting subcontractors to develop a first of a kind near zero emissions integrated gasification combined cycle power plant and carbon capture and storage project (IGCC-CCS). With the goal to design, build, and reliably operate the first IGCC-CCS facility, FutureGen would have been the lowest emitting pulverized coal power plant in the world, while providing a timely and relevant basis for coal combustion power plants deploying carbon capture in the future. The content of this report summarizes key findings and results of applicable project evaluations; modeling, design, and engineering assessments; cost estimate reports; and schedule and risk mitigation from initiation of the FutureGen project through final flow sheet analyses including capital and operating reports completed under DOE award DE-FE0000587. This project report necessarily builds upon previously completed siting, design, and development work executed under DOE award DE-FC26- 06NT4207 which included the siting process; environmental permitting, compliance, and mitigation under the National Environmental Policy Act; and development of conceptual and design basis documentation for the FutureGen plant. For completeness, the report includes as attachments the siting and design basis documents, as well as the source documentation for the following: • Site evaluation and selection process and environmental characterization • Underground Injection Control (UIC) Permit Application including well design and subsurface modeling • FutureGen IGCC-CCS Design Basis Document • Process evaluations and technology selection via Illinois Clean Coal Review Board Technical Report • Process flow diagrams and heat/material balance for slurry-fed gasifier configuration • Process flow diagrams and heat/material balance

  16. GC-Content Normalization for RNA-Seq Data

    Science.gov (United States)

    2011-01-01

    Background Transcriptome sequencing (RNA-Seq) has become the assay of choice for high-throughput studies of gene expression. However, as is the case with microarrays, major technology-related artifacts and biases affect the resulting expression measures. Normalization is therefore essential to ensure accurate inference of expression levels and subsequent analyses thereof. Results We focus on biases related to GC-content and demonstrate the existence of strong sample-specific GC-content effects on RNA-Seq read counts, which can substantially bias differential expression analysis. We propose three simple within-lane gene-level GC-content normalization approaches and assess their performance on two different RNA-Seq datasets, involving different species and experimental designs. Our methods are compared to state-of-the-art normalization procedures in terms of bias and mean squared error for expression fold-change estimation and in terms of Type I error and p-value distributions for tests of differential expression. The exploratory data analysis and normalization methods proposed in this article are implemented in the open-source Bioconductor R package EDASeq. Conclusions Our within-lane normalization procedures, followed by between-lane normalization, reduce GC-content bias and lead to more accurate estimates of expression fold-changes and tests of differential expression. Such results are crucial for the biological interpretation of RNA-Seq experiments, where downstream analyses can be sensitive to the supplied lists of genes. PMID:22177264

  17. Computational Methods for ChIP-seq Data Analysis and Applications

    KAUST Repository

    Ashoor, Haitham

    2017-04-25

    The development of Chromatin immunoprecipitation followed by sequencing (ChIP-seq) technology has enabled the construction of genome-wide maps of protein-DNA interaction. Such maps provide information about transcriptional regulation at the epigenetic level (histone modifications and histone variants) and at the level of transcription factor (TF) activity. This dissertation presents novel computational methods for ChIP-seq data analysis and applications. The work of this dissertation addresses four main challenges. First, I address the problem of detecting histone modifications from ChIP-seq cancer samples. The presence of copy number variations (CNVs) in cancer samples results in statistical biases that lead to inaccurate predictions when standard methods are used. To overcome this issue I developed HMCan, a specially designed algorithm to handle ChIP-seq cancer data by accounting for the presence of CNVs. When using ChIP-seq data from cancer cells, HMCan demonstrates unbiased and accurate predictions compared to the standard state of the art methods. Second, I address the problem of identifying changes in histone modifications between two ChIP-seq samples with different genetic backgrounds (for example cancer vs. normal). In addition to CNVs, different antibody efficiency between samples and presence of samples replicates are challenges for this problem. To overcome these issues, I developed the HMCan-diff algorithm as an extension to HMCan. HMCan-diff implements robust normalization methods to address the challenges listed above. HMCan-diff significantly outperforms another state of the art methods on data containing cancer samples. Third, I investigate and analyze predictions of different methods for enhancer prediction based on ChIP-seq data. The analysis shows that predictions generated by different methods are poorly overlapping. To overcome this issue, I developed DENdb, a database that integrates enhancer predictions from different methods. DENdb also

  18. TRANSIT--A Software Tool for Himar1 TnSeq Analysis.

    Directory of Open Access Journals (Sweden)

    Michael A DeJesus

    2015-10-01

    Full Text Available TnSeq has become a popular technique for determining the essentiality of genomic regions in bacterial organisms. Several methods have been developed to analyze the wealth of data that has been obtained through TnSeq experiments. We developed a tool for analyzing Himar1 TnSeq data called TRANSIT. TRANSIT provides a graphical interface to three different statistical methods for analyzing TnSeq data. These methods cover a variety of approaches capable of identifying essential genes in individual datasets as well as comparative analysis between conditions. We demonstrate the utility of this software by analyzing TnSeq datasets of M. tuberculosis grown on glycerol and cholesterol. We show that TRANSIT can be used to discover genes which have been previously implicated for growth on these carbon sources. TRANSIT is written in Python, and thus can be run on Windows, OSX and Linux platforms. The source code is distributed under the GNU GPL v3 license and can be obtained from the following GitHub repository: https://github.com/mad-lab/transit.

  19. Association and Genetic Identification of Loci for Four Fruit Traits in Tomato Using InDel Markers

    Directory of Open Access Journals (Sweden)

    Xiaoxi Liu

    2017-07-01

    Full Text Available Tomato (Solanum lycopersicum fruit weight (FW, soluble solid content (SSC, fruit shape and fruit color are crucial for yield, quality and consumer acceptability. In this study, a 192 accessions tomato association panel comprising a mixture of wild species, cherry tomato, landraces, and modern varieties collected worldwide was genotyped with 547 InDel markers evenly distributed on 12 chromosomes and scored for FW, SSC, fruit shape index (FSI, and color parameters over 2 years with three replications each year. The association panel was sorted into two subpopulations. Linkage disequilibrium ranged from 3.0 to 47.2 Mb across 12 chromosomes. A set of 102 markers significantly (p < 1.19–1.30 × 10−4 associated with SSC, FW, fruit shape, and fruit color was identified on 11 of the 12 chromosomes using a mixed linear model. The associations were compared with the known gene/QTLs for the same traits. Genetic analysis using F2 populations detected 14 and 4 markers significantly (p < 0.05 associated with SSC and FW, respectively. Some loci were commonly detected by both association and linkage analysis. Particularly, one novel locus for FW on chromosome 4 detected by association analysis was also identified in F2 populations. The results demonstrated that association mapping using limited number of InDel markers and a relatively small population could not only complement and enhance previous QTL information, but also identify novel loci for marker-assisted selection of fruit traits in tomato.

  20. An Integrated Approach for RNA-seq Data Normalization.

    Science.gov (United States)

    Yang, Shengping; Mercante, Donald E; Zhang, Kun; Fang, Zhide

    2016-01-01

    DNA copy number alteration is common in many cancers. Studies have shown that insertion or deletion of DNA sequences can directly alter gene expression, and significant correlation exists between DNA copy number and gene expression. Data normalization is a critical step in the analysis of gene expression generated by RNA-seq technology. Successful normalization reduces/removes unwanted nonbiological variations in the data, while keeping meaningful information intact. However, as far as we know, no attempt has been made to adjust for the variation due to DNA copy number changes in RNA-seq data normalization. In this article, we propose an integrated approach for RNA-seq data normalization. Comparisons show that the proposed normalization can improve power for downstream differentially expressed gene detection and generate more biologically meaningful results in gene profiling. In addition, our findings show that due to the effects of copy number changes, some housekeeping genes are not always suitable internal controls for studying gene expression. Using information from DNA copy number, integrated approach is successful in reducing noises due to both biological and nonbiological causes in RNA-seq data, thus increasing the accuracy of gene profiling.

  1. Dental Hypotheses: Seeks to Publish Hypotheses from All Areas of Dentistry

    Directory of Open Access Journals (Sweden)

    Edward F. Rossomando

    2010-07-01

    Full Text Available Starting a new open access journal in a rapid growing scientific panorama is a severe challenge. However, the first issue of dental hypotheses is now history and the even skeptics can appreciate that dental hypotheses is a success - it is a journal of high quality that provides an outlet for publication of articles that encourage readers to question dental paradigms. But dental hypotheses readers might have noticed that the majority of the articles published in the first issue of dental hypotheses concern clinical dentistry. However, dental hypotheses editors recognize that there are many other areas in dentistry that present challenges and that our readers may offer suggestions for their solution. Some of these challenges relate to: dental education; digital dental technology; teledentistry and access to dental care; dental practice issues, such as, dental office design, dental office management, the slow rate of acceptance of innovative technology in the dental office; and issues related to innovation and dental entrepreneurship including intellectual property protection. Nevertheless, the dental profession faces many challenges - in many areas - and with the publication of dental hypotheses our profession has a venue for presentation of possible solutions. If you have developed a hypothesis that might help, please share it with your colleagues. As many have noted, the intellectual power of the global village in which we now live is formidable. The internet has provided the technology to bring us together and dental hypotheses has provided the venue. Please use it. New radical, speculative and non-mainstream scientific ideas are always welcome.

  2. Single-tube linear DNA amplification (LinDA) for robust ChIP-seq

    NARCIS (Netherlands)

    Shankaranarayanan, P.; Mendoza-Parra, M.A.; Walia, M.; Wang, L.; Li, N.; Trindade, L.M.; Gronemeyer, H.

    2011-01-01

    Genome-wide profiling of transcription factors based on massive parallel sequencing of immunoprecipitated chromatin (ChIP-seq) requires nanogram amounts of DNA. Here we describe a high-fidelity, single-tube linear DNA amplification method (LinDA) for ChIP-seq and reChIP-seq with picogram DNA amounts

  3. In Silico Pooling of ChIP-seq Control Experiments

    Science.gov (United States)

    Sun, Guannan; Srinivasan, Rajini; Lopez-Anido, Camila; Hung, Holly A.; Svaren, John; Keleş, Sündüz

    2014-01-01

    As next generation sequencing technologies are becoming more economical, large-scale ChIP-seq studies are enabling the investigation of the roles of transcription factor binding and epigenome on phenotypic variation. Studying such variation requires individual level ChIP-seq experiments. Standard designs for ChIP-seq experiments employ a paired control per ChIP-seq sample. Genomic coverage for control experiments is often sacrificed to increase the resources for ChIP samples. However, the quality of ChIP-enriched regions identifiable from a ChIP-seq experiment depends on the quality and the coverage of the control experiments. Insufficient coverage leads to loss of power in detecting enrichment. We investigate the effect of in silico pooling of control samples within multiple biological replicates, multiple treatment conditions, and multiple cell lines and tissues across multiple datasets with varying levels of genomic coverage. Our computational studies suggest guidelines for performing in silico pooling of control experiments. Using vast amounts of ENCODE data, we show that pairwise correlations between control samples originating from multiple biological replicates, treatments, and cell lines/tissues can be grouped into two classes representing whether or not in silico pooling leads to power gain in detecting enrichment between the ChIP and the control samples. Our findings have important implications for multiplexing samples. PMID:25380244

  4. Ontología para la gestión unificada de variantes y versiones de productos

    OpenAIRE

    Sonzini, María Soledad; Vegetti, Marcela

    2015-01-01

    El objetivo de este trabajo es presentar una ontología para gestionar la variación temporal de una familia de productos a través de versiones. La propuesta permite identificar los puntos variantes, la causa, el tiempo de validez y el control de la propagación/ impacto de los cambios. Es una ontología genérica que puede ser integrada con distintos modelos de representación de variantes de productos. A fin de validar la propuesta, se muestra la integración de la ontología de versiones propuesta...

  5. Affective stress responses during leisure time: Validity evaluation of a modified version of the Stress-Energy Questionnaire.

    Science.gov (United States)

    Hadžibajramović, Emina; Ahlborg, Gunnar; Håkansson, Carita; Lundgren-Nilsson, Åsa; Grimby-Ekman, Anna

    2015-12-01

    Psychosocial stress at work is one of the most important factors behind increasing sick-leave rates. In addition to work stressors, it is important to account for non-work-related stressors when assessing stress responses. In this study, a modified version of the Stress-Energy Questionnaire (SEQ), the SEQ during leisure time (SEQ-LT) was introduced for assessing the affective stress response during leisure time. The aim of this study was to investigate the internal construct validity of the SEQ-LT. A second aim was to define the cut-off points for the scales, which could indicate high and low levels of leisure-time stress and energy, respectively. Internal construct validity of the SEQ-LT was evaluated using a Rasch analysis. We examined the unidimensionality and other psychometric properties of the scale by the fit to the Rasch model. A criterion-based approach was used for classification into high and low stress/energy levels. The psychometric properties of the stress and energy scales of the SEQ-LT were satisfactory, having accommodated for local dependency. The cut-off point for low stress was proposed to be in the interval between 2.45 and 3.02 on the Rasch metric score; while for high stress, it was between 3.65 and 3.90. The suggested cut-off points for the low and high energy levels were values between 1.73-1.97 and 2.66-3.08, respectively. The stress and energy scale of the SEQ-LT satisfied the measurement criteria defined by the Rasch analysis and it provided a useful tool for non-work-related assessment of stress responses. We provide guidelines on how to interpret the scale values. © 2015 the Nordic Societies of Public Health.

  6. A randomized phase II/III study of adverse events between sequential (SEQ) versus simultaneous integrated boost (SIB) intensity modulated radiation therapy (IMRT) in nasopharyngeal carcinoma; preliminary result on acute adverse events.

    Science.gov (United States)

    Songthong, Anussara P; Kannarunimit, Danita; Chakkabat, Chakkapong; Lertbutsayanukul, Chawalit

    2015-08-08

    To investigate acute and late toxicities comparing sequential (SEQ-IMRT) versus simultaneous integrated boost intensity modulated radiotherapy (SIB-IMRT) in nasopharyngeal carcinoma (NPC) patients. Newly diagnosed stage I-IVB NPC patients were randomized to receive SEQ-IMRT or SIB-IMRT, with or without chemotherapy. SEQ-IMRT consisted of two sequential radiation treatment plans: 2 Gy x 25 fractions to low-risk planning target volume (PTV-LR) followed by 2 Gy x 10 fractions to high-risk planning target volume (PTV-HR). In contrast, SIB-IMRT consisted of only one treatment plan: 2.12 Gy and 1.7 Gy x 33 fractions to PTV-HR and PTV-LR, respectively. Toxicities were evaluated according to CTCAE version 4.0. Between October 2010 and November 2013, 122 eligible patients were randomized between SEQ-IMRT (54 patients) and SIB-IMRT (68 patients). With median follow-up time of 16.8 months, there was no significant difference in toxicities between the two IMRT techniques. During chemoradiation, the most common grade 3-5 acute toxicities were mucositis (15.4% vs 13.6%, SEQ vs SIB, p = 0.788) followed by dysphagia (9.6% vs 9.1%, p = 1.000) and xerostomia (9.6% vs 7.6%, p = 0.748). During the adjuvant chemotherapy period, 25.6% and 32.7% experienced grade 3 weight loss in SEQ-IMRT and SIB-IMRT (p = 0.459). One-year overall survival (OS) and progression-free survival (PFS) were 95.8% and 95.5% in SEQ-IMRT and 98% and 90.2% in SIB-IMRT, respectively (p = 0.472 for OS and 0.069 for PFS). This randomized, phase II/III trial comparing SIB-IMRT versus SEQ-IMRT in NPC showed no statistically significant difference between both IMRT techniques in terms of acute adverse events. Short-term tumor control and survival outcome were promising.

  7. A randomized phase II/III study of adverse events between sequential (SEQ) versus simultaneous integrated boost (SIB) intensity modulated radiation therapy (IMRT) in nasopharyngeal carcinoma; preliminary result on acute adverse events

    International Nuclear Information System (INIS)

    Songthong, Anussara P.; Kannarunimit, Danita; Chakkabat, Chakkapong; Lertbutsayanukul, Chawalit

    2015-01-01

    To investigate acute and late toxicities comparing sequential (SEQ-IMRT) versus simultaneous integrated boost intensity modulated radiotherapy (SIB-IMRT) in nasopharyngeal carcinoma (NPC) patients. Newly diagnosed stage I-IVB NPC patients were randomized to receive SEQ-IMRT or SIB-IMRT, with or without chemotherapy. SEQ-IMRT consisted of two sequential radiation treatment plans: 2Gy x 25 fractions to low-risk planning target volume (PTV-LR) followed by 2Gy x 10 fractions to high-risk planning target volume (PTV-HR). In contrast, SIB-IMRT consisted of only one treatment plan: 2.12Gy and 1.7Gy x 33 fractions to PTV-HR and PTV-LR, respectively. Toxicities were evaluated according to CTCAE version 4.0. Between October 2010 and November 2013, 122 eligible patients were randomized between SEQ-IMRT (54 patients) and SIB-IMRT (68 patients). With median follow-up time of 16.8 months, there was no significant difference in toxicities between the two IMRT techniques. During chemoradiation, the most common grade 3–5 acute toxicities were mucositis (15.4 % vs 13.6 %, SEQ vs SIB, p = 0.788) followed by dysphagia (9.6 % vs 9.1 %, p = 1.000) and xerostomia (9.6 % vs 7.6 %, p = 0.748). During the adjuvant chemotherapy period, 25.6 % and 32.7 % experienced grade 3 weight loss in SEQ-IMRT and SIB-IMRT (p = 0.459). One-year overall survival (OS) and progression-free survival (PFS) were 95.8 % and 95.5 % in SEQ-IMRT and 98 % and 90.2 % in SIB-IMRT, respectively (p = 0.472 for OS and 0.069 for PFS). This randomized, phase II/III trial comparing SIB-IMRT versus SEQ-IMRT in NPC showed no statistically significant difference between both IMRT techniques in terms of acute adverse events. Short-term tumor control and survival outcome were promising

  8. Cardinality enhancement utilizing Sequential Algorithm (SeQ code in OCDMA system

    Directory of Open Access Journals (Sweden)

    Fazlina C. A. S.

    2017-01-01

    Full Text Available Optical Code Division Multiple Access (OCDMA has been important with increasing demand for high capacity and speed for communication in optical networks because of OCDMA technique high efficiency that can be achieved, hence fibre bandwidth is fully used. In this paper we will focus on Sequential Algorithm (SeQ code with AND detection technique using Optisystem design tool. The result revealed SeQ code capable to eliminate Multiple Access Interference (MAI and improve Bit Error Rate (BER, Phase Induced Intensity Noise (PIIN and orthogonally between users in the system. From the results, SeQ shows good performance of BER and capable to accommodate 190 numbers of simultaneous users contrast with existing code. Thus, SeQ code have enhanced the system about 36% and 111% of FCC and DCS code. In addition, SeQ have good BER performance 10-25 at 155 Mbps in comparison with 622 Mbps, 1 Gbps and 2 Gbps bit rate. From the plot graph, 155 Mbps bit rate is suitable enough speed for FTTH and LAN networks. Resolution can be made based on the superior performance of SeQ code. Thus, these codes will give an opportunity in OCDMA system for better quality of service in an optical access network for future generation's usage

  9. Cardinality enhancement utilizing Sequential Algorithm (SeQ) code in OCDMA system

    Science.gov (United States)

    Fazlina, C. A. S.; Rashidi, C. B. M.; Rahman, A. K.; Aljunid, S. A.

    2017-11-01

    Optical Code Division Multiple Access (OCDMA) has been important with increasing demand for high capacity and speed for communication in optical networks because of OCDMA technique high efficiency that can be achieved, hence fibre bandwidth is fully used. In this paper we will focus on Sequential Algorithm (SeQ) code with AND detection technique using Optisystem design tool. The result revealed SeQ code capable to eliminate Multiple Access Interference (MAI) and improve Bit Error Rate (BER), Phase Induced Intensity Noise (PIIN) and orthogonally between users in the system. From the results, SeQ shows good performance of BER and capable to accommodate 190 numbers of simultaneous users contrast with existing code. Thus, SeQ code have enhanced the system about 36% and 111% of FCC and DCS code. In addition, SeQ have good BER performance 10-25 at 155 Mbps in comparison with 622 Mbps, 1 Gbps and 2 Gbps bit rate. From the plot graph, 155 Mbps bit rate is suitable enough speed for FTTH and LAN networks. Resolution can be made based on the superior performance of SeQ code. Thus, these codes will give an opportunity in OCDMA system for better quality of service in an optical access network for future generation's usage

  10. Practical guidelines for the comprehensive analysis of ChIP-seq data.

    Directory of Open Access Journals (Sweden)

    Timothy Bailey

    Full Text Available Mapping the chromosomal locations of transcription factors, nucleosomes, histone modifications, chromatin remodeling enzymes, chaperones, and polymerases is one of the key tasks of modern biology, as evidenced by the Encyclopedia of DNA Elements (ENCODE Project. To this end, chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq is the standard methodology. Mapping such protein-DNA interactions in vivo using ChIP-seq presents multiple challenges not only in sample preparation and sequencing but also for computational analysis. Here, we present step-by-step guidelines for the computational analysis of ChIP-seq data. We address all the major steps in the analysis of ChIP-seq data: sequencing depth selection, quality checking, mapping, data normalization, assessment of reproducibility, peak calling, differential binding analysis, controlling the false discovery rate, peak annotation, visualization, and motif analysis. At each step in our guidelines we discuss some of the software tools most frequently used. We also highlight the challenges and problems associated with each step in ChIP-seq data analysis. We present a concise workflow for the analysis of ChIP-seq data in Figure 1 that complements and expands on the recommendations of the ENCODE and modENCODE projects. Each step in the workflow is described in detail in the following sections.

  11. Granatum: a graphical single-cell RNA-Seq analysis pipeline for genomics scientists.

    Science.gov (United States)

    Zhu, Xun; Wolfgruber, Thomas K; Tasato, Austin; Arisdakessian, Cédric; Garmire, David G; Garmire, Lana X

    2017-12-05

    Single-cell RNA sequencing (scRNA-Seq) is an increasingly popular platform to study heterogeneity at the single-cell level. Computational methods to process scRNA-Seq data are not very accessible to bench scientists as they require a significant amount of bioinformatic skills. We have developed Granatum, a web-based scRNA-Seq analysis pipeline to make analysis more broadly accessible to researchers. Without a single line of programming code, users can click through the pipeline, setting parameters and visualizing results via the interactive graphical interface. Granatum conveniently walks users through various steps of scRNA-Seq analysis. It has a comprehensive list of modules, including plate merging and batch-effect removal, outlier-sample removal, gene-expression normalization, imputation, gene filtering, cell clustering, differential gene expression analysis, pathway/ontology enrichment analysis, protein network interaction visualization, and pseudo-time cell series construction. Granatum enables broad adoption of scRNA-Seq technology by empowering bench scientists with an easy-to-use graphical interface for scRNA-Seq data analysis. The package is freely available for research use at http://garmiregroup.org/granatum/app.

  12. Recommendations and Requirements for GenCade Simluations

    Science.gov (United States)

    2014-08-01

    will report whether or not GenCade is enabled. If GenCade is disabled , the user will need a new license that includes GenCade...any depth but usually are not deeper than the seaward edge of the surf - zone. In the same way that some shorelines are less desirable for use in...Conference, 1919–1937. ASCE. Wang, P., N. C. Kraus, and R. A. Davis. 1998. Total rate of longshore sediment transport in the surf zone: Field

  13. Discovery of Protein–lncRNA Interactions by Integrating Large-Scale CLIP-Seq and RNA-Seq Datasets

    Energy Technology Data Exchange (ETDEWEB)

    Li, Jun-Hao; Liu, Shun; Zheng, Ling-Ling; Wu, Jie; Sun, Wen-Ju; Wang, Ze-Lin; Zhou, Hui; Qu, Liang-Hu, E-mail: lssqlh@mail.sysu.edu.cn; Yang, Jian-Hua, E-mail: lssqlh@mail.sysu.edu.cn [RNA Information Center, Key Laboratory of Gene Engineering of the Ministry of Education, State Key Laboratory for Biocontrol, Sun Yat-sen University, Guangzhou (China)

    2015-01-14

    Long non-coding RNAs (lncRNAs) are emerging as important regulatory molecules in developmental, physiological, and pathological processes. However, the precise mechanism and functions of most of lncRNAs remain largely unknown. Recent advances in high-throughput sequencing of immunoprecipitated RNAs after cross-linking (CLIP-Seq) provide powerful ways to identify biologically relevant protein–lncRNA interactions. In this study, by analyzing millions of RNA-binding protein (RBP) binding sites from 117 CLIP-Seq datasets generated by 50 independent studies, we identified 22,735 RBP–lncRNA regulatory relationships. We found that one single lncRNA will generally be bound and regulated by one or multiple RBPs, the combination of which may coordinately regulate gene expression. We also revealed the expression correlation of these interaction networks by mining expression profiles of over 6000 normal and tumor samples from 14 cancer types. Our combined analysis of CLIP-Seq data and genome-wide association studies data discovered hundreds of disease-related single nucleotide polymorphisms resided in the RBP binding sites of lncRNAs. Finally, we developed interactive web implementations to provide visualization, analysis, and downloading of the aforementioned large-scale datasets. Our study represented an important step in identification and analysis of RBP–lncRNA interactions and showed that these interactions may play crucial roles in cancer and genetic diseases.

  14. Discovery of Protein–lncRNA Interactions by Integrating Large-Scale CLIP-Seq and RNA-Seq Datasets

    International Nuclear Information System (INIS)

    Li, Jun-Hao; Liu, Shun; Zheng, Ling-Ling; Wu, Jie; Sun, Wen-Ju; Wang, Ze-Lin; Zhou, Hui; Qu, Liang-Hu; Yang, Jian-Hua

    2015-01-01

    Long non-coding RNAs (lncRNAs) are emerging as important regulatory molecules in developmental, physiological, and pathological processes. However, the precise mechanism and functions of most of lncRNAs remain largely unknown. Recent advances in high-throughput sequencing of immunoprecipitated RNAs after cross-linking (CLIP-Seq) provide powerful ways to identify biologically relevant protein–lncRNA interactions. In this study, by analyzing millions of RNA-binding protein (RBP) binding sites from 117 CLIP-Seq datasets generated by 50 independent studies, we identified 22,735 RBP–lncRNA regulatory relationships. We found that one single lncRNA will generally be bound and regulated by one or multiple RBPs, the combination of which may coordinately regulate gene expression. We also revealed the expression correlation of these interaction networks by mining expression profiles of over 6000 normal and tumor samples from 14 cancer types. Our combined analysis of CLIP-Seq data and genome-wide association studies data discovered hundreds of disease-related single nucleotide polymorphisms resided in the RBP binding sites of lncRNAs. Finally, we developed interactive web implementations to provide visualization, analysis, and downloading of the aforementioned large-scale datasets. Our study represented an important step in identification and analysis of RBP–lncRNA interactions and showed that these interactions may play crucial roles in cancer and genetic diseases.

  15. RNA-Seq profiling reveals novel hepatic gene expression pattern in aflatoxin B1 treated rats.

    Directory of Open Access Journals (Sweden)

    B Alex Merrick

    Full Text Available Deep sequencing was used to investigate the subchronic effects of 1 ppm aflatoxin B1 (AFB1, a potent hepatocarcinogen, on the male rat liver transcriptome prior to onset of histopathological lesions or tumors. We hypothesized RNA-Seq would reveal more differentially expressed genes (DEG than microarray analysis, including low copy and novel transcripts related to AFB1's carcinogenic activity compared to feed controls (CTRL. Paired-end reads were mapped to the rat genome (Rn4 with TopHat and further analyzed by DESeq and Cufflinks-Cuffdiff pipelines to identify differentially expressed transcripts, new exons and unannotated transcripts. PCA and cluster analysis of DEGs showed clear separation between AFB1 and CTRL treatments and concordance among group replicates. qPCR of eight high and medium DEGs and three low DEGs showed good comparability among RNA-Seq and microarray transcripts. DESeq analysis identified 1,026 differentially expressed transcripts at greater than two-fold change (p<0.005 compared to 626 transcripts by microarray due to base pair resolution of transcripts by RNA-Seq, probe placement within transcripts or an absence of probes to detect novel transcripts, splice variants and exons. Pathway analysis among DEGs revealed signaling of Ahr, Nrf2, GSH, xenobiotic, cell cycle, extracellular matrix, and cell differentiation networks consistent with pathways leading to AFB1 carcinogenesis, including almost 200 upregulated transcripts controlled by E2f1-related pathways related to kinetochore structure, mitotic spindle assembly and tissue remodeling. We report 49 novel, differentially-expressed transcripts including confirmation by PCR-cloning of two unique, unannotated, hepatic AFB1-responsive transcripts (HAfT's on chromosomes 1.q55 and 15.q11, overexpressed by 10 to 25-fold. Several potentially novel exons were found and exon refinements were made including AFB1 exon-specific induction of homologous family members, Ugt1a6 and Ugt1a7c

  16. NGScloud: RNA-seq analysis of non-model species using cloud computing.

    Science.gov (United States)

    Mora-Márquez, Fernando; Vázquez-Poletti, José Luis; López de Heredia, Unai

    2018-05-03

    RNA-seq analysis usually requires large computing infrastructures. NGScloud is a bioinformatic system developed to analyze RNA-seq data using the cloud computing services of Amazon that permit the access to ad hoc computing infrastructure scaled according to the complexity of the experiment, so its costs and times can be optimized. The application provides a user-friendly front-end to operate Amazon's hardware resources, and to control a workflow of RNA-seq analysis oriented to non-model species, incorporating the cluster concept, which allows parallel runs of common RNA-seq analysis programs in several virtual machines for faster analysis. NGScloud is freely available at https://github.com/GGFHF/NGScloud/. A manual detailing installation and how-to-use instructions is available with the distribution. unai.lopezdeheredia@upm.es.

  17. Unleashing Gen Y: Marketing Mars to Millennials

    Science.gov (United States)

    Leahy, Bart D.; Hidalgo, Loretta; Kloberdanz, Cassie

    2007-01-01

    Space advocates need to engage Generation Y (born 1977-1999).This outreach is necessary to recruit the next generation of scientists and engineers to explore Mars. Space advocates in the non-profit, private, and government sectors need to use a combination of technical communication, marketing, and politics, to develop messages that resonate with Gen Y. Until now, space messages have been generated by and for college-educated white males; Gen Y is much more diverse, including as much as one third minorities. Young women, too, need to be reached. My research has shown that messages emphasizing technology, fun, humor, and opportunity are the best means of reaching the Gen Y audience of 60 million (US population is 300 million). The important things space advocates must avoid are talking down to this generation, making false promises, or expecting them to "wait their turn" before they can participate. This is the MTV generation! We need to find ways of engaging Gen Y now to build a future where human beings can live and work on the planet Mars. In addition to the messages themselves, advocates need to keep up with Gen Y' s social networking and use of iPods, cell phones, and the Internet. NASA and space advocacy groups can use these tools for "viral marketing," where young people share targeted space-related information via cell phones or the Internet because they like it. Overall, Gen Y is a socially dynamic and media-savvy group; advocates' space messages need to be sincere, creative, and placed in locations where Gen Y lives. Mars messages must be memorable!

  18. FamSeq: a variant calling program for family-based sequencing data using graphics processing units.

    Directory of Open Access Journals (Sweden)

    Gang Peng

    2014-10-01

    Full Text Available Various algorithms have been developed for variant calling using next-generation sequencing data, and various methods have been applied to reduce the associated false positive and false negative rates. Few variant calling programs, however, utilize the pedigree information when the family-based sequencing data are available. Here, we present a program, FamSeq, which reduces both false positive and false negative rates by incorporating the pedigree information from the Mendelian genetic model into variant calling. To accommodate variations in data complexity, FamSeq consists of four distinct implementations of the Mendelian genetic model: the Bayesian network algorithm, a graphics processing unit version of the Bayesian network algorithm, the Elston-Stewart algorithm and the Markov chain Monte Carlo algorithm. To make the software efficient and applicable to large families, we parallelized the Bayesian network algorithm that copes with pedigrees with inbreeding loops without losing calculation precision on an NVIDIA graphics processing unit. In order to compare the difference in the four methods, we applied FamSeq to pedigree sequencing data with family sizes that varied from 7 to 12. When there is no inbreeding loop in the pedigree, the Elston-Stewart algorithm gives analytical results in a short time. If there are inbreeding loops in the pedigree, we recommend the Bayesian network method, which provides exact answers. To improve the computing speed of the Bayesian network method, we parallelized the computation on a graphics processing unit. This allowed the Bayesian network method to process the whole genome sequencing data of a family of 12 individuals within two days, which was a 10-fold time reduction compared to the time required for this computation on a central processing unit.

  19. Aspectos genéticos da escoliose idiopática do adolescente Aspectos genéticos de la escoliosis idiopática del adolescente Genetic aspects of the adolescent idiopathic scoliosis

    Directory of Open Access Journals (Sweden)

    Marcelo Wajchenberg

    2012-09-01

    Full Text Available A escoliose idiopática do adolescente é uma doença frequente e sua etiologia permanece obscura. Várias hipóteses foram formuladas, entre elas a possibilidade da transmissão genética. Estudos na literatura procuraram analisar a prevalência da doença em determinadas populações, as possíveis formas de transmissão, a localização dos genes responsáveis e as variações de determinados genes (polimorfismos que podem influenciar o desenvolvimento da deformidade. O objetivo deste artigo é revisar e atualizar os conceitos sobre a influência genética na etiologia da escoliose idiopática do adolescente.La escoliosis idiopática del adolescente es una enfermedad frecuente y su etiología continúa siendo obscura. Varias hipótesis fueron elaboradas, entre ellas, la posibilidad de la transmisión genética. Los estudios en la literatura procuraron analizar la prevalencia de la enfermedad en determinadas poblaciones, las posibles formas de transmisión, la localización de los genes responsables y las variaciones de genes específicos (polimorfismos que pueden influenciar en el desarrollo de la deformidad. El objetivo de este artículo es revisar y actualizar los conceptos sobre la influencia genética en la etiología de la escoliosis idiopática del adolescente.The adolescent idiopathic scoliosis is a common disease and its etiology remains unclear. Several hypotheses have been devised, including the possibility of genetic transmission. Studies in the literature have examined the prevalence of the disease in certain populations, the possible modes of transmission, the location of genes and variations of certain genes (polymorphisms that may influence the development of the deformity. This article intends to review and update the concepts of genetic influence in the etiology of adolescent idiopathic scoliosis.

  20. Analysis of allelic expression patterns in clonal somatic cells by single-cell RNA-seq.

    Science.gov (United States)

    Reinius, Björn; Mold, Jeff E; Ramsköld, Daniel; Deng, Qiaolin; Johnsson, Per; Michaëlsson, Jakob; Frisén, Jonas; Sandberg, Rickard

    2016-11-01

    Cellular heterogeneity can emerge from the expression of only one parental allele. However, it has remained controversial whether, or to what degree, random monoallelic expression of autosomal genes (aRME) is mitotically inherited (clonal) or stochastic (dynamic) in somatic cells, particularly in vivo. Here we used allele-sensitive single-cell RNA-seq on clonal primary mouse fibroblasts and freshly isolated human CD8 + T cells to dissect clonal and dynamic monoallelic expression patterns. Dynamic aRME affected a considerable portion of the cells' transcriptomes, with levels dependent on the cells' transcriptional activity. Notably, clonal aRME was detected, but it was surprisingly scarce (aRME occurs transiently within individual cells, and patterns of aRME are thus primarily scattered throughout somatic cell populations rather than, as previously hypothesized, confined to patches of clonally related cells.

  1. Genetic diversity and population structure in Physalis peruviana and related taxa based on InDels and SNPs derived from COSII and IRG markers

    Science.gov (United States)

    Garzón-Martínez, Gina A.; Osorio-Guarín, Jaime A.; Delgadillo-Durán, Paola; Mayorga, Franklin; Enciso-Rodríguez, Felix E.; Landsman, David

    2015-01-01

    The genus Physalis is common in the Americas and includes several economically important species, among them Physalis peruviana that produces appetizing edible fruits. We studied the genetic diversity and population structure of P. peruviana and characterized 47 accessions of this species along with 13 accessions of related taxa consisting of 222 individuals from the Colombian Corporation of Agricultural Research (CORPOICA) germplasm collection, using Conserved Orthologous Sequences (COSII) and Immunity Related Genes (IRGs). In addition, 642 Single Nucleotide Polymorphism (SNPs) markers were identified and used for the genetic diversity analysis. A total of 121 alleles were detected in 24 InDels loci ranging from 2 to 9 alleles per locus, with an average of 5.04 alleles per locus. The average number of alleles in the SNP markers was two. The observed heterozygosity for P. peruviana with InDel and SNP markers was higher (0.48 and 0.59) than the expected heterozygosity (0.30 and 0.41). Interestingly, the observed heterozygosity in related taxa (0.4 and 0.12) was lower than the expected heterozygosity (0.59 and 0.25). The coefficient of population differentiation FST was 0.143 (InDels) and 0.038 (SNPs), showing a relatively low level of genetic differentiation among P. peruviana and related taxa. Higher levels of genetic variation were instead observed within populations based on the AMOVA analysis. Population structure analysis supported the presence of two main groups and PCA analysis based on SNP markers revealed two distinct clusters in the P. peruviana accessions corresponding to their state of cultivation. In this study, we identified molecular markers useful to detect genetic variation in Physalis germplasm for assisting conservation and crossbreeding strategies. PMID:26550601

  2. Genetic diversity and population structure in Physalis peruviana and related taxa based on InDels and SNPs derived from COSII and IRG markers.

    Science.gov (United States)

    Garzón-Martínez, Gina A; Osorio-Guarín, Jaime A; Delgadillo-Durán, Paola; Mayorga, Franklin; Enciso-Rodríguez, Felix E; Landsman, David; Mariño-Ramírez, Leonardo; Barrero, Luz Stella

    2015-12-01

    The genus Physalis is common in the Americas and includes several economically important species, among them Physalis peruviana that produces appetizing edible fruits. We studied the genetic diversity and population structure of P. peruviana and characterized 47 accessions of this species along with 13 accessions of related taxa consisting of 222 individuals from the Colombian Corporation of Agricultural Research (CORPOICA) germplasm collection, using Conserved Orthologous Sequences (COSII) and Immunity Related Genes (IRGs). In addition, 642 Single Nucleotide Polymorphism (SNPs) markers were identified and used for the genetic diversity analysis. A total of 121 alleles were detected in 24 InDels loci ranging from 2 to 9 alleles per locus, with an average of 5.04 alleles per locus. The average number of alleles in the SNP markers was two. The observed heterozygosity for P. peruviana with InDel and SNP markers was higher (0.48 and 0.59) than the expected heterozygosity (0.30 and 0.41). Interestingly, the observed heterozygosity in related taxa (0.4 and 0.12) was lower than the expected heterozygosity (0.59 and 0.25). The coefficient of population differentiation F ST was 0.143 (InDels) and 0.038 (SNPs), showing a relatively low level of genetic differentiation among P. peruviana and related taxa. Higher levels of genetic variation were instead observed within populations based on the AMOVA analysis. Population structure analysis supported the presence of two main groups and PCA analysis based on SNP markers revealed two distinct clusters in the P. peruviana accessions corresponding to their state of cultivation. In this study, we identified molecular markers useful to detect genetic variation in Physalis germplasm for assisting conservation and crossbreeding strategies.

  3. Determination of in vivo RNA kinetics using RATE-seq.

    Science.gov (United States)

    Neymotin, Benjamin; Athanasiadou, Rodoniki; Gresham, David

    2014-10-01

    The abundance of a transcript is determined by its rate of synthesis and its rate of degradation; however, global methods for quantifying RNA abundance cannot distinguish variation in these two processes. Here, we introduce RNA approach to equilibrium sequencing (RATE-seq), which uses in vivo metabolic labeling of RNA and approach to equilibrium kinetics, to determine absolute RNA degradation and synthesis rates. RATE-seq does not disturb cellular physiology, uses straightforward normalization with exogenous spike-ins, and can be readily adapted for studies in most organisms. We demonstrate the use of RATE-seq to estimate genome-wide kinetic parameters for coding and noncoding transcripts in Saccharomyces cerevisiae. © 2014 Neymotin et al.; Published by Cold Spring Harbor Laboratory Press for the RNA Society.

  4. Modelo poblacional con algoritmos genéticos

    OpenAIRE

    Veliz Quintero, Eduardo; Rodriguez Ojeda, Luis

    2009-01-01

    Para el desarrollo de este trabajo, “MODELO POBLACIONAL CON ALGORITMOS GENÉTICOS”, he investigado la rama de la inteligencia artificial, como son los algoritmos genéticos. Primero presento en forma general los aspectos que envuelven los algoritmos genéticos, parto de la necesidad de optimizar, así como su historia y posibles aplicaciones y luego he cubierto detalladamente todo lo que pude investigar sobre la teoría de los algoritmos genéticos, sus fundamentos matemáticos, tipos de algoritmos ...

  5. Genome-wide identification and characterisation of human DNA replication origins by initiation site sequencing (ini-seq).

    Science.gov (United States)

    Langley, Alexander R; Gräf, Stefan; Smith, James C; Krude, Torsten

    2016-12-01

    Next-generation sequencing has enabled the genome-wide identification of human DNA replication origins. However, different approaches to mapping replication origins, namely (i) sequencing isolated small nascent DNA strands (SNS-seq); (ii) sequencing replication bubbles (bubble-seq) and (iii) sequencing Okazaki fragments (OK-seq), show only limited concordance. To address this controversy, we describe here an independent high-resolution origin mapping technique that we call initiation site sequencing (ini-seq). In this approach, newly replicated DNA is directly labelled with digoxigenin-dUTP near the sites of its initiation in a cell-free system. The labelled DNA is then immunoprecipitated and genomic locations are determined by DNA sequencing. Using this technique we identify >25,000 discrete origin sites at sub-kilobase resolution on the human genome, with high concordance between biological replicates. Most activated origins identified by ini-seq are found at transcriptional start sites and contain G-quadruplex (G4) motifs. They tend to cluster in early-replicating domains, providing a correlation between early replication timing and local density of activated origins. Origins identified by ini-seq show highest concordance with sites identified by SNS-seq, followed by OK-seq and bubble-seq. Furthermore, germline origins identified by positive nucleotide distribution skew jumps overlap with origins identified by ini-seq and OK-seq more frequently and more specifically than do sites identified by either SNS-seq or bubble-seq. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  6. Factors Influencing Retention of Gen Y and Non-Gen Y Teachers Working at International Schools in Asia

    Science.gov (United States)

    Fong, Hoi Wah Benny

    2018-01-01

    Quantitative studies on international-school teacher retention are few, especially studies that differentiate between Gen Y and non-Gen Y teachers. This article reports on the findings of a study that examined the relationship of job satisfaction factors to the likelihood of contract renewal by international-school teachers. Results from the study…

  7. ToNER: A tool for identifying nucleotide enrichment signals in feature-enriched RNA-seq data.

    Directory of Open Access Journals (Sweden)

    Yuttachon Promworn

    Full Text Available Biochemical methods are available for enriching 5' ends of RNAs in prokaryotes, which are employed in the differential RNA-seq (dRNA-seq and the more recent Cappable-seq protocols. Computational methods are needed to locate RNA 5' ends from these data by statistical analysis of the enrichment. Although statistical-based analysis methods have been developed for dRNA-seq, they may not be suitable for Cappable-seq data. The more efficient enrichment method employed in Cappable-seq compared with dRNA-seq could affect data distribution and thus algorithm performance.We present Transformation of Nucleotide Enrichment Ratios (ToNER, a tool for statistical modeling of enrichment from RNA-seq data obtained from enriched and unenriched libraries. The tool calculates nucleotide enrichment scores and determines the global transformation for fitting to the normal distribution using the Box-Cox procedure. From the transformed distribution, sites of significant enrichment are identified. To increase power of detection, meta-analysis across experimental replicates is offered. We tested the tool on Cappable-seq and dRNA-seq data for identifying Escherichia coli transcript 5' ends and compared the results with those from the TSSAR tool, which is designed for analyzing dRNA-seq data. When combining results across Cappable-seq replicates, ToNER detects more known transcript 5' ends than TSSAR. In general, the transcript 5' ends detected by ToNER but not TSSAR occur in regions which cannot be locally modeled by TSSAR.ToNER uses a simple yet robust statistical modeling approach, which can be used for detecting RNA 5'ends from Cappable-seq data, in particular when combining information from experimental replicates. The ToNER tool could potentially be applied for analyzing other RNA-seq datasets in which enrichment for other structural features of RNA is employed. The program is freely available for download at ToNER webpage (http://www4a

  8. ToNER: A tool for identifying nucleotide enrichment signals in feature-enriched RNA-seq data.

    Science.gov (United States)

    Promworn, Yuttachon; Kaewprommal, Pavita; Shaw, Philip J; Intarapanich, Apichart; Tongsima, Sissades; Piriyapongsa, Jittima

    2017-01-01

    Biochemical methods are available for enriching 5' ends of RNAs in prokaryotes, which are employed in the differential RNA-seq (dRNA-seq) and the more recent Cappable-seq protocols. Computational methods are needed to locate RNA 5' ends from these data by statistical analysis of the enrichment. Although statistical-based analysis methods have been developed for dRNA-seq, they may not be suitable for Cappable-seq data. The more efficient enrichment method employed in Cappable-seq compared with dRNA-seq could affect data distribution and thus algorithm performance. We present Transformation of Nucleotide Enrichment Ratios (ToNER), a tool for statistical modeling of enrichment from RNA-seq data obtained from enriched and unenriched libraries. The tool calculates nucleotide enrichment scores and determines the global transformation for fitting to the normal distribution using the Box-Cox procedure. From the transformed distribution, sites of significant enrichment are identified. To increase power of detection, meta-analysis across experimental replicates is offered. We tested the tool on Cappable-seq and dRNA-seq data for identifying Escherichia coli transcript 5' ends and compared the results with those from the TSSAR tool, which is designed for analyzing dRNA-seq data. When combining results across Cappable-seq replicates, ToNER detects more known transcript 5' ends than TSSAR. In general, the transcript 5' ends detected by ToNER but not TSSAR occur in regions which cannot be locally modeled by TSSAR. ToNER uses a simple yet robust statistical modeling approach, which can be used for detecting RNA 5'ends from Cappable-seq data, in particular when combining information from experimental replicates. The ToNER tool could potentially be applied for analyzing other RNA-seq datasets in which enrichment for other structural features of RNA is employed. The program is freely available for download at ToNER webpage (http://www4a.biotec.or.th/GI/tools/toner) and Git

  9. Metode Transfer Asam Nukleat sebagai Dasar Terapi Gen

    Directory of Open Access Journals (Sweden)

    Novi Silvia Hardiany

    2017-01-01

    Full Text Available Kemajuan ilmu biologi molekuler memberikan manfaat dalam bidang kedokteran untuk mengembangkanterapi gen. Tujuan terapi gen adalah untuk memperbaiki kerusakan gen atau mengganti gen yang rusakdengan gen yang normal. Pemindahan gen dilakukan dengan teknik transfeksi. Transfeksi merupakanproses pemindahan asam nukleat baik menggunakan vektor virus (transduksi atau menggunakan metodenonviral yaitu zat kimia, lipid dan metode fisik. Vektor virus yang digunakan pada transduksi adalahretrovirus, adenovirus, adeno-associated virus (AAV dan herpes simplex virus (HSV. Keberhasilantransfeksi ditentukan oleh berbagai faktor yang dapat dapat dinilai dengan menggunakan reporter sepertigreen fluorescence protein (GFP. Kata Kunci: terapi gen, transfeksi non viral, transduksi, vektor virus   Methods of Nucleic Acid Transfer as Basic Gene Therapy Abstract The advancement of molecular biology provides benefit in the field of medicine to develop genetherapy. The aim of gene therapy is to repair the genetic damage or to replace damaged gene with thenormal gene. Delivery of gene is carried out by transfection technique, a technique to transfer nucleic acidinto eukaryote cells either using viral vectors (known as transduction, and also using non viral methodsuch as chemical substance, lipid and physical method. Some of the viral vectors used in the transductionare retrovirus, adenovirus, Adeno-associated virus (AAV and Herpes Simplex Virus (HSV. The success oftransfection is determined by various factors which can be assessed using several reporters such as GreenFluorescence Protein (GFP. Key words: gene therapy, non viral transfection, transduction, viral vector. Normal 0 false false false IN X-NONE X-NONE

  10. Comprehensive Assessments of RNA-seq by the SEQC Consortium: FDA-Led Efforts Advance Precision Medicine

    Directory of Open Access Journals (Sweden)

    Joshua Xu

    2016-03-01

    Full Text Available Studies on gene expression in response to therapy have led to the discovery of pharmacogenomics biomarkers and advances in precision medicine. Whole transcriptome sequencing (RNA-seq is an emerging tool for profiling gene expression and has received wide adoption in the biomedical research community. However, its value in regulatory decision making requires rigorous assessment and consensus between various stakeholders, including the research community, regulatory agencies, and industry. The FDA-led SEquencing Quality Control (SEQC consortium has made considerable progress in this direction, and is the subject of this review. Specifically, three RNA-seq platforms (Illumina HiSeq, Life Technologies SOLiD, and Roche 454 were extensively evaluated at multiple sites to assess cross-site and cross-platform reproducibility. The results demonstrated that relative gene expression measurements were consistently comparable across labs and platforms, but not so for the measurement of absolute expression levels. As part of the quality evaluation several studies were included to evaluate the utility of RNA-seq in clinical settings and safety assessment. The neuroblastoma study profiled tumor samples from 498 pediatric neuroblastoma patients by both microarray and RNA-seq. RNA-seq offers more utilities than microarray in determining the transcriptomic characteristics of cancer. However, RNA-seq and microarray-based models were comparable in clinical endpoint prediction, even when including additional features unique to RNA-seq beyond gene expression. The toxicogenomics study compared microarray and RNA-seq profiles of the liver samples from rats exposed to 27 different chemicals representing multiple toxicity modes of action. Cross-platform concordance was dependent on chemical treatment and transcript abundance. Though both RNA-seq and microarray are suitable for developing gene expression based predictive models with comparable prediction performance, RNA-seq

  11. GenProBiS: web server for mapping of sequence variants to protein binding sites.

    Science.gov (United States)

    Konc, Janez; Skrlj, Blaz; Erzen, Nika; Kunej, Tanja; Janezic, Dusanka

    2017-07-03

    Discovery of potentially deleterious sequence variants is important and has wide implications for research and generation of new hypotheses in human and veterinary medicine, and drug discovery. The GenProBiS web server maps sequence variants to protein structures from the Protein Data Bank (PDB), and further to protein-protein, protein-nucleic acid, protein-compound, and protein-metal ion binding sites. The concept of a protein-compound binding site is understood in the broadest sense, which includes glycosylation and other post-translational modification sites. Binding sites were defined by local structural comparisons of whole protein structures using the Protein Binding Sites (ProBiS) algorithm and transposition of ligands from the similar binding sites found to the query protein using the ProBiS-ligands approach with new improvements introduced in GenProBiS. Binding site surfaces were generated as three-dimensional grids encompassing the space occupied by predicted ligands. The server allows intuitive visual exploration of comprehensively mapped variants, such as human somatic mis-sense mutations related to cancer and non-synonymous single nucleotide polymorphisms from 21 species, within the predicted binding sites regions for about 80 000 PDB protein structures using fast WebGL graphics. The GenProBiS web server is open and free to all users at http://genprobis.insilab.org. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  12. An Annotation Agnostic Algorithm for Detecting Nascent RNA Transcripts in GRO-Seq.

    Science.gov (United States)

    Azofeifa, Joseph G; Allen, Mary A; Lladser, Manuel E; Dowell, Robin D

    2017-01-01

    We present a fast and simple algorithm to detect nascent RNA transcription in global nuclear run-on sequencing (GRO-seq). GRO-seq is a relatively new protocol that captures nascent transcripts from actively engaged polymerase, providing a direct read-out on bona fide transcription. Most traditional assays, such as RNA-seq, measure steady state RNA levels which are affected by transcription, post-transcriptional processing, and RNA stability. GRO-seq data, however, presents unique analysis challenges that are only beginning to be addressed. Here, we describe a new algorithm, Fast Read Stitcher (FStitch), that takes advantage of two popular machine-learning techniques, hidden Markov models and logistic regression, to classify which regions of the genome are transcribed. Given a small user-defined training set, our algorithm is accurate, robust to varying read depth, annotation agnostic, and fast. Analysis of GRO-seq data without a priori need for annotation uncovers surprising new insights into several aspects of the transcription process.

  13. Rcorrector: efficient and accurate error correction for Illumina RNA-seq reads.

    Science.gov (United States)

    Song, Li; Florea, Liliana

    2015-01-01

    Next-generation sequencing of cellular RNA (RNA-seq) is rapidly becoming the cornerstone of transcriptomic analysis. However, sequencing errors in the already short RNA-seq reads complicate bioinformatics analyses, in particular alignment and assembly. Error correction methods have been highly effective for whole-genome sequencing (WGS) reads, but are unsuitable for RNA-seq reads, owing to the variation in gene expression levels and alternative splicing. We developed a k-mer based method, Rcorrector, to correct random sequencing errors in Illumina RNA-seq reads. Rcorrector uses a De Bruijn graph to compactly represent all trusted k-mers in the input reads. Unlike WGS read correctors, which use a global threshold to determine trusted k-mers, Rcorrector computes a local threshold at every position in a read. Rcorrector has an accuracy higher than or comparable to existing methods, including the only other method (SEECER) designed for RNA-seq reads, and is more time and memory efficient. With a 5 GB memory footprint for 100 million reads, it can be run on virtually any desktop or server. The software is available free of charge under the GNU General Public License from https://github.com/mourisl/Rcorrector/.

  14. SeqLib: a C ++ API for rapid BAM manipulation, sequence alignment and sequence assembly.

    Science.gov (United States)

    Wala, Jeremiah; Beroukhim, Rameen

    2017-03-01

    We present SeqLib, a C ++ API and command line tool that provides a rapid and user-friendly interface to BAM/SAM/CRAM files, global sequence alignment operations and sequence assembly. Four C libraries perform core operations in SeqLib: HTSlib for BAM access, BWA-MEM and BLAT for sequence alignment and Fermi for error correction and sequence assembly. Benchmarking indicates that SeqLib has lower CPU and memory requirements than leading C ++ sequence analysis APIs. We demonstrate an example of how minimal SeqLib code can extract, error-correct and assemble reads from a CRAM file and then align with BWA-MEM. SeqLib also provides additional capabilities, including chromosome-aware interval queries and read plotting. Command line tools are available for performing integrated error correction, micro-assemblies and alignment. SeqLib is available on Linux and OSX for the C ++98 standard and later at github.com/walaj/SeqLib. SeqLib is released under the Apache2 license. Additional capabilities for BLAT alignment are available under the BLAT license. jwala@broadinstitue.org ; rameen@broadinstitute.org. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com

  15. Full Data of Yeast Interacting Proteins Database (Original Version) - Yeast Interacting Proteins Database | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available List Contact us Yeast Interacting Proteins Database Full Data of Yeast Interacting Proteins Database (Origin...al Version) Data detail Data name Full Data of Yeast Interacting Proteins Database (Original Version) DOI 10....18908/lsdba.nbdc00742-004 Description of data contents The entire data in the Yeast Interacting Proteins Database...eir interactions are required. Several sources including YPD (Yeast Proteome Database, Costanzo, M. C., Hoga...ematic name in the SGD (Saccharomyces Genome Database; http://www.yeastgenome.org /). Bait gene name The gen

  16. Limitations and possibilities of low cell number ChIP-seq

    Directory of Open Access Journals (Sweden)

    Gilfillan Gregor D

    2012-11-01

    Full Text Available Abstract Background Chromatin immunoprecipitation coupled with high-throughput DNA sequencing (ChIP-seq offers high resolution, genome-wide analysis of DNA-protein interactions. However, current standard methods require abundant starting material in the range of 1–20 million cells per immunoprecipitation, and remain a bottleneck to the acquisition of biologically relevant epigenetic data. Using a ChIP-seq protocol optimised for low cell numbers (down to 100,000 cells / IP, we examined the performance of the ChIP-seq technique on a series of decreasing cell numbers. Results We present an enhanced native ChIP-seq method tailored to low cell numbers that represents a 200-fold reduction in input requirements over existing protocols. The protocol was tested over a range of starting cell numbers covering three orders of magnitude, enabling determination of the lower limit of the technique. At low input cell numbers, increased levels of unmapped and duplicate reads reduce the number of unique reads generated, and can drive up sequencing costs and affect sensitivity if ChIP is attempted from too few cells. Conclusions The optimised method presented here considerably reduces the input requirements for performing native ChIP-seq. It extends the applicability of the technique to isolated primary cells and rare cell populations (e.g. biobank samples, stem cells, and in many cases will alleviate the need for cell culture and any associated alteration of epigenetic marks. However, this study highlights a challenge inherent to ChIP-seq from low cell numbers: as cell input numbers fall, levels of unmapped sequence reads and PCR-generated duplicate reads rise. We discuss a number of solutions to overcome the effects of reducing cell number that may aid further improvements to ChIP performance.

  17. A non-parametric peak calling algorithm for DamID-Seq.

    Directory of Open Access Journals (Sweden)

    Renhua Li

    Full Text Available Protein-DNA interactions play a significant role in gene regulation and expression. In order to identify transcription factor binding sites (TFBS of double sex (DSX-an important transcription factor in sex determination, we applied the DNA adenine methylation identification (DamID technology to the fat body tissue of Drosophila, followed by deep sequencing (DamID-Seq. One feature of DamID-Seq data is that induced adenine methylation signals are not assured to be symmetrically distributed at TFBS, which renders the existing peak calling algorithms for ChIP-Seq, including SPP and MACS, inappropriate for DamID-Seq data. This challenged us to develop a new algorithm for peak calling. A challenge in peaking calling based on sequence data is estimating the averaged behavior of background signals. We applied a bootstrap resampling method to short sequence reads in the control (Dam only. After data quality check and mapping reads to a reference genome, the peaking calling procedure compromises the following steps: 1 reads resampling; 2 reads scaling (normalization and computing signal-to-noise fold changes; 3 filtering; 4 Calling peaks based on a statistically significant threshold. This is a non-parametric method for peak calling (NPPC. We also used irreproducible discovery rate (IDR analysis, as well as ChIP-Seq data to compare the peaks called by the NPPC. We identified approximately 6,000 peaks for DSX, which point to 1,225 genes related to the fat body tissue difference between female and male Drosophila. Statistical evidence from IDR analysis indicated that these peaks are reproducible across biological replicates. In addition, these peaks are comparable to those identified by use of ChIP-Seq on S2 cells, in terms of peak number, location, and peaks width.

  18. A non-parametric peak calling algorithm for DamID-Seq.

    Science.gov (United States)

    Li, Renhua; Hempel, Leonie U; Jiang, Tingbo

    2015-01-01

    Protein-DNA interactions play a significant role in gene regulation and expression. In order to identify transcription factor binding sites (TFBS) of double sex (DSX)-an important transcription factor in sex determination, we applied the DNA adenine methylation identification (DamID) technology to the fat body tissue of Drosophila, followed by deep sequencing (DamID-Seq). One feature of DamID-Seq data is that induced adenine methylation signals are not assured to be symmetrically distributed at TFBS, which renders the existing peak calling algorithms for ChIP-Seq, including SPP and MACS, inappropriate for DamID-Seq data. This challenged us to develop a new algorithm for peak calling. A challenge in peaking calling based on sequence data is estimating the averaged behavior of background signals. We applied a bootstrap resampling method to short sequence reads in the control (Dam only). After data quality check and mapping reads to a reference genome, the peaking calling procedure compromises the following steps: 1) reads resampling; 2) reads scaling (normalization) and computing signal-to-noise fold changes; 3) filtering; 4) Calling peaks based on a statistically significant threshold. This is a non-parametric method for peak calling (NPPC). We also used irreproducible discovery rate (IDR) analysis, as well as ChIP-Seq data to compare the peaks called by the NPPC. We identified approximately 6,000 peaks for DSX, which point to 1,225 genes related to the fat body tissue difference between female and male Drosophila. Statistical evidence from IDR analysis indicated that these peaks are reproducible across biological replicates. In addition, these peaks are comparable to those identified by use of ChIP-Seq on S2 cells, in terms of peak number, location, and peaks width.

  19. Comparison of transcriptomic landscapes of bovine embryos using RNA-Seq

    Directory of Open Access Journals (Sweden)

    Khatib Hasan

    2010-12-01

    Full Text Available Abstract Background Advances in sequencing technologies have opened a new era of high throughput investigations. Although RNA-seq has been demonstrated in many organisms, no study has provided a comprehensive investigation of the bovine transcriptome using RNA-seq. Results In this study, we provide a deep survey of the bovine embryonic transcriptomes, the first application of RNA-seq in cattle. Embryos cultured in vitro were used as models to study early embryonic development in cattle. RNA amplified from limited amounts of starting total RNA were sequenced and mapped to the reference genome to obtain digital gene expression at single base resolution. In particular, gene expression estimates from more than 1.6 million unannotated bases in 1785 novel transcribed units were obtained. We compared the transcriptomes of embryos showing distinct developmental statuses and found genes that showed differential overall expression as well as alternative splicing. Conclusion Our study demonstrates the power of RNA-seq and provides further understanding of bovine preimplantation embryonic development at a fine scale.

  20. As seqüelas psicológicas da tortura

    Directory of Open Access Journals (Sweden)

    Alfredo Guillermo Martín

    Full Text Available Analisam-se, no texto, as seqüelas psicológicas da tortura, sendo esta compreendida como instituição do Estado e como experiência-limite em diferentes aspectos (as três etapas do processo traumatizante, principais seqüelas somáticas, retraumatização. Estuda-se o incremento das psicoses, a alta porcentagem de suicídios, as dificuldades de reinserção social, as seqüelas crônicas trans-geracionais e a taxa de mortalidade muito superior à normal. Desenvolve-se uma análise detalhada das questões ligadas à indenização das vítimas. Propõem-se instrumentos diagnósticos e terapêuticos apropriados, baseando-se numa crítica clínica do PTSD, numa ampla experiência pessoal e numa bibliografia internacional atualizada.

  1. Single nucleotide polymorphism discovery in bovine liver using RNA-seq technology

    DEFF Research Database (Denmark)

    Pareek, Chandra Shekhar; Błaszczyk, Paweł; Dziuba, Piotr

    2017-01-01

    Background RNA-seq is a useful next-generation sequencing (NGS) technology that has been widely used to understand mammalian transcriptome architecture and function. In this study, a breed-specific RNA-seq experiment was utilized to detect putative single nucleotide polymorphisms (SNPs) in liver...

  2. Substantial differences in bias between single-digest and double-digest RAD-seq libraries: A case study.

    Science.gov (United States)

    Flanagan, Sarah P; Jones, Adam G

    2018-03-01

    The trade-offs of using single-digest vs. double-digest restriction site-associated DNA sequencing (RAD-seq) protocols have been widely discussed. However, no direct empirical comparisons of the two methods have been conducted. Here, we sampled a single population of Gulf pipefish (Syngnathus scovelli) and genotyped 444 individuals using RAD-seq. Sixty individuals were subjected to single-digest RAD-seq (sdRAD-seq), and the remaining 384 individuals were genotyped using a double-digest RAD-seq (ddRAD-seq) protocol. We analysed the resulting Illumina sequencing data and compared the two genotyping methods when reads were analysed either together or separately. Coverage statistics, observed heterozygosity, and allele frequencies differed significantly between the two protocols, as did the results of selection components analysis. We also performed an in silico digestion of the Gulf pipefish genome and modelled five major sources of bias: PCR duplicates, polymorphic restriction sites, shearing bias, asymmetric sampling (i.e., genotyping fewer individuals with sdRAD-seq than with ddRAD-seq) and higher major allele frequencies. This combination of approaches allowed us to determine that polymorphic restriction sites, an asymmetric sampling scheme, mean allele frequencies and to some extent PCR duplicates all contribute to different estimates of allele frequencies between samples genotyped using sdRAD-seq versus ddRAD-seq. Our finding that sdRAD-seq and ddRAD-seq can result in different allele frequencies has implications for comparisons across studies and techniques that endeavour to identify genomewide signatures of evolutionary processes in natural populations. © 2017 John Wiley & Sons Ltd.

  3. Characterization of Romboutsia ilealis gen. nov., sp. nov., isolated from the gastro-intestinal tract of a rat, and proposal for the reclassification of five closely related members of the genus Clostridium into the genera Romboutsia gen. nov., Intestinibacter gen. nov., Terrisporobacter gen. nov. and Asaccharospora gen. nov.

    Science.gov (United States)

    Gerritsen, Jacoline; Fuentes, Susana; Grievink, Wieke; van Niftrik, Laura; Tindall, Brian J; Timmerman, Harro M; Rijkers, Ger T; Smidt, Hauke

    2014-05-01

    A Gram-positive staining, rod-shaped, non-motile, spore-forming obligately anaerobic bacterium, designated CRIBT, was isolated from the gastro-intestinal tract of a rat and characterized. The major cellular fatty acids of strain CRIBT were saturated and unsaturated straight-chain C12-C19 fatty acids, with C16:0 being the predominant fatty acid. The polar lipid profile comprised six glycolipids, four phospholipids and one lipid that did not stain with any of the specific spray reagents used. The only quinone was MK-6. The predominating cell-wall sugars were glucose and galactose. The peptidoglycan type of strain CRIBT was A1σ lanthionine-direct. The genomic DNA G+C content of strain CRIBT was 28.1 mol%. On the basis of 16S rRNA gene sequence similarity, strain CRIBT was most closely related to a number of species of the genus Clostridium, including Clostridium lituseburense (97.2%), Clostridium glycolicum (96.2%), Clostridium mayombei (96.2%), Clostridium bartlettii (96.0%) and Clostridium irregulare (95.5%). All these species show very low 16S rRNA gene sequence similarity (genus Clostridium. DNA-DNA hybridization with closely related reference strains indicated reassociation values below 32%. On the basis of phenotypic and genetic studies, a novel genus, Romboutsia gen. nov., is proposed. The novel isolate CRIBT (=DSM 25109T=NIZO 4048T) is proposed as the type strain of the type species, Romboutsia ilealis gen. nov., sp. nov., of the proposed novel genus. It is proposed that C. lituseburense is transferred to this genus as Romboutsia lituseburensis comb. nov. Furthermore, the reclassification into novel genera is proposed for C. bartlettii, as Intestinibacter bartlettii gen. nov., comb. nov. (type species of the genus), C. glycolicum, as Terrisporobacter glycolicus gen. nov., comb. nov. (type species of the genus), C. mayombei, as Terrisporobacter mayombei gen. nov., comb. nov., and C. irregulare, as Asaccharospora irregularis gen. nov., comb. nov. (type species

  4. J3Gen: A PRNG for Low-Cost Passive RFID

    Directory of Open Access Journals (Sweden)

    Jordi Herrera-Joancomartí

    2013-03-01

    Full Text Available Pseudorandom number generation (PRNG is the main security tool in low-cost passive radio-frequency identification (RFID technologies, such as EPC Gen2. We present a lightweight PRNG design for low-cost passive RFID tags, named J3Gen. J3Gen is based on a linear feedback shift register (LFSR configured with multiple feedback polynomials. The polynomials are alternated during the generation of sequences via a physical source of randomness. J3Gen successfully handles the inherent linearity of LFSR based PRNGs and satisfies the statistical requirements imposed by the EPC Gen2 standard. A hardware implementation of J3Gen is presented and evaluated with regard to different design parameters, defining the key-equivalence security and nonlinearity of the design. The results of a SPICE simulation confirm the power-consumption suitability of the proposal.

  5. O impacto da genética na asma infantil

    OpenAIRE

    Pinto,Leonardo A.; Stein,Renato T.; Kabesch,Michael

    2008-01-01

    OBJETIVO: Apresentar os resultados dos estudos mais importantes e recentes sobre a genética da asma. Estes dados devem auxiliar os clínicos gerais a compreender o impacto da genética sobre este distúrbio complexo e como os genes e polimorfismos influenciam a asma e a atopia. FONTES DOS DADOS: Os dados foram coletados do banco de dados MEDLINE. Os estudos de associação genética foram selecionados do Genetic Association Database, um repositório de estudos de associação genética de doenças e dis...

  6. Combining multiple ChIP-seq peak detection systems using combinatorial fusion.

    Science.gov (United States)

    Schweikert, Christina; Brown, Stuart; Tang, Zuojian; Smith, Phillip R; Hsu, D Frank

    2012-01-01

    Due to the recent rapid development in ChIP-seq technologies, which uses high-throughput next-generation DNA sequencing to identify the targets of Chromatin Immunoprecipitation, there is an increasing amount of sequencing data being generated that provides us with greater opportunity to analyze genome-wide protein-DNA interactions. In particular, we are interested in evaluating and enhancing computational and statistical techniques for locating protein binding sites. Many peak detection systems have been developed; in this study, we utilize the following six: CisGenome, MACS, PeakSeq, QuEST, SISSRs, and TRLocator. We define two methods to merge and rescore the regions of two peak detection systems and analyze the performance based on average precision and coverage of transcription start sites. The results indicate that ChIP-seq peak detection can be improved by fusion using score or rank combination. Our method of combination and fusion analysis would provide a means for generic assessment of available technologies and systems and assist researchers in choosing an appropriate system (or fusion method) for analyzing ChIP-seq data. This analysis offers an alternate approach for increasing true positive rates, while decreasing false positive rates and hence improving the ChIP-seq peak identification process.

  7. Evaluation of tools for highly variable gene discovery from single-cell RNA-seq data.

    Science.gov (United States)

    Yip, Shun H; Sham, Pak Chung; Wang, Junwen

    2018-02-21

    Traditional RNA sequencing (RNA-seq) allows the detection of gene expression variations between two or more cell populations through differentially expressed gene (DEG) analysis. However, genes that contribute to cell-to-cell differences are not discoverable with RNA-seq because RNA-seq samples are obtained from a mixture of cells. Single-cell RNA-seq (scRNA-seq) allows the detection of gene expression in each cell. With scRNA-seq, highly variable gene (HVG) discovery allows the detection of genes that contribute strongly to cell-to-cell variation within a homogeneous cell population, such as a population of embryonic stem cells. This analysis is implemented in many software packages. In this study, we compare seven HVG methods from six software packages, including BASiCS, Brennecke, scLVM, scran, scVEGs and Seurat. Our results demonstrate that reproducibility in HVG analysis requires a larger sample size than DEG analysis. Discrepancies between methods and potential issues in these tools are discussed and recommendations are made.

  8. Discusión: Explicaciones genéticas y psicológicas de la esquizofrenia.Genética de la esperanza

    Directory of Open Access Journals (Sweden)

    Silvio Bolaños-Salvatierra

    2003-01-01

    Full Text Available En este documento se rebaten críticas hechas por Raventós y Jensen al artículo “Genética y comportamiento”. Cuatro temas fueron seleccionados: 1 se determina que los antipsicóticos aparecieron veinte años después de la concepción hereditaria de la esquizofrenia; 2 se considera que la discusión es altamente pertinente, para nada bizantina o irrelevante, debido que persisten prácticas epistémicas riesgosas en los investigadores genético-conductuales; 3 aunque ninguna conducta humana está exenta de influencia constitucional, el enfoque biologicista se ha propasado al pretender explicar genéticamente casi todo, desconfirmando solapadamente la importancia de la historia personal; y, 4 se plantea que la investigación biológica sobrevalora el peso de las anomalías genéticas frente a la historia social, por lo que solo aparenta cautela. Se propone investigar genéticamente la esperanza con el objetivo de saturar a la humanidad con ese tipo de explicaciones, para alcanzar más rápido una convivencia basada en la tolerancia y el respeto.

  9. GenToS: Use of Orthologous Gene Information to Prioritize Signals from Human GWAS.

    Directory of Open Access Journals (Sweden)

    Anselm S Hoppmann

    Full Text Available Genome-wide association studies (GWAS evaluate associations between genetic variants and a trait or disease of interest free of prior biological hypotheses. GWAS require stringent correction for multiple testing, with genome-wide significance typically defined as association p-value <5*10-8. This study presents a new tool that uses external information about genes to prioritize SNP associations (GenToS. For a given list of candidate genes, GenToS calculates an appropriate statistical significance threshold and then searches for trait-associated variants in summary statistics from human GWAS. It thereby allows for identifying trait-associated genetic variants that do not meet genome-wide significance. The program additionally tests for enrichment of significant candidate gene associations in the human GWAS data compared to the number expected by chance. As proof of principle, this report used external information from a comprehensive resource of genetically manipulated and systematically phenotyped mice. Based on selected murine phenotypes for which human GWAS data for corresponding traits were publicly available, several candidate gene input lists were derived. Using GenToS for the investigation of candidate genes underlying murine skeletal phenotypes in data from a large human discovery GWAS meta-analysis of bone mineral density resulted in the identification of significantly associated variants in 29 genes. Index variants in 28 of these loci were subsequently replicated in an independent GWAS replication step, highlighting that they are true positive associations. One signal, COL11A1, has not been discovered through GWAS so far and represents a novel human candidate gene for altered bone mineral density. The number of observed genes that contained significant SNP associations in human GWAS based on murine candidate gene input lists was much greater than the number expected by chance across several complex human traits (enrichment p-value as

  10. High-specificity detection of rare alleles with Paired-End Low Error Sequencing (PELE-Seq).

    Science.gov (United States)

    Preston, Jessica L; Royall, Ariel E; Randel, Melissa A; Sikkink, Kristin L; Phillips, Patrick C; Johnson, Eric A

    2016-06-14

    Polymorphic loci exist throughout the genomes of a population and provide the raw genetic material needed for a species to adapt to changes in the environment. The minor allele frequencies of rare Single Nucleotide Polymorphisms (SNPs) within a population have been difficult to track with Next-Generation Sequencing (NGS), due to the high error rate of standard methods such as Illumina sequencing. We have developed a wet-lab protocol and variant-calling method that identifies both sequencing and PCR errors, called Paired-End Low Error Sequencing (PELE-Seq). To test the specificity and sensitivity of the PELE-Seq method, we sequenced control E. coli DNA libraries containing known rare alleles present at frequencies ranging from 0.2-0.4 % of the total reads. PELE-Seq had higher specificity and sensitivity than standard libraries. We then used PELE-Seq to characterize rare alleles in a Caenorhabditis remanei nematode worm population before and after laboratory adaptation, and found that minor and rare alleles can undergo large changes in frequency during lab-adaptation. We have developed a method of rare allele detection that mitigates both sequencing and PCR errors, called PELE-Seq. PELE-Seq was evaluated using control E. coli populations and was then used to compare a wild C. remanei population to a lab-adapted population. The PELE-Seq method is ideal for investigating the dynamics of rare alleles in a broad range of reduced-representation sequencing methods, including targeted amplicon sequencing, RAD-Seq, ddRAD, and GBS. PELE-Seq is also well-suited for whole genome sequencing of mitochondria and viruses, and for high-throughput rare mutation screens.

  11. Seq2Ref: a web server to facilitate functional interpretation

    Directory of Open Access Journals (Sweden)

    Li Wenlin

    2013-01-01

    Full Text Available Abstract Background The size of the protein sequence database has been exponentially increasing due to advances in genome sequencing. However, experimentally characterized proteins only constitute a small portion of the database, such that the majority of sequences have been annotated by computational approaches. Current automatic annotation pipelines inevitably introduce errors, making the annotations unreliable. Instead of such error-prone automatic annotations, functional interpretation should rely on annotations of ‘reference proteins’ that have been experimentally characterized or manually curated. Results The Seq2Ref server uses BLAST to detect proteins homologous to a query sequence and identifies the reference proteins among them. Seq2Ref then reports publications with experimental characterizations of the identified reference proteins that might be relevant to the query. Furthermore, a plurality-based rating system is developed to evaluate the homologous relationships and rank the reference proteins by their relevance to the query. Conclusions The reference proteins detected by our server will lend insight into proteins of unknown function and provide extensive information to develop in-depth understanding of uncharacterized proteins. Seq2Ref is available at: http://prodata.swmed.edu/seq2ref.

  12. SNP discovery in the bovine milk transcriptome using RNA-Seq technology.

    Science.gov (United States)

    Cánovas, Angela; Rincon, Gonzalo; Islas-Trejo, Alma; Wickramasinghe, Saumya; Medrano, Juan F

    2010-12-01

    High-throughput sequencing of RNA (RNA-Seq) was developed primarily to analyze global gene expression in different tissues. However, it also is an efficient way to discover coding SNPs. The objective of this study was to perform a SNP discovery analysis in the milk transcriptome using RNA-Seq. Seven milk samples from Holstein cows were analyzed by sequencing cDNAs using the Illumina Genome Analyzer system. We detected 19,175 genes expressed in milk samples corresponding to approximately 70% of the total number of genes analyzed. The SNP detection analysis revealed 100,734 SNPs in Holstein samples, and a large number of those corresponded to differences between the Holstein breed and the Hereford bovine genome assembly Btau4.0. The number of polymorphic SNPs within Holstein cows was 33,045. The accuracy of RNA-Seq SNP discovery was tested by comparing SNPs detected in a set of 42 candidate genes expressed in milk that had been resequenced earlier using Sanger sequencing technology. Seventy of 86 SNPs were detected using both RNA-Seq and Sanger sequencing technologies. The KASPar Genotyping System was used to validate unique SNPs found by RNA-Seq but not observed by Sanger technology. Our results confirm that analyzing the transcriptome using RNA-Seq technology is an efficient and cost-effective method to identify SNPs in transcribed regions. This study creates guidelines to maximize the accuracy of SNP discovery and prevention of false-positive SNP detection, and provides more than 33,000 SNPs located in coding regions of genes expressed during lactation that can be used to develop genotyping platforms to perform marker-trait association studies in Holstein cattle.

  13. Nuclear-like Seq in mt Genome - RMG | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available ar-like Seq in mt Genome Data detail Data name Nuclear-like Seq in mt Genome DOI 10...e Site Policy | Contact Us Nuclear-like Seq in mt Genome - RMG | LSDB Archive ... ...switchLanguage; BLAST Search Image Search Home About Archive Update History Data List Contact us RMG Nucle

  14. Evaluation of PRNP Expression Based on Genotypes and Alleles of Two Indel Loci in the Medulla Oblongata of Japanese Black and Japanese Brown Cattle

    Science.gov (United States)

    Msalya, George; Shimogiri, Takeshi; Ohno, Shotaro; Okamoto, Shin; Kawabe, Kotaro; Minezawa, Mitsuru; Maeda, Yoshizane

    2011-01-01

    Background Prion protein (PrP) level plays the central role in bovine spongiform encephalopathy (BSE) susceptibility. Increasing the level of PrP decreases incubation period for this disease. Therefore, studying the expression of the cellular PrP or at least the messenger RNA might be used in selection for preventing the propagation of BSE and other prion diseases. Two insertion/deletion (indel) variations have been tentatively associated with susceptibility/resistance of cattle to classical BSE. Methodology/Principal Findings We studied the expression of each genotype at the two indel sites in Japanese Black (JB) and Japanese Brown (JBr) cattle breeds by a standard curve method of real-time PCR. Five diplotypes subdivided into two categories were selected from each breed. The two cattle breeds were considered differently. Expression of PRNP was significantly (p0.05). Conclusion Our results suggest that the del/del genotype or at least its del allele may modulate the expression of PRNP at the 23-bp locus in the medulla oblongata of these cattle breeds. PMID:21611160

  15. Genética e hanseníase

    Directory of Open Access Journals (Sweden)

    Bernardo Beiguelman

    Full Text Available As diferentes linhas de pesquisa utilizadas para investigar a importância dos fatores hereditários humanos na determinação da resistência/suscetibilidade à infecção pelo Mycobacterium leprae foram discutidas no presente trabalho. Uma síntese dessas abordagens permitiu analisar os resultados das investigações sobre associação da hanseníase com polimorfismos genéticos, distribuição familial da hanseníase, prevalência da hanseníase e distância genética, concordância da hanseníase em gêmeos e estudos genéticos sobre a reação de Mitsuda.

  16. DFI-seq identification of environment-specific gene expression in uropathogenic Escherichia coli

    DEFF Research Database (Denmark)

    Madelung, Michelle; Kronborg, Tina; Doktor, Thomas Koed

    2017-01-01

    response. We combined differential fluorescence induction (DFI) with next-generation sequencing, collectively termed DFI-seq, to identify differentially expressed genes in UPEC strain UTI89 during growth in human urine and bladder cells. RESULTS: DFI-seq eliminates the need for iterative cell sorting...... hypothetical proteins. One such gene UTI89_C5139, displayed increased adhesion and invasion of J82 cells when deleted from UPEC strain UTI89. CONCLUSIONS: We demonstrate the usefulness of DFI-seq for identification of genes required for optimal growth of UPEC in human urine, as well as potential virulence...

  17. Genética de la preeclampsia: una aproximación a los estudios de ligamiento genético.

    Directory of Open Access Journals (Sweden)

    Nora Alejandra Zuluaga

    2004-06-01

    Full Text Available La preeclampsia es considerada un problema de salud pública debido a su alta prevalencia. Muchas investigaciones coinciden en que su origen se relaciona con la interacción entre factores genéticos y ambientales. Por esta razón, múltiples estudios han explorado tales factores genéticos tratando de identificar regiones cromosómicas y genes candidatos cuyas variantes se relacionen con una mayor susceptibilidad a la enfermedad. Diversos estudios de asociación han identificado algunos genes de susceptibilidad a la preeclampsia, pero los resultados no se han replicado consistentemente en todas las poblaciones, quizá por su complejidad clínica y genética. El levantamiento de mapas de genes y regiones cromosómicas basado en análisis de ligamiento ha mostrado resultados interesantes con algunos marcadores en los cromosomas 2 y 4. En este sentido, hay muchas expectativas con respecto a los genes localizados en tales regiones candidatas, debido a que la identificación de los factores de riesgo genético podría ayudar al entendimiento de esta condición y en proveer claves para su prevención y tratamiento.

  18. rMATS: robust and flexible detection of differential alternative splicing from replicate RNA-Seq data.

    Science.gov (United States)

    Shen, Shihao; Park, Juw Won; Lu, Zhi-xiang; Lin, Lan; Henry, Michael D; Wu, Ying Nian; Zhou, Qing; Xing, Yi

    2014-12-23

    Ultra-deep RNA sequencing (RNA-Seq) has become a powerful approach for genome-wide analysis of pre-mRNA alternative splicing. We previously developed multivariate analysis of transcript splicing (MATS), a statistical method for detecting differential alternative splicing between two RNA-Seq samples. Here we describe a new statistical model and computer program, replicate MATS (rMATS), designed for detection of differential alternative splicing from replicate RNA-Seq data. rMATS uses a hierarchical model to simultaneously account for sampling uncertainty in individual replicates and variability among replicates. In addition to the analysis of unpaired replicates, rMATS also includes a model specifically designed for paired replicates between sample groups. The hypothesis-testing framework of rMATS is flexible and can assess the statistical significance over any user-defined magnitude of splicing change. The performance of rMATS is evaluated by the analysis of simulated and real RNA-Seq data. rMATS outperformed two existing methods for replicate RNA-Seq data in all simulation settings, and RT-PCR yielded a high validation rate (94%) in an RNA-Seq dataset of prostate cancer cell lines. Our data also provide guiding principles for designing RNA-Seq studies of alternative splicing. We demonstrate that it is essential to incorporate biological replicates in the study design. Of note, pooling RNAs or merging RNA-Seq data from multiple replicates is not an effective approach to account for variability, and the result is particularly sensitive to outliers. The rMATS source code is freely available at rnaseq-mats.sourceforge.net/. As the popularity of RNA-Seq continues to grow, we expect rMATS will be useful for studies of alternative splicing in diverse RNA-Seq projects.

  19. Divergence and genetic variability among superior rubber tree genotypes Divergência e variabilidade genética de genótipos superiores de seringueira

    Directory of Open Access Journals (Sweden)

    Lígia Regina Lima Gouvêa

    2010-02-01

    Full Text Available The objective of this work was to estimate the genetic variability and divergence among 22 superior rubber tree (Hevea sp. genotypes of the IAC 400 series. Univariate and multivariate analyses were performed using eight quantitative traits (descriptors, including yield. In the univariate analyses, the estimated parameters were: genetic and environmental variances; genetic and environmental coefficients of variation; and the variation index. The Mahalanobis generalized distance, the Tocher agglomerative method and canonical variables were used for the multivariate analyses. In the univariate analyses, variability was verified among the genotypes for all the variables evaluated. The Tocher method grouped the genotypes into 11 clusters of dissimilarity. The first four canonical variables explained 87.93% of the cumulative variation. The highest genetic variability was found in rubber yield-related traits, which contributed the most to the genetic divergence. The most divergent pairs of genotypes are suggested for crossbreeding. The genotypes evaluated are suitable for breeding and may be used to continue the IAC rubber tree breeding program.O objetivo deste trabalho foi estimar a divergência e a variabilidade genética entre 22 genótipos superiores de seringueira (Hevea sp. da série IAC 400. Análises univariadas e multivariadas foram realizadas com oito caracteres quantitativos (descritores, incluindo produtividade. Na análise univariada, os parâmetros estimados foram: variâncias genética e ambiental, coeficientes de variação genética e ambiental, e índice de variação. A distância generalizada de Mahalanobis, o método aglomerativo de Tocher e variáveis canônicas foram utilizados nas análises multivariadas. Nas análises univariadas, verificou-se variabilidade entre os genótipos para todas as variáveis avaliadas. O método de Tocher agrupou os genótipos em 11 grupos de dissimilaridade. As quatro primeiras variáveis can

  20. iTAR: a web server for identifying target genes of transcription factors using ChIP-seq or ChIP-chip data.

    Science.gov (United States)

    Yang, Chia-Chun; Andrews, Erik H; Chen, Min-Hsuan; Wang, Wan-Yu; Chen, Jeremy J W; Gerstein, Mark; Liu, Chun-Chi; Cheng, Chao

    2016-08-12

    Chromatin immunoprecipitation followed by massively parallel DNA sequencing (ChIP-seq) or microarray hybridization (ChIP-chip) has been widely used to determine the genomic occupation of transcription factors (TFs). We have previously developed a probabilistic method, called TIP (Target Identification from Profiles), to identify TF target genes using ChIP-seq/ChIP-chip data. To achieve high specificity, TIP applies a conservative method to estimate significance of target genes, with the trade-off being a relatively low sensitivity of target gene identification compared to other methods. Additionally, TIP's output does not render binding-peak locations or intensity, information highly useful for visualization and general experimental biological use, while the variability of ChIP-seq/ChIP-chip file formats has made input into TIP more difficult than desired. To improve upon these facets, here we present are fined TIP with key extensions. First, it implements a Gaussian mixture model for p-value estimation, increasing target gene identification sensitivity and more accurately capturing the shape of TF binding profile distributions. Second, it enables the incorporation of TF binding-peak data by identifying their locations in significant target gene promoter regions and quantifies their strengths. Finally, for full ease of implementation we have incorporated it into a web server ( http://syslab3.nchu.edu.tw/iTAR/ ) that enables flexibility of input file format, can be used across multiple species and genome assembly versions, and is freely available for public use. The web server additionally performs GO enrichment analysis for the identified target genes to reveal the potential function of the corresponding TF. The iTAR web server provides a user-friendly interface and supports target gene identification in seven species, ranging from yeast to human. To facilitate investigating the quality of ChIP-seq/ChIP-chip data, the web server generates the chart of the

  1. Estructura y diversidad genética en vacas Holstein de Antioquia usando un polimorfismo del gen bGH

    Directory of Open Access Journals (Sweden)

    Juan Rincon F.

    2013-03-01

    Full Text Available Objetivo. Determinar las frecuencias alélicas y genotípicas del polimorfismo del intrón 3 del gen bGH y estimar algunos parámetros de estructura poblacional en ganado Holstein. Materiales y métodos. El estudio se realizó con 1366 vacas Holstein en 120 hatos de 11 municipios del departamento de Antioquia. Se extrajo DNA por el método de Salting out y la genotipificación se realizó usando la técnica de PCR-RFLPs. La diversidad genética se determinó mediante la comparación de las heterocigosidades, El equilibrio de Hardy-Weinberg (HW y la diferenciación genética entre las poblaciones se realizó usando el software Arlequín 2.0 Las frecuencias alélicas y genotípicas se evaluaron mediante el paquete estadístico SAS®. Resultados. Las frecuencias genotípicas encontradas fueron 0.764 (+/+, 0.223 (+/- y 0.013 (-/- y las frecuencias alélicas 0.876 (+ y 0.124 (-. No se encontraron desviaciones del Equilibrio de Hardy Weinberg en ninguna de las subpoblaciones. La diversidad genética determinada mediante la comparación de las heterocigosidades fue relativamente baja entre poblaciones pero al interior de estas no. El valor de FST de toda la población fue de 0.0068 y significativo (p<0.05, algunos FST pareados también lo fueron, tomando valores desde 0.0 a 0.13. Los estadísticos FIT y FIS no fueron significativos. Conclusiones. El gen bGH es un candidato interesante para evaluar características de importancia económica ya que no parece haber sido sometido a selección directa, presenta una variabilidad media en las poblaciones, observándose diferenciación genética significativa entre distintos municipios, producto de los diferentes sistemas de producción y acceso a las biotecnologías.

  2. Next Gen One Portal Usability Evaluation

    Science.gov (United States)

    Cross, E. V., III; Perera, J. S.; Hanson, A. M.; English, K.; Vu, L.; Amonette, W.

    2018-01-01

    Each exercise device on the International Space Station (ISS) has a unique, customized software system interface with unique layouts / hierarchy, and operational principles that require significant crew training. Furthermore, the software programs are not adaptable and provide no real-time feedback or motivation to enhance the exercise experience and/or prevent injuries. Additionally, the graphical user interfaces (GUI) of these systems present information through multiple layers resulting in difficulty navigating to the desired screens and functions. These limitations of current exercise device GUI's lead to increased crew time spent on initiating, loading, performing exercises, logging data and exiting the system. To address these limitations a Next Generation One Portal (NextGen One Portal) Crew Countermeasure System (CMS) was developed, which utilizes the latest industry guidelines in GUI designs to provide an intuitive ease of use approach (i.e., 80% of the functionality gained within 5-10 minutes of initial use without/limited formal training required). This is accomplished by providing a consistent interface using common software to reduce crew training, increase efficiency & user satisfaction while also reducing development & maintenance costs. Results from the usability evaluations showed the NextGen One Portal UI having greater efficiency, learnability, memorability, usability and overall user experience than the current Advanced Resistive Exercise Device (ARED) UI used by astronauts on ISS. Specifically, the design of the One-Portal UI as an app interface similar to those found on the Apple and Google's App Store, assisted many of the participants in grasping the concepts of the interface with minimum training. Although the NextGen One-Portal UI was shown to be an overall better interface, observations by the test facilitators noted specific exercise tasks appeared to have a significant impact on the NextGen One-Portal UI efficiency. Future updates to

  3. Salud pública, genética y ética

    Directory of Open Access Journals (Sweden)

    Kottow Miguel H

    2002-01-01

    Full Text Available La investigación genética ha tenido una enorme expansión en recientes décadas, con repercusiones terapéuticas aún inciertas. El análisis bioético tradicional de las complejas prácticas genéticas ha sido insuficiente por sostenerse en la ética de la investigación y en la bioética de corte principialista. Los problemas éticos más importantes de la genética son de orden colectivo y deben ser abordados por una reflexión ético-social cuyo enfoque es más amplio que la agenda interpersonal del principialismo. Temas como exploraciones genéticas, cuestiones patrimoniales, manipulación génica y asignación de recursos, deben todos ser sometidos a un pensamiento inspirado en los requerimientos de la ciudadanía, en el bien común y en la definición del rol del Estado en fiscalizar actividades genéticas y en proteger a la población. El objetivo del estudio es mostrar cómo el amplio campo de la ética y de la genética tiene una mayor relevancia en el campo social que en el clínico. El objetivo del trabajo es señalar que la bioética principialista ha enfatizado los problemas éticos individuales que nacen con la intervención genética, a costa de marginar sus importantes repercusiones sociales.

  4. Salud pública, genética y ética

    Directory of Open Access Journals (Sweden)

    Miguel H Kottow

    2002-10-01

    Full Text Available La investigación genética ha tenido una enorme expansión en recientes décadas, con repercusiones terapéuticas aún inciertas. El análisis bioético tradicional de las complejas prácticas genéticas ha sido insuficiente por sostenerse en la ética de la investigación y en la bioética de corte principialista. Los problemas éticos más importantes de la genética son de orden colectivo y deben ser abordados por una reflexión ético-social cuyo enfoque es más amplio que la agenda interpersonal del principialismo. Temas como exploraciones genéticas, cuestiones patrimoniales, manipulación génica y asignación de recursos, deben todos ser sometidos a un pensamiento inspirado en los requerimientos de la ciudadanía, en el bien común y en la definición del rol del Estado en fiscalizar actividades genéticas y en proteger a la población. El objetivo del estudio es mostrar cómo el amplio campo de la ética y de la genética tiene una mayor relevancia en el campo social que en el clínico. El objetivo del trabajo es señalar que la bioética principialista ha enfatizado los problemas éticos individuales que nacen con la intervención genética, a costa de marginar sus importantes repercusiones sociales.

  5. Mapping RNA-seq Reads with STAR.

    Science.gov (United States)

    Dobin, Alexander; Gingeras, Thomas R

    2015-09-03

    Mapping of large sets of high-throughput sequencing reads to a reference genome is one of the foundational steps in RNA-seq data analysis. The STAR software package performs this task with high levels of accuracy and speed. In addition to detecting annotated and novel splice junctions, STAR is capable of discovering more complex RNA sequence arrangements, such as chimeric and circular RNA. STAR can align spliced sequences of any length with moderate error rates, providing scalability for emerging sequencing technologies. STAR generates output files that can be used for many downstream analyses such as transcript/gene expression quantification, differential gene expression, novel isoform reconstruction, and signal visualization. In this unit, we describe computational protocols that produce various output files, use different RNA-seq datatypes, and utilize different mapping strategies. STAR is open source software that can be run on Unix, Linux, or Mac OS X systems. Copyright © 2015 John Wiley & Sons, Inc.

  6. Optimal trading strategy for GenCo in LMP-based and bilateral ...

    African Journals Online (AJOL)

    cboonchu

    GenCo) ... In Li and Shahidehpour (2005), a game-based bidding strategy for GenCos with ..... With the different demands, dispatched levels of GenCos vary as shown in Table 6. .... optimisation, AI applications to power systems, and power system ...

  7. Guidance for RNA-seq co-expression network construction and analysis: safety in numbers.

    Science.gov (United States)

    Ballouz, S; Verleyen, W; Gillis, J

    2015-07-01

    RNA-seq co-expression analysis is in its infancy and reasonable practices remain poorly defined. We assessed a variety of RNA-seq expression data to determine factors affecting functional connectivity and topology in co-expression networks. We examine RNA-seq co-expression data generated from 1970 RNA-seq samples using a Guilt-By-Association framework, in which genes are assessed for the tendency of co-expression to reflect shared function. Minimal experimental criteria to obtain performance on par with microarrays were >20 samples with read depth >10 M per sample. While the aggregate network constructed shows good performance (area under the receiver operator characteristic curve ∼0.71), the dependency on number of experiments used is nearly identical to that present in microarrays, suggesting thousands of samples are required to obtain 'gold-standard' co-expression. We find a major topological difference between RNA-seq and microarray co-expression in the form of low overlaps between hub-like genes from each network due to changes in the correlation of expression noise within each technology. jgillis@cshl.edu or sballouz@cshl.edu Networks are available at: http://gillislab.labsites.cshl.edu/supplements/rna-seq-networks/ and supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  8. SERE: single-parameter quality control and sample comparison for RNA-Seq.

    Science.gov (United States)

    Schulze, Stefan K; Kanwar, Rahul; Gölzenleuchter, Meike; Therneau, Terry M; Beutler, Andreas S

    2012-10-03

    Assessing the reliability of experimental replicates (or global alterations corresponding to different experimental conditions) is a critical step in analyzing RNA-Seq data. Pearson's correlation coefficient r has been widely used in the RNA-Seq field even though its statistical characteristics may be poorly suited to the task. Here we present a single-parameter test procedure for count data, the Simple Error Ratio Estimate (SERE), that can determine whether two RNA-Seq libraries are faithful replicates or globally different. Benchmarking shows that the interpretation of SERE is unambiguous regardless of the total read count or the range of expression differences among bins (exons or genes), a score of 1 indicating faithful replication (i.e., samples are affected only by Poisson variation of individual counts), a score of 0 indicating data duplication, and scores >1 corresponding to true global differences between RNA-Seq libraries. On the contrary the interpretation of Pearson's r is generally ambiguous and highly dependent on sequencing depth and the range of expression levels inherent to the sample (difference between lowest and highest bin count). Cohen's simple Kappa results are also ambiguous and are highly dependent on the choice of bins. For quantifying global sample differences SERE performs similarly to a measure based on the negative binomial distribution yet is simpler to compute. SERE can therefore serve as a straightforward and reliable statistical procedure for the global assessment of pairs or large groups of RNA-Seq datasets by a single statistical parameter.

  9. História da genética no Brasil: um olhar a partir do Museu da Genética da Universidade Federal do Rio Grande do Sul

    Directory of Open Access Journals (Sweden)

    Vanderlei Sebastiao de Souza

    2013-06-01

    Full Text Available Aborda o contexto de criação do Museu da Genética, em 2011 no Departamento de Genética na Universidade Federal do Rio Grande do Sul, em Porto Alegre, e apresenta sua estrutura e conteúdo. Argumenta-se que os materiais disponibilizados no Museu da Genética constituem uma rica fonte para pesquisas sobre a história da genética no Brasil (e da genética de populações humanas em particular a partir da segunda metade do século XX, tema ainda pouco investigado, apesar da proeminência dessa área do conhecimento no Brasil.

  10. Manipulación genética de seres humanos

    Directory of Open Access Journals (Sweden)

    Manuel Santos Alcántara

    2006-08-01

    Full Text Available El gran avance que ha tenido la Genética en los últimos años y, particularmente, aquello relacionado con el desciframiento del genoma humano, ha traído a la discusión pública la posibilidad concreta de manipular genéticamente a los seres humanos. El mejoramiento o perfeccionamiento genético de los seres humanos, denominado eugenesia, actualmente se ha convertido técnicamente en una realidad, motivando una profunda reflexión de tipo ético. La pregunta básica es la siguiente: aquello que es técnicamente posible de realizar ¿es ético hacerlo? ¿Tienen derecho los padres a acceder a la tecnología genética para mejorar las características de sus hijos? En este artículo se revisan las bases científicas del mejoramiento genético de los seres humanos, y se plantean los cuestionamientos éticos más relevantes derivados de esta manipulación.

  11. Programa nacional de prevención y consejería genética del retinoblastoma mediante detección de mutaciones en el gen RB.

    Directory of Open Access Journals (Sweden)

    H. Frayle

    2001-07-01

    una la doble mutación inactivante del gen Rb, exclusivamente somática en los esporádicos y germinal más somática en los hereditarios. Esta investigacin tuvo como objetivo caracterizar las mutaciones en el gen Rb mediante secuenciación directa y evaluar su utilidad en la consejería genética.

  12. Programa nacional de prevención y consejería genética del retinoblastoma mediante detección de mutaciones en el gen rb.

    OpenAIRE

    Frayle, H.; Guevara, G.

    2011-01-01

    El retinoblastoma es un raro tumor ocular que se diagnostica en los niños, 40% de los casos se consideran hereditarios y 60% esporádicos. El modelo genético propuesto por Knudson involucra
    una la doble mutación inactivante del gen Rb, exclusivamente somática en los esporádicos y germinal más somática en los hereditarios. Esta investigacin tuvo como objetivo caracterizar las mutaciones en el gen Rb mediante secuenciación directa y evaluar su utilidad en la consejería genética....

  13. CLIP-seq analysis of multi-mapped reads discovers novel functional RNA regulatory sites in the human transcriptome.

    Science.gov (United States)

    Zhang, Zijun; Xing, Yi

    2017-09-19

    Crosslinking or RNA immunoprecipitation followed by sequencing (CLIP-seq or RIP-seq) allows transcriptome-wide discovery of RNA regulatory sites. As CLIP-seq/RIP-seq reads are short, existing computational tools focus on uniquely mapped reads, while reads mapped to multiple loci are discarded. We present CLAM (CLIP-seq Analysis of Multi-mapped reads). CLAM uses an expectation-maximization algorithm to assign multi-mapped reads and calls peaks combining uniquely and multi-mapped reads. To demonstrate the utility of CLAM, we applied it to a wide range of public CLIP-seq/RIP-seq datasets involving numerous splicing factors, microRNAs and m6A RNA methylation. CLAM recovered a large number of novel RNA regulatory sites inaccessible by uniquely mapped reads. The functional significance of these sites was demonstrated by consensus motif patterns and association with alternative splicing (splicing factors), transcript abundance (AGO2) and mRNA half-life (m6A). CLAM provides a useful tool to discover novel protein-RNA interactions and RNA modification sites from CLIP-seq and RIP-seq data, and reveals the significant contribution of repetitive elements to the RNA regulatory landscape of the human transcriptome. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  14. Justicia en salud y genética

    Directory of Open Access Journals (Sweden)

    Maria Graciela De Ortuzar

    2014-06-01

    Full Text Available Las expectativas puestas en el conocimiento genético exceden el ámbito de la medicina tradiciona, debido a que la intervención directa en la lotería natural demandaría el replanteamiento de conceptos centrales de justicia en salud: necesidades médicas, enfermedad, normalidad, e igualdad de oportunidades en el acceso a la salud. El punto en debate es sí el replanteo de dichos conceptos conlleva un cambio radical en las teorías de justicia (libertariana y/o liberal, mostrando su obsolescencia, o sí simplemente se requiere ampliar dichos conceptos claves por fallas estructurales en las mismas teorías. Como hipótesis general considero que los supuestos cuestionamientos, lejos de socavar las bases de las teorías de justicia, sólo ponen en evidencia sus viejos problemas estructurales. Por razones expositivas, dividiré la presentación tres partes. En la Primera parte, analizo la teoría libertariana, estudiando las contradicciones del modelo a través del impacto de la información genética en el seguro privado de salud. En la Segunda Parte, desarrollo la propuesta alternativa liberal rawlsianadanielsiana del modelo de seguro público, evaluando las implicaciones de la genética a partir de la crítica de su concepto biológico de enfermedad y su restricción al acceso a la salud por necesidades naturales. En la Tercera parte presento un modelo integral de necesidades y capacidades básicas, comprendiendo la prevención, el tratamiento y el mejoramiento moralmente permisible (genético y no genético.Mi aporte principal consiste en la elaboración de este modelo normativo integral de necesidades y capacidades para la regulación conjunta de la información y terapia genética con los restantes problemas de salud.

  15. Network-Based Isoform Quantification with RNA-Seq Data for Cancer Transcriptome Analysis.

    Directory of Open Access Journals (Sweden)

    Wei Zhang

    2015-12-01

    Full Text Available High-throughput mRNA sequencing (RNA-Seq is widely used for transcript quantification of gene isoforms. Since RNA-Seq data alone is often not sufficient to accurately identify the read origins from the isoforms for quantification, we propose to explore protein domain-domain interactions as prior knowledge for integrative analysis with RNA-Seq data. We introduce a Network-based method for RNA-Seq-based Transcript Quantification (Net-RSTQ to integrate protein domain-domain interaction network with short read alignments for transcript abundance estimation. Based on our observation that the abundances of the neighboring isoforms by domain-domain interactions in the network are positively correlated, Net-RSTQ models the expression of the neighboring transcripts as Dirichlet priors on the likelihood of the observed read alignments against the transcripts in one gene. The transcript abundances of all the genes are then jointly estimated with alternating optimization of multiple EM problems. In simulation Net-RSTQ effectively improved isoform transcript quantifications when isoform co-expressions correlate with their interactions. qRT-PCR results on 25 multi-isoform genes in a stem cell line, an ovarian cancer cell line, and a breast cancer cell line also showed that Net-RSTQ estimated more consistent isoform proportions with RNA-Seq data. In the experiments on the RNA-Seq data in The Cancer Genome Atlas (TCGA, the transcript abundances estimated by Net-RSTQ are more informative for patient sample classification of ovarian cancer, breast cancer and lung cancer. All experimental results collectively support that Net-RSTQ is a promising approach for isoform quantification. Net-RSTQ toolbox is available at http://compbio.cs.umn.edu/Net-RSTQ/.

  16. Archeological Echocardiography: Digitization and Speckle-Tracking Analysis of Archival Echocardiograms in the HyperGEN Study

    Science.gov (United States)

    Aguilar, Frank G.; Selvaraj, Senthil; Martinez, Eva E.; Katz, Daniel H.; Beussink, Lauren; Kim, Kwang-Youn A.; Ping, Jie; Rasmussen-Torvik, Laura; Goyal, Amita; Sha, Jin; Irvin, Marguerite R.; Arnett, Donna K.; Shah, Sanjiv J.

    2015-01-01

    Background Several large epidemiologic studies and clinical trials have included echocardiography, but images were stored in analog format and these studies predated tissue Doppler imaging (TDI) and speckle-tracking echocardiography (STE). We hypothesized that digitization of analog echocardiograms, with subsequent quantification of cardiac mechanics using STE, is feasible, reproducible, accurate, and produces clinically valid results. Methods In the NHLBI HyperGEN study (N=2234), archived analog echocardiograms were digitized and subsequently analyzed using STE to obtain tissue velocities/strain. Echocardiograms were assigned quality scores and inter/intraobserver agreement was calculated. Accuracy was evaluated in (1) a separate second study (N=50) comparing prospective digital strain vs. post-hoc analog-to-digital strain; and (2) in a third study (N=95) comparing prospectively-obtained TDI e′ velocities with post-hoc STE e′ velocities. Finally, we replicated previously known associations between tissue velocities/strain, conventional echocardiographic measurements, and clinical data. Results Of the 2234 HyperGEN echocardiograms, 2150 (96.2%) underwent successful digitization and STE analysis. Inter/intraobserver agreement was high for all STE parameters, especially longitudinal strain (LS). In accuracy studies, LS performed best when comparing post-hoc STE to prospective digital STE for strain analysis. STE-derived e′ velocities correlated with, but systematically underestimated, TDI e′ velocity. Several known associations between clinical variables and cardiac mechanics were replicated in HyperGEN. We also found a novel independent inverse association between fasting glucose and LS (adjusted β =−2.4 [95% CI −3.6,−1.2]% per 1-SD increase in fasting glucose; Pechocardiography, the digitization and speckle-tracking analysis of archival echocardiograms, is feasible and generates parameters of cardiac mechanics similar to contemporary studies. PMID

  17. Archeological Echocardiography: Digitization and Speckle Tracking Analysis of Archival Echocardiograms in the HyperGEN Study.

    Science.gov (United States)

    Aguilar, Frank G; Selvaraj, Senthil; Martinez, Eva E; Katz, Daniel H; Beussink, Lauren; Kim, Kwang-Youn A; Ping, Jie; Rasmussen-Torvik, Laura; Goyal, Amita; Sha, Jin; Irvin, Marguerite R; Arnett, Donna K; Shah, Sanjiv J

    2016-03-01

    Several large epidemiologic studies and clinical trials have included echocardiography, but images were stored in analog format and these studies predated tissue Doppler imaging (TDI) and speckle tracking echocardiography (STE). We hypothesized that digitization of analog echocardiograms, with subsequent quantification of cardiac mechanics using STE, is feasible, reproducible, accurate, and produces clinically valid results. In the NHLBI HyperGEN study (N = 2234), archived analog echocardiograms were digitized and subsequently analyzed using STE to obtain tissue velocities/strain. Echocardiograms were assigned quality scores and inter-/intra-observer agreement was calculated. Accuracy was evaluated in: (1) a separate second study (N = 50) comparing prospective digital strain versus post hoc analog-to-digital strain, and (2) in a third study (N = 95) comparing prospectively obtained TDI e' velocities with post hoc STE e' velocities. Finally, we replicated previously known associations between tissue velocities/strain, conventional echocardiographic measurements, and clinical data. Of the 2234 HyperGEN echocardiograms, 2150 (96.2%) underwent successful digitization and STE analysis. Inter/intra-observer agreement was high for all STE parameters, especially longitudinal strain (LS). In accuracy studies, LS performed best when comparing post hoc STE to prospective digital STE for strain analysis. STE-derived e' velocities correlated with, but systematically underestimated, TDI e' velocity. Several known associations between clinical variables and cardiac mechanics were replicated in HyperGEN. We also found a novel independent inverse association between fasting glucose and LS (adjusted β = -2.4 [95% CI -3.6, -1.2]% per 1-SD increase in fasting glucose; P echocardiography, the digitization and speckle tracking analysis of archival echocardiograms, is feasible and generates indices of cardiac mechanics similar to contemporary studies. © 2015, Wiley Periodicals, Inc.

  18. GenBank.

    OpenAIRE

    Benson, D; Lipman, D J; Ostell, J

    1993-01-01

    The GenBank sequence database has undergone an expansion in data coverage, annotation content and the development of new services for the scientific community. In addition to nucleotide sequences, data from the major protein sequence and structural databases, and from U.S. and European patents is now included in an integrated system. MEDLINE abstracts from published articles describing the sequences provide an important new source of biological annotation for sequence entries. In addition to ...

  19. Gen IV Materials Handbook Implementation Plan

    International Nuclear Information System (INIS)

    Rittenhouse, P.; Ren, W.

    2005-01-01

    A Gen IV Materials Handbook is being developed to provide an authoritative single source of highly qualified structural materials information and materials properties data for use in design and analyses of all Generation IV Reactor Systems. The Handbook will be responsive to the needs expressed by all of the principal government, national laboratory, and private company stakeholders of Gen IV Reactor Systems. The Gen IV Materials Handbook Implementation Plan provided here addresses the purpose, rationale, attributes, and benefits of the Handbook and will detail its content, format, quality assurance, applicability, and access. Structural materials, both metallic and ceramic, for all Gen IV reactor types currently supported by the Department of Energy (DOE) will be included in the Gen IV Materials Handbook. However, initial emphasis will be on materials for the Very High Temperature Reactor (VHTR). Descriptive information (e.g., chemical composition and applicable technical specifications and codes) will be provided for each material along with an extensive presentation of mechanical and physical property data including consideration of temperature, irradiation, environment, etc. effects on properties. Access to the Gen IV Materials Handbook will be internet-based with appropriate levels of control. Information and data in the Handbook will be configured to allow search by material classes, specific materials, specific information or property class, specific property, data parameters, and individual data points identified with materials parameters, test conditions, and data source. Details on all of these as well as proposed applicability and consideration of data quality classes are provided in the Implementation Plan. Website development for the Handbook is divided into six phases including (1) detailed product analysis and specification, (2) simulation and design, (3) implementation and testing, (4) product release, (5) project/product evaluation, and (6) product

  20. Evaluation of the Performance of ClimGen and LARS-WG models in generating rainfall and temperature time series in rainfed research station of Sisab, Northern Khorasan

    Directory of Open Access Journals (Sweden)

    najmeh khalili

    2016-10-01

    Full Text Available Introduction:Many existing results on water and agriculture researches require long-term statistical climate data, while practically; the available collected data in synoptic stations are quite short. Therefore, the required daily climate data should be generated based on the limited available data. For this purpose, weather generators can be used to enlarge the data length. Among the common weather generators, two models are more common: LARS-WG and ClimGen. Different studies have shown that these two models have different results in different regions and climates. Therefore, the output results of these two methods should be validated based on the climate and weather conditions of the study region. Materials and Methods:The Sisab station is 35 KM away from Bojnord city in Northern Khorasan. This station was established in 1366 and afterwards, the meteorological data including precipitation data are regularly collected. Geographical coordination of this station is 37º 25׳ N and 57º 38׳ E, and the elevation is 1359 meter. The climate in this region is dry and cold under Emberge and semi-dry under Demarton Methods. In this research, LARG-WG model, version 5.5, and ClimGen model, version 4.4, were used to generate 500 data sample for precipitation and temperature time series. The performance of these two models, were evaluated using RMSE, MAE, and CD over the 30 years collected data and their corresponding generated data. Also, to compare the statistical similarity of the generated data with the collected data, t-student, F, and X2 tests were used. With these tests, the similarity of 16 statistical characteristics of the generated data and the collected data has been investigated in the level of confidence 95%. Results and Discussion:This study showed that LARS-WG model can better generate precipitation data in terms of statistical error criteria. RMSE and MAE for the generated data by LAR-WG were less than ClimGen model while the CD value of

  1. GENE-counter: a computational pipeline for the analysis of RNA-Seq data for gene expression differences.

    Science.gov (United States)

    Cumbie, Jason S; Kimbrel, Jeffrey A; Di, Yanming; Schafer, Daniel W; Wilhelm, Larry J; Fox, Samuel E; Sullivan, Christopher M; Curzon, Aron D; Carrington, James C; Mockler, Todd C; Chang, Jeff H

    2011-01-01

    GENE-counter is a complete Perl-based computational pipeline for analyzing RNA-Sequencing (RNA-Seq) data for differential gene expression. In addition to its use in studying transcriptomes of eukaryotic model organisms, GENE-counter is applicable for prokaryotes and non-model organisms without an available genome reference sequence. For alignments, GENE-counter is configured for CASHX, Bowtie, and BWA, but an end user can use any Sequence Alignment/Map (SAM)-compliant program of preference. To analyze data for differential gene expression, GENE-counter can be run with any one of three statistics packages that are based on variations of the negative binomial distribution. The default method is a new and simple statistical test we developed based on an over-parameterized version of the negative binomial distribution. GENE-counter also includes three different methods for assessing differentially expressed features for enriched gene ontology (GO) terms. Results are transparent and data are systematically stored in a MySQL relational database to facilitate additional analyses as well as quality assessment. We used next generation sequencing to generate a small-scale RNA-Seq dataset derived from the heavily studied defense response of Arabidopsis thaliana and used GENE-counter to process the data. Collectively, the support from analysis of microarrays as well as the observed and substantial overlap in results from each of the three statistics packages demonstrates that GENE-counter is well suited for handling the unique characteristics of small sample sizes and high variability in gene counts.

  2. GENE-counter: a computational pipeline for the analysis of RNA-Seq data for gene expression differences.

    Directory of Open Access Journals (Sweden)

    Jason S Cumbie

    Full Text Available GENE-counter is a complete Perl-based computational pipeline for analyzing RNA-Sequencing (RNA-Seq data for differential gene expression. In addition to its use in studying transcriptomes of eukaryotic model organisms, GENE-counter is applicable for prokaryotes and non-model organisms without an available genome reference sequence. For alignments, GENE-counter is configured for CASHX, Bowtie, and BWA, but an end user can use any Sequence Alignment/Map (SAM-compliant program of preference. To analyze data for differential gene expression, GENE-counter can be run with any one of three statistics packages that are based on variations of the negative binomial distribution. The default method is a new and simple statistical test we developed based on an over-parameterized version of the negative binomial distribution. GENE-counter also includes three different methods for assessing differentially expressed features for enriched gene ontology (GO terms. Results are transparent and data are systematically stored in a MySQL relational database to facilitate additional analyses as well as quality assessment. We used next generation sequencing to generate a small-scale RNA-Seq dataset derived from the heavily studied defense response of Arabidopsis thaliana and used GENE-counter to process the data. Collectively, the support from analysis of microarrays as well as the observed and substantial overlap in results from each of the three statistics packages demonstrates that GENE-counter is well suited for handling the unique characteristics of small sample sizes and high variability in gene counts.

  3. Stormbow: A Cloud-Based Tool for Reads Mapping and Expression Quantification in Large-Scale RNA-Seq Studies.

    Science.gov (United States)

    Zhao, Shanrong; Prenger, Kurt; Smith, Lance

    2013-01-01

    RNA-Seq is becoming a promising replacement to microarrays in transcriptome profiling and differential gene expression study. Technical improvements have decreased sequencing costs and, as a result, the size and number of RNA-Seq datasets have increased rapidly. However, the increasing volume of data from large-scale RNA-Seq studies poses a practical challenge for data analysis in a local environment. To meet this challenge, we developed Stormbow, a cloud-based software package, to process large volumes of RNA-Seq data in parallel. The performance of Stormbow has been tested by practically applying it to analyse 178 RNA-Seq samples in the cloud. In our test, it took 6 to 8 hours to process an RNA-Seq sample with 100 million reads, and the average cost was $3.50 per sample. Utilizing Amazon Web Services as the infrastructure for Stormbow allows us to easily scale up to handle large datasets with on-demand computational resources. Stormbow is a scalable, cost effective, and open-source based tool for large-scale RNA-Seq data analysis. Stormbow can be freely downloaded and can be used out of box to process Illumina RNA-Seq datasets.

  4. ITSSOIN Hypotheses

    NARCIS (Netherlands)

    Anheier, H.K.; Krlev, G.; Preuss, S.; Mildenberger, G.; Bekkers, R.H.F.P.; Brink Lund, A.

    2014-01-01

    This report brings together findings from the first ITSSOIN project working steps to formulate empirically testable hypotheses on the impact of the third sector and social innovation – in particular regarding the role of the third sector in generating social innovation but also with reference to

  5. Algoritmos para genómica comparativa

    OpenAIRE

    Figueiras, Vasco da Rocha

    2010-01-01

    Com o surgimento da Genómica e da Proteómica, a Bioinformática conduziu a alguns dos avanços científicos mais relevantes do século XX. A Unidade de Investigação e Desenvolvimento do Biocant, parque biotecnológico de Cantanhede, assume actualmente o papel de motor no desenvolvimento da Genómica. O Biocant possui um importante sequenciador de larga escala que permite armazenar um elevado número de genomas, nomeadamente, genomas de bactérias. O estudo proposto reflecte a necessidade do Bio...

  6. Medicamentos genéricos y de marca-Calidad e intercambiabilidad

    Directory of Open Access Journals (Sweden)

    Rua F.

    2012-03-01

    Full Text Available El interés por los medicamentos genéricos procede de la necesidad de los sistemas sanitarios de reducir la factura sanitaria sin merma de los objetivos de salud. Su expansión y uso requieren la aceptación de la población y de los profesionales. También requieren que se despejen algunas dudas sobre su verdadera equivalencia respecto a los medicamentos originales. Desde su introducción en el mercado farmacéutico existe el debate de si son correctamente investigados y de alta calidad. No son infrecuentes los conceptos equivocados entre los profesionales sobre los genéricos, en especial, el supuesto hecho de que pueden llegar a contener hasta un 20% menos de concentración en principio activo. Estas creencias erróneas sugieren una situación de desventaja en la eficacia y la tolerabilidad de los medicamentos genéricos comparados con sus equivalentes de marca, disminuyendo la credibilidad de los mismos. Así, en una encuesta realizada en 2008 los farmacéuticos opinaron que los genéricos y las marcas son diferentes en eficacia (26%, equivalencia (28% y, sobre todo, en la calidad del excipiente (46%, aumentando la percepción de que los genéricos son diferentes en función del laboratorio que los fabrica (52,8%. En este artículo, con el fin de ampliar los conocimientos sobre medicamentos genéricos, solucionar dudas y proporcionar información, objetiva, clara y rigurosa, se revisan los posibles prejuicios sobre genéricos y se exponen las evidencias que existen en torno a los mismos, como los requisitos de bioequivalencia de los productos genéricos, analizando si ésta corrobora adecuadamente la equivalencia terapéutica y de intercambio.

  7. Sobre el significado del descubrimiento del gen FOXP2

    OpenAIRE

    Longa Martínez, Víctor Manuel

    2006-01-01

    El reciente descubrimiento del gen FOXP2 ha ofrecido la primera evidencia clara de la base genética del lenguaje, mostrando una correlación inequívoca desde la perspectiva genética entre una versión mutada de F0XP2 y los trastornos lingüísticos de diferente tipo sufridos por una familia inglesa, conocida como KE. El objetivo central del presente trabajo es discutir diferentes aspectos relacionados con tal descubrimiento; especialmente, la discusión del significado de FOXP2 con ...

  8. An empirical test of the treatment of indels during optimization alignment based on the phylogeny of the genus Secale (Poaceae)

    DEFF Research Database (Denmark)

    Petersen, Gitte; Seberg, Ole; Aagesen, Lone

    2004-01-01

    The ability of the program POY, implementing optimization alignment, to deal with major indels is explored and discussed in connection with a phylogenetic analysis of the genus Secale based on partial Adhl sequences. The Adhl sequences used span exon 2-4. Nearly all variation is found in intron 2...... recovers both genera as monophyletic when knowledge of the duplication is incorporated in the analysis. The phylogenetic relationships within Secale are not clearly resolved. Subspecific taxa of Secale strictum have identical sequences and they are confined to a monophyletic group. However, the two...

  9. Distributed Generation Market Demand Model (dGen): Documentation

    Energy Technology Data Exchange (ETDEWEB)

    Sigrin, Benjamin [National Renewable Energy Lab. (NREL), Golden, CO (United States); Gleason, Michael [National Renewable Energy Lab. (NREL), Golden, CO (United States); Preus, Robert [National Renewable Energy Lab. (NREL), Golden, CO (United States); Baring-Gould, Ian [National Renewable Energy Lab. (NREL), Golden, CO (United States); Margolis, Robert [National Renewable Energy Lab. (NREL), Golden, CO (United States)

    2016-02-01

    The Distributed Generation Market Demand model (dGen) is a geospatially rich, bottom-up, market-penetration model that simulates the potential adoption of distributed energy resources (DERs) for residential, commercial, and industrial entities in the continental United States through 2050. The National Renewable Energy Laboratory (NREL) developed dGen to analyze the key factors that will affect future market demand for distributed solar, wind, storage, and other DER technologies in the United States. The new model builds off, extends, and replaces NREL's SolarDS model (Denholm et al. 2009a), which simulates the market penetration of distributed PV only. Unlike the SolarDS model, dGen can model various DER technologies under one platform--it currently can simulate the adoption of distributed solar (the dSolar module) and distributed wind (the dWind module) and link with the ReEDS capacity expansion model (Appendix C). The underlying algorithms and datasets in dGen, which improve the representation of customer decision making as well as the spatial resolution of analyses (Figure ES-1), also are improvements over SolarDS.

  10. Revision of Corallinaceae (Corallinales, Rhodophyta): recognizing Dawsoniolithon gen. nov., Parvicellularium gen. nov. and Chamberlainoideae subfam. nov. containing Chamberlainium gen. nov. and Pneophyllum.

    Science.gov (United States)

    Caragnano, Annalisa; Foetisch, Alexandra; Maneveldt, Gavin W; Millet, Laurent; Liu, Li-Chia; Lin, Showe-Mei; Rodondi, Graziella; Payri, Claude E

    2018-03-25

    A multi-gene (SSU, LSU, psbA and COI) molecular phylogeny of the family Corallinaceae (excluding the subfamilies Lithophylloideae and Corallinoideae) showed a paraphyletic grouping of six monophyletic clades. Pneophyllum and Spongites were reassessed and recircumscribed using DNA sequence data integrated with morpho-anatomical comparisons of type material and recently collected specimens. We propose Chamberlainoideae subfam. nov., including the type genus Chamberlainium gen. nov., with C. tumidum comb. nov. as the generitype, and Pneophyllum. Chamberlainium is established to include several taxa previously ascribed to Spongites, the generitype of which currently resides in Neogoniolithoideae. Additionally we propose two new genera, Dawsoniolithon gen. nov. (Metagoniolithoideae), with D. conicum comb. nov. as the generitype and Parvicellularium gen. nov. (subfamily incertae sedis), with P. leonardi sp. nov. as the generitype. Chamberlainoideae has no diagnostic morpho-anatomical features that enable one to assign specimens to it without DNA sequence data, and it is the first subfamily to possess both Type 1 (Chamberlainium) and Type 2 (Pneophyllum) tetra/bisporangial conceptacle roof development. Two characters distinguish Chamberlainium from Spongites: tetra/biasporangial conceptacle chamber diameter (300 μm in Spongites) and tetra/bisporangial conceptacle roof thickness (8 cells in Spongites). Two characters also distinguish Pneophyllum from Dawsoniolithon: tetra/bisporangial conceptacle roof thickness (8 cells in Dawsoniolithon) and thallus construction (dimerous in Pneophyllum vs. monomerous in Dawsoniolithon). This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.

  11. DNA microarray-based PCR ribotyping of Clostridium difficile.

    Science.gov (United States)

    Schneeberg, Alexander; Ehricht, Ralf; Slickers, Peter; Baier, Vico; Neubauer, Heinrich; Zimmermann, Stefan; Rabold, Denise; Lübke-Becker, Antina; Seyboldt, Christian

    2015-02-01

    This study presents a DNA microarray-based assay for fast and simple PCR ribotyping of Clostridium difficile strains. Hybridization probes were designed to query the modularly structured intergenic spacer region (ISR), which is also the template for conventional and PCR ribotyping with subsequent capillary gel electrophoresis (seq-PCR) ribotyping. The probes were derived from sequences available in GenBank as well as from theoretical ISR module combinations. A database of reference hybridization patterns was set up from a collection of 142 well-characterized C. difficile isolates representing 48 seq-PCR ribotypes. The reference hybridization patterns calculated by the arithmetic mean were compared using a similarity matrix analysis. The 48 investigated seq-PCR ribotypes revealed 27 array profiles that were clearly distinguishable. The most frequent human-pathogenic ribotypes 001, 014/020, 027, and 078/126 were discriminated by the microarray. C. difficile strains related to 078/126 (033, 045/FLI01, 078, 126, 126/FLI01, 413, 413/FLI01, 598, 620, 652, and 660) and 014/020 (014, 020, and 449) showed similar hybridization patterns, confirming their genetic relatedness, which was previously reported. A panel of 50 C. difficile field isolates was tested by seq-PCR ribotyping and the DNA microarray-based assay in parallel. Taking into account that the current version of the microarray does not discriminate some closely related seq-PCR ribotypes, all isolates were typed correctly. Moreover, seq-PCR ribotypes without reference profiles available in the database (ribotype 009 and 5 new types) were correctly recognized as new ribotypes, confirming the performance and expansion potential of the microarray. Copyright © 2015, American Society for Microbiology. All Rights Reserved.

  12. ChIPBase v2.0: decoding transcriptional regulatory networks of non-coding RNAs and protein-coding genes from ChIP-seq data.

    Science.gov (United States)

    Zhou, Ke-Ren; Liu, Shun; Sun, Wen-Ju; Zheng, Ling-Ling; Zhou, Hui; Yang, Jian-Hua; Qu, Liang-Hu

    2017-01-04

    The abnormal transcriptional regulation of non-coding RNAs (ncRNAs) and protein-coding genes (PCGs) is contributed to various biological processes and linked with human diseases, but the underlying mechanisms remain elusive. In this study, we developed ChIPBase v2.0 (http://rna.sysu.edu.cn/chipbase/) to explore the transcriptional regulatory networks of ncRNAs and PCGs. ChIPBase v2.0 has been expanded with ∼10 200 curated ChIP-seq datasets, which represent about 20 times expansion when comparing to the previous released version. We identified thousands of binding motif matrices and their binding sites from ChIP-seq data of DNA-binding proteins and predicted millions of transcriptional regulatory relationships between transcription factors (TFs) and genes. We constructed 'Regulator' module to predict hundreds of TFs and histone modifications that were involved in or affected transcription of ncRNAs and PCGs. Moreover, we built a web-based tool, Co-Expression, to explore the co-expression patterns between DNA-binding proteins and various types of genes by integrating the gene expression profiles of ∼10 000 tumor samples and ∼9100 normal tissues and cell lines. ChIPBase also provides a ChIP-Function tool and a genome browser to predict functions of diverse genes and visualize various ChIP-seq data. This study will greatly expand our understanding of the transcriptional regulations of ncRNAs and PCGs. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  13. Strand-Specific RNA-Seq Analyses of Fruiting Body Development in Coprinopsis cinerea.

    Directory of Open Access Journals (Sweden)

    Hajime Muraguchi

    Full Text Available The basidiomycete fungus Coprinopsis cinerea is an important model system for multicellular development. Fruiting bodies of C. cinerea are typical mushrooms, which can be produced synchronously on defined media in the laboratory. To investigate the transcriptome in detail during fruiting body development, high-throughput sequencing (RNA-seq was performed using cDNA libraries strand-specifically constructed from 13 points (stages/tissues with two biological replicates. The reads were aligned to 14,245 predicted transcripts, and counted for forward and reverse transcripts. Differentially expressed genes (DEGs between two adjacent points and between vegetative mycelium and each point were detected by Tag Count Comparison (TCC. To validate RNA-seq data, expression levels of selected genes were compared using RPKM values in RNA-seq data and qRT-PCR data, and DEGs detected in microarray data were examined in MA plots of RNA-seq data by TCC. We discuss events deduced from GO analysis of DEGs. In addition, we uncovered both transcription factor candidates and antisense transcripts that are likely to be involved in developmental regulation for fruiting.

  14. TruSeq Stranded mRNA and Total RNA Sample Preparation Kits

    Science.gov (United States)

    Total RNA-Seq enabled by ribosomal RNA (rRNA) reduction is compatible with formalin-fixed paraffin embedded (FFPE) samples, which contain potentially critical biological information. The family of TruSeq Stranded Total RNA sample preparation kits provides a unique combination of unmatched data quality for both mRNA and whole-transcriptome analyses, robust interrogation of both standard and low-quality samples and workflows compatible with a wide range of study designs.

  15. Biometria e armazenamento de sementes de genótipos de cacaueiro

    Directory of Open Access Journals (Sweden)

    Lucimara Ribeiro Venial

    2017-03-01

    Full Text Available Genótipos de Theobroma cacao L. devem ser melhor estudados, para se identificar aqueles que produzem sementes mais desenvolvidas e viáveis após o armazenamento. Objetivou-se com este trabalho estudar a biometria e dois tempos de armazenamento de sementes de genótipos de cacaueiro. A biometria foi avaliada em oito genótipos de cacaueiro (tratamentos. Foram instalados testes de germinação em delineamento inteiramente ao acaso, no esquema fatorial 8 x 2 (genótipos: CCN51, PH16, CEPEC2002, Ipiranga, SJ02, PS1319, TSH1188 e Comum x dois períodos de armazenamento: 0 e dois dias. O genótipo TSH1188 apresentou maior comprimento, relação comprimento/largura, espessura e massa de 100 sementes. A absorção de água das sementes recém-colhidas dos genótipos é lenta, justificada pelos altos teores de água, o que não caracteriza padrão-trifásico. Os teores de água reduziram em média 2,3 vezes nas sementes armazenadas em relação às recém-colhidas. A germinação das sementes recém-colhidas dos genótipos foi de 100%. Após o armazenamento, as sementes do PS1319 apresentaram a menor redução da germinação (39%, enquanto as dos PH16, CEPEC2002 e SJ02 reduziram 96%. A velocidade de germinação foi maior e o tempo médio menor que dois dias nas sementes recém-colhidas do PS1319, indicando serem mais tolerante à dessecação. Sugere-se o uso dos genótipos TSH1188 e PS1319 em programas de melhoramento genético.

  16. DETECTION OF BACTERIAL SMALL TRANSCRIPTS FROM RNA-SEQ DATA: A COMPARATIVE ASSESSMENT.

    Science.gov (United States)

    Peña-Castillo, Lourdes; Grüell, Marc; Mulligan, Martin E; Lang, Andrew S

    2016-01-01

    Small non-coding RNAs (sRNAs) are regulatory RNA molecules that have been identified in a multitude of bacterial species and shown to control numerous cellular processes through various regulatory mechanisms. In the last decade, next generation RNA sequencing (RNA-seq) has been used for the genome-wide detection of bacterial sRNAs. Here we describe sRNA-Detect, a novel approach to identify expressed small transcripts from prokaryotic RNA-seq data. Using RNA-seq data from three bacterial species and two sequencing platforms, we performed a comparative assessment of five computational approaches for the detection of small transcripts. We demonstrate that sRNA-Detect improves upon current standalone computational approaches for identifying novel small transcripts in bacteria.

  17. Testing hypotheses in order

    OpenAIRE

    Paul R. Rosenbaum

    2008-01-01

    In certain circumstances, one wishes to test one hypothesis only if certain other hypotheses have been rejected. This ordering of hypotheses simplifies the task of controlling the probability of rejecting any true hypothesis. In an example from an observational study, a treated group is shown to be further from both of two control groups than the two control groups are from each other. Copyright 2008, Oxford University Press.

  18. Enfermedades genéticas más frecuentes en pacientes atendidos en consulta de genética clínica

    Directory of Open Access Journals (Sweden)

    Elibett Carcasés Carcasés

    2015-02-01

    Full Text Available La estimación de la prevalencia de las enfermedades genéticas se dificulta, entre otras causas, por su rareza. Se realizó un estudio descriptivo retrospectivo, para identificar las enfermedades genéticas de mayor prevalencia en pacientes atendidos por este programa en el Centro Provincial de Genética Médica de Las Tunas, Cuba; desde el año 1989 hasta julio de 2014. Se revisaron todas las historias clínicas. Predominó el origen monogénico (69 %, siendo los síndromes dismórficos los más numerosos y diversos, entre ellos los neurocutáneos, que representaron el 35 %. La enfermedad genética monogénica con mayor número de casos fue la Neurofibromatosis I con el 14,4 % y el 22,2 % de las enfermedades eran de origen monogénico y dismórfico. La Trisomía 21 representó el 77 % de la causa cromosómica. En el origen multifactorial prevalecieron los defectos congénitos mayores, entre ellos los defectos reductivos de miembros (27 %

  19. Estudo do polimorfismo genético no gene p53 (códon 72 em câncer colorretal Role of the genetic polymorphism of p53 (codon 72 gene in colorectal cancer

    Directory of Open Access Journals (Sweden)

    Jacqueline Miranda de Lima

    2006-03-01

    Full Text Available RACIONAL: Polimorfismos genéticos são variações genéticas que podem ocorrer em seqüências codificadoras e não-codificadoras, levando a alterações qualitativas e/ou quantitativas das proteínas em questão. O p53 é o gene mais comumente alterado no câncer humano. O polimorfismo desse gene no códon 72 ocorre por substituição de uma base e tem sido associado a maior risco de câncer. OBJETIVO: Determinar a possível associação entre o polimorfismo no códon 72 (72 arginina/prolina do gene p53 e câncer colorretal. CASUÍSTICA E MÉTODOS: Foram avaliados em 100 pacientes com câncer colorretal e em 100 indivíduos sem câncer, pareados quanto ao sexo idade, o hábito de fumar, o etilismo e no grupo caso o estádio, o grau de diferenciação e a evolução da doença. O genótipo (72 arginina/prolina foi determinado por PCR, utilizando-se primers (seqüências de nucleotídeos específicos. RESULTADOS: O genótipo homozigoto arginina/arginina foi prevalente em 56% no grupo controle e em 58% no grupo caso. Não se observou diferença entre os dois grupos. No estádio IV este genótipo foi mais freqüente quando comparado ao estádio I (80% versus 14%. Não se observou diferença entre as variações do genótipo e fumo, álcool, evolução clínica ou grau de diferenciação. CONCLUSÃO: A prevalência do genótipo arginina/arginina foi a mais freqüente nos dois grupos. Não foi encontrada correlação entre maior risco de câncer e o polimorfismo no códon 72 prolina/arginina do gene p53. Apesar do pequeno número de doentes com câncer em estádio avançado (IV, estes tiveram maior prevalência do genótipo arginina/arginina.BACKGROUND: Polymorphisms are genetic variations that can occur in sequences of codons, leading to defective proteins. p53 is the most commonly gene affected in human cancer. The polymorphism of this gene occurs by a substitution of a base in codon 72 and may increase the risk of cancer. AIM: To investigate the

  20. From AWE-GEN to AWE-GEN-2d: a high spatial and temporal resolution weather generator

    Science.gov (United States)

    Peleg, Nadav; Fatichi, Simone; Paschalis, Athanasios; Molnar, Peter; Burlando, Paolo

    2016-04-01

    A new weather generator, AWE-GEN-2d (Advanced WEather GENerator for 2-Dimension grid) is developed following the philosophy of combining physical and stochastic approaches to simulate meteorological variables at high spatial and temporal resolution (e.g. 2 km x 2 km and 5 min for precipitation and cloud cover and 100 m x 100 m and 1 h for other variables variable (temperature, solar radiation, vapor pressure, atmospheric pressure and near-surface wind). The model is suitable to investigate the impacts of climate variability, temporal and spatial resolutions of forcing on hydrological, ecological, agricultural and geomorphological impacts studies. Using appropriate parameterization the model can be used in the context of climate change. Here we present the model technical structure of AWE-GEN-2d, which is a substantial evolution of four preceding models (i) the hourly-point scale Advanced WEather GENerator (AWE-GEN) presented by Fatichi et al. (2011, Adv. Water Resour.) (ii) the Space-Time Realizations of Areal Precipitation (STREAP) model introduced by Paschalis et al. (2013, Water Resour. Res.), (iii) the High-Resolution Synoptically conditioned Weather Generator developed by Peleg and Morin (2014, Water Resour. Res.), and (iv) the Wind-field Interpolation by Non Divergent Schemes presented by Burlando et al. (2007, Boundary-Layer Meteorol.). The AWE-GEN-2d is relatively parsimonious in terms of computational demand and allows generating many stochastic realizations of current and projected climates in an efficient way. An example of model application and testing is presented with reference to a case study in the Wallis region, a complex orography terrain in the Swiss Alps.

  1. Targeted sequencing of large genomic regions with CATCH-Seq.

    Directory of Open Access Journals (Sweden)

    Kenneth Day

    Full Text Available Current target enrichment systems for large-scale next-generation sequencing typically require synthetic oligonucleotides used as capture reagents to isolate sequences of interest. The majority of target enrichment reagents are focused on gene coding regions or promoters en masse. Here we introduce development of a customizable targeted capture system using biotinylated RNA probe baits transcribed from sheared bacterial artificial chromosome clone templates that enables capture of large, contiguous blocks of the genome for sequencing applications. This clone adapted template capture hybridization sequencing (CATCH-Seq procedure can be used to capture both coding and non-coding regions of a gene, and resolve the boundaries of copy number variations within a genomic target site. Furthermore, libraries constructed with methylated adapters prior to solution hybridization also enable targeted bisulfite sequencing. We applied CATCH-Seq to diverse targets ranging in size from 125 kb to 3.5 Mb. Our approach provides a simple and cost effective alternative to other capture platforms because of template-based, enzymatic probe synthesis and the lack of oligonucleotide design costs. Given its similarity in procedure, CATCH-Seq can also be performed in parallel with commercial systems.

  2. Analysis of ChIP-seq Data in R/Bioconductor.

    Science.gov (United States)

    de Santiago, Ines; Carroll, Thomas

    2018-01-01

    The development of novel high-throughput sequencing methods for ChIP (chromatin immunoprecipitation) has provided a very powerful tool to study gene regulation in multiple conditions at unprecedented resolution and scale. Proactive quality-control and appropriate data analysis techniques are of critical importance to extract the most meaningful results from the data. Over the last years, an array of R/Bioconductor tools has been developed allowing researchers to process and analyze ChIP-seq data. This chapter provides an overview of the methods available to analyze ChIP-seq data based primarily on software packages from the open-source Bioconductor project. Protocols described in this chapter cover basic steps including data alignment, peak calling, quality control and data visualization, as well as more complex methods such as the identification of differentially bound regions and functional analyses to annotate regulatory regions. The steps in the data analysis process were demonstrated on publicly available data sets and will serve as a demonstration of the computational procedures routinely used for the analysis of ChIP-seq data in R/Bioconductor, from which readers can construct their own analysis pipelines.

  3. CMT: a constrained multi-level thresholding approach for ChIP-Seq data analysis.

    Directory of Open Access Journals (Sweden)

    Iman Rezaeian

    Full Text Available Genome-wide profiling of DNA-binding proteins using ChIP-Seq has emerged as an alternative to ChIP-chip methods. ChIP-Seq technology offers many advantages over ChIP-chip arrays, including but not limited to less noise, higher resolution, and more coverage. Several algorithms have been developed to take advantage of these abilities and find enriched regions by analyzing ChIP-Seq data. However, the complexity of analyzing various patterns of ChIP-Seq signals still needs the development of new algorithms. Most current algorithms use various heuristics to detect regions accurately. However, despite how many formulations are available, it is still difficult to accurately determine individual peaks corresponding to each binding event. We developed Constrained Multi-level Thresholding (CMT, an algorithm used to detect enriched regions on ChIP-Seq data. CMT employs a constraint-based module that can target regions within a specific range. We show that CMT has higher accuracy in detecting enriched regions (peaks by objectively assessing its performance relative to other previously proposed peak finders. This is shown by testing three algorithms on the well-known FoxA1 Data set, four transcription factors (with a total of six antibodies for Drosophila melanogaster and the H3K4ac antibody dataset.

  4. expVIP: a Customizable RNA-seq Data Analysis and Visualization Platform.

    Science.gov (United States)

    Borrill, Philippa; Ramirez-Gonzalez, Ricardo; Uauy, Cristobal

    2016-04-01

    The majority of transcriptome sequencing (RNA-seq) expression studies in plants remain underutilized and inaccessible due to the use of disparate transcriptome references and the lack of skills and resources to analyze and visualize these data. We have developed expVIP, an expression visualization and integration platform, which allows easy analysis of RNA-seq data combined with an intuitive and interactive interface. Users can analyze public and user-specified data sets with minimal bioinformatics knowledge using the expVIP virtual machine. This generates a custom Web browser to visualize, sort, and filter the RNA-seq data and provides outputs for differential gene expression analysis. We demonstrate expVIP's suitability for polyploid crops and evaluate its performance across a range of biologically relevant scenarios. To exemplify its use in crop research, we developed a flexible wheat (Triticum aestivum) expression browser (www.wheat-expression.com) that can be expanded with user-generated data in a local virtual machine environment. The open-access expVIP platform will facilitate the analysis of gene expression data from a wide variety of species by enabling the easy integration, visualization, and comparison of RNA-seq data across experiments. © 2016 American Society of Plant Biologists. All Rights Reserved.

  5. Linnorm: improved statistical analysis for single cell RNA-seq expression data.

    Science.gov (United States)

    Yip, Shun H; Wang, Panwen; Kocher, Jean-Pierre A; Sham, Pak Chung; Wang, Junwen

    2017-12-15

    Linnorm is a novel normalization and transformation method for the analysis of single cell RNA sequencing (scRNA-seq) data. Linnorm is developed to remove technical noises and simultaneously preserve biological variations in scRNA-seq data, such that existing statistical methods can be improved. Using real scRNA-seq data, we compared Linnorm with existing normalization methods, including NODES, SAMstrt, SCnorm, scran, DESeq and TMM. Linnorm shows advantages in speed, technical noise removal and preservation of cell heterogeneity, which can improve existing methods in the discovery of novel subtypes, pseudo-temporal ordering of cells, clustering analysis, etc. Linnorm also performs better than existing DEG analysis methods, including BASiCS, NODES, SAMstrt, Seurat and DESeq2, in false positive rate control and accuracy. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  6. EPCGen2 Pseudorandom Number Generators: Analysis of J3Gen

    Directory of Open Access Journals (Sweden)

    Alberto Peinado

    2014-04-01

    Full Text Available This paper analyzes the cryptographic security of J3Gen, a promising pseudo random number generator for low-cost passive Radio Frequency Identification (RFID tags. Although J3Gen has been shown to fulfill the randomness criteria set by the EPCglobal Gen2 standard and is intended for security applications, we describe here two cryptanalytic attacks that question its security claims: (i a probabilistic attack based on solving linear equation systems; and (ii a deterministic attack based on the decimation of the output sequence. Numerical results, supported by simulations, show that for the specific recommended values of the configurable parameters, a low number of intercepted output bits are enough to break J3Gen. We then make some recommendations that address these issues.

  7. Development of Molecular Markers Linked to Powdery Mildew Resistance Gene Pm4b by Combining SNP Discovery from Transcriptome Sequencing Data with Bulked Segregant Analysis (BSR-Seq) in Wheat.

    Science.gov (United States)

    Wu, Peipei; Xie, Jingzhong; Hu, Jinghuang; Qiu, Dan; Liu, Zhiyong; Li, Jingting; Li, Miaomiao; Zhang, Hongjun; Yang, Li; Liu, Hongwei; Zhou, Yang; Zhang, Zhongjun; Li, Hongjie

    2018-01-01

    Powdery mildew resistance gene Pm4b , originating from Triticum persicum , is effective against the prevalent Blumeria graminis f. sp. tritici ( Bgt ) isolates from certain regions of wheat production in China. The lack of tightly linked molecular markers with the target gene prevents the precise identification of Pm4b during the application of molecular marker-assisted selection (MAS). The strategy that combines the RNA-Seq technique and the bulked segregant analysis (BSR-Seq) was applied in an F 2:3 mapping population (237 families) derived from a pair of isogenic lines VPM1/7 ∗ Bainong 3217 F 4 (carrying Pm4b ) and Bainong 3217 to develop more closely linked molecular markers. RNA-Seq analysis of the two phenotypically contrasting RNA bulks prepared from the representative F 2:3 families generated 20,745,939 and 25,867,480 high-quality read pairs, and 82.8 and 80.2% of them were uniquely mapped to the wheat whole genome draft assembly for the resistant and susceptible RNA bulks, respectively. Variant calling identified 283,866 raw single nucleotide polymorphisms (SNPs) and InDels between the two bulks. The SNPs that were closely associated with the powdery mildew resistance were concentrated on chromosome 2AL. Among the 84 variants that were potentially associated with the disease resistance trait, 46 variants were enriched in an about 25 Mb region at the distal end of chromosome arm 2AL. Four Pm4b -linked SNP markers were developed from these variants. Based on the sequences of Chinese Spring where these polymorphic SNPs were located, 98 SSR primer pairs were designed to develop distal markers flanking the Pm4b gene. Three SSR markers, Xics13 , Xics43 , and Xics76 , were incorporated in the new genetic linkage map, which located Pm4b in a 3.0 cM genetic interval spanning a 6.7 Mb physical genomic region. This region had a collinear relationship with Brachypodium distachyon chromosome 5, rice chromosome 4, and sorghum chromosome 6. Seven genes associated with

  8. Development of Molecular Markers Linked to Powdery Mildew Resistance Gene Pm4b by Combining SNP Discovery from Transcriptome Sequencing Data with Bulked Segregant Analysis (BSR-Seq in Wheat

    Directory of Open Access Journals (Sweden)

    Peipei Wu

    2018-02-01

    Full Text Available Powdery mildew resistance gene Pm4b, originating from Triticum persicum, is effective against the prevalent Blumeria graminis f. sp. tritici (Bgt isolates from certain regions of wheat production in China. The lack of tightly linked molecular markers with the target gene prevents the precise identification of Pm4b during the application of molecular marker-assisted selection (MAS. The strategy that combines the RNA-Seq technique and the bulked segregant analysis (BSR-Seq was applied in an F2:3 mapping population (237 families derived from a pair of isogenic lines VPM1/7∗Bainong 3217 F4 (carrying Pm4b and Bainong 3217 to develop more closely linked molecular markers. RNA-Seq analysis of the two phenotypically contrasting RNA bulks prepared from the representative F2:3 families generated 20,745,939 and 25,867,480 high-quality read pairs, and 82.8 and 80.2% of them were uniquely mapped to the wheat whole genome draft assembly for the resistant and susceptible RNA bulks, respectively. Variant calling identified 283,866 raw single nucleotide polymorphisms (SNPs and InDels between the two bulks. The SNPs that were closely associated with the powdery mildew resistance were concentrated on chromosome 2AL. Among the 84 variants that were potentially associated with the disease resistance trait, 46 variants were enriched in an about 25 Mb region at the distal end of chromosome arm 2AL. Four Pm4b-linked SNP markers were developed from these variants. Based on the sequences of Chinese Spring where these polymorphic SNPs were located, 98 SSR primer pairs were designed to develop distal markers flanking the Pm4b gene. Three SSR markers, Xics13, Xics43, and Xics76, were incorporated in the new genetic linkage map, which located Pm4b in a 3.0 cM genetic interval spanning a 6.7 Mb physical genomic region. This region had a collinear relationship with Brachypodium distachyon chromosome 5, rice chromosome 4, and sorghum chromosome 6. Seven genes associated with

  9. Introducing AstroGen: The Astronomy Genealogy Project

    OpenAIRE

    Tenn, Joseph S.

    2016-01-01

    The Astronomy Genealogy Project ("AstroGen"), a project of the Historical Astronomy Division of the American Astronomical Society (AAS), will soon appear on the AAS website. Ultimately, it will list the world's astronomers with their highest degrees, theses for those who wrote them, academic advisors (supervisors), universities, and links to the astronomers or their obituaries, their theses when on-line, and more. At present the AstroGen team is working on those who earned doctorates with ast...

  10. Genética humana e sociedade

    OpenAIRE

    Rosa, Vivian Leyser da

    2000-01-01

    Tese (doutorado) - Universidade Federal de Santa Catarina, Centro de Ciências da Educação. Análise do campo de estudos sobre o entendimento público da ciência, distinguindo os modelos de deficit cognitivo e interativo, bem como suas implicações na esfera educacional. Estudo do panorama dos avanços atuais da genética humana, do ponto de vista científico, ético e social. Análise de aspectos relativos ao ensino de genética humana nos cursos de graduação da área da saúde, em nove Universidades...

  11. Manipulación genética de seres humanos

    OpenAIRE

    Manuel Santos Alcántara

    2006-01-01

    El gran avance que ha tenido la Genética en los últimos años y, particularmente, aquello relacionado con el desciframiento del genoma humano, ha traído a la discusión pública la posibilidad concreta de manipular genéticamente a los seres humanos. El mejoramiento o perfeccionamiento genético de los seres humanos, denominado eugenesia, actualmente se ha convertido técnicamente en una realidad, motivando una profunda reflexión de tipo ético. La pregunta básica es la siguiente: aquello que es téc...

  12. Joint modeling of ChIP-seq data via a Markov random field model

    NARCIS (Netherlands)

    Bao, Yanchun; Vinciotti, Veronica; Wit, Ernst; 't Hoen, Peter A C

    Chromatin ImmunoPrecipitation-sequencing (ChIP-seq) experiments have now become routine in biology for the detection of protein-binding sites. In this paper, we present a Markov random field model for the joint analysis of multiple ChIP-seq experiments. The proposed model naturally accounts for

  13. SeqKit: A Cross-Platform and Ultrafast Toolkit for FASTA/Q File Manipulation.

    Directory of Open Access Journals (Sweden)

    Wei Shen

    Full Text Available FASTA and FASTQ are basic and ubiquitous formats for storing nucleotide and protein sequences. Common manipulations of FASTA/Q file include converting, searching, filtering, deduplication, splitting, shuffling, and sampling. Existing tools only implement some of these manipulations, and not particularly efficiently, and some are only available for certain operating systems. Furthermore, the complicated installation process of required packages and running environments can render these programs less user friendly. This paper describes a cross-platform ultrafast comprehensive toolkit for FASTA/Q processing. SeqKit provides executable binary files for all major operating systems, including Windows, Linux, and Mac OSX, and can be directly used without any dependencies or pre-configurations. SeqKit demonstrates competitive performance in execution time and memory usage compared to similar tools. The efficiency and usability of SeqKit enable researchers to rapidly accomplish common FASTA/Q file manipulations. SeqKit is open source and available on Github at https://github.com/shenwei356/seqkit.

  14. Quantifying the impact of inter-site heterogeneity on the distribution of ChIP-seq data

    Directory of Open Access Journals (Sweden)

    Jonathan eCairns

    2014-11-01

    Full Text Available Chromatin Immunoprecipitation followed by sequencing (ChIP-seq is a valuable tool for epigenetic studies. Analysis of the data arising from ChIP-seq experiments often requires implicit or explicit statistical modelling of the read counts. The simple Poisson model is attractive, but does not provide a good fit to observed ChIP-seq data. Researchers therefore often either extend to a more general model (e.g. the Negative Binomial, and/or exclude regions of the genome that do not conform to the model. Since many modelling strategies employed for ChIP-seq data reduce to fitting a mixture of Poisson distributions, we explore the problem of inferring the optimal mixing distribution. We apply the Constrained Newton Method (CNM, which suggests the Negative Binomial - Negative Binomial (NB-NB mixture model as a candidate for modelling ChIP-seq data. We illustrate fitting the NB-NB model with an accelerated EM algorithm on four data sets from three species. Zero-inflated models have been suggested as an approach to improve model fit for ChIP-seq data. We show that the NB-NB mixture model requires no zero-inflation and suggest that in some cases the need for zero inflation is driven by the model's inability to cope with both artefactual large read counts and the frequently observed very low read counts.We see that the CNM-based approach is a useful diagnostic for the assessment of model fit and inference in ChIP-seq data and beyond. Use of the suggested NB-NB mixture model will be of value not only when calling peaks or otherwise modelling ChIP-seq data, but also when simulating data or constructing blacklists de novo.

  15. Estudio molecular del gen MLL en 30 pacientes con leucemias agudas Molecular study of MLL gen in 30 patients with acute leukemias

    Directory of Open Access Journals (Sweden)

    Raquel Levón Herrera

    2000-04-01

    Full Text Available Los reordenamientos del gen MLL en la banda cromosómica 11q23 son frecuentes en leucemias agudas (LA en niños y en las LA secundarias desarrolladas después de la terapia con inhibidores de la enzima topoisomerasa II. En menor medida también se aprecia en adultos con LA. La presencia de estos reordenamientos se considera un indicador de mal pronóstico asociado con resultados clínicos desfavorables, por ello es muy importante realizar su determinación en las LA. En este trabajo mostramos los resultados preliminares de la introducción del estudio del gen MLL en nuestro país mediante la técnica de Southern. Analizamos ADN de 30 pacientes con LA, incluidos niños y adultos, que en el momento del estudio se encontraban al debut o en recaída. El estudio molecular se realizó con la sonda FA4, que es un inserto genómico del gen MLL. Sólo uno de los 30 pacientes mostró bandas de reordenamiento con 2 enzimas de restricción diferentes, el resto mostró el gen MLL en configuración germinal. Es interesante destacar que el paciente con el reordenamiento era un niño con leucemia mieloblástica aguda subtipo M5b, lo cual concuerda con la literatura, donde se describe que estos reordenamientos están estrechamente correlacionados con los subtipos mielomonocítico (M4 y monocítico (M5 de leucemia mieloide aguda (LMARearrangements of MLL gen in llq23 chromosomal band are frequents in childhood type of acute leukemia (AL and in secondary AL, developed after therapy with II topoisomerase enzyme. To a lesser extent also is seen in adults with AL. Presence of theses rearrangements is considered to be a worse prognosis indicator, associated with unfavourable clinical results, that is why it is very important to carry our its assessment in AL. In this paper authors present preliminary results from introduction of study on MLL gen in our country through Southern technique. DNA from 30 patients was analized, including children and adults, that at the

  16. The psychometric properties of the Persian version of the metacognitions about Smoking Questionnaire among smokers.

    Science.gov (United States)

    Najafi, Mahmoud; Khosravani, Vahid; Shahhosseini, Meysam; Afshari, Amirhossein

    2018-09-01

    It has been shown that smoking may be affected by metacognitions. This study aimed to evaluate the factor structure, reliability and validity of the Persian version of the Metacognitions about Smoking Questionnaire (MSQ) among a sample of Iranian male smokers. When the English to Persian translation of the MSQ was performed, exploratory factor analysis (EFA) and confirmatory factor analysis (CFA) were completed according to the four-factor solution of the original MSQ. Three hundred male treatment-seeking smokers (mean age = 41.37, SD = 15.90) filled out the Persian-translated version of the MSQ, the Smoking Effects Questionnaire (SEQ), and the Nicotine Dependence Syndrome Scale (NDSS). The results of EFA revealed that the Persian version of the MSQ had a four-factor structure named positive metacognitions about cognitive regulation (PM-CR), positive metacognitions about emotional regulation (PM-ER), negative metacognitions about uncontrollability (NM-U), and negative metacognitions about cognitive interference (NM-CI). The findings of CFA also indicated that the four-factor structure of the Persian version of the MSQ had appropriate fit. Validity and reliability of the Persian version of the MSQ were found to be good. Negative metacognitions about smoking predicted nicotine dependence over and above smoking outcome expectancies. Positive metacognitions about emotion regulation explained daily cigarette use independent of smoking outcome expectancies. The findings suggested that the Persian version of the MSQ had adequate psychometric properties among Iranian male treatment-seeking smokers. Copyright © 2018 Elsevier Ltd. All rights reserved.

  17. An empirical strategy to detect bacterial transcript structure from directional RNA-seq transcriptome data.

    Science.gov (United States)

    Wang, Yejun; MacKenzie, Keith D; White, Aaron P

    2015-05-07

    As sequencing costs are being lowered continuously, RNA-seq has gradually been adopted as the first choice for comparative transcriptome studies with bacteria. Unlike microarrays, RNA-seq can directly detect cDNA derived from mRNA transcripts at a single nucleotide resolution. Not only does this allow researchers to determine the absolute expression level of genes, but it also conveys information about transcript structure. Few automatic software tools have yet been established to investigate large-scale RNA-seq data for bacterial transcript structure analysis. In this study, 54 directional RNA-seq libraries from Salmonella serovar Typhimurium (S. Typhimurium) 14028s were examined for potential relationships between read mapping patterns and transcript structure. We developed an empirical method, combined with statistical tests, to automatically detect key transcript features, including transcriptional start sites (TSSs), transcriptional termination sites (TTSs) and operon organization. Using our method, we obtained 2,764 TSSs and 1,467 TTSs for 1331 and 844 different genes, respectively. Identification of TSSs facilitated further discrimination of 215 putative sigma 38 regulons and 863 potential sigma 70 regulons. Combining the TSSs and TTSs with intergenic distance and co-expression information, we comprehensively annotated the operon organization in S. Typhimurium 14028s. Our results show that directional RNA-seq can be used to detect transcriptional borders at an acceptable resolution of ±10-20 nucleotides. Technical limitations of the RNA-seq procedure may prevent single nucleotide resolution. The automatic transcript border detection methods, statistical models and operon organization pipeline that we have described could be widely applied to RNA-seq studies in other bacteria. Furthermore, the TSSs, TTSs, operons, promoters and unstranslated regions that we have defined for S. Typhimurium 14028s may constitute valuable resources that can be used for

  18. Discovery of transcription factors and regulatory regions driving in vivo tumor development by ATAC-seq and FAIRE-seq open chromatin profiling.

    Directory of Open Access Journals (Sweden)

    Kristofer Davie

    2015-02-01

    Full Text Available Genomic enhancers regulate spatio-temporal gene expression by recruiting specific combinations of transcription factors (TFs. When TFs are bound to active regulatory regions, they displace canonical nucleosomes, making these regions biochemically detectable as nucleosome-depleted regions or accessible/open chromatin. Here we ask whether open chromatin profiling can be used to identify the entire repertoire of active promoters and enhancers underlying tissue-specific gene expression during normal development and oncogenesis in vivo. To this end, we first compare two different approaches to detect open chromatin in vivo using the Drosophila eye primordium as a model system: FAIRE-seq, based on physical separation of open versus closed chromatin; and ATAC-seq, based on preferential integration of a transposon into open chromatin. We find that both methods reproducibly capture the tissue-specific chromatin activity of regulatory regions, including promoters, enhancers, and insulators. Using both techniques, we screened for regulatory regions that become ectopically active during Ras-dependent oncogenesis, and identified 3778 regions that become (over-activated during tumor development. Next, we applied motif discovery to search for candidate transcription factors that could bind these regions and identified AP-1 and Stat92E as key regulators. We validated the importance of Stat92E in the development of the tumors by introducing a loss of function Stat92E mutant, which was sufficient to rescue the tumor phenotype. Additionally we tested if the predicted Stat92E responsive regulatory regions are genuine, using ectopic induction of JAK/STAT signaling in developing eye discs, and observed that similar chromatin changes indeed occurred. Finally, we determine that these are functionally significant regulatory changes, as nearby target genes are up- or down-regulated. In conclusion, we show that FAIRE-seq and ATAC-seq based open chromatin profiling

  19. Pipeline for the Analysis of ChIP-seq Data and New Motif Ranking Procedure

    KAUST Repository

    Ashoor, Haitham

    2011-06-01

    This thesis presents a computational methodology for ab-initio identification of transcription factor binding sites based on ChIP-seq data. This method consists of three main steps, namely ChIP-seq data processing, motif discovery and models selection. A novel method for ranking the models of motifs identified in this process is proposed. This method combines multiple factors in order to rank the provided candidate motifs. It combines the model coverage of the ChIP-seq fragments that contain motifs from which that model is built, the suitable background data made up of shuffled ChIP-seq fragments, and the p-value that resulted from evaluating the model on actual and background data. Two ChIP-seq datasets retrieved from ENCODE project are used to evaluate and demonstrate the ability of the method to predict correct TFBSs with high precision. The first dataset relates to neuron-restrictive silencer factor, NRSF, while the second one corresponds to growth-associated binding protein, GABP. The pipeline system shows high precision prediction for both datasets, as in both cases the top ranked motif closely resembles the known motifs for the respective transcription factors.

  20. voom: Precision weights unlock linear model analysis tools for RNA-seq read counts.

    Science.gov (United States)

    Law, Charity W; Chen, Yunshun; Shi, Wei; Smyth, Gordon K

    2014-02-03

    New normal linear modeling strategies are presented for analyzing read counts from RNA-seq experiments. The voom method estimates the mean-variance relationship of the log-counts, generates a precision weight for each observation and enters these into the limma empirical Bayes analysis pipeline. This opens access for RNA-seq analysts to a large body of methodology developed for microarrays. Simulation studies show that voom performs as well or better than count-based RNA-seq methods even when the data are generated according to the assumptions of the earlier methods. Two case studies illustrate the use of linear modeling and gene set testing methods.

  1. LipidSeq: a next-generation clinical resequencing panel for monogenic dyslipidemias[S

    Science.gov (United States)

    Johansen, Christopher T.; Dubé, Joseph B.; Loyzer, Melissa N.; MacDonald, Austin; Carter, David E.; McIntyre, Adam D.; Cao, Henian; Wang, Jian; Robinson, John F.; Hegele, Robert A.

    2014-01-01

    We report the design of a targeted resequencing panel for monogenic dyslipidemias, LipidSeq, for the purpose of replacing Sanger sequencing in the clinical detection of dyslipidemia-causing variants. We also evaluate the performance of the LipidSeq approach versus Sanger sequencing in 84 patients with a range of phenotypes including extreme blood lipid concentrations as well as additional dyslipidemias and related metabolic disorders. The panel performs well, with high concordance (95.2%) in samples with known mutations based on Sanger sequencing and a high detection rate (57.9%) of mutations likely to be causative for disease in samples not previously sequenced. Clinical implementation of LipidSeq has the potential to aid in the molecular diagnosis of patients with monogenic dyslipidemias with a high degree of speed and accuracy and at lower cost than either Sanger sequencing or whole exome sequencing. Furthermore, LipidSeq will help to provide a more focused picture of monogenic and polygenic contributors that underlie dyslipidemia while excluding the discovery of incidental pathogenic clinically actionable variants in nonmetabolism-related genes, such as oncogenes, that would otherwise be identified by a whole exome approach, thus minimizing potential ethical issues. PMID:24503134

  2. Genética em transtornos alimentares: ampliando os horizontes de pesquisa

    Directory of Open Access Journals (Sweden)

    Pinheiro Andréa Poyastro

    2006-01-01

    Full Text Available OBJETIVO: Revisar a literatura atual concernente à pesquisa genética em transtornos do comportamento alimentar e discutir questões relevantes ao desenvolvimento de um projeto de pesquisa genética nessa área no Brasil. MÉTODO: A revisão realizada utilizou a base de dados Medline, no período de 1984 a maio de 2005, com os seguintes termos de busca: "anorexia nervosa", "bulimia nervosa", "eating disorders", "binge eating disorder", "family studies", "twin studies", "molecular genetics studies". RESULTADOS: Os dados atuais apontam para uma contribuição relevante dos fatores genéticos na suscetibilidade à anorexia e à bulimia nervosa. A pesquisa genética com populações miscigenadas deve levar em consideração o tamanho da amostra, a densidade de genotipagem e a estratificação populacional. Através de "admixture mapping" é possível estimar a estrutura genética destas populações e localizar genes relacionados à variação étnica de doenças ou traços de interesse. CONCLUSÕES: O desenvolvimento de uma grande iniciativa de colaboração em genética de transtornos alimentares no Brasil e na América Latina viabilizará estudar os fatores genéticos em transtornos do comportamento alimentar no contexto de grupos inter-étnicos, e integrar uma nova perspectiva biológica à etiologia destes distúrbios.

  3. Evaluation of Xylella fastidiosa genetic diversity by fAFPL markers Diversidade genética de Xylella fastidiosa avaliada por marcadores fAFPL

    Directory of Open Access Journals (Sweden)

    Luciano Takeshi Kishi

    2008-03-01

    Full Text Available The first phytopathogenic bacterium with its DNA entirely sequenced is being detected and isolated from different host plants in several geographic regions. Although it causes diseases in cultures of economic importance, such as citrus, coffee, and grapevine little is known about the genetic relationships among different strains. Actually, all strains are grouped as a single species, Xylella fastidiosa, despite colonizing different hosts, developing symptoms, and different physiological and microbiological observed conditions. The existence of genetic diversity among X. fastidiosa strains was detected by different methodological techniques, since cultural to molecular methods. However, little is know about the phylogenetic relationships developed by Brazilian strains obtained from coffee and citrus plants. In order to evaluate it, fAFLP markers were used to verify genetic diversity and phylogenetic relationships developed by Brazilian and strange strains. fAFLP is an efficient technique, with high reproducibility that is currently used for bacterial typing and classification. The obtained results showed that Brazilian strains present genetic diversity and that the strains from this study were grouped distinctly according host and geographical origin like citrus-coffee, temecula-grapevine-mulberry and plum-elm.A primeira bactéria fitopatogênica a ter seu genoma totalmente seqüenciado foi detectada e isolada em diferentes hospedeiros em diferentes regiões geográficas. Embora seja causadora de doenças em culturas economicamente importantes, como citros, cafeeiro e videira, pouco se conhece acerca das relações genéticas estabelecidas entre isolados da bactéria. Atualmente, todos os isolados são agrupados como uma única espécie, Xylella fastidiosa, apesar de colonizarem diferentes hospedeiros que desenvolvem sintomas diferenciados e possuir diferentes condições fisiológicas e microbiológicas. A existência de diversidade gen

  4. Transforming RNA-Seq data to improve the performance of prognostic gene signatures.

    Science.gov (United States)

    Zwiener, Isabella; Frisch, Barbara; Binder, Harald

    2014-01-01

    Gene expression measurements have successfully been used for building prognostic signatures, i.e for identifying a short list of important genes that can predict patient outcome. Mostly microarray measurements have been considered, and there is little advice available for building multivariable risk prediction models from RNA-Seq data. We specifically consider penalized regression techniques, such as the lasso and componentwise boosting, which can simultaneously consider all measurements and provide both, multivariable regression models for prediction and automated variable selection. However, they might be affected by the typical skewness, mean-variance-dependency or extreme values of RNA-Seq covariates and therefore could benefit from transformations of the latter. In an analytical part, we highlight preferential selection of covariates with large variances, which is problematic due to the mean-variance dependency of RNA-Seq data. In a simulation study, we compare different transformations of RNA-Seq data for potentially improving detection of important genes. Specifically, we consider standardization, the log transformation, a variance-stabilizing transformation, the Box-Cox transformation, and rank-based transformations. In addition, the prediction performance for real data from patients with kidney cancer and acute myeloid leukemia is considered. We show that signature size, identification performance, and prediction performance critically depend on the choice of a suitable transformation. Rank-based transformations perform well in all scenarios and can even outperform complex variance-stabilizing approaches. Generally, the results illustrate that the distribution and potential transformations of RNA-Seq data need to be considered as a critical step when building risk prediction models by penalized regression techniques.

  5. Statistical modeling of isoform splicing dynamics from RNA-seq time series data.

    Science.gov (United States)

    Huang, Yuanhua; Sanguinetti, Guido

    2016-10-01

    Isoform quantification is an important goal of RNA-seq experiments, yet it remains problematic for genes with low expression or several isoforms. These difficulties may in principle be ameliorated by exploiting correlated experimental designs, such as time series or dosage response experiments. Time series RNA-seq experiments, in particular, are becoming increasingly popular, yet there are no methods that explicitly leverage the experimental design to improve isoform quantification. Here, we present DICEseq, the first isoform quantification method tailored to correlated RNA-seq experiments. DICEseq explicitly models the correlations between different RNA-seq experiments to aid the quantification of isoforms across experiments. Numerical experiments on simulated datasets show that DICEseq yields more accurate results than state-of-the-art methods, an advantage that can become considerable at low coverage levels. On real datasets, our results show that DICEseq provides substantially more reproducible and robust quantifications, increasing the correlation of estimates from replicate datasets by up to 10% on genes with low or moderate expression levels (bottom third of all genes). Furthermore, DICEseq permits to quantify the trade-off between temporal sampling of RNA and depth of sequencing, frequently an important choice when planning experiments. Our results have strong implications for the design of RNA-seq experiments, and offer a novel tool for improved analysis of such datasets. Python code is freely available at http://diceseq.sf.net G.Sanguinetti@ed.ac.uk Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  6. Transforming RNA-Seq data to improve the performance of prognostic gene signatures.

    Directory of Open Access Journals (Sweden)

    Isabella Zwiener

    Full Text Available Gene expression measurements have successfully been used for building prognostic signatures, i.e for identifying a short list of important genes that can predict patient outcome. Mostly microarray measurements have been considered, and there is little advice available for building multivariable risk prediction models from RNA-Seq data. We specifically consider penalized regression techniques, such as the lasso and componentwise boosting, which can simultaneously consider all measurements and provide both, multivariable regression models for prediction and automated variable selection. However, they might be affected by the typical skewness, mean-variance-dependency or extreme values of RNA-Seq covariates and therefore could benefit from transformations of the latter. In an analytical part, we highlight preferential selection of covariates with large variances, which is problematic due to the mean-variance dependency of RNA-Seq data. In a simulation study, we compare different transformations of RNA-Seq data for potentially improving detection of important genes. Specifically, we consider standardization, the log transformation, a variance-stabilizing transformation, the Box-Cox transformation, and rank-based transformations. In addition, the prediction performance for real data from patients with kidney cancer and acute myeloid leukemia is considered. We show that signature size, identification performance, and prediction performance critically depend on the choice of a suitable transformation. Rank-based transformations perform well in all scenarios and can even outperform complex variance-stabilizing approaches. Generally, the results illustrate that the distribution and potential transformations of RNA-Seq data need to be considered as a critical step when building risk prediction models by penalized regression techniques.

  7. Resistência genética em genótipos de feijoeiro a Curtobacterium flaccumfaciens pv. flaccumfaciens Genetic resistance to Curtobacterium flaccumfaciens pv. flaccumfaciens in bean genotypes

    Directory of Open Access Journals (Sweden)

    Valmir Luiz de Souza

    2006-09-01

    Full Text Available Curtobacterium flaccumfaciens pv. flaccumfaciens (Cff agente causal da murcha-de-curtobacterium em feijoeiro (Phaseolus vulgaris, é um patógeno vascular de difícil controle. A doença foi detectada pela primeira vez no Brasil na safra das águas de 1995, no Estado de São Paulo. Por se tratar de uma doença de difícil controle, a resistência genética tem sido a melhor opção. O objetivo deste trabalho foi avaliar a reação de genótipos de feijoeiro à murcha-de-curtobacterium, frente a 333 acessos pertencentes ao banco de germoplasma de feijoeiro do Instituto Agronômico de Campinas (IAC. Oportunamente, foram selecionados genótipos de feijoeiro altamente resistentes e suscetíveis, com a finalidade de comparar a colonização de Cff no vaso do xilema a partir da visualização sob microscopia eletrônica de varredura. Os resultados da triagem da resistência genética em genótipos de feijoeiro indicaram a existência de variabilidade genética nas amostras dos 333 genótipos avaliados, ao isolado de Cff Feij 2634. Os materiais foram classificados em 4 grupos de resistência: 29 genótipos (8,7% comportaram-se como altamente resistentes, 13 genótipos (3,9% como resistentes, 18 genótipos (5% como moderadamente resistentes e 273 genótipos (81% suscetíveis. A partir dos resultados obtidos, cerca de 18% dos genótipos de feijoeiros, desde altamente resistentes à moderadamente resistentes, poderão ser úteis para o programa de melhoramento genético como fonte de genes para resistência a Cff. Através da microscopia eletrônica de varredura, foram observadas em genótipos altamente resistentes, várias aglutinações da bactéria envolvidas por filamentos e estruturas rendilhadas sob pontuações da parede do vaso do xilema, não verificados em genótipos suscetíveis, o que sugere a ativação de mecanismos de defesa estruturais e bioquímicos nas plantas resistentes.Curtobacterium flaccumfaciens pv. flaccumfaciens (Cff, the causal

  8. Impact of genome assembly status on ChIP-Seq and ChIP-PET data mapping

    Directory of Open Access Journals (Sweden)

    Sachs Laurent

    2009-12-01

    Full Text Available Abstract Background ChIP-Seq and ChIP-PET can potentially be used with any genome for genome wide profiling of protein-DNA interaction sites. Unfortunately, it is probable that most genome assemblies will never reach the quality of the human genome assembly. Therefore, it remains to be determined whether ChIP-Seq and ChIP-PET are practicable with genome sequences other than a few (e.g. human and mouse. Findings Here, we used in silico simulations to assess the impact of completeness or fragmentation of genome assemblies on ChIP-Seq and ChIP-PET data mapping. Conclusions Most currently published genome assemblies are suitable for mapping the short sequence tags produced by ChIP-Seq or ChIP-PET.

  9. Impact of artefact removal on ChIP quality metrics in ChIP-seq and ChIP-exo data.

    Directory of Open Access Journals (Sweden)

    Thomas Samuel Carroll

    2014-04-01

    Full Text Available With the advent of ChIP-seq multiplexing technologies and the subsequent increase in ChIP-seq throughput, the development of working standards for the quality assessment of ChIP-seq studies has received significant attention. The ENCODE consortium’s large scale analysis of transcription factor binding and epigenetic marks as well as concordant work on ChIP-seq by other laboratories has established a new generation of ChIP-seq quality control measures. The use of these metrics alongside common processing steps has however not been evaluated. In this study, we investigate the effects of blacklisting and removal of duplicated reads on established metrics of ChIP-seq quality and show that the interpretation of these metrics is highly dependent on the ChIP-seq preprocessing steps applied. Further to this we perform the first investigation of the use of these metrics for ChIP-exo data and make recommendations for the adaptation of the NSC statistic to allow for the assessment of ChIP-exo efficiency.

  10. Consideraciones genéticas sobre las dislipidemias y la aterosclerosis

    OpenAIRE

    Julio César Fernández Travieso

    2008-01-01

    La interacción entre factores genéticos y ambientales explican muchos aspectos de la aterosclerosis y las variaciones genéticas constituyen marcadores de riesgo de la enfermedad coronaria (EC), la cual ocupa el primer lugar entre las causas de morbilidad y mortalidad a nivel mundial. La predisposición familiar a padecer EC, junto al avance vertiginoso en técnicas de análisis de ADN y la disponibilidad de secuencias del genoma humano, han orientado la investigación de alteraciones genéticas re...

  11. Towards the integration, annotation and association of historical microarray experiments with RNA-seq.

    Science.gov (United States)

    Chavan, Shweta S; Bauer, Michael A; Peterson, Erich A; Heuck, Christoph J; Johann, Donald J

    2013-01-01

    Transcriptome analysis by microarrays has produced important advances in biomedicine. For instance in multiple myeloma (MM), microarray approaches led to the development of an effective disease subtyping via cluster assignment, and a 70 gene risk score. Both enabled an improved molecular understanding of MM, and have provided prognostic information for the purposes of clinical management. Many researchers are now transitioning to Next Generation Sequencing (NGS) approaches and RNA-seq in particular, due to its discovery-based nature, improved sensitivity, and dynamic range. Additionally, RNA-seq allows for the analysis of gene isoforms, splice variants, and novel gene fusions. Given the voluminous amounts of historical microarray data, there is now a need to associate and integrate microarray and RNA-seq data via advanced bioinformatic approaches. Custom software was developed following a model-view-controller (MVC) approach to integrate Affymetrix probe set-IDs, and gene annotation information from a variety of sources. The tool/approach employs an assortment of strategies to integrate, cross reference, and associate microarray and RNA-seq datasets. Output from a variety of transcriptome reconstruction and quantitation tools (e.g., Cufflinks) can be directly integrated, and/or associated with Affymetrix probe set data, as well as necessary gene identifiers and/or symbols from a diversity of sources. Strategies are employed to maximize the annotation and cross referencing process. Custom gene sets (e.g., MM 70 risk score (GEP-70)) can be specified, and the tool can be directly assimilated into an RNA-seq pipeline. A novel bioinformatic approach to aid in the facilitation of both annotation and association of historic microarray data, in conjunction with richer RNA-seq data, is now assisting with the study of MM cancer biology.

  12. Divergencia genética en poblaciones peruanas detectada a partir de las frecuencias haplotípicas del mtDNA y del gen nuclear MBL

    Directory of Open Access Journals (Sweden)

    Jesús H. Córdova

    2011-01-01

    Full Text Available Objetivos: Avanzar en el conocimiento del origen de las poblaciones peruanas estudiadas en un contexto filogeográfico. Diseño: Estudio genético poblacional. Instituciones: Laboratorio de Genética Humana, Facultad de Ciencias Biológicas, Universidad Nacional Mayor de San Marcos, e Instituto de Genética y Biología Molecular, Facultad de Medicina, Universidad San Martín de Porras, Lima, Perú. Participantes: Siete poblaciones peruanas. Metodología: Análisis comparativo de los resultados a partir del estudio del mtDNA y el gen nuclear MBL de siete poblaciones peruanas, procesados de manera separada y luego combinados, utilizando el programa PHYLYP 3.65, para obtener valores FST de diferenciación genética y la construcción de árboles de distancias por aplicación del algorritmo UPGMA y el análisis subsecuente de los agrupamientos (clusters generados. Principales medidas de resultados: Árboles genéticos generados. Resultados: De manera separada, los árboles generados para cada marcador genético tuvieron topologías propias y diferentes entre sí. Procesados de manera combinada, el árbol resultante demostró que los mayores valores de diferenciación genética se hallaron en las Islas del Lago Titicaca (Puno, Perú conocidas -Taquile, Amantani y Anapia-, que fue calificada como muy alta, porque mostró valores de FST de 0.3113, 0.2949 y 0.3348 respecto de las poblaciones estudiadas, tanto fuera del Departamento de Puno -como Chachapoyas, Pucallpa y Chiclayo, respectivamente-, así como a la de los Uro del mismo Puno y del mismo Lago Titicaca (0.2837. Fuera de Puno, el par de poblaciones Chachapoyas-Pucallpa fue el menos divergente, al alcanzar entre ellas un valor de FST de 0.0108, calificándosele de pequeña. Conclusiones: El árbol obtenido del procesamiento de los marcadores vía una matriz combinada demostró que las poblaciones que habitan las islas de Taquile, Amantani y Anapia, divergen notablemente de las restantes cuatro

  13. Hyb-Seq: Combining Target Enrichment and Genome Skimming for Plant Phylogenomics

    Directory of Open Access Journals (Sweden)

    Kevin Weitemier

    2014-08-01

    Full Text Available Premise of the study: Hyb-Seq, the combination of target enrichment and genome skimming, allows simultaneous data collection for low-copy nuclear genes and high-copy genomic targets for plant systematics and evolution studies. Methods and Results: Genome and transcriptome assemblies for milkweed (Asclepias syriaca were used to design enrichment probes for 3385 exons from 768 genes (>1.6 Mbp followed by Illumina sequencing of enriched libraries. Hyb-Seq of 12 individuals (10 Asclepias species and two related genera resulted in at least partial assembly of 92.6% of exons and 99.7% of genes and an average assembly length >2 Mbp. Importantly, complete plastomes and nuclear ribosomal DNA cistrons were assembled using off-target reads. Phylogenomic analyses demonstrated signal conflict between genomes. Conclusions: The Hyb-Seq approach enables targeted sequencing of thousands of low-copy nuclear exons and flanking regions, as well as genome skimming of high-copy repeats and organellar genomes, to efficiently produce genome-scale data sets for phylogenomics.

  14. Towards an International Culture: Gen Y Students and SNS?

    Science.gov (United States)

    Lichy, Jessica

    2012-01-01

    This article reports the findings of a small-scale investigation into the Internet user behaviour of generation Y (Gen Y) students, with particular reference to social networking sites. The study adds to the literature on cross-cultural Internet user behaviour with specific reference to Gen Y and social networking. It compares how a cohort of…

  15. Introducing AstroGen: the Astronomy Genealogy Project

    Science.gov (United States)

    Tenn, Joseph S.

    2016-12-01

    The Astronomy Genealogy Project (AstroGen), a project of the Historical Astronomy Division of the American Astronomical Society (AAS), will soon appear on the AAS website. Ultimately, it will list the world's astronomers with their highest degrees, theses for those who wrote them, academic advisors (supervisors), universities, and links to the astronomers or their obituaries, their theses when online, and more. At present the AstroGen team is working on those who earned doctorates with astronomy-related theses. We show what can be learned already, with just ten countries essentially completed.

  16. An electronic flight bag for NextGen avionics

    Science.gov (United States)

    Zelazo, D. Eyton

    2012-06-01

    The introduction of the Next Generation Air Transportation System (NextGen) initiative by the Federal Aviation Administration (FAA) will impose new requirements for cockpit avionics. A similar program is also taking place in Europe by the European Organisation for the Safety of Air Navigation (Eurocontrol) called the Single European Sky Air Traffic Management Research (SESAR) initiative. NextGen will require aircraft to utilize Automatic Dependent Surveillance-Broadcast (ADS-B) in/out technology, requiring substantial changes to existing cockpit display systems. There are two ways that aircraft operators can upgrade their aircraft in order to utilize ADS-B technology. The first is to replace existing primary flight displays with new displays that are ADS-B compatible. The second, less costly approach is to install an advanced Class 3 Electronic Flight Bag (EFB) system. The installation of Class 3 EFBs in the cockpit will allow aircraft operators to utilize ADS-B technology in a lesser amount of time with a decreased cost of implementation and will provide additional benefits to the operator. This paper describes a Class 3 EFB, the NexisTM Flight-Intelligence System, which has been designed to allow users a direct interface with NextGen avionics sensors while additionally providing the pilot with all the necessary information to meet NextGen requirements.

  17. Clonal study of avian Escherichia coli strains by fliC conserved-DNA-sequence regions analysis Estudo clonal de Escherichia coli aviário por análise de seqüências de DNA conservadas do gene fliC

    Directory of Open Access Journals (Sweden)

    Tatiana Amabile de Campos

    2008-10-01

    Full Text Available The clonal relationship among avian Escherichia coli strains and their genetic proximity with human pathogenic E. coli, Salmonela enterica, Yersinia enterocolitica and Proteus mirabilis, was determined by the DNA sequencing of the conserved 5' and 3'regions fliC gene (flagellin encoded gene. Among 30 commensal avian E. coli strains and 49 pathogenic avian E. coli strains (APEC, 24 commensal and 39 APEC strains harbored fliC gene with fragments size varying from 670bp to 1,900bp. The comparative analysis of these regions allowed the construction of a dendrogram of similarity possessing two main clusters: one compounded mainly by APEC strains and by H-antigens from human E. coli, and another one compounded by commensal avian E. coli strains, S. enterica, and by other H-antigens from human E. coli. Overall, this work demonstrated that fliC conserved regions may be associated with pathogenic clones of APEC strains, and also shows a great similarity among APEC and H-antigens of E. coli strains isolated from humans. These data, can add evidence that APEC strains can exhibit a zoonotic risk.A relação clonal entre linhagens de Escherichia coli de origem aviária e sua proximidade genética com E. coli patogênica para humanos, Salmonella enterica, Yersinia enterocolitica e Proteus mirabilis foi determinada através da utilização das seqüências conservadas 5' e 3' do gene fliC (responsável pela codificação da flagelina. Entre as 30 linhagens comensais de E. coli aviária e as 49 linhagens patogênicas de E. coli para aves (APEC, 24 linhagens comensais e 39 APEC apresentaram o gene fliC, que foi encontrado em tamanhos que variam de 670pb a 1900pb. Um dendrograma representando similaridade genética foi obtido a partir do seqüenciamento das regiões 5' e 3' conservadas do gene fliC das linhagens de E. coli de origem aviária, das seqüências dos antígenos H de E. coli de origem humana, de S. enterica, Y. enterocolitica e de P. mirabilis. A an

  18. Uso de marcadores moleculares na análise da variabilidade genética em acerola (Malpighia emarginata D.C.

    Directory of Open Access Journals (Sweden)

    SALLA MARIA FERNANDA SPEGIORIN

    2002-01-01

    Full Text Available A acerola (Malpighia emarginata é uma frutífera tropical encontrada nativa na América Central e no Norte da América do Sul, sendo de grande importância econômica e social devido ao seu alto conteúdo de vitamina C (ácido ascórbico. Pomares de acerola têm sido preferencialmente estabelecidos por métodos de propagação vegetiva. No entanto, a propagação sexuada por sementes é igualmente utilizada e permite revelar um alto grau de polimorfismo na cultura, possibilitando a identificação de genótipos portadores de características de interesse agronômico. Vinte e quatro acessos de acerola, pertencentes ao Banco Ativo de Germoplasma da Universidade Estadual de Londrina, foram analisados, usando marcadores RAPD (Random amplified Polymorphic DNA e obtidos com iniciadores (primers de seqüência simples repetidas (SSRs. Um total de 164 e 73 marcadores foram obtidos com primers de RAPD e SSR, respectivamente. Os marcadores obtidos foram analisados, usando o método de agrupamentos UPGMA. A análise comparativa dos dendrogramas gerados com os primers de RAPD e com os primers SSR mostrou que, enquanto alguns acessos se associaram em grupos diferentes, outros apresentaram a mesma associação. Entretanto, maior polimorfismo entre acessos foi detectado com os primers de RAPD. A análise dos resultados revelou a alta variabilidade contida na coleção, permitindo associar o grau de similaridade genética, obtido por marcadores de DNA, com caracteres morfológicos compartilhados entre os acessos.

  19. Finding the breech: Influence of breech presentation on mode of delivery based on timing of diagnosis, attempt at external cephalic version, and provider success with version.

    Science.gov (United States)

    Andrews, Suzanne; Leeman, Lawrence; Yonke, Nicole

    2017-09-01

    Breech presentation affects 3-4% of pregnancies at term and malpresentation is the primary indication for 10-15% of cesarean deliveries. External cephalic version is an effective intervention that can decrease the need for cesarean delivery; however, timely identification of breech presentation is required. We hypothesized that women with a fetus in a breech presentation that is diagnosed after 38 weeks' estimated gestational age have a decreased likelihood of external cephalic version attempted and an increased likelihood of cesarean delivery. This was a retrospective cohort study. A chart review was performed for 251 women with breech presentation at term presenting to our tertiary referral university hospital for external cephalic version, cesarean for breech presentation, or vaginal breech delivery. Vaginal delivery was significantly more likely (31.1% vs 12.5%; Pexternal cephalic version was offered, and subsequently attempted in a greater proportion of women diagnosed before 38 weeks. External cephalic version was more successful when performed by physicians with greater procedural volume during the 3.5 year period of the study (59.1% for providers performing at least 10 procedures vs 31.3% if performing fewer than 10 procedures, Pexternal cephalic version. © 2017 Wiley Periodicals, Inc.

  20. Conceptos básicos de programación genética

    Directory of Open Access Journals (Sweden)

    José Jesús Martínez Páez

    2001-04-01

    Full Text Available La Programación Genética, PG, es un retoño de los Algoritmos Genéticos, en la cual los cromosomas que sufren la adaptación son en sí mismos programas de computador. Se usan operadores genéticos  especializados que generalizan la recombinación sexual y la mutación, para los programas de computador estructurados en árbol que están bajo adaptación.

  1. RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome

    Directory of Open Access Journals (Sweden)

    Dewey Colin N

    2011-08-01

    Full Text Available Abstract Background RNA-Seq is revolutionizing the way transcript abundances are measured. A key challenge in transcript quantification from RNA-Seq data is the handling of reads that map to multiple genes or isoforms. This issue is particularly important for quantification with de novo transcriptome assemblies in the absence of sequenced genomes, as it is difficult to determine which transcripts are isoforms of the same gene. A second significant issue is the design of RNA-Seq experiments, in terms of the number of reads, read length, and whether reads come from one or both ends of cDNA fragments. Results We present RSEM, an user-friendly software package for quantifying gene and isoform abundances from single-end or paired-end RNA-Seq data. RSEM outputs abundance estimates, 95% credibility intervals, and visualization files and can also simulate RNA-Seq data. In contrast to other existing tools, the software does not require a reference genome. Thus, in combination with a de novo transcriptome assembler, RSEM enables accurate transcript quantification for species without sequenced genomes. On simulated and real data sets, RSEM has superior or comparable performance to quantification methods that rely on a reference genome. Taking advantage of RSEM's ability to effectively use ambiguously-mapping reads, we show that accurate gene-level abundance estimates are best obtained with large numbers of short single-end reads. On the other hand, estimates of the relative frequencies of isoforms within single genes may be improved through the use of paired-end reads, depending on the number of possible splice forms for each gene. Conclusions RSEM is an accurate and user-friendly software tool for quantifying transcript abundances from RNA-Seq data. As it does not rely on the existence of a reference genome, it is particularly useful for quantification with de novo transcriptome assemblies. In addition, RSEM has enabled valuable guidance for cost

  2. WaveSeq: a novel data-driven method of detecting histone modification enrichments using wavelets.

    Directory of Open Access Journals (Sweden)

    Apratim Mitra

    Full Text Available BACKGROUND: Chromatin immunoprecipitation followed by next-generation sequencing is a genome-wide analysis technique that can be used to detect various epigenetic phenomena such as, transcription factor binding sites and histone modifications. Histone modification profiles can be either punctate or diffuse which makes it difficult to distinguish regions of enrichment from background noise. With the discovery of histone marks having a wide variety of enrichment patterns, there is an urgent need for analysis methods that are robust to various data characteristics and capable of detecting a broad range of enrichment patterns. RESULTS: To address these challenges we propose WaveSeq, a novel data-driven method of detecting regions of significant enrichment in ChIP-Seq data. Our approach utilizes the wavelet transform, is free of distributional assumptions and is robust to diverse data characteristics such as low signal-to-noise ratios and broad enrichment patterns. Using publicly available datasets we showed that WaveSeq compares favorably with other published methods, exhibiting high sensitivity and precision for both punctate and diffuse enrichment regions even in the absence of a control data set. The application of our algorithm to a complex histone modification data set helped make novel functional discoveries which further underlined its utility in such an experimental setup. CONCLUSIONS: WaveSeq is a highly sensitive method capable of accurate identification of enriched regions in a broad range of data sets. WaveSeq can detect both narrow and broad peaks with a high degree of accuracy even in low signal-to-noise ratio data sets. WaveSeq is also suited for application in complex experimental scenarios, helping make biologically relevant functional discoveries.

  3. EMSAR: estimation of transcript abundance from RNA-seq data by mappability-based segmentation and reclustering.

    Science.gov (United States)

    Lee, Soohyun; Seo, Chae Hwa; Alver, Burak Han; Lee, Sanghyuk; Park, Peter J

    2015-09-03

    RNA-seq has been widely used for genome-wide expression profiling. RNA-seq data typically consists of tens of millions of short sequenced reads from different transcripts. However, due to sequence similarity among genes and among isoforms, the source of a given read is often ambiguous. Existing approaches for estimating expression levels from RNA-seq reads tend to compromise between accuracy and computational cost. We introduce a new approach for quantifying transcript abundance from RNA-seq data. EMSAR (Estimation by Mappability-based Segmentation And Reclustering) groups reads according to the set of transcripts to which they are mapped and finds maximum likelihood estimates using a joint Poisson model for each optimal set of segments of transcripts. The method uses nearly all mapped reads, including those mapped to multiple genes. With an efficient transcriptome indexing based on modified suffix arrays, EMSAR minimizes the use of CPU time and memory while achieving accuracy comparable to the best existing methods. EMSAR is a method for quantifying transcripts from RNA-seq data with high accuracy and low computational cost. EMSAR is available at https://github.com/parklab/emsar.

  4. Single-Cell RNA-Seq of Mouse Dopaminergic Neurons Informs Candidate Gene Selection for Sporadic Parkinson Disease.

    Science.gov (United States)

    Hook, Paul W; McClymont, Sarah A; Cannon, Gabrielle H; Law, William D; Morton, A Jennifer; Goff, Loyal A; McCallion, Andrew S

    2018-03-01

    Genetic variation modulating risk of sporadic Parkinson disease (PD) has been primarily explored through genome-wide association studies (GWASs). However, like many other common genetic diseases, the impacted genes remain largely unknown. Here, we used single-cell RNA-seq to characterize dopaminergic (DA) neuron populations in the mouse brain at embryonic and early postnatal time points. These data facilitated unbiased identification of DA neuron subpopulations through their unique transcriptional profiles, including a postnatal neuroblast population and substantia nigra (SN) DA neurons. We use these population-specific data to develop a scoring system to prioritize candidate genes in all 49 GWAS intervals implicated in PD risk, including genes with known PD associations and many with extensive supporting literature. As proof of principle, we confirm that the nigrostriatal pathway is compromised in Cplx1-null mice. Ultimately, this systematic approach establishes biologically pertinent candidates and testable hypotheses for sporadic PD, informing a new era of PD genetic research. Copyright © 2018 American Society of Human Genetics. All rights reserved.

  5. CLONACIÓN Y FILOGENIA MOLECULAR DE UN SEGMENTO DEL GEN CODANTE DE LA ACTINA DE MYRCIARIA DUBIA “CAMU-CAMU”: UN CANDIDATO PARA GEN DE REFERENCIA

    Directory of Open Access Journals (Sweden)

    Juan Carlos Castro Gómez

    2012-12-01

    Full Text Available Myrciaria dubia “camu-camu” es un frutal amazónico caracterizado por su amplia variación de vitamina C. Pero los estudios genético moleculares que puedan explicar esta variación son limitados. Por ello nuestro objetivo fue realizar la clonación y filogenia molecular de un segmento del gen codante de la actina de M. dubia. Las muestras fueron obtenidas de la colección de germoplasma del INIA. Luego, el ARN fue purificado y mediante RT-PCR con cebadores degenerados se amplificó un segmento del gen. En base a la secuencia obtenida se diseñaron cebadores específicos para PCR en tiempo real. Los resultados muestran que se ha aislado, clonado y secuenciado un segmento del gen codante de actina de M. dubia y detectado su expresión en hojas, pulpa y cáscara de M. dubia. Así, con el soporte de herramientas bioinformáticas y uso de técnicas de biología molecular hemos aislado, clonado y secuenciado un segmento del gen codante de la actina de M. dubia. Asimismo, los análisis realizados muestran que el gen se expresa y presenta niveles similares de expresión en hojas, pulpa y cáscara de M. dubia. Sin embargo, es necesario realizar más experimentos a fin de verificar su estabilidad de expresión.

  6. Safety Design Criteria (SDC) for Gen-IV Sodium-cooled Fast Reactor

    International Nuclear Information System (INIS)

    Nakai, Ryodai

    2013-01-01

    SDC Development Background & Objectives: • Safety Design Criteria (SDC) Development for Gen-IV SFR: – Proposed at the GIF Policy Group (PG) meeting in October 2010 –SDC “harmonization” is increasingly important for: • Realization of enhanced safety designs meeting to Gen-IV safety goals and safety approach common to SFR systems; • Preparation for the forthcoming licensing in the near future; • Because Gen-IV SFR are progressing into conceptual design stage. • The SDC is the Reference criteria: – Of the designs of safety-related Structures, Systems & Components that are specific to the SFR system; – For clarifying the requisites systematically & comprehensively; – When the technology developers apply the basic safety approach and use the codes & standards for conceptual design of the Gen-IV SFR system

  7. Aproximación genómica al diagnóstico genético de las distrofias hereditarias de retina y búsqueda de nuevos genes relacionados

    OpenAIRE

    González del Pozo, María

    2014-01-01

    Diagnosticar genéticamente a las familias afectas de alguna de las distrofias hereditarias de retina (DHR) es, desde el punto de vista del genetista, una tarea ardua y complicada, si atendemos a la gran cantidad de genes y mutaciones reportados hasta la fecha. La gran heterogeneidad clínica y genética que caracteriza a este conjunto de enfermedades, es sin duda el mayor impedimento para su resolución genética. En este escenario, el empleo de herramientas cada vez más poderosas es indispensabl...

  8. Genética e hanseníase

    OpenAIRE

    Beiguelman Bernardo

    2002-01-01

    As diferentes linhas de pesquisa utilizadas para investigar a importância dos fatores hereditários humanos na determinação da resistência/suscetibilidade à infecção pelo Mycobacterium leprae foram discutidas no presente trabalho. Uma síntese dessas abordagens permitiu analisar os resultados das investigações sobre associação da hanseníase com polimorfismos genéticos, distribuição familial da hanseníase, prevalência da hanseníase e distância genética, concordância da hanseníase em gêmeos e est...

  9. Analisis Mutasi Gen Protein X Virus Hbv Pada Penderita Hepatitis B Akut Di Manado

    OpenAIRE

    Fatimawali; Kepel, Billy

    2014-01-01

    Faktor-faktor yang mempengaruhi perkembangan hepatitis B kronis menjadi kanker hati antara lain mutasi pada gen x. Penelitian ini bertujuan untuk mengidentifikasi gen protein x virus HBV dan menganalisis apakah terjadi mutasi gen yang terkait dengan munculnya tumor ganas sirosis hati (HCC). Penelitian ini menggunakan primer untuk proses nested PCR yang telah dirancang sebelumnya. Proses nested PCR terhadap 10 sampel DNA HBV pasien dilakukan untuk mengamplifikasi fragmen DNA gen x dilanjutkan ...

  10. Quartz-Seq2: a high-throughput single-cell RNA-sequencing method that effectively uses limited sequence reads.

    Science.gov (United States)

    Sasagawa, Yohei; Danno, Hiroki; Takada, Hitomi; Ebisawa, Masashi; Tanaka, Kaori; Hayashi, Tetsutaro; Kurisaki, Akira; Nikaido, Itoshi

    2018-03-09

    High-throughput single-cell RNA-seq methods assign limited unique molecular identifier (UMI) counts as gene expression values to single cells from shallow sequence reads and detect limited gene counts. We thus developed a high-throughput single-cell RNA-seq method, Quartz-Seq2, to overcome these issues. Our improvements in the reaction steps make it possible to effectively convert initial reads to UMI counts, at a rate of 30-50%, and detect more genes. To demonstrate the power of Quartz-Seq2, we analyzed approximately 10,000 transcriptomes from in vitro embryonic stem cells and an in vivo stromal vascular fraction with a limited number of reads.

  11. RAP: RNA-Seq Analysis Pipeline, a new cloud-based NGS web application.

    Science.gov (United States)

    D'Antonio, Mattia; D'Onorio De Meo, Paolo; Pallocca, Matteo; Picardi, Ernesto; D'Erchia, Anna Maria; Calogero, Raffaele A; Castrignanò, Tiziana; Pesole, Graziano

    2015-01-01

    The study of RNA has been dramatically improved by the introduction of Next Generation Sequencing platforms allowing massive and cheap sequencing of selected RNA fractions, also providing information on strand orientation (RNA-Seq). The complexity of transcriptomes and of their regulative pathways make RNA-Seq one of most complex field of NGS applications, addressing several aspects of the expression process (e.g. identification and quantification of expressed genes and transcripts, alternative splicing and polyadenylation, fusion genes and trans-splicing, post-transcriptional events, etc.). In order to provide researchers with an effective and friendly resource for analyzing RNA-Seq data, we present here RAP (RNA-Seq Analysis Pipeline), a cloud computing web application implementing a complete but modular analysis workflow. This pipeline integrates both state-of-the-art bioinformatics tools for RNA-Seq analysis and in-house developed scripts to offer to the user a comprehensive strategy for data analysis. RAP is able to perform quality checks (adopting FastQC and NGS QC Toolkit), identify and quantify expressed genes and transcripts (with Tophat, Cufflinks and HTSeq), detect alternative splicing events (using SpliceTrap) and chimeric transcripts (with ChimeraScan). This pipeline is also able to identify splicing junctions and constitutive or alternative polyadenylation sites (implementing custom analysis modules) and call for statistically significant differences in genes and transcripts expression, splicing pattern and polyadenylation site usage (using Cuffdiff2 and DESeq). Through a user friendly web interface, the RAP workflow can be suitably customized by the user and it is automatically executed on our cloud computing environment. This strategy allows to access to bioinformatics tools and computational resources without specific bioinformatics and IT skills. RAP provides a set of tabular and graphical results that can be helpful to browse, filter and export

  12. Using RAD-seq to recognize sex-specific markers and sex chromosome systems.

    Science.gov (United States)

    Gamble, Tony

    2016-05-01

    Next-generation sequencing methods have initiated a revolution in molecular ecology and evolution (Tautz et al. ). Among the most impressive of these sequencing innovations is restriction site-associated DNA sequencing or RAD-seq (Baird et al. ; Andrews et al. ). RAD-seq uses the Illumina sequencing platform to sequence fragments of DNA cut by a specific restriction enzyme and can generate tens of thousands of molecular genetic markers for analysis. One of the many uses of RAD-seq data has been to identify sex-specific genetic markers, markers found in one sex but not the other (Baxter et al. ; Gamble & Zarkower ). Sex-specific markers are a powerful tool for biologists. At their most basic, they can be used to identify the sex of an individual via PCR. This is useful in cases where a species lacks obvious sexual dimorphism at some or all life history stages. For example, such tests have been important for studying sex differences in life history (Sheldon ; Mossman & Waser ), the management and breeding of endangered species (Taberlet et al. ; Griffiths & Tiwari ; Robertson et al. ) and sexing embryonic material (Hacker et al. ; Smith et al. ). Furthermore, sex-specific markers allow recognition of the sex chromosome system in cases where standard cytogenetic methods fail (Charlesworth & Mank ; Gamble & Zarkower ). Thus, species with male-specific markers have male heterogamety (XY) while species with female-specific markers have female heterogamety (ZW). In this issue, Fowler & Buonaccorsi () illustrate the ease by which RAD-seq data can generate sex-specific genetic markers in rockfish (Sebastes). Moreover, by examining RAD-seq data from two closely related rockfish species, Sebastes chrysomelas and Sebastes carnatus (Fig. ), Fowler & Buonaccorsi () uncover shared sex-specific markers and a conserved sex chromosome system. © 2016 John Wiley & Sons Ltd.

  13. A nuclear phylogenetic analysis: SNPs, indels and SSRs deliver new insights into the relationships in the 'true citrus fruit trees' group (Citrinae, Rutaceae) and the origin of cultivated species.

    Science.gov (United States)

    Garcia-Lor, Andres; Curk, Franck; Snoussi-Trifa, Hager; Morillon, Raphael; Ancillo, Gema; Luro, François; Navarro, Luis; Ollitrault, Patrick

    2013-01-01

    Despite differences in morphology, the genera representing 'true citrus fruit trees' are sexually compatible, and their phylogenetic relationships remain unclear. Most of the important commercial 'species' of Citrus are believed to be of interspecific origin. By studying polymorphisms of 27 nuclear genes, the average molecular differentiation between species was estimated and some phylogenetic relationships between 'true citrus fruit trees' were clarified. Sanger sequencing of PCR-amplified fragments from 18 genes involved in metabolite biosynthesis pathways and nine putative genes for salt tolerance was performed for 45 genotypes of Citrus and relatives of Citrus to mine single nucleotide polymorphisms (SNPs) and indel polymorphisms. Fifty nuclear simple sequence repeats (SSRs) were also analysed. A total of 16 238 kb of DNA was sequenced for each genotype, and 1097 single nucleotide polymorphisms (SNPs) and 50 indels were identified. These polymorphisms were more valuable than SSRs for inter-taxon differentiation. Nuclear phylogenetic analysis revealed that Citrus reticulata and Fortunella form a cluster that is differentiated from the clade that includes three other basic taxa of cultivated citrus (C. maxima, C. medica and C. micrantha). These results confirm the taxonomic subdivision between the subgenera Metacitrus and Archicitrus. A few genes displayed positive selection patterns within or between species, but most of them displayed neutral patterns. The phylogenetic inheritance patterns of the analysed genes were inferred for commercial Citrus spp. Numerous molecular polymorphisms (SNPs and indels), which are potentially useful for the analysis of interspecific genetic structures, have been identified. The nuclear phylogenetic network for Citrus and its sexually compatible relatives was consistent with the geographical origins of these genera. The positive selection observed for a few genes will help further works to analyse the molecular basis of the

  14. DMS-Seq for In Vivo Genome-wide Mapping of Protein-DNA Interactions and Nucleosome Centers.

    Science.gov (United States)

    Umeyama, Taichi; Ito, Takashi

    2017-10-03

    Protein-DNA interactions provide the basis for chromatin structure and gene regulation. Comprehensive identification of protein-occupied sites is thus vital to an in-depth understanding of genome function. Dimethyl sulfate (DMS) is a chemical probe that has long been used to detect footprints of DNA-bound proteins in vitro and in vivo. Here, we describe a genomic footprinting method, dimethyl sulfate sequencing (DMS-seq), which exploits the cell-permeable nature of DMS to obviate the need for nuclear isolation. This feature makes DMS-seq simple in practice and removes the potential risk of protein re-localization during nuclear isolation. DMS-seq successfully detects transcription factors bound to cis-regulatory elements and non-canonical chromatin particles in nucleosome-free regions. Furthermore, an unexpected preference of DMS confers on DMS-seq a unique potential to directly detect nucleosome centers without using genetic manipulation. We expect that DMS-seq will serve as a characteristic method for genome-wide interrogation of in vivo protein-DNA interactions. Copyright © 2017 The Author(s). Published by Elsevier Inc. All rights reserved.

  15. Comparing the normalization methods for the differential analysis of Illumina high-throughput RNA-Seq data.

    Science.gov (United States)

    Li, Peipei; Piao, Yongjun; Shon, Ho Sun; Ryu, Keun Ho

    2015-10-28

    Recently, rapid improvements in technology and decrease in sequencing costs have made RNA-Seq a widely used technique to quantify gene expression levels. Various normalization approaches have been proposed, owing to the importance of normalization in the analysis of RNA-Seq data. A comparison of recently proposed normalization methods is required to generate suitable guidelines for the selection of the most appropriate approach for future experiments. In this paper, we compared eight non-abundance (RC, UQ, Med, TMM, DESeq, Q, RPKM, and ERPKM) and two abundance estimation normalization methods (RSEM and Sailfish). The experiments were based on real Illumina high-throughput RNA-Seq of 35- and 76-nucleotide sequences produced in the MAQC project and simulation reads. Reads were mapped with human genome obtained from UCSC Genome Browser Database. For precise evaluation, we investigated Spearman correlation between the normalization results from RNA-Seq and MAQC qRT-PCR values for 996 genes. Based on this work, we showed that out of the eight non-abundance estimation normalization methods, RC, UQ, Med, TMM, DESeq, and Q gave similar normalization results for all data sets. For RNA-Seq of a 35-nucleotide sequence, RPKM showed the highest correlation results, but for RNA-Seq of a 76-nucleotide sequence, least correlation was observed than the other methods. ERPKM did not improve results than RPKM. Between two abundance estimation normalization methods, for RNA-Seq of a 35-nucleotide sequence, higher correlation was obtained with Sailfish than that with RSEM, which was better than without using abundance estimation methods. However, for RNA-Seq of a 76-nucleotide sequence, the results achieved by RSEM were similar to without applying abundance estimation methods, and were much better than with Sailfish. Furthermore, we found that adding a poly-A tail increased alignment numbers, but did not improve normalization results. Spearman correlation analysis revealed that RC, UQ

  16. HMCan: A method for detecting chromatin modifications in cancer samples using ChIP-seq data

    KAUST Repository

    Ashoor, Haitham; Hé rault, Auré lie; Kamoun, Auré lie; Radvanyi, Franç ois; Bajic, Vladimir B.; Barillot, Emmanuel; Boeva, Valentina

    2013-01-01

    genes. Though several tools have been created to enable detection of histone marks in ChIP-seq data from normal samples, it is unclear whether these tools can be efficiently applied to ChIP-seq data generated from cancer samples. Indeed, cancer genomes

  17. Craniostenose em gêmeos: estudo genético

    Directory of Open Access Journals (Sweden)

    Walter Carlos Pereira

    1968-09-01

    Full Text Available É relatada a ocorrência de formas clínicas diversas de craniostenose em gêmeos de sexo diferente. A menina apresentava obliteração completa da sutura coronaria e dos dois terços anteriores da sutura sagital; no menino a sutura sagital era a única afetada. O estudo genético mostrou que a craniostenose independe de aberrações cromossômicas, indicando ser transmitida por gens recessivos raros de natureza autossômica.

  18. An InDel in the Promoter of Al-ACTIVATED MALATE TRANSPORTER9 Selected during Tomato Domestication Determines Fruit Malate Contents and Aluminum Tolerance[OPEN

    Science.gov (United States)

    Wang, Xin; Hu, Tixu; Zhang, Fengxia; Wang, Bing; Li, Changxin; Yang, Tianxia; Li, Hanxia; Lu, Yongen; Ye, Zhibiao

    2017-01-01

    Deciphering the mechanism of malate accumulation in plants would contribute to a greater understanding of plant chemistry, which has implications for improving flavor quality in crop species and enhancing human health benefits. However, the regulation of malate metabolism is poorly understood in crops such as tomato (Solanum lycopersicum). Here, we integrated a metabolite-based genome-wide association study with linkage mapping and gene functional studies to characterize the genetics of malate accumulation in a global collection of tomato accessions with broad genetic diversity. We report that TFM6 (tomato fruit malate 6), which corresponds to Al-ACTIVATED MALATE TRANSPORTER9 (Sl-ALMT9 in tomato), is the major quantitative trait locus responsible for variation in fruit malate accumulation among tomato genotypes. A 3-bp indel in the promoter region of Sl-ALMT9 was linked to high fruit malate content. Further analysis indicated that this indel disrupts a W-box binding site in the Sl-ALMT9 promoter, which prevents binding of the WRKY transcription repressor Sl-WRKY42, thereby alleviating the repression of Sl-ALMT9 expression and promoting high fruit malate accumulation. Evolutionary analysis revealed that this highly expressed Sl-ALMT9 allele was selected for during tomato domestication. Furthermore, vacuole membrane-localized Sl-ALMT9 increases in abundance following Al treatment, thereby elevating malate transport and enhancing Al resistance. PMID:28814642

  19. A Virtual Reality Framework to Optimize Design, Operation and Refueling of GEN-IV Reactors

    International Nuclear Information System (INIS)

    Rizwan-uddin; Nick Karancevic; Stefano Markidis; Joel Dixon; Cheng Luo; Jared Reynolds

    2008-01-01

    Many GEN-IV candidate designs are currently under investigation. Technical issues related to material, safety and economics are being addressed at research laboratories, industry and in academia. After safety, economic feasibility is likely to be the most important criterion in the success of GEN-IV design(s). Lessons learned from the designers and operators of GEN-II (and GEN-III) reactors must play a vital role in achieving both safety and economic feasibility goals

  20. A Virtual Reality Framework to Optimize Design, Operation and Refueling of GEN-IV Reactors.

    Energy Technology Data Exchange (ETDEWEB)

    Rizwan-uddin; Nick Karancevic; Stefano Markidis; Joel Dixon; Cheng Luo; Jared Reynolds

    2008-04-23

    many GEN-IV candidate designs are currently under investigation. Technical issues related to material, safety and economics are being addressed at research laboratories, industry and in academia. After safety, economic feasibility is likely to be the most important crterion in the success of GEN-IV design(s). Lessons learned from the designers and operators of GEN-II (and GEN-III) reactors must play a vital role in achieving both safety and economic feasibility goals.

  1. Integrated RNA-Seq and sRNA-Seq Analysis Identifies Chilling and Freezing Responsive Key Molecular Players and Pathways in Tea Plant (Camellia sinensis)

    Science.gov (United States)

    Zheng, Chao; Zhao, Lei; Wang, Yu; Shen, Jiazhi; Zhang, Yinfei; Jia, Sisi; Li, Yusheng; Ding, Zhaotang

    2015-01-01

    Tea [Camellia sinensis (L) O. Kuntze, Theaceae] is one of the most popular non-alcoholic beverages worldwide. Cold stress is one of the most severe abiotic stresses that limit tea plants’ growth, survival and geographical distribution. However, the genetic regulatory network and signaling pathways involved in cold stress responses in tea plants remain unearthed. Using RNA-Seq, DGE and sRNA-Seq technologies, we performed an integrative analysis of miRNA and mRNA expression profiling and their regulatory network of tea plants under chilling (4℃) and freezing (-5℃) stress. Differentially expressed (DE) miRNA and mRNA profiles were obtained based on fold change analysis, miRNAs and target mRNAs were found to show both coherent and incoherent relationships in the regulatory network. Furthermore, we compared several key pathways (e.g., ‘Photosynthesis’), GO terms (e.g., ‘response to karrikin’) and transcriptional factors (TFs, e.g., DREB1b/CBF1) which were identified as involved in the early chilling and/or freezing response of tea plants. Intriguingly, we found that karrikins, a new group of plant growth regulators, and β-primeverosidase (BPR), a key enzyme functionally relevant with the formation of tea aroma might play an important role in both early chilling and freezing response of tea plants. Quantitative reverse transcriptase-polymerase chain reaction (qRT-PCR) analysis further confirmed the results from RNA-Seq and sRNA-Seq analysis. This is the first study to simultaneously profile the expression patterns of both miRNAs and mRNAs on a genome-wide scale to elucidate the molecular mechanisms of early responses of tea plants to cold stress. In addition to gaining a deeper insight into the cold resistant characteristics of tea plants, we provide a good case study to analyse mRNA/miRNA expression and profiling of non-model plant species using next-generation sequencing technology. PMID:25901577

  2. Transformação genética em espécies florestais.

    OpenAIRE

    Claudia Studart-Guimarães; Cristiano Lacorte; Ana Cristina Miranda Brasileiro

    2010-01-01

    A transformação genética, que compreende a introdução de genes exógenos de forma controlada no genoma de uma célula vegetal e posterior regeneração da planta transgênica, tem contribuído com os programas de melhoramento genético de plantas pela obtenção de genótipos com novas características de interesse. O melhoramento de espécies florestais é limitado por características intrínsecas a tais espécies, como a altura dos indivíduos e o ciclo longo de vida. A transformação genética constitui, po...

  3. BRAKER1: Unsupervised RNA-Seq-Based Genome Annotation with GeneMark-ET and AUGUSTUS.

    Science.gov (United States)

    Hoff, Katharina J; Lange, Simone; Lomsadze, Alexandre; Borodovsky, Mark; Stanke, Mario

    2016-03-01

    Gene finding in eukaryotic genomes is notoriously difficult to automate. The task is to design a work flow with a minimal set of tools that would reach state-of-the-art performance across a wide range of species. GeneMark-ET is a gene prediction tool that incorporates RNA-Seq data into unsupervised training and subsequently generates ab initio gene predictions. AUGUSTUS is a gene finder that usually requires supervised training and uses information from RNA-Seq reads in the prediction step. Complementary strengths of GeneMark-ET and AUGUSTUS provided motivation for designing a new combined tool for automatic gene prediction. We present BRAKER1, a pipeline for unsupervised RNA-Seq-based genome annotation that combines the advantages of GeneMark-ET and AUGUSTUS. As input, BRAKER1 requires a genome assembly file and a file in bam-format with spliced alignments of RNA-Seq reads to the genome. First, GeneMark-ET performs iterative training and generates initial gene structures. Second, AUGUSTUS uses predicted genes for training and then integrates RNA-Seq read information into final gene predictions. In our experiments, we observed that BRAKER1 was more accurate than MAKER2 when it is using RNA-Seq as sole source for training and prediction. BRAKER1 does not require pre-trained parameters or a separate expert-prepared training step. BRAKER1 is available for download at http://bioinf.uni-greifswald.de/bioinf/braker/ and http://exon.gatech.edu/GeneMark/ katharina.hoff@uni-greifswald.de or borodovsky@gatech.edu Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  4. Preserving Accuracy in GenBank

    DEFF Research Database (Denmark)

    Bidartondo, M.I.; Bruns, T. D.; Blackwell, M.

    2008-01-01

    GenBank, the public repository for nucleotide and protein sequences, is a critical resource for molecular biology, evolutionary biology, and ecology. While some attention has been drawn to sequence errors (1), common annotation errors also reduce the value of this database. In fact, for organisms...

  5. Robust Identification of Developmentally Active Endothelial Enhancers in Zebrafish Using FANS-Assisted ATAC-Seq.

    Science.gov (United States)

    Quillien, Aurelie; Abdalla, Mary; Yu, Jun; Ou, Jianhong; Zhu, Lihua Julie; Lawson, Nathan D

    2017-07-18

    Identification of tissue-specific and developmentally active enhancers provides insights into mechanisms that control gene expression during embryogenesis. However, robust detection of these regulatory elements remains challenging, especially in vertebrate genomes. Here, we apply fluorescent-activated nuclei sorting (FANS) followed by Assay for Transposase-Accessible Chromatin with high-throughput sequencing (ATAC-seq) to identify developmentally active endothelial enhancers in the zebrafish genome. ATAC-seq of nuclei from Tg(fli1a:egfp) y1 transgenic embryos revealed expected patterns of nucleosomal positioning at transcriptional start sites throughout the genome and association with active histone modifications. Comparison of ATAC-seq from GFP-positive and -negative nuclei identified more than 5,000 open elements specific to endothelial cells. These elements flanked genes functionally important for vascular development and that displayed endothelial-specific gene expression. Importantly, a majority of tested elements drove endothelial gene expression in zebrafish embryos. Thus, FANS-assisted ATAC-seq using transgenic zebrafish embryos provides a robust approach for genome-wide identification of active tissue-specific enhancer elements. Copyright © 2017 The Author(s). Published by Elsevier Inc. All rights reserved.

  6. Update History of This Database - GenLibi | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...List Contact us GenLibi Update History of This Database Date Update contents 2014/03/25 GenLibi English archi...base Description Download License Update History of This Database Site Policy | Contact Us Update History of This Database - GenLibi | LSDB Archive ... ...ve site is opened. 2007/03/01 GenLibi ( http://gene.biosciencedbc.jp/ ) is opened. About This Database Data

  7. Whole Genome Re-Sequencing and Characterization of Powdery Mildew Disease-Associated Allelic Variation in Melon.

    Directory of Open Access Journals (Sweden)

    Sathishkumar Natarajan

    Full Text Available Powdery mildew is one of the most common fungal diseases in the world. This disease frequently affects melon (Cucumis melo L. and other Cucurbitaceous family crops in both open field and greenhouse cultivation. One of the goals of genomics is to identify the polymorphic loci responsible for variation in phenotypic traits. In this study, powdery mildew disease assessment scores were calculated for four melon accessions, 'SCNU1154', 'Edisto47', 'MR-1', and 'PMR5'. To investigate the genetic variation of these accessions, whole genome re-sequencing using the Illumina HiSeq 2000 platform was performed. A total of 754,759,704 quality-filtered reads were generated, with an average of 82.64% coverage relative to the reference genome. Comparisons of the sequences for the melon accessions revealed around 7.4 million single nucleotide polymorphisms (SNPs, 1.9 million InDels, and 182,398 putative structural variations (SVs. Functional enrichment analysis of detected variations classified them into biological process, cellular component and molecular function categories. Further, a disease-associated QTL map was constructed for 390 SNPs and 45 InDels identified as related to defense-response genes. Among them 112 SNPs and 12 InDels were observed in powdery mildew responsive chromosomes. Accordingly, this whole genome re-sequencing study identified SNPs and InDels associated with defense genes that will serve as candidate polymorphisms in the search for sources of resistance against powdery mildew disease and could accelerate marker-assisted breeding in melon.

  8. Whole Genome Re-Sequencing and Characterization of Powdery Mildew Disease-Associated Allelic Variation in Melon.

    Science.gov (United States)

    Natarajan, Sathishkumar; Kim, Hoy-Taek; Thamilarasan, Senthil Kumar; Veerappan, Karpagam; Park, Jong-In; Nou, Ill-Sup

    2016-01-01

    Powdery mildew is one of the most common fungal diseases in the world. This disease frequently affects melon (Cucumis melo L.) and other Cucurbitaceous family crops in both open field and greenhouse cultivation. One of the goals of genomics is to identify the polymorphic loci responsible for variation in phenotypic traits. In this study, powdery mildew disease assessment scores were calculated for four melon accessions, 'SCNU1154', 'Edisto47', 'MR-1', and 'PMR5'. To investigate the genetic variation of these accessions, whole genome re-sequencing using the Illumina HiSeq 2000 platform was performed. A total of 754,759,704 quality-filtered reads were generated, with an average of 82.64% coverage relative to the reference genome. Comparisons of the sequences for the melon accessions revealed around 7.4 million single nucleotide polymorphisms (SNPs), 1.9 million InDels, and 182,398 putative structural variations (SVs). Functional enrichment analysis of detected variations classified them into biological process, cellular component and molecular function categories. Further, a disease-associated QTL map was constructed for 390 SNPs and 45 InDels identified as related to defense-response genes. Among them 112 SNPs and 12 InDels were observed in powdery mildew responsive chromosomes. Accordingly, this whole genome re-sequencing study identified SNPs and InDels associated with defense genes that will serve as candidate polymorphisms in the search for sources of resistance against powdery mildew disease and could accelerate marker-assisted breeding in melon.

  9. SeqBox: RNAseq/ChIPseq reproducible analysis on a consumer game computer.

    Science.gov (United States)

    Beccuti, Marco; Cordero, Francesca; Arigoni, Maddalena; Panero, Riccardo; Amparore, Elvio G; Donatelli, Susanna; Calogero, Raffaele A

    2018-03-01

    Short reads sequencing technology has been used for more than a decade now. However, the analysis of RNAseq and ChIPseq data is still computational demanding and the simple access to raw data does not guarantee results reproducibility between laboratories. To address these two aspects, we developed SeqBox, a cheap, efficient and reproducible RNAseq/ChIPseq hardware/software solution based on NUC6I7KYK mini-PC (an Intel consumer game computer with a fast processor and a high performance SSD disk), and Docker container platform. In SeqBox the analysis of RNAseq and ChIPseq data is supported by a friendly GUI. This allows access to fast and reproducible analysis also to scientists with/without scripting experience. Docker container images, docker4seq package and the GUI are available at http://www.bioinformatica.unito.it/reproducibile.bioinformatics.html. beccuti@di.unito.it. Supplementary data are available at Bioinformatics online. © The Author(s) 2017. Published by Oxford University Press.

  10. Hyb-Seq: Combining target enrichment and genome skimming for plant phylogenomics1

    Science.gov (United States)

    Weitemier, Kevin; Straub, Shannon C. K.; Cronn, Richard C.; Fishbein, Mark; Schmickl, Roswitha; McDonnell, Angela; Liston, Aaron

    2014-01-01

    • Premise of the study: Hyb-Seq, the combination of target enrichment and genome skimming, allows simultaneous data collection for low-copy nuclear genes and high-copy genomic targets for plant systematics and evolution studies. • Methods and Results: Genome and transcriptome assemblies for milkweed (Asclepias syriaca) were used to design enrichment probes for 3385 exons from 768 genes (>1.6 Mbp) followed by Illumina sequencing of enriched libraries. Hyb-Seq of 12 individuals (10 Asclepias species and two related genera) resulted in at least partial assembly of 92.6% of exons and 99.7% of genes and an average assembly length >2 Mbp. Importantly, complete plastomes and nuclear ribosomal DNA cistrons were assembled using off-target reads. Phylogenomic analyses demonstrated signal conflict between genomes. • Conclusions: The Hyb-Seq approach enables targeted sequencing of thousands of low-copy nuclear exons and flanking regions, as well as genome skimming of high-copy repeats and organellar genomes, to efficiently produce genome-scale data sets for phylogenomics. PMID:25225629

  11. Genetic diversity of the VP1/VP2 gene of canine parvovirus type 2b amplified from clinical specimens in Brazil Diversidade genética no gene VP1/VP2 do parvovirus canino tipo 2b amplificado de material clínico no Brasil

    Directory of Open Access Journals (Sweden)

    Cesar A. D. Pereira

    2000-10-01

    Full Text Available We evaluated the genetic diversity in the VP1/VP2 gene of CPV type 2b isolates from symptomatic dogs in Brazil. A total of 21 isolates collected from 1990 through 1995 previously typed as CPV2b by PCR assay were studied. Overall we found a high degree of similarity among sequences from different CPV clinical isolates collected. Genetic analysis of this selected region gave no indication of a specific Brazilian parvovirus lineage.Neste estudo foi avaliada a diversidade genética no gene VP1/VP2 do parvovírus canino tipo 2b a partir de amostras isoladas de cães sintomáticos no Brasil. Foram estudadas 21 amostras coletadas no período de 1990 à 1995, previamente caracterizadas como CPV 2b pela técnica de PCR. Observou-se alto grau de similaridade entre as seqüências estudadas e a análise genética da região selecionada não indicou a presença de uma linhagem brasileira específica.

  12. Sistema inmune y genética: un abordaje diferente a la diversidad de anticuerpos.

    OpenAIRE

    Matta Camacho, Nubia Estela

    2011-01-01

    RESUMEN Es común encontrar en los libros de inmunología o de genética un capítulo con el título de “sistema inmune y genética”, sin embargo su asociación se centra en cómo la generación de anticuerpos rompió el paradigma “un gen, una proteína”, pues en el caso de la producción de anticuerpos, un gen produce millones de proteínas. El sistema inmune tiene muchos vínculos con la genética y la herencia; esta asociación se da porque cualquier sustancia o compuesto que produzca un organi...

  13. Perda auditiva genética Genetic hearing loss

    Directory of Open Access Journals (Sweden)

    Ricardo Godinho

    2003-01-01

    Full Text Available O progresso das pesquisas relacionadas à perda auditiva genética tem provocado um importante avanço do entendimento dos mecanismos moleculares que governam o desenvolvimento, a função, a resposta ao trauma e o envelhecimento do ouvido interno. Em países desenvolvidos, mais de 50% dos casos de surdez na infância é causada por alterações genéticas e as perdas auditivas relacionadas à idade têm sido associadas com mecanismos genéticos. OBJETIVO: O objetivo desta revisão é relatar as informações mais recentes relacionadas às perdas audtivas de origem genética. FORAMA DE ESTUDO: Revisão sistemática. MATERIAL E MÉTODO: A revisão da literatura inclui artigos indexados à MEDLINE (Biblioteca Nacional de Saúde, NIH-USA e publicados nos últimos 3 anos, além das informações disponíveis na Hereditary Hearing Loss Home Page. CONCLUSÃO: Os recentes avanços no entendimento das perdas auditivas de origem genética têm favorecido a nossa compreensão da função auditiva e tornado o diagnóstico mais apurado. Possivelmente, no futuro, este conhecimento também proporcionará o desenvolvimento de novas terapias para o tratamento das causas genéticas das perdas auditivas.The progress in the research of genetic hearing loss has advanced our understanding of the molecular mechanisms that govern inner ear development, function and response to injury and aging. In the developed world, over 50% of childhood deafness is attributable to genetic causes and even age-related hearing loss has been associated with genetic mechanisms. AIM: The objective of this review is to summarize recent knowledge in genetic hearing loss. STUDY DESIGN: Sistematic review. MATERIAL AND METHODS: The literature review included articles indexed at MEDLINE (The National Library of Medicine, The National Institute of Health - USA focusing on publications from the past 3 years plus the information available at the Hereditary Hearing Loss Home Page. CONCLUSION

  14. Algoritmos genéticos

    Directory of Open Access Journals (Sweden)

    José Jesús Martínez Páez

    1998-10-01

    Full Text Available Esta técnica se basa en el concepto de evolución a través de selección de los mejores individuos, y de los operadores genéticos de selección, reproducción y mutación. Se trata entonces, de definir un espacio de soluciones para el problema que se quiere solucionar, en una cadena de bits. A esto se le conoce como la codificación del cromosoma, donde cada bit, denominado gen  tiene cierto significado especial. Inicialmente el algoritmo genera al azar muchas de estas cadenas o seres, es decir, una población, que luego confronta can un ambiente, que es el problema solucionar o función que se quiere optimizar. De esta confrontación  o evaluación a que se somete cada ser. Se obtiene información sobre cómo se comporto cada uno. A través de métodos aleatorios, pero con probabilidad de selección proporcional a su comportamiento, es decir, a mejor comportamiento mayor probabilidad, se selecciona una nueva población de seres supuestamente mejores que la generación anterior.

  15. ASAP: a web-based platform for the analysis and interactive visualization of single-cell RNA-seq data.

    Science.gov (United States)

    Gardeux, Vincent; David, Fabrice P A; Shajkofci, Adrian; Schwalie, Petra C; Deplancke, Bart

    2017-10-01

    Single-cell RNA-sequencing (scRNA-seq) allows whole transcriptome profiling of thousands of individual cells, enabling the molecular exploration of tissues at the cellular level. Such analytical capacity is of great interest to many research groups in the world, yet these groups often lack the expertise to handle complex scRNA-seq datasets. We developed a fully integrated, web-based platform aimed at the complete analysis of scRNA-seq data post genome alignment: from the parsing, filtering and normalization of the input count data files, to the visual representation of the data, identification of cell clusters, differentially expressed genes (including cluster-specific marker genes), and functional gene set enrichment. This Automated Single-cell Analysis Pipeline (ASAP) combines a wide range of commonly used algorithms with sophisticated visualization tools. Compared with existing scRNA-seq analysis platforms, researchers (including those lacking computational expertise) are able to interact with the data in a straightforward fashion and in real time. Furthermore, given the overlap between scRNA-seq and bulk RNA-seq analysis workflows, ASAP should conceptually be broadly applicable to any RNA-seq dataset. As a validation, we demonstrate how we can use ASAP to simply reproduce the results from a single-cell study of 91 mouse cells involving five distinct cell types. The tool is freely available at asap.epfl.ch and R/Python scripts are available at github.com/DeplanckeLab/ASAP. bart.deplancke@epfl.ch. Supplementary data are available at Bioinformatics online. © The Author(s) 2017. Published by Oxford University Press.

  16. Ekspresi Gen CYP19 Aromatase, Estrogen, Androgen pada penderita Periodontitis Agresif

    Directory of Open Access Journals (Sweden)

    Dahlia Herawati

    2016-11-01

    Full Text Available Kepadatan tulang tubuh ditentukan oleh gen CYP19 aromatase, hormon estrogen dan androgen. Pada periodontitis agresif terjadi perkembangan cepat kerusakan tulang alveolar, dan kerusakan tulang alveoler tersebut tidak diimbangioleh regenerasi tulang. Tujuan penelitian ini adalah menunjukkan ekspresi gen CYP19 aromatase, estrogen, androgen pada penderita periodontitis agresif agar dapat untuk menjadi pertimbangan pada saat melakukan perawatan periodontal. Metode penelitian, pemeriksaan ekspresi gen aromatse CYP19 berasal dari spesimen tulang alveolar menggunakan imunohistokimia, pengukuran hormon estrogen dan androgen dari serum menggunakan Vidas: Elfa. Hasil penelitian ekspresi gene CYP19 aromatase pada periodontitis agresif menunjukkan gambaran lebih rendah densitasnya dibandingkan pada nonperiodontitis. Estrogen dan androgen pad aperiodontitis agresif ada kecenderungan lebih rendah dibandingkan pada nonperiodontitis. Kesimpulan regenerasi tulang alveoler pad a periodontitis agresif terhambat karena sedikitnya gen CYP19 aromatase dan hormon estrogen dan androgen yang berperan pada pembentukan tulang alveoler kurang memadai.

  17. A nuclear phylogenetic analysis: SNPs, indels and SSRs deliver new insights into the relationships in the ‘true citrus fruit trees’ group (Citrinae, Rutaceae) and the origin of cultivated species

    Science.gov (United States)

    Garcia-Lor, Andres; Curk, Franck; Snoussi-Trifa, Hager; Morillon, Raphael; Ancillo, Gema; Luro, François; Navarro, Luis; Ollitrault, Patrick

    2013-01-01

    Background and Aims Despite differences in morphology, the genera representing ‘true citrus fruit trees’ are sexually compatible, and their phylogenetic relationships remain unclear. Most of the important commercial ‘species’ of Citrus are believed to be of interspecific origin. By studying polymorphisms of 27 nuclear genes, the average molecular differentiation between species was estimated and some phylogenetic relationships between ‘true citrus fruit trees’ were clarified. Methods Sanger sequencing of PCR-amplified fragments from 18 genes involved in metabolite biosynthesis pathways and nine putative genes for salt tolerance was performed for 45 genotypes of Citrus and relatives of Citrus to mine single nucleotide polymorphisms (SNPs) and indel polymorphisms. Fifty nuclear simple sequence repeats (SSRs) were also analysed. Key Results A total of 16 238 kb of DNA was sequenced for each genotype, and 1097 single nucleotide polymorphisms (SNPs) and 50 indels were identified. These polymorphisms were more valuable than SSRs for inter-taxon differentiation. Nuclear phylogenetic analysis revealed that Citrus reticulata and Fortunella form a cluster that is differentiated from the clade that includes three other basic taxa of cultivated citrus (C. maxima, C. medica and C. micrantha). These results confirm the taxonomic subdivision between the subgenera Metacitrus and Archicitrus. A few genes displayed positive selection patterns within or between species, but most of them displayed neutral patterns. The phylogenetic inheritance patterns of the analysed genes were inferred for commercial Citrus spp. Conclusions Numerous molecular polymorphisms (SNPs and indels), which are potentially useful for the analysis of interspecific genetic structures, have been identified. The nuclear phylogenetic network for Citrus and its sexually compatible relatives was consistent with the geographical origins of these genera. The positive selection observed for a few genes will

  18. Selective amplification and sequencing of cyclic phosphate-containing RNAs by the cP-RNA-seq method.

    Science.gov (United States)

    Honda, Shozo; Morichika, Keisuke; Kirino, Yohei

    2016-03-01

    RNA digestions catalyzed by many ribonucleases generate RNA fragments that contain a 2',3'-cyclic phosphate (cP) at their 3' termini. However, standard RNA-seq methods are unable to accurately capture cP-containing RNAs because the cP inhibits the adapter ligation reaction. We recently developed a method named cP-RNA-seq that is able to selectively amplify and sequence cP-containing RNAs. Here we describe the cP-RNA-seq protocol in which the 3' termini of all RNAs, except those containing a cP, are cleaved through a periodate treatment after phosphatase treatment; hence, subsequent adapter ligation and cDNA amplification steps are exclusively applied to cP-containing RNAs. cP-RNA-seq takes ∼6 d, excluding the time required for sequencing and bioinformatics analyses, which are not covered in detail in this protocol. Biochemical validation of the existence of cP in the identified RNAs takes ∼3 d. Even though the cP-RNA-seq method was developed to identify angiogenin-generating 5'-tRNA halves as a proof of principle, the method should be applicable to global identification of cP-containing RNA repertoires in various transcriptomes.

  19. ChIP-PIT: Enhancing the Analysis of ChIP-Seq Data Using Convex-Relaxed Pair-Wise Interaction Tensor Decomposition.

    Science.gov (United States)

    Zhu, Lin; Guo, Wei-Li; Deng, Su-Ping; Huang, De-Shuang

    2016-01-01

    In recent years, thanks to the efforts of individual scientists and research consortiums, a huge amount of chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) experimental data have been accumulated. Instead of investigating them independently, several recent studies have convincingly demonstrated that a wealth of scientific insights can be gained by integrative analysis of these ChIP-seq data. However, when used for the purpose of integrative analysis, a serious drawback of current ChIP-seq technique is that it is still expensive and time-consuming to generate ChIP-seq datasets of high standard. Most researchers are therefore unable to obtain complete ChIP-seq data for several TFs in a wide variety of cell lines, which considerably limits the understanding of transcriptional regulation pattern. In this paper, we propose a novel method called ChIP-PIT to overcome the aforementioned limitation. In ChIP-PIT, ChIP-seq data corresponding to a diverse collection of cell types, TFs and genes are fused together using the three-mode pair-wise interaction tensor (PIT) model, and the prediction of unperformed ChIP-seq experimental results is formulated as a tensor completion problem. Computationally, we propose efficient first-order method based on extensions of coordinate descent method to learn the optimal solution of ChIP-PIT, which makes it particularly suitable for the analysis of massive scale ChIP-seq data. Experimental evaluation the ENCODE data illustrate the usefulness of the proposed model.

  20. La genética de las poblaciones centroamericanas

    OpenAIRE

    Barrantes, Ramiro

    2005-01-01

    Las poblaciones centroamericanas no han sido objeto de muchos estudios genéticos con la excepción de análisis esporádicos de la variación entre y dentro de los grupos amerindios y de origen africano ubicados en el área. No obstante, en los últimos 15 años se efectuaron investigaciones sistemáticas en este sentido incluyendo poblaciones mestizas, particularmente las de Costa Rica y Panamá. En los amerindios se efectuaron estudios detallados de su estructura genética y las relaciones filogenéti...

  1. Autosomal InDel polymorphisms for population genetic structure and differentiation analysis of Chinese Kazak ethnic group

    Science.gov (United States)

    Kong, Tingting; Chen, Yahao; Guo, Yuxin; Wei, Yuanyuan; Jin, Xiaoye; Xie, Tong; Mu, Yuling; Dong, Qian; Wen, Shaoqing; Zhou, Boyan; Zhang, Li; Shen, Chunmei; Zhu, Bofeng

    2017-01-01

    In the present study, we assessed the genetic diversities of the Chinese Kazak ethnic group on the basis of 30 well-chosen autosomal insertion and deletion loci and explored the genetic relationships between Kazak and 23 reference groups. We detected the level of the expected heterozygosity ranging from 0.3605 at HLD39 locus to 0.5000 at HLD136 locus and the observed heterozygosity ranging from 0.3548 at HLD39 locus to 0.5283 at HLD136 locus. The combined power of discrimination and the combined power of exclusion for all 30 loci in the studied Kazak group were 0.999999999999128 and 0.9945, respectively. The dataset generated in this study indicated the panel of 30 InDels was highly efficient in forensic individual identifcation but may not have enough power in paternity cases. The results of the interpopulation differentiations, PCA plots, phylogenetic trees and STRUCTURE analyses showed a close genetic affiliation between the Kazak and Uigur group. PMID:28915619

  2. QuickRNASeq lifts large-scale RNA-seq data analyses to the next level of automation and interactive visualization.

    Science.gov (United States)

    Zhao, Shanrong; Xi, Li; Quan, Jie; Xi, Hualin; Zhang, Ying; von Schack, David; Vincent, Michael; Zhang, Baohong

    2016-01-08

    RNA sequencing (RNA-seq), a next-generation sequencing technique for transcriptome profiling, is being increasingly used, in part driven by the decreasing cost of sequencing. Nevertheless, the analysis of the massive amounts of data generated by large-scale RNA-seq remains a challenge. Multiple algorithms pertinent to basic analyses have been developed, and there is an increasing need to automate the use of these tools so as to obtain results in an efficient and user friendly manner. Increased automation and improved visualization of the results will help make the results and findings of the analyses readily available to experimental scientists. By combing the best open source tools developed for RNA-seq data analyses and the most advanced web 2.0 technologies, we have implemented QuickRNASeq, a pipeline for large-scale RNA-seq data analyses and visualization. The QuickRNASeq workflow consists of three main steps. In Step #1, each individual sample is processed, including mapping RNA-seq reads to a reference genome, counting the numbers of mapped reads, quality control of the aligned reads, and SNP (single nucleotide polymorphism) calling. Step #1 is computationally intensive, and can be processed in parallel. In Step #2, the results from individual samples are merged, and an integrated and interactive project report is generated. All analyses results in the report are accessible via a single HTML entry webpage. Step #3 is the data interpretation and presentation step. The rich visualization features implemented here allow end users to interactively explore the results of RNA-seq data analyses, and to gain more insights into RNA-seq datasets. In addition, we used a real world dataset to demonstrate the simplicity and efficiency of QuickRNASeq in RNA-seq data analyses and interactive visualizations. The seamless integration of automated capabilites with interactive visualizations in QuickRNASeq is not available in other published RNA-seq pipelines. The high degree

  3. The Oswestry Disability Index (version 2.1a): validation of a Dutch language version.

    Science.gov (United States)

    van Hooff, Miranda L; Spruit, Maarten; Fairbank, Jeremy C T; van Limbeek, Jacques; Jacobs, Wilco C H

    2015-01-15

    A cross-sectional study on baseline data. To translate the Oswestry Disability Index (ODI) version 2.1a into the Dutch language and to validate its use in a cohort of patients with chronic low back pain in secondary spine care. Patient-reported outcome measures (PROMs) are commonly accepted to evaluate the outcome of spine interventions. Functional status is an important outcome in spine research. The ODI is a recommended condition-specific patient-reported outcome measure used to evaluate functional status in patients with back pain. As yet, no formal translated Dutch version exists. The ODI was translated according to established guidelines. The final version was built into the electronic web-based system in addition with the Roland Morris Disability Questionnaire, the numeric rating scale for pain severity, 36-Item Short Form Health Survey Questionnaire for quality of life, and the hospital anxiety and depression scale. Baseline data were used of 244 patients with chronic low back pain who participated in a combined physical and psychological program. Floor and ceiling effects, internal consistency, and the construct validity were evaluated using quality criteria. The mean ODI (standard deviation) was 39.6 (12.3); minimum 6, maximum 70. Most of the participants (88%) were moderately to severely disabled. Factor analysis determined a 1-factor structure (36% explained variance) and the homogeneity of ODI items is shown (Cronbach α = 0.79). The construct validity is supported as all (6:6) the a priori hypotheses were confirmed. Moreover, the ODI and Roland Morris Disability Questionnaire, showed a strong significant correlation (r = 0.68, P disability among Dutch patients with chronic low back pain. This translated condition-specific patient-reported outcome measure version is recommended for use in future back pain research and to evaluate outcome of back care in the Netherlands.

  4. A technical assessment of the porcine ejaculated spermatozoa for a sperm-specific RNA-seq analysis.

    Science.gov (United States)

    Gòdia, Marta; Mayer, Fabiana Quoos; Nafissi, Julieta; Castelló, Anna; Rodríguez-Gil, Joan Enric; Sánchez, Armand; Clop, Alex

    2018-04-26

    The study of the boar sperm transcriptome by RNA-seq can provide relevant information on sperm quality and fertility and might contribute to animal breeding strategies. However, the analysis of the spermatozoa RNA is challenging as these cells harbor very low amounts of highly fragmented RNA, and the ejaculates also contain other cell types with larger amounts of non-fragmented RNA. Here, we describe a strategy for a successful boar sperm purification, RNA extraction and RNA-seq library preparation. Using these approaches our objectives were: (i) to evaluate the sperm recovery rate (SRR) after boar spermatozoa purification by density centrifugation using the non-porcine-specific commercial reagent BoviPure TM ; (ii) to assess the correlation between SRR and sperm quality characteristics; (iii) to evaluate the relationship between sperm cell RNA load and sperm quality traits and (iv) to compare different library preparation kits for both total RNA-seq (SMARTer Universal Low Input RNA and TruSeq RNA Library Prep kit) and small RNA-seq (NEBNext Small RNA and TailorMix miRNA Sample Prep v2) for high-throughput sequencing. Our results show that pig SRR (~22%) is lower than in other mammalian species and that it is not significantly dependent of the sperm quality parameters analyzed in our study. Moreover, no relationship between the RNA yield per sperm cell and sperm phenotypes was found. We compared a RNA-seq library preparation kit optimized for low amounts of fragmented RNA with a standard kit designed for high amount and quality of input RNA and found that for sperm, a protocol designed to work on low-quality RNA is essential. We also compared two small RNA-seq kits and did not find substantial differences in their performance. We propose the methodological workflow described for the RNA-seq screening of the boar spermatozoa transcriptome. FPKM: fragments per kilobase of transcript per million mapped reads; KRT1: keratin 1; miRNA: micro-RNA; miscRNA: miscellaneous

  5. MOCCS: Clarifying DNA-binding motif ambiguity using ChIP-Seq data.

    Science.gov (United States)

    Ozaki, Haruka; Iwasaki, Wataru

    2016-08-01

    As a key mechanism of gene regulation, transcription factors (TFs) bind to DNA by recognizing specific short sequence patterns that are called DNA-binding motifs. A single TF can accept ambiguity within its DNA-binding motifs, which comprise both canonical (typical) and non-canonical motifs. Clarification of such DNA-binding motif ambiguity is crucial for revealing gene regulatory networks and evaluating mutations in cis-regulatory elements. Although chromatin immunoprecipitation sequencing (ChIP-seq) now provides abundant data on the genomic sequences to which a given TF binds, existing motif discovery methods are unable to directly answer whether a given TF can bind to a specific DNA-binding motif. Here, we report a method for clarifying the DNA-binding motif ambiguity, MOCCS. Given ChIP-Seq data of any TF, MOCCS comprehensively analyzes and describes every k-mer to which that TF binds. Analysis of simulated datasets revealed that MOCCS is applicable to various ChIP-Seq datasets, requiring only a few minutes per dataset. Application to the ENCODE ChIP-Seq datasets proved that MOCCS directly evaluates whether a given TF binds to each DNA-binding motif, even if known position weight matrix models do not provide sufficient information on DNA-binding motif ambiguity. Furthermore, users are not required to provide numerous parameters or background genomic sequence models that are typically unavailable. MOCCS is implemented in Perl and R and is freely available via https://github.com/yuifu/moccs. By complementing existing motif-discovery software, MOCCS will contribute to the basic understanding of how the genome controls diverse cellular processes via DNA-protein interactions. Copyright © 2016 Elsevier Ltd. All rights reserved.

  6. Deep learning improves prediction of CRISPR-Cpf1 guide RNA activity.

    Science.gov (United States)

    Kim, Hui Kwon; Min, Seonwoo; Song, Myungjae; Jung, Soobin; Choi, Jae Woo; Kim, Younggwang; Lee, Sangeun; Yoon, Sungroh; Kim, Hyongbum Henry

    2018-03-01

    We present two algorithms to predict the activity of AsCpf1 guide RNAs. Indel frequencies for 15,000 target sequences were used in a deep-learning framework based on a convolutional neural network to train Seq-deepCpf1. We then incorporated chromatin accessibility information to create the better-performing DeepCpf1 algorithm for cell lines for which such information is available and show that both algorithms outperform previous machine learning algorithms on our own and published data sets.

  7. Bayesian phylogeny analysis of vertebrate serpins illustrates evolutionary conservation of the intron and indels based six groups classification system from lampreys for ∼500 MY

    Directory of Open Access Journals (Sweden)

    Abhishek Kumar

    2015-06-01

    Full Text Available The serpin superfamily is characterized by proteins that fold into a conserved tertiary structure and exploits a sophisticated and irreversible suicide-mechanism of inhibition. Vertebrate serpins are classified into six groups (V1–V6, based on three independent biological features—genomic organization, diagnostic amino acid sites and rare indels. However, this classification system was based on the limited number of mammalian genomes available. In this study, several non-mammalian genomes are used to validate this classification system using the powerful Bayesian phylogenetic method. This method supports the intron and indel based vertebrate classification and proves that serpins have been maintained from lampreys to humans for about 500 MY. Lampreys have fewer than 10 serpins, which expand into 36 serpins in humans. The two expanding groups V1 and V2 have SERPINB1/SERPINB6 and SERPINA8/SERPIND1 as the ancestral serpins, respectively. Large clusters of serpins are formed by local duplications of these serpins in tetrapod genomes. Interestingly, the ancestral HCII/SERPIND1 locus (nested within PIK4CA possesses group V4 serpin (A2APL1, homolog of α2-AP/SERPINF2 of lampreys; hence, pointing to the fact that group V4 might have originated from group V2. Additionally in this study, details of the phylogenetic history and genomic characteristics of vertebrate serpins are revisited.

  8. Genome-wide RNA-seq analysis of human and mouse platelet transcriptomes

    Science.gov (United States)

    Rowley, Jesse W.; Oler, Andrew J.; Tolley, Neal D.; Hunter, Benjamin N.; Low, Elizabeth N.; Nix, David A.; Yost, Christian C.; Zimmerman, Guy A.

    2011-01-01

    Inbred mice are a useful tool for studying the in vivo functions of platelets. Nonetheless, the mRNA signature of mouse platelets is not known. Here, we use paired-end next-generation RNA sequencing (RNA-seq) to characterize the polyadenylated transcriptomes of human and mouse platelets. We report that RNA-seq provides unprecedented resolution of mRNAs that are expressed across the entire human and mouse genomes. Transcript expression and abundance are often conserved between the 2 species. Several mRNAs, however, are differentially expressed in human and mouse platelets. Moreover, previously described functional disparities between mouse and human platelets are reflected in differences at the transcript level, including protease activated receptor-1, protease activated receptor-3, platelet activating factor receptor, and factor V. This suggests that RNA-seq is a useful tool for predicting differences in platelet function between mice and humans. Our next-generation sequencing analysis provides new insights into the human and murine platelet transcriptomes. The sequencing dataset will be useful in the design of mouse models of hemostasis and a catalyst for discovery of new functions of platelets. Access to the dataset is found in the “Introduction.” PMID:21596849

  9. Whole genome sequence of the emerging oomycete pathogen Pythium insidiosum strain CDC-B5653 isolated from an infected human in the USA

    Directory of Open Access Journals (Sweden)

    Marina S. Ascunce

    2016-03-01

    Full Text Available Pythium insidiosum ATCC 200269 strain CDC-B5653, an isolate from necrotizing lesions on the mouth and eye of a 2-year-old boy in Memphis, Tennessee, USA, was sequenced using a combination of Illumina MiSeq (300 bp paired-end, 14 millions reads and PacBio (10  Kb fragment library, 356,001 reads. The sequencing data were assembled using SPAdes version 3.1.0, yielding a total genome size of 45.6 Mb contained in 8992 contigs, N50 of 13 Kb, 57% G + C content, and 17,867 putative protein-coding genes. This Whole Genome Shotgun project has been deposited at DDBJ/EMBL/GenBank under the accession JRHR00000000. Keywords: Oomycete, Pythium insidiosum, Pythiosis, Human emerging pathogen, Genome sequencing

  10. Hybrid De Novo Genome Assembly Using MiSeq and SOLiD Short Read Data.

    Directory of Open Access Journals (Sweden)

    Tsutomu Ikegami

    Full Text Available A hybrid de novo assembly pipeline was constructed to utilize both MiSeq and SOLiD short read data in combination in the assembly. The short read data were converted to a standard format of the pipeline, and were supplied to the pipeline components such as ABySS and SOAPdenovo. The assembly pipeline proceeded through several stages, and either MiSeq paired-end data, SOLiD mate-paired data, or both of them could be specified as input data at each stage separately. The pipeline was examined on the filamentous fungus Aspergillus oryzae RIB40, by aligning the assembly results against the reference sequences. Using both the MiSeq and the SOLiD data in the hybrid assembly, the alignment length was improved by a factor of 3 to 8, compared with the assemblies using either one of the data types. The number of the reproduced gene cluster regions encoding secondary metabolite biosyntheses (SMB was also improved by the hybrid assemblies. These results imply that the MiSeq data with long read length are essential to construct accurate nucleotide sequences, while the SOLiD mate-paired reads with long insertion length enhance long-range arrangements of the sequences. The pipeline was also tested on the actinomycete Streptomyces avermitilis MA-4680, whose gene is known to have high-GC content. Although the quality of the SOLiD reads was too low to perform any meaningful assemblies by themselves, the alignment length to the reference was improved by a factor of 2, compared with the assembly using only the MiSeq data.

  11. Performance Evaluation of a Novel Optimization Sequential Algorithm (SeQ Code for FTTH Network

    Directory of Open Access Journals (Sweden)

    Fazlina C.A.S.

    2017-01-01

    Full Text Available The SeQ codes has advantages, such as variable cross-correlation property at any given number of users and weights, as well as effectively suppressed the impacts of phase induced intensity noise (PIIN and multiple access interference (MAI cancellation property. The result revealed, at system performance analysis of BER = 10-09, the SeQ code capable to achieved 1 Gbps up to 60 km.

  12. SeqAPASS: Predicting chemical susceptibility to threatened/endangered species

    Science.gov (United States)

    Conservation of a molecular target across species can be used as a line-of-evidence to predict the likelihood of chemical susceptibility. The web-based Sequence Alignment to Predict Across Species Susceptibility (SeqAPASS; https://seqapass.epa.gov/seqapass/) application was devel...

  13. HyGenSys: a Flexible Process for Hydrogen and Power Production with Reduction of CO2 Emission HyGenSys : un procédé flexible de production d’hydrogène et d’électricité avec réduction des émissions de CO2

    Directory of Open Access Journals (Sweden)

    Giroudière F.

    2010-09-01

    . En fait, la chaleur nécessaire pour la réaction de reformage à la vapeur provient des fumées pressurisées produites dans une turbine à gaz au lieu d’un four conventionnel. Grâce à cette intégration thermique poussée, l’efficacité globale est améliorée et la consommation de gaz naturel réduite, ce qui représente un avantage d’un point de vue économique et environnemental notamment vis-à-vis de la réduction des émissions de CO2. Deux déclinaisons du procédé sont détaillées, elles répondent chacune à des besoins différents. La première, appelée HyGenSys-0, correspond à la production d’hydrogène pour le raffinage et la pétrochimique. La deuxième, appelée HyGenSys-1, permet la production d’énergie centralisée avec la capture de CO2 en précombustion. Dans ce cas, l’hydrogène produit est entièrement utilisé pour alimenter une turbine de production d’électricité. HyGenSys-1 a été développé et optimisé au cours du projet CACHET, financé par la Communauté européenne, avec comme objectif de fournir une puissance de 400 MW minimum. Les versions HyGenSys-0 et HyGenSys-1 du procédé sont décrites en détail avec les défis et avantages comparés aux technologies existantes. Dans les deux cas, le coeur de la technologie est le réacteuréchangeur dont le développement est également présenté en détail. La conception de réacteur-échangeur est basée sur un arrangement innovant de tubes à baïonnette autorisant une conception à grande échelle, de l’échange thermique multiple entre la fumée pressurisée chaude, l’alimentation de gaz naturel et l’effluent riche en hydrogène.

  14. 4C-ker: A Method to Reproducibly Identify Genome-Wide Interactions Captured by 4C-Seq Experiments.

    Science.gov (United States)

    Raviram, Ramya; Rocha, Pedro P; Müller, Christian L; Miraldi, Emily R; Badri, Sana; Fu, Yi; Swanzey, Emily; Proudhon, Charlotte; Snetkova, Valentina; Bonneau, Richard; Skok, Jane A

    2016-03-01

    4C-Seq has proven to be a powerful technique to identify genome-wide interactions with a single locus of interest (or "bait") that can be important for gene regulation. However, analysis of 4C-Seq data is complicated by the many biases inherent to the technique. An important consideration when dealing with 4C-Seq data is the differences in resolution of signal across the genome that result from differences in 3D distance separation from the bait. This leads to the highest signal in the region immediately surrounding the bait and increasingly lower signals in far-cis and trans. Another important aspect of 4C-Seq experiments is the resolution, which is greatly influenced by the choice of restriction enzyme and the frequency at which it can cut the genome. Thus, it is important that a 4C-Seq analysis method is flexible enough to analyze data generated using different enzymes and to identify interactions across the entire genome. Current methods for 4C-Seq analysis only identify interactions in regions near the bait or in regions located in far-cis and trans, but no method comprehensively analyzes 4C signals of different length scales. In addition, some methods also fail in experiments where chromatin fragments are generated using frequent cutter restriction enzymes. Here, we describe 4C-ker, a Hidden-Markov Model based pipeline that identifies regions throughout the genome that interact with the 4C bait locus. In addition, we incorporate methods for the identification of differential interactions in multiple 4C-seq datasets collected from different genotypes or experimental conditions. Adaptive window sizes are used to correct for differences in signal coverage in near-bait regions, far-cis and trans chromosomes. Using several datasets, we demonstrate that 4C-ker outperforms all existing 4C-Seq pipelines in its ability to reproducibly identify interaction domains at all genomic ranges with different resolution enzymes.

  15. 4C-ker: A Method to Reproducibly Identify Genome-Wide Interactions Captured by 4C-Seq Experiments.

    Directory of Open Access Journals (Sweden)

    Ramya Raviram

    2016-03-01

    Full Text Available 4C-Seq has proven to be a powerful technique to identify genome-wide interactions with a single locus of interest (or "bait" that can be important for gene regulation. However, analysis of 4C-Seq data is complicated by the many biases inherent to the technique. An important consideration when dealing with 4C-Seq data is the differences in resolution of signal across the genome that result from differences in 3D distance separation from the bait. This leads to the highest signal in the region immediately surrounding the bait and increasingly lower signals in far-cis and trans. Another important aspect of 4C-Seq experiments is the resolution, which is greatly influenced by the choice of restriction enzyme and the frequency at which it can cut the genome. Thus, it is important that a 4C-Seq analysis method is flexible enough to analyze data generated using different enzymes and to identify interactions across the entire genome. Current methods for 4C-Seq analysis only identify interactions in regions near the bait or in regions located in far-cis and trans, but no method comprehensively analyzes 4C signals of different length scales. In addition, some methods also fail in experiments where chromatin fragments are generated using frequent cutter restriction enzymes. Here, we describe 4C-ker, a Hidden-Markov Model based pipeline that identifies regions throughout the genome that interact with the 4C bait locus. In addition, we incorporate methods for the identification of differential interactions in multiple 4C-seq datasets collected from different genotypes or experimental conditions. Adaptive window sizes are used to correct for differences in signal coverage in near-bait regions, far-cis and trans chromosomes. Using several datasets, we demonstrate that 4C-ker outperforms all existing 4C-Seq pipelines in its ability to reproducibly identify interaction domains at all genomic ranges with different resolution enzymes.

  16. GenLocDip: A Generalized Program to Calculate and Visualize Local Electric Dipole Moments.

    Science.gov (United States)

    Groß, Lynn; Herrmann, Carmen

    2016-09-30

    Local dipole moments (i.e., dipole moments of atomic or molecular subsystems) are essential for understanding various phenomena in nanoscience, such as solvent effects on the conductance of single molecules in break junctions or the interaction between the tip and the adsorbate in atomic force microscopy. We introduce GenLocDip, a program for calculating and visualizing local dipole moments of molecular subsystems. GenLocDip currently uses the Atoms-In-Molecules (AIM) partitioning scheme and is interfaced to various AIM programs. This enables postprocessing of a variety of electronic structure output formats including cube and wavefunction files, and, in general, output from any other code capable of writing the electron density on a three-dimensional grid. It uses a modified version of Bader's and Laidig's approach for achieving origin-independence of local dipoles by referring to internal reference points which can (but do not need to be) bond critical points (BCPs). Furthermore, the code allows the export of critical points and local dipole moments into a POVray readable input format. It is particularly designed for fragments of large systems, for which no BCPs have been calculated for computational efficiency reasons, because large interfragment distances prevent their identification, or because a local partitioning scheme different from AIM was used. The program requires only minimal user input and is written in the Fortran90 programming language. To demonstrate the capabilities of the program, examples are given for covalently and non-covalently bound systems, in particular molecular adsorbates. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  17. A survey of motif finding Web tools for detecting binding site motifs in ChIP-Seq data.

    Science.gov (United States)

    Tran, Ngoc Tam L; Huang, Chun-Hsi

    2014-02-20

    ChIP-Seq (chromatin immunoprecipitation sequencing) has provided the advantage for finding motifs as ChIP-Seq experiments narrow down the motif finding to binding site locations. Recent motif finding tools facilitate the motif detection by providing user-friendly Web interface. In this work, we reviewed nine motif finding Web tools that are capable for detecting binding site motifs in ChIP-Seq data. We showed each motif finding Web tool has its own advantages for detecting motifs that other tools may not discover. We recommended the users to use multiple motif finding Web tools that implement different algorithms for obtaining significant motifs, overlapping resemble motifs, and non-overlapping motifs. Finally, we provided our suggestions for future development of motif finding Web tool that better assists researchers for finding motifs in ChIP-Seq data.

  18. RNA-seq analysis of early hepatic response to handling and confinement stress in rainbow trout.

    Directory of Open Access Journals (Sweden)

    Sixin Liu

    Full Text Available Fish under intensive rearing conditions experience various stressors which have negative impacts on survival, growth, reproduction and fillet quality. Identifying and characterizing the molecular mechanisms underlying stress responses will facilitate the development of strategies that aim to improve animal welfare and aquaculture production efficiency. In this study, we used RNA-seq to identify transcripts which are differentially expressed in the rainbow trout liver in response to handling and confinement stress. These stressors were selected due to their relevance in aquaculture production. Total RNA was extracted from the livers of individual fish in five tanks having eight fish each, including three tanks of fish subjected to a 3 hour handling and confinement stress and two control tanks. Equal amount of total RNA of six individual fish was pooled by tank to create five RNA-seq libraries which were sequenced in one lane of Illumina HiSeq 2000. Three sequencing runs were conducted to obtain a total of 491,570,566 reads which were mapped onto the previously generated stress reference transcriptome to identify 316 differentially expressed transcripts (DETs. Twenty one DETs were selected for qPCR to validate the RNA-seq approach. The fold changes in gene expression identified by RNA-seq and qPCR were highly correlated (R(2 = 0.88. Several gene ontology terms including transcription factor activity and biological process such as glucose metabolic process were enriched among these DETs. Pathways involved in response to handling and confinement stress were implicated by mapping the DETs to reference pathways in the KEGG database.Raw RNA-seq reads have been submitted to the NCBI Short Read Archive under accession number SRP022881.All customized scripts described in this paper are available from Dr. Guangtu Gao or the corresponding author.

  19. TRANSFER GEN ANTIVIRUS PADA EMBRIO UDANG WINDU, Penaeus monodon DALAM BERBAGAI KONSENTRASI DEOXYRIBO NUCLEIC ACID

    Directory of Open Access Journals (Sweden)

    Andi Parenrengi

    2011-12-01

    Full Text Available Teknologi transgenesis khususnya rekayasa genetik untuk menghasilkan udang windu resisten penyakit merupakan salah satu strategi yang dapat dilakukan dalam upaya pemecahan masalah penyakit yang menimpa budidaya udang windu. Teknologi transgenesis khususnya transfer gen antivirus pada udang windu telah berhasil dilakukan melalui teknik transfeksi. Meskipun demikian optimalisasi komponen teknologi tersebut masih perlu dilakukan. Konsentrasi DNA gen merupakan salah satu komponen teknologi transgenesis yang harus dioptimalkan untuk mendapatkan efisiensi dalam transfer gen. Penelitian bertujuan untuk mengetahui konsentrasi DNA gen antivirus yang optimal sebagai bahan transfer gen ke embrio menggunakan metode transfeksi. Embrio udang windu yang diperoleh dari hasil pemijahan induk asal Aceh, dikoleksi 5-10 menit setelah memijah dengan kepadatan 625 telur/2 mL. Transfeksi dilakukan dengan menggunakan media larutan transfeksi jetPEI dengan konsentrasi DNA gen antivirus sebagai perlakuan, yakni: 5, 10, dan 15 µg serta kontrol positif (tanpa plasmid DNA dan negatif (tanpa plasmid DNA dan larutan transfeksi, masing-masing 3 ulangan. Embrio hasil transfeksi ditetaskan pada stoples berisi air laut sebanyak 2 L yang diletakkan pada waterbath. Hasil penelitian menunjukkan bahwa gen antivirus telah berhasil diintroduksi ke embrio udang windu. Hasil analisis ragam menunjukkan bahwa perbedaan konsentrasi DNA (5-15 µg tidak berpengaruh nyata (P>0,05 terhadap daya tetas embrio udang windu. Analisis ekspresi gen pada larva udang windu juga menunjukkan adanya aktivitas ekspresi gen antivirus pada semua perlakuan konsentrasi DNA, di mana ekspresi gen antivirus pada larva transgenik lebih tinggi dibandingkan dengan kontrol (tanpa transfeksi. Sintasan pasca-larva PL-1 yang didapatkan pada penelitian ini adalah 12,0%; 10,0%; 10,6%; 12,3%; dan 14,2% masing-masing untuk perlakuan konsentrasi plasmid DNA 5 µg, 10 µg, 15 µg, kontrol positif dan negatif, di mana

  20. Mejoramiento genético acelerado de angiospermas perennes vía inducción floral por sobre-expresión del gen FT

    Directory of Open Access Journals (Sweden)

    Rafael Urrea López

    2018-05-01

    Full Text Available Los bosques y selvas enfrentan el reto de satisfacer la demanda por recursos de una población en crecimiento, así como la amenaza del rápido cambio climático que exacerba la magnitud y frecuencia de estreses bióticos y abióticos. Para ello, es urgente acelerar el mejoramiento genético de especies forestales. Sin embargo, sus largas etapas juveniles y asincronía floral retrasan peligrosamente este proceso. El presente ensayo explora los adelantos biotecnológicos en inducción floral y su potencial aplicación en especies forestales. Entre los genes identificados y caracterizados que participan en la ruta de señalización de la floración, especial atención se destina al gen FLOWERING LOCUS T, considerado un integrador de rutas de señalización altamente conservado entre las angiospermas, que, al sobre-expresarse por ingeniería genética, es capaz de inducir la floración de forma eficiente. Esta novedosa estrategia biotecnológica se ha utilizado, recientemente, para segregar genes de resistencia a enfermedades, en un menor tiempo, en germoplasma comercial de manzana y ciruela. Permite soslayar barreras naturales que por mucho tiempo han restringido a las especies forestales al mejoramiento por selección, principalmente. Entre sus ventajas está la de poder restringirla al proceso y no al producto, para acelerar las cruzas sexuales sin modificar genéticamente la progenie; se aleja así de la controversia alrededor de la liberación y consumo de organismos genéticamente modificados, y de los costos y trámites obligatorios para los OGM para monitoreo de posibles riesgos. Se proyecta como una tecnología que puede acelerar, significativamente, el mejoramiento de especies forestales.

  1. A comprehensive simulation study on classification of RNA-Seq data.

    Directory of Open Access Journals (Sweden)

    Gökmen Zararsız

    Full Text Available RNA sequencing (RNA-Seq is a powerful technique for the gene-expression profiling of organisms that uses the capabilities of next-generation sequencing technologies. Developing gene-expression-based classification algorithms is an emerging powerful method for diagnosis, disease classification and monitoring at molecular level, as well as providing potential markers of diseases. Most of the statistical methods proposed for the classification of gene-expression data are either based on a continuous scale (eg. microarray data or require a normal distribution assumption. Hence, these methods cannot be directly applied to RNA-Seq data since they violate both data structure and distributional assumptions. However, it is possible to apply these algorithms with appropriate modifications to RNA-Seq data. One way is to develop count-based classifiers, such as Poisson linear discriminant analysis and negative binomial linear discriminant analysis. Another way is to bring the data closer to microarrays and apply microarray-based classifiers. In this study, we compared several classifiers including PLDA with and without power transformation, NBLDA, single SVM, bagging SVM (bagSVM, classification and regression trees (CART, and random forests (RF. We also examined the effect of several parameters such as overdispersion, sample size, number of genes, number of classes, differential-expression rate, and the transformation method on model performances. A comprehensive simulation study is conducted and the results are compared with the results of two miRNA and two mRNA experimental datasets. The results revealed that increasing the sample size, differential-expression rate and decreasing the dispersion parameter and number of groups lead to an increase in classification accuracy. Similar with differential-expression studies, the classification of RNA-Seq data requires careful attention when handling data overdispersion. We conclude that, as a count

  2. Spatio-temporal model for multiple ChIP-seq experiments

    NARCIS (Netherlands)

    Ranciati, Saverio; Viroli, Cinzia; Wit, Ernst

    2015-01-01

    The increasing availability of ChIP-seq data demands for advanced statistical tools to analyze the results of such experiments. The inherent features of high-throughput sequencing output call for a modelling framework that can account for the spatial dependency between neighboring regions of the

  3. Aconselhamento genético Genetic counseling

    Directory of Open Access Journals (Sweden)

    João Monteiro de Pina-Neto

    2008-08-01

    Full Text Available OBJETIVO: Esta revisão sobre aconselhamento genético (AG teve o objetivo de mostrar os conceitos atuais e os princípios filosóficos e éticos aceitos na grande maioria dos países e recomendados pela Organização Mundial da Saúde, as fases do processo, seus resultados e o impacto psicológico de uma doença genética em uma família. FONTES DOS DADOS: Os conceitos apresentados são baseados em uma síntese histórica da literatura sobre AG desde a década de 1930 até o momento atual, sendo que os artigos citados representam os principais trabalhos publicados e que hoje fundamentam a teoria e a prática do AG. SÍNTESE DOS DADOS: O AG modernamente é definido como um processo de comunicação que trata dos problemas humanos relacionados à ocorrência de uma doença genética em uma família. É fundamental que os profissionais da saúde conheçam os aspectos psicológicos desencadeados pela doença genética e como estes aspectos podem ser manejados. Vivemos ainda na genética humana e médica uma fase de predomínio dos aspectos técnicos e científicos e de pouca ênfase no estudo das reações emocionais e dos processos de adaptação das pessoas a estas doenças, o que leva ao baixo entendimento dos clientes sobre os fatos ocorridos, com conseqüências negativas sobre a vida familiar e para a sociedade. CONCLUSÕES: Conclui-se pela necessidade de que as famílias com doenças genéticas sejam encaminhadas para AG e que os profissionais desta área invistam mais na humanização do atendimento, desenvolvendo mais as técnicas do AG psicológico não-diretivo.OBJECTIVE: The objective of this review of genetic counseling (GC is to describe the current concepts and philosophical and ethical principles accepted by the great majority of countries and recommended by the World Health Organization, the stages of the process, its results and the psychological impact that a genetic disease has on a family. SOURCES: The concepts presented are

  4. Estudio de la variabilidad genética en camélidos bolivianos

    OpenAIRE

    Barreta Pinto, Julia

    2013-01-01

    El estudio de los camélidos sudamericanos es de gran interés en los países andinoscomo Perú, Bolivia, Chile, Argentina, debido a su importante valor económico y suimportancia en el mantenimiento y desarrollo de las poblaciones rurales en dichos países. Dada la falta de estudios genéticos centrados en las poblaciones de camélidos quehabitan en Bolivia, y la necesidad de realizar una valoración de la diversidad genética deestas poblaciones, la presente Tesis doctoral ha abordado el estudio gené...

  5. RNA-Seq Atlas of Glycine max: A guide to the soybean transcriptome

    Directory of Open Access Journals (Sweden)

    Severin Andrew J

    2010-08-01

    Full Text Available Abstract Background Next generation sequencing is transforming our understanding of transcriptomes. It can determine the expression level of transcripts with a dynamic range of over six orders of magnitude from multiple tissues, developmental stages or conditions. Patterns of gene expression provide insight into functions of genes with unknown annotation. Results The RNA Seq-Atlas presented here provides a record of high-resolution gene expression in a set of fourteen diverse tissues. Hierarchical clustering of transcriptional profiles for these tissues suggests three clades with similar profiles: aerial, underground and seed tissues. We also investigate the relationship between gene structure and gene expression and find a correlation between gene length and expression. Additionally, we find dramatic tissue-specific gene expression of both the most highly-expressed genes and the genes specific to legumes in seed development and nodule tissues. Analysis of the gene expression profiles of over 2,000 genes with preferential gene expression in seed suggests there are more than 177 genes with functional roles that are involved in the economically important seed filling process. Finally, the Seq-atlas also provides a means of evaluating existing gene model annotations for the Glycine max genome. Conclusions This RNA-Seq atlas extends the analyses of previous gene expression atlases performed using Affymetrix GeneChip technology and provides an example of new methods to accommodate the increase in transcriptome data obtained from next generation sequencing. Data contained within this RNA-Seq atlas of Glycine max can be explored at http://www.soybase.org/soyseq.

  6. GenoGAM: genome-wide generalized additive models for ChIP-Seq analysis.

    Science.gov (United States)

    Stricker, Georg; Engelhardt, Alexander; Schulz, Daniel; Schmid, Matthias; Tresch, Achim; Gagneur, Julien

    2017-08-01

    Chromatin immunoprecipitation followed by deep sequencing (ChIP-Seq) is a widely used approach to study protein-DNA interactions. Often, the quantities of interest are the differential occupancies relative to controls, between genetic backgrounds, treatments, or combinations thereof. Current methods for differential occupancy of ChIP-Seq data rely however on binning or sliding window techniques, for which the choice of the window and bin sizes are subjective. Here, we present GenoGAM (Genome-wide Generalized Additive Model), which brings the well-established and flexible generalized additive models framework to genomic applications using a data parallelism strategy. We model ChIP-Seq read count frequencies as products of smooth functions along chromosomes. Smoothing parameters are objectively estimated from the data by cross-validation, eliminating ad hoc binning and windowing needed by current approaches. GenoGAM provides base-level and region-level significance testing for full factorial designs. Application to a ChIP-Seq dataset in yeast showed increased sensitivity over existing differential occupancy methods while controlling for type I error rate. By analyzing a set of DNA methylation data and illustrating an extension to a peak caller, we further demonstrate the potential of GenoGAM as a generic statistical modeling tool for genome-wide assays. Software is available from Bioconductor: https://www.bioconductor.org/packages/release/bioc/html/GenoGAM.html . gagneur@in.tum.de. Supplementary information is available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  7. LookSeq: a browser-based viewer for deep sequencing data.

    Science.gov (United States)

    Manske, Heinrich Magnus; Kwiatkowski, Dominic P

    2009-11-01

    Sequencing a genome to great depth can be highly informative about heterogeneity within an individual or a population. Here we address the problem of how to visualize the multiple layers of information contained in deep sequencing data. We propose an interactive AJAX-based web viewer for browsing large data sets of aligned sequence reads. By enabling seamless browsing and fast zooming, the LookSeq program assists the user to assimilate information at different levels of resolution, from an overview of a genomic region to fine details such as heterogeneity within the sample. A specific problem, particularly if the sample is heterogeneous, is how to depict information about structural variation. LookSeq provides a simple graphical representation of paired sequence reads that is more revealing about potential insertions and deletions than are conventional methods.

  8. Direct chloroplast sequencing: comparison of sequencing platforms and analysis tools for whole chloroplast barcoding.

    Directory of Open Access Journals (Sweden)

    Marta Brozynska

    Full Text Available Direct sequencing of total plant DNA using next generation sequencing technologies generates a whole chloroplast genome sequence that has the potential to provide a barcode for use in plant and food identification. Advances in DNA sequencing platforms may make this an attractive approach for routine plant identification. The HiSeq (Illumina and Ion Torrent (Life Technology sequencing platforms were used to sequence total DNA from rice to identify polymorphisms in the whole chloroplast genome sequence of a wild rice plant relative to cultivated rice (cv. Nipponbare. Consensus chloroplast sequences were produced by mapping sequence reads to the reference rice chloroplast genome or by de novo assembly and mapping of the resulting contigs to the reference sequence. A total of 122 polymorphisms (SNPs and indels between the wild and cultivated rice chloroplasts were predicted by these different sequencing and analysis methods. Of these, a total of 102 polymorphisms including 90 SNPs were predicted by both platforms. Indels were more variable with different sequencing methods, with almost all discrepancies found in homopolymers. The Ion Torrent platform gave no apparent false SNP but was less reliable for indels. The methods should be suitable for routine barcoding using appropriate combinations of sequencing platform and data analysis.

  9. Playing hide and seek with repeats in local and global de novo transcriptome assembly of short RNA-seq reads.

    Science.gov (United States)

    Lima, Leandro; Sinaimeri, Blerina; Sacomoto, Gustavo; Lopez-Maestre, Helene; Marchet, Camille; Miele, Vincent; Sagot, Marie-France; Lacroix, Vincent

    2017-01-01

    The main challenge in de novo genome assembly of DNA-seq data is certainly to deal with repeats that are longer than the reads. In de novo transcriptome assembly of RNA-seq reads, on the other hand, this problem has been underestimated so far. Even though we have fewer and shorter repeated sequences in transcriptomics, they do create ambiguities and confuse assemblers if not addressed properly. Most transcriptome assemblers of short reads are based on de Bruijn graphs (DBG) and have no clear and explicit model for repeats in RNA-seq data, relying instead on heuristics to deal with them. The results of this work are threefold. First, we introduce a formal model for representing high copy-number and low-divergence repeats in RNA-seq data and exploit its properties to infer a combinatorial characteristic of repeat-associated subgraphs. We show that the problem of identifying such subgraphs in a DBG is NP-complete. Second, we show that in the specific case of local assembly of alternative splicing (AS) events, we can implicitly avoid such subgraphs, and we present an efficient algorithm to enumerate AS events that are not included in repeats. Using simulated data, we show that this strategy is significantly more sensitive and precise than the previous version of KisSplice (Sacomoto et al. in WABI, pp 99-111, 1), Trinity (Grabherr et al. in Nat Biotechnol 29(7):644-652, 2), and Oases (Schulz et al. in Bioinformatics 28(8):1086-1092, 3), for the specific task of calling AS events. Third, we turn our focus to full-length transcriptome assembly, and we show that exploring the topology of DBGs can improve de novo transcriptome evaluation methods. Based on the observation that repeats create complicated regions in a DBG, and when assemblers try to traverse these regions, they can infer erroneous transcripts, we propose a measure to flag transcripts traversing such troublesome regions, thereby giving a confidence level for each transcript. The originality of our work when

  10. ASN’s actions in GEN IV reactors and Sodium Fast Reactors (SFR)

    International Nuclear Information System (INIS)

    Belot, Clotilde

    2013-01-01

    The ASN is involved in 3 actions concerning GEN IV: • Overview of nuclear reactor GEN IV systems; • Specific analysis about transmutation; • Prototype reactor ASTRID (SFR). Furthermore theses actions are in the beginning (no conclusions or results available)

  11. Identification of innate lymphoid cells in single-cell RNA-Seq data.

    Science.gov (United States)

    Suffiotti, Madeleine; Carmona, Santiago J; Jandus, Camilla; Gfeller, David

    2017-07-01

    Innate lymphoid cells (ILCs) consist of natural killer (NK) cells and non-cytotoxic ILCs that are broadly classified into ILC1, ILC2, and ILC3 subtypes. These cells recently emerged as important early effectors of innate immunity for their roles in tissue homeostasis and inflammation. Over the last few years, ILCs have been extensively studied in mouse and human at the functional and molecular level, including gene expression profiling. However, sorting ILCs with flow cytometry for gene expression analysis is a delicate and time-consuming process. Here we propose and validate a novel framework for studying ILCs at the transcriptomic level using single-cell RNA-Seq data. Our approach combines unsupervised clustering and a new cell type classifier trained on mouse ILC gene expression data. We show that this approach can accurately identify different ILCs, especially ILC2 cells, in human lymphocyte single-cell RNA-Seq data. Our new model relies only on genes conserved across vertebrates, thereby making it in principle applicable in any vertebrate species. Considering the rapid increase in throughput of single-cell RNA-Seq technology, our work provides a computational framework for studying ILC2 cells in single-cell transcriptomic data and may help exploring their conservation in distant vertebrate species.

  12. Doping Genético e Eugenia: Diálogos além do esporte

    Directory of Open Access Journals (Sweden)

    Tiago Vieira Bomtempo

    2016-01-01

    Full Text Available La ingeniería genética trajo posibilidades antes inimaginables, en la que no hace mucho tiempo era visto sólo en las películas. De la terapia génica, dirigida hacia una corrección o cura de una enfermedad, pasa a la posibilidad del mejoramiento genético, actualmente vislumbrado en el mundo del deporte con el doping genético. ¿Pero, el doping genético no estaría violando el derecho al patrimonio genético no modificado? Aunque la intervención genética no se transmita a los descendentes, habría un mejoramiento genético, que afectaría el genoma del atleta y lo diferenciaría de los demás atletas y otros individuos, hiriendo el principio de igualdad en detrimento de la autonomía privada, pudiéndose estar hablando inicialmente de una relación de dominación, aunque sea en razón al rendimiento físico en el deporte. En este sentido, estas innovaciones que atraviesan el campo de la ingeniería genética, infunden una preocupación acerca de la manipulación genética en las generaciones futuras, punto de discusión no sólo biomédica, sino también bioético y biojurídico. Así, surge una preocupación si estos nuevos avances pueden afectar a la dignidad humana delante de una posible eugenesia, debido a la proyección de personas y la consecuente discriminación por determinada identidad genética. Junto a esto, el objetivo de este artículo es investigar si el dopaje genético ofendería el derecho al patrimonio genético no modificado y los derechos de las generaciones futuras, dando lugar a una nueva forma de eugenesia, al no permitir el ejercicio igualitario de las libertades fundamentales. Por lo tanto, se hace necesaria una investigación basada en los autores de la bioética y el bioderecho, así como también los textos legales nacionales e internacionales que involucran el tema. Es indispensable la discusión de estas cuestiones, sobre todo con la proximidad de los Juegos Olímpicos de Verano en Brasil en este año 2016

  13. Microsatélites, distancias genéticas y estructura de poblaciones nativas sudamericanas

    Directory of Open Access Journals (Sweden)

    Demarchi, Darío Alfredo

    2009-01-01

    Full Text Available En este trabajo se investigaron las relaciones genéticas entre 17 poblaciones nativas sudamericanas en relación a 15 microsatélites (STRs autosómicos, utilizando 3 distancias genéticas- DST, DAy (δu2-que se ajustan a diferentes postulados teóricos. A través de diferentes técnicas de análisis (escalamiento multidimensional, correlación y correlación parcial de matrices se puso a prueba si las distancias genéticas reflejaban las relaciones interpoblacionales esperadas a partir de la distribución geográfica o de relaciones lingüísticas entre las poblaciones. Además, se estimó en que grado las distintas medidas de distancias genéticas eran influenciadas por la diversidad (He de cada población. Los mapas genéticos muestran, principalmente para DST y DA, que las poblaciones aisladas y con bajo tamaño efectivo (Ne aparecen como outliers, mientras que las poblaciones con alto Ne y mayor flujo génico ocupan una posición central a bajos valores de distancia unas de otras y sin un patrón definido de agrupamiento. La falta de asociación entre distancias genéticas y lingüísticas o geográficas y por otra parte, la alta correlación negativa entre He y distancias génicas promedio por población confiman ese patrón, demostrando que la mayor parte de la variación interpoblacional puede ser explicada en función del grado de diversidad intrapoblacional. Es decir, las distancias genéticas no reflejan relaciones filogenéticas, lingüísticas o geográficas, sino más bien eventos demográficos recientes tales como cuellos de botella genético, efecto fundador o migración externa masiva. Este hecho puede ser comprobado por medio de otra metodología analítica, el modelo de Harpending y Ward.

  14. Computational Methods for ChIP-seq Data Analysis and Applications

    KAUST Repository

    Ashoor, Haitham

    2017-01-01

    four main challenges. First, I address the problem of detecting histone modifications from ChIP-seq cancer samples. The presence of copy number variations (CNVs) in cancer samples results in statistical biases that lead to inaccurate predictions when

  15. Safer Systems: A NextGen Aviation Safety Strategic Goal

    Science.gov (United States)

    Darr, Stephen T.; Ricks, Wendell R.; Lemos, Katherine A.

    2008-01-01

    The Joint Planning and Development Office (JPDO), is charged by Congress with developing the concepts and plans for the Next Generation Air Transportation System (NextGen). The National Aviation Safety Strategic Plan (NASSP), developed by the Safety Working Group of the JPDO, focuses on establishing the goals, objectives, and strategies needed to realize the safety objectives of the NextGen Integrated Plan. The three goal areas of the NASSP are Safer Practices, Safer Systems, and Safer Worldwide. Safer Practices emphasizes an integrated, systematic approach to safety risk management through implementation of formalized Safety Management Systems (SMS) that incorporate safety data analysis processes, and the enhancement of methods for ensuring safety is an inherent characteristic of NextGen. Safer Systems emphasizes implementation of safety-enhancing technologies, which will improve safety for human-centered interfaces and enhance the safety of airborne and ground-based systems. Safer Worldwide encourages coordinating the adoption of the safer practices and safer systems technologies, policies and procedures worldwide, such that the maximum level of safety is achieved across air transportation system boundaries. This paper introduces the NASSP and its development, and focuses on the Safer Systems elements of the NASSP, which incorporates three objectives for NextGen systems: 1) provide risk reducing system interfaces, 2) provide safety enhancements for airborne systems, and 3) provide safety enhancements for ground-based systems. The goal of this paper is to expose avionics and air traffic management system developers to NASSP objectives and Safer Systems strategies.

  16. Genetic similarity between coriander genotypes using ISSR markers Similaridade genética entre genótipos de coentro por marcadores ISSR

    Directory of Open Access Journals (Sweden)

    Roberto de A Melo

    2011-12-01

    Full Text Available With the development of new cultivars, a precise genetic characterization is essential for improvement programs or for cultivar registration and protection. Molecular markers have been complementing the traditional morphological and agronomic characterization techniques because they are virtually unlimited, cover the whole genome and are not environmentally influenced. Genetic characterization constitutes the basis for studies involving estimates of genetic similarity. Therefore, the objective of the present study was to evaluate the genetic similarity between ten coriander genotypes (nine cultivars and one line using ISSR markers. The cultivars used were: Americano, Asteca, Palmeira, Português, Santo, Supéria, Tabocas, Tapacurá, Verdão and the experimental line HTV-9299. The genetic similarity between the cultivars was estimated using 227 banded regions of ISSR molecular markers. The UBC 897 oligonucleotide generated the highest number of fragments (16, resulting in a higher polymorphism. The results indicate that the twenty-nine oligonucleotides chosen were satisfactory for detecting polymorphism. Based on the grouping analysis determined from the similarity data, there were two groups and two sub-groups. The calculated similarity for the genotypes varied from 52 to 75%. The lowest similarity was observed between Português and Verdão, at 52%. The highest similarity was found between Português and Palmeira, at 75%. The ISSR is efficient for identifying DNA polymorphism in coriander.Com o surgimento de novas cultivares, uma caracterização genética precisa é essencial, visando à utilização em programas de melhoramento ou para fins de registros e ou proteção de cultivares. Marcadores moleculares vêm complementando a caracterização morfológica e agronômica tradicional, uma vez que são virtualmente ilimitados, cobrem todo o genoma e não são influenciados pelo ambiente. A caracterização genética constitui a base para

  17. Predictive hypotheses are ineffectual in resolving complex biochemical systems.

    Science.gov (United States)

    Fry, Michael

    2018-03-20

    Scientific hypotheses may either predict particular unknown facts or accommodate previously-known data. Although affirmed predictions are intuitively more rewarding than accommodations of established facts, opinions divide whether predictive hypotheses are also epistemically superior to accommodation hypotheses. This paper examines the contribution of predictive hypotheses to discoveries of several bio-molecular systems. Having all the necessary elements of the system known beforehand, an abstract predictive hypothesis of semiconservative mode of DNA replication was successfully affirmed. However, in defining the genetic code whose biochemical basis was unclear, hypotheses were only partially effective and supplementary experimentation was required for its conclusive definition. Markedly, hypotheses were entirely inept in predicting workings of complex systems that included unknown elements. Thus, hypotheses did not predict the existence and function of mRNA, the multiple unidentified components of the protein biosynthesis machinery, or the manifold unknown constituents of the ubiquitin-proteasome system of protein breakdown. Consequently, because of their inability to envision unknown entities, predictive hypotheses did not contribute to the elucidation of cation theories remained the sole instrument to explain complex bio-molecular systems, the philosophical question of alleged advantage of predictive over accommodative hypotheses became inconsequential.

  18. Validation of Nepalese version of Utrecht Work Engagement Scale.

    Science.gov (United States)

    Panthee, Bimala; Shimazu, Akihito; Kawakami, Norito

    2014-01-01

    The objective of the current study was to examine the psychometric properties of the Nepalese version of the Utrecht Work Engagement Scale (UWES-N) in a sample of hospital nurses. Registered nurses from three hospitals in Nepal (total N=438) voluntarily completed a self-administered paper-and-pencil questionnaire. Confirmatory factor analysis revealed that the hypothesized three-factor model of the 9-item version of the UWES-N (UWES-N-9) fitted the data best. The internal consistency of the scale was acceptable. Work engagement was positively related to job satisfaction, job performance, happiness and health, and it was negatively related to psychological distress, confirming its construct validity. In conclusion, the findings of our study indicated that the UWES-N-9 has satisfactory psychometric properties and provided supportive evidence for use of the UWES-N-9 in the Nepalese context.

  19. Generic Optimization Program User Manual Version 3.0.0

    International Nuclear Information System (INIS)

    Wetter, Michael

    2009-01-01

    GenOpt is an optimization program for the minimization of a cost function that is evaluated by an external simulation program. It has been developed for optimization problems where the cost function is computationally expensive and its derivatives are not available or may not even exist. GenOpt can be coupled to any simulation program that reads its input from text files and writes its output to text files. The independent variables can be continuous variables (possibly with lower and upper bounds), discrete variables, or both, continuous and discrete variables. Constraints on dependent variables can be implemented using penalty or barrier functions. GenOpt uses parallel computing to evaluate the simulations. GenOpt has a library with local and global multi-dimensional and one-dimensional optimization algorithms, and algorithms for doing parametric runs. An algorithm interface allows adding new minimization algorithms without knowing the details of the program structure. GenOpt is written in Java so that it is platform independent. The platform independence and the general interface make GenOpt applicable to a wide range of optimization problems. GenOpt has not been designed for linear programming problems, quadratic programming problems, and problems where the gradient of the cost function is available. For such problems, as well as for other problems, special tailored software exists that is more efficient

  20. Generic Optimization Program User Manual Version 3.0.0

    Energy Technology Data Exchange (ETDEWEB)

    Wetter, Michael

    2009-05-11

    GenOpt is an optimization program for the minimization of a cost function that is evaluated by an external simulation program. It has been developed for optimization problems where the cost function is computationally expensive and its derivatives are not available or may not even exist. GenOpt can be coupled to any simulation program that reads its input from text files and writes its output to text files. The independent variables can be continuous variables (possibly with lower and upper bounds), discrete variables, or both, continuous and discrete variables. Constraints on dependent variables can be implemented using penalty or barrier functions. GenOpt uses parallel computing to evaluate the simulations. GenOpt has a library with local and global multi-dimensional and one-dimensional optimization algorithms, and algorithms for doing parametric runs. An algorithm interface allows adding new minimization algorithms without knowing the details of the program structure. GenOpt is written in Java so that it is platform independent. The platform independence and the general interface make GenOpt applicable to a wide range of optimization problems. GenOpt has not been designed for linear programming problems, quadratic programming problems, and problems where the gradient of the cost function is available. For such problems, as well as for other problems, special tailored software exists that is more efficient.

  1. Crystal Structure of a Eukaryotic GEN1 Resolving Enzyme Bound to DNA

    Directory of Open Access Journals (Sweden)

    Yijin Liu

    2015-12-01

    Full Text Available We present the crystal structure of the junction-resolving enzyme GEN1 bound to DNA at 2.5 Å resolution. The structure of the GEN1 protein reveals it to have an elaborated FEN-XPG family fold that is modified for its role in four-way junction resolution. The functional unit in the crystal is a monomer of active GEN1 bound to the product of resolution cleavage, with an extensive DNA binding interface for both helical arms. Within the crystal lattice, a GEN1 dimer interface juxtaposes two products, whereby they can be reconnected into a four-way junction, the structure of which agrees with that determined in solution. The reconnection requires some opening of the DNA structure at the center, in agreement with permanganate probing and 2-aminopurine fluorescence. The structure shows that a relaxation of the DNA structure accompanies cleavage, suggesting how second-strand cleavage is accelerated to ensure productive resolution of the junction.

  2. Version pressure feedback mechanisms for speculative versioning caches

    Science.gov (United States)

    Eichenberger, Alexandre E.; Gara, Alan; O& #x27; Brien, Kathryn M.; Ohmacht, Martin; Zhuang, Xiaotong

    2013-03-12

    Mechanisms are provided for controlling version pressure on a speculative versioning cache. Raw version pressure data is collected based on one or more threads accessing cache lines of the speculative versioning cache. One or more statistical measures of version pressure are generated based on the collected raw version pressure data. A determination is made as to whether one or more modifications to an operation of a data processing system are to be performed based on the one or more statistical measures of version pressure, the one or more modifications affecting version pressure exerted on the speculative versioning cache. An operation of the data processing system is modified based on the one or more determined modifications, in response to a determination that one or more modifications to the operation of the data processing system are to be performed, to affect the version pressure exerted on the speculative versioning cache.

  3. Massively parallel sequencing, aCGH, and RNA-Seq technologies provide a comprehensive molecular diagnosis of Fanconi anemia.

    Science.gov (United States)

    Chandrasekharappa, Settara C; Lach, Francis P; Kimble, Danielle C; Kamat, Aparna; Teer, Jamie K; Donovan, Frank X; Flynn, Elizabeth; Sen, Shurjo K; Thongthip, Supawat; Sanborn, Erica; Smogorzewska, Agata; Auerbach, Arleen D; Ostrander, Elaine A

    2013-05-30

    Current methods for detecting mutations in Fanconi anemia (FA)-suspected patients are inefficient and often miss mutations. We have applied recent advances in DNA sequencing and genomic capture to the diagnosis of FA. Specifically, we used custom molecular inversion probes or TruSeq-enrichment oligos to capture and sequence FA and related genes, including introns, from 27 samples from the International Fanconi Anemia Registry at The Rockefeller University. DNA sequencing was complemented with custom array comparative genomic hybridization (aCGH) and RNA sequencing (RNA-seq) analysis. aCGH identified deletions/duplications in 4 different FA genes. RNA-seq analysis revealed lack of allele specific expression associated with a deletion and splicing defects caused by missense, synonymous, and deep-in-intron variants. The combination of TruSeq-targeted capture, aCGH, and RNA-seq enabled us to identify the complementation group and biallelic germline mutations in all 27 families: FANCA (7), FANCB (3), FANCC (3), FANCD1 (1), FANCD2 (3), FANCF (2), FANCG (2), FANCI (1), FANCJ (2), and FANCL (3). FANCC mutations are often the cause of FA in patients of Ashkenazi Jewish (AJ) ancestry, and we identified 2 novel FANCC mutations in 2 patients of AJ ancestry. We describe here a strategy for efficient molecular diagnosis of FA.

  4. Inference of chromosomal inversion dynamics from Pool-Seq data in natural and laboratory populations of Drosophila melanogaster.

    Science.gov (United States)

    Kapun, Martin; van Schalkwyk, Hester; McAllister, Bryant; Flatt, Thomas; Schlötterer, Christian

    2014-04-01

    Sequencing of pools of individuals (Pool-Seq) represents a reliable and cost-effective approach for estimating genome-wide SNP and transposable element insertion frequencies. However, Pool-Seq does not provide direct information on haplotypes so that, for example, obtaining inversion frequencies has not been possible until now. Here, we have developed a new set of diagnostic marker SNPs for seven cosmopolitan inversions in Drosophila melanogaster that can be used to infer inversion frequencies from Pool-Seq data. We applied our novel marker set to Pool-Seq data from an experimental evolution study and from North American and Australian latitudinal clines. In the experimental evolution data, we find evidence that positive selection has driven the frequencies of In(3R)C and In(3R)Mo to increase over time. In the clinal data, we confirm the existence of frequency clines for In(2L)t, In(3L)P and In(3R)Payne in both North America and Australia and detect a previously unknown latitudinal cline for In(3R)Mo in North America. The inversion markers developed here provide a versatile and robust tool for characterizing inversion frequencies and their dynamics in Pool-Seq data from diverse D. melanogaster populations. © 2013 The Authors. Molecular Ecology Published by John Wiley & Sons Ltd.

  5. Empirical methods for controlling false positives and estimating confidence in ChIP-Seq peaks

    Directory of Open Access Journals (Sweden)

    Courdy Samir J

    2008-12-01

    Full Text Available Abstract Background High throughput signature sequencing holds many promises, one of which is the ready identification of in vivo transcription factor binding sites, histone modifications, changes in chromatin structure and patterns of DNA methylation across entire genomes. In these experiments, chromatin immunoprecipitation is used to enrich for particular DNA sequences of interest and signature sequencing is used to map the regions to the genome (ChIP-Seq. Elucidation of these sites of DNA-protein binding/modification are proving instrumental in reconstructing networks of gene regulation and chromatin remodelling that direct development, response to cellular perturbation, and neoplastic transformation. Results Here we present a package of algorithms and software that makes use of control input data to reduce false positives and estimate confidence in ChIP-Seq peaks. Several different methods were compared using two simulated spike-in datasets. Use of control input data and a normalized difference score were found to more than double the recovery of ChIP-Seq peaks at a 5% false discovery rate (FDR. Moreover, both a binomial p-value/q-value and an empirical FDR were found to predict the true FDR within 2–3 fold and are more reliable estimators of confidence than a global Poisson p-value. These methods were then used to reanalyze Johnson et al.'s neuron-restrictive silencer factor (NRSF ChIP-Seq data without relying on extensive qPCR validated NRSF sites and the presence of NRSF binding motifs for setting thresholds. Conclusion The methods developed and tested here show considerable promise for reducing false positives and estimating confidence in ChIP-Seq data without any prior knowledge of the chIP target. They are part of a larger open source package freely available from http://useq.sourceforge.net/.

  6. A structured sparse regression method for estimating isoform expression level from multi-sample RNA-seq data.

    Science.gov (United States)

    Zhang, L; Liu, X J

    2016-06-03

    With the rapid development of next-generation high-throughput sequencing technology, RNA-seq has become a standard and important technique for transcriptome analysis. For multi-sample RNA-seq data, the existing expression estimation methods usually deal with each single-RNA-seq sample, and ignore that the read distributions are consistent across multiple samples. In the current study, we propose a structured sparse regression method, SSRSeq, to estimate isoform expression using multi-sample RNA-seq data. SSRSeq uses a non-parameter model to capture the general tendency of non-uniformity read distribution for all genes across multiple samples. Additionally, our method adds a structured sparse regularization, which not only incorporates the sparse specificity between a gene and its corresponding isoform expression levels, but also reduces the effects of noisy reads, especially for lowly expressed genes and isoforms. Four real datasets were used to evaluate our method on isoform expression estimation. Compared with other popular methods, SSRSeq reduced the variance between multiple samples, and produced more accurate isoform expression estimations, and thus more meaningful biological interpretations.

  7. Genome-Independent Identification of RNA Editing by Mutual Information (GIREMI) | Informatics Technology for Cancer Research (ITCR)

    Science.gov (United States)

    Identification of single-nucleotide variants in RNA-seq data. Current version focuses on detection of RNA editing sites without requiring genome sequence data. New version is under development to separately identify RNA editing sites and genetic variants using RNA-seq data alone.

  8. Role of SeqA and Dam in Escherichia coli gene expression: A global/microarray analysis

    DEFF Research Database (Denmark)

    Løbner-Olesen, Anders; Marinus, M.G.; Hansen, Flemming G.

    2003-01-01

    High-density oligonucleotide arrays were used to monitor global transcription patterns in Escherichia coli with various levels of Dam and SeqA proteins. Cells lacking Dam methyltransferase showed a modest increase in transcription of the genes belonging to the SOS regulon. Bacteria devoid...... of the SeqA protein, which preferentially binds hemimethylated DNA, were found to have a transcriptional profile almost identical to WT bacteria overexpressing Dam methyltransferase. The latter two strains differed from WT in two ways. First, the origin proximal genes were transcribed with increased...... frequency due to increased gene dosage. Second, chromosomal domains of high transcriptional activity alternate with regions of low activity, and our results indicate that the activity in each domain is modulated in the same way by SeqA deficiency or Dam overproduction. We suggest that the methylation status...

  9. Efeito de idade à castração e de grupos genéticos sobre o desempenho em confinamento e características de carcaça

    Directory of Open Access Journals (Sweden)

    Euclides Filho Kepler

    2001-01-01

    Full Text Available Foram utilizados 71 animais pertencentes a dois grupos genéticos com diferentes potenciais de crescimento, ½ Angus - ½ Nelore (AN e ½ Simental - ½ Nelore (SN. Esses animais, pertencentes a um projeto amplo denominado Projeto Cruzamento Embrapa 1, foram submetidos a sete tratamentos de castração. Observou-se que os animais SN permaneceram 14 dias a mais, em confinamento, para que fossem abatidos com o mesmo grau de acabamento que os AN (131 dias versus 117 dias, respectivamente. As demais características estudadas, peso de abate, peso de carcaça fria, rendimento de carcaça e área de olho-de-lombo, não foram influenciadas pelo grupo genético e apresentaram, nessa mesma seqüência, médias iguais a 471 kg e 476 kg, 266 kg e 274 kg, 58,13% e 57,46% e 72,71 cm² e 75,79 cm² para AN e SN, respectivamente. As comparações entre as médias dos diferentes tratamentos foram realizadas utilizando-se seis contrastes. Verificou-se que os animais inteiros permaneceram 25 dias a mais em confinamento do que aqueles castrados no nascimento (136 dias versus 111 dias, respectivamente. No entanto, estes animais apresentaram peso médio de abate superior àqueles observados para os animais castrados no nascimento (515 kg versus 463 kg, respectivamente. Animais castrados no nascimento permaneceram mais tempo em confinamento do que aqueles castrados na desmama ou com um ano de idade (111 dias versus 95 dias, respectivamente. Os animais confinados logo após a desmama, como era de se esperar, foram aqueles que permaneceram mais tempo em confinamento (181 dias. Por serem animais jovens, um ano mais novos que os demais, esse maior tempo em confinamento não refletiu em pesos de abate mais elevados (455 kg. O rendimento médio de carcaça, independente do grupo genético e tratamento, foi de 57,79%.

  10. GenMapDB: a database of mapped human BAC clones

    OpenAIRE

    Morley, Michael; Arcaro, Melissa; Burdick, Joshua; Yonescu, Raluca; Reid, Thomas; Kirsch, Ilan R.; Cheung, Vivian G.

    2001-01-01

    GenMapDB (http://genomics.med.upenn.edu/genmapdb) is a repository of human bacterial artificial chromosome (BAC) clones mapped by our laboratory to sequence-tagged site markers. Currently, GenMapDB contains over 3000 mapped clones that span 19 chromosomes, chromosomes 2, 4, 5, 9–22, X and Y. This database provides positional information about human BAC clones from the RPCI-11 human male BAC library. It also contains restriction fragment analysis data and end sequen...

  11. Inducing indel mutation in the SOX6 gene by zinc finger nuclease for gamma reactivation: An approach towards gene therapy of beta thalassemia.

    Science.gov (United States)

    Modares Sadeghi, Mehran; Shariati, Laleh; Hejazi, Zahra; Shahbazi, Mansoureh; Tabatabaiefar, Mohammad Amin; Khanahmad, Hossein

    2018-03-01

    β-thalassemia is a common autosomal recessive disorder characterized by a deficiency in the synthesis of β-chains. Evidences show that increased HbF levels improve the symptoms in patients with β-thalassemia or sickle cell anemia. In this study, ZFN technology was applied to induce a mutation in the binding domain region of SOX6 to reactivate γ-globin expression. The sequences coding for ZFP arrays were designed and sub cloned in TDH plus as a transfer vector. The ZFN expression was confirmed using Western blot analysis. In the next step, using the site-directed mutagenesis strategy through the overlap PCR, a missense mutation (D64V) was induced in the catalytic domain of the integrase gene in the packaging plasmid and verified using DNA sequencing. Then, the integrase minus lentivirus containing ZFN cassette was packaged. Transduction of K562 cells with this virus was performed. Mutation detection assay was performed. The indel percentage of the cells transducted with lenti virus containing ZFN was 31%. After 5 days of erythroid differentiation with 15 μg/mL cisplatin, the levels of γ-globin mRNA were sixfold in the cells treated with ZFN compared to untreated cells. In the meantime, the measurement of HbF expression levels was carried out using hemoglobin electrophoresis and showed the same results. Integrase minus lentivirus can provide a useful tool for efficient transient gene expression and helps avoid disadvantages of gene targeting using the native virus. The ZFN strategy applied here to induce indel on SOX6 gene in adult erythroid progenitors may provide a method to activate fetal hemoglobin expression in individuals with β-thalassemia. © 2017 Wiley Periodicals, Inc.

  12. High-sensitivity HLA typing by Saturated Tiling Capture Sequencing (STC-Seq).

    Science.gov (United States)

    Jiao, Yang; Li, Ran; Wu, Chao; Ding, Yibin; Liu, Yanning; Jia, Danmei; Wang, Lifeng; Xu, Xiang; Zhu, Jing; Zheng, Min; Jia, Junling

    2018-01-15

    Highly polymorphic human leukocyte antigen (HLA) genes are responsible for fine-tuning the adaptive immune system. High-resolution HLA typing is important for the treatment of autoimmune and infectious diseases. Additionally, it is routinely performed for identifying matched donors in transplantation medicine. Although many HLA typing approaches have been developed, the complexity, low-efficiency and high-cost of current HLA-typing assays limit their application in population-based high-throughput HLA typing for donors, which is required for creating large-scale databases for transplantation and precision medicine. Here, we present a cost-efficient Saturated Tiling Capture Sequencing (STC-Seq) approach to capturing 14 HLA class I and II genes. The highly efficient capture (an approximately 23,000-fold enrichment) of these genes allows for simplified allele calling. Tests on five genes (HLA-A/B/C/DRB1/DQB1) from 31 human samples and 351 datasets using STC-Seq showed results that were 98% consistent with the known two sets of digitals (field1 and field2) genotypes. Additionally, STC can capture genomic DNA fragments longer than 3 kb from HLA loci, making the library compatible with the third-generation sequencing. STC-Seq is a highly accurate and cost-efficient method for HLA typing which can be used to facilitate the establishment of population-based HLA databases for the precision and transplantation medicine.

  13. Main modelling features of the ASTEC V2.1 major version

    International Nuclear Information System (INIS)

    Chatelard, P.; Belon, S.; Bosland, L.; Carénini, L.; Coindreau, O.; Cousin, F.; Marchetto, C.; Nowack, H.; Piar, L.; Chailan, L.

    2016-01-01

    Highlights: • Recent modelling improvements of the ASTEC European severe accident code are outlined. • Key new physical models now available in the ASTEC V2.1 major version are described. • ASTEC progress towards a multi-design reactor code is illustrated for BWR and PHWR. • ASTEC strong link with the on-going EC CESAM FP7 project is emphasized. • Main remaining modelling issues (on which IRSN efforts are now directing) are given. - Abstract: A new major version of the European severe accident integral code ASTEC, developed by IRSN with some GRS support, was delivered in November 2015 to the ASTEC worldwide community. Main modelling features of this V2.1 version are summarised in this paper. In particular, the in-vessel coupling technique between the reactor coolant system thermal-hydraulics module and the core degradation module has been strongly re-engineered to remove some well-known weaknesses of the former V2.0 series. The V2.1 version also includes new core degradation models specifically addressing BWR and PHWR reactor types, as well as several other physical modelling improvements, notably on reflooding of severely damaged cores, Zircaloy oxidation under air atmosphere, corium coolability during corium concrete interaction and source term evaluation. Moreover, this V2.1 version constitutes the back-bone of the CESAM FP7 project, which final objective is to further improve ASTEC for use in Severe Accident Management analysis of the Gen.II–III nuclear power plants presently under operation or foreseen in near future in Europe. As part of this European project, IRSN efforts to continuously improve both code numerical robustness and computing performances at plant scale as well as users’ tools are being intensified. Besides, ASTEC will continue capitalising the whole knowledge on severe accidents phenomenology by progressively keeping physical models at the state of the art through a regular feed-back from the interpretation of the current and

  14. Quantitative linking hypotheses for infant eye movements.

    Directory of Open Access Journals (Sweden)

    Daniel Yurovsky

    Full Text Available The study of cognitive development hinges, largely, on the analysis of infant looking. But analyses of eye gaze data require the adoption of linking hypotheses: assumptions about the relationship between observed eye movements and underlying cognitive processes. We develop a general framework for constructing, testing, and comparing these hypotheses, and thus for producing new insights into early cognitive development. We first introduce the general framework--applicable to any infant gaze experiment--and then demonstrate its utility by analyzing data from a set of experiments investigating the role of attentional cues in infant learning. The new analysis uncovers significantly more structure in these data, finding evidence of learning that was not found in standard analyses and showing an unexpected relationship between cue use and learning rate. Finally, we discuss general implications for the construction and testing of quantitative linking hypotheses. MATLAB code for sample linking hypotheses can be found on the first author's website.

  15. Pearce element ratios: A paradigm for testing hypotheses

    Science.gov (United States)

    Russell, J. K.; Nicholls, Jim; Stanley, Clifford R.; Pearce, T. H.

    Science moves forward with the development of new ideas that are encapsulated by hypotheses whose aim is to explain the structure of data sets or to expand existing theory. These hypotheses remain conjecture until they have been tested. In fact, Karl Popper advocated that a scientist's job does not finish with the creation of an idea but, rather, begins with the testing of the related hypotheses. In Popper's [1959] advocation it is implicit that there be tools with which we can test our hypotheses. Consequently, the development of rigorous tests for conceptual models plays a major role in maintaining the integrity of scientific endeavor [e.g., Greenwood, 1989].

  16. Description of Guyruita gen. nov. and two new species (Ischnocolinae, Theraphosidae Descrição de Guyruita gen. nov. e duas novas espécies (Ischnocolinae, Theraphosidae

    Directory of Open Access Journals (Sweden)

    José P.L. Guadanucci

    2007-12-01

    Full Text Available The genus Guyruita gen. nov. and two new species from Brazil are described. Holothele waikoshiemi (Bertani & Araújo, 2005 from Venezuela is transferred here to the new genus. Guyruita gen. nov. differs from the remaining Ischnocolinae by the following features: labium densely occupied by a lot of cuspules (more than 100, intercheliceral intumescence absent, posterior sternal sigilla remote from margin, tarsal claws without teeth, tarsal scopula I-II undivided (tarsus II with a line of sparse setae, which does not divide the scopula, III-IV divided.É descrito o gênero Guyruita gen. nov. e duas espécies novas do Brasil. Holothele waikoshiemi (Bertani & Araújo, 2005 da Venezuela é transferido para o novo gênero. Guyruita gen. nov. difere dos outros Ischnocolinae pelas seguintes caracterísicas: lábio densamente ocupado por muitas cúspides (mais de 100, tumescência interqueliceral ausente, sigilla esternal posterior distante da margem, unhas tarsais sem dentes, escópula tarsal I e II inteiras (tarso II com uma fileira de cerdas esparsas, as quais não dividem a escópula, III e IV divididas.

  17. A Novel Role of Human Holliday Junction Resolvase GEN1 in the Maintenance of Centrosome Integrity

    DEFF Research Database (Denmark)

    Gao, M.; Danielsen, Jannie Michaela Rendtlew; Wei, L.-Z.

    2012-01-01

    but not catalytic activity of GEN1 is required for preventing centrosome hyper-amplification, formation of multiple mitotic spindles, and multi-nucleation. Our findings provide novel insight into the biological functions of GEN1 by uncovering an important role of GEN1 in the regulation of centrosome integrity....

  18. SignalSpider: Probabilistic pattern discovery on multiple normalized ChIP-Seq signal profiles

    KAUST Repository

    Wong, Kachun

    2014-09-05

    Motivation: Chromatin immunoprecipitation (ChIP) followed by high-throughput sequencing (ChIP-Seq) measures the genome-wide occupancy of transcription factors in vivo. Different combinations of DNA-binding protein occupancies may result in a gene being expressed in different tissues or at different developmental stages. To fully understand the functions of genes, it is essential to develop probabilistic models on multiple ChIP-Seq profiles to decipher the combinatorial regulatory mechanisms by multiple transcription factors. Results: In this work, we describe a probabilistic model (SignalSpider) to decipher the combinatorial binding events of multiple transcription factors. Comparing with similar existing methods, we found SignalSpider performs better in clustering promoter and enhancer regions. Notably, SignalSpider can learn higher-order combinatorial patterns from multiple ChIP-Seq profiles. We have applied SignalSpider on the normalized ChIP-Seq profiles from the ENCODE consortium and learned model instances. We observed different higher-order enrichment and depletion patterns across sets of proteins. Those clustering patterns are supported by Gene Ontology (GO) enrichment, evolutionary conservation and chromatin interaction enrichment, offering biological insights for further focused studies. We also proposed a specific enrichment map visualization method to reveal the genome-wide transcription factor combinatorial patterns from the models built, which extend our existing fine-scale knowledge on gene regulation to a genome-wide level. Availability and implementation: The matrix-algebra-optimized executables and source codes are available at the authors\\' websites: http://www.cs.toronto.edu/∼wkc/SignalSpider. Contact: Supplementary information: Supplementary data are available at Bioinformatics online.

  19. Environmental Information for the U.S. Next Generation Air Transportation System (NextGen)

    Science.gov (United States)

    Murray, J.; Miner, C.; Pace, D.; Minnis, P.; Mecikalski, J.; Feltz, W.; Johnson, D.; Iskendarian, H.; Haynes, J.

    2009-09-01

    It is estimated that weather is responsible for approximately 70% of all air traffic delays and cancellations in the United States. Annually, this produces an overall economic loss of nearly 40B. The FAA and NASA have determined that weather impacts and other environmental constraints on the U.S. National Airspace System (NAS) will increase to the point of system unsustainability unless the NAS is radically transformed. A Next Generation Air Transportation System (NextGen) is planned to accommodate the anticipated demand for increased system capacity and the super-density operations that this transformation will entail. The heart of the environmental information component that is being developed for NextGen will be a 4-dimensional data cube which will include a single authoritative source comprising probabilistic weather information for NextGen Air Traffic Management (ATM) systems. Aviation weather constraints and safety hazards typically comprise meso-scale, storm-scale and microscale observables that can significantly impact both terminal and enroute aviation operations. With these operational impacts in mind, functional and performance requirements for the NextGen weather system were established which require significant improvements in observation and forecasting capabilities. This will include satellite observations from geostationary and/or polar-orbiting hyperspectral sounders, multi-spectral imagers, lightning mappers, space weather monitors and other environmental observing systems. It will also require improved in situ and remotely sensed observations from ground-based and airborne systems. These observations will be used to better understand and to develop forecasting applications for convective weather, in-flight icing, turbulence, ceilings and visibility, volcanic ash, space weather and the environmental impacts of aviation. Cutting-edge collaborative research efforts and results from NASA, NOAA and the FAA which address these phenomena are summarized

  20. HPeak: an HMM-based algorithm for defining read-enriched regions in ChIP-Seq data

    Directory of Open Access Journals (Sweden)

    Maher Christopher A

    2010-07-01

    Full Text Available Abstract Background Protein-DNA interaction constitutes a basic mechanism for the genetic regulation of target gene expression. Deciphering this mechanism has been a daunting task due to the difficulty in characterizing protein-bound DNA on a large scale. A powerful technique has recently emerged that couples chromatin immunoprecipitation (ChIP with next-generation sequencing, (ChIP-Seq. This technique provides a direct survey of the cistrom of transcription factors and other chromatin-associated proteins. In order to realize the full potential of this technique, increasingly sophisticated statistical algorithms have been developed to analyze the massive amount of data generated by this method. Results Here we introduce HPeak, a Hidden Markov model (HMM-based Peak-finding algorithm for analyzing ChIP-Seq data to identify protein-interacting genomic regions. In contrast to the majority of available ChIP-Seq analysis software packages, HPeak is a model-based approach allowing for rigorous statistical inference. This approach enables HPeak to accurately infer genomic regions enriched with sequence reads by assuming realistic probability distributions, in conjunction with a novel weighting scheme on the sequencing read coverage. Conclusions Using biologically relevant data collections, we found that HPeak showed a higher prevalence of the expected transcription factor binding motifs in ChIP-enriched sequences relative to the control sequences when compared to other currently available ChIP-Seq analysis approaches. Additionally, in comparison to the ChIP-chip assay, ChIP-Seq provides higher resolution along with improved sensitivity and specificity of binding site detection. Additional file and the HPeak program are freely available at http://www.sph.umich.edu/csg/qin/HPeak.

  1. IMG 4 version of the integrated microbial genomes comparative analysis system

    Science.gov (United States)

    Markowitz, Victor M.; Chen, I-Min A.; Palaniappan, Krishna; Chu, Ken; Szeto, Ernest; Pillay, Manoj; Ratner, Anna; Huang, Jinghua; Woyke, Tanja; Huntemann, Marcel; Anderson, Iain; Billis, Konstantinos; Varghese, Neha; Mavromatis, Konstantinos; Pati, Amrita; Ivanova, Natalia N.; Kyrpides, Nikos C.

    2014-01-01

    The Integrated Microbial Genomes (IMG) data warehouse integrates genomes from all three domains of life, as well as plasmids, viruses and genome fragments. IMG provides tools for analyzing and reviewing the structural and functional annotations of genomes in a comparative context. IMG’s data content and analytical capabilities have increased continuously since its first version released in 2005. Since the last report published in the 2012 NAR Database Issue, IMG’s annotation and data integration pipelines have evolved while new tools have been added for recording and analyzing single cell genomes, RNA Seq and biosynthetic cluster data. Different IMG datamarts provide support for the analysis of publicly available genomes (IMG/W: http://img.jgi.doe.gov/w), expert review of genome annotations (IMG/ER: http://img.jgi.doe.gov/er) and teaching and training in the area of microbial genome analysis (IMG/EDU: http://img.jgi.doe.gov/edu). PMID:24165883

  2. IMG 4 version of the integrated microbial genomes comparative analysis system

    Energy Technology Data Exchange (ETDEWEB)

    Markowitz, Victor M. [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Biological Data Management and Technology Center. Computational Research Division; Chen, I-Min A. [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Biological Data Management and Technology Center. Computational Research Division; Palaniappan, Krishna [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Biological Data Management and Technology Center. Computational Research Division; Chu, Ken [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Biological Data Management and Technology Center. Computational Research Division; Szeto, Ernest [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Biological Data Management and Technology Center. Computational Research Division; Pillay, Manoj [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Biological Data Management and Technology Center. Computational Research Division; Ratner, Anna [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Biological Data Management and Technology Center. Computational Research Division; Huang, Jinghua [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Biological Data Management and Technology Center. Computational Research Division; Woyke, Tanja [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States). Microbial Genome and Metagenome Program; Huntemann, Marcel [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States). Microbial Genome and Metagenome Program; Anderson, Iain [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States). Microbial Genome and Metagenome Program; Billis, Konstantinos [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States). Microbial Genome and Metagenome Program; Varghese, Neha [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States). Microbial Genome and Metagenome Program; Mavromatis, Konstantinos [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States). Microbial Genome and Metagenome Program; Pati, Amrita [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States). Microbial Genome and Metagenome Program; Ivanova, Natalia N. [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States). Microbial Genome and Metagenome Program; Kyrpides, Nikos C. [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States). Microbial Genome and Metagenome Program

    2013-10-27

    The Integrated Microbial Genomes (IMG) data warehouse integrates genomes from all three domains of life, as well as plasmids, viruses and genome fragments. IMG provides tools for analyzing and reviewing the structural and functional annotations of genomes in a comparative context. IMG’s data content and analytical capabilities have increased continuously since its first version released in 2005. Since the last report published in the 2012 NAR Database Issue, IMG’s annotation and data integration pipelines have evolved while new tools have been added for recording and analyzing single cell genomes, RNA Seq and biosynthetic cluster data. Finally, different IMG datamarts provide support for the analysis of publicly available genomes (IMG/W: http://img.jgi.doe.gov/w), expert review of genome annotations (IMG/ER: http://img.jgi.doe.gov/er) and teaching and training in the area of microbial genome analysis (IMG/EDU: http://img.jgi.doe.gov/edu).

  3. Genética, performance física humana e doping genético: o senso comum versus a realidade científica

    Directory of Open Access Journals (Sweden)

    Rodrigo Gonçalves Dias

    2011-02-01

    Full Text Available Atletas de elite são reconhecidos como fenômenos esportivos e o potencial para atingir níveis superiores de performance no esporte está parcialmente sob o controle de genes. A excelência atlética é essencialmente multifatorial e determinada por complexas interações entre fatores ambientais e genéticos. Existem aproximadamente 10 milhões de variantes genéticas dispersas por todo o genoma humano e uma parcela destas variantes têm demonstrado influenciar a responsividade ao treinamento físico. Os fenótipos de performance física humana parecem ser altamente poligênicos e alguns estudos têm comprovado a existência de raras combinações genotípicas em atletas. No entanto, os mecanismos pelos quais genes se interagem para amplificar a performance física são desconhecidos. O conhecimento sobre os genes que influenciam a treinabilidade somado ao potencial uso indevido dos avanços da terapia gênica, como a possível introdução de genes em células de atletas, fez surgir o termo doping genético, um novo e censurado método de amplificação da performance física, além dos limites fisiológicos. Aumentos na hipertrofia muscular esquelética e nos níveis de hematócrito estão sendo conseguidos através da manipulação da expressão de genes específicos, mas a grande parte das impressionáveis alterações foi obtida em experimentação com animais de laboratório. A compreensão dos resultados científicos envolvendo genética, performance física humana e doping genético é uma difícil tarefa. Com o propósito de evitar a contínua má interpretação e propagação de conceitos errôneos, esta revisão, intencionalmente, vem discutir as evidências científicas produzidas até o momento sobre o tema, permitindo a compreensão do atual "estado da arte"

  4. voomDDA: discovery of diagnostic biomarkers and classification of RNA-seq data

    Directory of Open Access Journals (Sweden)

    Gokmen Zararsiz

    2017-10-01

    Full Text Available RNA-Seq is a recent and efficient technique that uses the capabilities of next-generation sequencing technology for characterizing and quantifying transcriptomes. One important task using gene-expression data is to identify a small subset of genes that can be used to build diagnostic classifiers particularly for cancer diseases. Microarray based classifiers are not directly applicable to RNA-Seq data due to its discrete nature. Overdispersion is another problem that requires careful modeling of mean and variance relationship of the RNA-Seq data. In this study, we present voomDDA classifiers: variance modeling at the observational level (voom extensions of the nearest shrunken centroids (NSC and the diagonal discriminant classifiers. VoomNSC is one of these classifiers and brings voom and NSC approaches together for the purpose of gene-expression based classification. For this purpose, we propose weighted statistics and put these weighted statistics into the NSC algorithm. The VoomNSC is a sparse classifier that models the mean-variance relationship using the voom method and incorporates voom’s precision weights into the NSC classifier via weighted statistics. A comprehensive simulation study was designed and four real datasets are used for performance assessment. The overall results indicate that voomNSC performs as the sparsest classifier. It also provides the most accurate results together with power-transformed Poisson linear discriminant analysis, rlog transformed support vector machines and random forests algorithms. In addition to prediction purposes, the voomNSC classifier can be used to identify the potential diagnostic biomarkers for a condition of interest. Through this work, statistical learning methods proposed for microarrays can be reused for RNA-Seq data. An interactive web application is freely available at http://www.biosoft.hacettepe.edu.tr/voomDDA/.

  5. voomDDA: discovery of diagnostic biomarkers and classification of RNA-seq data.

    Science.gov (United States)

    Zararsiz, Gokmen; Goksuluk, Dincer; Klaus, Bernd; Korkmaz, Selcuk; Eldem, Vahap; Karabulut, Erdem; Ozturk, Ahmet

    2017-01-01

    RNA-Seq is a recent and efficient technique that uses the capabilities of next-generation sequencing technology for characterizing and quantifying transcriptomes. One important task using gene-expression data is to identify a small subset of genes that can be used to build diagnostic classifiers particularly for cancer diseases. Microarray based classifiers are not directly applicable to RNA-Seq data due to its discrete nature. Overdispersion is another problem that requires careful modeling of mean and variance relationship of the RNA-Seq data. In this study, we present voomDDA classifiers: variance modeling at the observational level (voom) extensions of the nearest shrunken centroids (NSC) and the diagonal discriminant classifiers. VoomNSC is one of these classifiers and brings voom and NSC approaches together for the purpose of gene-expression based classification. For this purpose, we propose weighted statistics and put these weighted statistics into the NSC algorithm. The VoomNSC is a sparse classifier that models the mean-variance relationship using the voom method and incorporates voom's precision weights into the NSC classifier via weighted statistics. A comprehensive simulation study was designed and four real datasets are used for performance assessment. The overall results indicate that voomNSC performs as the sparsest classifier. It also provides the most accurate results together with power-transformed Poisson linear discriminant analysis, rlog transformed support vector machines and random forests algorithms. In addition to prediction purposes, the voomNSC classifier can be used to identify the potential diagnostic biomarkers for a condition of interest. Through this work, statistical learning methods proposed for microarrays can be reused for RNA-Seq data. An interactive web application is freely available at http://www.biosoft.hacettepe.edu.tr/voomDDA/.

  6. Electroclinical presentation and genotype-phenotype relationships in patients with Unverricht-Lundborg disease carrying compound heterozygous CSTB point and indel mutations.

    Science.gov (United States)

    Canafoglia, Laura; Gennaro, Elena; Capovilla, Giuseppe; Gobbi, Giuseppe; Boni, Antonella; Beccaria, Francesca; Viri, Maurizio; Michelucci, Roberto; Agazzi, Pamela; Assereto, Stefania; Coviello, Domenico A; Di Stefano, Maria; Rossi Sebastiano, Davide; Franceschetti, Silvana; Zara, Federico

    2012-12-01

    Unverricht-Lundborg disease (EPM1A) is frequently due to an unstable expansion of a dodecamer repeat in the CSTB gene, whereas other types of mutations are rare. EPM1A due to homozygous expansion has a rather stereotyped presentation with prominent action myoclonus. We describe eight patients with five different compound heterozygous CSTB point or indel mutations in order to highlight their particular phenotypical presentations and evaluate their genotype-phenotype relationships. We screened CSTB mutations by means of Southern blotting and the sequencing of the genomic DNA of each proband. CSTB messenger RNA (mRNA) aberrations were characterized by sequencing the complementary DNA (cDNA) of lymphoblastoid cells, and assessing the protein concentrations in the lymphoblasts. The patient evaluations included the use of a simplified myoclonus severity rating scale, multiple neurophysiologic tests, and electroencephalography (EEG)-polygraphic recordings. To highlight the particular clinical features and disease time-course in compound heterozygous patients, we compared some of their characteristics with those observed in a series of 40 patients carrying the common homozygous expansion mutation observed at the C. Besta Foundation, Milan, Italy. The eight compound heterozygous patients belong to six EPM1A families (out of 52; 11.5%) diagnosed at the Laboratory of Genetics of the Galliera Hospitals in Genoa, Italy. They segregated five different heterozygous point or indel mutations in association with the common dodecamer expansion. Four patients from three families had previously reported CSTB mutations (c.67-1G>C and c.168+1_18del); one had a novel nonsense mutation at the first exon (c.133C>T) leading to a premature stop codon predicting a short peptide; the other three patients from two families had a complex novel indel mutation involving the donor splice site of intron 2 (c.168+2_169+21delinsAA) and leading to an aberrant transcript with a partially retained intron

  7. Genes and proteins of Escherichia coli (GenProtEc).

    Science.gov (United States)

    Riley, M; Space, D B

    1996-01-01

    GenProtEc is a database of Escherichia coli genes and their gene products, classified by type of function and physiological role and with citations to the literature for each. Also present are data on sequence similarities among E.coli proteins with PAM values, percent identity of amino acids, length of alignment and percent aligned. The database is available as a PKZip file by ftp from mbl.edu/pub/ecoli.exe. The program runs under MS-DOS on IMB-compatible machines. GenProtEc can also be accessed through the World Wide Web at URL http://mbl.edu/html/ecoli.html.

  8. Reflexo da interação genótipo x ambiente sobre o melhoramento genético de feijão

    OpenAIRE

    Pereira, Thayse Cristine Vieira; Schmit, Rodolfo; Haveroth, Eduardo José; Melo, Rita Carolina de; Coimbra, Jefferson Luís Meirelles; Guidolin, Altamir Frederico; Backes, Rogério Luiz

    2015-01-01

    RESUMO: O objetivo foi avaliar os componentes da variância fenotípica e estimar a influência da interação genótipo*ambiente no rendimento de grãos em feijão. Os componentes da variância fenotípica foram estimados pelo método da máxima verossimilhança restrita e do melhor preditor linear não viesado (REML/BLUP), juntamente com o espaço de inferência específico. As avaliações foram realizadas nas safras agrícolas de 2006/07 a 2011/12 no município de Lages/SC. Durante o período, 104 genótipos fo...

  9. ANALISIS SEKUEN GEN GLUTATION PEROKSIDASE (GPX1 SEBAGAI DETEKSI STRES OKSIDATIF AKIBAT INFEKSI MYCOBACTERIUM TUBERCULOSIS

    Directory of Open Access Journals (Sweden)

    Ari Yuniastuti

    2013-02-01

    Full Text Available Glutation merupakan antioksidan yang berperan dalam fungsi imun, dan diekspresikan secara genetik oleh urutan gen yang membentuk protein enzim Glutation Peroxidase (GPx1. Bila ekspresi gen berubah maka terjadi perubahan fungsi glutation dan kerentanan terhadap stress oksidatif. Metode yang digunakan adalah Kasus-kontrol. Sampel yang digunakan adalah sampel darah. Kelompok kasus adalah sampel darah pasien tuberkulosis paru sedangkan kelompok kontrol adalah sampel darah orang sehat. Pemeriksaan gen Glutation peroxidase (GPx1 menggunakan metode Polymerase Chain Reaction (PCR untuk melihat pita DNA pada pasien tuberkulosis par serta elektroforesis produk PCR-RFLP gen GPx1 kelompok sampel tuberkulosis. Hasil penelitian menunjukkan bahwa tidak terdapat hubungan yang bermakna antara polimorfisme gen GPx1 (p=0,365 pasein tuberkulois dengan individu sehat, sehingga tidak dapat digunakan sebagai alat deteksi kerentanan terhadap stress oksidatif pada pasien tuberkulosis. Perlu penelitian lanjutan yang menggunakan sampel lebih besar dan populasi etnik yang berbeda.

  10. A strand-specific RNA-Seq analysis of the transcriptome of the typhoid bacillus Salmonella typhi.

    Directory of Open Access Journals (Sweden)

    Timothy T Perkins

    2009-07-01

    Full Text Available High-density, strand-specific cDNA sequencing (ssRNA-seq was used to analyze the transcriptome of Salmonella enterica serovar Typhi (S. Typhi. By mapping sequence data to the entire S. Typhi genome, we analyzed the transcriptome in a strand-specific manner and further defined transcribed regions encoded within prophages, pseudogenes, previously un-annotated, and 3'- or 5'-untranslated regions (UTR. An additional 40 novel candidate non-coding RNAs were identified beyond those previously annotated. Proteomic analysis was combined with transcriptome data to confirm and refine the annotation of a number of hpothetical genes. ssRNA-seq was also combined with microarray and proteome analysis to further define the S. Typhi OmpR regulon and identify novel OmpR regulated transcripts. Thus, ssRNA-seq provides a novel and powerful approach to the characterization of the bacterial transcriptome.

  11. Polimorfismos del gen BoLA-DRB3.2* en ganado criollo colombiano

    Directory of Open Access Journals (Sweden)

    Darwin Hernández H.

    2013-10-01

    Full Text Available Objetivo. Caracterizar el polimorfismo del gen BoLA-DRB3.2* en las razas bovinas criollas y colombianas. Materiales y métodos. En 360 muestras de ADN de ocho razas bovinas criollas (Blanco Orejinegro, Casanareño, Costeño con Cuernos, Chino Santandereano, Caqueteño, Hartón del Valle, Romosinuano y San Martinero, dos razas sintéticas Colombianas (Lucerna y Velásquez y dos razas foráneas (Brahman y Holstein se evaluó el polimorfismo del gen BoLA-DRB3.2 mediante técnicas moleculares (PCR-RFLP; se calculó el número promedio de alelos (NPA, las frecuencias, la heterocigocidad esperada (He y observada (Ho, el equilibrio de Hardy-Weinberg, la estructura genética y los valores de FST y FIS. Resultados. El NPA fue 14.6 ± 3.8 siendo Caqueteño la raza con mayor NPA (25 y el menor el Chino Santandereano (10. Se encontraron 41 alelos BoLA-DRB3.2* los más frecuentes fueron *28, *37, *24, *23, *20, *27, *8, *16, *39 (0.17, 0.11, 0.10, 0.09, 0.09, 0.07, 0.07 y 0.06 respectivamente. Se encontró alta diversidad genética (He = 0.878 con mayor valor en Caqueteño (0.96 y menor en San Martinero (0.81. Todas las razas se encontraron en equilibrio de Hardy-Weinberg, se encontraron valores altamente significativos de diferenciación genética (FST= 0.044 y de coeficiente de endogamia (FIS = 0.249. Conclusiones. El ganado criollo colombiano posee alto polimorfismo del gen BoLA-DRB3.2* representado en los altos valores de NPA y diversidad génetica.

  12. Caracterização genética de rizóbios nativos dos tabuleiros costeiros eficientes em culturas do guandu e caupi

    Directory of Open Access Journals (Sweden)

    Fernandes Marcelo Ferreira

    2003-01-01

    Full Text Available O objetivo deste trabalho foi caracterizar geneticamente sete estirpes de rizóbios nativos dos tabuleiros costeiros de Sergipe com alta eficiência de fixação biológica do N2 em associação com guandu (Cajanus cajan e caupi (Vigna unguiculata. A amplificação do DNA pela técnica de PCR (polymerase chain reaction com o oligonucleotídeo específico BOX indicou um grau elevado de diversidade genética, uma vez que todas as estirpes apresentaram perfis únicos de DNA. A análise por BOX-PCR revelou, ainda, que essa metodologia é eficiente para diferenciar estirpes, mas não para a diferenciação de espécies de rizóbio. Pela técnica do RFLP (restriction fragment length polymorphism da região do DNA que codifica o gene 16S rRNA e da região intergênica entre os genes 16S e 23S rRNA, com cinco enzimas de restrição, bem como pelo seqüenciamento parcial da região do 16S rRNA, foi possível classificar as estirpes nos gêneros Bradyrhizobium e Rhizobium. Houve coerência entre as análises envolvendo a região do 16S rRNA, mas o agrupamento com uma das estirpes diferiu pela análise do espaço intergênico. Os resultados obtidos com a estirpe R11 indicam variabilidade genética elevada em relação às espécies de rizóbios descritas, inclusive diferindo em diversas bases da região do 16S rRNA, e podem indicar uma nova espécie.

  13. Variabilidade genética da produção anual da seringueira: estimativas de parâmetros genéticos e estudo de interação genótipo x ambiente Genetic variability of rubber tree annual yielding: estimates of genetic parameters and study of genotype x environment interaction

    Directory of Open Access Journals (Sweden)

    Paulo de Souza Gonçalves

    1990-01-01

    Full Text Available Selecionaram-se dezenove genótipos de seringueira (Hevea brasiliensis Muell. Arg. considerados como os melhores em vigor e produção em uma população de pés francos estabelecidos no campo de ensaios da Estação Experimental de Pindorama, com o objetivo de estudar a variabilidade genética e ambiental e a interação genótipo x ambiente sobre a produção durante cinco anos. Com base na análise da variância anual e conjunta, estimaram-se parâmetros genéticos para produção, na tentativa de quantificar o ganho genético com a seleção, e as correlações genéticas e fenotfpicas das produções ano a ano. Os resultados das análises da variância dentro de anos mostraram efeitos significativos para genótipos, sendo os efeitos da interação genótipo x ambiente altamente significativos. As estimativas de herdabilídade, no sentido amplo, ao nível de médias de parcelas, foram altas, com amplitude de 0,57 a 0,77, respectivamente, para o segundo e quinto ano de produção. As maiores percentagens de ganho genético foram obtidas no primeiro e quinto ano de produção, 39,03 e 27,57 respectivamente. Correlações genéticas e fenotípicas entre anos de sangria foram altas e significativas. Os altos valores de herdabilidade e ganho genético para o primeiro ano de sangria indicam que a seleção massal conduzida nesta fase proporciona, efetivamente, maior ganho na seleção.Nineteen rubber trees (Hevea brasiliensis Muell. Arg. considered as the best in growth and yield performance, were selected from a mature seedling population in the experimental field at the Pindorama Experiment Station of the "Instituto Agronômico de Campinas", São Paulo State, Brazil. Studies were carried out aiming to assess the annual environmental influence on genetic variation in five years of yielding. Components of variance were estimated from these analyses in an attempt to quantify genotype x environment interactions. The results of the analysis of

  14. Avaliação genética de touros usando produção em lactações completas ou parciais projetadas: 3. Confiabilidade e ganhos genéticos

    Directory of Open Access Journals (Sweden)

    Melo Cláudio Manoel Rodrigues de

    2000-01-01

    Full Text Available Para estudar a viabilidade de se usarem produções em lactações parciais, projetadas, na avaliação do mérito genético de touros, foram utilizadas 4595 lactações de 2254 vacas, filhas de 145 touros de 1618 matrizes, distribuídas em 18 rebanhos, com partos entre 1980 e 1997. A partir de 91, 151, 211 ou 241 dias de lactação, projetaram-se 10, 30, 50 ou 70% das lactações, para a duração da lactação observada e para 305 dias. As estimativas dos parâmetros genéticos foram obtidas pelo sistema MTDFREML. Incluíram-se no modelo, independente da característica, efeitos fixos de rebanho-ano, época de parto e idade da vaca ao parto, com termos linear e quadrático, considerando-se efeitos aleatórios de animal, efeito permanente de ambiente e erro. A média das confiabilidades, obtida por meio das produções estimadas (PE, variou de 0,60 a 0,67, utilizando-se P305 igual a 0,60. O ganho genético anual pela seleção dos touros utilizando as PE foi, em média, 24,27% maior que o ganho genético anual da P305, quando as lactações foram projetadas para a duração da lactação observada, e 25,65% superior, quando as lactações foram projetadas para P305. As confiabilidades obtidas, bem como os ganhos genéticos anuais estimados nas avaliações genéticas, usando as PE, foram semelhantes àquelas obtidas para a produção de leite até 305 dias.

  15. Prevalencia de bacterias Gram negativas portadoras del gen blaKPC en hospitales de Colombia

    Directory of Open Access Journals (Sweden)

    Robinson Pacheco

    2014-04-01

    Full Text Available Introducción. Las enzimas carbapenemasas de tipo KPC tienen gran capacidad de diseminación, son causantes de epidemias y se asocian a mayor mortalidad y estancia hospitalaria. En Colombia se han venido reportando cada vez más desde 2007, pero se desconoce la prevalencia hospitalaria. Objetivo. Estimar la prevalencia hospitalaria del gen blaKPC. Materiales y métodos. Se evaluó la presencia del gen blaKPC y su ‘clonalidad’ en aislamientos de enterobacterias y Pseudomonas aeruginosa de pacientes hospitalizados. Resultados. De los 424 aislamientos evaluados durante el periodo de estudio, 273 cumplieron con criterios de elegibilidad, 31,1 % fue positivo para el gen blaKPC y, al ajustar por ‘clonalidad’, la positividad fue de 12,8 %. El gen blaKPC se encontró con mayor frecuencia en Klebsiella pneumoniae seguido de P. aeruginosa y otras enterobacterias. A pesar de que la unidad de cuidados intensivos aportó el mayor número de aislamientos, no se encontró un patrón más prevalente del gen blaKPC en las ellas que en las otras salas. El aparato respiratorio fue el sitio anatómico de origen con la mayor prevalencia. No se presentó estacionalidad en la frecuencia de los aislamientos portadores del gen blaKPC. Conclusión. Este estudio reveló la alta prevalencia del gen blaKPC en diferentes microorganismos aislados en varias instituciones hospitalarias del país. La extraordinaria capacidad de propagación del gen blaKPC, las dificultades del diagnóstico y la limitada disponibilidad de antibióticos plantean la apremiante necesidad de fortalecer los sistemas de vigilancia epidemiológica y ajustar oportunamente las políticas institucionales de uso racional de antibióticos con el fin de contener su diseminación a otras instituciones de salud del país.

  16. Illuminating choices for library prep: a comparison of library preparation methods for whole genome sequencing of Cryptococcus neoformans using Illumina HiSeq.

    Directory of Open Access Journals (Sweden)

    Johanna Rhodes

    Full Text Available The industry of next-generation sequencing is constantly evolving, with novel library preparation methods and new sequencing machines being released by the major sequencing technology companies annually. The Illumina TruSeq v2 library preparation method was the most widely used kit and the market leader; however, it has now been discontinued, and in 2013 was replaced by the TruSeq Nano and TruSeq PCR-free methods, leaving a gap in knowledge regarding which is the most appropriate library preparation method to use. Here, we used isolates from the pathogenic fungi Cryptococcus neoformans var. grubii and sequenced them using the existing TruSeq DNA v2 kit (Illumina, along with two new kits: the TruSeq Nano DNA kit (Illumina and the NEBNext Ultra DNA kit (New England Biolabs to provide a comparison. Compared to the original TruSeq DNA v2 kit, both newer kits gave equivalent or better sequencing data, with increased coverage. When comparing the two newer kits, we found little difference in cost and workflow, with the NEBNext Ultra both slightly cheaper and faster than the TruSeq Nano. However, the quality of data generated using the TruSeq Nano DNA kit was superior due to higher coverage at regions of low GC content, and more SNPs identified. Researchers should therefore evaluate their resources and the type of application (and hence data quality being considered when ultimately deciding on which library prep method to use.

  17. Illuminating choices for library prep: a comparison of library preparation methods for whole genome sequencing of Cryptococcus neoformans using Illumina HiSeq.

    Science.gov (United States)

    Rhodes, Johanna; Beale, Mathew A; Fisher, Matthew C

    2014-01-01

    The industry of next-generation sequencing is constantly evolving, with novel library preparation methods and new sequencing machines being released by the major sequencing technology companies annually. The Illumina TruSeq v2 library preparation method was the most widely used kit and the market leader; however, it has now been discontinued, and in 2013 was replaced by the TruSeq Nano and TruSeq PCR-free methods, leaving a gap in knowledge regarding which is the most appropriate library preparation method to use. Here, we used isolates from the pathogenic fungi Cryptococcus neoformans var. grubii and sequenced them using the existing TruSeq DNA v2 kit (Illumina), along with two new kits: the TruSeq Nano DNA kit (Illumina) and the NEBNext Ultra DNA kit (New England Biolabs) to provide a comparison. Compared to the original TruSeq DNA v2 kit, both newer kits gave equivalent or better sequencing data, with increased coverage. When comparing the two newer kits, we found little difference in cost and workflow, with the NEBNext Ultra both slightly cheaper and faster than the TruSeq Nano. However, the quality of data generated using the TruSeq Nano DNA kit was superior due to higher coverage at regions of low GC content, and more SNPs identified. Researchers should therefore evaluate their resources and the type of application (and hence data quality) being considered when ultimately deciding on which library prep method to use.

  18. Contribuciones de Sir Roland Fisher a la Estadística Genética

    Directory of Open Access Journals (Sweden)

    Jaime Cuadros

    2004-11-01

    Full Text Available Sir Ronald Fisher (18901962 fue profesor de genética y muchas de sus innovaciones estadísticas encontraron expresión en el desarrollo de metodología en estadística genética. Sin embargo, mientras sus contribuciones en estadística matemática son fácilmente identificadas, en genética de poblaciones compartió su supremacía con Sewall Wright (1889 1988 y J. S. S. Haldane (1892 1965. Este documento muestra algunas de las mejores contribuciones de Fisher a las bases de la estadística genética, y sus interacciones con Wright y Haldane, los cuales contribuyeron al desarrollo del tema. Con la tecnología moderna, tanto la metodología la estadística como la información genética están cambiando. No obstante, muchos de los trabajos de Fisher permanecen relevantes, y pueden aun servir como una base para investigaciones futuras en el análisis estadístico de datos de DNA. El trabajo de este autor refleja su visión del papel de Ia estadística en Ia inferencia científica expresada en 1949

  19. HyQue: evaluating hypotheses using Semantic Web technologies

    Directory of Open Access Journals (Sweden)

    Callahan Alison

    2011-05-01

    Full Text Available Abstract Background Key to the success of e-Science is the ability to computationally evaluate expert-composed hypotheses for validity against experimental data. Researchers face the challenge of collecting, evaluating and integrating large amounts of diverse information to compose and evaluate a hypothesis. Confronted with rapidly accumulating data, researchers currently do not have the software tools to undertake the required information integration tasks. Results We present HyQue, a Semantic Web tool for querying scientific knowledge bases with the purpose of evaluating user submitted hypotheses. HyQue features a knowledge model to accommodate diverse hypotheses structured as events and represented using Semantic Web languages (RDF/OWL. Hypothesis validity is evaluated against experimental and literature-sourced evidence through a combination of SPARQL queries and evaluation rules. Inference over OWL ontologies (for type specifications, subclass assertions and parthood relations and retrieval of facts stored as Bio2RDF linked data provide support for a given hypothesis. We evaluate hypotheses of varying levels of detail about the genetic network controlling galactose metabolism in Saccharomyces cerevisiae to demonstrate the feasibility of deploying such semantic computing tools over a growing body of structured knowledge in Bio2RDF. Conclusions HyQue is a query-based hypothesis evaluation system that can currently evaluate hypotheses about the galactose metabolism in S. cerevisiae. Hypotheses as well as the supporting or refuting data are represented in RDF and directly linked to one another allowing scientists to browse from data to hypothesis and vice versa. HyQue hypotheses and data are available at http://semanticscience.org/projects/hyque.

  20. HyQue: evaluating hypotheses using Semantic Web technologies

    Science.gov (United States)

    2011-01-01

    Background Key to the success of e-Science is the ability to computationally evaluate expert-composed hypotheses for validity against experimental data. Researchers face the challenge of collecting, evaluating and integrating large amounts of diverse information to compose and evaluate a hypothesis. Confronted with rapidly accumulating data, researchers currently do not have the software tools to undertake the required information integration tasks. Results We present HyQue, a Semantic Web tool for querying scientific knowledge bases with the purpose of evaluating user submitted hypotheses. HyQue features a knowledge model to accommodate diverse hypotheses structured as events and represented using Semantic Web languages (RDF/OWL). Hypothesis validity is evaluated against experimental and literature-sourced evidence through a combination of SPARQL queries and evaluation rules. Inference over OWL ontologies (for type specifications, subclass assertions and parthood relations) and retrieval of facts stored as Bio2RDF linked data provide support for a given hypothesis. We evaluate hypotheses of varying levels of detail about the genetic network controlling galactose metabolism in Saccharomyces cerevisiae to demonstrate the feasibility of deploying such semantic computing tools over a growing body of structured knowledge in Bio2RDF. Conclusions HyQue is a query-based hypothesis evaluation system that can currently evaluate hypotheses about the galactose metabolism in S. cerevisiae. Hypotheses as well as the supporting or refuting data are represented in RDF and directly linked to one another allowing scientists to browse from data to hypothesis and vice versa. HyQue hypotheses and data are available at http://semanticscience.org/projects/hyque. PMID:21624158

  1. HyQue: evaluating hypotheses using Semantic Web technologies.

    Science.gov (United States)

    Callahan, Alison; Dumontier, Michel; Shah, Nigam H

    2011-05-17

    Key to the success of e-Science is the ability to computationally evaluate expert-composed hypotheses for validity against experimental data. Researchers face the challenge of collecting, evaluating and integrating large amounts of diverse information to compose and evaluate a hypothesis. Confronted with rapidly accumulating data, researchers currently do not have the software tools to undertake the required information integration tasks. We present HyQue, a Semantic Web tool for querying scientific knowledge bases with the purpose of evaluating user submitted hypotheses. HyQue features a knowledge model to accommodate diverse hypotheses structured as events and represented using Semantic Web languages (RDF/OWL). Hypothesis validity is evaluated against experimental and literature-sourced evidence through a combination of SPARQL queries and evaluation rules. Inference over OWL ontologies (for type specifications, subclass assertions and parthood relations) and retrieval of facts stored as Bio2RDF linked data provide support for a given hypothesis. We evaluate hypotheses of varying levels of detail about the genetic network controlling galactose metabolism in Saccharomyces cerevisiae to demonstrate the feasibility of deploying such semantic computing tools over a growing body of structured knowledge in Bio2RDF. HyQue is a query-based hypothesis evaluation system that can currently evaluate hypotheses about the galactose metabolism in S. cerevisiae. Hypotheses as well as the supporting or refuting data are represented in RDF and directly linked to one another allowing scientists to browse from data to hypothesis and vice versa. HyQue hypotheses and data are available at http://semanticscience.org/projects/hyque.

  2. Identification of genomic indels and structural variations using split reads

    Directory of Open Access Journals (Sweden)

    Urban Alexander E

    2011-07-01

    Full Text Available Abstract Background Recent studies have demonstrated the genetic significance of insertions, deletions, and other more complex structural variants (SVs in the human population. With the development of the next-generation sequencing technologies, high-throughput surveys of SVs on the whole-genome level have become possible. Here we present split-read identification, calibrated (SRiC, a sequence-based method for SV detection. Results We start by mapping each read to the reference genome in standard fashion using gapped alignment. Then to identify SVs, we score each of the many initial mappings with an assessment strategy designed to take into account both sequencing and alignment errors (e.g. scoring more highly events gapped in the center of a read. All current SV calling methods have multilevel biases in their identifications due to both experimental and computational limitations (e.g. calling more deletions than insertions. A key aspect of our approach is that we calibrate all our calls against synthetic data sets generated from simulations of high-throughput sequencing (with realistic error models. This allows us to calculate sensitivity and the positive predictive value under different parameter-value scenarios and for different classes of events (e.g. long deletions vs. short insertions. We run our calculations on representative data from the 1000 Genomes Project. Coupling the observed numbers of events on chromosome 1 with the calibrations gleaned from the simulations (for different length events allows us to construct a relatively unbiased estimate for the total number of SVs in the human genome across a wide range of length scales. We estimate in particular that an individual genome contains ~670,000 indels/SVs. Conclusions Compared with the existing read-depth and read-pair approaches for SV identification, our method can pinpoint the exact breakpoints of SV events, reveal the actual sequence content of insertions, and cover the whole

  3. CASSys: an integrated software-system for the interactive analysis of ChIP-seq data

    Directory of Open Access Journals (Sweden)

    Alawi Malik

    2011-06-01

    Full Text Available The mapping of DNA-protein interactions is crucial for a full understanding of transcriptional regulation. Chromatin-immunoprecipitation followed bymassively parallel sequencing (ChIP-seq has become the standard technique for analyzing these interactions on a genome-wide scale. We have developed a software system called CASSys (ChIP-seq data Analysis Software System spanning all steps of ChIP-seq data analysis. It supersedes the laborious application of several single command line tools. CASSys provides functionality ranging from quality assessment and -control of short reads, over the mapping of reads against a reference genome (readmapping and the detection of enriched regions (peakdetection to various follow-up analyses. The latter are accessible via a state-of-the-art web interface and can be performed interactively by the user. The follow-up analyses allow for flexible user defined association of putative interaction sites with genes, visualization of their genomic context with an integrated genome browser, the detection of putative binding motifs, the identification of over-represented Gene Ontology-terms, pathway analysis and the visualization of interaction networks. The system is client-server based, accessible via a web browser and does not require any software installation on the client side. To demonstrate CASSys’s functionality we used the system for the complete data analysis of a publicly available Chip-seq study that investigated the role of the transcription factor estrogen receptor-α in breast cancer cells.

  4. Caracterização e estimativa da variabilidade genética de genótipos de cebola Characterization and estimation of genetic variability of onion genotypes

    Directory of Open Access Journals (Sweden)

    Gerson Henrique Wamser

    2012-06-01

    Full Text Available Este trabalho teve como objetivo caracterizar genótipos de cebola cultivados em Santa Catarina e estimar a variabilidade genética existente entre os mesmos. Para isto foram avaliados quinze genótipos de cebola em dois ambientes, Ituporanga e Lages. O delineamento utilizado foi de blocos casualizados, com três repetições em cada ambiente. Foram avaliados o comprimento do pseudocaule; número de folhas por pseudocaule; diâmetro do pseudocaule; diâmetro do bulbo; altura do bulbo; peso do bulbo; relação altura:diâmetro do bulbo; produção total de bulbos; formato do bulbo; porcentagem de florescimento e porcentagem de bulbos podres. Os dados foram submetidos à análise de variância multivariada. Houve efeito significativo para a interação genótipos x ambientes, fato que causou diferenças nos valores de dissimilaridade em cada local. Foi elaborada uma matriz de dissimilaridade utilizando a distância de Mahalanobis. Os caracteres morfológicos e agronômicos utilizados foram suficientes para caracterizar os genótipos, indicando que os programas de melhoramento dispõem de uma ampla base genética para o desenvolvimento de novas cultivares.This study aimed to characterize onion genotypes grown in Santa Catarina state, Brazil and to estimate their genetic variability. Fifteen onion genotypes were evaluated in two locations, Ituporanga and Lages. The experimental design was of randomized blocks with three replications in each environment. We evaluated the length of the pseudostem, number of leaves per pseudostem, stem diameter, bulb diameter, height of the bulb, bulb weight, height:diameter ratio; total production of bulbs, bulb shape, flowering percentage and percentage of rotten bulbs. The data were subjected to multivariate analysis of variance. The results showed significant effects for genotype-environment interaction, fact that was reflected in the values of dissimilarity in each location. A matrix of dissimilarity was prepared

  5. Comparative RNA-Seq and microarray analysis of gene expression changes in B-cell lymphomas of Canis familiaris.

    Directory of Open Access Journals (Sweden)

    Marie Mooney

    Full Text Available Comparative oncology is a developing research discipline that is being used to assist our understanding of human neoplastic diseases. Companion canines are a preferred animal oncology model due to spontaneous tumor development and similarity to human disease at the pathophysiological level. We use a paired RNA sequencing (RNA-Seq/microarray analysis of a set of four normal canine lymph nodes and ten canine lymphoma fine needle aspirates to identify technical biases and variation between the technologies and convergence on biological disease pathways. Surrogate Variable Analysis (SVA provides a formal multivariate analysis of the combined RNA-Seq/microarray data set. Applying SVA to the data allows us to decompose variation into contributions associated with transcript abundance, differences between the technology, and latent variation within each technology. A substantial and highly statistically significant component of the variation reflects transcript abundance, and RNA-Seq appeared more sensitive for detection of transcripts expressed at low levels. Latent random variation among RNA-Seq samples is also distinct in character from that impacting microarray samples. In particular, we observed variation between RNA-Seq samples that reflects transcript GC content. Platform-independent variable decomposition without a priori knowledge of the sources of variation using SVA represents a generalizable method for accomplishing cross-platform data analysis. We identified genes differentially expressed between normal lymph nodes of disease free dogs and a subset of the diseased dogs diagnosed with B-cell lymphoma using each technology. There is statistically significant overlap between the RNA-Seq and microarray sets of differentially expressed genes. Analysis of overlapping genes in the context of biological systems suggests elevated expression and activity of PI3K signaling in B-cell lymphoma biopsies compared with normal biopsies, consistent with

  6. Nascent-Seq reveals novel features of mouse circadian transcriptional regulation

    Science.gov (United States)

    Menet, Jerome S; Rodriguez, Joseph; Abruzzi, Katharine C; Rosbash, Michael

    2012-01-01

    A substantial fraction of the metazoan transcriptome undergoes circadian oscillations in many cells and tissues. Based on the transcription feedback loops important for circadian timekeeping, it is commonly assumed that this mRNA cycling reflects widespread transcriptional regulation. To address this issue, we directly measured the circadian dynamics of mouse liver transcription using Nascent-Seq (genome-wide sequencing of nascent RNA). Although many genes are rhythmically transcribed, many rhythmic mRNAs manifest poor transcriptional rhythms, indicating a prominent contribution of post-transcriptional regulation to circadian mRNA expression. This analysis of rhythmic transcription also showed that the rhythmic DNA binding profile of the transcription factors CLOCK and BMAL1 does not determine the transcriptional phase of most target genes. This likely reflects gene-specific collaborations of CLK:BMAL1 with other transcription factors. These insights from Nascent-Seq indicate that it should have broad applicability to many other gene expression regulatory issues. DOI: http://dx.doi.org/10.7554/eLife.00011.001 PMID:23150795

  7. GeoBoost: accelerating research involving the geospatial metadata of virus GenBank records.

    Science.gov (United States)

    Tahsin, Tasnia; Weissenbacher, Davy; O'Connor, Karen; Magge, Arjun; Scotch, Matthew; Gonzalez-Hernandez, Graciela

    2018-05-01

    GeoBoost is a command-line software package developed to address sparse or incomplete metadata in GenBank sequence records that relate to the location of the infected host (LOIH) of viruses. Given a set of GenBank accession numbers corresponding to virus GenBank records, GeoBoost extracts, integrates and normalizes geographic information reflecting the LOIH of the viruses using integrated information from GenBank metadata and related full-text publications. In addition, to facilitate probabilistic geospatial modeling, GeoBoost assigns probability scores for each possible LOIH. Binaries and resources required for running GeoBoost are packed into a single zipped file and freely available for download at https://tinyurl.com/geoboost. A video tutorial is included to help users quickly and easily install and run the software. The software is implemented in Java 1.8, and supported on MS Windows and Linux platforms. gragon@upenn.edu. Supplementary data are available at Bioinformatics online.

  8. ORF Sequence: Ca19AnnotatedDec2004aaSeq [GENIUS II[Archive

    Lifescience Database Archive (English)

    Full Text Available Ca19AnnotatedDec2004aaSeq orf19.7258 >orf19.7258; Contig19-2507; 88880..89851; DDI1*; response to DNA alkyl...ation; MQLTISLDHSGDIISVDVPDSLCLEDFKAYLSAETGLEASVQVLKFNGRELVGNATLSELQIHDNDLLQLSKKQVA

  9. ORF Sequence: Ca19AnnotatedDec2004aaSeq [GENIUS II[Archive

    Lifescience Database Archive (English)

    Full Text Available Ca19AnnotatedDec2004aaSeq orf19.1278 >orf19.1278; Contig19-10104; complement(13162...4..>132028); ; conserved hypothetical protein; truncated protein IQNNKCSGCNLKLDFPVIHFKCKHSFHQKCLSTNLIATSTESS

  10. ORF Sequence: Ca19AnnotatedDec2004aaSeq [GENIUS II[Archive

    Lifescience Database Archive (English)

    Full Text Available Ca19AnnotatedDec2004aaSeq orf19.4711 >orf19.4711; Contig19-10212; complement(29836...7..>300616); ; acidic repetitive protein; truncated protein DRSDYNEEDNNDFTRKLNEIQSKESNHEDLAQSEVQEGQKDEPDSVNQ

  11. ORF Sequence: Ca19AnnotatedDec2004aaSeq [GENIUS II[Archive

    Lifescience Database Archive (English)

    Full Text Available ruitment factor; MAKTRSKSAATAAATSPKASPTAAKVTKNKVTKPSTASPSKTTKTKAVKKTTTKKATPKKEEEEKK... Ca19AnnotatedDec2004aaSeq orf19.124 >orf19.124; Contig19-10035; 67601..68698; CIC1*; protease substrate rec

  12. GUIDEseq: a bioconductor package to analyze GUIDE-Seq datasets for CRISPR-Cas nucleases.

    Science.gov (United States)

    Zhu, Lihua Julie; Lawrence, Michael; Gupta, Ankit; Pagès, Hervé; Kucukural, Alper; Garber, Manuel; Wolfe, Scot A

    2017-05-15

    Genome editing technologies developed around the CRISPR-Cas9 nuclease system have facilitated the investigation of a broad range of biological questions. These nucleases also hold tremendous promise for treating a variety of genetic disorders. In the context of their therapeutic application, it is important to identify the spectrum of genomic sequences that are cleaved by a candidate nuclease when programmed with a particular guide RNA, as well as the cleavage efficiency of these sites. Powerful new experimental approaches, such as GUIDE-seq, facilitate the sensitive, unbiased genome-wide detection of nuclease cleavage sites within the genome. Flexible bioinformatics analysis tools for processing GUIDE-seq data are needed. Here, we describe an open source, open development software suite, GUIDEseq, for GUIDE-seq data analysis and annotation as a Bioconductor package in R. The GUIDEseq package provides a flexible platform with more than 60 adjustable parameters for the analysis of datasets associated with custom nuclease applications. These parameters allow data analysis to be tailored to different nuclease platforms with different length and complexity in their guide and PAM recognition sequences or their DNA cleavage position. They also enable users to customize sequence aggregation criteria, and vary peak calling thresholds that can influence the number of potential off-target sites recovered. GUIDEseq also annotates potential off-target sites that overlap with genes based on genome annotation information, as these may be the most important off-target sites for further characterization. In addition, GUIDEseq enables the comparison and visualization of off-target site overlap between different datasets for a rapid comparison of different nuclease configurations or experimental conditions. For each identified off-target, the GUIDEseq package outputs mapped GUIDE-Seq read count as well as cleavage score from a user specified off-target cleavage score prediction

  13. The Development of the Genital Psoriasis Sexual Frequency Questionnaire (GenPs-SFQ) to Assess the Impact of Genital Psoriasis on Sexual Health.

    Science.gov (United States)

    Gottlieb, Alice B; Kirby, Brian; Ryan, Caitriona; Naegeli, April N; Burge, Russel; Potts Bleakman, Alison; Anatchkova, Milena D; Cather, Jennifer

    2018-03-01

    Patient-reported outcome measures (PROs) exist for psoriasis but not genital psoriasis (GenPs). This cross-sectional, qualitative study in patients with moderate-to-severe GenPs was conducted to support development of a PRO for measuring the impact of GenPs on sexual activity and to establish content validity. The impacts of GenPs were identified in a literature review. Findings from the literature review were discussed with clinicians, and then patients with GenPs were interviewed. From the literature review, 52 articles, 44 abstracts, and 41 clinical trials met predefined search criteria. Of these, 11 concepts emerged as having theoretical support for use as measurable impacts of psoriasis symptoms on patients; these concepts included sexual functioning and general health-related quality of life (HRQoL). These concepts were confirmed and expanded upon by two clinicians who routinely care for patients with GenPs. Interviews were then conducted with GenPs patients (n = 20) to discuss the impact of GenPs on their HRQoL. Eighty percent of patients reported that GenPs impacted sexual frequency. The two-item GenPs Sexual Frequency Questionnaire (GenPs-SFQ) was developed to assess limitations on sexual activity frequency because of GenPs. Cognitive debriefing with an additional 50 patients with GenPs confirmed the utility and understandability of the GenPs-SFQ. The GenPs-SFQ may have utility in clinical trials involving GenPs patients and in routine clinical practice. Eli Lilly and Company. Plain language summary available for this article.

  14. XplorSeq: a software environment for integrated management and phylogenetic analysis of metagenomic sequence data.

    Science.gov (United States)

    Frank, Daniel N

    2008-10-07

    Advances in automated DNA sequencing technology have accelerated the generation of metagenomic DNA sequences, especially environmental ribosomal RNA gene (rDNA) sequences. As the scale of rDNA-based studies of microbial ecology has expanded, need has arisen for software that is capable of managing, annotating, and analyzing the plethora of diverse data accumulated in these projects. XplorSeq is a software package that facilitates the compilation, management and phylogenetic analysis of DNA sequences. XplorSeq was developed for, but is not limited to, high-throughput analysis of environmental rRNA gene sequences. XplorSeq integrates and extends several commonly used UNIX-based analysis tools by use of a Macintosh OS-X-based graphical user interface (GUI). Through this GUI, users may perform basic sequence import and assembly steps (base-calling, vector/primer trimming, contig assembly), perform BLAST (Basic Local Alignment and Search Tool; 123) searches of NCBI and local databases, create multiple sequence alignments, build phylogenetic trees, assemble Operational Taxonomic Units, estimate biodiversity indices, and summarize data in a variety of formats. Furthermore, sequences may be annotated with user-specified meta-data, which then can be used to sort data and organize analyses and reports. A document-based architecture permits parallel analysis of sequence data from multiple clones or amplicons, with sequences and other data stored in a single file. XplorSeq should benefit researchers who are engaged in analyses of environmental sequence data, especially those with little experience using bioinformatics software. Although XplorSeq was developed for management of rDNA sequence data, it can be applied to most any sequencing project. The application is available free of charge for non-commercial use at http://vent.colorado.edu/phyloware.

  15. SeqAnt: A web service to rapidly identify and annotate DNA sequence variations

    Directory of Open Access Journals (Sweden)

    Patel Viren

    2010-09-01

    Full Text Available Abstract Background The enormous throughput and low cost of second-generation sequencing platforms now allow research and clinical geneticists to routinely perform single experiments that identify tens of thousands to millions of variant sites. Existing methods to annotate variant sites using information from publicly available databases via web browsers are too slow to be useful for the large sequencing datasets being routinely generated by geneticists. Because sequence annotation of variant sites is required before functional characterization can proceed, the lack of a high-throughput pipeline to efficiently annotate variant sites can act as a significant bottleneck in genetics research. Results SeqAnt (Sequence Annotator is an open source web service and software package that rapidly annotates DNA sequence variants and identifies recessive or compound heterozygous loci in human, mouse, fly, and worm genome sequencing experiments. Variants are characterized with respect to their functional type, frequency, and evolutionary conservation. Annotated variants can be viewed on a web browser, downloaded in a tab-delimited text file, or directly uploaded in a BED format to the UCSC genome browser. To demonstrate the speed of SeqAnt, we annotated a series of publicly available datasets that ranged in size from 37 to 3,439,107 variant sites. The total time to completely annotate these data completely ranged from 0.17 seconds to 28 minutes 49.8 seconds. Conclusion SeqAnt is an open source web service and software package that overcomes a critical bottleneck facing research and clinical geneticists using second-generation sequencing platforms. SeqAnt will prove especially useful for those investigators who lack dedicated bioinformatics personnel or infrastructure in their laboratories.

  16. Discusión: Explicaciones genéticas y psicológicas de la esquizofrenia. Bases genéticas de la esquizofrenia: "Nurture vrs Nature

    Directory of Open Access Journals (Sweden)

    Henriette Raventós-Vorst

    2003-01-01

    Full Text Available El presente artículo revisa la evidencia científica que muestra la heredabilidad de la esquizofrenia, su forma de herencia compleja y la posible heterogeneidad genética y ambiental. Se presentan las regiones cromosómicas que han sido ligadas a la enfermedad y algunos de los genes candidatos. El objetivo es presentar los resultados más importantes en el campo de la investigación genética de la enfermedad. Aunque se acepta que factores ambientales deben estar presentes en la etiopatogenia de la enfermedad, no se profundiza en ellos. Finalmente, se comenta el modelo lamarquiano sugerido por el Prof.. Bolaños. El fin es transmitir que en la actualidad no hay contradicción entre el modelo biologista o psicológico que explicaban esta enfermedad. La concepción moderna une ambos modelos: se considera una enfermedad del neurodesarrollo en la que participan factores genéticos, factores epigenéticos y noxas ambientales, incluyendo los factores psicosociales.

  17. SALOME PLATFORM and TetGen for Polyhedral Mesh Generation

    Energy Technology Data Exchange (ETDEWEB)

    Lee, Sang Yong; Park, Chan Eok; Kim, Shin Whan [KEPCO E and C Company, Inc., Daejeon (Korea, Republic of)

    2014-05-15

    SPACE and CUPID use the unstructured mesh and they also require reliable mesh generation system. The combination of CAD system and mesh generation system is necessary to cope with a large number of cells and the complex fluid system with structural materials inside. In the past, a CAD system Pro/Engineer and mesh generator Pointwise were evaluated for this application. But, the cost of those commercial CAD and mesh generator is sometimes a great burden. Therefore, efforts have been made to set up a mesh generation system with open source programs. The evaluation of the TetGen has been made in focusing the application for the polyhedral mesh generation. In this paper, SALOME will be evaluated for the efforts in conjunction with TetGen. In section 2, review will be made on the CAD and mesh generation capability of SALOME. SALOME and TetGen codes are being integrated to construct robust polyhedral mesh generator. Edge removal on the flat surface and vertex reattachment to the solid are two challenging tasks. It is worthwhile to point out that the Python script capability of the SALOME should be fully utilized for the future investigation.

  18. Nebula--a web-server for advanced ChIP-seq data analysis.

    Science.gov (United States)

    Boeva, Valentina; Lermine, Alban; Barette, Camille; Guillouf, Christel; Barillot, Emmanuel

    2012-10-01

    ChIP-seq consists of chromatin immunoprecipitation and deep sequencing of the extracted DNA fragments. It is the technique of choice for accurate characterization of the binding sites of transcription factors and other DNA-associated proteins. We present a web service, Nebula, which allows inexperienced users to perform a complete bioinformatics analysis of ChIP-seq data. Nebula was designed for both bioinformaticians and biologists. It is based on the Galaxy open source framework. Galaxy already includes a large number of functionalities for mapping reads and peak calling. We added the following to Galaxy: (i) peak calling with FindPeaks and a module for immunoprecipitation quality control, (ii) de novo motif discovery with ChIPMunk, (iii) calculation of the density and the cumulative distribution of peak locations relative to gene transcription start sites, (iv) annotation of peaks with genomic features and (v) annotation of genes with peak information. Nebula generates the graphs and the enrichment statistics at each step of the process. During Steps 3-5, Nebula optionally repeats the analysis on a control dataset and compares these results with those from the main dataset. Nebula can also incorporate gene expression (or gene modulation) data during these steps. In summary, Nebula is an innovative web service that provides an advanced ChIP-seq analysis pipeline providing ready-to-publish results. Nebula is available at http://nebula.curie.fr/ Supplementary data are available at Bioinformatics online.

  19. Genes and proteins of Escherichia coli K-12 (GenProtEC).

    Science.gov (United States)

    Riley, M

    1997-01-01

    GenProtEC is a database of Escherichia coli genes and their gene products, classified by type of function and physiological role and with citations to the literature for each. Also present are data on sequence similarities amongE.coliproteins with PAM values, percent identity of amino acids, length of alignment and percent aligned. GenProtEC can also be accessed through the World Wide Web at URL http://mbl.edu/html/ecoli.html .

  20. Variabilidad genética de la respuesta inflamatoria: I. Polimorfismo -511 C/T en el gen IL1β en diferentes subpoblaciones peruanas

    Directory of Open Access Journals (Sweden)

    Óscar Acosta

    2012-07-01

    Full Text Available El polimorfismo -511 citosina/timina (-511 C/T en la región promotora del gen interleuquina 1 beta (IL1β estα implicado en la producciσn diferencial de la citoquina y por tanto puede estar asociado a la respuesta inmuno-inflamatoria en obesidad, dislipidemias, cardiopatías, cáncer, infecciones, y el tratamiento con nutrientes y fármacos. Objetivos: Establecer la distribución de frecuencias de los genotipos y alelos del polimorfismo -511 C/T del gen IL1β en diferentes subpoblaciones peruanas. Diseño: Estudio descriptivo, observacional, transversal. Instituciones: Centro de Investigación de Bioquímica y Nutrición e Instituto de Medicina Tropical D.A. Carrión, Facultad de Medicina, UNMSM y Centro de Genética y Biología Molecular, Facultad de Medicina, USMP, Lima, Perú. Participantes: Pobladores peruanos. Intervenciones: Extracción de ADN genómico a partir de muestras sanguíneas o epitelio bucal según metodología estándar, de 168 individuos de 9 grupos subpoblacionales: 23 mestizos de Lima, 33 amazónicos (20 de Pucallpa y 13 de Amazonas y 112 andinos (12 de Ancash, 10 de Cajamarca, 18 de Huarochirí-Lima, 25 de Puno-Taquile, 25 de Puno-Uros y 22 de Puno-Anapia. Análisis del polimorfismo -511 C/T mediante la técnica de PCR/RFLP, con primers específicos y digestión con la enzima de restricción AvaI, detectándose los fragmentos por electroforesis en geles de agarosa al 2% y tinción con bromuro de etidio. Principales medidas de resultados: Frecuencias genotípicas y alélicas del gen IL1β. Resultados: Se encontró las siguientes frecuencias genotípicas CC=0,024; CT=0,369 y TT=0,607, consistentes con el equilibrio de Hardy-Weinberg; y las frecuencias alélicas fueron alelo C=0,208 y aleloT= 0,792. La frecuencia del alelo T, considerado el mutante, fue muy alta en los Uros de Puno (0.940 y más baja en los mestizos de Lima (0.609. La comparación de las frecuencias genotípicas (TT versus CT+CC y alélicas (T versus C

  1. Improvement of Steam Generator Reliability for GEN-IV SFR

    Energy Technology Data Exchange (ETDEWEB)

    Kim, Seong O; Kim Se Yun; Kim, Seok Hoon; Eoh, Jae Hyuk; Lee, Hyeong Yeon; Choi, Byung Seon

    2005-11-15

    The R and D items performed in this study were selected from the R and D task of ' Reliability improvement of Steam Generator' of GEN-IV SFR Component Design and BOP. Since this project deals with one of the most important issues for a GEN-IV SFR system, it needs to enhance the domestic technical backgrounds associated with the corresponding R and D items even for a very short period by 2005. This study provides the R and D results for i) Development of assessment methodology for dissimilar metal weld and ii) Development of multi-dimensional simulation methodology for a SWR event in a SFR steam generator.

  2. Improvement of Steam Generator Reliability for GEN-IV SFR

    International Nuclear Information System (INIS)

    Kim, Seong O; Kim Se Yun; Kim, Seok Hoon; Eoh, Jae Hyuk; Lee, Hyeong Yeon; Choi, Byung Seon

    2005-11-01

    The R and D items performed in this study were selected from the R and D task of ' Reliability improvement of Steam Generator' of GEN-IV SFR Component Design and BOP. Since this project deals with one of the most important issues for a GEN-IV SFR system, it needs to enhance the domestic technical backgrounds associated with the corresponding R and D items even for a very short period by 2005. This study provides the R and D results for i) Development of assessment methodology for dissimilar metal weld and ii) Development of multi-dimensional simulation methodology for a SWR event in a SFR steam generator

  3. ORF Sequence: Ca19AnnotatedDec2004aaSeq [GENIUS II[Archive

    Lifescience Database Archive (English)

    Full Text Available Ca19AnnotatedDec2004aaSeq orf19.3361 >orf19.3361; Contig19-10173; 157397..>158185;... YAT2*; carnitine acetyltransferase; gene family | truncated protein MSTYRFQETLEKLPIPDLVQTCNAYLEALKPLQTEQEHE

  4. BrAD-seq: Breath Adapter Directional sequencing: a streamlined, ultra-simple and fast library preparation protocol for strand specific mRNA library construction.

    Directory of Open Access Journals (Sweden)

    Brad Thomas Townsley

    2015-05-01

    Full Text Available Next Generation Sequencing (NGS is driving rapid advancement in biological understanding and RNA-sequencing (RNA-seq has become an indispensable tool for biology and medicine. There is a growing need for access to these technologies although preparation of NGS libraries remains a bottleneck to wider adoption. Here we report a novel method for the production of strand specific RNA-seq libraries utilizing inherent properties of double-stranded cDNA to capture and incorporate a sequencing adapter. Breath Adapter Directional sequencing (BrAD-seq reduces sample handling and requires far fewer enzymatic steps than most available methods to produce high quality strand-specific RNA-seq libraries. The method we present is optimized for 3-prime Digital Gene Expression (DGE libraries and can easily extend to full transcript coverage shotgun (SHO type strand-specific libraries and is modularized to accommodate a diversity of RNA and DNA input materials. BrAD-seq offers a highly streamlined and inexpensive option for RNA-seq libraries.

  5. Beta-Poisson model for single-cell RNA-seq data analyses.

    Science.gov (United States)

    Vu, Trung Nghia; Wills, Quin F; Kalari, Krishna R; Niu, Nifang; Wang, Liewei; Rantalainen, Mattias; Pawitan, Yudi

    2016-07-15

    Single-cell RNA-sequencing technology allows detection of gene expression at the single-cell level. One typical feature of the data is a bimodality in the cellular distribution even for highly expressed genes, primarily caused by a proportion of non-expressing cells. The standard and the over-dispersed gamma-Poisson models that are commonly used in bulk-cell RNA-sequencing are not able to capture this property. We introduce a beta-Poisson mixture model that can capture the bimodality of the single-cell gene expression distribution. We further integrate the model into the generalized linear model framework in order to perform differential expression analyses. The whole analytical procedure is called BPSC. The results from several real single-cell RNA-seq datasets indicate that ∼90% of the transcripts are well characterized by the beta-Poisson model; the model-fit from BPSC is better than the fit of the standard gamma-Poisson model in > 80% of the transcripts. Moreover, in differential expression analyses of simulated and real datasets, BPSC performs well against edgeR, a conventional method widely used in bulk-cell RNA-sequencing data, and against scde and MAST, two recent methods specifically designed for single-cell RNA-seq data. An R package BPSC for model fitting and differential expression analyses of single-cell RNA-seq data is available under GPL-3 license at https://github.com/nghiavtr/BPSC CONTACT: yudi.pawitan@ki.se or mattias.rantalainen@ki.se Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  6. smallWig: parallel compression of RNA-seq WIG files.

    Science.gov (United States)

    Wang, Zhiying; Weissman, Tsachy; Milenkovic, Olgica

    2016-01-15

    We developed a new lossless compression method for WIG data, named smallWig, offering the best known compression rates for RNA-seq data and featuring random access functionalities that enable visualization, summary statistics analysis and fast queries from the compressed files. Our approach results in order of magnitude improvements compared with bigWig and ensures compression rates only a fraction of those produced by cWig. The key features of the smallWig algorithm are statistical data analysis and a combination of source coding methods that ensure high flexibility and make the algorithm suitable for different applications. Furthermore, for general-purpose file compression, the compression rate of smallWig approaches the empirical entropy of the tested WIG data. For compression with random query features, smallWig uses a simple block-based compression scheme that introduces only a minor overhead in the compression rate. For archival or storage space-sensitive applications, the method relies on context mixing techniques that lead to further improvements of the compression rate. Implementations of smallWig can be executed in parallel on different sets of chromosomes using multiple processors, thereby enabling desirable scaling for future transcriptome Big Data platforms. The development of next-generation sequencing technologies has led to a dramatic decrease in the cost of DNA/RNA sequencing and expression profiling. RNA-seq has emerged as an important and inexpensive technology that provides information about whole transcriptomes of various species and organisms, as well as different organs and cellular communities. The vast volume of data generated by RNA-seq experiments has significantly increased data storage costs and communication bandwidth requirements. Current compression tools for RNA-seq data such as bigWig and cWig either use general-purpose compressors (gzip) or suboptimal compression schemes that leave significant room for improvement. To substantiate

  7. Epidemiologia genética: epidemiologia, genética ou nenhuma das anteriores?

    Directory of Open Access Journals (Sweden)

    Aguinaldo Gonçalves

    1990-12-01

    Full Text Available No esforço de contribuir para melhor entendimento da identidade da Epidemiologia Genética, são revistas sua concepção, campo de atuação, métodos e técnicas pertinentes e algumas instâncias de aplicação. Entendendo-a como a área de interesse dos fatores genéticos das doenças e suas interações ambientais, apresenta-se seu campo de atuação como constituído por dois segmentos: um descritivo, que lida com conhecimento da distribuição de tais afecções em famílias e populações, seu impacto a nível do coletivo e sua vigilância epidemiológica, bem como o estudo de seus determinantes; o segundo, caracterizado pela intervenção, refere-se às respectivas medidas preventivas. Em que pese possível limitação pela não-consideração de todas as situações existentes, particular atenção é destinada à revisão de métodos e técnicas que possam ser convergentemente aplicados, a partir de procedimentos genéticos e epidemiológicos. Entre eles, destacam-se como estudos de casos tanto metodologias laboratoriais (como os dermatóglifos quanto quantitativos, como cálculo de herdabilidade e análise multivariada. Alguns objetos de estudo são tomados como instância de aplicação, por contarem com investigações específicas em nosso meio: a hanseníase, o hidrargirismo e a esquizofrenia.In an attempt to contribute to a better undestanding of the identity of Genetic Epidemiology, we review its conception, its field of influence, its appropriate methods and techniques and, at last, some of its applications. Genetic Epidemiology involves the study of genetic factors acting on diseases and on their environmental interactions. These includes two major areas: a descriptive one, related to the distribution of such conditions in families and populations, to the epidemiologic surveillance and to the study of determinants; and another characterized by intervention, which is related to preventive measures. Because of the dificulty in

  8. PASSion: a pattern growth algorithm-based pipeline for splice junction detection in paired-end RNA-Seq data.

    Science.gov (United States)

    Zhang, Yanju; Lameijer, Eric-Wubbo; 't Hoen, Peter A C; Ning, Zemin; Slagboom, P Eline; Ye, Kai

    2012-02-15

    RNA-seq is a powerful technology for the study of transcriptome profiles that uses deep-sequencing technologies. Moreover, it may be used for cellular phenotyping and help establishing the etiology of diseases characterized by abnormal splicing patterns. In RNA-Seq, the exact nature of splicing events is buried in the reads that span exon-exon boundaries. The accurate and efficient mapping of these reads to the reference genome is a major challenge. We developed PASSion, a pattern growth algorithm-based pipeline for splice site detection in paired-end RNA-Seq reads. Comparing the performance of PASSion to three existing RNA-Seq analysis pipelines, TopHat, MapSplice and HMMSplicer, revealed that PASSion is competitive with these packages. Moreover, the performance of PASSion is not affected by read length and coverage. It performs better than the other three approaches when detecting junctions in highly abundant transcripts. PASSion has the ability to detect junctions that do not have known splicing motifs, which cannot be found by the other tools. Of the two public RNA-Seq datasets, PASSion predicted ≈ 137,000 and 173,000 splicing events, of which on average 82 are known junctions annotated in the Ensembl transcript database and 18% are novel. In addition, our package can discover differential and shared splicing patterns among multiple samples. The code and utilities can be freely downloaded from https://trac.nbic.nl/passion and ftp://ftp.sanger.ac.uk/pub/zn1/passion.

  9. Variabilidad genética en Prosopis ferox (Mimosaceae

    Directory of Open Access Journals (Sweden)

    Alicia D. Burghardt

    2004-01-01

    Full Text Available Prosopis ferox (Mimosaceae es una especie arbustiva o arbórea espinosa que se distribuye desde el Sur de Bolivia hasta el noroeste de la Argentina. En la provincia de Jujuy se encuentra a grandes alturas (entre los 2400 y los 3700 m s.m.. Existe una gran variabilidad morfológica, especialmente en cuanto a las dimensiones del fruto y la cantidad de semillas por fruto, ambas características importantes debido al uso de esta planta como forraje. Con el objeto de verificar si existe además variabilidad genética, se realizó un estudio electroforético de proteínas seminales de árboles procedentes de distintas localidades de la provincia de Jujuy. Los patrones polipeptídicos obtenidos por SDS-PAGE presentaron en total 26 bandas. Cada población se caracterizó por sus patrones de presencia-ausencia de bandas, habiéndose encontrado variabilidad intrapoblacional (polimorfismo en algunas de ellas, siendo otras genéticamente homogéneas. Los índices polimórficos en poblaciones de P. ferox son comparables a los obtenidos previamente en P. ruscifolia. La variabilidad genética interpoblacional hallada por medio del estudio electroforético de las proteínas seminales hace suponer la existencia de ecotipos

  10. ORF Sequence: Ca19AnnotatedDec2004aaSeq [GENIUS II[Archive

    Lifescience Database Archive (English)

    Full Text Available Ca19AnnotatedDec2004aaSeq orf19.4748 >orf19.4748; Contig19-10215; complement(47336.....47731); MSL1*; U2 snRNA-associated protein; MPSTKRSSSTEYSHKDSKKKVKLDYVNLKPSQTLYVKNLNTKINKKILLHNLYLLFSAFGDIISINLQNGFAFIIFSNLNSATLALRNLKNQDFFDKPLVLNYAVKESKAISQEKQKLQDENDEEVMPSYE*

  11. ORF Sequence: Ca19AnnotatedDec2004aaSeq [GENIUS II[Archive

    Lifescience Database Archive (English)

    Full Text Available Ca19AnnotatedDec2004aaSeq orf19.2370 >orf19.2370; Contig19-10147; complement(50671..52716); DSL1*; retrogra...de ER-to-golgi transport; MPSIEQQLEDQELYLKDIEQNINKTLSKINKTTLENDNDFRKQFEEIPQDSNTTESN

  12. RNA-Seq as an Emerging Tool for Marine Dinoflagellate Transcriptome Analysis: Process and Challenges

    Directory of Open Access Journals (Sweden)

    Muhamad Afiq Akbar

    2018-01-01

    Full Text Available Dinoflagellates are the large group of marine phytoplankton with primary studies interest regarding their symbiosis with coral reef and the abilities to form harmful algae blooms (HABs. Toxin produced by dinoflagellates during events of HABs cause severe negative impact both in the economy and health sector. However, attempts to understand the dinoflagellates genomic features are hindered by their complex genome organization. Transcriptomics have been employed to understand dinoflagellates genome structure, profile genes and gene expression. RNA-seq is one of the latest methods for transcriptomics study. This method is capable of profiling the dinoflagellates transcriptomes and has several advantages, including highly sensitive, cost effective and deeper sequence coverage. Thus, in this review paper, the current workflow of dinoflagellates RNA-seq starts with the extraction of high quality RNA and is followed by cDNA sequencing using the next-generation sequencing platform, dinoflagellates transcriptome assembly and computational analysis will be discussed. Certain consideration needs will be highlighted such as difficulty in dinoflagellates sequence annotation, post-transcriptional activity and the effect of RNA pooling when using RNA-seq.

  13. Genetic variability within Fusarium solani specie as revealed by PCR-fingerprinting based on pcr markers Variabilidade genética em espécies de Fusarium solani revelada pela técnica de impressão genética baseada em marcadores PCR

    Directory of Open Access Journals (Sweden)

    Bereneuza Tavares Ramos Valente Brasileiro

    2004-09-01

    variabilidade intraespecífica dos isolados de F. solani, sem qualquer correlação para a origem geográfica e substrato. Seu polimorfismo foi observado até mesmo na seqüência conservada do locus do rDNA, e o marcador SPAR (GTG5 mostrou o mais alto polimorfismo. Em conjunto, estes resultados poderão auxiliar nos estudos da relação entre variabilidade do perfil genético de isolados e os fenótipos de resistência de determinados cultivares às doenças provocadas pelo fungo, orientando programas de melhoramento vegetal.

  14. GenBank blastx search result: AK106998 [KOME

    Lifescience Database Archive (English)

    Full Text Available AK106998 002-120-B12 AB179082.1 Macaca fascicularis testis cDNA clone: QtsA-12630, similar to human oculocer...ebrorenal syndrome of Lowe (OCRL), transcriptvariant a, mRNA, RefSeq: NM_000276.3.|PRI PRI 1e-29 +3 ...

  15. GenBank blastx search result: AK059494 [KOME

    Lifescience Database Archive (English)

    Full Text Available AK059494 001-028-H05 AB179082.1 Macaca fascicularis testis cDNA clone: QtsA-12630, similar to human oculocer...ebrorenal syndrome of Lowe (OCRL), transcriptvariant a, mRNA, RefSeq: NM_000276.3.|PRI PRI 7e-40 +2 ...

  16. GenRGenS: Software for Generating Random Genomic Sequences and Structures

    OpenAIRE

    Ponty , Yann; Termier , Michel; Denise , Alain

    2006-01-01

    International audience; GenRGenS is a software tool dedicated to randomly generating genomic sequences and structures. It handles several classes of models useful for sequence analysis, such as Markov chains, hidden Markov models, weighted context-free grammars, regular expressions and PROSITE expressions. GenRGenS is the only program that can handle weighted context-free grammars, thus allowing the user to model and to generate structured objects (such as RNA secondary structures) of any giv...

  17. Frecuencia de algunas enfermedades genéticas en Neuropediatría

    Directory of Open Access Journals (Sweden)

    Tatiana Zaldívar Vaillant

    2012-12-01

    Full Text Available Introducción: las enfermedades neurológicas en Pediatría son diversas y obedecen a un gran número de causas: infecciosas, genéticas, metabólicas y degenerativas, entre otras. El diagnóstico genético, dentro del método clínico en Neurología, está relacionado con el diagnóstico etiológico. Existen muy pocas publicaciones que reflejen la frecuencia de las enfermedades neurogenéticas como grupo etiológico. Objetivo: describir la frecuencia de algunas enfermedades neuropediátricas en la Consulta de Neurogenética del Instituto de Neurología y Neurocirugía. Métodos: se realizó una investigación descriptiva y prospectiva en el periodo 2008-2010. Se clasificó a los pacientes por grupos etarios, y se calculó el porcentaje de frecuencia para la atrofia muscular espinal de la infancia, la distrofia muscular tipo Duchenne/Becker, las lesiones estáticas del sistema nervioso central de causa prenatal genética, y para la clasificación de los grupos según tipo de herencia. Resultados: el universo de estudio estuvo conformado por 161 pacientes, 72,6 % del sexo masculino, para una razón de la variable sexo de 2,5. Los escolares fueron mayoría (37,8 %, y la edad promedio 5 años. La distrofia muscular tipo Duchenne fue la enfermedad más frecuente (24,8 %. El 41,40 % clasificó en la herencia autosómica recesiva. Los resultados coinciden con lo reportado en la literatura. Conclusiones: las enfermedades neuromusculares hereditarias, y las lesiones estáticas del sistema nervioso central de causa prenatal genética, son las más frecuentes de solicitud de asesoramiento genético en un servicio de Neurogenética.

  18. Incorporation of unique molecular identifiers in TruSeq adapters improves the accuracy of quantitative sequencing.

    Science.gov (United States)

    Hong, Jungeui; Gresham, David

    2017-11-01

    Quantitative analysis of next-generation sequencing (NGS) data requires discriminating duplicate reads generated by PCR from identical molecules that are of unique origin. Typically, PCR duplicates are identified as sequence reads that align to the same genomic coordinates using reference-based alignment. However, identical molecules can be independently generated during library preparation. Misidentification of these molecules as PCR duplicates can introduce unforeseen biases during analyses. Here, we developed a cost-effective sequencing adapter design by modifying Illumina TruSeq adapters to incorporate a unique molecular identifier (UMI) while maintaining the capacity to undertake multiplexed, single-index sequencing. Incorporation of UMIs into TruSeq adapters (TrUMIseq adapters) enables identification of bona fide PCR duplicates as identically mapped reads with identical UMIs. Using TrUMIseq adapters, we show that accurate removal of PCR duplicates results in improved accuracy of both allele frequency (AF) estimation in heterogeneous populations using DNA sequencing and gene expression quantification using RNA-Seq.

  19. A survey of etiologic hypotheses among testicular cancer researchers.

    Science.gov (United States)

    Stang, A; Trabert, B; Rusner, C; Poole, C; Almstrup, K; Rajpert-De Meyts, E; McGlynn, K A

    2015-01-01

    Basic research results can provide new ideas and hypotheses to be examined in epidemiological studies. We conducted a survey among testicular cancer researchers on hypotheses concerning the etiology of this malignancy. All researchers on the mailing list of Copenhagen Testis Cancer Workshops and corresponding authors of PubMed-indexed articles identified by the search term 'testicular cancer' and published within 10 years (in total 2750 recipients) were invited to respond to an e-mail-based survey. Participants of the 8th Copenhagen Testis Cancer Workshop in May 2014 were subsequently asked to rate the plausibility of the suggested etiologic hypotheses on a scale of 1 (very implausible) to 10 (very plausible). This report describes the methodology of the survey, the score distributions by individual hypotheses, hypothesis group, and the participants' major research fields, and discuss the hypotheses that scored as most plausible. We also present plans for improving the survey that may be repeated at a next international meeting of experts in testicular cancer. Overall 52 of 99 (53%) registered participants of the 8th Copenhagen Testis Cancer Workshop submitted the plausibility rating form. Fourteen of 27 hypotheses were related to exposures during pregnancy. Hypotheses with the highest mean plausibility ratings were either related to pre-natal exposures or exposures that might have an effect during pregnancy and in post-natal life. The results of the survey may be helpful for triggering more specific etiologic hypotheses that include factors related to endocrine disruption, DNA damage, inflammation, and nutrition during pregnancy. The survey results may stimulate a multidisciplinary discussion about new etiologic hypotheses of testicular cancer. Published 2014. This article is a U. S. Government work and is in the public domain in the USA.

  20. The Screening Test for Emotional Problems--Teacher-Report Version (Step-T): Studies of Reliability and Validity

    Science.gov (United States)

    Erford, Bradley T.; Butler, Caitlin; Peacock, Elizabeth

    2015-01-01

    The Screening Test for Emotional Problems-Teacher Version (STEP-T) was designed to identify students aged 7-17 years with wide-ranging emotional disturbances. Coefficients alpha and test-retest reliability were adequate for all subscales except Anxiety. The hypothesized five-factor model fit the data very well and external aspects of validity were…

  1. Ganho de seleção no melhoramento genético intrapopulacional do maracujazeiro-amarelo

    Directory of Open Access Journals (Sweden)

    Willian Krause

    2012-01-01

    Full Text Available O objetivo deste trabalho foi estimar o ganho de seleção associado a características agronômicas de importância no melhoramento intrapopulacional do maracujazeiro-amarelo. O experimento foi realizado em campo, no Município de Terra Nova do Norte, MT, com a avaliação de 111 famílias de irmãos completos (FIC e seis cultivares comerciais, utilizadas como testemunhas. Utilizou-se o delineamento de blocos ao acaso, com três repetições e quatro plantas por parcela. Foram avaliadas as seguintes características: produtividade, comprimento, diâmetro e peso médio dos frutos, percentagem e peso de polpa, espessura de casca e teor de sólidos solúveis. Para verificar a existência de variabilidade genética entre os genótipos, estimaram-se os parâmetros genéticos da população com base na média das famílias. Os 30 genótipos com o menor valor da soma de postos, de acordo com o índice de seleção de Mulamba & Mock, foram selecionados para estimar os ganhos genéticos. Observaram-se altos valores médios para as características e parâmetros genéticos avaliados nas 26 FIC e nas quatro testemunhas selecionadas. O uso do índice de seleção proporciona ganhos genéticos positivos em produtividade, percentagem e peso de polpa, comprimento, diâmetro e peso de frutos, e espessura de casca.

  2. RNA-Seq for gene identification and transcript profiling of three Stevia rebaudiana genotypes.

    Science.gov (United States)

    Chen, Junwen; Hou, Kai; Qin, Peng; Liu, Hongchang; Yi, Bin; Yang, Wenting; Wu, Wei

    2014-07-07

    Stevia (Stevia rebaudiana) is an important medicinal plant that yields diterpenoid steviol glycosides (SGs). SGs are currently used in the preparation of medicines, food products and neutraceuticals because of its sweetening property (zero calories and about 300 times sweeter than sugar). Recently, some progress has been made in understanding the biosynthesis of SGs in Stevia, but little is known about the molecular mechanisms underlying this process. Additionally, the genomics of Stevia, a non-model species, remains uncharacterized. The recent advent of RNA-Seq, a next generation sequencing technology, provides an opportunity to expand the identification of Stevia genes through in-depth transcript profiling. We present a comprehensive landscape of the transcriptome profiles of three genotypes of Stevia with divergent SG compositions characterized using RNA-seq. 191,590,282 high-quality reads were generated and then assembled into 171,837 transcripts with an average sequence length of 969 base pairs. A total of 80,160 unigenes were annotated, and 14,211 of the unique sequences were assigned to specific metabolic pathways by the Kyoto Encyclopedia of Genes and Genomes. Gene sequences of all enzymes known to be involved in SG synthesis were examined. A total of 143 UDP-glucosyltransferase (UGT) unigenes were identified, some of which might be involved in SG biosynthesis. The expression patterns of eight of these genes were further confirmed by RT-QPCR. RNA-seq analysis identified candidate genes encoding enzymes responsible for the biosynthesis of SGs in Stevia, a non-model plant without a reference genome. The transcriptome data from this study yielded new insights into the process of SG accumulation in Stevia. Our results demonstrate that RNA-Seq can be successfully used for gene identification and transcript profiling in a non-model species.

  3. Optimizing de novo common wheat transcriptome assembly using short-read RNA-Seq data

    Directory of Open Access Journals (Sweden)

    Duan Jialei

    2012-08-01

    Full Text Available Abstract Background Rapid advances in next-generation sequencing methods have provided new opportunities for transcriptome sequencing (RNA-Seq. The unprecedented sequencing depth provided by RNA-Seq makes it a powerful and cost-efficient method for transcriptome study, and it has been widely used in model organisms and non-model organisms to identify and quantify RNA. For non-model organisms lacking well-defined genomes, de novo assembly is typically required for downstream RNA-Seq analyses, including SNP discovery and identification of genes differentially expressed by phenotypes. Although RNA-Seq has been successfully used to sequence many non-model organisms, the results of de novo assembly from short reads can still be improved by using recent bioinformatic developments. Results In this study, we used 212.6 million pair-end reads, which accounted for 16.2 Gb, to assemble the hexaploid wheat transcriptome. Two state-of-the-art assemblers, Trinity and Trans-ABySS, which use the single and multiple k-mer methods, respectively, were used, and the whole de novo assembly process was divided into the following four steps: pre-assembly, merging different samples, removal of redundancy and scaffolding. We documented every detail of these steps and how these steps influenced assembly performance to gain insight into transcriptome assembly from short reads. After optimization, the assembled transcripts were comparable to Sanger-derived ESTs in terms of both continuity and accuracy. We also provided considerable new wheat transcript data to the community. Conclusions It is feasible to assemble the hexaploid wheat transcriptome from short reads. Special attention should be paid to dealing with multiple samples to balance the spectrum of expression levels and redundancy. To obtain an accurate overview of RNA profiling, removal of redundancy may be crucial in de novo assembly.

  4. GenBank blastx search result: AK105069 [KOME

    Lifescience Database Archive (English)

    Full Text Available AK105069 001-045-C01 AB169286.1 Macaca fascicularis testis cDNA, clone: QtsA-18648, similar to human mortali...ty factor 4 like 2 (MORF4L2), mRNA, RefSeq: NM_012286.1.|PRI PRI 3e-20 +3 ...

  5. Coeficiente de repetibilidade e parâmetros genéticos em capim-elefante

    Directory of Open Access Journals (Sweden)

    Marcelo Cavalcante

    2012-04-01

    Full Text Available O objetivo deste trabalho foi determinar os coeficientes de repetibilidade de caracteres morfofisiológicos em genótipos de capim-elefante (Pennisetum spp., a partir de dados obtidos durante seis ciclos de avaliação. Foram estimados: número mínimo de medições e parâmetros genéticos. Utilizou-se o delineamento experimental de blocos ao acaso, em arranjo de parcelas subdivididas, com quatro níveis de N (controle, 30, 60 e 90 kg ha‑1 por corte e 16 genótipos de Pennisetum (11 híbridos interespecíficos e cinco cultivares. Os ciclos consistiram de avaliações em 2010 (21/4, 19/7 e 28/9 e 2011 (6/1, 7/4 e 3/8. Os coeficientes de repetibilidade foram de média‑alta magnitude para todas as variáveis, o que indica que houve regularidade entre as medidas repetidas. Para as variáveis massa de forragem, altura da planta, comprimento e largura da folha, diâmetro do colmo, clorose e índice de área foliar, três ciclos de avaliação são suficientes para obter R² de 90%, pela análise de componentes principais. Para o comprimento do entrenó, o mínimo de sete avaliações é necessário para predizer o valor real dos genótipos. Os parâmetros genéticos das variáveis massa de forragem, comprimento e largura da folha, diâmetro do colmo e clorose foliar são de alta magnitude, o que favorece a seleção de genótipos superiores de Pennisetum.

  6. Teleport Generation 3 (Teleport Gen 3)

    Science.gov (United States)

    2016-03-01

    for high- throughput multi-band and multimedia connectivity from deployed locations to DISN and DoD Information Network (DoDIN) information sources and...2016 Major Automated Information System Annual Report Teleport Generation 3 (Teleport Gen 3) Defense Acquisition Management Information Retrieval...Program Information 4 Responsible Office 4 References 4 Program Description 5 Business Case 6 Program Status 8 Schedule 9

  7. Divergência entre genótipos de soja, cultivados em várzea irrigada

    Directory of Open Access Journals (Sweden)

    Elonha Rodrigues dos Santos

    2011-12-01

    Full Text Available A divergência genética é um dos mais importantes parâmetros avaliados por melhoristas de plantas, na fase inicial de um programa de melhoramento genético. Diante disso, objetivou-se com este trabalho avaliar, por meio de técnicas multivariadas, a divergência genética entre 48 genótipos de soja, cultivados em várzea irrigada no Estado do Tocantins, com o intuito de identificar as combinações mais promissoras para produzir recombinações superiores, tanto destinados a produção de óleo e farelo, como do grupo especial, destinados ao consumo humano. O experimento foi conduzido no município de Formoso do Araguaia, TO, em cultivo de várzea irrigada na entressafra de 2010. O delineamento experimental foi o de blocos ao acaso, com quatro repetições. Verificou-se variabilidade entre os genótipos testados. Os resultados dos métodos de agrupamento de Tocher, UPGMA e Variáveis Canônicas foram concordantes entre si e detectaram quatro grupos distintos. As seguintes hibridações são promissoras para produção de grãos de soja destinados a óleo e farelo: M-Soy 8766, M-Soy 9144, A 7002 e M-Soy 9056 com Amaralina e cruzamentos entre M-Soy 8766, M-Soy 9144 e Amaralina com BRSMG 790A, BRS 257, BRS 216 e BRS 213 e são indicados visando a genótipos de soja especiais para alimentação humana.

  8. ORF Sequence: Ca19AnnotatedDec2004aaSeq [GENIUS II[Archive

    Lifescience Database Archive (English)

    Full Text Available Ca19AnnotatedDec2004aaSeq orf19.710 >orf19.710; Contig19-10065; complement(47186.....>47710); LSC2*; succinate-CoA ligase beta subunit; truncated protein | overlap LGFDDNASFRQEEVFSWRDPTQEDPQEAE

  9. Testing competing hypotheses about single trial fMRI

    DEFF Research Database (Denmark)

    Hansen, Lars Kai; Purushotham, Archana; Kim, Seong-Ge

    2002-01-01

    We use a Bayesian framework to compute probabilities of competing hypotheses about functional activation based on single trial fMRI measurements. Within the framework we obtain a complete probabilistic picture of competing hypotheses, hence control of both type I and type II errors....

  10. Ancestry prediction in Singapore population samples using the Illumina ForenSeq kit.

    Science.gov (United States)

    Ramani, Anantharaman; Wong, Yongxun; Tan, Si Zhen; Shue, Bing Hong; Syn, Christopher

    2017-11-01

    The ability to predict bio-geographic ancestry can be valuable to generate investigative leads towards solving crimes. Ancestry informative marker (AIM) sets include large numbers of SNPs to predict an ancestral population. Massively parallel sequencing has enabled forensic laboratories to genotype a large number of such markers in a single assay. Illumina's ForenSeq DNA Signature Kit includes the ancestry informative SNPs reported by Kidd et al. In this study, the ancestry prediction capabilities of the ForenSeq kit through sequencing on the MiSeq FGx were evaluated in 1030 unrelated Singapore population samples of Chinese, Malay and Indian origin. A total of 59 ancestry SNPs and phenotypic SNPs with AIM properties were selected. The bio-geographic ancestry of the 1030 samples, as predicted by Illumina's ForenSeq Universal Analysis Software (UAS), was determined. 712 of the genotyped samples were used as a training sample set for the generation of an ancestry prediction model using STRUCTURE and Snipper. The performance of the prediction model was tested by both methods with the remaining 318 samples. Ancestry prediction in UAS was able to correctly classify the Singapore Chinese as part of the East Asian cluster, while Indians clustered with Ad-mixed Americans and Malays clustered in-between these two reference populations. Principal component analyses showed that the 59 SNPs were only able to account for 26% of the variation between the Singapore sub-populations. Their discriminatory potential was also found to be lower (G ST =0.085) than that reported in ALFRED (F ST =0.357). The Snipper algorithm was able to correctly predict bio-geographic ancestry in 91% of Chinese and Indian, and 88% of Malay individuals, while the success rates for the STRUCTURE algorithm were 94% in Chinese, 80% in Malay, and 91% in Indian individuals. Both these algorithms were able to provide admixture proportions when present. Ancestry prediction accuracy (in terms of likelihood ratio

  11. Caracteres clínico-patológicos y perfil genético en el carcinoma colorrectal

    Directory of Open Access Journals (Sweden)

    Florencia Perazzo

    2013-10-01

    Full Text Available El cáncer colorrectal es el tercer cáncer más frecuente en hombres y el segundo más frecuente en mujeres, con una incidencia mundial aproximada de 1.2 millones de casos nuevos por año. Nuestro objetivo primario fue estudiar la relación existente entre las características clínico-histológicas en individuos con cáncer colorrectal y el estado mutacional de los codones 12 y 13 del gen KRAS (7 mutaciones validadas, con el fin de hallar un marcador histopatológico para los tumores mutados. El objetivo secundario fue determinar cuántos pacientes tenían mutaciones adicionales en los codones 15 y 61 del gen KRAS y 600 del gen BRAF que podrían modificar el fenotipo tumoral. Fueron seleccionados 60 individuos con cáncer colorrectal (30 wild-type y 30 con mutaciones validadas en los codones 12 y 13 del gen KRAS. Se amplificaron y secuenciaron del gen KRAS los exones 2 y 3, y del gen BRAF el exón 15. La información recolectada se examinó mediante un análisis descriptivo, análisis univariado y/o análisis multivariado, según correspondiese. En conclusión, no se encontró relación entre las características clínico-histológicas de los tumores de individuos con diagnóstico de cáncer colorrectal y el estado mutacional de los codones 12 y 13 del gen KRAS. No hallamos un marcador histopatológico para los tumores mutados. En pacientes con adenocarcinomas colorrectales avanzados y KRAS wild-type resulta de interés considerar el estudio del codón 600 del gen BRAF.

  12. Alimentos Transgénicos : Organismos Genéticamente Modificados (OGM)

    OpenAIRE

    Martín López, Jimena

    2016-01-01

    Los alimentos transgénicos son aquellos que proceden de un organismo modificado genéticamente. La introducción de este tipo de productos en nuestra dieta es un tema que genera controversia ya que en muchos casos no se conoce con exactitud los efectos que esta modificación puede tener en el ser humano. A lo largo de las páginas de este trabajo se explica la historia de la aparición de estos organismos gracias a procedimientos de ingeniería genética, en los que se modifican fragmentos de su ADN...

  13. Rcount: simple and flexible RNA-Seq read counting.

    Science.gov (United States)

    Schmid, Marc W; Grossniklaus, Ueli

    2015-02-01

    Analysis of differential gene expression by RNA sequencing (RNA-Seq) is frequently done using feature counts, i.e. the number of reads mapping to a gene. However, commonly used count algorithms (e.g. HTSeq) do not address the problem of reads aligning with multiple locations in the genome (multireads) or reads aligning with positions where two or more genes overlap (ambiguous reads). Rcount specifically addresses these issues. Furthermore, Rcount allows the user to assign priorities to certain feature types (e.g. higher priority for protein-coding genes compared to rRNA-coding genes) or to add flanking regions. Rcount provides a fast and easy-to-use graphical user interface requiring no command line or programming skills. It is implemented in C++ using the SeqAn (www.seqan.de) and the Qt libraries (qt-project.org). Source code and 64 bit binaries for (Ubuntu) Linux, Windows (7) and MacOSX are released under the GPLv3 license and are freely available on github.com/MWSchmid/Rcount. marcschmid@gmx.ch Test data, genome annotation files, useful Python and R scripts and a step-by-step user guide (including run-time and memory usage tests) are available on github.com/MWSchmid/Rcount. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  14. Mining RNA-seq data for infections and contaminations.

    Directory of Open Access Journals (Sweden)

    Thomas Bonfert

    Full Text Available RNA sequencing (RNA-seq provides novel opportunities for transcriptomic studies at nucleotide resolution, including transcriptomics of viruses or microbes infecting a cell. However, standard approaches for mapping the resulting sequencing reads generally ignore alternative sources of expression other than the host cell and are little equipped to address the problems arising from redundancies and gaps among sequenced microbe and virus genomes. We show that screening of sequencing reads for contaminations and infections can be performed easily using ContextMap, our recently developed mapping software. Based on mapping-derived statistics, mapping confidence, similarities and misidentifications (e.g. due to missing genome sequences of species/strains can be assessed. Performance of our approach is evaluated on three real-life sequencing data sets and compared to state-of-the-art metagenomics tools. In particular, ContextMap vastly outperformed GASiC and GRAMMy in terms of runtime. In contrast to MEGAN4, it was capable of providing individual read mappings to species and resolving non-unique mappings, thus allowing the identification of misalignments caused by sequence similarities between genomes and missing genome sequences. Our study illustrates the importance and potentials of routinely mining RNA-seq experiments for infections or contaminations by microbes and viruses. By using ContextMap, gene expression of infecting agents can be analyzed and novel insights in infection processes and tumorigenesis can be obtained.

  15. Scientific 'Laws', 'Hypotheses' and 'Theories'

    Indian Academy of Sciences (India)

    verified, the hypothesis changes from the status of a 'mere' hypothesis, and ... a pre-existing law and the body of facts upon which that law is based. Hypotheses .... implicit belief that order objectively exists in nature, and that scientific laws ...

  16. Selecting between-sample RNA-Seq normalization methods from the perspective of their assumptions.

    Science.gov (United States)

    Evans, Ciaran; Hardin, Johanna; Stoebel, Daniel M

    2017-02-27

    RNA-Seq is a widely used method for studying the behavior of genes under different biological conditions. An essential step in an RNA-Seq study is normalization, in which raw data are adjusted to account for factors that prevent direct comparison of expression measures. Errors in normalization can have a significant impact on downstream analysis, such as inflated false positives in differential expression analysis. An underemphasized feature of normalization is the assumptions on which the methods rely and how the validity of these assumptions can have a substantial impact on the performance of the methods. In this article, we explain how assumptions provide the link between raw RNA-Seq read counts and meaningful measures of gene expression. We examine normalization methods from the perspective of their assumptions, as an understanding of methodological assumptions is necessary for choosing methods appropriate for the data at hand. Furthermore, we discuss why normalization methods perform poorly when their assumptions are violated and how this causes problems in subsequent analysis. To analyze a biological experiment, researchers must select a normalization method with assumptions that are met and that produces a meaningful measure of expression for the given experiment. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  17. Crescimento de genótipos de frangos tipo caipira

    Directory of Open Access Journals (Sweden)

    R. C. Veloso

    2015-10-01

    Full Text Available RESUMOObjetivou-se com este trabalho comparar o padrão de crescimento, mediante ajustes das respectivas curvas de crescimento por modelos não lineares, bem como estudar o desenvolvimento de cortes de carcaça em relação ao peso da carcaça em diferentes genótipos de frangos tipo caipira. Foram utilizados 840 pintos de um dia, machos, distribuídos em delineamento inteiramente ao acaso, dos seguintes genótipos da linhagem Redbro: Caboclo, Carijó, Colorpak, Gigante Negro, Pesadão Vermelho, Pescoço Pelado e Tricolor. As aves foram alojadas em 28 boxes, sendo 30 aves/boxe, em galpão de alvenaria com acesso a um piquete de 45m², com quatro repetições. O peso corporal individual dos frangos foi medido ao nascer, aos 14, 28, 42, 56, 70 e 84 dias de idade. Para a determinação das curvas de crescimento do peso corporal das aves, os dados coletados foram avaliados por meio dos modelos não lineares: Brody, Gompertz, Logístico, Richards e von Bertalanffy. Foi empregado o PROC NLIN do SAS, utilizando-se o método interativo de Gauss-Newton. Os critérios usados para escolha do modelo de melhor ajuste da curva de crescimento foram o coeficiente de determinação, o desvio padrão assintótico, o desvio médio absoluto dos resíduos e o índice assintótico. As análises para obtenção dos coeficientes alométricos foram realizadas por meio do PROC GLM do SAS para os genótipos Carijó, Colorpak, Pesadão Vermelho, Pescoço Pelado e Tricolor. Foram avaliados os pesos da carcaça, do peito, das coxas, das sobrecoxas, das pernas e das asas das aves abatidas aos 85 dias de idade. Apenas as equações propostas por Gompertz, von Bertalanffy e Logístico atingiram a convergência, e o modelo proposto por von Bertalanffy foi o mais adequado para descrever o crescimento dos genótipos de frangos caipiras. Todos os cortes avaliados apresentaram crescimento tardio em relação ao peso da carcaça em genótipos de frangos tipo caipira.

  18. Scientific 'Laws', 'Hypotheses' and 'Theories'

    Indian Academy of Sciences (India)

    theories, defines a hypothesis as "any supposition which we may ... about the origin of the solar system are also hypotheses of this type. They are about the birth of the planets, an event, which has happened, in the past history of our Universe.

  19. Bacterial diversity of soil under eucalyptus assessed by 16S rDNA sequencing analysis Diversidade bacteriana de solo sob eucaliptos obtida por seqüenciamento do 16S rDNA

    Directory of Open Access Journals (Sweden)

    Érico Leandro da Silveira

    2006-10-01

    clonado usando pGEM-T e seqüenciado para determinar a diversidade bacteriana. Foram analisados 134 clones do solo NFA e 116 clones do solo EAA. As seqüências foram comparadas às depositadas no GenBank. Análises filogenéticas revelaram diferenças entre os tipos de solos e alta diversidade em ambas comunidades. No solo de arboreto de Eucalyptus spp. foi encontrada maior diversidade bacteriana em comparação com o solo da área de floresta nativa.

  20. Transformação genética em espécies florestais.

    Directory of Open Access Journals (Sweden)

    Claudia Studart-Guimarães

    2010-08-01

    Full Text Available A transformação genética, que compreende a introdução de genes exógenos de forma controlada no genoma de uma célula vegetal e posterior regeneração da planta transgênica, tem contribuído com os programas de melhoramento genético de plantas pela obtenção de genótipos com novas características de interesse. O melhoramento de espécies florestais é limitado por características intrínsecas a tais espécies, como a altura dos indivíduos e o ciclo longo de vida. A transformação genética constitui, portanto, uma alternativa para a obtenção de espécies florestais com características desejáveis em um menor espaço de tempo. Plantas transgênicas com resistência a determinadas pragas, com melhor qualidade de madeira, maior produção de biomassa, tolerância a herbicidas, entre outras características de interesse, já foram obtidas para diferentes espécies florestais de importância econômica como álamo, eucalipto e pinheiros em geral. Este trabalho mostra a importância da transformação genética, associada a outras técnicas biotecnológicas no melhoramento de espécies florestais, as técnicas de transformação mais utilizadas e as características que já foram introduzidas nessas espécies pela transformação.

  1. Reporte de familias con neurofibromatosis y otras enfermedades genéticas

    OpenAIRE

    Orraca Castillo, Miladys; Licourt Otero, Deysi; Sánchez Álvarez de La Campa, Ana Isabel

    2011-01-01

    La neurofibromatosis tipo 1, es una enfermedad genética que primariamente afecta el desarrollo y crecimiento celular del sistema nervioso, clínicamente se caracteriza por máculas café con leche, neurofibromas, pecas en regiones no expuestas al sol, nódulos de Lisch, lesiones óseas y glioma óptico. En el presente trabajo se describen dos familias, en las cuales algunos individuos padecen esta enfermedad y otros miembros de la misma familia muestran una diferente enfermedad genética. La coexist...

  2. Inventering av Suldalslågen. Produksjonspotensial for sjøvandrende laksefisk

    OpenAIRE

    Foldvik, Anders; Pettersen, Oskar

    2017-01-01

    Foldvik, A. & Pettersen, O. 2017. Inventering av Suldalslågen. Produksjonspotensial for sjøvandrende laksefisk. - NINA Kortrapport 75, 18 s. Reguleringen av Suldalslågen til kraftproduksjon har hatt negative effekter for habitat for laksefisk, blant annet i form av sedimentering og begroing av substratet. Disse prosessene har blitt forsøkt motvirket ved å ha en serie med spyleflommer på over 200 m3/s om høsten. På oppdrag fra Statkraft inverterte NINA oppvekst- og gyteforhold for laks i Su...

  3. TrayGen: Arranging objects for exhibition and packaging

    KAUST Repository

    Yang, Yongliang; Huang, Qixing

    2013-01-01

    We present a framework, called TrayGen, to generate tray designs for the exhibition and packaging of a collection of objects. Based on principles from shape perception and visual merchandising, we abstract a number of design guidelines on how

  4. A 48-plex autosomal SNP GenPlex™ assay for human individualization and relationship testing

    DEFF Research Database (Denmark)

    Tomas Mas, Carmen; Børsting, Claus; Morling, Niels

    2012-01-01

    SNPs are being increasingly used by forensic laboratories. Different platforms have been developed for SNP typing. We describe the GenPlex™ HID system protocol, a new SNP-typing platform developed by Applied Biosystems where 48 of the 52 SNPforID SNPs and amelogenin are included. The GenPlex™ HID...

  5. Combining laser microdissection and RNA-seq to chart the transcriptional landscape of fungal development

    Science.gov (United States)

    2012-01-01

    Background During sexual development, filamentous ascomycetes form complex, three-dimensional fruiting bodies for the protection and dispersal of sexual spores. Fruiting bodies contain a number of cell types not found in vegetative mycelium, and these morphological differences are thought to be mediated by changes in gene expression. However, little is known about the spatial distribution of gene expression in fungal development. Here, we used laser microdissection (LM) and RNA-seq to determine gene expression patterns in young fruiting bodies (protoperithecia) and non-reproductive mycelia of the ascomycete Sordaria macrospora. Results Quantitative analysis showed major differences in the gene expression patterns between protoperithecia and total mycelium. Among the genes strongly up-regulated in protoperithecia were the pheromone precursor genes ppg1 and ppg2. The up-regulation was confirmed by fluorescence microscopy of egfp expression under the control of ppg1 regulatory sequences. RNA-seq analysis of protoperithecia from the sterile mutant pro1 showed that many genes that are differentially regulated in these structures are under the genetic control of transcription factor PRO1. Conclusions We have generated transcriptional profiles of young fungal sexual structures using a combination of LM and RNA-seq. This allowed a high spatial resolution and sensitivity, and yielded a detailed picture of gene expression during development. Our data revealed significant differences in gene expression between protoperithecia and non-reproductive mycelia, and showed that the transcription factor PRO1 is involved in the regulation of many genes expressed specifically in sexual structures. The LM/RNA-seq approach will also be relevant to other eukaryotic systems in which multicellular development is investigated. PMID:23016559

  6. Combining laser microdissection and RNA-seq to chart the transcriptional landscape of fungal development

    Directory of Open Access Journals (Sweden)

    Teichert Ines

    2012-09-01

    Full Text Available Abstract Background During sexual development, filamentous ascomycetes form complex, three-dimensional fruiting bodies for the protection and dispersal of sexual spores. Fruiting bodies contain a number of cell types not found in vegetative mycelium, and these morphological differences are thought to be mediated by changes in gene expression. However, little is known about the spatial distribution of gene expression in fungal development. Here, we used laser microdissection (LM and RNA-seq to determine gene expression patterns in young fruiting bodies (protoperithecia and non-reproductive mycelia of the ascomycete Sordaria macrospora. Results Quantitative analysis showed major differences in the gene expression patterns between protoperithecia and total mycelium. Among the genes strongly up-regulated in protoperithecia were the pheromone precursor genes ppg1 and ppg2. The up-regulation was confirmed by fluorescence microscopy of egfp expression under the control of ppg1 regulatory sequences. RNA-seq analysis of protoperithecia from the sterile mutant pro1 showed that many genes that are differentially regulated in these structures are under the genetic control of transcription factor PRO1. Conclusions We have generated transcriptional profiles of young fungal sexual structures using a combination of LM and RNA-seq. This allowed a high spatial resolution and sensitivity, and yielded a detailed picture of gene expression during development. Our data revealed significant differences in gene expression between protoperithecia and non-reproductive mycelia, and showed that the transcription factor PRO1 is involved in the regulation of many genes expressed specifically in sexual structures. The LM/RNA-seq approach will also be relevant to other eukaryotic systems in which multicellular development is investigated.

  7. Validity and reliability of the Turkish version of the Manchester-Oxford Foot Questionnaire for hallux valgus deformity evaluation.

    Science.gov (United States)

    Talu, Burcu; Bayramlar, Kezban; Bek, Nilgün; Yakut, Yavuz

    2016-01-01

    The aim of this study was to evaluate the reliability and validity of the Turkish version of the Manchester-Oxford Foot Questionnaire (MOXFQ) in patients affected by hallux valgus in order to assess the accuracy of this cross-cultural adaption. Thirty female volunteers aged between 18 and 55 years were included in the study. Subjects with hallux valgus were asked to complete the MOXFQ and the Short-Form 36 Health Survey (SF-36). After receiving permission from the author, the MOXFQ was translated into Turkish twice and then back translated to English, after which its compatibility was evaluated. The Turkish version of the MOXFO was applied twice, 1-3 days apart, to the study subjects. Internal consistency and test-retest reliability were assessed using Cronbach's alpha and intraclass correlation coefficient (ICC), respectively. Construct validity was assessed with the use of Spearman's rank correlation coefficient, using a priori hypothesized correlations with SF-36 domains. Subjects achieved similar scores at the first and second administration of the questionnaire (validity was supported by the presence of all the hypothesized correlations, with SF-36 within its physical parameters. The Turkish version of the MOXFQ is a valid and reliable tool for evaluating foot pain and functional status in patients affected by hallux valgus.

  8. Using Poisson mixed-effects model to quantify transcript-level gene expression in RNA-Seq.

    Science.gov (United States)

    Hu, Ming; Zhu, Yu; Taylor, Jeremy M G; Liu, Jun S; Qin, Zhaohui S

    2012-01-01

    RNA sequencing (RNA-Seq) is a powerful new technology for mapping and quantifying transcriptomes using ultra high-throughput next-generation sequencing technologies. Using deep sequencing, gene expression levels of all transcripts including novel ones can be quantified digitally. Although extremely promising, the massive amounts of data generated by RNA-Seq, substantial biases and uncertainty in short read alignment pose challenges for data analysis. In particular, large base-specific variation and between-base dependence make simple approaches, such as those that use averaging to normalize RNA-Seq data and quantify gene expressions, ineffective. In this study, we propose a Poisson mixed-effects (POME) model to characterize base-level read coverage within each transcript. The underlying expression level is included as a key parameter in this model. Since the proposed model is capable of incorporating base-specific variation as well as between-base dependence that affect read coverage profile throughout the transcript, it can lead to improved quantification of the true underlying expression level. POME can be freely downloaded at http://www.stat.purdue.edu/~yuzhu/pome.html. yuzhu@purdue.edu; zhaohui.qin@emory.edu Supplementary data are available at Bioinformatics online.

  9. Transcriptome Analysis of the Thymus in Short-Term Calorie-Restricted Mice Using RNA-seq

    Directory of Open Access Journals (Sweden)

    Zehra Omeroğlu Ulu

    2018-01-01

    Full Text Available Calorie restriction (CR, which is a factor that expands lifespan and an important player in immune response, is an effective protective method against cancer development. Thymus, which plays a critical role in the development of the immune system, reacts to nutrition deficiency quickly. RNA-seq-based transcriptome sequencing was performed to thymus tissues of MMTV-TGF-α mice subjected to ad libitum (AL, chronic calorie restriction (CCR, and intermittent calorie restriction (ICR diets in this study. Three cDNA libraries were sequenced using Illumina HiSeq™ 4000 to produce 100 base pair-end reads. On average, 105 million clean reads were mapped and in total 6091 significantly differentially expressed genes (DEGs were identified (p<0.05. These DEGs were clustered into Gene Ontology (GO categories. The expression pattern revealed by RNA-seq was validated by quantitative real-time PCR (qPCR analysis of four important genes, which are leptin, ghrelin, Igf1, and adinopectin. RNA-seq data has been deposited in NCBI Gene Expression Omnibus (GEO database (GSE95371. We report the use of RNA sequencing to find DEGs that are affected by different feeding regimes in the thymus.

  10. Transcriptome Analysis of the Thymus in Short-Term Calorie-Restricted Mice Using RNA-seq

    Science.gov (United States)

    Omeroğlu Ulu, Zehra; Ulu, Salih; Dogan, Soner; Guvenc Tuna, Bilge

    2018-01-01

    Calorie restriction (CR), which is a factor that expands lifespan and an important player in immune response, is an effective protective method against cancer development. Thymus, which plays a critical role in the development of the immune system, reacts to nutrition deficiency quickly. RNA-seq-based transcriptome sequencing was performed to thymus tissues of MMTV-TGF-α mice subjected to ad libitum (AL), chronic calorie restriction (CCR), and intermittent calorie restriction (ICR) diets in this study. Three cDNA libraries were sequenced using Illumina HiSeq™ 4000 to produce 100 base pair-end reads. On average, 105 million clean reads were mapped and in total 6091 significantly differentially expressed genes (DEGs) were identified (p < 0.05). These DEGs were clustered into Gene Ontology (GO) categories. The expression pattern revealed by RNA-seq was validated by quantitative real-time PCR (qPCR) analysis of four important genes, which are leptin, ghrelin, Igf1, and adinopectin. RNA-seq data has been deposited in NCBI Gene Expression Omnibus (GEO) database (GSE95371). We report the use of RNA sequencing to find DEGs that are affected by different feeding regimes in the thymus. PMID:29511668

  11. TidGen Power System Commercialization Project

    Energy Technology Data Exchange (ETDEWEB)

    Sauer, Christopher R. [President & CEO; McEntee, Jarlath [VP Engineering & CTO

    2013-12-30

    ORPC Maine, LLC, a wholly-owned subsidiary of Ocean Renewable Power Company, LLC (collectively ORPC), submits this Final Technical Report for the TidGen® Power System Commercialization Project (Project), partially funded by the U.S. Department of Energy (DE-EE0003647). The Project was built and operated in compliance with the Federal Energy Regulatory Commission (FERC) pilot project license (P-12711) and other permits and approvals needed for the Project. This report documents the methodologies, activities and results of the various phases of the Project, including design, engineering, procurement, assembly, installation, operation, licensing, environmental monitoring, retrieval, maintenance and repair. The Project represents a significant achievement for the renewable energy portfolio of the U.S. in general, and for the U.S. marine hydrokinetic (MHK) industry in particular. The stated Project goal was to advance, demonstrate and accelerate deployment and commercialization of ORPC’s tidal-current based hydrokinetic power generation system, including the energy extraction and conversion technology, associated power electronics, and interconnection equipment capable of reliably delivering electricity to the domestic power grid. ORPC achieved this goal by designing, building and operating the TidGen® Power System in 2012 and becoming the first federally licensed hydrokinetic tidal energy project to deliver electricity to a power grid under a power purchase agreement in North America. Located in Cobscook Bay between Eastport and Lubec, Maine, the TidGen® Power System was connected to the Bangor Hydro Electric utility grid at an on-shore station in North Lubec on September 13, 2012. ORPC obtained a FERC pilot project license for the Project on February 12, 2012 and the first Maine Department of Environmental Protection General Permit issued for a tidal energy project on January 31, 2012. In addition, ORPC entered into a 20-year agreement with Bangor Hydro Electric

  12. Scientific'Laws','Hypotheses' and'Theories'

    Indian Academy of Sciences (India)

    Home; Journals; Resonance – Journal of Science Education; Volume 3; Issue 12. Scientific 'Laws', 'Hypotheses' and 'Theories' - How are They Related? J R Lakshmana Rao. General Article Volume 3 Issue 12 December 1998 pp 55-61. Fulltext. Click here to view fulltext PDF. Permanent link:

  13. Quantitative ChIP-Seq Normalization Reveals Global Modulation of the Epigenome

    Directory of Open Access Journals (Sweden)

    David A. Orlando

    2014-11-01

    Full Text Available Epigenomic profiling by chromatin immunoprecipitation coupled with massively parallel DNA sequencing (ChIP-seq is a prevailing methodology used to investigate chromatin-based regulation in biological systems such as human disease, but the lack of an empirical methodology to enable normalization among experiments has limited the precision and usefulness of this technique. Here, we describe a method called ChIP with reference exogenous genome (ChIP-Rx that allows one to perform genome-wide quantitative comparisons of histone modification status across cell populations using defined quantities of a reference epigenome. ChIP-Rx enables the discovery and quantification of dynamic epigenomic profiles across mammalian cells that would otherwise remain hidden using traditional normalization methods. We demonstrate the utility of this method for measuring epigenomic changes following chemical perturbations and show how reference normalization of ChIP-seq experiments enables the discovery of disease-relevant changes in histone modification occupancy.

  14. Polimorfismos del gen ob en bovinos de raza holstein en la Comarca Lagunera, México

    OpenAIRE

    Sarai S. Mendoza-Retana; Miguel A. Gallegos-Robles; Uriel González-Salas; José L. García-Hernández; Manuel Fortis-Hernández; Cirilo Vázquez-Vázquez; Héctor I. Trejo-Escareño

    2017-01-01

    La Comarca Lagunera es la cuenca lechera más importante de México. En la actualidad se están utilizando diversas técnicas que permiten evaluar genéticamente el animal a una edad temprana, permitiendo seleccionar futuros reproductores con características deseables. Entre los genes relacionados con la producción de leche, se encuentran el gen Ob también llamado gen Leptina el cual actúa sobre el sistema nervioso central y tejidos periféricos jugando un papel muy importante ...

  15. Evaluación genética de los salmónidos asturianos como recurso natural

    OpenAIRE

    Abad García, David

    2012-01-01

    En este proyecto se analiza la estructura genética de los stocks utilizados para la repoblación de trucha común en el Principado de Asturias, pertenecientes a dos piscifactorías diferentes, con el fin de establecer si los repobladores cumplen con la normativa vigente sobre la liberación de individuos no autóctonos al medio natural, que está actualmente prohibida. Para ello se utiliza como marcador genético el gen del enzima lactato deshidrogenasa LDH-C, que permite diferenciar las poblaciones...

  16. Incidência de tripes em genótipos de cebola

    Directory of Open Access Journals (Sweden)

    Paulo Antonio de Souza Gonçalves

    2017-05-01

    Full Text Available Os objetivos deste estudo foram avaliar a incidência de tripes em genótipos de cebola, verificar sua correlação com o teor de clorofila, arquitetura e coloração foliar, e produtividade. O experimento foi conduzido na Epagri, Estação Experimental de Ituporanga, SC, Brasil, na safra de 2015. O número de genótipos comerciais ou em desenvolvimento avaliados foi 48, sendo doze híbridos e 36 de polinização livre. A incidência de tripes foi semelhante na maioria dos genótipos. As exceções foram os híbridos precoces Roxa 10039 e 10160, que apresentaram menores notas de incidência que RDW Luthy e Conesul. A arquitetura foliar mais aberta associada com a cor verde clara favoreceu uma menor incidência de tripes. Os cultivares de polinização livre e com origem no programa de melhoramento da Epagri (Superprecoce-Agroecológica, Bola Precoce-Agroecológica, Juporanga-Agroecológica, Valessul, Bola Suprema e Crioula Alto Vale foram os mais produtivos.

  17. Involvement of the Ventrolateral Prefrontal Cortex in Learning Others' Bad Reputations and Indelible Distrust.

    Science.gov (United States)

    Suzuki, Atsunobu; Ito, Yuichi; Kiyama, Sachiko; Kunimi, Mitsunobu; Ohira, Hideki; Kawaguchi, Jun; Tanabe, Hiroki C; Nakai, Toshiharu

    2016-01-01

    A bad reputation can persistently affect judgments of an individual even when it turns out to be invalid and ought to be disregarded. Such indelible distrust may reflect that the negative evaluation elicited by a bad reputation transfers to a person. Consequently, the person him/herself may come to activate this negative evaluation irrespective of the accuracy of the reputation. If this theoretical model is correct, an evaluation-related brain region will be activated when witnessing a person whose bad reputation one has learned about, regardless of whether the reputation is deemed valid or not. Here, we tested this neural hypothesis with functional magnetic resonance imaging (fMRI). Participants memorized faces paired with either a good or a bad reputation. Next, they viewed the faces alone and inferred whether each person was likely to cooperate, first while retrieving the reputations, and then while trying to disregard them as false. A region of the left ventrolateral prefrontal cortex (vlPFC), which may be involved in negative evaluation, was activated by faces previously paired with bad reputations, irrespective of whether participants attempted to retrieve or disregard these reputations. Furthermore, participants showing greater activity of the left ventrolateral prefrontal region in response to the faces with bad reputations were more likely to infer that these individuals would not cooperate. Thus, once associated with a bad reputation, a person may elicit evaluation-related brain responses on their own, thereby evoking distrust independently of their reputation.

  18. Insights into bacterioplankton community structure from Sundarbans mangrove ecoregion using Sanger and Illumina MiSeq sequencing approaches: A comparative analysis

    Directory of Open Access Journals (Sweden)

    Anwesha Ghosh

    2017-03-01

    Full Text Available Next generation sequencing using platforms such as Illumina MiSeq provides a deeper insight into the structure and function of bacterioplankton communities in coastal ecosystems compared to traditional molecular techniques such as clone library approach which incorporates Sanger sequencing. In this study, structure of bacterioplankton communities was investigated from two stations of Sundarbans mangrove ecoregion using both Sanger and Illumina MiSeq sequencing approaches. The Illumina MiSeq data is available under the BioProject ID PRJNA35180 and Sanger sequencing data under accession numbers KX014101-KX014140 (Stn1 and KX014372-KX014410 (Stn3. Proteobacteria-, Firmicutes- and Bacteroidetes-like sequences retrieved from both approaches appeared to be abundant in the studied ecosystem. The Illumina MiSeq data (2.1 GB provided a deeper insight into the structure of bacterioplankton communities and revealed the presence of bacterial phyla such as Actinobacteria, Cyanobacteria, Tenericutes, Verrucomicrobia which were not recovered based on Sanger sequencing. A comparative analysis of bacterioplankton communities from both stations highlighted the presence of genera that appear in both stations and genera that occur exclusively in either station. However, both the Sanger sequencing and Illumina MiSeq data were coherent at broader taxonomic levels. Pseudomonas, Devosia, Hyphomonas and Erythrobacter-like sequences were the abundant bacterial genera found in the studied ecosystem. Both the sequencing methods showed broad coherence although as expected the Illumina MiSeq data helped identify rarer bacterioplankton groups and also showed the presence of unassigned OTUs indicating possible presence of novel bacterioplankton from the studied mangrove ecosystem.

  19. Prevalência de artefatos em exames de ressonância magnética do abdome utilizando a seqüência GRASE: comparável com as melhores seqüências rápidas? Prevalence of artifacts in abdominal magnetic resonance imaging using GRASE sequence: a comparison with TSE sequences

    Directory of Open Access Journals (Sweden)

    Viviane Vieira Francisco

    2005-09-01

    Full Text Available OBJETIVO: Determinar a freqüência global de artefatos na seqüência "gradient and spin echo" (GRASE, por tipo e grau do artefato, em exames de ressonância magnética de abdome; realizar comparação entre as seqüências GRASE e duas seqüências TSE previamente selecionadas como aquelas com melhor relação sinal-ruído e menor incidência de artefatos. MATERIAIS E MÉTODOS: Foi realizado estudo prospectivo, autopareado, em 86 pacientes submetidos a ressonância magnética de abdome superior, sendo adquiridas a seqüência GRASE com sincronizador respiratório e supressão de gordura e seis seqüências TSE ponderadas em T2. Dentre as seis seqüências TSE, foram previamente selecionadas aquelas com melhor relação sinal-ruído e menor número de artefatos, que foram as realizadas com supressão de gordura e com sincronizador respiratório, sendo uma com bobina de corpo (seqüência 1 e outra com bobina de sinergia (seqüência 2. A análise das imagens foi realizada por dois observadores em consenso, quanto a presença, grau e tipo de artefato. Posteriormente os dados foram analisados estatisticamente, através do teste de Friedman e do qui-quadrado. RESULTADOS: A freqüência absoluta de artefatos nas seqüências utilizadas foi de 65,02%. Os artefatos mais encontrados nas três seqüências estudadas foram os de respiração (30% e de pulsação (33%. Apenas 3% dos casos apresentaram algum tipo de artefato que dificultava a análise das imagens. As freqüências de artefatos nas diversas seqüências foram: GRASE, 67,2%; seqüência TSE 1, 62,2%; seqüência TSE 2, 65,5%. Não houve diferença estatisticamente significante na freqüência de artefatos encontrados nas seqüências GRASE e nas seqüências TSE (p = 0,845; NS. CONCLUSÃO: As seqüências GRASE e TSE ponderadas em T2 com sincronizador respiratório e com supressão de gordura, independentemente da bobina utilizada, apresentam freqüentemente artefatos, porém com incid

  20. Testing hypotheses for differences between linear regression lines

    Science.gov (United States)

    Stanley J. Zarnoch

    2009-01-01

    Five hypotheses are identified for testing differences between simple linear regression lines. The distinctions between these hypotheses are based on a priori assumptions and illustrated with full and reduced models. The contrast approach is presented as an easy and complete method for testing for overall differences between the regressions and for making pairwise...

  1. Caracterização de genótipos de cebola com a utilização de marcadores moleculares RAPD

    Directory of Open Access Journals (Sweden)

    Gerson Henrique Wamser

    Full Text Available A divergência genética foi avaliada entre quinze genótipos de cebola cultivados em Santa Catarina, com a utilização de marcadores moleculares RAPD. Onze oligonucleotídeos iniciadores da série Operon Technologies foram utilizados e produziram 35 marcadores, destes, 28 foram polimórficos. Os produtos da amplificação foram visualizados em gel de agarose 1,4%, corado com brometo de etídeo. Uma matriz de similaridade utilizando-se o coeficiente de Jaccard foi construída a partir dos dados moleculares. Um dendrograma foi gerado para melhor visualização da similaridade genética através do método de agrupamento UPGMA. Três grupos foram formados utilizando o coeficiente de similaridade 0,6 como ponto de corte. O primeiro grupo reuniu os genótipos Super Superprecoce e Gauchinha. O segundo grupo reuniu doze genótipos. Dentro desse grupo, os genótipos Bella Vista e Bella Dura foram os que apresentaram o maior coeficiente de similaridade, em torno de 0,89. Bela Vista e Superprecoce, Catarina e o híbrido Bella Vista, com coeficiente de similaridade de 0,88 entre os pares. O terceiro grupo apresentou apenas o genótipo Crioula Roxa, que obteve o menor valor (0,31 para o coeficiente de similaridade. Tendo em vista os resultados obtidos, cruzamentos entre os genótipos do primeiro e segundo grupo e destes com o genótipo Crioula Roxa, podem ser melhores por apresentarem maior divergência entre si. A técnica de RAPD mostrou-se eficaz na caracterização molecular dos genótipos de cebola, evidenciando que existe variabilidade entre os genótipos estudados.

  2. Relating genes to function: identifying enriched transcription factors using the ENCODE ChIP-Seq significance tool.

    Science.gov (United States)

    Auerbach, Raymond K; Chen, Bin; Butte, Atul J

    2013-08-01

    Biological analysis has shifted from identifying genes and transcripts to mapping these genes and transcripts to biological functions. The ENCODE Project has generated hundreds of ChIP-Seq experiments spanning multiple transcription factors and cell lines for public use, but tools for a biomedical scientist to analyze these data are either non-existent or tailored to narrow biological questions. We present the ENCODE ChIP-Seq Significance Tool, a flexible web application leveraging public ENCODE data to identify enriched transcription factors in a gene or transcript list for comparative analyses. The ENCODE ChIP-Seq Significance Tool is written in JavaScript on the client side and has been tested on Google Chrome, Apple Safari and Mozilla Firefox browsers. Server-side scripts are written in PHP and leverage R and a MySQL database. The tool is available at http://encodeqt.stanford.edu. abutte@stanford.edu Supplementary material is available at Bioinformatics online.

  3. Overview of materials R and D for fusion and Gen-4

    Energy Technology Data Exchange (ETDEWEB)

    Kohyama, A. [Kyoto Univ., lnstitute of Advanced Energy (Japan); Tavassoli, F.; Carre, F.; Billot, P. [CEA Saclay, 91 - Gif sur Yvette (France); Zinide, S. [Oak Ridge National Laboratory, Materials Science and Technology Div., AK TN (United States)

    2007-07-01

    Full text of publication follows: In view of the growing need for energy, the risk of exhaustion of fossil fuel and the problem of global warming, the nuclear energy is receiving added attention as a realistic and viable advanced solution. International collaborations on Generation IV (Gen-IV) fission reactors and on ITER and DEMO fusion reactors are developing. This is particularly the case in the sector of materials, where they hold the key to success of these systems. The international community has recognized and planned its materials R and D work for Fusion and Gen-IV reactors with the following considerations: 1- The time allotted to materials R and D is short and may not allow development of totally new materials. 2- Activities required, to cover existing materials variations and service conditions necessary for reactor design, are very time consuming. 3- The work to be done must build upon the existing knowledge of materials and avoid duplications. Although ITER for fusion and Generation four International Forum (GIF) for Gen-IV are important international collaborative programs, they are insufficient to meet all the national energy policies of the participating countries. This paper provides an overview of the materials R and D carried out for fusion and Gen-IV reactors at international and national levels. Materials programs discussed include both cross-cutting and reactor specific actions, where major tasks can be defined as: + Cross-cutting materials tasks: - materials for high temperature service; - materials with neutron damage tolerance; - materials behavior analysis and modeling; - high temperature design methodology. + Reactor specific materials tasks: - very high temperature alloys; - carbon, high temperature ceramics and their composites; - materials compatibilities. Starting with a brief introduction of materials R and D strategies, ITER and Broader Approach (BA), overall activities for fusion and GIF for Gen-IV will be reviewed. Domestic

  4. Gridded precipitation dataset for the Rhine basin made with the genRE interpolation method

    NARCIS (Netherlands)

    Osnabrugge, van B.; Uijlenhoet, R.

    2017-01-01

    A high resolution (1.2x1.2km) gridded precipitation dataset with hourly time step that covers the whole Rhine basin for the period 1997-2015. Made from gauge data with the genRE interpolation scheme. See "genRE: A method to extend gridded precipitation climatology datasets in near real-time for

  5. GenSVM: a generalized multiclass support vector machine

    NARCIS (Netherlands)

    G.J.J. van den Burg (Gertjan); P.J.F. Groenen (Patrick)

    2016-01-01

    textabstractTraditional extensions of the binary support vector machine (SVM) to multiclass problems are either heuristics or require solving a large dual optimization problem. Here, a generalized multiclass SVM is proposed called GenSVM. In this method classification boundaries for a K-class

  6. The RNASeq-er API-a gateway to systematically updated analysis of public RNA-seq data.

    Science.gov (United States)

    Petryszak, Robert; Fonseca, Nuno A; Füllgrabe, Anja; Huerta, Laura; Keays, Maria; Tang, Y Amy; Brazma, Alvis

    2017-07-15

    The exponential growth of publicly available RNA-sequencing (RNA-Seq) data poses an increasing challenge to researchers wishing to discover, analyse and store such data, particularly those based in institutions with limited computational resources. EMBL-EBI is in an ideal position to address these challenges and to allow the scientific community easy access to not just raw, but also processed RNA-Seq data. We present a Web service to access the results of a systematically and continually updated standardized alignment as well as gene and exon expression quantification of all public bulk (and in the near future also single-cell) RNA-Seq runs in 264 species in European Nucleotide Archive, using Representational State Transfer. The RNASeq-er API (Application Programming Interface) enables ontology-powered search for and retrieval of CRAM, bigwig and bedGraph files, gene and exon expression quantification matrices (Fragments Per Kilobase Of Exon Per Million Fragments Mapped, Transcripts Per Million, raw counts) as well as sample attributes annotated with ontology terms. To date over 270 00 RNA-Seq runs in nearly 10 000 studies (1PB of raw FASTQ data) in 264 species in ENA have been processed and made available via the API. The RNASeq-er API can be accessed at http://www.ebi.ac.uk/fg/rnaseq/api . The commands used to analyse the data are available in supplementary materials and at https://github.com/nunofonseca/irap/wiki/iRAP-single-library . rnaseq@ebi.ac.uk ; rpetry@ebi.ac.uk. Supplementary data are available at Bioinformatics online. © The Author(s) 2017. Published by Oxford University Press.

  7. Análisis genético del virus peruano de la fiebre amarilla

    Directory of Open Access Journals (Sweden)

    Carlos Yábar V

    2002-01-01

    Full Text Available Objetivo: Determinar las variantes genéticas de aislamientos del virus peruano de la Fiebre Amarilla (FA. Materiales y métodos: la región carboxiterminal del gen de la envoltura (E de cinco aislamientos de FA obtenidas de pacientes provenientes de Ayacucho 1978 (PER1, Junín 1995 (PER2, Cerro de Pasco (PER3, Cusco (1998 y San Martín (1999 fue amplificada por PCR, secuenciada y analizada con programas software de ADN. Resultados: el índice de similaridad de la secuencia de nucleótidos entre los cinco aislamientos reveló valores oscilantes entre 94,3% y 99,3%, mientras que la secuencia de aminoácidos presentó valores entre 97,6% y 99,7% de similaridad. El análisis filogenético demostró una distancia genética entre 0,40 y 6,50 mediante la secuencia de nucleótidos y a través de la secuencia de aminoácidos se observó un rango de 0,30 y 4,29. Sin embargo, las secuencias correspondientes a los sitios de glicosilación y a los epítopes de reconocimiento humoral fueron conservadas entre los cinco aislamientos, con excepción de algunos aislamientos de referencia reportados por otros autores. Conclusiones: los virus de FA peruanos forman un grupo filogenético distinto a otros virus de FA sudamericanos, basados en el análisis genéticos del gen E.

  8. Next-Gen3: Sequencing, Modeling, and Advanced Biofuels - Final Technical Report

    Energy Technology Data Exchange (ETDEWEB)

    Zengler, Karsten [Univ. of California, San Diego, CA (United States). Dept. of Pediatrics; Palsson, Bernhard [Univ. of California, San Diego, CA (United States). Dept. of Bioengineering; Lewis, Nathan [Univ. of California, San Diego, CA (United States). Dept. of Pediatrics

    2017-12-27

    Successful, scalable implementation of biofuels is dependent on the efficient and near complete utilization of diverse biomass sources. One approach is to utilize the large recalcitrant biomass fraction (or any organic waste stream) through the thermochemical conversion of organic compounds to syngas, a mixture of carbon monoxide (CO), carbon dioxide (CO2), and hydrogen (H2), which can subsequently be metabolized by acetogenic microorganisms to produce next-gen biofuels. The goal of this proposal was to advance the development of the acetogen Clostridium ljungdahlii as a chassis organism for next-gen biofuel production from cheap, renewable sources and to detail the interconnectivity of metabolism, energy conservation, and regulation of acetogens using next-gen sequencing and next-gen modeling. To achieve this goal we determined optimization of carbon and energy utilization through differential translational efficiency in C. ljungdahlii. Furthermore, we reconstructed a next-generation model of all major cellular processes, such as macromolecular synthesis and transcriptional regulation and deployed this model to predicting proteome allocation, overflow metabolism, and metal requirements in this model acetogen. In addition we explored the evolutionary significance of tRNA operon structure using the next-gen model and determined the optimal operon structure for bioproduction. Our study substantially enhanced the knowledgebaase for chemolithoautotrophs and their potential for advanced biofuel production. It provides next-generation modeling capability, offer innovative tools for genome-scale engineering, and provide novel methods to utilize next-generation models for the design of tunable systems that produce commodity chemicals from inexpensive sources.

  9. Conférence extérieure - Université de Genève: La modélisation numérique des extrêmes climatiques: Projections pour l'Europe et la Suisse d'ici 2100 - French version only

    CERN Document Server

    2006-01-01

    Université de Genève Ecole de physique 24 quai Ernest Ansermet 1211 Genève 4 Tél : + 41 22 379 63 83 (secrétariat) Tél : + 41 22 379 62 56 (réception) Fax: + 41 22 379 69 22 Lundi 15 janvier 2007 17 heures - Auditoire Stueckelberg La modélisation numérique des extrêmes climatiques: Projections pour l'Europe et la Suisse d'ici 2100 Prof. Martin Beniston / Chaire de Climatologie de l'Université de Genève Les nombreuses catastrophes liées au climat (canicule 2003 en Europe; inondations en Suisse en 2005; sécheresse en Australie; ouragans Katrina, etc.) donnent l'impression que les catastrophes climatiques qui touchent de nombreuses parties du monde sont la preuve du réchauffement climatique. A voir... Pourtant, les changements climatiques représentent l'un des thèmes de préoccupation majeure de ce début du 21e siècle, du moins pour les scientifiques sinon pour le monde politique. Car si l'ampleur, et surtout la rapidité du changement, sont aussi importants que ce que laissent entrevoi...

  10. Genome-scale data suggest reclassifications in the Leisingera-Phaeobacter cluster including proposals for Sedimentitalea gen. nov. and Pseudophaeobacter gen. nov.

    Directory of Open Access Journals (Sweden)

    Sven eBreider

    2014-08-01

    Full Text Available Earlier phylogenetic analyses of the marine Rhodobacteraceae (class Alphaproteobacteria genera Leisingera and Phaeobacter indicated that neither genus might be monophyletic. We here used phylogenetic reconstruction from genome-scale data, MALDI-TOF mass-spectrometry analysis and a re-assessment of the phenotypic data from the literature to settle this matter, aiming at a reclassification of the two genera. Neither Phaeobacter nor Leisingera formed a clade in any of the phylogenetic analyses conducted. Rather, smaller monophyletic assemblages emerged, which were phenotypically more homogeneous, too. We thus propose the reclassification of Leisingera nanhaiensis as the type species of a new genus as Sedimentitalea nanhaiensis gen. nov., comb. nov., the reclassification of Phaeobacter arcticus and Phaeobacter leonis as Pseudophaeobacter arcticus gen. nov., comb. nov. and Pseudophaeobacter leonis comb. nov., and the reclassification of Phaeobacter aquaemixtae, Phaeobacter caeruleus and Phaeobacter daeponensis as Leisingera aquaemixtae comb. nov., Leisingera caerulea comb. nov. and Leisingera daeponensis comb. nov. The genera Phaeobacter and Leisingera are accordingly emended.

  11. Key Factors for the Linkage Strategy between R and D and Commercialization for Gen-ΙV

    International Nuclear Information System (INIS)

    Lee, Kyoungmi; Hong, Jung Suk

    2013-01-01

    The Fukushima nuclear disaster has leaded to enhance the safety and the cost-effectiveness of technology for the future so that advanced countries such as United Sates and France have concerned about a next generation nuclear power plant, Gen-IV(Generation-IV Reactor). Considering various characteristics of nuclear R and D, it is necessary to have more elaborated strategies for the effective development of the next generation of nuclear technology. In this study, we suggest 5 key factors for the successful commercialization of Gen-IV by analyzing the distinct characteristics of nuclear R and D with Gen-IV and CSF(Critical Success Factor)s of several cases in these field and conducting the FGI(Focus Group Interview). Considering these results, we could find and suggest some important points for further strategy for Gen-IV. That is, following five key factors for the linkage improvement between R and D and commercialization of Gen-IV should be considered: the participation of nuclear power plant operators from the beginning, the establishment of consistent and comprehensive plan/roadmap/detailed strategy, the technology development based on global energy issues and international cooperation, the stable and clear funding plans for long-term projects, the cooperation of relative ministries. Gen-IV system is getting a positive response in that it accompanies long-term R and D plans in Korea. We think that the standard of Gen-IV would lead the next generation of nuclear industry if the proper strategy for the cooperation between the private sector and the regulation from the beginning. Moreover, we expect that this study will facilitate its development process from R and D to commercialization

  12. Key Factors for the Linkage Strategy between R and D and Commercialization for Gen-ΙV

    Energy Technology Data Exchange (ETDEWEB)

    Lee, Kyoungmi; Hong, Jung Suk [Korean Institute of S and T Evaluation and Planning, Seoul (Korea, Republic of)

    2013-05-15

    The Fukushima nuclear disaster has leaded to enhance the safety and the cost-effectiveness of technology for the future so that advanced countries such as United Sates and France have concerned about a next generation nuclear power plant, Gen-IV(Generation-IV Reactor). Considering various characteristics of nuclear R and D, it is necessary to have more elaborated strategies for the effective development of the next generation of nuclear technology. In this study, we suggest 5 key factors for the successful commercialization of Gen-IV by analyzing the distinct characteristics of nuclear R and D with Gen-IV and CSF(Critical Success Factor)s of several cases in these field and conducting the FGI(Focus Group Interview). Considering these results, we could find and suggest some important points for further strategy for Gen-IV. That is, following five key factors for the linkage improvement between R and D and commercialization of Gen-IV should be considered: the participation of nuclear power plant operators from the beginning, the establishment of consistent and comprehensive plan/roadmap/detailed strategy, the technology development based on global energy issues and international cooperation, the stable and clear funding plans for long-term projects, the cooperation of relative ministries. Gen-IV system is getting a positive response in that it accompanies long-term R and D plans in Korea. We think that the standard of Gen-IV would lead the next generation of nuclear industry if the proper strategy for the cooperation between the private sector and the regulation from the beginning. Moreover, we expect that this study will facilitate its development process from R and D to commercialization.

  13. Targeted NextGen Capabilities for 2025

    Science.gov (United States)

    2011-11-01

    increased arrival capacity to single runways by reducing longitudinal wake separation standards for Instrument Flight Rules ( IFR ) operations under certain...b. ABSTRACT unclassified c. THIS PAGE unclassified Standard Form 298 (Rev. 8-98) Prescribed by ANSI Std Z39-18 Targeted NextGen Capabilities...The examples cited are not intended to cover every aircraft and every flight. In some instances, the available capabilities for 2025 will not be

  14. PowerGen plc report and accounts 1995

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1995-12-31

    Detailed financial results are presented for the United Kingdom power generation company PowerGen for the year ended 2 April 1995. A review is given of operating and financial performance. Significant reductions in operating costs and improvements in productivity have been achieved. Diversity of fuels and plant portfolio has been enhanced by building 3000 MW of gas fired CCGT plant. Investment in coal import facilities has increased access to international coal markets. Environmental performance has improved with further reductions in SO{sub 2}, NO{sub x} and CO{sub 2} emissions. Overseas projects include construction of 990 MW CCGT power station in Portugal and a contract to build the 1200 MW Paiton 2 coal-fired power station in Indonesia. Investment in lignite mining and power generation assets of MIBRAG in Germany is contributing to profits. PowerGen`s five year contract for coal supply was assigned to RJB Mining (UK) Ltd. during the year. Stocks of coal fell during the year and further reductions are expected during 1995/96.

  15. An ultra-short screening version of the Recalled Parental Rearing Behavior questionnaire (FEE-US and its factor structure in a representative German sample

    Directory of Open Access Journals (Sweden)

    Petrowski Katja

    2012-11-01

    Full Text Available Abstract Background The Recalled Parental Rearing Behavior questionnaire (FEE, [1,2] assesses perceived parental rearing behavior separately for each parent. An ultra-short screening version (FEE-US with the same three scales each for the mother and the father is reported and factor-analytically validated. Methods N = 4,640 subjects aged 14 to 92 (M = 48.4 years were selected by the random-route sampling method. The ultra-short questionnaire version was derived from the long version through item and factor analyses. In a confirmatory factor analysis framework, the hypothesized three-factorial structure was fitted to the empirical data and tested for measurement invariance, differential item functioning, item discriminability, and convergent and discriminant factorial validity. Effects of gender or age were assessed using MANOVAs. Results The a-priori hypothesized model resulted in mostly adequate overall fit. Neither gender nor age group yielded considerable effects on the factor structure, but had small effects on means of raw score sums. Factorial validities could be confirmed. Scale sums are well-suited to rank respondents along the respective latent dimension. Conclusion The structure of the long version with the factors Rejection & Punishment, Emotional Warmth, and Control & Overprotection could be replicated for both father and mother items in the ultra-short screening version using confirmatory factor analyses. These results indicate that the ultra-short screening version is a time-saving and promising screening instrument for research settings and in individual counseling. However, the shortened scales do not necessarily represent the full spectrum covered by the full-scale dimensions.

  16. GenExp: an interactive web-based genomic DAS client with client-side data rendering.

    Directory of Open Access Journals (Sweden)

    Bernat Gel Moreno

    Full Text Available BACKGROUND: The Distributed Annotation System (DAS offers a standard protocol for sharing and integrating annotations on biological sequences. There are more than 1000 DAS sources available and the number is steadily increasing. Clients are an essential part of the DAS system and integrate data from several independent sources in order to create a useful representation to the user. While web-based DAS clients exist, most of them do not have direct interaction capabilities such as dragging and zooming with the mouse. RESULTS: Here we present GenExp, a web based and fully interactive visual DAS client. GenExp is a genome oriented DAS client capable of creating informative representations of genomic data zooming out from base level to complete chromosomes. It proposes a novel approach to genomic data rendering and uses the latest HTML5 web technologies to create the data representation inside the client browser. Thanks to client-side rendering most position changes do not need a network request to the server and so responses to zooming and panning are almost immediate. In GenExp it is possible to explore the genome intuitively moving it with the mouse just like geographical map applications. Additionally, in GenExp it is possible to have more than one data viewer at the same time and to save the current state of the application to revisit it later on. CONCLUSIONS: GenExp is a new interactive web-based client for DAS and addresses some of the short-comings of the existing clients. It uses client-side data rendering techniques resulting in easier genome browsing and exploration. GenExp is open source under the GPL license and it is freely available at http://gralggen.lsi.upc.edu/recerca/genexp.

  17. A comparative study of techniques for differential expression analysis on RNA-Seq data.

    Directory of Open Access Journals (Sweden)

    Zong Hong Zhang

    Full Text Available Recent advances in next-generation sequencing technology allow high-throughput cDNA sequencing (RNA-Seq to be widely applied in transcriptomic studies, in particular for detecting differentially expressed genes between groups. Many software packages have been developed for the identification of differentially expressed genes (DEGs between treatment groups based on RNA-Seq data. However, there is a lack of consensus on how to approach an optimal study design and choice of suitable software for the analysis. In this comparative study we evaluate the performance of three of the most frequently used software tools: Cufflinks-Cuffdiff2, DESeq and edgeR. A number of important parameters of RNA-Seq technology were taken into consideration, including the number of replicates, sequencing depth, and balanced vs. unbalanced sequencing depth within and between groups. We benchmarked results relative to sets of DEGs identified through either quantitative RT-PCR or microarray. We observed that edgeR performs slightly better than DESeq and Cuffdiff2 in terms of the ability to uncover true positives. Overall, DESeq or taking the intersection of DEGs from two or more tools is recommended if the number of false positives is a major concern in the study. In other circumstances, edgeR is slightly preferable for differential expression analysis at the expense of potentially introducing more false positives.

  18. Human-Automation Cooperation for Separation Assurance in Future NextGen Environments

    Science.gov (United States)

    Mercer, Joey; Homola, Jeffrey; Cabrall, Christopher; Martin, Lynne; Morey, Susan; Gomez, Ashley; Prevot, Thomas

    2014-01-01

    A 2012 Human-In-The-Loop air traffic control simulation investigated a gradual paradigm-shift in the allocation of functions between operators and automation. Air traffic controllers staffed five adjacent high-altitude en route sectors, and during the course of a two-week experiment, worked traffic under different function-allocation approaches aligned with four increasingly mature NextGen operational environments. These NextGen time-frames ranged from near current-day operations to nearly fully-automated control, in which the ground systems automation was responsible for detecting conflicts, issuing strategic and tactical resolutions, and alerting the controller to exceptional circumstances. Results indicate that overall performance was best in the most automated NextGen environment. Safe operations were achieved in this environment for twice todays peak airspace capacity, while being rated by the controllers as highly acceptable. However, results show that sector operations were not always safe; separation violations did in fact occur. This paper will describe in detail the simulation conducted, as well discuss important results and their implications.

  19. Description of Sharon gen. nov. for the Chilean species Asaphes amoenus Philippi, 1861 (Coleoptera: Elateridae

    Directory of Open Access Journals (Sweden)

    Elizabeth T. Arias-Bohart

    2015-10-01

    Full Text Available Sharon gen. nov. is here described to include Asaphes? amoenus Philippi, 1861 comb. nov. from Chile. A redescription of the species is based on the female holotype and material from different geographic locations. Candèze (1891 placed Asaphes amoenus and Parasaphes elegans in the suprageneric group Asaphites. We discuss differences between Sharon gen. nov. and Hemicrepidius Germar, 1839, where Asaphes amoenus was later placed by Blackwelder (1944. Based on morphological characters, Sharon gen. nov. appears to be related to Parasaphes Candèze, 1881, Wynarka Calder, 1986, and Tasmanelater Calder, 1996, all from Australia, suggesting Gondwanan relationships.

  20. Actividad del Sistema Renina-Angiotensina en relación con sus polimorfismos genéticos

    OpenAIRE

    Morcillo Hidalgo, Luis

    2015-01-01

    La realización del presente estudio sobre sujetos jóvenes y sanos no hipertensos tiene dos objetivos primordiales: El primero es analizar la relación de los polimorfismos de los genes del Sistema Renina-Angiotensina, el M235T del gen del angiotensinógeno, el Inserción/Delección del gen de la ECA y el A1166C del gen del receptor AT1 para la angiotensina II, con los niveles en plasma de angiotensina I, angiotensina II y angiotensina-(1-7), todas sustancias peptídicas activas del sistema E...