multiple gene sets: Topics by WorldWideScience.org

Sample records for multiple gene sets

Three gene expression vector sets for concurrently expressing multiple genes in Saccharomyces cerevisiae.

Science.gov (United States)

Ishii, Jun; Kondo, Takashi; Makino, Harumi; Ogura, Akira; Matsuda, Fumio; Kondo, Akihiko

2014-05-01

Yeast has the potential to be used in bulk-scale fermentative production of fuels and chemicals due to its tolerance for low pH and robustness for autolysis. However, expression of multiple external genes in one host yeast strain is considerably labor-intensive due to the lack of polycistronic transcription. To promote the metabolic engineering of yeast, we generated systematic and convenient genetic engineering tools to express multiple genes in Saccharomyces cerevisiae. We constructed a series of multi-copy and integration vector sets for concurrently expressing two or three genes in S. cerevisiae by embedding three classical promoters. The comparative expression capabilities of the constructed vectors were monitored with green fluorescent protein, and the concurrent expression of genes was monitored with three different fluorescent proteins. Our multiple gene expression tool will be helpful to the advanced construction of genetically engineered yeast strains in a variety of research fields other than metabolic engineering. © 2014 Federation of European Microbiological Societies. Published by John Wiley & Sons Ltd. All rights reserved.
Intervene: a tool for intersection and visualization of multiple gene or genomic region sets.

Science.gov (United States)

Khan, Aziz; Mathelier, Anthony

2017-05-31

A common task for scientists relies on comparing lists of genes or genomic regions derived from high-throughput sequencing experiments. While several tools exist to intersect and visualize sets of genes, similar tools dedicated to the visualization of genomic region sets are currently limited. To address this gap, we have developed the Intervene tool, which provides an easy and automated interface for the effective intersection and visualization of genomic region or list sets, thus facilitating their analysis and interpretation. Intervene contains three modules: venn to generate Venn diagrams of up to six sets, upset to generate UpSet plots of multiple sets, and pairwise to compute and visualize intersections of multiple sets as clustered heat maps. Intervene, and its interactive web ShinyApp companion, generate publication-quality figures for the interpretation of genomic region and list sets. Intervene and its web application companion provide an easy command line and an interactive web interface to compute intersections of multiple genomic and list sets. They have the capacity to plot intersections using easy-to-interpret visual approaches. Intervene is developed and designed to meet the needs of both computer scientists and biologists. The source code is freely available at https://bitbucket.org/CBGR/intervene , with the web application available at https://asntech.shinyapps.io/intervene .
APPRIS 2017: principal isoforms for multiple gene sets

Science.gov (United States)

Rodriguez-Rivas, Juan; Di Domenico, Tomás; Vázquez, Jesús; Valencia, Alfonso

2018-01-01

Abstract The APPRIS database (http://appris-tools.org) uses protein structural and functional features and information from cross-species conservation to annotate splice isoforms in protein-coding genes. APPRIS selects a single protein isoform, the ‘principal’ isoform, as the reference for each gene based on these annotations. A single main splice isoform reflects the biological reality for most protein coding genes and APPRIS principal isoforms are the best predictors of these main proteins isoforms. Here, we present the updates to the database, new developments that include the addition of three new species (chimpanzee, Drosophila melangaster and Caenorhabditis elegans), the expansion of APPRIS to cover the RefSeq gene set and the UniProtKB proteome for six species and refinements in the core methods that make up the annotation pipeline. In addition APPRIS now provides a measure of reliability for individual principal isoforms and updates with each release of the GENCODE/Ensembl and RefSeq reference sets. The individual GENCODE/Ensembl, RefSeq and UniProtKB reference gene sets for six organisms have been merged to produce common sets of splice variants. PMID:29069475
MAGMA: generalized gene-set analysis of GWAS data.

Science.gov (United States)

de Leeuw, Christiaan A; Mooij, Joris M; Heskes, Tom; Posthuma, Danielle

2015-04-01

By aggregating data for complex traits in a biologically meaningful way, gene and gene-set analysis constitute a valuable addition to single-marker analysis. However, although various methods for gene and gene-set analysis currently exist, they generally suffer from a number of issues. Statistical power for most methods is strongly affected by linkage disequilibrium between markers, multi-marker associations are often hard to detect, and the reliance on permutation to compute p-values tends to make the analysis computationally very expensive. To address these issues we have developed MAGMA, a novel tool for gene and gene-set analysis. The gene analysis is based on a multiple regression model, to provide better statistical performance. The gene-set analysis is built as a separate layer around the gene analysis for additional flexibility. This gene-set analysis also uses a regression structure to allow generalization to analysis of continuous properties of genes and simultaneous analysis of multiple gene sets and other gene properties. Simulations and an analysis of Crohn's Disease data are used to evaluate the performance of MAGMA and to compare it to a number of other gene and gene-set analysis tools. The results show that MAGMA has significantly more power than other tools for both the gene and the gene-set analysis, identifying more genes and gene sets associated with Crohn's Disease while maintaining a correct type 1 error rate. Moreover, the MAGMA analysis of the Crohn's Disease data was found to be considerably faster as well.
Discovery of cancer common and specific driver gene sets

Science.gov (United States)

2017-01-01

Abstract Cancer is known as a disease mainly caused by gene alterations. Discovery of mutated driver pathways or gene sets is becoming an important step to understand molecular mechanisms of carcinogenesis. However, systematically investigating commonalities and specificities of driver gene sets among multiple cancer types is still a great challenge, but this investigation will undoubtedly benefit deciphering cancers and will be helpful for personalized therapy and precision medicine in cancer treatment. In this study, we propose two optimization models to de novo discover common driver gene sets among multiple cancer types (ComMDP) and specific driver gene sets of one certain or multiple cancer types to other cancers (SpeMDP), respectively. We first apply ComMDP and SpeMDP to simulated data to validate their efficiency. Then, we further apply these methods to 12 cancer types from The Cancer Genome Atlas (TCGA) and obtain several biologically meaningful driver pathways. As examples, we construct a common cancer pathway model for BRCA and OV, infer a complex driver pathway model for BRCA carcinogenesis based on common driver gene sets of BRCA with eight cancer types, and investigate specific driver pathways of the liquid cancer lymphoblastic acute myeloid leukemia (LAML) versus other solid cancer types. In these processes more candidate cancer genes are also found. PMID:28168295
GSMA: Gene Set Matrix Analysis, An Automated Method for Rapid Hypothesis Testing of Gene Expression Data

Directory of Open Access Journals (Sweden)

Chris Cheadle

2007-01-01

Full Text Available Background: Microarray technology has become highly valuable for identifying complex global changes in gene expression patterns. The assignment of functional information to these complex patterns remains a challenging task in effectively interpreting data and correlating results from across experiments, projects and laboratories. Methods which allow the rapid and robust evaluation of multiple functional hypotheses increase the power of individual researchers to data mine gene expression data more efficiently.Results: We have developed (gene set matrix analysis GSMA as a useful method for the rapid testing of group-wise up- or downregulation of gene expression simultaneously for multiple lists of genes (gene sets against entire distributions of gene expression changes (datasets for single or multiple experiments. The utility of GSMA lies in its flexibility to rapidly poll gene sets related by known biological function or as designated solely by the end-user against large numbers of datasets simultaneously.Conclusions: GSMA provides a simple and straightforward method for hypothesis testing in which genes are tested by groups across multiple datasets for patterns of expression enrichment.
Time-Course Gene Set Analysis for Longitudinal Gene Expression Data.

Directory of Open Access Journals (Sweden)

Boris P Hejblum

2015-06-01

Full Text Available Gene set analysis methods, which consider predefined groups of genes in the analysis of genomic data, have been successfully applied for analyzing gene expression data in cross-sectional studies. The time-course gene set analysis (TcGSA introduced here is an extension of gene set analysis to longitudinal data. The proposed method relies on random effects modeling with maximum likelihood estimates. It allows to use all available repeated measurements while dealing with unbalanced data due to missing at random (MAR measurements. TcGSA is a hypothesis driven method that identifies a priori defined gene sets with significant expression variations over time, taking into account the potential heterogeneity of expression within gene sets. When biological conditions are compared, the method indicates if the time patterns of gene sets significantly differ according to these conditions. The interest of the method is illustrated by its application to two real life datasets: an HIV therapeutic vaccine trial (DALIA-1 trial, and data from a recent study on influenza and pneumococcal vaccines. In the DALIA-1 trial TcGSA revealed a significant change in gene expression over time within 69 gene sets during vaccination, while a standard univariate individual gene analysis corrected for multiple testing as well as a standard a Gene Set Enrichment Analysis (GSEA for time series both failed to detect any significant pattern change over time. When applied to the second illustrative data set, TcGSA allowed the identification of 4 gene sets finally found to be linked with the influenza vaccine too although they were found to be associated to the pneumococcal vaccine only in previous analyses. In our simulation study TcGSA exhibits good statistical properties, and an increased power compared to other approaches for analyzing time-course expression patterns of gene sets. The method is made available for the community through an R package.
Gene set analysis of purine and pyrimidine antimetabolites cancer therapies.

Science.gov (United States)

Fridley, Brooke L; Batzler, Anthony; Li, Liang; Li, Fang; Matimba, Alice; Jenkins, Gregory D; Ji, Yuan; Wang, Liewei; Weinshilboum, Richard M

2011-11-01

Responses to therapies, either with regard to toxicities or efficacy, are expected to involve complex relationships of gene products within the same molecular pathway or functional gene set. Therefore, pathways or gene sets, as opposed to single genes, may better reflect the true underlying biology and may be more appropriate units for analysis of pharmacogenomic studies. Application of such methods to pharmacogenomic studies may enable the detection of more subtle effects of multiple genes in the same pathway that may be missed by assessing each gene individually. A gene set analysis of 3821 gene sets is presented assessing the association between basal messenger RNA expression and drug cytotoxicity using ethnically defined human lymphoblastoid cell lines for two classes of drugs: pyrimidines [gemcitabine (dFdC) and arabinoside] and purines [6-thioguanine and 6-mercaptopurine]. The gene set nucleoside-diphosphatase activity was found to be significantly associated with both dFdC and arabinoside, whereas gene set γ-aminobutyric acid catabolic process was associated with dFdC and 6-thioguanine. These gene sets were significantly associated with the phenotype even after adjusting for multiple testing. In addition, five associated gene sets were found in common between the pyrimidines and two gene sets for the purines (3',5'-cyclic-AMP phosphodiesterase activity and γ-aminobutyric acid catabolic process) with a P value of less than 0.0001. Functional validation was attempted with four genes each in gene sets for thiopurine and pyrimidine antimetabolites. All four genes selected from the pyrimidine gene sets (PSME3, CANT1, ENTPD6, ADRM1) were validated, but only one (PDE4D) was validated for the thiopurine gene sets. In summary, results from the gene set analysis of pyrimidine and purine therapies, used often in the treatment of various cancers, provide novel insight into the relationship between genomic variation and drug response.
Gene set analysis using variance component tests.

Science.gov (United States)

Huang, Yen-Tsung; Lin, Xihong

2013-06-28

Gene set analyses have become increasingly important in genomic research, as many complex diseases are contributed jointly by alterations of numerous genes. Genes often coordinate together as a functional repertoire, e.g., a biological pathway/network and are highly correlated. However, most of the existing gene set analysis methods do not fully account for the correlation among the genes. Here we propose to tackle this important feature of a gene set to improve statistical power in gene set analyses. We propose to model the effects of an independent variable, e.g., exposure/biological status (yes/no), on multiple gene expression values in a gene set using a multivariate linear regression model, where the correlation among the genes is explicitly modeled using a working covariance matrix. We develop TEGS (Test for the Effect of a Gene Set), a variance component test for the gene set effects by assuming a common distribution for regression coefficients in multivariate linear regression models, and calculate the p-values using permutation and a scaled chi-square approximation. We show using simulations that type I error is protected under different choices of working covariance matrices and power is improved as the working covariance approaches the true covariance. The global test is a special case of TEGS when correlation among genes in a gene set is ignored. Using both simulation data and a published diabetes dataset, we show that our test outperforms the commonly used approaches, the global test and gene set enrichment analysis (GSEA). We develop a gene set analyses method (TEGS) under the multivariate regression framework, which directly models the interdependence of the expression values in a gene set using a working covariance. TEGS outperforms two widely used methods, GSEA and global test in both simulation and a diabetes microarray data.
gsSKAT: Rapid gene set analysis and multiple testing correction for rare-variant association studies using weighted linear kernels.

Science.gov (United States)

Larson, Nicholas B; McDonnell, Shannon; Cannon Albright, Lisa; Teerlink, Craig; Stanford, Janet; Ostrander, Elaine A; Isaacs, William B; Xu, Jianfeng; Cooney, Kathleen A; Lange, Ethan; Schleutker, Johanna; Carpten, John D; Powell, Isaac; Bailey-Wilson, Joan E; Cussenot, Olivier; Cancel-Tassin, Geraldine; Giles, Graham G; MacInnis, Robert J; Maier, Christiane; Whittemore, Alice S; Hsieh, Chih-Lin; Wiklund, Fredrik; Catalona, William J; Foulkes, William; Mandal, Diptasri; Eeles, Rosalind; Kote-Jarai, Zsofia; Ackerman, Michael J; Olson, Timothy M; Klein, Christopher J; Thibodeau, Stephen N; Schaid, Daniel J

2017-05-01

Next-generation sequencing technologies have afforded unprecedented characterization of low-frequency and rare genetic variation. Due to low power for single-variant testing, aggregative methods are commonly used to combine observed rare variation within a single gene. Causal variation may also aggregate across multiple genes within relevant biomolecular pathways. Kernel-machine regression and adaptive testing methods for aggregative rare-variant association testing have been demonstrated to be powerful approaches for pathway-level analysis, although these methods tend to be computationally intensive at high-variant dimensionality and require access to complete data. An additional analytical issue in scans of large pathway definition sets is multiple testing correction. Gene set definitions may exhibit substantial genic overlap, and the impact of the resultant correlation in test statistics on Type I error rate control for large agnostic gene set scans has not been fully explored. Herein, we first outline a statistical strategy for aggregative rare-variant analysis using component gene-level linear kernel score test summary statistics as well as derive simple estimators of the effective number of tests for family-wise error rate control. We then conduct extensive simulation studies to characterize the behavior of our approach relative to direct application of kernel and adaptive methods under a variety of conditions. We also apply our method to two case-control studies, respectively, evaluating rare variation in hereditary prostate cancer and schizophrenia. Finally, we provide open-source R code for public use to facilitate easy application of our methods to existing rare-variant analysis results. © 2017 WILEY PERIODICALS, INC.
Novel gene sets improve set-level classification of prokaryotic gene expression data.

Science.gov (United States)

Holec, Matěj; Kuželka, Ondřej; Železný, Filip

2015-10-28

Set-level classification of gene expression data has received significant attention recently. In this setting, high-dimensional vectors of features corresponding to genes are converted into lower-dimensional vectors of features corresponding to biologically interpretable gene sets. The dimensionality reduction brings the promise of a decreased risk of overfitting, potentially resulting in improved accuracy of the learned classifiers. However, recent empirical research has not confirmed this expectation. Here we hypothesize that the reported unfavorable classification results in the set-level framework were due to the adoption of unsuitable gene sets defined typically on the basis of the Gene ontology and the KEGG database of metabolic networks. We explore an alternative approach to defining gene sets, based on regulatory interactions, which we expect to collect genes with more correlated expression. We hypothesize that such more correlated gene sets will enable to learn more accurate classifiers. We define two families of gene sets using information on regulatory interactions, and evaluate them on phenotype-classification tasks using public prokaryotic gene expression data sets. From each of the two gene-set families, we first select the best-performing subtype. The two selected subtypes are then evaluated on independent (testing) data sets against state-of-the-art gene sets and against the conventional gene-level approach. The novel gene sets are indeed more correlated than the conventional ones, and lead to significantly more accurate classifiers. The novel gene sets are indeed more correlated than the conventional ones, and lead to significantly more accurate classifiers. Novel gene sets defined on the basis of regulatory interactions improve set-level classification of gene expression data. The experimental scripts and other material needed to reproduce the experiments are available at http://ida.felk.cvut.cz/novelgenesets.tar.gz.
The Molecular Signatures Database (MSigDB) hallmark gene set collection.

Science.gov (United States)

Liberzon, Arthur; Birger, Chet; Thorvaldsdóttir, Helga; Ghandi, Mahmoud; Mesirov, Jill P; Tamayo, Pablo

2015-12-23

The Molecular Signatures Database (MSigDB) is one of the most widely used and comprehensive databases of gene sets for performing gene set enrichment analysis. Since its creation, MSigDB has grown beyond its roots in metabolic disease and cancer to include >10,000 gene sets. These better represent a wider range of biological processes and diseases, but the utility of the database is reduced by increased redundancy across, and heterogeneity within, gene sets. To address this challenge, here we use a combination of automated approaches and expert curation to develop a collection of "hallmark" gene sets as part of MSigDB. Each hallmark in this collection consists of a "refined" gene set, derived from multiple "founder" sets, that conveys a specific biological state or process and displays coherent expression. The hallmarks effectively summarize most of the relevant information of the original founder sets and, by reducing both variation and redundancy, provide more refined and concise inputs for gene set enrichment analysis.
Mining gene expression data of multiple sclerosis.

Directory of Open Access Journals (Sweden)

Pi Guo

Full Text Available Microarray produces a large amount of gene expression data, containing various biological implications. The challenge is to detect a panel of discriminative genes associated with disease. This study proposed a robust classification model for gene selection using gene expression data, and performed an analysis to identify disease-related genes using multiple sclerosis as an example.Gene expression profiles based on the transcriptome of peripheral blood mononuclear cells from a total of 44 samples from 26 multiple sclerosis patients and 18 individuals with other neurological diseases (control were analyzed. Feature selection algorithms including Support Vector Machine based on Recursive Feature Elimination, Receiver Operating Characteristic Curve, and Boruta algorithms were jointly performed to select candidate genes associating with multiple sclerosis. Multiple classification models categorized samples into two different groups based on the identified genes. Models' performance was evaluated using cross-validation methods, and an optimal classifier for gene selection was determined.An overlapping feature set was identified consisting of 8 genes that were differentially expressed between the two phenotype groups. The genes were significantly associated with the pathways of apoptosis and cytokine-cytokine receptor interaction. TNFSF10 was significantly associated with multiple sclerosis. A Support Vector Machine model was established based on the featured genes and gave a practical accuracy of ∼86%. This binary classification model also outperformed the other models in terms of Sensitivity, Specificity and F1 score.The combined analytical framework integrating feature ranking algorithms and Support Vector Machine model could be used for selecting genes for other diseases.
Using the gene ontology to scan multilevel gene sets for associations in genome wide association studies.

Science.gov (United States)

Schaid, Daniel J; Sinnwell, Jason P; Jenkins, Gregory D; McDonnell, Shannon K; Ingle, James N; Kubo, Michiaki; Goss, Paul E; Costantino, Joseph P; Wickerham, D Lawrence; Weinshilboum, Richard M

2012-01-01

Gene-set analyses have been widely used in gene expression studies, and some of the developed methods have been extended to genome wide association studies (GWAS). Yet, complications due to linkage disequilibrium (LD) among single nucleotide polymorphisms (SNPs), and variable numbers of SNPs per gene and genes per gene-set, have plagued current approaches, often leading to ad hoc "fixes." To overcome some of the current limitations, we developed a general approach to scan GWAS SNP data for both gene-level and gene-set analyses, building on score statistics for generalized linear models, and taking advantage of the directed acyclic graph structure of the gene ontology when creating gene-sets. However, other types of gene-set structures can be used, such as the popular Kyoto Encyclopedia of Genes and Genomes (KEGG). Our approach combines SNPs into genes, and genes into gene-sets, but assures that positive and negative effects of genes on a trait do not cancel. To control for multiple testing of many gene-sets, we use an efficient computational strategy that accounts for LD and provides accurate step-down adjusted P-values for each gene-set. Application of our methods to two different GWAS provide guidance on the potential strengths and weaknesses of our proposed gene-set analyses. © 2011 Wiley Periodicals, Inc.
A hybrid approach of gene sets and single genes for the prediction of survival risks with gene expression data.

Science.gov (United States)

Seok, Junhee; Davis, Ronald W; Xiao, Wenzhong

2015-01-01

Accumulated biological knowledge is often encoded as gene sets, collections of genes associated with similar biological functions or pathways. The use of gene sets in the analyses of high-throughput gene expression data has been intensively studied and applied in clinical research. However, the main interest remains in finding modules of biological knowledge, or corresponding gene sets, significantly associated with disease conditions. Risk prediction from censored survival times using gene sets hasn't been well studied. In this work, we propose a hybrid method that uses both single gene and gene set information together to predict patient survival risks from gene expression profiles. In the proposed method, gene sets provide context-level information that is poorly reflected by single genes. Complementarily, single genes help to supplement incomplete information of gene sets due to our imperfect biomedical knowledge. Through the tests over multiple data sets of cancer and trauma injury, the proposed method showed robust and improved performance compared with the conventional approaches with only single genes or gene sets solely. Additionally, we examined the prediction result in the trauma injury data, and showed that the modules of biological knowledge used in the prediction by the proposed method were highly interpretable in biology. A wide range of survival prediction problems in clinical genomics is expected to benefit from the use of biological knowledge.
Association between expression of random gene sets and survival is evident in multiple cancer types and may be explained by sub-classification

Science.gov (United States)

2018-01-01

One of the goals of cancer research is to identify a set of genes that cause or control disease progression. However, although multiple such gene sets were published, these are usually in very poor agreement with each other, and very few of the genes proved to be functional therapeutic targets. Furthermore, recent findings from a breast cancer gene-expression cohort showed that sets of genes selected randomly can be used to predict survival with a much higher probability than expected. These results imply that many of the genes identified in breast cancer gene expression analysis may not be causal of cancer progression, even though they can still be highly predictive of prognosis. We performed a similar analysis on all the cancer types available in the cancer genome atlas (TCGA), namely, estimating the predictive power of random gene sets for survival. Our work shows that most cancer types exhibit the property that random selections of genes are more predictive of survival than expected. In contrast to previous work, this property is not removed by using a proliferation signature, which implies that proliferation may not always be the confounder that drives this property. We suggest one possible solution in the form of data-driven sub-classification to reduce this property significantly. Our results suggest that the predictive power of random gene sets may be used to identify the existence of sub-classes in the data, and thus may allow better understanding of patient stratification. Furthermore, by reducing the observed bias this may allow more direct identification of biologically relevant, and potentially causal, genes. PMID:29470520
Integrating genome-wide association study and expression quantitative trait loci data identifies multiple genes and gene set associated with neuroticism.

Science.gov (United States)

Fan, Qianrui; Wang, Wenyu; Hao, Jingcan; He, Awen; Wen, Yan; Guo, Xiong; Wu, Cuiyan; Ning, Yujie; Wang, Xi; Wang, Sen; Zhang, Feng

2017-08-01

Neuroticism is a fundamental personality trait with significant genetic determinant. To identify novel susceptibility genes for neuroticism, we conducted an integrative analysis of genomic and transcriptomic data of genome wide association study (GWAS) and expression quantitative trait locus (eQTL) study. GWAS summary data was driven from published studies of neuroticism, totally involving 170,906 subjects. eQTL dataset containing 927,753 eQTLs were obtained from an eQTL meta-analysis of 5311 samples. Integrative analysis of GWAS and eQTL data was conducted by summary data-based Mendelian randomization (SMR) analysis software. To identify neuroticism associated gene sets, the SMR analysis results were further subjected to gene set enrichment analysis (GSEA). The gene set annotation dataset (containing 13,311 annotated gene sets) of GSEA Molecular Signatures Database was used. SMR single gene analysis identified 6 significant genes for neuroticism, including MSRA (p value=2.27×10 -10 ), MGC57346 (p value=6.92×10 -7 ), BLK (p value=1.01×10 -6 ), XKR6 (p value=1.11×10 -6 ), C17ORF69 (p value=1.12×10 -6 ) and KIAA1267 (p value=4.00×10 -6 ). Gene set enrichment analysis observed significant association for Chr8p23 gene set (false discovery rate=0.033). Our results provide novel clues for the genetic mechanism studies of neuroticism. Copyright © 2017. Published by Elsevier Inc.
Beyond main effects of gene-sets: harsh parenting moderates the association between a dopamine gene-set and child externalizing behavior.

Science.gov (United States)

Windhorst, Dafna A; Mileva-Seitz, Viara R; Rippe, Ralph C A; Tiemeier, Henning; Jaddoe, Vincent W V; Verhulst, Frank C; van IJzendoorn, Marinus H; Bakermans-Kranenburg, Marian J

2016-08-01

In a longitudinal cohort study, we investigated the interplay of harsh parenting and genetic variation across a set of functionally related dopamine genes, in association with children's externalizing behavior. This is one of the first studies to employ gene-based and gene-set approaches in tests of Gene by Environment (G × E) effects on complex behavior. This approach can offer an important alternative or complement to candidate gene and genome-wide environmental interaction (GWEI) studies in the search for genetic variation underlying individual differences in behavior. Genetic variants in 12 autosomal dopaminergic genes were available in an ethnically homogenous part of a population-based cohort. Harsh parenting was assessed with maternal (n = 1881) and paternal (n = 1710) reports at age 3. Externalizing behavior was assessed with the Child Behavior Checklist (CBCL) at age 5 (71 ± 3.7 months). We conducted gene-set analyses of the association between variation in dopaminergic genes and externalizing behavior, stratified for harsh parenting. The association was statistically significant or approached significance for children without harsh parenting experiences, but was absent in the group with harsh parenting. Similarly, significant associations between single genes and externalizing behavior were only found in the group without harsh parenting. Effect sizes in the groups with and without harsh parenting did not differ significantly. Gene-environment interaction tests were conducted for individual genetic variants, resulting in two significant interaction effects (rs1497023 and rs4922132) after correction for multiple testing. Our findings are suggestive of G × E interplay, with associations between dopamine genes and externalizing behavior present in children without harsh parenting, but not in children with harsh parenting experiences. Harsh parenting may overrule the role of genetic factors in externalizing behavior. Gene-based and gene-set
Functional Multiple-Set Canonical Correlation Analysis

Science.gov (United States)

Hwang, Heungsun; Jung, Kwanghee; Takane, Yoshio; Woodward, Todd S.

2012-01-01

We propose functional multiple-set canonical correlation analysis for exploring associations among multiple sets of functions. The proposed method includes functional canonical correlation analysis as a special case when only two sets of functions are considered. As in classical multiple-set canonical correlation analysis, computationally, the…
Association of Protein Translation and Extracellular Matrix Gene Sets with Breast Cancer Metastasis: Findings Uncovered on Analysis of Multiple Publicly Available Datasets Using Individual Patient Data Approach.

Directory of Open Access Journals (Sweden)

Nilotpal Chowdhury

Full Text Available Microarray analysis has revolutionized the role of genomic prognostication in breast cancer. However, most studies are single series studies, and suffer from methodological problems. We sought to use a meta-analytic approach in combining multiple publicly available datasets, while correcting for batch effects, to reach a more robust oncogenomic analysis.The aim of the present study was to find gene sets associated with distant metastasis free survival (DMFS in systemically untreated, node-negative breast cancer patients, from publicly available genomic microarray datasets.Four microarray series (having 742 patients were selected after a systematic search and combined. Cox regression for each gene was done for the combined dataset (univariate, as well as multivariate - adjusted for expression of Cell cycle related genes and for the 4 major molecular subtypes. The centre and microarray batch effects were adjusted by including them as random effects variables. The Cox regression coefficients for each analysis were then ranked and subjected to a Gene Set Enrichment Analysis (GSEA.Gene sets representing protein translation were independently negatively associated with metastasis in the Luminal A and Luminal B subtypes, but positively associated with metastasis in Basal tumors. Proteinaceous extracellular matrix (ECM gene set expression was positively associated with metastasis, after adjustment for expression of cell cycle related genes on the combined dataset. Finally, the positive association of the proliferation-related genes with metastases was confirmed.To the best of our knowledge, the results depicting mixed prognostic significance of protein translation in breast cancer subtypes are being reported for the first time. We attribute this to our study combining multiple series and performing a more robust meta-analytic Cox regression modeling on the combined dataset, thus discovering 'hidden' associations. This methodology seems to yield new and

Association of Protein Translation and Extracellular Matrix Gene Sets with Breast Cancer Metastasis: Findings Uncovered on Analysis of Multiple Publicly Available Datasets Using Individual Patient Data Approach.

Science.gov (United States)

Chowdhury, Nilotpal; Sapru, Shantanu

2015-01-01

Microarray analysis has revolutionized the role of genomic prognostication in breast cancer. However, most studies are single series studies, and suffer from methodological problems. We sought to use a meta-analytic approach in combining multiple publicly available datasets, while correcting for batch effects, to reach a more robust oncogenomic analysis. The aim of the present study was to find gene sets associated with distant metastasis free survival (DMFS) in systemically untreated, node-negative breast cancer patients, from publicly available genomic microarray datasets. Four microarray series (having 742 patients) were selected after a systematic search and combined. Cox regression for each gene was done for the combined dataset (univariate, as well as multivariate - adjusted for expression of Cell cycle related genes) and for the 4 major molecular subtypes. The centre and microarray batch effects were adjusted by including them as random effects variables. The Cox regression coefficients for each analysis were then ranked and subjected to a Gene Set Enrichment Analysis (GSEA). Gene sets representing protein translation were independently negatively associated with metastasis in the Luminal A and Luminal B subtypes, but positively associated with metastasis in Basal tumors. Proteinaceous extracellular matrix (ECM) gene set expression was positively associated with metastasis, after adjustment for expression of cell cycle related genes on the combined dataset. Finally, the positive association of the proliferation-related genes with metastases was confirmed. To the best of our knowledge, the results depicting mixed prognostic significance of protein translation in breast cancer subtypes are being reported for the first time. We attribute this to our study combining multiple series and performing a more robust meta-analytic Cox regression modeling on the combined dataset, thus discovering 'hidden' associations. This methodology seems to yield new and interesting
A Meta-Analysis of Multiple Matched Copy Number and Transcriptomics Data Sets for Inferring Gene Regulatory Relationships

Science.gov (United States)

Newton, Richard; Wernisch, Lorenz

2014-01-01

Inferring gene regulatory relationships from observational data is challenging. Manipulation and intervention is often required to unravel causal relationships unambiguously. However, gene copy number changes, as they frequently occur in cancer cells, might be considered natural manipulation experiments on gene expression. An increasing number of data sets on matched array comparative genomic hybridisation and transcriptomics experiments from a variety of cancer pathologies are becoming publicly available. Here we explore the potential of a meta-analysis of thirty such data sets. The aim of our analysis was to assess the potential of in silico inference of trans-acting gene regulatory relationships from this type of data. We found sufficient correlation signal in the data to infer gene regulatory relationships, with interesting similarities between data sets. A number of genes had highly correlated copy number and expression changes in many of the data sets and we present predicted potential trans-acted regulatory relationships for each of these genes. The study also investigates to what extent heterogeneity between cell types and between pathologies determines the number of statistically significant predictions available from a meta-analysis of experiments. PMID:25148247
Gene set analysis of the EADGENE chicken data-set

DEFF Research Database (Denmark)

Skarman, Axel; Jiang, Li; Hornshøj, Henrik

2009-01-01

Abstract Background: Gene set analysis is considered to be a way of improving our biological interpretation of the observed expression patterns. This paper describes different methods applied to analyse expression data from a chicken DNA microarray dataset. Results: Applying different gene set...... analyses to the chicken expression data led to different ranking of the Gene Ontology terms tested. A method for prediction of possible annotations was applied. Conclusion: Biological interpretation based on gene set analyses dependent on the statistical method used. Methods for predicting the possible...
Glutamatergic and GABAergic gene sets in attention-deficit/hyperactivity disorder

DEFF Research Database (Denmark)

Naaijen, Jill; Bralten, Janita; Poelmans, Geert

2017-01-01

Attention-deficit/hyperactivity disorder (ADHD) and autism spectrum disorders (ASD) often co-occur. Both are highly heritable; however, it has been difficult to discover genetic risk variants. Glutamate and GABA are main excitatory and inhibitory neurotransmitters in the brain; their balance...... within glutamatergic and GABAergic genes were investigated using the MAGMA software in an ADHD case-only sample (n=931), in which we assessed ASD symptoms and response inhibition on a Stop task. Gene set analysis for ADHD symptom severity, divided into inattention and hyperactivity/impulsivity symptoms...... is essential for proper brain development and functioning. In this study we investigated the role of glutamate and GABA genetics in ADHD severity, autism symptom severity and inhibitory performance, based on gene set analysis, an approach to investigate multiple genetic variants simultaneously. Common variants...
Constellation Map: Downstream visualization and interpretation of gene set enrichment results [version 1; referees: 2 approved

Directory of Open Access Journals (Sweden)

Yan Tan

2015-06-01

Full Text Available Summary: Gene set enrichment analysis (GSEA approaches are widely used to identify coordinately regulated genes associated with phenotypes of interest. Here, we present Constellation Map, a tool to visualize and interpret the results when enrichment analyses yield a long list of significantly enriched gene sets. Constellation Map identifies commonalities that explain the enrichment of multiple top-scoring gene sets and maps the relationships between them. Constellation Map can help investigators take full advantage of GSEA and facilitates the biological interpretation of enrichment results. Availability: Constellation Map is freely available as a GenePattern module at http://www.genepattern.org.
Glutamatergic and GABAergic gene sets in attention-deficit/hyperactivity disorder: association to overlapping traits in ADHD and autism.

Science.gov (United States)

Naaijen, J; Bralten, J; Poelmans, G; Glennon, J C; Franke, B; Buitelaar, J K

2017-01-10

Attention-deficit/hyperactivity disorder (ADHD) and autism spectrum disorders (ASD) often co-occur. Both are highly heritable; however, it has been difficult to discover genetic risk variants. Glutamate and GABA are main excitatory and inhibitory neurotransmitters in the brain; their balance is essential for proper brain development and functioning. In this study we investigated the role of glutamate and GABA genetics in ADHD severity, autism symptom severity and inhibitory performance, based on gene set analysis, an approach to investigate multiple genetic variants simultaneously. Common variants within glutamatergic and GABAergic genes were investigated using the MAGMA software in an ADHD case-only sample (n=931), in which we assessed ASD symptoms and response inhibition on a Stop task. Gene set analysis for ADHD symptom severity, divided into inattention and hyperactivity/impulsivity symptoms, autism symptom severity and inhibition were performed using principal component regression analyses. Subsequently, gene-wide association analyses were performed. The glutamate gene set showed an association with severity of hyperactivity/impulsivity (P=0.009), which was robust to correcting for genome-wide association levels. The GABA gene set showed nominally significant association with inhibition (P=0.04), but this did not survive correction for multiple comparisons. None of single gene or single variant associations was significant on their own. By analyzing multiple genetic variants within candidate gene sets together, we were able to find genetic associations supporting the involvement of excitatory and inhibitory neurotransmitter systems in ADHD and ASD symptom severity in ADHD.
Principles for the organization of gene-sets.

Science.gov (United States)

Li, Wentian; Freudenberg, Jan; Oswald, Michaela

2015-12-01

A gene-set, an important concept in microarray expression analysis and systems biology, is a collection of genes and/or their products (i.e. proteins) that have some features in common. There are many different ways to construct gene-sets, but a systematic organization of these ways is lacking. Gene-sets are mainly organized ad hoc in current public-domain databases, with group header names often determined by practical reasons (such as the types of technology in obtaining the gene-sets or a balanced number of gene-sets under a header). Here we aim at providing a gene-set organization principle according to the level at which genes are connected: homology, physical map proximity, chemical interaction, biological, and phenotypic-medical levels. We also distinguish two types of connections between genes: actual connection versus sharing of a label. Actual connections denote direct biological interactions, whereas shared label connection denotes shared membership in a group. Some extensions of the framework are also addressed such as overlapping of gene-sets, modules, and the incorporation of other non-protein-coding entities such as microRNAs. Copyright © 2015 Elsevier Ltd. All rights reserved.
Integrative set enrichment testing for multiple omics platforms

Directory of Open Access Journals (Sweden)

Poisson Laila M

2011-11-01

Full Text Available Abstract Background Enrichment testing assesses the overall evidence of differential expression behavior of the elements within a defined set. When we have measured many molecular aspects, e.g. gene expression, metabolites, proteins, it is desirable to assess their differential tendencies jointly across platforms using an integrated set enrichment test. In this work we explore the properties of several methods for performing a combined enrichment test using gene expression and metabolomics as the motivating platforms. Results Using two simulation models we explored the properties of several enrichment methods including two novel methods: the logistic regression 2-degree of freedom Wald test and the 2-dimensional permutation p-value for the sum-of-squared statistics test. In relation to their univariate counterparts we find that the joint tests can improve our ability to detect results that are marginal univariately. We also find that joint tests improve the ranking of associated pathways compared to their univariate counterparts. However, there is a risk of Type I error inflation with some methods and self-contained methods lose specificity when the sets are not representative of underlying association. Conclusions In this work we show that consideration of data from multiple platforms, in conjunction with summarization via a priori pathway information, leads to increased power in detection of genomic associations with phenotypes.
Effect of the absolute statistic on gene-sampling gene-set analysis methods.

Science.gov (United States)

Nam, Dougu

2017-06-01

Gene-set enrichment analysis and its modified versions have commonly been used for identifying altered functions or pathways in disease from microarray data. In particular, the simple gene-sampling gene-set analysis methods have been heavily used for datasets with only a few sample replicates. The biggest problem with this approach is the highly inflated false-positive rate. In this paper, the effect of absolute gene statistic on gene-sampling gene-set analysis methods is systematically investigated. Thus far, the absolute gene statistic has merely been regarded as a supplementary method for capturing the bidirectional changes in each gene set. Here, it is shown that incorporating the absolute gene statistic in gene-sampling gene-set analysis substantially reduces the false-positive rate and improves the overall discriminatory ability. Its effect was investigated by power, false-positive rate, and receiver operating curve for a number of simulated and real datasets. The performances of gene-set analysis methods in one-tailed (genome-wide association study) and two-tailed (gene expression data) tests were also compared and discussed.
GeneTopics - interpretation of gene sets via literature-driven topic models

Science.gov (United States)

2013-01-01

Background Annotation of a set of genes is often accomplished through comparison to a library of labelled gene sets such as biological processes or canonical pathways. However, this approach might fail if the employed libraries are not up to date with the latest research, don't capture relevant biological themes or are curated at a different level of granularity than is required to appropriately analyze the input gene set. At the same time, the vast biomedical literature offers an unstructured repository of the latest research findings that can be tapped to provide thematic sub-groupings for any input gene set. Methods Our proposed method relies on a gene-specific text corpus and extracts commonalities between documents in an unsupervised manner using a topic model approach. We automatically determine the number of topics summarizing the corpus and calculate a gene relevancy score for each topic allowing us to eliminate non-specific topics. As a result we obtain a set of literature topics in which each topic is associated with a subset of the input genes providing directly interpretable keywords and corresponding documents for literature research. Results We validate our method based on labelled gene sets from the KEGG metabolic pathway collection and the genetic association database (GAD) and show that the approach is able to detect topics consistent with the labelled annotation. Furthermore, we discuss the results on three different types of experimentally derived gene sets, (1) differentially expressed genes from a cardiac hypertrophy experiment in mice, (2) altered transcript abundance in human pancreatic beta cells, and (3) genes implicated by GWA studies to be associated with metabolite levels in a healthy population. In all three cases, we are able to replicate findings from the original papers in a quick and semi-automated manner. Conclusions Our approach provides a novel way of automatically generating meaningful annotations for gene sets that are directly
A search engine to identify pathway genes from expression data on multiple organisms

Directory of Open Access Journals (Sweden)

Zambon Alexander C

2007-05-01

Full Text Available Abstract Background The completion of several genome projects showed that most genes have not yet been characterized, especially in multicellular organisms. Although most genes have unknown functions, a large collection of data is available describing their transcriptional activities under many different experimental conditions. In many cases, the coregulatation of a set of genes across a set of conditions can be used to infer roles for genes of unknown function. Results We developed a search engine, the Multiple-Species Gene Recommender (MSGR, which scans gene expression datasets from multiple organisms to identify genes that participate in a genetic pathway. The MSGR takes a query consisting of a list of genes that function together in a genetic pathway from one of six organisms: Homo sapiens, Drosophila melanogaster, Caenorhabditis elegans, Saccharomyces cerevisiae, Arabidopsis thaliana, and Helicobacter pylori. Using a probabilistic method to merge searches, the MSGR identifies genes that are significantly coregulated with the query genes in one or more of those organisms. The MSGR achieves its highest accuracy for many human pathways when searches are combined across species. We describe specific examples in which new genes were identified to be involved in a neuromuscular signaling pathway and a cell-adhesion pathway. Conclusion The search engine can scan large collections of gene expression data for new genes that are significantly coregulated with a pathway of interest. By integrating searches across organisms, the MSGR can identify pathway members whose coregulation is either ancient or newly evolved.
Integrative analysis of survival-associated gene sets in breast cancer.

Science.gov (United States)

Varn, Frederick S; Ung, Matthew H; Lou, Shao Ke; Cheng, Chao

2015-03-12

Patient gene expression information has recently become a clinical feature used to evaluate breast cancer prognosis. The emergence of prognostic gene sets that take advantage of these data has led to a rich library of information that can be used to characterize the molecular nature of a patient's cancer. Identifying robust gene sets that are consistently predictive of a patient's clinical outcome has become one of the main challenges in the field. We inputted our previously established BASE algorithm with patient gene expression data and gene sets from MSigDB to develop the gene set activity score (GSAS), a metric that quantitatively assesses a gene set's activity level in a given patient. We utilized this metric, along with patient time-to-event data, to perform survival analyses to identify the gene sets that were significantly correlated with patient survival. We then performed cross-dataset analyses to identify robust prognostic gene sets and to classify patients by metastasis status. Additionally, we created a gene set network based on component gene overlap to explore the relationship between gene sets derived from MSigDB. We developed a novel gene set based on this network's topology and applied the GSAS metric to characterize its role in patient survival. Using the GSAS metric, we identified 120 gene sets that were significantly associated with patient survival in all datasets tested. The gene overlap network analysis yielded a novel gene set enriched in genes shared by the robustly predictive gene sets. This gene set was highly correlated to patient survival when used alone. Most interestingly, removal of the genes in this gene set from the gene pool on MSigDB resulted in a large reduction in the number of predictive gene sets, suggesting a prominent role for these genes in breast cancer progression. The GSAS metric provided a useful medium by which we systematically investigated how gene sets from MSigDB relate to breast cancer patient survival. We used
NIMEFI: gene regulatory network inference using multiple ensemble feature importance algorithms.

Directory of Open Access Journals (Sweden)

Joeri Ruyssinck

Full Text Available One of the long-standing open challenges in computational systems biology is the topology inference of gene regulatory networks from high-throughput omics data. Recently, two community-wide efforts, DREAM4 and DREAM5, have been established to benchmark network inference techniques using gene expression measurements. In these challenges the overall top performer was the GENIE3 algorithm. This method decomposes the network inference task into separate regression problems for each gene in the network in which the expression values of a particular target gene are predicted using all other genes as possible predictors. Next, using tree-based ensemble methods, an importance measure for each predictor gene is calculated with respect to the target gene and a high feature importance is considered as putative evidence of a regulatory link existing between both genes. The contribution of this work is twofold. First, we generalize the regression decomposition strategy of GENIE3 to other feature importance methods. We compare the performance of support vector regression, the elastic net, random forest regression, symbolic regression and their ensemble variants in this setting to the original GENIE3 algorithm. To create the ensemble variants, we propose a subsampling approach which allows us to cast any feature selection algorithm that produces a feature ranking into an ensemble feature importance algorithm. We demonstrate that the ensemble setting is key to the network inference task, as only ensemble variants achieve top performance. As second contribution, we explore the effect of using rankwise averaged predictions of multiple ensemble algorithms as opposed to only one. We name this approach NIMEFI (Network Inference using Multiple Ensemble Feature Importance algorithms and show that this approach outperforms all individual methods in general, although on a specific network a single method can perform better. An implementation of NIMEFI has been made
Multiple Wheel Throwing: And Chess Sets.

Science.gov (United States)

Sapiro, Maurice

1978-01-01

A chess set project is suggested to teach multiple throwing, the creation on a potter's wheel of several pieces of similar configuration. Processes and finished sets are illustrated with photographs. (SJL)
On multiple blocking sets in Galois planes

NARCIS (Netherlands)

Blokhuis, A.; Lovász, L.; Storme, L.; Szönyi, T.

2007-01-01

This article continues the study of multiple blocking sets in PG(2, q). In [A. Blokhuis, L. Storme, T. Szonyi, Lacunary polynomials, multiple blocking sets and Baer subplanes. J. London Math. Soc. (2) 60 (1999), 321–332. MR1724814 (2000j:05025) Zbl 0940.51007], using lacunary polynomials, it was
Evaluation of endogenous control genes for gene expression studies across multiple tissues and in the specific sets of fat- and muscle-type samples of the pig.

Science.gov (United States)

Gu, Y R; Li, M Z; Zhang, K; Chen, L; Jiang, A A; Wang, J Y; Li, X W

2011-08-01

To normalize a set of quantitative real-time PCR (q-PCR) data, it is essential to determine an optimal number/set of housekeeping genes, as the abundance of housekeeping genes can vary across tissues or cells during different developmental stages, or even under certain environmental conditions. In this study, of the 20 commonly used endogenous control genes, 13, 18 and 17 genes exhibited credible stability in 56 different tissues, 10 types of adipose tissue and five types of muscle tissue, respectively. Our analysis clearly showed that three optimal housekeeping genes are adequate for an accurate normalization, which correlated well with the theoretical optimal number (r ≥ 0.94). In terms of economical and experimental feasibility, we recommend the use of the three most stable housekeeping genes for calculating the normalization factor. Based on our results, the three most stable housekeeping genes in all analysed samples (TOP2B, HSPCB and YWHAZ) are recommended for accurate normalization of q-PCR data. We also suggest that two different sets of housekeeping genes are appropriate for 10 types of adipose tissue (the HSPCB, ALDOA and GAPDH genes) and five types of muscle tissue (the TOP2B, HSPCB and YWHAZ genes), respectively. Our report will serve as a valuable reference for other studies aimed at measuring tissue-specific mRNA abundance in porcine samples. © 2011 Blackwell Verlag GmbH.
Zebrafish Expression Ontology of Gene Sets (ZEOGS): A Tool to Analyze Enrichment of Zebrafish Anatomical Terms in Large Gene Sets

Science.gov (United States)

Marsico, Annalisa

2013-01-01

Abstract The zebrafish (Danio rerio) is an established model organism for developmental and biomedical research. It is frequently used for high-throughput functional genomics experiments, such as genome-wide gene expression measurements, to systematically analyze molecular mechanisms. However, the use of whole embryos or larvae in such experiments leads to a loss of the spatial information. To address this problem, we have developed a tool called Zebrafish Expression Ontology of Gene Sets (ZEOGS) to assess the enrichment of anatomical terms in large gene sets. ZEOGS uses gene expression pattern data from several sources: first, in situ hybridization experiments from the Zebrafish Model Organism Database (ZFIN); second, it uses the Zebrafish Anatomical Ontology, a controlled vocabulary that describes connected anatomical structures; and third, the available connections between expression patterns and anatomical terms contained in ZFIN. Upon input of a gene set, ZEOGS determines which anatomical structures are overrepresented in the input gene set. ZEOGS allows one for the first time to look at groups of genes and to describe them in terms of shared anatomical structures. To establish ZEOGS, we first tested it on random gene selections and on two public microarray datasets with known tissue-specific gene expression changes. These tests showed that ZEOGS could reliably identify the tissues affected, whereas only very few enriched terms to none were found in the random gene sets. Next we applied ZEOGS to microarray datasets of 24 and 72 h postfertilization zebrafish embryos treated with beclomethasone, a potent glucocorticoid. This analysis resulted in the identification of several anatomical terms related to glucocorticoid-responsive tissues, some of which were stage-specific. Our studies highlight the ability of ZEOGS to extract spatial information from datasets derived from whole embryos, indicating that ZEOGS could be a useful tool to automatically analyze gene
Zebrafish Expression Ontology of Gene Sets (ZEOGS): a tool to analyze enrichment of zebrafish anatomical terms in large gene sets.

Science.gov (United States)

Prykhozhij, Sergey V; Marsico, Annalisa; Meijsing, Sebastiaan H

2013-09-01

The zebrafish (Danio rerio) is an established model organism for developmental and biomedical research. It is frequently used for high-throughput functional genomics experiments, such as genome-wide gene expression measurements, to systematically analyze molecular mechanisms. However, the use of whole embryos or larvae in such experiments leads to a loss of the spatial information. To address this problem, we have developed a tool called Zebrafish Expression Ontology of Gene Sets (ZEOGS) to assess the enrichment of anatomical terms in large gene sets. ZEOGS uses gene expression pattern data from several sources: first, in situ hybridization experiments from the Zebrafish Model Organism Database (ZFIN); second, it uses the Zebrafish Anatomical Ontology, a controlled vocabulary that describes connected anatomical structures; and third, the available connections between expression patterns and anatomical terms contained in ZFIN. Upon input of a gene set, ZEOGS determines which anatomical structures are overrepresented in the input gene set. ZEOGS allows one for the first time to look at groups of genes and to describe them in terms of shared anatomical structures. To establish ZEOGS, we first tested it on random gene selections and on two public microarray datasets with known tissue-specific gene expression changes. These tests showed that ZEOGS could reliably identify the tissues affected, whereas only very few enriched terms to none were found in the random gene sets. Next we applied ZEOGS to microarray datasets of 24 and 72 h postfertilization zebrafish embryos treated with beclomethasone, a potent glucocorticoid. This analysis resulted in the identification of several anatomical terms related to glucocorticoid-responsive tissues, some of which were stage-specific. Our studies highlight the ability of ZEOGS to extract spatial information from datasets derived from whole embryos, indicating that ZEOGS could be a useful tool to automatically analyze gene expression
MAGMA: Generalized Gene-Set Analysis of GWAS Data

NARCIS (Netherlands)

de Leeuw, C.A.; Mooij, J.M.; Heskes, T.; Posthuma, D.

2015-01-01

By aggregating data for complex traits in a biologically meaningful way, gene and gene-set analysis constitute a valuable addition to single-marker analysis. However, although various methods for gene and gene-set analysis currently exist, they generally suffer from a number of issues. Statistical
MAGMA: generalized gene-set analysis of GWAS data.

NARCIS (Netherlands)

de Leeuw, C.A.; Mooij, J.M.; Heskes, T.; Posthuma, D.

2015-01-01

By aggregating data for complex traits in a biologically meaningful way, gene and gene-set analysis constitute a valuable addition to single-marker analysis. However, although various methods for gene and gene-set analysis currently exist, they generally suffer from a number of issues. Statistical

Risk score modeling of multiple gene to gene interactions using aggregated-multifactor dimensionality reduction

Directory of Open Access Journals (Sweden)

Dai Hongying

2013-01-01

Full Text Available Abstract Background Multifactor Dimensionality Reduction (MDR has been widely applied to detect gene-gene (GxG interactions associated with complex diseases. Existing MDR methods summarize disease risk by a dichotomous predisposing model (high-risk/low-risk from one optimal GxG interaction, which does not take the accumulated effects from multiple GxG interactions into account. Results We propose an Aggregated-Multifactor Dimensionality Reduction (A-MDR method that exhaustively searches for and detects significant GxG interactions to generate an epistasis enriched gene network. An aggregated epistasis enriched risk score, which takes into account multiple GxG interactions simultaneously, replaces the dichotomous predisposing risk variable and provides higher resolution in the quantification of disease susceptibility. We evaluate this new A-MDR approach in a broad range of simulations. Also, we present the results of an application of the A-MDR method to a data set derived from Juvenile Idiopathic Arthritis patients treated with methotrexate (MTX that revealed several GxG interactions in the folate pathway that were associated with treatment response. The epistasis enriched risk score that pooled information from 82 significant GxG interactions distinguished MTX responders from non-responders with 82% accuracy. Conclusions The proposed A-MDR is innovative in the MDR framework to investigate aggregated effects among GxG interactions. New measures (pOR, pRR and pChi are proposed to detect multiple GxG interactions.
Gene set analysis for GWAS

DEFF Research Database (Denmark)

Debrabant, Birgit; Soerensen, Mette

2014-01-01

Abstract We discuss the use of modified Kolmogorov-Smirnov (KS) statistics in the context of gene set analysis and review corresponding null and alternative hypotheses. Especially, we show that, when enhancing the impact of highly significant genes in the calculation of the test statistic, the co...
Evaluating the consistency of gene sets used in the analysis of bacterial gene expression data

Directory of Open Access Journals (Sweden)

Tintle Nathan L

2012-08-01

Full Text Available Abstract Background Statistical analyses of whole genome expression data require functional information about genes in order to yield meaningful biological conclusions. The Gene Ontology (GO and Kyoto Encyclopedia of Genes and Genomes (KEGG are common sources of functionally grouped gene sets. For bacteria, the SEED and MicrobesOnline provide alternative, complementary sources of gene sets. To date, no comprehensive evaluation of the data obtained from these resources has been performed. Results We define a series of gene set consistency metrics directly related to the most common classes of statistical analyses for gene expression data, and then perform a comprehensive analysis of 3581 Affymetrix® gene expression arrays across 17 diverse bacteria. We find that gene sets obtained from GO and KEGG demonstrate lower consistency than those obtained from the SEED and MicrobesOnline, regardless of gene set size. Conclusions Despite the widespread use of GO and KEGG gene sets in bacterial gene expression data analysis, the SEED and MicrobesOnline provide more consistent sets for a wide variety of statistical analyses. Increased use of the SEED and MicrobesOnline gene sets in the analysis of bacterial gene expression data may improve statistical power and utility of expression data.
Studying the Complex Expression Dependences between Sets of Coexpressed Genes

Directory of Open Access Journals (Sweden)

Mario Huerta

2014-01-01

Full Text Available Organisms simplify the orchestration of gene expression by coregulating genes whose products function together in the cell. The use of clustering methods to obtain sets of coexpressed genes from expression arrays is very common; nevertheless there are no appropriate tools to study the expression networks among these sets of coexpressed genes. The aim of the developed tools is to allow studying the complex expression dependences that exist between sets of coexpressed genes. For this purpose, we start detecting the nonlinear expression relationships between pairs of genes, plus the coexpressed genes. Next, we form networks among sets of coexpressed genes that maintain nonlinear expression dependences between all of them. The expression relationship between the sets of coexpressed genes is defined by the expression relationship between the skeletons of these sets, where this skeleton represents the coexpressed genes with a well-defined nonlinear expression relationship with the skeleton of the other sets. As a result, we can study the nonlinear expression relationships between a target gene and other sets of coexpressed genes, or start the study from the skeleton of the sets, to study the complex relationships of activation and deactivation between the sets of coexpressed genes that carry out the different cellular processes present in the expression experiments.
Fractional populations in multiple gene inheritance.

Science.gov (United States)

Chung, Myung-Hoon; Kim, Chul Koo; Nahm, Kyun

2003-01-22

With complete knowledge of the human genome sequence, one of the most interesting tasks remaining is to understand the functions of individual genes and how they communicate. Using the information about genes (locus, allele, mutation rate, fitness, etc.), we attempt to explain population demographic data. This population evolution study could complement and enhance biologists' understanding about genes. We present a general approach to study population genetics in complex situations. In the present approach, multiple allele inheritance, multiple loci inheritance, natural selection and mutations are allowed simultaneously in order to consider a more realistic situation. A simulation program is presented so that readers can readily carry out studies with their own parameters. It is shown that the multiplicity of the loci greatly affects the demographic results of fractional population ratios. Furthermore, the study indicates that some high infant mortality rates due to congenital anomalies can be attributed to multiple loci inheritance. The simulation program can be downloaded from http://won.hongik.ac.kr/~mhchung/index_files/yapop.htm. In order to run this program, one needs Visual Studio.NET platform, which can be downloaded from http://msdn.microsoft.com/netframework/downloads/default.asp.
IGSA: Individual Gene Sets Analysis, including Enrichment and Clustering.

Science.gov (United States)

Wu, Lingxiang; Chen, Xiujie; Zhang, Denan; Zhang, Wubing; Liu, Lei; Ma, Hongzhe; Yang, Jingbo; Xie, Hongbo; Liu, Bo; Jin, Qing

2016-01-01

Analysis of gene sets has been widely applied in various high-throughput biological studies. One weakness in the traditional methods is that they neglect the heterogeneity of genes expressions in samples which may lead to the omission of some specific and important gene sets. It is also difficult for them to reflect the severities of disease and provide expression profiles of gene sets for individuals. We developed an application software called IGSA that leverages a powerful analytical capacity in gene sets enrichment and samples clustering. IGSA calculates gene sets expression scores for each sample and takes an accumulating clustering strategy to let the samples gather into the set according to the progress of disease from mild to severe. We focus on gastric, pancreatic and ovarian cancer data sets for the performance of IGSA. We also compared the results of IGSA in KEGG pathways enrichment with David, GSEA, SPIA, ssGSEA and analyzed the results of IGSA clustering and different similarity measurement methods. Notably, IGSA is proved to be more sensitive and specific in finding significant pathways, and can indicate related changes in pathways with the severity of disease. In addition, IGSA provides with significant gene sets profile for each sample.
On multiple level-set regularization methods for inverse problems

International Nuclear Information System (INIS)

DeCezaro, A; Leitão, A; Tai, X-C

2009-01-01

We analyze a multiple level-set method for solving inverse problems with piecewise constant solutions. This method corresponds to an iterated Tikhonov method for a particular Tikhonov functional G α based on TV–H 1 penalization. We define generalized minimizers for our Tikhonov functional and establish an existence result. Moreover, we prove convergence and stability results of the proposed Tikhonov method. A multiple level-set algorithm is derived from the first-order optimality conditions for the Tikhonov functional G α , similarly as the iterated Tikhonov method. The proposed multiple level-set method is tested on an inverse potential problem. Numerical experiments show that the method is able to recover multiple objects as well as multiple contrast levels
A Fast Multiple-Kernel Method With Applications to Detect Gene-Environment Interaction.

Science.gov (United States)

Marceau, Rachel; Lu, Wenbin; Holloway, Shannon; Sale, Michèle M; Worrall, Bradford B; Williams, Stephen R; Hsu, Fang-Chi; Tzeng, Jung-Ying

2015-09-01

Kernel machine (KM) models are a powerful tool for exploring associations between sets of genetic variants and complex traits. Although most KM methods use a single kernel function to assess the marginal effect of a variable set, KM analyses involving multiple kernels have become increasingly popular. Multikernel analysis allows researchers to study more complex problems, such as assessing gene-gene or gene-environment interactions, incorporating variance-component based methods for population substructure into rare-variant association testing, and assessing the conditional effects of a variable set adjusting for other variable sets. The KM framework is robust, powerful, and provides efficient dimension reduction for multifactor analyses, but requires the estimation of high dimensional nuisance parameters. Traditional estimation techniques, including regularization and the "expectation-maximization (EM)" algorithm, have a large computational cost and are not scalable to large sample sizes needed for rare variant analysis. Therefore, under the context of gene-environment interaction, we propose a computationally efficient and statistically rigorous "fastKM" algorithm for multikernel analysis that is based on a low-rank approximation to the nuisance effect kernel matrices. Our algorithm is applicable to various trait types (e.g., continuous, binary, and survival traits) and can be implemented using any existing single-kernel analysis software. Through extensive simulation studies, we show that our algorithm has similar performance to an EM-based KM approach for quantitative traits while running much faster. We also apply our method to the Vitamin Intervention for Stroke Prevention (VISP) clinical trial, examining gene-by-vitamin effects on recurrent stroke risk and gene-by-age effects on change in homocysteine level. © 2015 WILEY PERIODICALS, INC.
A pipeline to determine RT-QPCR control genes for evolutionary studies: application to primate gene expression across multiple tissues.

Directory of Open Access Journals (Sweden)

Olivier Fedrigo

Full Text Available Because many species-specific phenotypic differences are assumed to be caused by differential regulation of gene expression, many recent investigations have focused on measuring transcript abundance. Despite the availability of high-throughput platforms, quantitative real-time polymerase chain reaction (RT-QPCR is often the method of choice because of its low cost and wider dynamic range. However, the accuracy of this technique heavily relies on the use of multiple valid control genes for normalization. We created a pipeline for choosing genes potentially useful as RT-QPCR control genes for measuring expression between human and chimpanzee samples across multiple tissues, using published microarrays and a measure of tissue-specificity. We identified 13 genes from the pipeline and from commonly used control genes: ACTB, USP49, ARGHGEF2, GSK3A, TBP, SDHA, EIF2B2, GPDH, YWHAZ, HPTR1, RPL13A, HMBS, and EEF2. We then tested these candidate genes and validated their expression stability across species. We established the rank order of the most preferable set of genes for single and combined tissues. Our results suggest that for at least three tissues (cerebral cortex, liver, and skeletal muscle, EIF2B2, EEF2, HMBS, and SDHA are useful genes for normalizing human and chimpanzee expression using RT-QPCR. Interestingly, other commonly used control genes, including TBP, GAPDH, and, especially ACTB do not perform as well. This pipeline could be easily adapted to other species for which expression data exist, providing taxonomically appropriate control genes for comparisons of gene expression among species.
Novel method to load multiple genes onto a mammalian artificial chromosome.

Directory of Open Access Journals (Sweden)

Anna Tóth

Full Text Available Mammalian artificial chromosomes are natural chromosome-based vectors that may carry a vast amount of genetic material in terms of both size and number. They are reasonably stable and segregate well in both mitosis and meiosis. A platform artificial chromosome expression system (ACEs was earlier described with multiple loading sites for a modified lambda-integrase enzyme. It has been shown that this ACEs is suitable for high-level industrial protein production and the treatment of a mouse model for a devastating human disorder, Krabbe's disease. ACEs-treated mutant mice carrying a therapeutic gene lived more than four times longer than untreated counterparts. This novel gene therapy method is called combined mammalian artificial chromosome-stem cell therapy. At present, this method suffers from the limitation that a new selection marker gene should be present for each therapeutic gene loaded onto the ACEs. Complex diseases require the cooperative action of several genes for treatment, but only a limited number of selection marker genes are available and there is also a risk of serious side-effects caused by the unwanted expression of these marker genes in mammalian cells, organs and organisms. We describe here a novel method to load multiple genes onto the ACEs by using only two selectable marker genes. These markers may be removed from the ACEs before therapeutic application. This novel technology could revolutionize gene therapeutic applications targeting the treatment of complex disorders and cancers. It could also speed up cell therapy by allowing researchers to engineer a chromosome with a predetermined set of genetic factors to differentiate adult stem cells, embryonic stem cells and induced pluripotent stem (iPS cells into cell types of therapeutic value. It is also a suitable tool for the investigation of complex biochemical pathways in basic science by producing an ACEs with several genes from a signal transduction pathway of interest.
Novel method to load multiple genes onto a mammalian artificial chromosome.

Science.gov (United States)

Tóth, Anna; Fodor, Katalin; Praznovszky, Tünde; Tubak, Vilmos; Udvardy, Andor; Hadlaczky, Gyula; Katona, Robert L

2014-01-01

Mammalian artificial chromosomes are natural chromosome-based vectors that may carry a vast amount of genetic material in terms of both size and number. They are reasonably stable and segregate well in both mitosis and meiosis. A platform artificial chromosome expression system (ACEs) was earlier described with multiple loading sites for a modified lambda-integrase enzyme. It has been shown that this ACEs is suitable for high-level industrial protein production and the treatment of a mouse model for a devastating human disorder, Krabbe's disease. ACEs-treated mutant mice carrying a therapeutic gene lived more than four times longer than untreated counterparts. This novel gene therapy method is called combined mammalian artificial chromosome-stem cell therapy. At present, this method suffers from the limitation that a new selection marker gene should be present for each therapeutic gene loaded onto the ACEs. Complex diseases require the cooperative action of several genes for treatment, but only a limited number of selection marker genes are available and there is also a risk of serious side-effects caused by the unwanted expression of these marker genes in mammalian cells, organs and organisms. We describe here a novel method to load multiple genes onto the ACEs by using only two selectable marker genes. These markers may be removed from the ACEs before therapeutic application. This novel technology could revolutionize gene therapeutic applications targeting the treatment of complex disorders and cancers. It could also speed up cell therapy by allowing researchers to engineer a chromosome with a predetermined set of genetic factors to differentiate adult stem cells, embryonic stem cells and induced pluripotent stem (iPS) cells into cell types of therapeutic value. It is also a suitable tool for the investigation of complex biochemical pathways in basic science by producing an ACEs with several genes from a signal transduction pathway of interest.
Multiple and variable NHEJ-like genes are involved in resistance to DNA damage in Streptomyces ambofaciens

Directory of Open Access Journals (Sweden)

Grégory Hoff

2016-11-01

Full Text Available Non homologous end-joining (NHEJ is a double strand break (DSB repair pathway which does not require any homologous template and can ligate two DNA ends together. The basic bacterial NHEJ machinery involves two partners: the Ku protein, a DNA end binding protein for DSB recognition and the multifunctional LigD protein composed a ligase, a nuclease and a polymerase domain, for end processing and ligation of the broken ends. In silico analyses performed in the 38 sequenced genomes of Streptomyces species revealed the existence of a large panel of NHEJ-like genes. Indeed, ku genes or ligD domain homologues are scattered throughout the genome in multiple copies and can be distinguished in two categories: the core NHEJ gene set constituted of conserved loci and the variable NHEJ gene set constituted of NHEJ-like genes present in only a part of the species. In Streptomyces ambofaciens ATCC 23877, not only the deletion of core genes but also that of variable genes led to an increased sensitivity to DNA damage induced by electron beam irradiation. Multiple mutants of ku, ligase or polymerase encoding genes showed an aggravated phenotype compared to single mutants. Biochemical assays revealed the ability of Ku-like proteins to protect and to stimulate ligation of DNA ends. RT-qPCR and GFP fusion experiments suggested that ku-like genes show a growth phase dependent expression profile consistent with their involvement in DNA repair during spores formation and/or germination.
Multiple variables data sets visualization in ROOT

International Nuclear Information System (INIS)

Couet, O

2008-01-01

The ROOT graphical framework provides support for many different functions including basic graphics, high-level visualization techniques, output on files, 3D viewing etc. They use well-known world standards to render graphics on screen, to produce high-quality output files, and to generate images for Web publishing. Many techniques allow visualization of all the basic ROOT data types, but the graphical framework was still a bit weak in the visualization of multiple variables data sets. This paper presents latest developments done in the ROOT framework to visualize multiple variables (>4) data sets
Next-generation text-mining mediated generation of chemical response-specific gene sets for interpretation of gene expression data

Science.gov (United States)

2013-01-01

Background Availability of chemical response-specific lists of genes (gene sets) for pharmacological and/or toxic effect prediction for compounds is limited. We hypothesize that more gene sets can be created by next-generation text mining (next-gen TM), and that these can be used with gene set analysis (GSA) methods for chemical treatment identification, for pharmacological mechanism elucidation, and for comparing compound toxicity profiles. Methods We created 30,211 chemical response-specific gene sets for human and mouse by next-gen TM, and derived 1,189 (human) and 588 (mouse) gene sets from the Comparative Toxicogenomics Database (CTD). We tested for significant differential expression (SDE) (false discovery rate -corrected p-values sets and the CTD-derived gene sets in gene expression (GE) data sets of five chemicals (from experimental models). We tested for SDE of gene sets for six fibrates in a peroxisome proliferator-activated receptor alpha (PPARA) knock-out GE dataset and compared to results from the Connectivity Map. We tested for SDE of 319 next-gen TM-derived gene sets for environmental toxicants in three GE data sets of triazoles, and tested for SDE of 442 gene sets associated with embryonic structures. We compared the gene sets to triazole effects seen in the Whole Embryo Culture (WEC), and used principal component analysis (PCA) to discriminate triazoles from other chemicals. Results Next-gen TM-derived gene sets matching the chemical treatment were significantly altered in three GE data sets, and the corresponding CTD-derived gene sets were significantly altered in five GE data sets. Six next-gen TM-derived and four CTD-derived fibrate gene sets were significantly altered in the PPARA knock-out GE dataset. None of the fibrate signatures in cMap scored significant against the PPARA GE signature. 33 environmental toxicant gene sets were significantly altered in the triazole GE data sets. 21 of these toxicants had a similar toxicity pattern as the
Enriching the gene set analysis of genome-wide data by incorporating directionality of gene expression and combining statistical hypotheses and methods

Science.gov (United States)

Väremo, Leif; Nielsen, Jens; Nookaew, Intawat

2013-01-01

Gene set analysis (GSA) is used to elucidate genome-wide data, in particular transcriptome data. A multitude of methods have been proposed for this step of the analysis, and many of them have been compared and evaluated. Unfortunately, there is no consolidated opinion regarding what methods should be preferred, and the variety of available GSA software and implementations pose a difficulty for the end-user who wants to try out different methods. To address this, we have developed the R package Piano that collects a range of GSA methods into the same system, for the benefit of the end-user. Further on we refine the GSA workflow by using modifications of the gene-level statistics. This enables us to divide the resulting gene set P-values into three classes, describing different aspects of gene expression directionality at gene set level. We use our fully implemented workflow to investigate the impact of the individual components of GSA by using microarray and RNA-seq data. The results show that the evaluated methods are globally similar and the major separation correlates well with our defined directionality classes. As a consequence of this, we suggest to use a consensus scoring approach, based on multiple GSA runs. In combination with the directionality classes, this constitutes a more thorough basis for an enriched biological interpretation. PMID:23444143
Bayesian models and meta analysis for multiple tissue gene expression data following corticosteroid administration

Directory of Open Access Journals (Sweden)

Kelemen Arpad

2008-08-01

Full Text Available Abstract Background This paper addresses key biological problems and statistical issues in the analysis of large gene expression data sets that describe systemic temporal response cascades to therapeutic doses in multiple tissues such as liver, skeletal muscle, and kidney from the same animals. Affymetrix time course gene expression data U34A are obtained from three different tissues including kidney, liver and muscle. Our goal is not only to find the concordance of gene in different tissues, identify the common differentially expressed genes over time and also examine the reproducibility of the findings by integrating the results through meta analysis from multiple tissues in order to gain a significant increase in the power of detecting differentially expressed genes over time and to find the differential differences of three tissues responding to the drug. Results and conclusion Bayesian categorical model for estimating the proportion of the 'call' are used for pre-screening genes. Hierarchical Bayesian Mixture Model is further developed for the identifications of differentially expressed genes across time and dynamic clusters. Deviance information criterion is applied to determine the number of components for model comparisons and selections. Bayesian mixture model produces the gene-specific posterior probability of differential/non-differential expression and the 95% credible interval, which is the basis for our further Bayesian meta-inference. Meta-analysis is performed in order to identify commonly expressed genes from multiple tissues that may serve as ideal targets for novel treatment strategies and to integrate the results across separate studies. We have found the common expressed genes in the three tissues. However, the up/down/no regulations of these common genes are different at different time points. Moreover, the most differentially expressed genes were found in the liver, then in kidney, and then in muscle.
Next-generation text-mining mediated generation of chemical response-specific gene sets for interpretation of gene expression data

Directory of Open Access Journals (Sweden)

Hettne Kristina M

2013-01-01

Full Text Available Abstract Background Availability of chemical response-specific lists of genes (gene sets for pharmacological and/or toxic effect prediction for compounds is limited. We hypothesize that more gene sets can be created by next-generation text mining (next-gen TM, and that these can be used with gene set analysis (GSA methods for chemical treatment identification, for pharmacological mechanism elucidation, and for comparing compound toxicity profiles. Methods We created 30,211 chemical response-specific gene sets for human and mouse by next-gen TM, and derived 1,189 (human and 588 (mouse gene sets from the Comparative Toxicogenomics Database (CTD. We tested for significant differential expression (SDE (false discovery rate -corrected p-values Results Next-gen TM-derived gene sets matching the chemical treatment were significantly altered in three GE data sets, and the corresponding CTD-derived gene sets were significantly altered in five GE data sets. Six next-gen TM-derived and four CTD-derived fibrate gene sets were significantly altered in the PPARA knock-out GE dataset. None of the fibrate signatures in cMap scored significant against the PPARA GE signature. 33 environmental toxicant gene sets were significantly altered in the triazole GE data sets. 21 of these toxicants had a similar toxicity pattern as the triazoles. We confirmed embryotoxic effects, and discriminated triazoles from other chemicals. Conclusions Gene set analysis with next-gen TM-derived chemical response-specific gene sets is a scalable method for identifying similarities in gene responses to other chemicals, from which one may infer potential mode of action and/or toxic effect.
EasyCloneMulti: A Set of Vectors for Simultaneous and Multiple Genomic Integrations in Saccharomyces cerevisiae

DEFF Research Database (Denmark)

Maury, Jerome; Germann, Susanne Manuela; Jacobsen, Simo Abdessamad

2016-01-01

Saccharomyces cerevisiae is widely used in the biotechnology industry for production of ethanol, recombinant proteins, food ingredients and other chemicals. In order to generate highly producing and stable strains, genome integration of genes encoding metabolic pathway enzymes is the preferred...... of integrative vectors, EasyCloneMulti, that enables multiple and simultaneous integration of genes in S. cerevisiae. By creating vector backbones that combine consensus sequences that aim at targeting subsets of Ty sequences and a quickly degrading selective marker, integrations at multiple genomic loci...... and a range of expression levels were obtained, as assessed with the green fluorescent protein (GFP) reporter system. The EasyCloneMulti vector set was applied to balance the expression of the rate-controlling step in the β-alanine pathway for biosynthesis of 3-hydroxypropionic acid (3HP). The best 3HP...
Genome-Wide Temporal Expression Profiling in Caenorhabditis elegans Identifies a Core Gene Set Related to Long-Term Memory.

Science.gov (United States)

Freytag, Virginie; Probst, Sabine; Hadziselimovic, Nils; Boglari, Csaba; Hauser, Yannick; Peter, Fabian; Gabor Fenyves, Bank; Milnik, Annette; Demougin, Philippe; Vukojevic, Vanja; de Quervain, Dominique J-F; Papassotiropoulos, Andreas; Stetak, Attila

2017-07-12

The identification of genes related to encoding, storage, and retrieval of memories is a major interest in neuroscience. In the current study, we analyzed the temporal gene expression changes in a neuronal mRNA pool during an olfactory long-term associative memory (LTAM) in Caenorhabditis elegans hermaphrodites. Here, we identified a core set of 712 (538 upregulated and 174 downregulated) genes that follows three distinct temporal peaks demonstrating multiple gene regulation waves in LTAM. Compared with the previously published positive LTAM gene set (Lakhina et al., 2015), 50% of the identified upregulated genes here overlap with the previous dataset, possibly representing stimulus-independent memory-related genes. On the other hand, the remaining genes were not previously identified in positive associative memory and may specifically regulate aversive LTAM. Our results suggest a multistep gene activation process during the formation and retrieval of long-term memory and define general memory-implicated genes as well as conditioning-type-dependent gene sets. SIGNIFICANCE STATEMENT The identification of genes regulating different steps of memory is of major interest in neuroscience. Identification of common memory genes across different learning paradigms and the temporal activation of the genes are poorly studied. Here, we investigated the temporal aspects of Caenorhabditis elegans gene expression changes using aversive olfactory associative long-term memory (LTAM) and identified three major gene activation waves. Like in previous studies, aversive LTAM is also CREB dependent, and CREB activity is necessary immediately after training. Finally, we define a list of memory paradigm-independent core gene sets as well as conditioning-dependent genes. Copyright © 2017 the authors 0270-6474/17/376661-12$15.00/0.
EasyClone: method for iterative chromosomal integration of multiple genes in Saccharomyces cerevisiae

DEFF Research Database (Denmark)

Jensen, Niels Bjerg; Strucko, Tomas; Kildegaard, Kanchana Rueksomtawin

2014-01-01

of multiple genes with an option of recycling selection markers. The vectors combine the advantage of efficient uracil excision reaction-based cloning and Cre-LoxP-mediated marker recycling system. The episomal and integrative vector sets were tested by inserting genes encoding cyan, yellow, and red...... fluorescent proteins into separate vectors and analyzing for co-expression of proteins by flow cytometry. Cells expressing genes encoding for the three fluorescent proteins from three integrations exhibited a much higher level of simultaneous expression than cells producing fluorescent proteins encoded...... on episomal plasmids, where correspondingly 95% and 6% of the cells were within a fluorescence interval of Log10 mean ± 15% for all three colors. We demonstrate that selective markers can be simultaneously removed using Cre-mediated recombination and all the integrated heterologous genes remain...

Comparative study on gene set and pathway topology-based enrichment methods.

Science.gov (United States)

Bayerlová, Michaela; Jung, Klaus; Kramer, Frank; Klemm, Florian; Bleckmann, Annalen; Beißbarth, Tim

2015-10-22

Enrichment analysis is a popular approach to identify pathways or sets of genes which are significantly enriched in the context of differentially expressed genes. The traditional gene set enrichment approach considers a pathway as a simple gene list disregarding any knowledge of gene or protein interactions. In contrast, the new group of so called pathway topology-based methods integrates the topological structure of a pathway into the analysis. We comparatively investigated gene set and pathway topology-based enrichment approaches, considering three gene set and four topological methods. These methods were compared in two extensive simulation studies and on a benchmark of 36 real datasets, providing the same pathway input data for all methods. In the benchmark data analysis both types of methods showed a comparable ability to detect enriched pathways. The first simulation study was conducted with KEGG pathways, which showed considerable gene overlaps between each other. In this study with original KEGG pathways, none of the topology-based methods outperformed the gene set approach. Therefore, a second simulation study was performed on non-overlapping pathways created by unique gene IDs. Here, methods accounting for pathway topology reached higher accuracy than the gene set methods, however their sensitivity was lower. We conducted one of the first comprehensive comparative works on evaluating gene set against pathway topology-based enrichment methods. The topological methods showed better performance in the simulation scenarios with non-overlapping pathways, however, they were not conclusively better in the other scenarios. This suggests that simple gene set approach might be sufficient to detect an enriched pathway under realistic circumstances. Nevertheless, more extensive studies and further benchmark data are needed to systematically evaluate these methods and to assess what gain and cost pathway topology information introduces into enrichment analysis. Both
All Set! Evidence of Simultaneous Attentional Control Settings for Multiple Target Colors

Science.gov (United States)

Irons, Jessica L.; Folk, Charles L.; Remington, Roger W.

2012-01-01

Although models of visual search have often assumed that attention can only be set for a single feature or property at a time, recent studies have suggested that it may be possible to maintain more than one attentional control setting. The aim of the present study was to investigate whether spatial attention could be guided by multiple attentional…
LEGO: a novel method for gene set over-representation analysis by incorporating network-based gene weights.

Science.gov (United States)

Dong, Xinran; Hao, Yun; Wang, Xiao; Tian, Weidong

2016-01-11

Pathway or gene set over-representation analysis (ORA) has become a routine task in functional genomics studies. However, currently widely used ORA tools employ statistical methods such as Fisher's exact test that reduce a pathway into a list of genes, ignoring the constitutive functional non-equivalent roles of genes and the complex gene-gene interactions. Here, we develop a novel method named LEGO (functional Link Enrichment of Gene Ontology or gene sets) that takes into consideration these two types of information by incorporating network-based gene weights in ORA analysis. In three benchmarks, LEGO achieves better performance than Fisher and three other network-based methods. To further evaluate LEGO's usefulness, we compare LEGO with five gene expression-based and three pathway topology-based methods using a benchmark of 34 disease gene expression datasets compiled by a recent publication, and show that LEGO is among the top-ranked methods in terms of both sensitivity and prioritization for detecting target KEGG pathways. In addition, we develop a cluster-and-filter approach to reduce the redundancy among the enriched gene sets, making the results more interpretable to biologists. Finally, we apply LEGO to two lists of autism genes, and identify relevant gene sets to autism that could not be found by Fisher.
A Bayesian Hierarchical Model for Relating Multiple SNPs within Multiple Genes to Disease Risk

Directory of Open Access Journals (Sweden)

Lewei Duan

2013-01-01

Full Text Available A variety of methods have been proposed for studying the association of multiple genes thought to be involved in a common pathway for a particular disease. Here, we present an extension of a Bayesian hierarchical modeling strategy that allows for multiple SNPs within each gene, with external prior information at either the SNP or gene level. The model involves variable selection at the SNP level through latent indicator variables and Bayesian shrinkage at the gene level towards a prior mean vector and covariance matrix that depend on external information. The entire model is fitted using Markov chain Monte Carlo methods. Simulation studies show that the approach is capable of recovering many of the truly causal SNPs and genes, depending upon their frequency and size of their effects. The method is applied to data on 504 SNPs in 38 candidate genes involved in DNA damage response in the WECARE study of second breast cancers in relation to radiotherapy exposure.
Next-generation text-mining mediated generation of chemical response-specific gene sets for interpretation of gene expression data

NARCIS (Netherlands)

Hettne, K.M.; Boorsma, A.; Dartel, D.A. van; Goeman, J.J.; Jong, E. de; Piersma, A.H.; Stierum, R.H.; Kleinjans, J.C.; Kors, J.A.

2013-01-01

BACKGROUND: Availability of chemical response-specific lists of genes (gene sets) for pharmacological and/or toxic effect prediction for compounds is limited. We hypothesize that more gene sets can be created by next-generation text mining (next-gen TM), and that these can be used with gene set
Next-generation text-mining mediated generation of chemical response-specific gene sets for interpretation of gene expression data

NARCIS (Netherlands)

Hettne, K.M.; Boorsma, A.; Dartel, van D.A.M.; Goeman, J.J.; Jong, de E.; Piersma, A.H.; Stierum, R.H.; Kleinjans, J.C.; Kors, J.A.

2013-01-01

Background: Availability of chemical response-specific lists of genes (gene sets) for pharmacological and/or toxic effect prediction for compounds is limited. We hypothesize that more gene sets can be created by next-generation text mining (next-gen TM), and that these can be used with gene set
Simultaneous inference of phenotype-associated genes and relevant tissues from GWAS data via Bayesian integration of multiple tissue-specific gene networks.

Science.gov (United States)

Wu, Mengmeng; Lin, Zhixiang; Ma, Shining; Chen, Ting; Jiang, Rui; Wong, Wing Hung

2017-12-01

Although genome-wide association studies (GWAS) have successfully identified thousands of genomic loci associated with hundreds of complex traits in the past decade, the debate about such problems as missing heritability and weak interpretability has been appealing for effective computational methods to facilitate the advanced analysis of the vast volume of existing and anticipated genetic data. Towards this goal, gene-level integrative GWAS analysis with the assumption that genes associated with a phenotype tend to be enriched in biological gene sets or gene networks has recently attracted much attention, due to such advantages as straightforward interpretation, less multiple testing burdens, and robustness across studies. However, existing methods in this category usually exploit non-tissue-specific gene networks and thus lack the ability to utilize informative tissue-specific characteristics. To overcome this limitation, we proposed a Bayesian approach called SIGNET (Simultaneously Inference of GeNEs and Tissues) to integrate GWAS data and multiple tissue-specific gene networks for the simultaneous inference of phenotype-associated genes and relevant tissues. Through extensive simulation studies, we showed the effectiveness of our method in finding both associated genes and relevant tissues for a phenotype. In applications to real GWAS data of 14 complex phenotypes, we demonstrated the power of our method in both deciphering genetic basis and discovering biological insights of a phenotype. With this understanding, we expect to see SIGNET as a valuable tool for integrative GWAS analysis, thereby boosting the prevention, diagnosis, and treatment of human inherited diseases and eventually facilitating precision medicine.
Gene set analysis: limitations in popular existing methods and proposed improvements.

Science.gov (United States)

Mishra, Pashupati; Törönen, Petri; Leino, Yrjö; Holm, Liisa

2014-10-01

Gene set analysis is the analysis of a set of genes that collectively contribute to a biological process. Most popular gene set analysis methods are based on empirical P-value that requires large number of permutations. Despite numerous gene set analysis methods developed in the past decade, the most popular methods still suffer from serious limitations. We present a gene set analysis method (mGSZ) based on Gene Set Z-scoring function (GSZ) and asymptotic P-values. Asymptotic P-value calculation requires fewer permutations, and thus speeds up the gene set analysis process. We compare the GSZ-scoring function with seven popular gene set scoring functions and show that GSZ stands out as the best scoring function. In addition, we show improved performance of the GSA method when the max-mean statistics is replaced by the GSZ scoring function. We demonstrate the importance of both gene and sample permutations by showing the consequences in the absence of one or the other. A comparison of asymptotic and empirical methods of P-value estimation demonstrates a clear advantage of asymptotic P-value over empirical P-value. We show that mGSZ outperforms the state-of-the-art methods based on two different evaluations. We compared mGSZ results with permutation and rotation tests and show that rotation does not improve our asymptotic P-values. We also propose well-known asymptotic distribution models for three of the compared methods. mGSZ is available as R package from cran.r-project.org. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Beyond main effects of gene-sets: harsh parenting moderates the association between a dopamine gene-set and child externalizing behavior

NARCIS (Netherlands)

J. Windhorst (Judith); V. Mileva-Seitz (Viara); R.C.A. Rippe (Ralph C.A.); H.W. Tiemeier (Henning); V.W.V. Jaddoe (Vincent); F.C. Verhulst (Frank); M.H. van IJzendoorn (Rien); M.J. Bakermans-Kranenburg (Marian)

2016-01-01

textabstractBackground: In a longitudinal cohort study, we investigated the interplay of harsh parenting and genetic variation across a set of functionally related dopamine genes, in association with children's externalizing behavior. This is one of the first studies to employ gene-based and
Set Partitions and the Multiplication Principle

Science.gov (United States)

Lockwood, Elise; Caughman, John S., IV

2016-01-01

To further understand student thinking in the context of combinatorial enumeration, we examine student work on a problem involving set partitions. In this context, we note some key features of the multiplication principle that were often not attended to by students. We also share a productive way of thinking that emerged for several students who…
Identification of the Core Set of Carbon-Associated Genes in a Bioenergy Grassland Soil.

Directory of Open Access Journals (Sweden)

Adina Howe

Full Text Available Despite the central role of soil microbial communities in global carbon (C cycling, little is known about soil microbial community structure and even less about their metabolic pathways. Efforts to characterize soil communities often focus on identifying differences in gene content across environmental gradients, but an alternative question is what genes are similar in soils. These genes may indicate critical species or potential functions that are required in all soils. Here we identified the "core" set of C cycling sequences widely present in multiple soil metagenomes from a fertilized prairie (FP. Of 226,887 sequences associated with known enzymes involved in the synthesis, metabolism, and transport of carbohydrates, 843 were identified to be consistently prevalent across four replicate soil metagenomes. This core metagenome was functionally and taxonomically diverse, representing five enzyme classes and 99 enzyme families within the CAZy database. Though it only comprised 0.4% of all CAZy-associated genes identified in FP metagenomes, the core was found to be comprised of functions similar to those within cumulative soils. The FP CAZy-associated core sequences were present in multiple publicly available soil metagenomes and most similar to soils sharing geographic proximity. In soil ecosystems, where high diversity remains a key challenge for metagenomic investigations, these core genes represent a subset of critical functions necessary for carbohydrate metabolism, which can be targeted to evaluate important C fluxes in these and other similar soils.
Prediction potential of candidate biomarker sets identified and validated on gene expression data from multiple datasets

Directory of Open Access Journals (Sweden)

Karacali Bilge

2007-10-01

Full Text Available Abstract Background Independently derived expression profiles of the same biological condition often have few genes in common. In this study, we created populations of expression profiles from publicly available microarray datasets of cancer (breast, lymphoma and renal samples linked to clinical information with an iterative machine learning algorithm. ROC curves were used to assess the prediction error of each profile for classification. We compared the prediction error of profiles correlated with molecular phenotype against profiles correlated with relapse-free status. Prediction error of profiles identified with supervised univariate feature selection algorithms were compared to profiles selected randomly from a all genes on the microarray platform and b a list of known disease-related genes (a priori selection. We also determined the relevance of expression profiles on test arrays from independent datasets, measured on either the same or different microarray platforms. Results Highly discriminative expression profiles were produced on both simulated gene expression data and expression data from breast cancer and lymphoma datasets on the basis of ER and BCL-6 expression, respectively. Use of relapse-free status to identify profiles for prognosis prediction resulted in poorly discriminative decision rules. Supervised feature selection resulted in more accurate classifications than random or a priori selection, however, the difference in prediction error decreased as the number of features increased. These results held when decision rules were applied across-datasets to samples profiled on the same microarray platform. Conclusion Our results show that many gene sets predict molecular phenotypes accurately. Given this, expression profiles identified using different training datasets should be expected to show little agreement. In addition, we demonstrate the difficulty in predicting relapse directly from microarray data using supervised machine
Variable precision rough set for multiple decision attribute analysis

Institute of Scientific and Technical Information of China (English)

Lai; Kin; Keung

2008-01-01

A variable precision rough set (VPRS) model is used to solve the multi-attribute decision analysis (MADA) problem with multiple conflicting decision attributes and multiple condition attributes. By introducing confidence measures and a β-reduct, the VPRS model can rationally solve the conflicting decision analysis problem with multiple decision attributes and multiple condition attributes. For illustration, a medical diagnosis example is utilized to show the feasibility of the VPRS model in solving the MADA...
Annotating gene sets by mining large literature collections with protein networks.

Science.gov (United States)

Wang, Sheng; Ma, Jianzhu; Yu, Michael Ku; Zheng, Fan; Huang, Edward W; Han, Jiawei; Peng, Jian; Ideker, Trey

2018-01-01

Analysis of patient genomes and transcriptomes routinely recognizes new gene sets associated with human disease. Here we present an integrative natural language processing system which infers common functions for a gene set through automatic mining of the scientific literature with biological networks. This system links genes with associated literature phrases and combines these links with protein interactions in a single heterogeneous network. Multiscale functional annotations are inferred based on network distances between phrases and genes and then visualized as an ontology of biological concepts. To evaluate this system, we predict functions for gene sets representing known pathways and find that our approach achieves substantial improvement over the conventional text-mining baseline method. Moreover, our system discovers novel annotations for gene sets or pathways without previously known functions. Two case studies demonstrate how the system is used in discovery of new cancer-related pathways with ontological annotations.
Genome-Wide Gene Set Analysis for Identification of Pathways Associated with Alcohol Dependence

Science.gov (United States)

Biernacka, Joanna M.; Geske, Jennifer; Jenkins, Gregory D.; Colby, Colin; Rider, David N.; Karpyak, Victor M.; Choi, Doo-Sup; Fridley, Brooke L.

2013-01-01

It is believed that multiple genetic variants with small individual effects contribute to the risk of alcohol dependence. Such polygenic effects are difficult to detect in genome-wide association studies that test for association of the phenotype with each single nucleotide polymorphism (SNP) individually. To overcome this challenge, gene set analysis (GSA) methods that jointly test for the effects of pre-defined groups of genes have been proposed. Rather than testing for association between the phenotype and individual SNPs, these analyses evaluate the global evidence of association with a set of related genes enabling the identification of cellular or molecular pathways or biological processes that play a role in development of the disease. It is hoped that by aggregating the evidence of association for all available SNPs in a group of related genes, these approaches will have enhanced power to detect genetic associations with complex traits. We performed GSA using data from a genome-wide study of 1165 alcohol dependent cases and 1379 controls from the Study of Addiction: Genetics and Environment (SAGE), for all 200 pathways listed in the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. Results demonstrated a potential role of the “Synthesis and Degradation of Ketone Bodies” pathway. Our results also support the potential involvement of the “Neuroactive Ligand Receptor Interaction” pathway, which has previously been implicated in addictive disorders. These findings demonstrate the utility of GSA in the study of complex disease, and suggest specific directions for further research into the genetic architecture of alcohol dependence. PMID:22717047
A gene pathway analysis highlights the role of cellular adhesion molecules in multiple sclerosis susceptibility

DEFF Research Database (Denmark)

Damotte, V; Guillot-Noel, L; Patsopoulos, N A

2014-01-01

adhesion molecule (CAMs) biological pathway using Cytoscape software. This network is a strong candidate, as it is involved in the crossing of the blood-brain barrier by the T cells, an early event in MS pathophysiology, and is used as an efficient therapeutic target. We drew up a list of 76 genes...... in interaction with other genes as a group. Pathway analysis is an alternative way to highlight such group of genes. Using SNP association P-values from eight multiple sclerosis (MS) GWAS data sets, we performed a candidate pathway analysis for MS susceptibility by considering genes interacting in the cell...... belonging to the CAM network. We highlighted 64 networks enriched with CAM genes with low P-values. Filtering by a percentage of CAM genes up to 50% and rejecting enriched signals mainly driven by transcription factors, we highlighted five networks associated with MS susceptibility. One of them, constituted...
Phylogenetics and evolution of Trx SET genes in fully sequenced land plants.

Science.gov (United States)

Zhu, Xinyu; Chen, Caoyi; Wang, Baohua

2012-04-01

Plant Trx SET proteins are involved in H3K4 methylation and play a key role in plant floral development. Genes encoding Trx SET proteins constitute a multigene family in which the copy number varies among plant species and functional divergence appears to have occurred repeatedly. To investigate the evolutionary history of the Trx SET gene family, we made a comprehensive evolutionary analysis on this gene family from 13 major representatives of green plants. A novel clustering (here named as cpTrx clade), which included the III-1, III-2, and III-4 orthologous groups, previously resolved was identified. Our analysis showed that plant Trx proteins possessed a variety of domain organizations and gene structures among paralogs. Additional domains such as PHD, PWWP, and FYR were early integrated into primordial SET-PostSET domain organization of cpTrx clade. We suggested that the PostSET domain was lost in some members of III-4 orthologous group during the evolution of land plants. At least four classes of gene structures had been formed at the early evolutionary stage of land plants. Three intronless orphan Trx SET genes from the Physcomitrella patens (moss) were identified, and supposedly, their parental genes have been eliminated from the genome. The structural differences among evolutionary groups of plant Trx SET genes with different functions were described, contributing to the design of further experimental studies.
Ranking metrics in gene set enrichment analysis: do they matter?

Science.gov (United States)

Zyla, Joanna; Marczyk, Michal; Weiner, January; Polanska, Joanna

2017-05-12

There exist many methods for describing the complex relation between changes of gene expression in molecular pathways or gene ontologies under different experimental conditions. Among them, Gene Set Enrichment Analysis seems to be one of the most commonly used (over 10,000 citations). An important parameter, which could affect the final result, is the choice of a metric for the ranking of genes. Applying a default ranking metric may lead to poor results. In this work 28 benchmark data sets were used to evaluate the sensitivity and false positive rate of gene set analysis for 16 different ranking metrics including new proposals. Furthermore, the robustness of the chosen methods to sample size was tested. Using k-means clustering algorithm a group of four metrics with the highest performance in terms of overall sensitivity, overall false positive rate and computational load was established i.e. absolute value of Moderated Welch Test statistic, Minimum Significant Difference, absolute value of Signal-To-Noise ratio and Baumgartner-Weiss-Schindler test statistic. In case of false positive rate estimation, all selected ranking metrics were robust with respect to sample size. In case of sensitivity, the absolute value of Moderated Welch Test statistic and absolute value of Signal-To-Noise ratio gave stable results, while Baumgartner-Weiss-Schindler and Minimum Significant Difference showed better results for larger sample size. Finally, the Gene Set Enrichment Analysis method with all tested ranking metrics was parallelised and implemented in MATLAB, and is available at https://github.com/ZAEDPolSl/MrGSEA . Choosing a ranking metric in Gene Set Enrichment Analysis has critical impact on results of pathway enrichment analysis. The absolute value of Moderated Welch Test has the best overall sensitivity and Minimum Significant Difference has the best overall specificity of gene set analysis. When the number of non-normally distributed genes is high, using Baumgartner
Systems-based biological concordance and predictive reproducibility of gene set discovery methods in cardiovascular disease.

Science.gov (United States)

Azuaje, Francisco; Zheng, Huiru; Camargo, Anyela; Wang, Haiying

2011-08-01

The discovery of novel disease biomarkers is a crucial challenge for translational bioinformatics. Demonstration of both their classification power and reproducibility across independent datasets are essential requirements to assess their potential clinical relevance. Small datasets and multiplicity of putative biomarker sets may explain lack of predictive reproducibility. Studies based on pathway-driven discovery approaches have suggested that, despite such discrepancies, the resulting putative biomarkers tend to be implicated in common biological processes. Investigations of this problem have been mainly focused on datasets derived from cancer research. We investigated the predictive and functional concordance of five methods for discovering putative biomarkers in four independently-generated datasets from the cardiovascular disease domain. A diversity of biosignatures was identified by the different methods. However, we found strong biological process concordance between them, especially in the case of methods based on gene set analysis. With a few exceptions, we observed lack of classification reproducibility using independent datasets. Partial overlaps between our putative sets of biomarkers and the primary studies exist. Despite the observed limitations, pathway-driven or gene set analysis can predict potentially novel biomarkers and can jointly point to biomedically-relevant underlying molecular mechanisms. Copyright © 2011 Elsevier Inc. All rights reserved.
Aberrant gene promoter methylation associated with sporadic multiple colorectal cancer.

Directory of Open Access Journals (Sweden)

Victoria Gonzalo

Full Text Available BACKGROUND: Colorectal cancer (CRC multiplicity has been mainly related to polyposis and non-polyposis hereditary syndromes. In sporadic CRC, aberrant gene promoter methylation has been shown to play a key role in carcinogenesis, although little is known about its involvement in multiplicity. To assess the effect of methylation in tumor multiplicity in sporadic CRC, hypermethylation of key tumor suppressor genes was evaluated in patients with both multiple and solitary tumors, as a proof-of-concept of an underlying epigenetic defect. METHODOLOGY/PRINCIPAL FINDINGS: We examined a total of 47 synchronous/metachronous primary CRC from 41 patients, and 41 gender, age (5-year intervals and tumor location-paired patients with solitary tumors. Exclusion criteria were polyposis syndromes, Lynch syndrome and inflammatory bowel disease. DNA methylation at the promoter region of the MGMT, CDKN2A, SFRP1, TMEFF2, HS3ST2 (3OST2, RASSF1A and GATA4 genes was evaluated by quantitative methylation specific PCR in both tumor and corresponding normal appearing colorectal mucosa samples. Overall, patients with multiple lesions exhibited a higher degree of methylation in tumor samples than those with solitary tumors regarding all evaluated genes. After adjusting for age and gender, binomial logistic regression analysis identified methylation of MGMT2 (OR, 1.48; 95% CI, 1.10 to 1.97; p = 0.008 and RASSF1A (OR, 2.04; 95% CI, 1.01 to 4.13; p = 0.047 as variables independently associated with tumor multiplicity, being the risk related to methylation of any of these two genes 4.57 (95% CI, 1.53 to 13.61; p = 0.006. Moreover, in six patients in whom both tumors were available, we found a correlation in the methylation levels of MGMT2 (r = 0.64, p = 0.17, SFRP1 (r = 0.83, 0.06, HPP1 (r = 0.64, p = 0.17, 3OST2 (r = 0.83, p = 0.06 and GATA4 (r = 0.6, p = 0.24. Methylation in normal appearing colorectal mucosa from patients with multiple and solitary CRC showed no relevant

Investigating the effect of paralogs on microarray gene-set analysis

LENUS (Irish Health Repository)

Faure, Andre J

2011-01-24

Abstract Background In order to interpret the results obtained from a microarray experiment, researchers often shift focus from analysis of individual differentially expressed genes to analyses of sets of genes. These gene-set analysis (GSA) methods use previously accumulated biological knowledge to group genes into sets and then aim to rank these gene sets in a way that reflects their relative importance in the experimental situation in question. We suspect that the presence of paralogs affects the ability of GSA methods to accurately identify the most important sets of genes for subsequent research. Results We show that paralogs, which typically have high sequence identity and similar molecular functions, also exhibit high correlation in their expression patterns. We investigate this correlation as a potential confounding factor common to current GSA methods using Indygene http:\\/\\/www.cbio.uct.ac.za\\/indygene, a web tool that reduces a supplied list of genes so that it includes no pairwise paralogy relationships above a specified sequence similarity threshold. We use the tool to reanalyse previously published microarray datasets and determine the potential utility of accounting for the presence of paralogs. Conclusions The Indygene tool efficiently removes paralogy relationships from a given dataset and we found that such a reduction, performed prior to GSA, has the ability to generate significantly different results that often represent novel and plausible biological hypotheses. This was demonstrated for three different GSA approaches when applied to the reanalysis of previously published microarray datasets and suggests that the redundancy and non-independence of paralogs is an important consideration when dealing with GSA methodologies.
FunGeneNet: a web tool to estimate enrichment of functional interactions in experimental gene sets.

Science.gov (United States)

Tiys, Evgeny S; Ivanisenko, Timofey V; Demenkov, Pavel S; Ivanisenko, Vladimir A

2018-02-09

Estimation of functional connectivity in gene sets derived from genome-wide or other biological experiments is one of the essential tasks of bioinformatics. A promising approach for solving this problem is to compare gene networks built using experimental gene sets with random networks. One of the resources that make such an analysis possible is CrossTalkZ, which uses the FunCoup database. However, existing methods, including CrossTalkZ, do not take into account individual types of interactions, such as protein/protein interactions, expression regulation, transport regulation, catalytic reactions, etc., but rather work with generalized types characterizing the existence of any connection between network members. We developed the online tool FunGeneNet, which utilizes the ANDSystem and STRING to reconstruct gene networks using experimental gene sets and to estimate their difference from random networks. To compare the reconstructed networks with random ones, the node permutation algorithm implemented in CrossTalkZ was taken as a basis. To study the FunGeneNet applicability, the functional connectivity analysis of networks constructed for gene sets involved in the Gene Ontology biological processes was conducted. We showed that the method sensitivity exceeds 0.8 at a specificity of 0.95. We found that the significance level of the difference between gene networks of biological processes and random networks is determined by the type of connections considered between objects. At the same time, the highest reliability is achieved for the generalized form of connections that takes into account all the individual types of connections. By taking examples of the thyroid cancer networks and the apoptosis network, it is demonstrated that key participants in these processes are involved in the interactions of those types by which these networks differ from random ones. FunGeneNet is a web tool aimed at proving the functionality of networks in a wide range of sizes of
Principal Angle Enrichment Analysis (PAEA): Dimensionally Reduced Multivariate Gene Set Enrichment Analysis Tool.

Science.gov (United States)

Clark, Neil R; Szymkiewicz, Maciej; Wang, Zichen; Monteiro, Caroline D; Jones, Matthew R; Ma'ayan, Avi

2015-11-01

Gene set analysis of differential expression, which identifies collectively differentially expressed gene sets, has become an important tool for biology. The power of this approach lies in its reduction of the dimensionality of the statistical problem and its incorporation of biological interpretation by construction. Many approaches to gene set analysis have been proposed, but benchmarking their performance in the setting of real biological data is difficult due to the lack of a gold standard. In a previously published work we proposed a geometrical approach to differential expression which performed highly in benchmarking tests and compared well to the most popular methods of differential gene expression. As reported, this approach has a natural extension to gene set analysis which we call Principal Angle Enrichment Analysis (PAEA). PAEA employs dimensionality reduction and a multivariate approach for gene set enrichment analysis. However, the performance of this method has not been assessed nor its implementation as a web-based tool. Here we describe new benchmarking protocols for gene set analysis methods and find that PAEA performs highly. The PAEA method is implemented as a user-friendly web-based tool, which contains 70 gene set libraries and is freely available to the community.
Horizontal acquisition of multiple mitochondrial genes from a parasitic plant followed by gene conversion with host mitochondrial genes

Science.gov (United States)

2010-01-01

Background Horizontal gene transfer (HGT) is relatively common in plant mitochondrial genomes but the mechanisms, extent and consequences of transfer remain largely unknown. Previous results indicate that parasitic plants are often involved as either transfer donors or recipients, suggesting that direct contact between parasite and host facilitates genetic transfer among plants. Results In order to uncover the mechanistic details of plant-to-plant HGT, the extent and evolutionary fate of transfer was investigated between two groups: the parasitic genus Cuscuta and a small clade of Plantago species. A broad polymerase chain reaction (PCR) survey of mitochondrial genes revealed that at least three genes (atp1, atp6 and matR) were recently transferred from Cuscuta to Plantago. Quantitative PCR assays show that these three genes have a mitochondrial location in the one species line of Plantago examined. Patterns of sequence evolution suggest that these foreign genes degraded into pseudogenes shortly after transfer and reverse transcription (RT)-PCR analyses demonstrate that none are detectably transcribed. Three cases of gene conversion were detected between native and foreign copies of the atp1 gene. The identical phylogenetic distribution of the three foreign genes within Plantago and the retention of cytidines at ancestral positions of RNA editing indicate that these genes were probably acquired via a single, DNA-mediated transfer event. However, samplings of multiple individuals from two of the three species in the recipient Plantago clade revealed complex and perplexing phylogenetic discrepancies and patterns of sequence divergence for all three of the foreign genes. Conclusions This study reports the best evidence to date that multiple mitochondrial genes can be transferred via a single HGT event and that transfer occurred via a strictly DNA-level intermediate. The discovery of gene conversion between co-resident foreign and native mitochondrial copies suggests
Horizontal acquisition of multiple mitochondrial genes from a parasitic plant followed by gene conversion with host mitochondrial genes

Directory of Open Access Journals (Sweden)

Hao Weilong

2010-12-01

Full Text Available Abstract Background Horizontal gene transfer (HGT is relatively common in plant mitochondrial genomes but the mechanisms, extent and consequences of transfer remain largely unknown. Previous results indicate that parasitic plants are often involved as either transfer donors or recipients, suggesting that direct contact between parasite and host facilitates genetic transfer among plants. Results In order to uncover the mechanistic details of plant-to-plant HGT, the extent and evolutionary fate of transfer was investigated between two groups: the parasitic genus Cuscuta and a small clade of Plantago species. A broad polymerase chain reaction (PCR survey of mitochondrial genes revealed that at least three genes (atp1, atp6 and matR were recently transferred from Cuscuta to Plantago. Quantitative PCR assays show that these three genes have a mitochondrial location in the one species line of Plantago examined. Patterns of sequence evolution suggest that these foreign genes degraded into pseudogenes shortly after transfer and reverse transcription (RT-PCR analyses demonstrate that none are detectably transcribed. Three cases of gene conversion were detected between native and foreign copies of the atp1 gene. The identical phylogenetic distribution of the three foreign genes within Plantago and the retention of cytidines at ancestral positions of RNA editing indicate that these genes were probably acquired via a single, DNA-mediated transfer event. However, samplings of multiple individuals from two of the three species in the recipient Plantago clade revealed complex and perplexing phylogenetic discrepancies and patterns of sequence divergence for all three of the foreign genes. Conclusions This study reports the best evidence to date that multiple mitochondrial genes can be transferred via a single HGT event and that transfer occurred via a strictly DNA-level intermediate. The discovery of gene conversion between co-resident foreign and native
Uniform approximation is more appropriate for Wilcoxon Rank-Sum Test in gene set analysis.

Directory of Open Access Journals (Sweden)

Zhide Fang

Full Text Available Gene set analysis is widely used to facilitate biological interpretations in the analyses of differential expression from high throughput profiling data. Wilcoxon Rank-Sum (WRS test is one of the commonly used methods in gene set enrichment analysis. It compares the ranks of genes in a gene set against those of genes outside the gene set. This method is easy to implement and it eliminates the dichotomization of genes into significant and non-significant in a competitive hypothesis testing. Due to the large number of genes being examined, it is impractical to calculate the exact null distribution for the WRS test. Therefore, the normal distribution is commonly used as an approximation. However, as we demonstrate in this paper, the normal approximation is problematic when a gene set with relative small number of genes is tested against the large number of genes in the complementary set. In this situation, a uniform approximation is substantially more powerful, more accurate, and less intensive in computation. We demonstrate the advantage of the uniform approximations in Gene Ontology (GO term analysis using simulations and real data sets.
Next-generation text-mining mediated generation of chemical response-specific gene sets for interpretation of gene expression data

NARCIS (Netherlands)

K.M. Hettne (Kristina); J. Boorsma (Jeffrey); D.A.M. van Dartel (Dorien A M); J.J. Goeman (Jelle); E.C. de Jong (Esther); A.H. Piersma (Aldert); R.H. Stierum (Rob); J. Kleinjans (Jos); J.A. Kors (Jan)

2013-01-01

textabstractBackground: Availability of chemical response-specific lists of genes (gene sets) for pharmacological and/or toxic effect prediction for compounds is limited. We hypothesize that more gene sets can be created by next-generation text mining (next-gen TM), and that these can be used with
Multiple Suboptimal Solutions for Prediction Rules in Gene Expression Data

Directory of Open Access Journals (Sweden)

Osamu Komori

2013-01-01

Full Text Available This paper discusses mathematical and statistical aspects in analysis methods applied to microarray gene expressions. We focus on pattern recognition to extract informative features embedded in the data for prediction of phenotypes. It has been pointed out that there are severely difficult problems due to the unbalance in the number of observed genes compared with the number of observed subjects. We make a reanalysis of microarray gene expression published data to detect many other gene sets with almost the same performance. We conclude in the current stage that it is not possible to extract only informative genes with high performance in the all observed genes. We investigate the reason why this difficulty still exists even though there are actively proposed analysis methods and learning algorithms in statistical machine learning approaches. We focus on the mutual coherence or the absolute value of the Pearson correlations between two genes and describe the distributions of the correlation for the selected set of genes and the total set. We show that the problem of finding informative genes in high dimensional data is ill-posed and that the difficulty is closely related with the mutual coherence.
Identification of a robust gene signature that predicts breast cancer outcome in independent data sets

International Nuclear Information System (INIS)

Korkola, James E; Waldman, Frederic M; Blaveri, Ekaterina; DeVries, Sandy; Moore, Dan H II; Hwang, E Shelley; Chen, Yunn-Yi; Estep, Anne LH; Chew, Karen L; Jensen, Ronald H

2007-01-01

Breast cancer is a heterogeneous disease, presenting with a wide range of histologic, clinical, and genetic features. Microarray technology has shown promise in predicting outcome in these patients. We profiled 162 breast tumors using expression microarrays to stratify tumors based on gene expression. A subset of 55 tumors with extensive follow-up was used to identify gene sets that predicted outcome. The predictive gene set was further tested in previously published data sets. We used different statistical methods to identify three gene sets associated with disease free survival. A fourth gene set, consisting of 21 genes in common to all three sets, also had the ability to predict patient outcome. To validate the predictive utility of this derived gene set, it was tested in two published data sets from other groups. This gene set resulted in significant separation of patients on the basis of survival in these data sets, correctly predicting outcome in 62–65% of patients. By comparing outcome prediction within subgroups based on ER status, grade, and nodal status, we found that our gene set was most effective in predicting outcome in ER positive and node negative tumors. This robust gene selection with extensive validation has identified a predictive gene set that may have clinical utility for outcome prediction in breast cancer patients
A Bayesian variable selection procedure for ranking overlapping gene sets

DEFF Research Database (Denmark)

Skarman, Axel; Mahdi Shariati, Mohammad; Janss, Luc

2012-01-01

Background Genome-wide expression profiling using microarrays or sequence-based technologies allows us to identify genes and genetic pathways whose expression patterns influence complex traits. Different methods to prioritize gene sets, such as the genes in a given molecular pathway, have been de...
Delimiting Coalescence Genes (C-Genes) in Phylogenomic Data Sets.

Science.gov (United States)

Springer, Mark S; Gatesy, John

2018-02-26

coalescence methods have emerged as a popular alternative for inferring species trees with large genomic datasets, because these methods explicitly account for incomplete lineage sorting. However, statistical consistency of summary coalescence methods is not guaranteed unless several model assumptions are true, including the critical assumption that recombination occurs freely among but not within coalescence genes (c-genes), which are the fundamental units of analysis for these methods. Each c-gene has a single branching history, and large sets of these independent gene histories should be the input for genome-scale coalescence estimates of phylogeny. By contrast, numerous studies have reported the results of coalescence analyses in which complete protein-coding sequences are treated as c-genes even though exons for these loci can span more than a megabase of DNA. Empirical estimates of recombination breakpoints suggest that c-genes may be much shorter, especially when large clades with many species are the focus of analysis. Although this idea has been challenged recently in the literature, the inverse relationship between c-gene size and increased taxon sampling in a dataset-the 'recombination ratchet'-is a fundamental property of c-genes. For taxonomic groups characterized by genes with long intron sequences, complete protein-coding sequences are likely not valid c-genes and are inappropriate units of analysis for summary coalescence methods unless they occur in recombination deserts that are devoid of incomplete lineage sorting (ILS). Finally, it has been argued that coalescence methods are robust when the no-recombination within loci assumption is violated, but recombination must matter at some scale because ILS, a by-product of recombination, is the raison d'etre for coalescence methods. That is, extensive recombination is required to yield the large number of independently segregating c-genes used to infer a species tree. If coalescent methods are powerful
An Independent Filter for Gene Set Testing Based on Spectral Enrichment

NARCIS (Netherlands)

Frost, H Robert; Li, Zhigang; Asselbergs, Folkert W; Moore, Jason H

2015-01-01

Gene set testing has become an indispensable tool for the analysis of high-dimensional genomic data. An important motivation for testing gene sets, rather than individual genomic variables, is to improve statistical power by reducing the number of tested hypotheses. Given the dramatic growth in
Application of biclustering of gene expression data and gene set enrichment analysis methods to identify potentially disease causing nanomaterials

Directory of Open Access Journals (Sweden)

Andrew Williams

2015-12-01

Full Text Available Background: The presence of diverse types of nanomaterials (NMs in commerce is growing at an exponential pace. As a result, human exposure to these materials in the environment is inevitable, necessitating the need for rapid and reliable toxicity testing methods to accurately assess the potential hazards associated with NMs. In this study, we applied biclustering and gene set enrichment analysis methods to derive essential features of altered lung transcriptome following exposure to NMs that are associated with lung-specific diseases. Several datasets from public microarray repositories describing pulmonary diseases in mouse models following exposure to a variety of substances were examined and functionally related biclusters of genes showing similar expression profiles were identified. The identified biclusters were then used to conduct a gene set enrichment analysis on pulmonary gene expression profiles derived from mice exposed to nano-titanium dioxide (nano-TiO2, carbon black (CB or carbon nanotubes (CNTs to determine the disease significance of these data-driven gene sets.Results: Biclusters representing inflammation (chemokine activity, DNA binding, cell cycle, apoptosis, reactive oxygen species (ROS and fibrosis processes were identified. All of the NM studies were significant with respect to the bicluster related to chemokine activity (DAVID; FDR p-value = 0.032. The bicluster related to pulmonary fibrosis was enriched in studies where toxicity induced by CNT and CB studies was investigated, suggesting the potential for these materials to induce lung fibrosis. The pro-fibrogenic potential of CNTs is well established. Although CB has not been shown to induce fibrosis, it induces stronger inflammatory, oxidative stress and DNA damage responses than nano-TiO2 particles.Conclusion: The results of the analysis correctly identified all NMs to be inflammogenic and only CB and CNTs as potentially fibrogenic. In addition to identifying several
Model-based gene set analysis for Bioconductor.

Science.gov (United States)

Bauer, Sebastian; Robinson, Peter N; Gagneur, Julien

2011-07-01

Gene Ontology and other forms of gene-category analysis play a major role in the evaluation of high-throughput experiments in molecular biology. Single-category enrichment analysis procedures such as Fisher's exact test tend to flag large numbers of redundant categories as significant, which can complicate interpretation. We have recently developed an approach called model-based gene set analysis (MGSA), that substantially reduces the number of redundant categories returned by the gene-category analysis. In this work, we present the Bioconductor package mgsa, which makes the MGSA algorithm available to users of the R language. Our package provides a simple and flexible application programming interface for applying the approach. The mgsa package has been made available as part of Bioconductor 2.8. It is released under the conditions of the Artistic license 2.0. peter.robinson@charite.de; julien.gagneur@embl.de.
Gene set analysis for interpreting genetic studies

DEFF Research Database (Denmark)

Pers, Tune H

2016-01-01

Interpretation of genome-wide association study (GWAS) results is lacking behind the discovery of new genetic associations. Consequently, there is an urgent need for data-driven methods for interpreting genetic association studies. Gene set analysis (GSA) can identify aetiologic pathways...
Gene prediction using the Self-Organizing Map: automatic generation of multiple gene models.

Science.gov (United States)

Mahony, Shaun; McInerney, James O; Smith, Terry J; Golden, Aaron

2004-03-05

Many current gene prediction methods use only one model to represent protein-coding regions in a genome, and so are less likely to predict the location of genes that have an atypical sequence composition. It is likely that future improvements in gene finding will involve the development of methods that can adequately deal with intra-genomic compositional variation. This work explores a new approach to gene-prediction, based on the Self-Organizing Map, which has the ability to automatically identify multiple gene models within a genome. The current implementation, named RescueNet, uses relative synonymous codon usage as the indicator of protein-coding potential. While its raw accuracy rate can be less than other methods, RescueNet consistently identifies some genes that other methods do not, and should therefore be of interest to gene-prediction software developers and genome annotation teams alike. RescueNet is recommended for use in conjunction with, or as a complement to, other gene prediction methods.
Ventilator-associated pneumonia caused by carbapenem-resistant Enterobacteriaceae carrying multiple metallo-beta-lactamase genes

Directory of Open Access Journals (Sweden)

Dwivedi Mayank

2009-07-01

Full Text Available Context: Ventilator-associated pneumonia (VAP is a leading nosocomial infection in the intensive care unit (ICU. Members of Enterobacteriaceae are the most common causative agents and carbapenems are the most commonly used antibiotics. Metallo-beta-lactamase (MBL production leading to treatment failure may go unnoticed by routine disc diffusion susceptibility testing. Moreover, there is not much information on association of MBL-producing Enterobacteriaceae with ICU-acquired VAP. Therefore, a study was undertaken to find out the association of MBL-producing Enterobacteriaceae with VAP. Settings: This study was conducted in a large tertiary care hospital of North India with an eight-bed critical care unit. Materials and Methods: The respiratory samples (bronchoalveolar lavage, protected brush catheter specimens and endotracheal or transtracheal aspirates obtained from VAP patients (during January 2005-December 2006 were processed, isolated bacteria identified and their antibiotic susceptibilities tested as per standard protocols. The isolates of Enterobacteriaceae resistant to carbapenem were subjected to phenotypic and genotypic tests for the detection of MBLs. Results: Twelve of 64 isolates of Enterobacteriaceae were detected as MBL producers, bla IMP being the most prevalent gene. Additionally, in three strains, simultaneous coexistence of multiple MBL genes was detected. Conclusion: The coexistence of multiple MBL genes in Enterobacteriaceae is an alarming situation. As MBL genes are associated with integrons that can be embedded in transposons, which in turn can be accommodated on plasmids thereby resulting in a highly mobile genetic apparatus, the further spread of these genes in different pathogens is likely to occur.
Meta-analysis of differentiating mouse embryonic stem cell gene expression kinetics reveals early change of a small gene set.

Directory of Open Access Journals (Sweden)

Clive H Glover

2006-11-01

Full Text Available Stem cell differentiation involves critical changes in gene expression. Identification of these should provide endpoints useful for optimizing stem cell propagation as well as potential clues about mechanisms governing stem cell maintenance. Here we describe the results of a new meta-analysis methodology applied to multiple gene expression datasets from three mouse embryonic stem cell (ESC lines obtained at specific time points during the course of their differentiation into various lineages. We developed methods to identify genes with expression changes that correlated with the altered frequency of functionally defined, undifferentiated ESC in culture. In each dataset, we computed a novel statistical confidence measure for every gene which captured the certainty that a particular gene exhibited an expression pattern of interest within that dataset. This permitted a joint analysis of the datasets, despite the different experimental designs. Using a ranking scheme that favored genes exhibiting patterns of interest, we focused on the top 88 genes whose expression was consistently changed when ESC were induced to differentiate. Seven of these (103728_at, 8430410A17Rik, Klf2, Nr0b1, Sox2, Tcl1, and Zfp42 showed a rapid decrease in expression concurrent with a decrease in frequency of undifferentiated cells and remained predictive when evaluated in additional maintenance and differentiating protocols. Through a novel meta-analysis, this study identifies a small set of genes whose expression is useful for identifying changes in stem cell frequencies in cultures of mouse ESC. The methods and findings have broader applicability to understanding the regulation of self-renewal of other stem cell types.
Pediatric Multiple Sclerosis: Genes, Environment, and a Comprehensive Therapeutic Approach.

Science.gov (United States)

Cappa, Ryan; Theroux, Liana; Brenton, J Nicholas

2017-10-01

Pediatric multiple sclerosis is an increasingly recognized and studied disorder that accounts for 3% to 10% of all patients with multiple sclerosis. The risk for pediatric multiple sclerosis is thought to reflect a complex interplay between environmental and genetic risk factors. Environmental exposures, including sunlight (ultraviolet radiation, vitamin D levels), infections (Epstein-Barr virus), passive smoking, and obesity, have been identified as potential risk factors in youth. Genetic predisposition contributes to the risk of multiple sclerosis, and the major histocompatibility complex on chromosome 6 makes the single largest contribution to susceptibility to multiple sclerosis. With the use of large-scale genome-wide association studies, other non-major histocompatibility complex alleles have been identified as independent risk factors for the disease. The bridge between environment and genes likely lies in the study of epigenetic processes, which are environmentally-influenced mechanisms through which gene expression may be modified. This article will review these topics to provide a framework for discussion of a comprehensive approach to counseling and ultimately treating the pediatric patient with multiple sclerosis. Copyright © 2017 Elsevier Inc. All rights reserved.
Genome-wide survey and developmental expression mapping of zebrafish SET domain-containing genes.

Directory of Open Access Journals (Sweden)

Xiao-Jian Sun

Full Text Available SET domain-containing proteins represent an evolutionarily conserved family of epigenetic regulators, which are responsible for most histone lysine methylation. Since some of these genes have been revealed to be essential for embryonic development, we propose that the zebrafish, a vertebrate model organism possessing many advantages for developmental studies, can be utilized to study the biological functions of these genes and the related epigenetic mechanisms during early development. To this end, we have performed a genome-wide survey of zebrafish SET domain genes. 58 genes total have been identified. Although gene duplication events give rise to several lineage-specific paralogs, clear reciprocal orthologous relationship reveals high conservation between zebrafish and human SET domain genes. These data were further subject to an evolutionary analysis ranging from yeast to human, leading to the identification of putative clusters of orthologous groups (COGs of this gene family. By means of whole-mount mRNA in situ hybridization strategy, we have also carried out a developmental expression mapping of these genes. A group of maternal SET domain genes, which are implicated in the programming of histone modification states in early development, have been identified and predicted to be responsible for all known sites of SET domain-mediated histone methylation. Furthermore, some genes show specific expression patterns in certain tissues at certain stages, suggesting the involvement of epigenetic mechanisms in the development of these systems. These results provide a global view of zebrafish SET domain histone methyltransferases in evolutionary and developmental dimensions and pave the way for using zebrafish to systematically study the roles of these genes during development.

Screening of point mutations by multiple SSCP analysis in the dystrophin gene

Energy Technology Data Exchange (ETDEWEB)

Lasa, A.; Baiget, M.; Gallano, P. [Hospital Sant Pau, Barcelona (Spain)

1994-09-01

Duchenne muscular dystrophy (DMD) is a lethal, X-linked neuromuscular disorder. The population frequency of DMD is one in approximately 3500 boys, of which one third is thought to be a new mutant. The DMD gene is the largest known to date, spanning over 2,3 Mb in band Xp21.2; 79 exons are transcribed into a 14 Kb mRNA coding for a protein of 427 kD which has been named dystrophin. It has been shown that about 65% of affected boys have a gene deletion with a wide variation in localization and size. The remaining affected individuals who have no detectable deletions or duplications would probably carry more subtle mutations that are difficult to detect. These mutations occur in several different exons and seem to be unique to single patients. Their identification represents a formidable goal because of the large size and complexity of the dystrophin gene. SSCP is a very efficient method for the detection of point mutations if the parameters that affect the separation of the strands are optimized for a particular DNA fragment. The multiple SSCP allows the simultaneous study of several exons, and implies the use of different conditions because no single set of conditions will be optimal for all fragments. Seventy-eight DMD patients with no deletion or duplication in the dystrophin gene were selected for the multiple SSCP analysis. Genomic DNA from these patients was amplified using the primers described for the diagnosis procedure (muscle promoter and exons 3, 8, 12, 16, 17, 19, 32, 45, 48 and 51). We have observed different mobility shifts in bands corresponding to exons 8, 12, 43 and 51. In exons 17 and 45, altered electrophoretic patterns were found in different samples identifying polymorphisms already described.
GENIE: a software package for gene-gene interaction analysis in genetic association studies using multiple GPU or CPU cores

Directory of Open Access Journals (Sweden)

Wang Kai

2011-05-01

Full Text Available Abstract Background Gene-gene interaction in genetic association studies is computationally intensive when a large number of SNPs are involved. Most of the latest Central Processing Units (CPUs have multiple cores, whereas Graphics Processing Units (GPUs also have hundreds of cores and have been recently used to implement faster scientific software. However, currently there are no genetic analysis software packages that allow users to fully utilize the computing power of these multi-core devices for genetic interaction analysis for binary traits. Findings Here we present a novel software package GENIE, which utilizes the power of multiple GPU or CPU processor cores to parallelize the interaction analysis. GENIE reads an entire genetic association study dataset into memory and partitions the dataset into fragments with non-overlapping sets of SNPs. For each fragment, GENIE analyzes: 1 the interaction of SNPs within it in parallel, and 2 the interaction between the SNPs of the current fragment and other fragments in parallel. We tested GENIE on a large-scale candidate gene study on high-density lipoprotein cholesterol. Using an NVIDIA Tesla C1060 graphics card, the GPU mode of GENIE achieves a speedup of 27 times over its single-core CPU mode run. Conclusions GENIE is open-source, economical, user-friendly, and scalable. Since the computing power and memory capacity of graphics cards are increasing rapidly while their cost is going down, we anticipate that GENIE will achieve greater speedups with faster GPU cards. Documentation, source code, and precompiled binaries can be downloaded from http://www.cceb.upenn.edu/~mli/software/GENIE/.
Gene set of nuclear-encoded mitochondrial regulators is enriched for common inherited variation in obesity.

Directory of Open Access Journals (Sweden)

Nadja Knoll

Full Text Available There are hints of an altered mitochondrial function in obesity. Nuclear-encoded genes are relevant for mitochondrial function (3 gene sets of known relevant pathways: (1 16 nuclear regulators of mitochondrial genes, (2 91 genes for oxidative phosphorylation and (3 966 nuclear-encoded mitochondrial genes. Gene set enrichment analysis (GSEA showed no association with type 2 diabetes mellitus in these gene sets. Here we performed a GSEA for the same gene sets for obesity. Genome wide association study (GWAS data from a case-control approach on 453 extremely obese children and adolescents and 435 lean adult controls were used for GSEA. For independent confirmation, we analyzed 705 obesity GWAS trios (extremely obese child and both biological parents and a population-based GWAS sample (KORA F4, n = 1,743. A meta-analysis was performed on all three samples. In each sample, the distribution of significance levels between the respective gene set and those of all genes was compared using the leading-edge-fraction-comparison test (cut-offs between the 50(th and 95(th percentile of the set of all gene-wise corrected p-values as implemented in the MAGENTA software. In the case-control sample, significant enrichment of associations with obesity was observed above the 50(th percentile for the set of the 16 nuclear regulators of mitochondrial genes (p(GSEA,50 = 0.0103. This finding was not confirmed in the trios (p(GSEA,50 = 0.5991, but in KORA (p(GSEA,50 = 0.0398. The meta-analysis again indicated a trend for enrichment (p(MAGENTA,50 = 0.1052, p(MAGENTA,75 = 0.0251. The GSEA revealed that weak association signals for obesity might be enriched in the gene set of 16 nuclear regulators of mitochondrial genes.
Simple and Efficient Targeting of Multiple Genes Through CRISPR-Cas9 in Physcomitrella patens

Directory of Open Access Journals (Sweden)

Mauricio Lopez-Obando

2016-11-01

Full Text Available Powerful genome editing technologies are needed for efficient gene function analysis. The CRISPR-Cas9 system has been adapted as an efficient gene-knock-out technology in a variety of species. However, in a number of situations, knocking out or modifying a single gene is not sufficient; this is particularly true for genes belonging to a common family, or for genes showing redundant functions. Like many plants, the model organism Physcomitrella patens has experienced multiple events of polyploidization during evolution that has resulted in a number of families of duplicated genes. Here, we report a robust CRISPR-Cas9 system, based on the codelivery of a CAS9 expressing cassette, multiple sgRNA vectors, and a cassette for transient transformation selection, for gene knock-out in multiple gene families. We demonstrate that CRISPR-Cas9-mediated targeting of five different genes allows the selection of a quintuple mutant, and all possible subcombinations of mutants, in one experiment, with no mutations detected in potential off-target sequences. Furthermore, we confirmed the observation that the presence of repeats in the vicinity of the cutting region favors deletion due to the alternative end joining pathway, for which induced frameshift mutations can be potentially predicted. Because the number of multiple gene families in Physcomitrella is substantial, this tool opens new perspectives to study the role of expanded gene families in the colonization of land by plants.
Maximizing the Lifetime of Wireless Sensor Networks Using Multiple Sets of Rendezvous

Directory of Open Access Journals (Sweden)

Bo Li

2015-01-01

Full Text Available In wireless sensor networks (WSNs, there is a “crowded center effect” where the energy of nodes located near a data sink drains much faster than other nodes resulting in a short network lifetime. To mitigate the “crowded center effect,” rendezvous points (RPs are used to gather data from other nodes. In order to prolong the lifetime of WSN further, we propose using multiple sets of RPs in turn to average the energy consumption of the RPs. The problem is how to select the multiple sets of RPs and how long to use each set of RPs. An optimal algorithm and a heuristic algorithm are proposed to address this problem. The optimal algorithm is highly complex and only suitable for small scale WSN. The performance of the proposed algorithms is evaluated through simulations. The simulation results indicate that the heuristic algorithm approaches the optimal one and that using multiple RP sets can significantly prolong network lifetime.
Optimal structural inference of signaling pathways from unordered and overlapping gene sets.

Science.gov (United States)

Acharya, Lipi R; Judeh, Thair; Wang, Guangdi; Zhu, Dongxiao

2012-02-15

A plethora of bioinformatics analysis has led to the discovery of numerous gene sets, which can be interpreted as discrete measurements emitted from latent signaling pathways. Their potential to infer signaling pathway structures, however, has not been sufficiently exploited. Existing methods accommodating discrete data do not explicitly consider signal cascading mechanisms that characterize a signaling pathway. Novel computational methods are thus needed to fully utilize gene sets and broaden the scope from focusing only on pairwise interactions to the more general cascading events in the inference of signaling pathway structures. We propose a gene set based simulated annealing (SA) algorithm for the reconstruction of signaling pathway structures. A signaling pathway structure is a directed graph containing up to a few hundred nodes and many overlapping signal cascades, where each cascade represents a chain of molecular interactions from the cell surface to the nucleus. Gene sets in our context refer to discrete sets of genes participating in signal cascades, the basic building blocks of a signaling pathway, with no prior information about gene orderings in the cascades. From a compendium of gene sets related to a pathway, SA aims to search for signal cascades that characterize the optimal signaling pathway structure. In the search process, the extent of overlap among signal cascades is used to measure the optimality of a structure. Throughout, we treat gene sets as random samples from a first-order Markov chain model. We evaluated the performance of SA in three case studies. In the first study conducted on 83 KEGG pathways, SA demonstrated a significantly better performance than Bayesian network methods. Since both SA and Bayesian network methods accommodate discrete data, use a 'search and score' network learning strategy and output a directed network, they can be compared in terms of performance and computational time. In the second study, we compared SA and
Mechanism-based biomarker gene sets for glutathione depletion-related hepatotoxicity in rats

International Nuclear Information System (INIS)

Gao Weihua; Mizukawa, Yumiko; Nakatsu, Noriyuki; Minowa, Yosuke; Yamada, Hiroshi; Ohno, Yasuo; Urushidani, Tetsuro

2010-01-01

Chemical-induced glutathione depletion is thought to be caused by two types of toxicological mechanisms: PHO-type glutathione depletion [glutathione conjugated with chemicals such as phorone (PHO) or diethyl maleate (DEM)], and BSO-type glutathione depletion [i.e., glutathione synthesis inhibited by chemicals such as L-buthionine-sulfoximine (BSO)]. In order to identify mechanism-based biomarker gene sets for glutathione depletion in rat liver, male SD rats were treated with various chemicals including PHO (40, 120 and 400 mg/kg), DEM (80, 240 and 800 mg/kg), BSO (150, 450 and 1500 mg/kg), and bromobenzene (BBZ, 10, 100 and 300 mg/kg). Liver samples were taken 3, 6, 9 and 24 h after administration and examined for hepatic glutathione content, physiological and pathological changes, and gene expression changes using Affymetrix GeneChip Arrays. To identify differentially expressed probe sets in response to glutathione depletion, we focused on the following two courses of events for the two types of mechanisms of glutathione depletion: a) gene expression changes occurring simultaneously in response to glutathione depletion, and b) gene expression changes after glutathione was depleted. The gene expression profiles of the identified probe sets for the two types of glutathione depletion differed markedly at times during and after glutathione depletion, whereas Srxn1 was markedly increased for both types as glutathione was depleted, suggesting that Srxn1 is a key molecule in oxidative stress related to glutathione. The extracted probe sets were refined and verified using various compounds including 13 additional positive or negative compounds, and they established two useful marker sets. One contained three probe sets (Akr7a3, Trib3 and Gstp1) that could detect conjugation-type glutathione depletors any time within 24 h after dosing, and the other contained 14 probe sets that could detect glutathione depletors by any mechanism. These two sets, with appropriate scoring
An Approach for Predicting Essential Genes Using Multiple Homology Mapping and Machine Learning Algorithms.

Science.gov (United States)

Hua, Hong-Li; Zhang, Fa-Zhan; Labena, Abraham Alemayehu; Dong, Chuan; Jin, Yan-Ting; Guo, Feng-Biao

Investigation of essential genes is significant to comprehend the minimal gene sets of cell and discover potential drug targets. In this study, a novel approach based on multiple homology mapping and machine learning method was introduced to predict essential genes. We focused on 25 bacteria which have characterized essential genes. The predictions yielded the highest area under receiver operating characteristic (ROC) curve (AUC) of 0.9716 through tenfold cross-validation test. Proper features were utilized to construct models to make predictions in distantly related bacteria. The accuracy of predictions was evaluated via the consistency of predictions and known essential genes of target species. The highest AUC of 0.9552 and average AUC of 0.8314 were achieved when making predictions across organisms. An independent dataset from Synechococcus elongatus , which was released recently, was obtained for further assessment of the performance of our model. The AUC score of predictions is 0.7855, which is higher than other methods. This research presents that features obtained by homology mapping uniquely can achieve quite great or even better results than those integrated features. Meanwhile, the work indicates that machine learning-based method can assign more efficient weight coefficients than using empirical formula based on biological knowledge.
Multiple blocking sets in PG(n,q), n>=3

DEFF Research Database (Denmark)

Barat, Janos

2004-01-01

This article discusses minimal s-fold blocking sets B in PG (n, q), q = ph, p prime, q > 661, n > 3, of size |B| > sq + cp q2/3 - (s - 1) (s - 2)/2 (s > min (cp q1/6, q1/4/2)). It is shown that these s-fold blocking sets contain the disjoint union of a collection of s lines and/or Baer subplanes....... To obtain these results, we extend results of Blokhuis–Storme–Szönyi on s-fold blocking sets in PG(2, q) to s-fold blocking sets having points to which a multiplicity is given. Then the results in PG(n, q), n ≥ 3, are obtained using projection arguments. The results of this article also improve results...
Upregulation of Immunoglobulin-related Genes in Cortical Sections from Multiple Sclerosis Patients

NARCIS (Netherlands)

Torkildsen, O.; Stansberg, C.; Angelskar, S.M.; Kooi, E.J.; Geurts, J.J.G.; van der Valk, P.; Myhr, K.M.; Steen, V.M.; Bo, L.

2010-01-01

Multiple sclerosis (MS) is a demyelinating disease of the central nervous system (CNS). Microarray-based global gene expression profiling is a promising method, used to study potential genes involved in the pathogenesis of the disease. In the present study, we have examined global gene expression in
GeneAnalytics: An Integrative Gene Set Analysis Tool for Next Generation Sequencing, RNAseq and Microarray Data.

Science.gov (United States)

Ben-Ari Fuchs, Shani; Lieder, Iris; Stelzer, Gil; Mazor, Yaron; Buzhor, Ella; Kaplan, Sergey; Bogoch, Yoel; Plaschkes, Inbar; Shitrit, Alina; Rappaport, Noa; Kohn, Asher; Edgar, Ron; Shenhav, Liraz; Safran, Marilyn; Lancet, Doron; Guan-Golan, Yaron; Warshawsky, David; Shtrichman, Ronit

2016-03-01

Postgenomics data are produced in large volumes by life sciences and clinical applications of novel omics diagnostics and therapeutics for precision medicine. To move from "data-to-knowledge-to-innovation," a crucial missing step in the current era is, however, our limited understanding of biological and clinical contexts associated with data. Prominent among the emerging remedies to this challenge are the gene set enrichment tools. This study reports on GeneAnalytics™ ( geneanalytics.genecards.org ), a comprehensive and easy-to-apply gene set analysis tool for rapid contextualization of expression patterns and functional signatures embedded in the postgenomics Big Data domains, such as Next Generation Sequencing (NGS), RNAseq, and microarray experiments. GeneAnalytics' differentiating features include in-depth evidence-based scoring algorithms, an intuitive user interface and proprietary unified data. GeneAnalytics employs the LifeMap Science's GeneCards suite, including the GeneCards®--the human gene database; the MalaCards-the human diseases database; and the PathCards--the biological pathways database. Expression-based analysis in GeneAnalytics relies on the LifeMap Discovery®--the embryonic development and stem cells database, which includes manually curated expression data for normal and diseased tissues, enabling advanced matching algorithm for gene-tissue association. This assists in evaluating differentiation protocols and discovering biomarkers for tissues and cells. Results are directly linked to gene, disease, or cell "cards" in the GeneCards suite. Future developments aim to enhance the GeneAnalytics algorithm as well as visualizations, employing varied graphical display items. Such attributes make GeneAnalytics a broadly applicable postgenomics data analyses and interpretation tool for translation of data to knowledge-based innovation in various Big Data fields such as precision medicine, ecogenomics, nutrigenomics, pharmacogenomics, vaccinomics
Gene expression analysis of interferon-beta treatment in multiple sclerosis

DEFF Research Database (Denmark)

Sellebjerg, F.; Datta, P.; Larsen, J.

2008-01-01

by treatment with IFN-beta. We use DNA microarrays to study gene expression in 10 multiple sclerosis (MS) patients who began de novo treatment with IFN-beta. After the first injection of IFN-beta, the expression of 74 out of 3428 genes changed at least two-fold and statistically significantly (after Bonferroni......Treatment with interferon-beta (IFN-beta) induces the expression of hundreds of genes in blood mononuclear cells, and the expression of several genes has been proposed as a marker of the effect of treatment with IFN-beta. However, to date no molecules have been identified that are stably induced...
Multiple genetic interaction experiments provide complementary information useful for gene function prediction.

Directory of Open Access Journals (Sweden)

Magali Michaut

Full Text Available Genetic interactions help map biological processes and their functional relationships. A genetic interaction is defined as a deviation from the expected phenotype when combining multiple genetic mutations. In Saccharomyces cerevisiae, most genetic interactions are measured under a single phenotype - growth rate in standard laboratory conditions. Recently genetic interactions have been collected under different phenotypic readouts and experimental conditions. How different are these networks and what can we learn from their differences? We conducted a systematic analysis of quantitative genetic interaction networks in yeast performed under different experimental conditions. We find that networks obtained using different phenotypic readouts, in different conditions and from different laboratories overlap less than expected and provide significant unique information. To exploit this information, we develop a novel method to combine individual genetic interaction data sets and show that the resulting network improves gene function prediction performance, demonstrating that individual networks provide complementary information. Our results support the notion that using diverse phenotypic readouts and experimental conditions will substantially increase the amount of gene function information produced by genetic interaction screens.
ADAGE signature analysis: differential expression analysis with data-defined gene sets.

Science.gov (United States)

Tan, Jie; Huyck, Matthew; Hu, Dongbo; Zelaya, René A; Hogan, Deborah A; Greene, Casey S

2017-11-22

Gene set enrichment analysis and overrepresentation analyses are commonly used methods to determine the biological processes affected by a differential expression experiment. This approach requires biologically relevant gene sets, which are currently curated manually, limiting their availability and accuracy in many organisms without extensively curated resources. New feature learning approaches can now be paired with existing data collections to directly extract functional gene sets from big data. Here we introduce a method to identify perturbed processes. In contrast with methods that use curated gene sets, this approach uses signatures extracted from public expression data. We first extract expression signatures from public data using ADAGE, a neural network-based feature extraction approach. We next identify signatures that are differentially active under a given treatment. Our results demonstrate that these signatures represent biological processes that are perturbed by the experiment. Because these signatures are directly learned from data without supervision, they can identify uncurated or novel biological processes. We implemented ADAGE signature analysis for the bacterial pathogen Pseudomonas aeruginosa. For the convenience of different user groups, we implemented both an R package (ADAGEpath) and a web server ( http://adage.greenelab.com ) to run these analyses. Both are open-source to allow easy expansion to other organisms or signature generation methods. We applied ADAGE signature analysis to an example dataset in which wild-type and ∆anr mutant cells were grown as biofilms on the Cystic Fibrosis genotype bronchial epithelial cells. We mapped active signatures in the dataset to KEGG pathways and compared with pathways identified using GSEA. The two approaches generally return consistent results; however, ADAGE signature analysis also identified a signature that revealed the molecularly supported link between the MexT regulon and Anr. We designed
Can survival prediction be improved by merging gene expression data sets?

Directory of Open Access Journals (Sweden)

Haleh Yasrebi

Full Text Available BACKGROUND: High-throughput gene expression profiling technologies generating a wealth of data, are increasingly used for characterization of tumor biopsies for clinical trials. By applying machine learning algorithms to such clinically documented data sets, one hopes to improve tumor diagnosis, prognosis, as well as prediction of treatment response. However, the limited number of patients enrolled in a single trial study limits the power of machine learning approaches due to over-fitting. One could partially overcome this limitation by merging data from different studies. Nevertheless, such data sets differ from each other with regard to technical biases, patient selection criteria and follow-up treatment. It is therefore not clear at all whether the advantage of increased sample size outweighs the disadvantage of higher heterogeneity of merged data sets. Here, we present a systematic study to answer this question specifically for breast cancer data sets. We use survival prediction based on Cox regression as an assay to measure the added value of merged data sets. RESULTS: Using time-dependent Receiver Operating Characteristic-Area Under the Curve (ROC-AUC and hazard ratio as performance measures, we see in overall no significant improvement or deterioration of survival prediction with merged data sets as compared to individual data sets. This apparently was due to the fact that a few genes with strong prognostic power were not available on all microarray platforms and thus were not retained in the merged data sets. Surprisingly, we found that the overall best performance was achieved with a single-gene predictor consisting of CYB5D1. CONCLUSIONS: Merging did not deteriorate performance on average despite (a The diversity of microarray platforms used. (b The heterogeneity of patients cohorts. (c The heterogeneity of breast cancer disease. (d Substantial variation of time to death or relapse. (e The reduced number of genes in the merged data
Genes with a spike expression are clustered in chromosome (sub)bands and spike (sub)bands have a powerful prognostic value in patients with multiple myeloma

Science.gov (United States)

Kassambara, Alboukadel; Hose, Dirk; Moreaux, Jérôme; Walker, Brian A.; Protopopov, Alexei; Reme, Thierry; Pellestor, Franck; Pantesco, Véronique; Jauch, Anna; Morgan, Gareth; Goldschmidt, Hartmut; Klein, Bernard

2012-01-01

Background Genetic abnormalities are common in patients with multiple myeloma, and may deregulate gene products involved in tumor survival, proliferation, metabolism and drug resistance. In particular, translocations may result in a high expression of targeted genes (termed spike expression) in tumor cells. We identified spike genes in multiple myeloma cells of patients with newly-diagnosed myeloma and investigated their prognostic value. Design and Methods Genes with a spike expression in multiple myeloma cells were picked up using box plot probe set signal distribution and two selection filters. Results In a cohort of 206 newly diagnosed patients with multiple myeloma, 2587 genes/expressed sequence tags with a spike expression were identified. Some spike genes were associated with some transcription factors such as MAF or MMSET and with known recurrent translocations as expected. Spike genes were not associated with increased DNA copy number and for a majority of them, involved unknown mechanisms. Of spiked genes, 36.7% clustered significantly in 149 out of 862 documented chromosome (sub)bands, of which 53 had prognostic value (35 bad, 18 good). Their prognostic value was summarized with a spike band score that delineated 23.8% of patients with a poor median overall survival (27.4 months versus not reached, Pband score was independent of other gene expression profiling-based risk scores, t(4;14), or del17p in an independent validation cohort of 345 patients. Conclusions We present a new approach to identify spike genes and their relationship to patients’ survival. PMID:22102711
Circadian Enhancers Coordinate Multiple Phases of Rhythmic Gene Transcription In Vivo

Science.gov (United States)

Fang, Bin; Everett, Logan J.; Jager, Jennifer; Briggs, Erika; Armour, Sean M.; Feng, Dan; Roy, Ankur; Gerhart-Hines, Zachary; Sun, Zheng; Lazar, Mitchell A.

2014-01-01

SUMMARY Mammalian transcriptomes display complex circadian rhythms with multiple phases of gene expression that cannot be accounted for by current models of the molecular clock. We have determined the underlying mechanisms by measuring nascent RNA transcription around the clock in mouse liver. Unbiased examination of eRNAs that cluster in specific circadian phases identified functional enhancers driven by distinct transcription factors (TFs). We further identify on a global scale the components of the TF cistromes that function to orchestrate circadian gene expression. Integrated genomic analyses also revealed novel mechanisms by which a single circadian factor controls opposing transcriptional phases. These findings shed new light on the diversity and specificity of TF function in the generation of multiple phases of circadian gene transcription in a mammalian organ. PMID:25416951
A non-inheritable maternal Cas9-based multiple-gene editing system in mice

OpenAIRE

Takayuki Sakurai; Akiko Kamiyoshi; Hisaka Kawate; Chie Mori; Satoshi Watanabe; Megumu Tanaka; Ryuichi Uetake; Masahiro Sato; Takayuki Shindo

2016-01-01

The CRISPR/Cas9 system is capable of editing multiple genes through one-step zygote injection. The preexisting method is largely based on the co-injection of Cas9 DNA (or mRNA) and guide RNAs (gRNAs); however, it is unclear how many genes can be simultaneously edited by this method, and a reliable means to generate transgenic (Tg) animals with multiple gene editing has yet to be developed. Here, we employed non-inheritable maternal Cas9 (maCas9) protein derived from Tg mice with systemic Cas9...
Synaptic genes are extensively downregulated across multiple brain regions in normal human aging and Alzheimer’s disease

Science.gov (United States)

Berchtold, Nicole C.; Coleman, Paul D.; Cribbs, David H.; Rogers, Joseph; Gillen, Daniel L.; Cotman, Carl W.

2014-01-01

Synapses are essential for transmitting, processing, and storing information, all of which decline in aging and Alzheimer’s disease (AD). Because synapse loss only partially accounts for the cognitive declines seen in aging and AD, we hypothesized that existing synapses might undergo molecular changes that reduce their functional capacity. Microarrays were used to evaluate expression profiles of 340 synaptic genes in aging (20–99 years) and AD across 4 brain regions from 81 cases. The analysis revealed an unexpectedly large number of significant expression changes in synapse-related genes in aging, with many undergoing progressive downregulation across aging and AD. Functional classification of the genes showing altered expression revealed that multiple aspects of synaptic function are affected, notably synaptic vesicle trafficking and release, neurotransmitter receptors and receptor trafficking, postsynaptic density scaffolding, cell adhesion regulating synaptic stability, and neuromodulatory systems. The widespread declines in synaptic gene expression in normal aging suggests that function of existing synapses might be impaired, and that a common set of synaptic genes are vulnerable to change in aging and AD. PMID:23273601
CAsubtype: An R Package to Identify Gene Sets Predictive of Cancer Subtypes and Clinical Outcomes.

Science.gov (United States)

Kong, Hualei; Tong, Pan; Zhao, Xiaodong; Sun, Jielin; Li, Hua

2018-03-01

In the past decade, molecular classification of cancer has gained high popularity owing to its high predictive power on clinical outcomes as compared with traditional methods commonly used in clinical practice. In particular, using gene expression profiles, recent studies have successfully identified a number of gene sets for the delineation of cancer subtypes that are associated with distinct prognosis. However, identification of such gene sets remains a laborious task due to the lack of tools with flexibility, integration and ease of use. To reduce the burden, we have developed an R package, CAsubtype, to efficiently identify gene sets predictive of cancer subtypes and clinical outcomes. By integrating more than 13,000 annotated gene sets, CAsubtype provides a comprehensive repertoire of candidates for new cancer subtype identification. For easy data access, CAsubtype further includes the gene expression and clinical data of more than 2000 cancer patients from TCGA. CAsubtype first employs principal component analysis to identify gene sets (from user-provided or package-integrated ones) with robust principal components representing significantly large variation between cancer samples. Based on these principal components, CAsubtype visualizes the sample distribution in low-dimensional space for better understanding of the distinction between samples and classifies samples into subgroups with prevalent clustering algorithms. Finally, CAsubtype performs survival analysis to compare the clinical outcomes between the identified subgroups, assessing their clinical value as potentially novel cancer subtypes. In conclusion, CAsubtype is a flexible and well-integrated tool in the R environment to identify gene sets for cancer subtype identification and clinical outcome prediction. Its simple R commands and comprehensive data sets enable efficient examination of the clinical value of any given gene set, thus facilitating hypothesis generating and testing in biological and

Managing multiple projects: a literature review of setting priorities and a pilot survey of healthcare researchers in an academic setting.

Science.gov (United States)

Hopkins, Robert Borden; Campbell, Kaitryn; O'Reilly, Daria; Tarride, Jean-Eric; Bowen, Jim; Blackhouse, Gord; Goerre, Ron

2007-05-16

To summarize and then assess with a pilot study the use of published best practice recommendations for priority setting during management of multiple healthcare research projects, in a resource-constrained environment. Medical, economic, business, and operations literature was reviewed to summarize and develop a survey to assess best practices for managing multiple projects. Fifteen senior healthcare research project managers, directors, and faculty at an urban academic institution were surveyed to determine most commonly used priority rules, ranking of rules, characteristics of their projects, and availability of resources. Survey results were compared to literature recommendations to determine use of best practices. Seven priority-setting rules were identified for managing multiple projects. Recommendations on assigning priorities by project characteristics are presented. In the pilot study, a large majority of survey respondents follow best practice recommendations identified in the research literature. However, priority rules such as Most Total Successors (MTS) and Resource Scheduling Method (RSM) were used "very often" by half of the respondents when better performing priority rules were available. Through experience, project managers learn to manage multiple projects under resource constraints. Best practice literature can assist project managers in priority setting by recommending the most appropriate priority given resource constraints and project characteristics. There is room for improvement in managing multiple projects.
Global map of physical interactions among differentially expressed genes in multiple sclerosis relapses and remissions.

Science.gov (United States)

Tuller, Tamir; Atar, Shimshi; Ruppin, Eytan; Gurevich, Michael; Achiron, Anat

2011-09-15

Multiple sclerosis (MS) is a central nervous system autoimmune inflammatory T-cell-mediated disease with a relapsing-remitting course in the majority of patients. In this study, we performed a high-resolution systems biology analysis of gene expression and physical interactions in MS relapse and remission. To this end, we integrated 164 large-scale measurements of gene expression in peripheral blood mononuclear cells of MS patients in relapse or remission and healthy subjects, with large-scale information about the physical interactions between these genes obtained from public databases. These data were analyzed with a variety of computational methods. We find that there is a clear and significant global network-level signal that is related to the changes in gene expression of MS patients in comparison to healthy subjects. However, despite the clear differences in the clinical symptoms of MS patients in relapse versus remission, the network level signal is weaker when comparing patients in these two stages of the disease. This result suggests that most of the genes have relatively similar expression levels in the two stages of the disease. In accordance with previous studies, we found that the pathways related to regulation of cell death, chemotaxis and inflammatory response are differentially expressed in the disease in comparison to healthy subjects, while pathways related to cell adhesion, cell migration and cell-cell signaling are activated in relapse in comparison to remission. However, the current study includes a detailed report of the exact set of genes involved in these pathways and the interactions between them. For example, we found that the genes TP53 and IL1 are 'network-hub' that interacts with many of the differentially expressed genes in MS patients versus healthy subjects, and the epidermal growth factor receptor is a 'network-hub' in the case of MS patients with relapse versus remission. The statistical approaches employed in this study enabled us
Repression of Middle Sporulation Genes in Saccharomyces cerevisiae by the Sum1-Rfm1-Hst1 Complex Is Maintained by Set1 and H3K4 Methylation

Science.gov (United States)

Jaiswal, Deepika; Jezek, Meagan; Quijote, Jeremiah; Lum, Joanna; Choi, Grace; Kulkarni, Rushmie; Park, DoHwan; Green, Erin M.

2017-01-01

The conserved yeast histone methyltransferase Set1 targets H3 lysine 4 (H3K4) for mono, di, and trimethylation and is linked to active transcription due to the euchromatic distribution of these methyl marks and the recruitment of Set1 during transcription. However, loss of Set1 results in increased expression of multiple classes of genes, including genes adjacent to telomeres and middle sporulation genes, which are repressed under normal growth conditions because they function in meiotic progression and spore formation. The mechanisms underlying Set1-mediated gene repression are varied, and still unclear in some cases, although repression has been linked to both direct and indirect action of Set1, associated with noncoding transcription, and is often dependent on the H3K4me2 mark. We show that Set1, and particularly the H3K4me2 mark, are implicated in repression of a subset of middle sporulation genes during vegetative growth. In the absence of Set1, there is loss of the DNA-binding transcriptional regulator Sum1 and the associated histone deacetylase Hst1 from chromatin in a locus-specific manner. This is linked to increased H4K5ac at these loci and aberrant middle gene expression. These data indicate that, in addition to DNA sequence, histone modification status also contributes to proper localization of Sum1. Our results also show that the role for Set1 in middle gene expression control diverges as cells receive signals to undergo meiosis. Overall, this work dissects an unexplored role for Set1 in gene-specific repression, and provides important insights into a new mechanism associated with the control of gene expression linked to meiotic differentiation. PMID:29066473
Suitability of public use secondary data sets to study multiple activities.

Science.gov (United States)

Putnam, Michelle; Morrow-Howell, Nancy; Inoue, Megumi; Greenfield, Jennifer C; Chen, Huajuan; Lee, YungSoo

2014-10-01

The aims of this study were to inventory activity items within and across U.S. public use data sets, to identify gaps in represented activity domains and challenges in interpreting domains, and to assess the potential for studying multiple activity engagement among older adults using existing data. We engaged in content analysis of activity measures of 5U.S. public use data sets with nationally representative samples of older adults. Data sets included the Health & Retirement Survey (HRS), Americans' Changing Lives Survey (ACL), Midlife in the United States Survey (MIDUS), the National Health Interview Survey (NHIS), and the Panel Study of Income Dynamics survey (PSID). Two waves of each data set were analyzed. We identified 13 distinct activity domains across the 5 data sets, with substantial differences in representation of those domains among the data sets, and variance in the number and type of activity measures included in each. Our findings indicate that although it is possible to study multiple activity engagement within existing data sets, fuller sets of activity measures need to be developed in order to evaluate the portfolio of activities older adults engage in and the relationship of these portfolios to health and wellness outcomes. Importantly, clearer conceptual models of activity broadly conceived are required to guide this work. © The Author 2013. Published by Oxford University Press on behalf of The Gerontological Society of America. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
A Convenient Cas9-based Conditional Knockout Strategy for Simultaneously Targeting Multiple Genes in Mouse.

Science.gov (United States)

Chen, Jiang; Du, Yinan; He, Xueyan; Huang, Xingxu; Shi, Yun S

2017-03-31

The most powerful way to probe protein function is to characterize the consequence of its deletion. Compared to conventional gene knockout (KO), conditional knockout (cKO) provides an advanced gene targeting strategy with which gene deletion can be performed in a spatially and temporally restricted manner. However, for most species that are amphiploid, the widely used Cre-flox conditional KO (cKO) system would need targeting loci in both alleles to be loxP flanked, which in practice, requires time and labor consuming breeding. This is considerably significant when one is dealing with multiple genes. CRISPR/Cas9 genome modulation system is advantaged in its capability in targeting multiple sites simultaneously. Here we propose a strategy that could achieve conditional KO of multiple genes in mouse with Cre recombinase dependent Cas9 expression. By transgenic construction of loxP-stop-loxP (LSL) controlled Cas9 (LSL-Cas9) together with sgRNAs targeting EGFP, we showed that the fluorescence molecule could be eliminated in a Cre-dependent manner. We further verified the efficacy of this novel strategy to target multiple sites by deleting c-Maf and MafB simultaneously in macrophages specifically. Compared to the traditional Cre-flox cKO strategy, this sgRNAs-LSL-Cas9 cKO system is simpler and faster, and would make conditional manipulation of multiple genes feasible.
Association of circadian rhythm genes ARNTL/BMAL1 and CLOCK with multiple sclerosis.

Directory of Open Access Journals (Sweden)

Polona Lavtar

Full Text Available Prevalence of multiple sclerosis varies with geographic latitude. We hypothesized that this fact might be partially associated with the influence of latitude on circadian rhythm and consequently that genetic variability of key circadian rhythm regulators, ARNTL and CLOCK genes, might contribute to the risk for multiple sclerosis. Our aim was to analyse selected polymorphisms of ARNTL and CLOCK, and their association with multiple sclerosis. A total of 900 Caucasian patients and 1024 healthy controls were compared for genetic signature at 8 SNPs, 4 for each of both genes. We found a statistically significant difference in genotype (ARNTL rs3789327, P = 7.5·10-5; CLOCK rs6811520 P = 0.02 distributions in patients and controls. The ARNTL rs3789327 CC genotype was associated with higher risk for multiple sclerosis at an OR of 1.67 (95% CI 1.35-2.07, P = 0.0001 and the CLOCK rs6811520 genotype CC at an OR of 1.40 (95% CI 1.13-1.73, P = 0.002. The results of this study suggest that genetic variability in the ARNTL and CLOCK genes might be associated with risk for multiple sclerosis.
A new fast method for inferring multiple consensus trees using k-medoids.

Science.gov (United States)

Tahiri, Nadia; Willems, Matthieu; Makarenkov, Vladimir

2018-04-05

Gene trees carry important information about specific evolutionary patterns which characterize the evolution of the corresponding gene families. However, a reliable species consensus tree cannot be inferred from a multiple sequence alignment of a single gene family or from the concatenation of alignments corresponding to gene families having different evolutionary histories. These evolutionary histories can be quite different due to horizontal transfer events or to ancient gene duplications which cause the emergence of paralogs within a genome. Many methods have been proposed to infer a single consensus tree from a collection of gene trees. Still, the application of these tree merging methods can lead to the loss of specific evolutionary patterns which characterize some gene families or some groups of gene families. Thus, the problem of inferring multiple consensus trees from a given set of gene trees becomes relevant. We describe a new fast method for inferring multiple consensus trees from a given set of phylogenetic trees (i.e. additive trees or X-trees) defined on the same set of species (i.e. objects or taxa). The traditional consensus approach yields a single consensus tree. We use the popular k-medoids partitioning algorithm to divide a given set of trees into several clusters of trees. We propose novel versions of the well-known Silhouette and Caliński-Harabasz cluster validity indices that are adapted for tree clustering with k-medoids. The efficiency of the new method was assessed using both synthetic and real data, such as a well-known phylogenetic dataset consisting of 47 gene trees inferred for 14 archaeal organisms. The method described here allows inference of multiple consensus trees from a given set of gene trees. It can be used to identify groups of gene trees having similar intragroup and different intergroup evolutionary histories. The main advantage of our method is that it is much faster than the existing tree clustering approaches, while
A level set method for multiple sclerosis lesion segmentation.

Science.gov (United States)

Zhao, Yue; Guo, Shuxu; Luo, Min; Shi, Xue; Bilello, Michel; Zhang, Shaoxiang; Li, Chunming

2018-06-01

In this paper, we present a level set method for multiple sclerosis (MS) lesion segmentation from FLAIR images in the presence of intensity inhomogeneities. We use a three-phase level set formulation of segmentation and bias field estimation to segment MS lesions and normal tissue region (including GM and WM) and CSF and the background from FLAIR images. To save computational load, we derive a two-phase formulation from the original multi-phase level set formulation to segment the MS lesions and normal tissue regions. The derived method inherits the desirable ability to precisely locate object boundaries of the original level set method, which simultaneously performs segmentation and estimation of the bias field to deal with intensity inhomogeneity. Experimental results demonstrate the advantages of our method over other state-of-the-art methods in terms of segmentation accuracy. Copyright © 2017 Elsevier Inc. All rights reserved.
Genetic evaluation with major genes and polygenic inheritance when some animals are not genotyped using gene content multiple-trait BLUP.

Science.gov (United States)

Legarra, Andrés; Vitezica, Zulma G

2015-11-17

In pedigreed populations with a major gene segregating for a quantitative trait, it is not clear how to use pedigree, genotype and phenotype information when some individuals are not genotyped. We propose to consider gene content at the major gene as a second trait correlated to the quantitative trait, in a gene content multiple-trait best linear unbiased prediction (GCMTBLUP) method. The genetic covariance between the trait and gene content at the major gene is a function of the substitution effect of the gene. This genetic covariance can be written in a multiple-trait form that accommodates any pattern of missing values for either genotype or phenotype data. Effects of major gene alleles and the genetic covariance between genotype at the major gene and the phenotype can be estimated using standard EM-REML or Gibbs sampling. Prediction of breeding values with genotypes at the major gene can use multiple-trait BLUP software. Major genes with more than two alleles can be considered by including negative covariances between gene contents at each different allele. We simulated two scenarios: a selected and an unselected trait with heritabilities of 0.05 and 0.5, respectively. In both cases, the major gene explained half the genetic variation. Competing methods used imputed gene contents derived by the method of Gengler et al. or by iterative peeling. Imputed gene contents, in contrast to GCMTBLUP, do not consider information on the quantitative trait for genotype prediction. GCMTBLUP gave unbiased estimates of the gene effect, in contrast to the other methods, with less bias and better or equal accuracy of prediction. GCMTBLUP improved estimation of genotypes in non-genotyped individuals, in particular if these individuals had own phenotype records and the trait had a high heritability. Ignoring the major gene in genetic evaluation led to serious biases and decreased prediction accuracy. CGMTBLUP is the best linear predictor of additive genetic merit including
Level-set simulations of buoyancy-driven motion of single and multiple bubbles

International Nuclear Information System (INIS)

Balcázar, Néstor; Lehmkuhl, Oriol; Jofre, Lluís; Oliva, Assensi

2015-01-01

Highlights: • A conservative level-set method is validated and verified. • An extensive study of buoyancy-driven motion of single bubbles is performed. • The interactions of two spherical and ellipsoidal bubbles is studied. • The interaction of multiple bubbles is simulated in a vertical channel. - Abstract: This paper presents a numerical study of buoyancy-driven motion of single and multiple bubbles by means of the conservative level-set method. First, an extensive study of the hydrodynamics of single bubbles rising in a quiescent liquid is performed, including its shape, terminal velocity, drag coefficients and wake patterns. These results are validated against experimental and numerical data well established in the scientific literature. Then, a further study on the interaction of two spherical and ellipsoidal bubbles is performed for different orientation angles. Finally, the interaction of multiple bubbles is explored in a periodic vertical channel. The results show that the conservative level-set approach can be used for accurate modelling of bubble dynamics. Moreover, it is demonstrated that the present method is numerically stable for a wide range of Morton and Reynolds numbers.
Gene set-based module discovery in the breast cancer transcriptome

Directory of Open Access Journals (Sweden)

Zhang Michael Q

2009-02-01

Full Text Available Abstract Background Although microarray-based studies have revealed global view of gene expression in cancer cells, we still have little knowledge about regulatory mechanisms underlying the transcriptome. Several computational methods applied to yeast data have recently succeeded in identifying expression modules, which is defined as co-expressed gene sets under common regulatory mechanisms. However, such module discovery methods are not applied cancer transcriptome data. Results In order to decode oncogenic regulatory programs in cancer cells, we developed a novel module discovery method termed EEM by extending a previously reported module discovery method, and applied it to breast cancer expression data. Starting from seed gene sets prepared based on cis-regulatory elements, ChIP-chip data, and gene locus information, EEM identified 10 principal expression modules in breast cancer based on their expression coherence. Moreover, EEM depicted their activity profiles, which predict regulatory programs in each subtypes of breast tumors. For example, our analysis revealed that the expression module regulated by the Polycomb repressive complex 2 (PRC2 is downregulated in triple negative breast cancers, suggesting similarity of transcriptional programs between stem cells and aggressive breast cancer cells. We also found that the activity of the PRC2 expression module is negatively correlated to the expression of EZH2, a component of PRC2 which belongs to the E2F expression module. E2F-driven EZH2 overexpression may be responsible for the repression of the PRC2 expression modules in triple negative tumors. Furthermore, our network analysis predicts regulatory circuits in breast cancer cells. Conclusion These results demonstrate that the gene set-based module discovery approach is a powerful tool to decode regulatory programs in cancer cells.
IGEMS: The Consortium on Interplay of Genes and Environment Across Multiple Studies

DEFF Research Database (Denmark)

Pedersen, Nancy L; Christensen, Kaare; Dahl, Anna K

2013-01-01

The Interplay of Genes and Environment across Multiple Studies (IGEMS) group is a consortium of eight longitudinal twin studies established to explore the nature of social context effects and gene-environment interplay in late-life functioning. The resulting analysis of the combined data from ove...
Action of multiple intra-QTL genes concerted around a co-localized transcription factor underpins a large effect QTL

Science.gov (United States)

Dixit, Shalabh; Kumar Biswal, Akshaya; Min, Aye; Henry, Amelia; Oane, Rowena H.; Raorane, Manish L.; Longkumer, Toshisangba; Pabuayon, Isaiah M.; Mutte, Sumanth K.; Vardarajan, Adithi R.; Miro, Berta; Govindan, Ganesan; Albano-Enriquez, Blesilda; Pueffeld, Mandy; Sreenivasulu, Nese; Slamet-Loedin, Inez; Sundarvelpandian, Kalaipandian; Tsai, Yuan-Ching; Raghuvanshi, Saurabh; Hsing, Yue-Ie C.; Kumar, Arvind; Kohli, Ajay

2015-01-01

Sub-QTLs and multiple intra-QTL genes are hypothesized to underpin large-effect QTLs. Known QTLs over gene families, biosynthetic pathways or certain traits represent functional gene-clusters of genes of the same gene ontology (GO). Gene-clusters containing genes of different GO have not been elaborated, except in silico as coexpressed genes within QTLs. Here we demonstrate the requirement of multiple intra-QTL genes for the full impact of QTL qDTY12.1 on rice yield under drought. Multiple evidences are presented for the need of the transcription factor ‘no apical meristem’ (OsNAM12.1) and its co-localized target genes of separate GO categories for qDTY12.1 function, raising a regulon-like model of genetic architecture. The molecular underpinnings of qDTY12.1 support its effectiveness in further improving a drought tolerant genotype and for its validity in multiple genotypes/ecosystems/environments. Resolving the combinatorial value of OsNAM12.1 with individual intra-QTL genes notwithstanding, identification and analyses of qDTY12.1has fast-tracked rice improvement towards food security. PMID:26507552
A common variant within the HNF1B gene is associated with overall survival of multiple myeloma patients

DEFF Research Database (Denmark)

Ríos-Tamayo, Rafael; Lupiañez, Carmen Belén; Campa, Daniele

2016-01-01

Diabetogenic single nucleotide polymorphisms (SNPs) have recently been associated with multiple myeloma (MM) risk but their impact on overall survival (OS) of MM patients has not been analysed yet. In order to investigate the impact of 58 GWAS-identified variants for type 2 diabetes (T2D) on OS...... of patients with MM, we analysed genotyping data of 936 MM patients collected by the International Multiple Myeloma rESEarch (IMMENSE) consortium and an independent set of 700 MM patients recruited by the University Clinic of Heidelberg. A meta-analysis of the cox regression results of the two sets showed...... that rs7501939 located in the HNF1B gene negatively impacted OS (HRRec= 1.44, 95% CI = 1.18-1.76, P = 0.0001). The meta-analysis also showed a noteworthy gender-specific association of the SLC30A8rs13266634 SNP with OS. The presence of each additional copy of the minor allele at rs13266634 was associated...
Function of One Regular Separable Relation Set Decided for the Minimal Covering in Multiple Valued Logic

Directory of Open Access Journals (Sweden)

Liu Yu Zhen

2016-01-01

Full Text Available Multiple-valued logic is an important branch of the computer science and technology. Multiple-valued logic studies the theory, multiple-valued circuit & multiple-valued system, and the applications of multiple-valued logic included.In the theory of multiple-valued logic, one primary and important problem is the completeness of function sets, which can be solved depending on the decision for all the precomplete sets(also called maximal closed sets of K-valued function sets noted by PK*, and another is the decision for Sheffer function, which can be totally solved by picking out all of the minimal covering of the precomplete sets. In the function structure theory of multi-logic, decision on Sheffer function is an important role. It contains structure and decision of full multi-logic and partial multi-logic. Its decision is closely related to decision of completeness of function which can be done by deciding the minimal covering of full multi-logic and partial-logic. By theory of completeness of partial multi-logic, we prove that function of one regular separable relation is not minimal covering of PK* under the condition of m = 2, σ = e.
The null hypothesis of GSEA, and a novel statistical model for competitive gene set analysis

DEFF Research Database (Denmark)

Debrabant, Birgit

2017-01-01

MOTIVATION: Competitive gene set analysis intends to assess whether a specific set of genes is more associated with a trait than the remaining genes. However, the statistical models assumed to date to underly these methods do not enable a clear cut formulation of the competitive null hypothesis....... This is a major handicap to the interpretation of results obtained from a gene set analysis. RESULTS: This work presents a hierarchical statistical model based on the notion of dependence measures, which overcomes this problem. The two levels of the model naturally reflect the modular structure of many gene set...... analysis methods. We apply the model to show that the popular GSEA method, which recently has been claimed to test the self-contained null hypothesis, actually tests the competitive null if the weight parameter is zero. However, for this result to hold strictly, the choice of the dependence measures...
Fast generation of multiple resolution instances of raster data sets

NARCIS (Netherlands)

Arge, L.; Haverkort, H.J.; Tsirogiannis, C.P.

2012-01-01

In many GIS applications it is important to study the characteristics of a raster data set at multiple resolutions. Often this is done by generating several coarser resolution rasters from a fine resolution raster. In this paper we describe efficient algorithms for different variants of this
Development of the Multiple Gene Knockout System with One-Step PCR in Thermoacidophilic Crenarchaeon Sulfolobus acidocaldarius

Directory of Open Access Journals (Sweden)

Shoji Suzuki

2017-01-01

Full Text Available Multiple gene knockout systems developed in the thermoacidophilic crenarchaeon Sulfolobus acidocaldarius are powerful genetic tools. However, plasmid construction typically requires several steps. Alternatively, PCR tailing for high-throughput gene disruption was also developed in S. acidocaldarius, but repeated gene knockout based on PCR tailing has been limited due to lack of a genetic marker system. In this study, we demonstrated efficient homologous recombination frequency (2.8 × 104 ± 6.9 × 103 colonies/μg DNA by optimizing the transformation conditions. This optimized protocol allowed to develop reliable gene knockout via double crossover using short homologous arms and to establish the multiple gene knockout system with one-step PCR (MONSTER. In the MONSTER, a multiple gene knockout cassette was simply and rapidly constructed by one-step PCR without plasmid construction, and the PCR product can be immediately used for target gene deletion. As an example of the applications of this strategy, we successfully made a DNA photolyase- (phr- and arginine decarboxylase- (argD- deficient strain of S. acidocaldarius. In addition, an agmatine selection system consisting of an agmatine-auxotrophic strain and argD marker was also established. The MONSTER provides an alternative strategy that enables the very simple construction of multiple gene knockout cassettes for genetic studies in S. acidocaldarius.
The ALMT Gene Family Performs Multiple Functions in Plants

Directory of Open Access Journals (Sweden)

Jie Liu

2018-02-01

Full Text Available The aluminium activated malate transporter (ALMT gene family is named after the first member of the family identified in wheat (Triticum aestivum L.. The product of this gene controls resistance to aluminium (Al toxicity. ALMT genes encode transmembrane proteins that function as anion channels and perform multiple functions involving the transport of organic anions (e.g., carboxylates and inorganic anions in cells. They share a PF11744 domain and are classified in the Fusaric acid resistance protein-like superfamily, CL0307. The proteins typically have five to seven transmembrane regions in the N-terminal half and a long hydrophillic C-terminal tail but predictions of secondary structure vary. Although widely spread in plants, relatively little information is available on the roles performed by other members of this family. In this review, we summarized functions of ALMT gene families, including Al resistance, stomatal function, mineral nutrition, microbe interactions, fruit acidity, light response and seed development.
Multiple Gene-Environment Interactions on the Angiogenesis Gene-Pathway Impact Rectal Cancer Risk and Survival

Directory of Open Access Journals (Sweden)

Noha Sharafeldin

2017-09-01

Full Text Available Characterization of gene-environment interactions (GEIs in cancer is limited. We aimed at identifying GEIs in rectal cancer focusing on a relevant biologic process involving the angiogenesis pathway and relevant environmental exposures: cigarette smoking, alcohol consumption, and animal protein intake. We analyzed data from 747 rectal cancer cases and 956 controls from the Diet, Activity and Lifestyle as a Risk Factor for Rectal Cancer study. We applied a 3-step analysis approach: first, we searched for interactions among single nucleotide polymorphisms on the pathway genes; second, we searched for interactions among the genes, both steps using Logic regression; third, we examined the GEIs significant at the 5% level using logistic regression for cancer risk and Cox proportional hazards models for survival. Permutation-based test was used for multiple testing adjustment. We identified 8 significant GEIs associated with risk among 6 genes adjusting for multiple testing: TNF (OR = 1.85, 95% CI: 1.10, 3.11, TLR4 (OR = 2.34, 95% CI: 1.38, 3.98, and EGR2 (OR = 2.23, 95% CI: 1.04, 4.78 with smoking; IGF1R (OR = 1.69, 95% CI: 1.04, 2.72, TLR4 (OR = 2.10, 95% CI: 1.22, 3.60 and EGR2 (OR = 2.12, 95% CI: 1.01, 4.46 with alcohol; and PDGFB (OR = 1.75, 95% CI: 1.04, 2.92 and MMP1 (OR = 2.44, 95% CI: 1.24, 4.81 with protein. Five GEIs were associated with survival at the 5% significance level but not after multiple testing adjustment: CXCR1 (HR = 2.06, 95% CI: 1.13, 3.75 with smoking; and KDR (HR = 4.36, 95% CI: 1.62, 11.73, TLR2 (HR = 9.06, 95% CI: 1.14, 72.11, EGR2 (HR = 2.45, 95% CI: 1.42, 4.22, and EGFR (HR = 6.33, 95% CI: 1.95, 20.54 with protein. GEIs between angiogenesis genes and smoking, alcohol, and animal protein impact rectal cancer risk. Our results support the importance of considering the biologic hypothesis to characterize GEIs associated with cancer outcomes.

Simultaneous gene finding in multiple genomes.

Science.gov (United States)

König, Stefanie; Romoth, Lars W; Gerischer, Lizzy; Stanke, Mario

2016-11-15

As the tree of life is populated with sequenced genomes ever more densely, the new challenge is the accurate and consistent annotation of entire clades of genomes. We address this problem with a new approach to comparative gene finding that takes a multiple genome alignment of closely related species and simultaneously predicts the location and structure of protein-coding genes in all input genomes, thereby exploiting negative selection and sequence conservation. The model prefers potential gene structures in the different genomes that are in agreement with each other, or-if not-where the exon gains and losses are plausible given the species tree. We formulate the multi-species gene finding problem as a binary labeling problem on a graph. The resulting optimization problem is NP hard, but can be efficiently approximated using a subgradient-based dual decomposition approach. The proposed method was tested on whole-genome alignments of 12 vertebrate and 12 Drosophila species. The accuracy was evaluated for human, mouse and Drosophila melanogaster and compared to competing methods. Results suggest that our method is well-suited for annotation of (a large number of) genomes of closely related species within a clade, in particular, when RNA-Seq data are available for many of the genomes. The transfer of existing annotations from one genome to another via the genome alignment is more accurate than previous approaches that are based on protein-spliced alignments, when the genomes are at close to medium distances. The method is implemented in C ++ as part of Augustus and available open source at http://bioinf.uni-greifswald.de/augustus/ CONTACT: stefaniekoenig@ymail.com or mario.stanke@uni-greifswald.deSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Using OWL reasoning to support the generation of novel gene sets for enrichment analysis.

Science.gov (United States)

Osumi-Sutherland, David J; Ponta, Enrico; Courtot, Melanie; Parkinson, Helen; Badi, Laura

2018-02-14

The Gene Ontology (GO) consists of over 40,000 terms for biological processes, cell components and gene product activities linked into a graph structure by over 90,000 relationships. It has been used to annotate the functions and cellular locations of several million gene products. The graph structure is used by a variety of tools to group annotated genes into sets whose products share function or location. These gene sets are widely used to interpret the results of genomics experiments by assessing which sets are significantly over- or under-represented in results lists. F Hoffmann-La Roche Ltd. has developed a bespoke, manually maintained controlled vocabulary (RCV) for use in over-representation analysis. Many terms in this vocabulary group GO terms in novel ways that cannot easily be derived using the graph structure of the GO. For example, some RCV terms group GO terms by the cell, chemical or tissue type they refer to. Recent improvements in the content and formal structure of the GO make it possible to use logical queries in Web Ontology Language (OWL) to automatically map these cross-cutting classifications to sets of GO terms. We used this approach to automate mapping between RCV and GO, largely replacing the increasingly unsustainable manual mapping process. We then tested the utility of the resulting groupings for over-representation analysis. We successfully mapped 85% of RCV terms to logical OWL definitions and showed that these could be used to recapitulate and extend manual mappings between RCV terms and the sets of GO terms subsumed by them. We also show that gene sets derived from the resulting GO terms sets can be used to detect the signatures of cell and tissue types in whole genome expression data. The rich formal structure of the GO makes it possible to use reasoning to dynamically generate novel, biologically relevant groupings of GO terms. GO term groupings generated with this approach can be used in. over-representation analysis to detect
Synergistic interactions between Drosophila orthologues of genes spanned by de novo human CNVs support multiple-hit models of autism.

Science.gov (United States)

Grice, Stuart J; Liu, Ji-Long; Webber, Caleb

2015-03-01

Autism spectrum disorders (ASDs) are highly heritable and characterised by deficits in social interaction and communication, as well as restricted and repetitive behaviours. Although a number of highly penetrant ASD gene variants have been identified, there is growing evidence to support a causal role for combinatorial effects arising from the contributions of multiple loci. By examining synaptic and circadian neurological phenotypes resulting from the dosage variants of unique human:fly orthologues in Drosophila, we observe numerous synergistic interactions between pairs of informatically-identified candidate genes whose orthologues are jointly affected by large de novo copy number variants (CNVs). These CNVs were found in the genomes of individuals with autism, including a patient carrying a 22q11.2 deletion. We first demonstrate that dosage alterations of the unique Drosophila orthologues of candidate genes from de novo CNVs that harbour only a single candidate gene display neurological defects similar to those previously reported in Drosophila models of ASD-associated variants. We then considered pairwise dosage changes within the set of orthologues of candidate genes that were affected by the same single human de novo CNV. For three of four CNVs with complete orthologous relationships, we observed significant synergistic effects following the simultaneous dosage change of gene pairs drawn from a single CNV. The phenotypic variation observed at the Drosophila synapse that results from these interacting genetic variants supports a concordant phenotypic outcome across all interacting gene pairs following the direction of human gene copy number change. We observe both specificity and transitivity between interactors, both within and between CNV candidate gene sets, supporting shared and distinct genetic aetiologies. We then show that different interactions affect divergent synaptic processes, demonstrating distinct molecular aetiologies. Our study illustrates
Stable carbon isotope fractionation of chlorinated ethenes by a microbial consortium containing multiple dechlorinating genes.

Science.gov (United States)

Liu, Na; Ding, Longzhen; Li, Haijun; Zhang, Pengpeng; Zheng, Jixing; Weng, Chih-Huang

2018-08-01

The study aimed to determine the possible contribution of specific growth conditions and community structures to variable carbon enrichment factors (Ɛ- carbon ) values for the degradation of chlorinated ethenes (CEs) by a bacterial consortium with multiple dechlorinating genes. Ɛ- carbon values for trichloroethylene, cis-1,2-dichloroethylene, and vinyl chloride were -7.24% ± 0.59%, -14.6% ± 1.71%, and -21.1% ± 1.14%, respectively, during their degradation by a microbial consortium containing multiple dechlorinating genes including tceA and vcrA. The Ɛ- carbon values of all CEs were not greatly affected by changes in growth conditions and community structures, which directly or indirectly affected reductive dechlorination of CEs by this consortium. Stability analysis provided evidence that the presence of multiple dechlorinating genes within a microbial consortium had little effect on carbon isotope fractionation, as long as the genes have definite, non-overlapping functions. Copyright © 2018 Elsevier Ltd. All rights reserved.
A multicolor panel of TALE-KRAB based transcriptional repressor vectors enabling knockdown of multiple gene targets.

Science.gov (United States)

Zhang, Zhonghui; Wu, Elise; Qian, Zhijian; Wu, Wen-Shu

2014-12-05

Stable and efficient knockdown of multiple gene targets is highly desirable for dissection of molecular pathways. Because it allows sequence-specific DNA binding, transcription activator-like effector (TALE) offers a new genetic perturbation technique that allows for gene-specific repression. Here, we constructed a multicolor lentiviral TALE-Kruppel-associated box (KRAB) expression vector platform that enables knockdown of multiple gene targets. This platform is fully compatible with the Golden Gate TALEN and TAL Effector Kit 2.0, a widely used and efficient method for TALE assembly. We showed that this multicolor TALE-KRAB vector system when combined together with bone marrow transplantation could quickly knock down c-kit and PU.1 genes in hematopoietic stem and progenitor cells of recipient mice. Furthermore, our data demonstrated that this platform simultaneously knocked down both c-Kit and PU.1 genes in the same primary cell populations. Together, our results suggest that this multicolor TALE-KRAB vector platform is a promising and versatile tool for knockdown of multiple gene targets and could greatly facilitate dissection of molecular pathways.
Structured association analysis leads to insight into Saccharomyces cerevisiae gene regulation by finding multiple contributing eQTL hotspots associated with functional gene modules.

Science.gov (United States)

Curtis, Ross E; Kim, Seyoung; Woolford, John L; Xu, Wenjie; Xing, Eric P

2013-03-21

Association analysis using genome-wide expression quantitative trait locus (eQTL) data investigates the effect that genetic variation has on cellular pathways and leads to the discovery of candidate regulators. Traditional analysis of eQTL data via pairwise statistical significance tests or linear regression does not leverage the availability of the structural information of the transcriptome, such as presence of gene networks that reveal correlation and potentially regulatory relationships among the study genes. We employ a new eQTL mapping algorithm, GFlasso, which we have previously developed for sparse structured regression, to reanalyze a genome-wide yeast dataset. GFlasso fully takes into account the dependencies among expression traits to suppress false positives and to enhance the signal/noise ratio. Thus, GFlasso leverages the gene-interaction network to discover the pleiotropic effects of genetic loci that perturb the expression level of multiple (rather than individual) genes, which enables us to gain more power in detecting previously neglected signals that are marginally weak but pleiotropically significant. While eQTL hotspots in yeast have been reported previously as genomic regions controlling multiple genes, our analysis reveals additional novel eQTL hotspots and, more interestingly, uncovers groups of multiple contributing eQTL hotspots that affect the expression level of functional gene modules. To our knowledge, our study is the first to report this type of gene regulation stemming from multiple eQTL hotspots. Additionally, we report the results from in-depth bioinformatics analysis for three groups of these eQTL hotspots: ribosome biogenesis, telomere silencing, and retrotransposon biology. We suggest candidate regulators for the functional gene modules that map to each group of hotspots. Not only do we find that many of these candidate regulators contain mutations in the promoter and coding regions of the genes, in the case of the Ribi group
Rapid genome reshaping by multiple-gene loss after whole-genome duplication in teleost fish suggested by mathematical modeling

Science.gov (United States)

Sato, Yukuto; Tsukamoto, Katsumi; Nishida, Mutsumi

2015-01-01

Whole-genome duplication (WGD) is believed to be a significant source of major evolutionary innovation. Redundant genes resulting from WGD are thought to be lost or acquire new functions. However, the rates of gene loss and thus temporal process of genome reshaping after WGD remain unclear. The WGD shared by all teleost fish, one-half of all jawed vertebrates, was more recent than the two ancient WGDs that occurred before the origin of jawed vertebrates, and thus lends itself to analysis of gene loss and genome reshaping. Using a newly developed orthology identification pipeline, we inferred the post–teleost-specific WGD evolutionary histories of 6,892 protein-coding genes from nine phylogenetically representative teleost genomes on a time-calibrated tree. We found that rapid gene loss did occur in the first 60 My, with a loss of more than 70–80% of duplicated genes, and produced similar genomic gene arrangements within teleosts in that relatively short time. Mathematical modeling suggests that rapid gene loss occurred mainly by events involving simultaneous loss of multiple genes. We found that the subsequent 250 My were characterized by slow and steady loss of individual genes. Our pipeline also identified about 1,100 shared single-copy genes that are inferred to have become singletons before the divergence of clupeocephalan teleosts. Therefore, our comparative genome analysis suggests that rapid gene loss just after the WGD reshaped teleost genomes before the major divergence, and provides a useful set of marker genes for future phylogenetic analysis. PMID:26578810
A cross-study gene set enrichment analysis identifies critical pathways in endometriosis

Directory of Open Access Journals (Sweden)

Bai Chunyan

2009-09-01

Full Text Available Abstract Background Endometriosis is an enigmatic disease. Gene expression profiling of endometriosis has been used in several studies, but few studies went further to classify subtypes of endometriosis based on expression patterns and to identify possible pathways involved in endometriosis. Some of the observed pathways are more inconsistent between the studies, and these candidate pathways presumably only represent a fraction of the pathways involved in endometriosis. Methods We applied a standardised microarray preprocessing and gene set enrichment analysis to six independent studies, and demonstrated increased concordance between these gene datasets. Results We find 16 up-regulated and 19 down-regulated pathways common in ovarian endometriosis data sets, 22 up-regulated and one down-regulated pathway common in peritoneal endometriosis data sets. Among them, 12 up-regulated and 1 down-regulated were found consistent between ovarian and peritoneal endometriosis. The main canonical pathways identified are related to immunological and inflammatory disease. Early secretory phase has the most over-represented pathways in the three uterine cycle phases. There are no overlapping significant pathways between the dataset from human endometrial endothelial cells and the datasets from ovarian endometriosis which used whole tissues. Conclusion The study of complex diseases through pathway analysis is able to highlight genes weakly connected to the phenotype which may be difficult to detect by using classical univariate statistics. By standardised microarray preprocessing and GSEA, we have increased the concordance in identifying many biological mechanisms involved in endometriosis. The identified gene pathways will shed light on the understanding of endometriosis and promote the development of novel therapies.
Multiple gcd-closed sets and determinants of matrices associated with arithmetic functions

Directory of Open Access Journals (Sweden)

Hong Siao

2016-01-01

Full Text Available Let f be an arithmetic function and S = {x1, …, xn} be a set of n distinct positive integers. By (f(xi, xj (resp. (f[xi, xj] we denote the n × n matrix having f evaluated at the greatest common divisor (xi, xj (resp. the least common multiple [xi, xj] of x, and xj as its (i, j-entry, respectively. The set S is said to be gcd closed if (xi, xj ∈ S for 1 ≤ i, j ≤ n. In this paper, we give formulas for the determinants of the matrices (f(xi, xj and (f[xi, xj] if S consists of multiple coprime gcd-closed sets (i.e., S equals the union of S1, …, Sk with k ≥ 1 being an integer and S1, …, Sk being gcd-closed sets such that (lcm(Si, lcm(Sj = 1 for all 1 ≤ i ≠ j ≤ k. This extends the Bourque-Ligh, Hong’s and the Hong-Loewy formulas obtained in 1993, 2002 and 2011, respectively. It also generalizes the famous Smith’s determinant.
GSHR, a Web-Based Platform Provides Gene Set-Level Analyses of Hormone Responses in Arabidopsis

Directory of Open Access Journals (Sweden)

Xiaojuan Ran

2018-01-01

Full Text Available Phytohormones regulate diverse aspects of plant growth and environmental responses. Recent high-throughput technologies have promoted a more comprehensive profiling of genes regulated by different hormones. However, these omics data generally result in large gene lists that make it challenging to interpret the data and extract insights into biological significance. With the rapid accumulation of theses large-scale experiments, especially the transcriptomic data available in public databases, a means of using this information to explore the transcriptional networks is needed. Different platforms have different architectures and designs, and even similar studies using the same platform may obtain data with large variances because of the highly dynamic and flexible effects of plant hormones; this makes it difficult to make comparisons across different studies and platforms. Here, we present a web server providing gene set-level analyses of Arabidopsis thaliana hormone responses. GSHR collected 333 RNA-seq and 1,205 microarray datasets from the Gene Expression Omnibus, characterizing transcriptomic changes in Arabidopsis in response to phytohormones including abscisic acid, auxin, brassinosteroids, cytokinins, ethylene, gibberellins, jasmonic acid, salicylic acid, and strigolactones. These data were further processed and organized into 1,368 gene sets regulated by different hormones or hormone-related factors. By comparing input gene lists to these gene sets, GSHR helped to identify gene sets from the input gene list regulated by different phytohormones or related factors. Together, GSHR links prior information regarding transcriptomic changes induced by hormones and related factors to newly generated data and facilities cross-study and cross-platform comparisons; this helps facilitate the mining of biologically significant information from large-scale datasets. The GSHR is freely available at http://bioinfo.sibs.ac.cn/GSHR/.
Identification of a conserved set of upregulated genes in mouse skeletal muscle hypertrophy and regrowth.

Science.gov (United States)

Chaillou, Thomas; Jackson, Janna R; England, Jonathan H; Kirby, Tyler J; Richards-White, Jena; Esser, Karyn A; Dupont-Versteegden, Esther E; McCarthy, John J

2015-01-01

The purpose of this study was to compare the gene expression profile of mouse skeletal muscle undergoing two forms of growth (hypertrophy and regrowth) with the goal of identifying a conserved set of differentially expressed genes. Expression profiling by microarray was performed on the plantaris muscle subjected to 1, 3, 5, 7, 10, and 14 days of hypertrophy or regrowth following 2 wk of hind-limb suspension. We identified 97 differentially expressed genes (≥2-fold increase or ≥50% decrease compared with control muscle) that were conserved during the two forms of muscle growth. The vast majority (∼90%) of the differentially expressed genes was upregulated and occurred at a single time point (64 out of 86 genes), which most often was on the first day of the time course. Microarray analysis from the conserved upregulated genes showed a set of genes related to contractile apparatus and stress response at day 1, including three genes involved in mechanotransduction and four genes encoding heat shock proteins. Our analysis further identified three cell cycle-related genes at day and several genes associated with extracellular matrix (ECM) at both days 3 and 10. In conclusion, we have identified a core set of genes commonly upregulated in two forms of muscle growth that could play a role in the maintenance of sarcomere stability, ECM remodeling, cell proliferation, fast-to-slow fiber type transition, and the regulation of skeletal muscle growth. These findings suggest conserved regulatory mechanisms involved in the adaptation of skeletal muscle to increased mechanical loading. Copyright © 2015 the American Physiological Society.
Nitrogen Cycle Evaluation (NiCE) Chip for the Simultaneous Analysis of Multiple N-Cycle Associated Genes.

Science.gov (United States)

Oshiki, Mamoru; Segawa, Takahiro; Ishii, Satoshi

2018-02-02

Various microorganisms play key roles in the Nitrogen (N) cycle. Quantitative PCR (qPCR) and PCR-amplicon sequencing of the N cycle functional genes allow us to analyze the abundance and diversity of microbes responsible in the N transforming reactions in various environmental samples. However, analysis of multiple target genes can be cumbersome and expensive. PCR-independent analysis, such as metagenomics and metatranscriptomics, is useful but expensive especially when we analyze multiple samples and try to detect N cycle functional genes present at relatively low abundance. Here, we present the application of microfluidic qPCR chip technology to simultaneously quantify and prepare amplicon sequence libraries for multiple N cycle functional genes as well as taxon-specific 16S rRNA gene markers for many samples. This approach, named as N cycle evaluation (NiCE) chip, was evaluated by using DNA from pure and artificially mixed bacterial cultures and by comparing the results with those obtained by conventional qPCR and amplicon sequencing methods. Quantitative results obtained by the NiCE chip were comparable to those obtained by conventional qPCR. In addition, the NiCE chip was successfully applied to examine abundance and diversity of N cycle functional genes in wastewater samples. Although non-specific amplification was detected on the NiCE chip, this could be overcome by optimizing the primer sequences in the future. As the NiCE chip can provide high-throughput format to quantify and prepare sequence libraries for multiple N cycle functional genes, this tool should advance our ability to explore N cycling in various samples. Importance. We report a novel approach, namely Nitrogen Cycle Evaluation (NiCE) chip by using microfluidic qPCR chip technology. By sequencing the amplicons recovered from the NiCE chip, we can assess diversities of the N cycle functional genes. The NiCE chip technology is applicable to analyze the temporal dynamics of the N cycle gene
A viral microRNA down-regulates multiple cell cycle genes through mRNA 5'UTRs.

Directory of Open Access Journals (Sweden)

Finn Grey

2010-06-01

Full Text Available Global gene expression data combined with bioinformatic analysis provides strong evidence that mammalian miRNAs mediate repression of gene expression primarily through binding sites within the 3' untranslated region (UTR. Using RNA induced silencing complex immunoprecipitation (RISC-IP techniques we have identified multiple cellular targets for a human cytomegalovirus (HCMV miRNA, miR-US25-1. Strikingly, this miRNA binds target sites primarily within 5'UTRs, mediating significant reduction in gene expression. Intriguingly, many of the genes targeted by miR-US25-1 are associated with cell cycle control, including cyclin E2, BRCC3, EID1, MAPRE2, and CD147, suggesting that miR-US25-1 is targeting genes within a related pathway. Deletion of miR-US25-1 from HCMV results in over expression of cyclin E2 in the context of viral infection. Our studies demonstrate that a viral miRNA mediates translational repression of multiple cellular genes by targeting mRNA 5'UTRs.
A BAC-bacterial recombination method to generate physically linked multiple gene reporter DNA constructs

Directory of Open Access Journals (Sweden)

Gong Shiaochin

2009-03-01

Full Text Available Abstract Background Reporter gene mice are valuable animal models for biological research providing a gene expression readout that can contribute to cellular characterization within the context of a developmental process. With the advancement of bacterial recombination techniques to engineer reporter gene constructs from BAC genomic clones and the generation of optically distinguishable fluorescent protein reporter genes, there is an unprecedented capability to engineer more informative transgenic reporter mouse models relative to what has been traditionally available. Results We demonstrate here our first effort on the development of a three stage bacterial recombination strategy to physically link multiple genes together with their respective fluorescent protein (FP reporters in one DNA fragment. This strategy uses bacterial recombination techniques to: (1 subclone genes of interest into BAC linking vectors, (2 insert desired reporter genes into respective genes and (3 link different gene-reporters together. As proof of concept, we have generated a single DNA fragment containing the genes Trap, Dmp1, and Ibsp driving the expression of ECFP, mCherry, and Topaz FP reporter genes, respectively. Using this DNA construct, we have successfully generated transgenic reporter mice that retain two to three gene readouts. Conclusion The three stage methodology to link multiple genes with their respective fluorescent protein reporter works with reasonable efficiency. Moreover, gene linkage allows for their common chromosomal integration into a single locus. However, the testing of this multi-reporter DNA construct by transgenesis does suggest that the linkage of two different genes together, despite their large size, can still create a positional effect. We believe that gene choice, genomic DNA fragment size and the presence of endogenous insulator elements are critical variables.
A BAC-bacterial recombination method to generate physically linked multiple gene reporter DNA constructs.

Science.gov (United States)

Maye, Peter; Stover, Mary Louise; Liu, Yaling; Rowe, David W; Gong, Shiaochin; Lichtler, Alexander C

2009-03-13

Reporter gene mice are valuable animal models for biological research providing a gene expression readout that can contribute to cellular characterization within the context of a developmental process. With the advancement of bacterial recombination techniques to engineer reporter gene constructs from BAC genomic clones and the generation of optically distinguishable fluorescent protein reporter genes, there is an unprecedented capability to engineer more informative transgenic reporter mouse models relative to what has been traditionally available. We demonstrate here our first effort on the development of a three stage bacterial recombination strategy to physically link multiple genes together with their respective fluorescent protein (FP) reporters in one DNA fragment. This strategy uses bacterial recombination techniques to: (1) subclone genes of interest into BAC linking vectors, (2) insert desired reporter genes into respective genes and (3) link different gene-reporters together. As proof of concept, we have generated a single DNA fragment containing the genes Trap, Dmp1, and Ibsp driving the expression of ECFP, mCherry, and Topaz FP reporter genes, respectively. Using this DNA construct, we have successfully generated transgenic reporter mice that retain two to three gene readouts. The three stage methodology to link multiple genes with their respective fluorescent protein reporter works with reasonable efficiency. Moreover, gene linkage allows for their common chromosomal integration into a single locus. However, the testing of this multi-reporter DNA construct by transgenesis does suggest that the linkage of two different genes together, despite their large size, can still create a positional effect. We believe that gene choice, genomic DNA fragment size and the presence of endogenous insulator elements are critical variables.
DNMT1 is associated with cell cycle and DNA replication gene sets in diffuse large B-cell lymphoma.

Science.gov (United States)

Loo, Suet Kee; Ab Hamid, Suzina Sheikh; Musa, Mustaffa; Wong, Kah Keng

2018-01-01

Dysregulation of DNA (cytosine-5)-methyltransferase 1 (DNMT1) is associated with the pathogenesis of various types of cancer. It has been previously shown that DNMT1 is frequently expressed in diffuse large B-cell lymphoma (DLBCL), however its functions remain to be elucidated in the disease. In this study, we gene expression profiled (GEP) shRNA targeting DNMT1(shDNMT1)-treated germinal center B-cell-like DLBCL (GCB-DLBCL)-derived cell line (i.e. HT) compared with non-silencing shRNA (control shRNA)-treated HT cells. Independent gene set enrichment analysis (GSEA) performed using GEPs of shRNA-treated HT cells and primary GCB-DLBCL cases derived from two publicly-available datasets (i.e. GSE10846 and GSE31312) produced three separate lists of enriched gene sets for each gene sets collection from Molecular Signatures Database (MSigDB). Subsequent Venn analysis identified 268, 145 and six consensus gene sets from analyzing gene sets in C2 collection (curated gene sets), C5 sub-collection [gene sets from gene ontology (GO) biological process ontology] and Hallmark collection, respectively to be enriched in positive correlation with DNMT1 expression profiles in shRNA-treated HT cells, GSE10846 and GSE31312 datasets [false discovery rate (FDR) 0.8) with DNMT1 expression and significantly downregulated (log fold-change <-1.35; p<0.05) following DNMT1 silencing in HT cells. These results suggest the involvement of DNMT1 in the activation of cell cycle and DNA replication in DLBCL cells. Copyright © 2017 Elsevier GmbH. All rights reserved.
Entropy and Multifractality for the Myeloma Multiple TET 2 Gene

Directory of Open Access Journals (Sweden)

Carlo Cattani

2012-01-01

Full Text Available The nucleotide and amino-acid distributions are studied for two variants of mRNA of gene that codes for a protein which is involved in multiple myeloid. Some patches and symmetries are singled out, thus, showing some distinctions between the two variants. Fractal dimensions and entropy are discussed as well.
Multiple genes encode the major surface glycoprotein of Pneumocystis carinii

DEFF Research Database (Denmark)

Kovacs, J A; Powell, F; Edman, J C

1993-01-01

hydrophobic region at the carboxyl terminus. The presence of multiple related msg genes encoding the major surface glycoprotein of P. carinii suggests that antigenic variation is a possible mechanism for evading host defenses. Further characterization of this family of genes should allow the development......The major surface antigen of Pneumocystis carinii, a life-threatening opportunistic pathogen in human immunodeficiency virus-infected patients, is an abundant glycoprotein that functions in host-organism interactions. A monoclonal antibody to this antigen is protective in animals, and thus...... blot studies using chromosomal or restricted DNA, the major surface glycoproteins are the products of a multicopy family of genes. The predicted protein has an M(r) of approximately 123,000, is relatively rich in cysteine residues (5.5%) that are very strongly conserved, and contains a well conserved...
Coping and Sexual Harassment: How Victims Cope across Multiple Settings.

Science.gov (United States)

Scarduzio, Jennifer A; Sheff, Sarah E; Smith, Mathew

2018-02-01

The ways sexual harassment occurs both online and in face-to-face settings has become more complicated. Sexual harassment that occurs in cyberspace or online sexual harassment adds complexity to the experiences of victims, current research understandings, and the legal dimensions of this phenomenon. Social networking sites (SNS) are a type of social media that offer unique opportunities to users and sometimes the communication that occurs on SNS can cross the line from flirtation into online sexual harassment. Victims of sexual harassment employ communicative strategies such as coping to make sense of their experiences of sexual harassment. The current study qualitatively examined problem-focused, active emotion-focused, and passive emotion-focused coping strategies employed by sexual harassment victims across multiple settings. We conducted 26 in-depth interviews with victims that had experienced sexual harassment across multiple settings (e.g., face-to-face and SNS). The findings present 16 types of coping strategies-five problem-focused, five active emotion-focused, and six passive emotion-focused. The victims used an average of three types of coping strategies during their experiences. Theoretical implications extend research on passive emotion-focused coping strategies by discussing powerlessness and how victims blame other victims. Furthermore, theoretically the findings reveal that coping is a complex, cyclical process and that victims shift among types of coping strategies over the course of their experience. Practical implications are offered for victims and for SNS sites.
Adaptive Horizontal Gene Transfers between Multiple Cheese-Associated Fungi.

Science.gov (United States)

Ropars, Jeanne; Rodríguez de la Vega, Ricardo C; López-Villavicencio, Manuela; Gouzy, Jérôme; Sallet, Erika; Dumas, Émilie; Lacoste, Sandrine; Debuchy, Robert; Dupont, Joëlle; Branca, Antoine; Giraud, Tatiana

2015-10-05

Domestication is an excellent model for studies of adaptation because it involves recent and strong selection on a few, identified traits [1-5]. Few studies have focused on the domestication of fungi, with notable exceptions [6-11], despite their importance to bioindustry [12] and to a general understanding of adaptation in eukaryotes [5]. Penicillium fungi are ubiquitous molds among which two distantly related species have been independently selected for cheese making-P. roqueforti for blue cheeses like Roquefort and P. camemberti for soft cheeses like Camembert. The selected traits include morphology, aromatic profile, lipolytic and proteolytic activities, and ability to grow at low temperatures, in a matrix containing bacterial and fungal competitors [13-15]. By comparing the genomes of ten Penicillium species, we show that adaptation to cheese was associated with multiple recent horizontal transfers of large genomic regions carrying crucial metabolic genes. We identified seven horizontally transferred regions (HTRs) spanning more than 10 kb each, flanked by specific transposable elements, and displaying nearly 100% identity between distant Penicillium species. Two HTRs carried genes with functions involved in the utilization of cheese nutrients or competition and were found nearly identical in multiple strains and species of cheese-associated Penicillium fungi, indicating recent selective sweeps; they were experimentally associated with faster growth and greater competitiveness on cheese and contained genes highly expressed in the early stage of cheese maturation. These findings have industrial and food safety implications and improve our understanding of the processes of adaptation to rapid environmental changes. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.

AnovArray: a set of SAS macros for the analysis of variance of gene expression data

Directory of Open Access Journals (Sweden)

Renard Jean-Paul

2005-06-01

Full Text Available Abstract Background Analysis of variance is a powerful approach to identify differentially expressed genes in a complex experimental design for microarray and macroarray data. The advantage of the anova model is the possibility to evaluate multiple sources of variation in an experiment. Results AnovArray is a package implementing ANOVA for gene expression data using SAS® statistical software. The originality of the package is 1 to quantify the different sources of variation on all genes together, 2 to provide a quality control of the model, 3 to propose two models for a gene's variance estimation and to perform a correction for multiple comparisons. Conclusion AnovArray is freely available at http://www-mig.jouy.inra.fr/stat/AnovArray and requires only SAS® statistical software.
A polymorphism in the HLA-DPB1 gene is associated with susceptibility to multiple sclerosis.

Directory of Open Access Journals (Sweden)

Judith Field

2010-10-01

Full Text Available We conducted an association study across the human leukocyte antigen (HLA complex to identify loci associated with multiple sclerosis (MS. Comparing 1927 SNPs in 1618 MS cases and 3413 controls of European ancestry, we identified seven SNPs that were independently associated with MS conditional on the others (each P ≤ 4 x 10(-6. All associations were significant in an independent replication cohort of 2212 cases and 2251 controls (P ≤ 0.001 and were highly significant in the combined dataset (P ≤ 6 x 10(-8. The associated SNPs included proxies for HLA-DRB1*15:01 and HLA-DRB1*03:01, and SNPs in moderate linkage disequilibrium (LD with HLA-A*02:01, HLA-DRB1*04:01 and HLA-DRB1*13:03. We also found a strong association with rs9277535 in the class II gene HLA-DPB1 (discovery set P = 9 x 10(-9, replication set P = 7 x 10(-4, combined P = 2 x 10(-10. HLA-DPB1 is located centromeric of the more commonly typed class II genes HLA-DRB1, -DQA1 and -DQB1. It is separated from these genes by a recombination hotspot, and the association is not affected by conditioning on genotypes at DRB1, DQA1 and DQB1. Hence rs9277535 represents an independent MS-susceptibility locus of genome-wide significance. It is correlated with the HLA-DPB1*03:01 allele, which has been implicated previously in MS in smaller studies. Further genotyping in large datasets is required to confirm and resolve this association.
The Data Set on the Multiple Abilities

DEFF Research Database (Denmark)

Klynge, Alice Heegaard

2008-01-01

This paper presents a data set on multiple abilities. The abilities cover the Literacy and Math Ability, the Creative and Innovative Ability, the Learning Ability, the Communication Ability, the Social Competency, the Self-Management Ability, the Environmental Awareness, the Civic Competency......, the Intercultural Awareness, and the Health Awareness. The data stems from a unique cross-sectional survey carried out for the adult population in Denmark. Several dimensions and many questions pinpoint and measure every ability. The dimensions cover areas such as the individual behavior at work, the individual...... behavior in leisure, the motivation for using an ability, the working conditions for using an ability, and the educational conditions for using an ability. The paper defines every ability and describes the dimensions and the questions underlying the abilities. It reports the categories of answers...
C/EBPβ Mediates Growth Hormone-Regulated Expression of Multiple Target Genes

Science.gov (United States)

Cui, Tracy X.; Lin, Grace; LaPensee, Christopher R.; Calinescu, Anda-Alexandra; Rathore, Maanjot; Streeter, Cale; Piwien-Pilipuk, Graciela; Lanning, Nathan; Jin, Hui; Carter-Su, Christin; Qin, Zhaohui S.

2011-01-01

Regulation of c-Fos transcription by GH is mediated by CCAAT/enhancer binding protein β (C/EBPβ). This study examines the role of C/EBPβ in mediating GH activation of other early response genes, including Cyr61, Btg2, Socs3, Zfp36, and Socs1. C/EBPβ depletion using short hairpin RNA impaired responsiveness of these genes to GH, as seen for c-Fos. Rescue with wild-type C/EBPβ led to GH-dependent recruitment of the coactivator p300 to the c-Fos promoter. In contrast, rescue with C/EBPβ mutated at the ERK phosphorylation site at T188 failed to induce GH-dependent recruitment of p300, indicating that ERK-mediated phosphorylation of C/EBPβ at T188 is required for GH-induced recruitment of p300 to c-Fos. GH also induced the occupancy of phosphorylated C/EBPβ and p300 on Cyr61, Btg2, and Socs3 at predicted C/EBP-cAMP response element-binding protein motifs in their promoters. Consistent with a role for ERKs in GH-induced expression of these genes, treatment with U0126 to block ERK phosphorylation inhibited their GH-induced expression. In contrast, GH-dependent expression of Zfp36 and Socs1 was not inhibited by U0126. Thus, induction of multiple early response genes by GH in 3T3-F442A cells is mediated by C/EBPβ. A subset of these genes is regulated similarly to c-Fos, through a mechanism involving GH-stimulated ERK 1/2 activation, phosphorylation of C/EBPβ, and recruitment of p300. Overall, these studies suggest that C/EBPβ, like the signal transducer and activator of transcription proteins, regulates multiple genes in response to GH. PMID:21292824
Visual Comparison of Multiple Gene Expression Datasets in a Genomic Context

Directory of Open Access Journals (Sweden)

Borowski Krzysztof

2008-06-01

Full Text Available The need for novel methods of visualizing microarray data is growing. New perspectives are beneficial to finding patterns in expression data. The Bluejay genome browser provides an integrative way of visualizing gene expression datasets in a genomic context. We have now developed the functionality to display multiple microarray datasets simultaneously in Bluejay, in order to provide researchers with a comprehensive view of their datasets linked to a graphical representation of gene function. This will enable biologists to obtain valuable insights on expression patterns, by allowing them to analyze the expression values in relation to the gene locations as well as to compare expression profiles of related genomes or of di erent experiments for the same genome.
Meta-Analysis of Multiple Sclerosis Microarray Data Reveals Dysregulation in RNA Splicing Regulatory Genes

Directory of Open Access Journals (Sweden)

Elvezia Maria Paraboschi

2015-09-01

Full Text Available Abnormalities in RNA metabolism and alternative splicing (AS are emerging as important players in complex disease phenotypes. In particular, accumulating evidence suggests the existence of pathogenic links between multiple sclerosis (MS and altered AS, including functional studies showing that an imbalance in alternatively-spliced isoforms may contribute to disease etiology. Here, we tested whether the altered expression of AS-related genes represents a MS-specific signature. A comprehensive comparative analysis of gene expression profiles of publicly-available microarray datasets (190 MS cases, 182 controls, followed by gene-ontology enrichment analysis, highlighted a significant enrichment for differentially-expressed genes involved in RNA metabolism/AS. In detail, a total of 17 genes were found to be differentially expressed in MS in multiple datasets, with CELF1 being dysregulated in five out of seven studies. We confirmed CELF1 downregulation in MS (p = 0.0015 by real-time RT-PCRs on RNA extracted from blood cells of 30 cases and 30 controls. As a proof of concept, we experimentally verified the unbalance in alternatively-spliced isoforms in MS of the NFAT5 gene, a putative CELF1 target. In conclusion, for the first time we provide evidence of a consistent dysregulation of splicing-related genes in MS and we discuss its possible implications in modulating specific AS events in MS susceptibility genes.
Multiple independent insertions of 5S rRNA genes in the spliced-leader gene family of trypanosome species.

Science.gov (United States)

Beauparlant, Marc A; Drouin, Guy

2014-02-01

Analyses of the 5S rRNA genes found in the spliced-leader (SL) gene repeat units of numerous trypanosome species suggest that such linkages were not inherited from a common ancestor, but were the result of independent 5S rRNA gene insertions. In trypanosomes, 5S rRNA genes are found either in the tandemly repeated units coding for SL genes or in independent tandemly repeated units. Given that trypanosome species where 5S rRNA genes are within the tandemly repeated units coding for SL genes are phylogenetically related, one might hypothesize that this arrangement is the result of an ancestral insertion of 5S rRNA genes into the tandemly repeated SL gene family of trypanosomes. Here, we use the types of 5S rRNA genes found associated with SL genes, the flanking regions of the inserted 5S rRNA genes and the position of these insertions to show that most of the 5S rRNA genes found within SL gene repeat units of trypanosome species were not acquired from a common ancestor but are the results of independent insertions. These multiple 5S rRNA genes insertion events in trypanosomes are likely the result of frequent founder events in different hosts and/or geographical locations in species having short generation times.
Identification of a set of genes showing regionally enriched expression in the mouse brain

Directory of Open Access Journals (Sweden)

Marra Marco A

2008-07-01

Full Text Available Abstract Background The Pleiades Promoter Project aims to improve gene therapy by designing human mini-promoters ( Results We have utilized LongSAGE to identify regionally enriched transcripts in the adult mouse brain. As supplemental strategies, we also performed a meta-analysis of published literature and inspected the Allen Brain Atlas in situ hybridization data. From a set of approximately 30,000 mouse genes, 237 were identified as showing specific or enriched expression in 30 target regions of the mouse brain. GO term over-representation among these genes revealed co-involvement in various aspects of central nervous system development and physiology. Conclusion Using a multi-faceted expression validation approach, we have identified mouse genes whose human orthologs are good candidates for design of mini-promoters. These mouse genes represent molecular markers in several discrete brain regions/cell-types, which could potentially provide a mechanistic explanation of unique functions performed by each region. This set of markers may also serve as a resource for further studies of gene regulatory elements influencing brain expression.
The SET1 Complex Selects Actively Transcribed Target Genes via Multivalent Interaction with CpG Island Chromatin

Directory of Open Access Journals (Sweden)

David A. Brown

2017-09-01

Full Text Available Chromatin modifications and the promoter-associated epigenome are important for the regulation of gene expression. However, the mechanisms by which chromatin-modifying complexes are targeted to the appropriate gene promoters in vertebrates and how they influence gene expression have remained poorly defined. Here, using a combination of live-cell imaging and functional genomics, we discover that the vertebrate SET1 complex is targeted to actively transcribed gene promoters through CFP1, which engages in a form of multivalent chromatin reading that involves recognition of non-methylated DNA and histone H3 lysine 4 trimethylation (H3K4me3. CFP1 defines SET1 complex occupancy on chromatin, and its multivalent interactions are required for the SET1 complex to place H3K4me3. In the absence of CFP1, gene expression is perturbed, suggesting that normal targeting and function of the SET1 complex are central to creating an appropriately functioning vertebrate promoter-associated epigenome.
The SH2D2A gene and susceptibility to multiple sclerosis

DEFF Research Database (Denmark)

Lorentzen, A.R.; Smestad, C.; Lie, B.A.

2008-01-01

We previously reported an association between the SH2D2A gene encoding TSAd and multiple sclerosis (MS). Here a total of 2128 Nordic MS patients and 2004 controls were genotyped for the SH2D2A promoter GA repeat polymorphism and rs926103 encoding a serine to asparagine substitution at amino acid...... that the SH2D2A gene may contribute to susceptibility to MS Udgivelsesdato: 2008/7/15...
Tracking difference in gene expression in a time-course experiment using gene set enrichment analysis.

Directory of Open Access Journals (Sweden)

Pui Shan Wong

Full Text Available Fistulifera sp. strain JPCC DA0580 is a newly sequenced pennate diatom that is capable of simultaneously growing and accumulating lipids. This is a unique trait, not found in other related microalgae so far. It is able to accumulate between 40 to 60% of its cell weight in lipids, making it a strong candidate for the production of biofuel. To investigate this characteristic, we used RNA-Seq data gathered at four different times while Fistulifera sp. strain JPCC DA0580 was grown in oil accumulating and non-oil accumulating conditions. We then adapted gene set enrichment analysis (GSEA to investigate the relationship between the difference in gene expression of 7,822 genes and metabolic functions in our data. We utilized information in the KEGG pathway database to create the gene sets and changed GSEA to use re-sampling so that data from the different time points could be included in the analysis. Our GSEA method identified photosynthesis, lipid synthesis and amino acid synthesis related pathways as processes that play a significant role in oil production and growth in Fistulifera sp. strain JPCC DA0580. In addition to GSEA, we visualized the results by creating a network of compounds and reactions, and plotted the expression data on top of the network. This made existing graph algorithms available to us which we then used to calculate a path that metabolizes glucose into triacylglycerol (TAG in the smallest number of steps. By visualizing the data this way, we observed a separate up-regulation of genes at different times instead of a concerted response. We also identified two metabolic paths that used less reactions than the one shown in KEGG and showed that the reactions were up-regulated during the experiment. The combination of analysis and visualization methods successfully analyzed time-course data, identified important metabolic pathways and provided new hypotheses for further research.
Fast generation of multiple resolution instances of raster data sets

DEFF Research Database (Denmark)

Arge, Lars; Haverkort, Herman; Tsirogiannis, Constantinos

2012-01-01

In many GIS applications it is important to study the characteristics of a raster data set at multiple resolutions. Often this is done by generating several coarser resolution rasters from a fine resolution raster. In this paper we describe efficient algorithms for different variants of this prob......In many GIS applications it is important to study the characteristics of a raster data set at multiple resolutions. Often this is done by generating several coarser resolution rasters from a fine resolution raster. In this paper we describe efficient algorithms for different variants...... in the main memory of the computer. We also provide two algorithms that solve this problem in external memory, that is when the input raster is larger than the main memory. The first external algorithm is very easy to implement and requires O(sort(N)) data block transfers from/to the external memory....... For this variant we describe an algorithm that runs in (U logN) time in internal memory, where U is the size of the output. We show how this algorithm can be adapted to perform efficiently in the external memory using O(sort(U)) data transfers from the disk. We have also implemented two of the presented algorithms...
Cross-species multiple environmental stress responses: An integrated approach to identify candidate genes for multiple stress tolerance in sorghum (Sorghum bicolor (L. Moench and related model species.

Directory of Open Access Journals (Sweden)

Adugna Abdi Woldesemayat

Full Text Available Crop response to the changing climate and unpredictable effects of global warming with adverse conditions such as drought stress has brought concerns about food security to the fore; crop yield loss is a major cause of concern in this regard. Identification of genes with multiple responses across environmental stresses is the genetic foundation that leads to crop adaptation to environmental perturbations.In this paper, we introduce an integrated approach to assess candidate genes for multiple stress responses across-species. The approach combines ontology based semantic data integration with expression profiling, comparative genomics, phylogenomics, functional gene enrichment and gene enrichment network analysis to identify genes associated with plant stress phenotypes. Five different ontologies, viz., Gene Ontology (GO, Trait Ontology (TO, Plant Ontology (PO, Growth Ontology (GRO and Environment Ontology (EO were used to semantically integrate drought related information.Target genes linked to Quantitative Trait Loci (QTLs controlling yield and stress tolerance in sorghum (Sorghum bicolor (L. Moench and closely related species were identified. Based on the enriched GO terms of the biological processes, 1116 sorghum genes with potential responses to 5 different stresses, such as drought (18%, salt (32%, cold (20%, heat (8% and oxidative stress (25% were identified to be over-expressed. Out of 169 sorghum drought responsive QTLs associated genes that were identified based on expression datasets, 56% were shown to have multiple stress responses. On the other hand, out of 168 additional genes that have been evaluated for orthologous pairs, 90% were conserved across species for drought tolerance. Over 50% of identified maize and rice genes were responsive to drought and salt stresses and were co-located within multifunctional QTLs. Among the total identified multi-stress responsive genes, 272 targets were shown to be co-localized within QTLs
GOBO: gene expression-based outcome for breast cancer online.

Directory of Open Access Journals (Sweden)

Markus Ringnér

Full Text Available Microarray-based gene expression analysis holds promise of improving prognostication and treatment decisions for breast cancer patients. However, the heterogeneity of breast cancer emphasizes the need for validation of prognostic gene signatures in larger sample sets stratified into relevant subgroups. Here, we describe a multifunctional user-friendly online tool, GOBO (http://co.bmc.lu.se/gobo, allowing a range of different analyses to be performed in an 1881-sample breast tumor data set, and a 51-sample breast cancer cell line set, both generated on Affymetrix U133A microarrays. GOBO supports a wide range of applications including: 1 rapid assessment of gene expression levels in subgroups of breast tumors and cell lines, 2 identification of co-expressed genes for creation of potential metagenes, 3 association with outcome for gene expression levels of single genes, sets of genes, or gene signatures in multiple subgroups of the 1881-sample breast cancer data set. The design and implementation of GOBO facilitate easy incorporation of additional query functions and applications, as well as additional data sets irrespective of tumor type and array platform.
Extracting Gene Networks for Low-Dose Radiation Using Graph Theoretical Algorithms

Energy Technology Data Exchange (ETDEWEB)

Voy, Brynn H [ORNL; Scharff, Jon [University of Tennessee, Knoxville (UTK); Perkins, Andy [University of Tennessee, Knoxville (UTK); Saxton, Arnold [University of Tennessee, Knoxville (UTK); Borate, Bhavesh [University of Tennessee, Knoxville (UTK); Chesler, Elissa J [ORNL; Branstetter, Lisa R [ORNL; Langston, Michael A [University of Tennessee, Knoxville (UTK)

2006-01-01

Genes with common functions often exhibit correlated expression levels, which can be used to identify sets of interacting genes from microarray data. Microarrays typically measure expression across genomic space, creating a massive matrix of co-expression that must be mined to extract only the most relevant gene interactions. We describe a graph theoretical approach to extracting co-expressed sets of genes, based on the computation of cliques. Unlike the results of traditional clustering algorithms, cliques are not disjoint and allow genes to be assigned to multiple sets of interacting partners, consistent with biological reality. A graph is created by thresholding the correlation matrix to include only the correlations most likely to signify functional relationships. Cliques computed from the graph correspond to sets of genes for which significant edges are present between all members of the set, representing potential members of common or interacting pathways. Clique membership can be used to infer function about poorly annotated genes, based on the known functions of better-annotated genes with which they share clique membership (i.e., ''guilt-by-association''). We illustrate our method by applying it to microarray data collected from the spleens of mice exposed to low-dose ionizing radiation. Differential analysis is used to identify sets of genes whose interactions are impacted by radiation exposure. The correlation graph is also queried independently of clique to extract edges that are impacted by radiation. We present several examples of multiple gene interactions that are altered by radiation exposure and thus represent potential molecular pathways that mediate the radiation response.
Extracting gene networks for low-dose radiation using graph theoretical algorithms.

Directory of Open Access Journals (Sweden)

Brynn H Voy

2006-07-01

Full Text Available Genes with common functions often exhibit correlated expression levels, which can be used to identify sets of interacting genes from microarray data. Microarrays typically measure expression across genomic space, creating a massive matrix of co-expression that must be mined to extract only the most relevant gene interactions. We describe a graph theoretical approach to extracting co-expressed sets of genes, based on the computation of cliques. Unlike the results of traditional clustering algorithms, cliques are not disjoint and allow genes to be assigned to multiple sets of interacting partners, consistent with biological reality. A graph is created by thresholding the correlation matrix to include only the correlations most likely to signify functional relationships. Cliques computed from the graph correspond to sets of genes for which significant edges are present between all members of the set, representing potential members of common or interacting pathways. Clique membership can be used to infer function about poorly annotated genes, based on the known functions of better-annotated genes with which they share clique membership (i.e., "guilt-by-association". We illustrate our method by applying it to microarray data collected from the spleens of mice exposed to low-dose ionizing radiation. Differential analysis is used to identify sets of genes whose interactions are impacted by radiation exposure. The correlation graph is also queried independently of clique to extract edges that are impacted by radiation. We present several examples of multiple gene interactions that are altered by radiation exposure and thus represent potential molecular pathways that mediate the radiation response.
Solving the multiple-set split equality common fixed-point problem of firmly quasi-nonexpansive operators.

Science.gov (United States)

Zhao, Jing; Zong, Haili

2018-01-01

In this paper, we propose parallel and cyclic iterative algorithms for solving the multiple-set split equality common fixed-point problem of firmly quasi-nonexpansive operators. We also combine the process of cyclic and parallel iterative methods and propose two mixed iterative algorithms. Our several algorithms do not need any prior information about the operator norms. Under mild assumptions, we prove weak convergence of the proposed iterative sequences in Hilbert spaces. As applications, we obtain several iterative algorithms to solve the multiple-set split equality problem.
Simulation of multi-photon emission isotopes using time-resolved SimSET multiple photon history generator

Energy Technology Data Exchange (ETDEWEB)

Chiang, Chih-Chieh; Lin, Hsin-Hon; Lin, Chang-Shiun; Chuang, Keh-Shih [Department of Biomedical Engineering and Environmental Sciences, National Tsing-HuaUniversity, Hsinchu, Taiwan (China); Jan, Meei-Ling [Health Physics Division, Institute of Nuclear Energy Research, Atomic Energy Council, Taoyuan, Taiwan (China)

2015-07-01

Abstract-Multiple-photon emitters, such as In-111 or Se-75, have enormous potential in the field of nuclear medicine imaging. For example, Se-75 can be used to investigate the bile acid malabsorption and measure the bile acid pool loss. The simulation system for emission tomography (SimSET) is a well-known Monte Carlo simulation (MCS) code in nuclear medicine for its high computational efficiency. However, current SimSET cannot simulate these isotopes due to the lack of modeling of complex decay scheme and the time-dependent decay process. To extend the versatility of SimSET for simulation of those multi-photon emission isotopes, a time-resolved multiple photon history generator based on SimSET codes is developed in present study. For developing the time-resolved SimSET (trSimSET) with radionuclide decay process, the new MCS model introduce new features, including decay time information and photon time-of-flight information, into this new code. The half-life of energy states were tabulated from the Evaluated Nuclear Structure Data File (ENSDF) database. The MCS results indicate that the overall percent difference is less than 8.5% for all simulation trials as compared to GATE. To sum up, we demonstrated that time-resolved SimSET multiple photon history generator can have comparable accuracy with GATE and keeping better computational efficiency. The new MCS code is very useful to study the multi-photon imaging of novel isotopes that needs the simulation of lifetime and the time-of-fight measurements. (authors)
The SET1 Complex Selects Actively Transcribed Target Genes via Multivalent Interaction with CpG Island Chromatin.

Science.gov (United States)

Brown, David A; Di Cerbo, Vincenzo; Feldmann, Angelika; Ahn, Jaewoo; Ito, Shinsuke; Blackledge, Neil P; Nakayama, Manabu; McClellan, Michael; Dimitrova, Emilia; Turberfield, Anne H; Long, Hannah K; King, Hamish W; Kriaucionis, Skirmantas; Schermelleh, Lothar; Kutateladze, Tatiana G; Koseki, Haruhiko; Klose, Robert J

2017-09-05

Chromatin modifications and the promoter-associated epigenome are important for the regulation of gene expression. However, the mechanisms by which chromatin-modifying complexes are targeted to the appropriate gene promoters in vertebrates and how they influence gene expression have remained poorly defined. Here, using a combination of live-cell imaging and functional genomics, we discover that the vertebrate SET1 complex is targeted to actively transcribed gene promoters through CFP1, which engages in a form of multivalent chromatin reading that involves recognition of non-methylated DNA and histone H3 lysine 4 trimethylation (H3K4me3). CFP1 defines SET1 complex occupancy on chromatin, and its multivalent interactions are required for the SET1 complex to place H3K4me3. In the absence of CFP1, gene expression is perturbed, suggesting that normal targeting and function of the SET1 complex are central to creating an appropriately functioning vertebrate promoter-associated epigenome. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.
The Effects of Noncontingent Access to Single-versus Multiple-Stimulus Sets on Self-Injurious Behavior.

Science.gov (United States)

DeLeon, Iser G.; Anders, Bonita M.; Rodriguez-Catter, Vanessa; Neidert, Pamela L.

2000-01-01

The automatically reinforced self-injury of a girl (age 11) with autism was treated by providing noncontingent access to a single set of preferred toys during 30-minute sessions. Rotating toy sets after 10 minutes or providing access to multiple toy sets resulted in reductions that lasted the entire 30 minutes. (Contains four references.)…

Classification of Non-Small Cell Lung Cancer Using Significance Analysis of Microarray-Gene Set Reduction Algorithm

Directory of Open Access Journals (Sweden)

Lei Zhang

2016-01-01

Full Text Available Among non-small cell lung cancer (NSCLC, adenocarcinoma (AC, and squamous cell carcinoma (SCC are two major histology subtypes, accounting for roughly 40% and 30% of all lung cancer cases, respectively. Since AC and SCC differ in their cell of origin, location within the lung, and growth pattern, they are considered as distinct diseases. Gene expression signatures have been demonstrated to be an effective tool for distinguishing AC and SCC. Gene set analysis is regarded as irrelevant to the identification of gene expression signatures. Nevertheless, we found that one specific gene set analysis method, significance analysis of microarray-gene set reduction (SAMGSR, can be adopted directly to select relevant features and to construct gene expression signatures. In this study, we applied SAMGSR to a NSCLC gene expression dataset. When compared with several novel feature selection algorithms, for example, LASSO, SAMGSR has equivalent or better performance in terms of predictive ability and model parsimony. Therefore, SAMGSR is a feature selection algorithm, indeed. Additionally, we applied SAMGSR to AC and SCC subtypes separately to discriminate their respective stages, that is, stage II versus stage I. Few overlaps between these two resulting gene signatures illustrate that AC and SCC are technically distinct diseases. Therefore, stratified analyses on subtypes are recommended when diagnostic or prognostic signatures of these two NSCLC subtypes are constructed.
Redundancy control in pathway databases (ReCiPa): an application for improving gene-set enrichment analysis in Omics studies and "Big data" biology.

Science.gov (United States)

Vivar, Juan C; Pemu, Priscilla; McPherson, Ruth; Ghosh, Sujoy

2013-08-01

Abstract Unparalleled technological advances have fueled an explosive growth in the scope and scale of biological data and have propelled life sciences into the realm of "Big Data" that cannot be managed or analyzed by conventional approaches. Big Data in the life sciences are driven primarily via a diverse collection of 'omics'-based technologies, including genomics, proteomics, metabolomics, transcriptomics, metagenomics, and lipidomics. Gene-set enrichment analysis is a powerful approach for interrogating large 'omics' datasets, leading to the identification of biological mechanisms associated with observed outcomes. While several factors influence the results from such analysis, the impact from the contents of pathway databases is often under-appreciated. Pathway databases often contain variously named pathways that overlap with one another to varying degrees. Ignoring such redundancies during pathway analysis can lead to the designation of several pathways as being significant due to high content-similarity, rather than truly independent biological mechanisms. Statistically, such dependencies also result in correlated p values and overdispersion, leading to biased results. We investigated the level of redundancies in multiple pathway databases and observed large discrepancies in the nature and extent of pathway overlap. This prompted us to develop the application, ReCiPa (Redundancy Control in Pathway Databases), to control redundancies in pathway databases based on user-defined thresholds. Analysis of genomic and genetic datasets, using ReCiPa-generated overlap-controlled versions of KEGG and Reactome pathways, led to a reduction in redundancy among the top-scoring gene-sets and allowed for the inclusion of additional gene-sets representing possibly novel biological mechanisms. Using obesity as an example, bioinformatic analysis further demonstrated that gene-sets identified from overlap-controlled pathway databases show stronger evidence of prior association
Gene expression profiling reveals multiple toxicity endpoints induced by hepatotoxicants

Energy Technology Data Exchange (ETDEWEB)

Huang Qihong; Jin Xidong; Gaillard, Elias T.; Knight, Brian L.; Pack, Franklin D.; Stoltz, James H.; Jayadev, Supriya; Blanchard, Kerry T

2004-05-18

Microarray technology continues to gain increased acceptance in the drug development process, particularly at the stage of toxicology and safety assessment. In the current study, microarrays were used to investigate gene expression changes associated with hepatotoxicity, the most commonly reported clinical liability with pharmaceutical agents. Acetaminophen, methotrexate, methapyrilene, furan and phenytoin were used as benchmark compounds capable of inducing specific but different types of hepatotoxicity. The goal of the work was to define gene expression profiles capable of distinguishing the different subtypes of hepatotoxicity. Sprague-Dawley rats were orally dosed with acetaminophen (single dose, 4500 mg/kg for 6, 24 and 72 h), methotrexate (1 mg/kg per day for 1, 7 and 14 days), methapyrilene (100 mg/kg per day for 3 and 7 days), furan (40 mg/kg per day for 1, 3, 7 and 14 days) or phenytoin (300 mg/kg per day for 14 days). Hepatic gene expression was assessed using toxicology-specific gene arrays containing 684 target genes or expressed sequence tags (ESTs). Principal component analysis (PCA) of gene expression data was able to provide a clear distinction of each compound, suggesting that gene expression data can be used to discern different hepatotoxic agents and toxicity endpoints. Gene expression data were applied to the multiplicity-adjusted permutation test and significantly changed genes were categorized and correlated to hepatotoxic endpoints. Repression of enzymes involved in lipid oxidation (acyl-CoA dehydrogenase, medium chain, enoyl CoA hydratase, very long-chain acyl-CoA synthetase) were associated with microvesicular lipidosis. Likewise, subsets of genes associated with hepatotocellular necrosis, inflammation, hepatitis, bile duct hyperplasia and fibrosis have been identified. The current study illustrates that expression profiling can be used to: (1) distinguish different hepatotoxic endpoints; (2) predict the development of toxic endpoints; and
Shrinkage covariance matrix approach based on robust trimmed mean in gene sets detection

Science.gov (United States)

Karjanto, Suryaefiza; Ramli, Norazan Mohamed; Ghani, Nor Azura Md; Aripin, Rasimah; Yusop, Noorezatty Mohd

2015-02-01

Microarray involves of placing an orderly arrangement of thousands of gene sequences in a grid on a suitable surface. The technology has made a novelty discovery since its development and obtained an increasing attention among researchers. The widespread of microarray technology is largely due to its ability to perform simultaneous analysis of thousands of genes in a massively parallel manner in one experiment. Hence, it provides valuable knowledge on gene interaction and function. The microarray data set typically consists of tens of thousands of genes (variables) from just dozens of samples due to various constraints. Therefore, the sample covariance matrix in Hotelling's T2 statistic is not positive definite and become singular, thus it cannot be inverted. In this research, the Hotelling's T2 statistic is combined with a shrinkage approach as an alternative estimation to estimate the covariance matrix to detect significant gene sets. The use of shrinkage covariance matrix overcomes the singularity problem by converting an unbiased to an improved biased estimator of covariance matrix. Robust trimmed mean is integrated into the shrinkage matrix to reduce the influence of outliers and consequently increases its efficiency. The performance of the proposed method is measured using several simulation designs. The results are expected to outperform existing techniques in many tested conditions.
The development and application of a multiple gene co-silencing system using endogenous URA3 as a reporter gene in Ganoderma lucidum.

Directory of Open Access Journals (Sweden)

Dashuai Mu

Full Text Available Ganoderma lucidum is one of the most important medicinal mushrooms; however, molecular genetics research on this species has been limited due to a lack of reliable reverse genetic tools. In this study, the endogenous orotidine 5'-monophosphate decarboxylase gene (URA3 was cloned as a silencing reporter, and four gene-silencing methods using hairpin, sense, antisense, and dual promoter constructs, were introduced into G. lucidum through a simple electroporation procedure. A comparison and evaluation of silencing efficiency demonstrated that all of the four methods differentially suppressed the expression of URA3. Our data unequivocally indicate that the dual promoter silencing vector yields the highest rate of URA3 silencing compared with other vectors (up to 81.9%. To highlight the advantages of the dual promoter system, we constructed a co-silencing system based on the dual promoter method and succeeded in co-silencing URA3 and laccase in G. lucidum. The reduction of the mRNA levels of the two genes were correlated. Thus, the screening efficiency for RNAi knockdown of multiple genes may be improved by the co-silencing of an endogenous reporter gene. The molecular tools developed in this study should facilitate the isolation of genes and the characterization of the functions of multiple genes in this pharmaceutically important species, and these tools should be highly useful for the study of other basidiomycetes.
Development of a multiple-gene-loading method by combining multi-integration system-equipped mouse artificial chromosome vector and CRISPR-Cas9.

Directory of Open Access Journals (Sweden)

Kazuhisa Honma

Full Text Available Mouse artificial chromosome (MAC vectors have several advantages as gene delivery vectors, such as stable and independent maintenance in host cells without integration, transferability from donor cells to recipient cells via microcell-mediated chromosome transfer (MMCT, and the potential for loading a megabase-sized DNA fragment. Previously, a MAC containing a multi-integrase platform (MI-MAC was developed to facilitate the transfer of multiple genes into desired cells. Although the MI system can theoretically hold five gene-loading vectors (GLVs, there are a limited number of drugs available for the selection of multiple-GLV integration. To overcome this issue, we attempted to knock out and reuse drug resistance genes (DRGs using the CRISPR-Cas9 system. In this study, we developed new methods for multiple-GLV integration. As a proof of concept, we introduced five GLVs in the MI-MAC by these methods, in which each GLV contained a gene encoding a fluorescent or luminescent protein (EGFP, mCherry, BFP, Eluc, and Cluc. Genes of interest (GOI on the MI-MAC were expressed stably and functionally without silencing in the host cells. Furthermore, the MI-MAC carrying five GLVs was transferred to other cells by MMCT, and the resultant recipient cells exhibited all five fluorescence/luminescence signals. Thus, the MI-MAC was successfully used as a multiple-GLV integration vector using the CRISPR-Cas9 system. The MI-MAC employing these methods may resolve bottlenecks in developing multiple-gene humanized models, multiple-gene monitoring models, disease models, reprogramming, and inducible gene expression systems.
Molecular evolution of the Paramyxoviridae and Rhabdoviridae multiple-protein-encoding P gene.

Science.gov (United States)

Jordan, I K; Sutter, B A; McClure, M A

2000-01-01

Presented here is an analysis of the molecular evolutionary dynamics of the P gene among 76 representative sequences of the Paramyxoviridae and Rhabdoviridae RNA virus families. In a number of Paramyxoviridae taxa, as well as in vesicular stomatitis viruses of the Rhabdoviridae, the P gene encodes multiple proteins from a single genomic RNA sequence. These products include the phosphoprotein (P), as well as the C and V proteins. The complexity of the P gene makes it an intriguing locus to study from an evolutionary perspective. Amino acid sequence alignments of the proteins encoded at the P and N loci were used in independent phylogenetic reconstructions of the Paramyxoviridae and Rhabdoviridae families. P-gene-coding capacities were mapped onto the Paramyxoviridae phylogeny, and the most parsimonious path of multiple-coding-capacity evolution was determined. Levels of amino acid variation for Paramyxoviridae and Rhabdoviridae P-gene-encoded products were also analyzed. Proteins encoded in overlapping reading frames from the same nucleotides have different levels of amino acid variation. The nucleotide architecture that underlies the amino acid variation was determined in order to evaluate the role of selection in the evolution of the P gene overlapping reading frames. In every case, the evolution of one of the proteins encoded in the overlapping reading frames has been constrained by negative selection while the other has evolved more rapidly. The integrity of the overlapping reading frame that represents a derived state is generally maintained at the expense of the ancestral reading frame encoded by the same nucleotides. The evolution of such multicoding sequences is likely a response by RNA viruses to selective pressure to maximize genomic information content while maintaining small genome size. The ability to evolve such a complex genomic strategy is intimately related to the dynamics of the viral quasispecies, which allow enhanced exploration of the adaptive
Positive selection of Plasmodium falciparum parasites with multiple var2csa-type PfEMP1 genes during the course of infection in pregnant women

DEFF Research Database (Denmark)

Sander, Adam F; Salanti, Ali; Lavstsen, Thomas

2011-01-01

multiple genes coding for different VAR2CSA proteins, and parasites with >1 var2csa gene appear to be more common in pregnant women with placental malaria than in nonpregnant individuals. We present evidence that, in pregnant women, parasites containing multiple var2csa-type genes possess a selective...... advantage over parasites with a single var2csa gene. Accumulation of parasites with multiple copies of the var2csa gene during the course of pregnancy was also correlated with the development of antibodies involved in blocking VAR2CSA adhesion. The data suggest that multiplicity of var2csa-type genes...
Novel Approach to Tourism Analysis with Multiple Outcome Capability Using Rough Set Theory

Directory of Open Access Journals (Sweden)

Chun-Che Huang

2016-12-01

Full Text Available To explore the relationship between characteristics and decision-making outcomes of the tourist is critical to keep competitive tourism business. In investigation of tourism development, most of the existing studies lack of a systematic approach to analyze qualitative data. Although the traditional Rough Set (RS based approach is an excellent classification method in qualitative modeling, but it is canarsquo;t deal with the case of multiple outcomes, which is a common situation in tourism. Consequently, the Multiple Outcome Reduct Generation (MORG and Multiple Outcome Rule Extraction (MORE approaches based on RS to handle multiple outcomes are proposed. This study proposes a ranking based approach to induct meaningful reducts and ensure the strength and robustness of decision rules, which helps decision makers understand touristarsquo;s characteristics in a tourism case.
Multiple episodes of convergence in genes of the dim light vision pathway in bats.

Directory of Open Access Journals (Sweden)

Yong-Yi Shen

Full Text Available The molecular basis of the evolution of phenotypic characters is very complex and is poorly understood with few examples documenting the roles of multiple genes. Considering that a single gene cannot fully explain the convergence of phenotypic characters, we choose to study the convergent evolution of rod vision in two divergent bats from a network perspective. The Old World fruit bats (Pteropodidae are non-echolocating and have binocular vision, whereas the sheath-tailed bats (Emballonuridae are echolocating and have monocular vision; however, they both have relatively large eyes and rely more on rod vision to find food and navigate in the night. We found that the genes CRX, which plays an essential role in the differentiation of photoreceptor cells, SAG, which is involved in the desensitization of the photoactivated transduction cascade, and the photoreceptor gene RH, which is directly responsible for the perception of dim light, have undergone parallel sequence evolution in two divergent lineages of bats with larger eyes (Pteropodidae and Emballonuroidea. The multiple convergent events in the network of genes essential for rod vision is a rare phenomenon that illustrates the importance of investigating pathways and networks in the evolution of the molecular basis of phenotypic convergence.
Dynamic Response Genes in CD4+ T Cells Reveal a Network of Interactive Proteins that Classifies Disease Activity in Multiple Sclerosis

Directory of Open Access Journals (Sweden)

Sandra Hellberg

2016-09-01

Full Text Available Multiple sclerosis (MS is a chronic inflammatory disease of the CNS and has a varying disease course as well as variable response to treatment. Biomarkers may therefore aid personalized treatment. We tested whether in vitro activation of MS patient-derived CD4+ T cells could reveal potential biomarkers. The dynamic gene expression response to activation was dysregulated in patient-derived CD4+ T cells. By integrating our findings with genome-wide association studies, we constructed a highly connected MS gene module, disclosing cell activation and chemotaxis as central components. Changes in several module genes were associated with differences in protein levels, which were measurable in cerebrospinal fluid and were used to classify patients from control individuals. In addition, these measurements could predict disease activity after 2 years and distinguish low and high responders to treatment in two additional, independent cohorts. While further validation is needed in larger cohorts prior to clinical implementation, we have uncovered a set of potentially promising biomarkers.
A Nonparametric, Multiple Imputation-Based Method for the Retrospective Integration of Data Sets

Science.gov (United States)

Carrig, Madeline M.; Manrique-Vallier, Daniel; Ranby, Krista W.; Reiter, Jerome P.; Hoyle, Rick H.

2015-01-01

Complex research questions often cannot be addressed adequately with a single data set. One sensible alternative to the high cost and effort associated with the creation of large new data sets is to combine existing data sets containing variables related to the constructs of interest. The goal of the present research was to develop a flexible, broadly applicable approach to the integration of disparate data sets that is based on nonparametric multiple imputation and the collection of data from a convenient, de novo calibration sample. We demonstrate proof of concept for the approach by integrating three existing data sets containing items related to the extent of problematic alcohol use and associations with deviant peers. We discuss both necessary conditions for the approach to work well and potential strengths and weaknesses of the method compared to other data set integration approaches. PMID:26257437
Detailed assessment of gene activation levels by multiple hypoxia-responsive elements under various hypoxic conditions.

Science.gov (United States)

Takeuchi, Yasuto; Inubushi, Masayuki; Jin, Yong-Nan; Murai, Chika; Tsuji, Atsushi B; Hata, Hironobu; Kitagawa, Yoshimasa; Saga, Tsuneo

2014-12-01

HIF-1/HRE pathway is a promising target for the imaging and the treatment of intractable malignancy (HIF-1; hypoxia-inducible factor 1, HRE; hypoxia-responsive element). The purposes of our study are: (1) to assess the gene activation levels resulting from various numbers of HREs under various hypoxic conditions, (2) to evaluate the bidirectional activity of multiple HREs, and (3) to confirm whether multiple HREs can induce gene expression in vivo. Human colon carcinoma HCT116 cells were transiently transfected by the constructs containing a firefly luciferase reporter gene and various numbers (2, 4, 6, 8, 10, and 12) of HREs (nHRE+, nHRE-). The relative luciferase activities were measured under various durations of hypoxia (6, 12, 18, and 24 h), O2 concentrations (1, 2, 4, 8, and 16 %), and various concentrations of deferoxamine mesylate (20, 40, 80, 160, and 320 µg/mL growth medium). The bidirectional gene activation levels by HREs were examined in the constructs (dual-luc-nHREs) containing firefly and Renilla luciferase reporter genes at each side of nHREs. Finally, to test whether the construct containing 12HRE and the NIS reporter gene (12HRE-NIS) can induce gene expression in vivo, SPECT imaging was performed in a mouse xenograft model. (1) gene activation levels by HREs tended to increase with increasing HRE copy number, but a saturation effect was observed in constructs with more than 6 or 8 copies of an HRE, (2) gene activation levels by HREs increased remarkably during 6-12 h of hypoxia, but not beyond 12 h, (3) gene activation levels by HREs decreased with increasing O2 concentrations, but could be detected even under mild hypoxia at 16 % O2, (4) the bidirectionally proportional activity of the HRE was confirmed regardless of the hypoxic severity, and (5) NIS expression driven by 12 tandem copies of an HRE in response to hypoxia could be visualized on in vivo SPECT imaging. The results of this study will help in the understanding and assessment of
Responsive Neurostimulation System (RNS in setting of cranioplasty and history of multiple craniotomies

Directory of Open Access Journals (Sweden)

Jason Ledesma

2016-09-01

Conclusion: The case illustrates a possible limitation of SEEG placement, particularly in patients with a history of cranioplasty and multiple prior craniotomies. We also describe the first placement of an RNS generator and system in the setting of prior cranioplasty.
The Effects of Multiple Sets of Squats and Jump Squats on Mechanical Variables.

Science.gov (United States)

Rossetti, Michael L; Munford, Shawn N; Snyder, Brandon W; Davis, Shala E; Moir, Gavin L

2017-07-28

The mechanical responses to two non-ballistic squat and two ballistic jump squat protocols performed over multiple sets were investigated. One protocol from each of the two non-ballistic and ballistic conditions incorporated a pause between the eccentric and concentric phases of the movements in order to determine the influence of the coupling time on the mechanical variables and post-activation potentiation (PAP). Eleven men (age: 21.9 ± 1.8 years; height: 1.79 ± 0.05 m; mass: 87.0 ± 7.4 kg) attended four sessions where they performed multiple sets of squats and jump squats with a load equivalent to 30% 1-repeititon maximum under one of the following conditions: 1) 3 × 4 repetitions of non-ballistic squats (30N-B); 2) 3 × 4 repetitions of non-ballistic squats with a 3-second pause between the eccentric and concentric phases of each repetition (30PN-B); 3) 3 × 4 repetitions of ballistic jump squats (30B); 4) 3 × 4 repetitions of ballistic jump squats with a 3-second pause between the eccentric and concentric phases of each repetition (30PB). Force plates were used to calculate variables including average vertical velocity, average vertical force (GRF), and average power output (PO). Vertical velocities during the ballistic conditions were significantly greater than those attained during the non-ballistic conditions (mean differences: 0.21 - 0.25 m/s, p0.05). Ballistic jump squats may be an effective exercise for developing PO given the high velocities and forces generated in these exercises. Furthermore, the completion of multiple sets of jump squats may induce PAP to enhance PO. The coupling times between the eccentric and concentric phases of the jump squats should be short in order to maximize the GRF and PO across the sets.
A Hox Gene, Antennapedia, Regulates Expression of Multiple Major Silk Protein Genes in the Silkworm Bombyx mori.

Science.gov (United States)

Tsubota, Takuya; Tomita, Shuichiro; Uchino, Keiro; Kimoto, Mai; Takiya, Shigeharu; Kajiwara, Hideyuki; Yamazaki, Toshimasa; Sezutsu, Hideki

2016-03-25

Hoxgenes play a pivotal role in the determination of anteroposterior axis specificity during bilaterian animal development. They do so by acting as a master control and regulating the expression of genes important for development. Recently, however, we showed that Hoxgenes can also function in terminally differentiated tissue of the lepidopteranBombyx mori In this species,Antennapedia(Antp) regulates expression of sericin-1, a major silk protein gene, in the silk gland. Here, we investigated whether Antpcan regulate expression of multiple genes in this tissue. By means of proteomic, RT-PCR, and in situ hybridization analyses, we demonstrate that misexpression of Antpin the posterior silk gland induced ectopic expression of major silk protein genes such assericin-3,fhxh4, and fhxh5 These genes are normally expressed specifically in the middle silk gland as is Antp Therefore, the evidence strongly suggests that Antpactivates these silk protein genes in the middle silk gland. The putativesericin-1 activator complex (middle silk gland-intermolt-specific complex) can bind to the upstream regions of these genes, suggesting that Antpdirectly activates their expression. We also found that the pattern of gene expression was well conserved between B. moriand the wild species Bombyx mandarina, indicating that the gene regulation mechanism identified here is an evolutionarily conserved mechanism and not an artifact of the domestication of B. mori We suggest that Hoxgenes have a role as a master control in terminally differentiated tissues, possibly acting as a primary regulator for a range of physiological processes. © 2016 by The American Society for Biochemistry and Molecular Biology, Inc.
Cancer Outlier Analysis Based on Mixture Modeling of Gene Expression Data

Directory of Open Access Journals (Sweden)

Keita Mori

2013-01-01

Full Text Available Molecular heterogeneity of cancer, partially caused by various chromosomal aberrations or gene mutations, can yield substantial heterogeneity in gene expression profile in cancer samples. To detect cancer-related genes which are active only in a subset of cancer samples or cancer outliers, several methods have been proposed in the context of multiple testing. Such cancer outlier analyses will generally suffer from a serious lack of power, compared with the standard multiple testing setting where common activation of genes across all cancer samples is supposed. In this paper, we consider information sharing across genes and cancer samples, via a parametric normal mixture modeling of gene expression levels of cancer samples across genes after a standardization using the reference, normal sample data. A gene-based statistic for gene selection is developed on the basis of a posterior probability of cancer outlier for each cancer sample. Some efficiency improvement by using our method was demonstrated, even under settings with misspecified, heavy-tailed t-distributions. An application to a real dataset from hematologic malignancies is provided.
Cre/lox-based multiple markerless gene disruption in the genome of the extreme thermophile Thermus thermophilus.

Science.gov (United States)

Togawa, Yoichiro; Nunoshiba, Tatsuo; Hiratsu, Keiichiro

2018-02-01

Markerless gene-disruption technology is particularly useful for effective genetic analyses of Thermus thermophilus (T. thermophilus), which have a limited number of selectable markers. In an attempt to develop a novel system for the markerless disruption of genes in T. thermophilus, we applied a Cre/lox system to construct a triple gene disruptant. To achieve this, we constructed two genetic tools, a loxP-htk-loxP cassette and cre-expressing plasmid, pSH-Cre, for gene disruption and removal of the selectable marker by Cre-mediated recombination. We found that the Cre/lox system was compatible with the proliferation of the T. thermophilus HB27 strain at the lowest growth temperature (50 °C), and thus succeeded in establishing a triple gene disruptant, the (∆TTC1454::loxP, ∆TTC1535KpnI::loxP, ∆TTC1576::loxP) strain, without leaving behind a selectable marker. During the process of the sequential disruption of multiple genes, we observed the undesired deletion and inversion of the chromosomal region between multiple loxP sites that were induced by Cre-mediated recombination. Therefore, we examined the effects of a lox66-htk-lox71 cassette by exploiting the mutant lox sites, lox66 and lox71, instead of native loxP sites. We successfully constructed a (∆TTC1535::lox72, ∆TTC1537::lox72) double gene disruptant without inducing the undesired deletion of the 0.7-kbp region between the two directly oriented lox72 sites created by the Cre-mediated recombination of the lox66-htk-lox71 cassette. This is the first demonstration of a Cre/lox system being applicable to extreme thermophiles in a genetic manipulation. Our results indicate that this system is a powerful tool for multiple markerless gene disruption in T. thermophilus.
Transcriptome-wide selection of a reliable set of reference genes for gene expression studies in potato cyst nematodes (Globodera spp.).

Science.gov (United States)

Sabeh, Michael; Duceppe, Marc-Olivier; St-Arnaud, Marc; Mimee, Benjamin

2018-01-01

Relative gene expression analyses by qRT-PCR (quantitative reverse transcription PCR) require an internal control to normalize the expression data of genes of interest and eliminate the unwanted variation introduced by sample preparation. A perfect reference gene should have a constant expression level under all the experimental conditions. However, the same few housekeeping genes selected from the literature or successfully used in previous unrelated experiments are often routinely used in new conditions without proper validation of their stability across treatments. The advent of RNA-Seq and the availability of public datasets for numerous organisms are opening the way to finding better reference genes for expression studies. Globodera rostochiensis is a plant-parasitic nematode that is particularly yield-limiting for potato. The aim of our study was to identify a reliable set of reference genes to study G. rostochiensis gene expression. Gene expression levels from an RNA-Seq database were used to identify putative reference genes and were validated with qRT-PCR analysis. Three genes, GR, PMP-3, and aaRS, were found to be very stable within the experimental conditions of this study and are proposed as reference genes for future work.
Mining tissue specificity, gene connectivity and disease association to reveal a set of genes that modify the action of disease causing genes

Directory of Open Access Journals (Sweden)

Reverter Antonio

2008-09-01

Full Text Available Abstract Background The tissue specificity of gene expression has been linked to a number of significant outcomes including level of expression, and differential rates of polymorphism, evolution and disease association. Recent studies have also shown the importance of exploring differential gene connectivity and sequence conservation in the identification of disease-associated genes. However, no study relates gene interactions with tissue specificity and disease association. Methods We adopted an a priori approach making as few assumptions as possible to analyse the interplay among gene-gene interactions with tissue specificity and its subsequent likelihood of association with disease. We mined three large datasets comprising expression data drawn from massively parallel signature sequencing across 32 tissues, describing a set of 55,606 true positive interactions for 7,197 genes, and microarray expression results generated during the profiling of systemic inflammation, from which 126,543 interactions among 7,090 genes were reported. Results Amongst the myriad of complex relationships identified between expression, disease, connectivity and tissue specificity, some interesting patterns emerged. These include elevated rates of expression and network connectivity in housekeeping and disease-associated tissue-specific genes. We found that disease-associated genes are more likely to show tissue specific expression and most frequently interact with other disease genes. Using the thresholds defined in these observations, we develop a guilt-by-association algorithm and discover a group of 112 non-disease annotated genes that predominantly interact with disease-associated genes, impacting on disease outcomes. Conclusion We conclude that parameters such as tissue specificity and network connectivity can be used in combination to identify a group of genes, not previously confirmed as disease causing, that are involved in interactions with disease causing

Using Variable Precision Rough Set for Selection and Classification of Biological Knowledge Integrated in DNA Gene Expression

Directory of Open Access Journals (Sweden)

Calvo-Dmgz D.

2012-12-01

Full Text Available DNA microarrays have contributed to the exponential growth of genomic and experimental data in the last decade. This large amount of gene expression data has been used by researchers seeking diagnosis of diseases like cancer using machine learning methods. In turn, explicit biological knowledge about gene functions has also grown tremendously over the last decade. This work integrates explicit biological knowledge, provided as gene sets, into the classication process by means of Variable Precision Rough Set Theory (VPRS. The proposed model is able to highlight which part of the provided biological knowledge has been important for classification. This paper presents a novel model for microarray data classification which is able to incorporate prior biological knowledge in the form of gene sets. Based on this knowledge, we transform the input microarray data into supergenes, and then we apply rough set theory to select the most promising supergenes and to derive a set of easy interpretable classification rules. The proposed model is evaluated over three breast cancer microarrays datasets obtaining successful results compared to classical classification techniques. The experimental results shows that there are not significat differences between our model and classical techniques but it is able to provide a biological-interpretable explanation of how it classifies new samples.
Robust multi-tissue gene panel for cancer detection

Directory of Open Access Journals (Sweden)

Talantov Dmitri

2010-06-01

Full Text Available Abstract Background We have identified a set of genes whose relative mRNA expression levels in various solid tumors can be used to robustly distinguish cancer from matching normal tissue. Our current feature set consists of 113 gene probes for 104 unique genes, originally identified as differentially expressed in solid primary tumors in microarray data on Affymetrix HG-U133A platform in five tissue types: breast, colon, lung, prostate and ovary. For each dataset, we first identified a set of genes significantly differentially expressed in tumor vs. normal tissue at p-value = 0.05 using an experimentally derived error model. Our common cancer gene panel is the intersection of these sets of significantly dysregulated genes and can distinguish tumors from normal tissue on all these five tissue types. Methods Frozen tumor specimens were obtained from two commercial vendors Clinomics (Pittsfield, MA and Asterand (Detroit, MI. Biotinylated targets were prepared using published methods (Affymetrix, CA and hybridized to Affymetrix U133A GeneChips (Affymetrix, CA. Expression values for each gene were calculated using Affymetrix GeneChip analysis software MAS 5.0. We then used a software package called Genes@Work for differential expression discovery, and SVM light linear kernel for building classification models. Results We validate the predictability of this gene list on several publicly available data sets generated on the same platform. Of note, when analysing the lung cancer data set of Spira et al, using an SVM linear kernel classifier, our gene panel had 94.7% leave-one-out accuracy compared to 87.8% using the gene panel in the original paper. In addition, we performed high-throughput validation on the Dana Farber Cancer Institute GCOD database and several GEO datasets. Conclusions Our result showed the potential for this panel as a robust classification tool for multiple tumor types on the Affymetrix platform, as well as other whole genome arrays
Novel multiple criteria decision making methods based on bipolar neutrosophic sets and bipolar neutrosophic graphs

OpenAIRE

Muhammad, Akram; Musavarah, Sarwar

2016-01-01

In this research study, we introduce the concept of bipolar neutrosophic graphs. We present the dominating and independent sets of bipolar neutrosophic graphs. We describe novel multiple criteria decision making methods based on bipolar neutrosophic sets and bipolar neutrosophic graphs. We also develop an algorithm for computing domination in bipolar neutrosophic graphs.
Gene Set Analyses of Genome-Wide Association Studies on 49 Quantitative Traits Measured in a Single Genetic Epidemiology Dataset

Directory of Open Access Journals (Sweden)

Jihye Kim

2013-09-01

Full Text Available Gene set analysis is a powerful tool for interpreting a genome-wide association study result and is gaining popularity these days. Comparison of the gene sets obtained for a variety of traits measured from a single genetic epidemiology dataset may give insights into the biological mechanisms underlying these traits. Based on the previously published single nucleotide polymorphism (SNP genotype data on 8,842 individuals enrolled in the Korea Association Resource project, we performed a series of systematic genome-wide association analyses for 49 quantitative traits of basic epidemiological, anthropometric, or blood chemistry parameters. Each analysis result was subjected to subsequent gene set analyses based on Gene Ontology (GO terms using gene set analysis software, GSA-SNP, identifying a set of GO terms significantly associated to each trait (pcorr < 0.05. Pairwise comparison of the traits in terms of the semantic similarity in their GO sets revealed surprising cases where phenotypically uncorrelated traits showed high similarity in terms of biological pathways. For example, the pH level was related to 7 other traits that showed low phenotypic correlations with it. A literature survey implies that these traits may be regulated partly by common pathways that involve neuronal or nerve systems.
Gene-Based Analysis of Regionally Enriched Cortical Genes in GWAS Data Sets of Cognitive Traits and Psychiatric Disorders

DEFF Research Database (Denmark)

Ersland, Kari M; Christoforou, Andrea; Stansberg, Christine

2012-01-01

the regionally enriched cortical genes to mine a genome-wide association study (GWAS) of the Norwegian Cognitive NeuroGenetics (NCNG) sample of healthy adults for association to nine psychometric tests measures. In addition, we explored GWAS data sets for the serious psychiatric disorders schizophrenia (SCZ) (n...
Identification of sparsely distributed clusters of cis-regulatory elements in sets of co-expressed genes

OpenAIRE

Kreiman, Gabriel

2004-01-01

Sequence information and high‐throughput methods to measure gene expression levels open the door to explore transcriptional regulation using computational tools. Combinatorial regulation and sparseness of regulatory elements throughout the genome allow organisms to control the spatial and temporal patterns of gene expression. Here we study the organization of cis‐regulatory elements in sets of co‐regulated genes. We build an algorithm to search for combinations of transcription factor binding...
Gene expression profiling for molecular classification of multiple myeloma in newly diagnosed patients

NARCIS (Netherlands)

Broyl, Annemiek; Hose, Dirk; Lokhorst, Henk; de Knegt, Yvonne; Peeters, Justine; Jauch, Anna; Bertsch, Uta; Buijs, Arjan; Stevens-Kroef, Marian; Beverloo, H. Berna; Vellenga, Edo; Zweegman, Sonja; Kersten, Marie-Josée; van der Holt, Bronno; el Jarari, Laila; Mulligan, George; Goldschmidt, Hartmut; van Duin, Mark; Sonneveld, Pieter

2010-01-01

To identify molecularly defined subgroups in multiple myeloma, gene expression profiling was performed on purified CD138(+) plasma cells of 320 newly diagnosed myeloma patients included in the Dutch-Belgian/German HOVON-65/GMMG-HD4 trial. Hierarchical clustering identified 10 subgroups; 6
Permethrin induction of multiple cytochrome P450 genes in insecticide resistant mosquitoes, Culex quinquefasciatus.

Science.gov (United States)

Gong, Youhui; Li, Ting; Zhang, Lee; Gao, Xiwu; Liu, Nannan

2013-01-01

The expression of some insect P450 genes can be induced by both exogenous and endogenous compounds and there is evidence to suggest that multiple constitutively overexpressed P450 genes are co-responsible for the development of resistance to permethrin in resistant mosquitoes. This study characterized the permethrin induction profiles of P450 genes known to be constitutively overexpressed in resistant mosquitoes, Culex quinquefasciatus. The gene expression in 7 of the 19 P450 genes CYP325K3v1, CYP4D42v2, CYP9J45, (CYP) CPIJ000926, CYP325G4, CYP4C38, CYP4H40 in the HAmCqG8 strain, increased more than 2-fold after exposure to permethrin at an LC50 concentration (10 ppm) compared to their acetone treated counterpart; no significant differences in the expression of these P450 genes in susceptible S-Lab mosquitoes were observed after permethrin treatment. Eleven of the fourteen P450 genes overexpressed in the MAmCqG6 strain, CYP9M10, CYP6Z12, CYP9J33, CYP9J43, CYP9J34, CYP306A1, CYP6Z15, CYP9J45, CYPPAL1, CYP4C52v1, CYP9J39, were also induced more than doubled after exposure to an LC50 (0.7 ppm) dose of permethrin. No significant induction in P450 gene expression was observed in the susceptible S-Lab mosquitoes after permethrin treatment except for CYP6Z15 and CYP9J39, suggesting that permethrin induction of these two P450 genes are common to both susceptible and resistant mosquitoes while the induction of the others are specific to insecticide resistant mosquitoes. These results demonstrate that multiple P450 genes are co-up-regulated in insecticide resistant mosquitoes through both constitutive overexpression and induction mechanisms, providing additional support for their involvement in the detoxification of insecticides and the development of insecticide resistance.
Integrative Functional Genomics for Systems Genetics in GeneWeaver.org.

Science.gov (United States)

Bubier, Jason A; Langston, Michael A; Baker, Erich J; Chesler, Elissa J

2017-01-01

The abundance of existing functional genomics studies permits an integrative approach to interpreting and resolving the results of diverse systems genetics studies. However, a major challenge lies in assembling and harmonizing heterogeneous data sets across species for facile comparison to the positional candidate genes and coexpression networks that come from systems genetic studies. GeneWeaver is an online database and suite of tools at www.geneweaver.org that allows for fast aggregation and analysis of gene set-centric data. GeneWeaver contains curated experimental data together with resource-level data such as GO annotations, MP annotations, and KEGG pathways, along with persistent stores of user entered data sets. These can be entered directly into GeneWeaver or transferred from widely used resources such as GeneNetwork.org. Data are analyzed using statistical tools and advanced graph algorithms to discover new relations, prioritize candidate genes, and generate function hypotheses. Here we use GeneWeaver to find genes common to multiple gene sets, prioritize candidate genes from a quantitative trait locus, and characterize a set of differentially expressed genes. Coupling a large multispecies repository curated and empirical functional genomics data to fast computational tools allows for the rapid integrative analysis of heterogeneous data for interpreting and extrapolating systems genetics results.
Prediction of regulatory gene pairs using dynamic time warping and gene ontology.

Science.gov (United States)

Yang, Andy C; Hsu, Hui-Huang; Lu, Ming-Da; Tseng, Vincent S; Shih, Timothy K

2014-01-01

Selecting informative genes is the most important task for data analysis on microarray gene expression data. In this work, we aim at identifying regulatory gene pairs from microarray gene expression data. However, microarray data often contain multiple missing expression values. Missing value imputation is thus needed before further processing for regulatory gene pairs becomes possible. We develop a novel approach to first impute missing values in microarray time series data by combining k-Nearest Neighbour (KNN), Dynamic Time Warping (DTW) and Gene Ontology (GO). After missing values are imputed, we then perform gene regulation prediction based on our proposed DTW-GO distance measurement of gene pairs. Experimental results show that our approach is more accurate when compared with existing missing value imputation methods on real microarray data sets. Furthermore, our approach can also discover more regulatory gene pairs that are known in the literature than other methods.
MADS goes genomic in conifers: towards determining the ancestral set of MADS-box genes in seed plants.

Science.gov (United States)

Gramzow, Lydia; Weilandt, Lisa; Theißen, Günter

2014-11-01

MADS-box genes comprise a gene family coding for transcription factors. This gene family expanded greatly during land plant evolution such that the number of MADS-box genes ranges from one or two in green algae to around 100 in angiosperms. Given the crucial functions of MADS-box genes for nearly all aspects of plant development, the expansion of this gene family probably contributed to the increasing complexity of plants. However, the expansion of MADS-box genes during one important step of land plant evolution, namely the origin of seed plants, remains poorly understood due to the previous lack of whole-genome data for gymnosperms. The newly available genome sequences of Picea abies, Picea glauca and Pinus taeda were used to identify the complete set of MADS-box genes in these conifers. In addition, MADS-box genes were identified in the growing number of transcriptomes available for gymnosperms. With these datasets, phylogenies were constructed to determine the ancestral set of MADS-box genes of seed plants and to infer the ancestral functions of these genes. Type I MADS-box genes are under-represented in gymnosperms and only a minimum of two Type I MADS-box genes have been present in the most recent common ancestor (MRCA) of seed plants. In contrast, a large number of Type II MADS-box genes were found in gymnosperms. The MRCA of extant seed plants probably possessed at least 11-14 Type II MADS-box genes. In gymnosperms two duplications of Type II MADS-box genes were found, such that the MRCA of extant gymnosperms had at least 14-16 Type II MADS-box genes. The implied ancestral set of MADS-box genes for seed plants shows simplicity for Type I MADS-box genes and remarkable complexity for Type II MADS-box genes in terms of phylogeny and putative functions. The analysis of transcriptome data reveals that gymnosperm MADS-box genes are expressed in a great variety of tissues, indicating diverse roles of MADS-box genes for the development of gymnosperms. This study is
The association of color memory and the enumeration of multiple spatially overlapping sets.

Science.gov (United States)

Poltoratski, Sonia; Xu, Yaoda

2013-07-09

Using dot displays, Halberda, Sires, and Feigenson (2006) showed that observers could simultaneously encode the numerosity of two spatially overlapping sets and the superset of all items at a glance. With the brief display and the masking used in Halberda et al., the task required observers to encode the colors of each set in order to select and enumerate all the dots in that set. As such, the observed capacity limit for set enumeration could reflect a limit in visual short-term memory (VSTM) capacity for the set color rather than a limit in set enumeration per se. Here, we largely replicated Halberda et al. and found successful enumeration of approximately two sets (the superset was not probed). We also found that only about two and a half colors could be remembered from the colored dot displays whether or not the enumeration task was performed concurrently with the color VSTM task. Because observers must remember the color of a set prior to enumerating it, the under three-item VSTM capacity for color necessarily dictates that set enumeration capacity in this paradigm could not exceed two sets. Thus, the ability to enumerate multiple spatially overlapping sets is likely limited by VSTM capacity to retain the discriminating feature of these sets. This relationship suggests that the capacity for set enumeration cannot be considered independently from the capacity for the set's defining features.
Evolving Non-Dominated Parameter Sets for Computational Models from Multiple Experiments

Science.gov (United States)

Lane, Peter C. R.; Gobet, Fernand

2013-03-01

Creating robust, reproducible and optimal computational models is a key challenge for theorists in many sciences. Psychology and cognitive science face particular challenges as large amounts of data are collected and many models are not amenable to analytical techniques for calculating parameter sets. Particular problems are to locate the full range of acceptable model parameters for a given dataset, and to confirm the consistency of model parameters across different datasets. Resolving these problems will provide a better understanding of the behaviour of computational models, and so support the development of general and robust models. In this article, we address these problems using evolutionary algorithms to develop parameters for computational models against multiple sets of experimental data; in particular, we propose the `speciated non-dominated sorting genetic algorithm' for evolving models in several theories. We discuss the problem of developing a model of categorisation using twenty-nine sets of data and models drawn from four different theories. We find that the evolutionary algorithms generate high quality models, adapted to provide a good fit to all available data.
Reference gene selection for quantitative gene expression studies during biological invasions: A test on multiple genes and tissues in a model ascidian Ciona savignyi.

Science.gov (United States)

Huang, Xuena; Gao, Yangchun; Jiang, Bei; Zhou, Zunchun; Zhan, Aibin

2016-01-15

As invasive species have successfully colonized a wide range of dramatically different local environments, they offer a good opportunity to study interactions between species and rapidly changing environments. Gene expression represents one of the primary and crucial mechanisms for rapid adaptation to local environments. Here, we aim to select reference genes for quantitative gene expression analysis based on quantitative Real-Time PCR (qRT-PCR) for a model invasive ascidian, Ciona savignyi. We analyzed the stability of ten candidate reference genes in three tissues (siphon, pharynx and intestine) under two key environmental stresses (temperature and salinity) in the marine realm based on three programs (geNorm, NormFinder and delta Ct method). Our results demonstrated only minor difference for stability rankings among the three methods. The use of different single reference gene might influence the data interpretation, while multiple reference genes could minimize possible errors. Therefore, reference gene combinations were recommended for different tissues - the optimal reference gene combination for siphon was RPS15 and RPL17 under temperature stress, and RPL17, UBQ and TubA under salinity treatment; for pharynx, TubB, TubA and RPL17 were the most stable genes under temperature stress, while TubB, TubA and UBQ were the best under salinity stress; for intestine, UBQ, RPS15 and RPL17 were the most reliable reference genes under both treatments. Our results suggest that the necessity of selection and test of reference genes for different tissues under varying environmental stresses. The results obtained here are expected to reveal mechanisms of gene expression-mediated invasion success using C. savignyi as a model species. Copyright © 2015 Elsevier B.V. All rights reserved.
Analyzing Multiple-Probe Microarray: Estimation and Application of Gene Expression Indexes

KAUST Repository

Maadooliat, Mehdi

2012-07-26

Gene expression index estimation is an essential step in analyzing multiple probe microarray data. Various modeling methods have been proposed in this area. Amidst all, a popular method proposed in Li and Wong (2001) is based on a multiplicative model, which is similar to the additive model discussed in Irizarry et al. (2003a) at the logarithm scale. Along this line, Hu et al. (2006) proposed data transformation to improve expression index estimation based on an ad hoc entropy criteria and naive grid search approach. In this work, we re-examined this problem using a new profile likelihood-based transformation estimation approach that is more statistically elegant and computationally efficient. We demonstrate the applicability of the proposed method using a benchmark Affymetrix U95A spiked-in experiment. Moreover, We introduced a new multivariate expression index and used the empirical study to shows its promise in terms of improving model fitting and power of detecting differential expression over the commonly used univariate expression index. As the other important content of the work, we discussed two generally encountered practical issues in application of gene expression index: normalization and summary statistic used for detecting differential expression. Our empirical study shows somewhat different findings from the MAQC project (MAQC, 2006).
Fine tuning of RFX/DAF-19-regulated target gene expression through binding to multiple sites in Caenorhabditis elegans

OpenAIRE

Chu, Jeffery S. C.; Tarailo-Graovac, Maja; Zhang, Di; Wang, Jun; Uyar, Bora; Tu, Domena; Trinh, Joanne; Baillie, David L.; Chen, Nansheng

2011-01-01

In humans, mutations of a growing list of regulatory factor X (RFX) target genes have been associated with devastating genetics disease conditions including ciliopathies. However, mechanisms underlying RFX transcription factors (TFs)-mediated gene expression regulation, especially differential gene expression regulation, are largely unknown. In this study, we explore the functional significance of the co-existence of multiple X-box motifs in regulating differential gene expression in Caenorha...
Gene-set analysis based on the pharmacological profiles of drugs to identify repurposing opportunities in schizophrenia.

Science.gov (United States)

de Jong, Simone; Vidler, Lewis R; Mokrab, Younes; Collier, David A; Breen, Gerome

2016-08-01

Genome-wide association studies (GWAS) have identified thousands of novel genetic associations for complex genetic disorders, leading to the identification of potential pharmacological targets for novel drug development. In schizophrenia, 108 conservatively defined loci that meet genome-wide significance have been identified and hundreds of additional sub-threshold associations harbour information on the genetic aetiology of the disorder. In the present study, we used gene-set analysis based on the known binding targets of chemical compounds to identify the 'drug pathways' most strongly associated with schizophrenia-associated genes, with the aim of identifying potential drug repositioning opportunities and clues for novel treatment paradigms, especially in multi-target drug development. We compiled 9389 gene sets (2496 with unique gene content) and interrogated gene-based p-values from the PGC2-SCZ analysis. Although no single drug exceeded experiment wide significance (corrected pneratinib. This is a proof of principle analysis showing the potential utility of GWAS data of schizophrenia for the direct identification of candidate drugs and molecules that show polypharmacy. © The Author(s) 2016.
Phase I metabolic genes and risk of lung cancer: multiple polymorphisms and mRNA expression.

Directory of Open Access Journals (Sweden)

Melissa Rotunno

2009-05-01

Full Text Available Polymorphisms in genes coding for enzymes that activate tobacco lung carcinogens may generate inter-individual differences in lung cancer risk. Previous studies had limited sample sizes, poor exposure characterization, and a few single nucleotide polymorphisms (SNPs tested in candidate genes. We analyzed 25 SNPs (some previously untested in 2101 primary lung cancer cases and 2120 population controls from the Environment And Genetics in Lung cancer Etiology (EAGLE study from six phase I metabolic genes, including cytochrome P450s, microsomal epoxide hydrolase, and myeloperoxidase. We evaluated the main genotype effects and genotype-smoking interactions in lung cancer risk overall and in the major histology subtypes. We tested the combined effect of multiple SNPs on lung cancer risk and on gene expression. Findings were prioritized based on significance thresholds and consistency across different analyses, and accounted for multiple testing and prior knowledge. Two haplotypes in EPHX1 were significantly associated with lung cancer risk in the overall population. In addition, CYP1B1 and CYP2A6 polymorphisms were inversely associated with adenocarcinoma and squamous cell carcinoma risk, respectively. Moreover, the association between CYP1A1 rs2606345 genotype and lung cancer was significantly modified by intensity of cigarette smoking, suggesting an underlying dose-response mechanism. Finally, increasing number of variants at CYP1A1/A2 genes revealed significant protection in never smokers and risk in ever smokers. Results were supported by differential gene expression in non-tumor lung tissue samples with down-regulation of CYP1A1 in never smokers and up-regulation in smokers from CYP1A1/A2 SNPs. The significant haplotype associations emphasize that the effect of multiple SNPs may be important despite null single SNP-associations, and warrants consideration in genome-wide association studies (GWAS. Our findings emphasize the necessity of post
PCA-based bootstrap confidence interval tests for gene-disease association involving multiple SNPs

Directory of Open Access Journals (Sweden)

Xue Fuzhong

2010-01-01

Full Text Available Abstract Background Genetic association study is currently the primary vehicle for identification and characterization of disease-predisposing variant(s which usually involves multiple single-nucleotide polymorphisms (SNPs available. However, SNP-wise association tests raise concerns over multiple testing. Haplotype-based methods have the advantage of being able to account for correlations between neighbouring SNPs, yet assuming Hardy-Weinberg equilibrium (HWE and potentially large number degrees of freedom can harm its statistical power and robustness. Approaches based on principal component analysis (PCA are preferable in this regard but their performance varies with methods of extracting principal components (PCs. Results PCA-based bootstrap confidence interval test (PCA-BCIT, which directly uses the PC scores to assess gene-disease association, was developed and evaluated for three ways of extracting PCs, i.e., cases only(CAES, controls only(COES and cases and controls combined(CES. Extraction of PCs with COES is preferred to that with CAES and CES. Performance of the test was examined via simulations as well as analyses on data of rheumatoid arthritis and heroin addiction, which maintains nominal level under null hypothesis and showed comparable performance with permutation test. Conclusions PCA-BCIT is a valid and powerful method for assessing gene-disease association involving multiple SNPs.
HIV Cell-to-Cell Spread Results in Earlier Onset of Viral Gene Expression by Multiple Infections per Cell.

Directory of Open Access Journals (Sweden)

Mikaël Boullé

2016-11-01

Full Text Available Cell-to-cell spread of HIV, a directed mode of viral transmission, has been observed to be more rapid than cell-free infection. However, a mechanism for earlier onset of viral gene expression in cell-to-cell spread was previously uncharacterized. Here we used time-lapse microscopy combined with automated image analysis to quantify the timing of the onset of HIV gene expression in a fluorescent reporter cell line, as well as single cell staining for infection over time in primary cells. We compared cell-to-cell spread of HIV to cell-free infection, and limited both types of transmission to a two-hour window to minimize differences due to virus transit time to the cell. The mean time to detectable onset of viral gene expression in cell-to-cell spread was accelerated by 19% in the reporter cell line and by 35% in peripheral blood mononuclear cells relative to cell-free HIV infection. Neither factors secreted by infected cells, nor contact with infected cells in the absence of transmission, detectably changed onset. We recapitulated the earlier onset by infecting with multiple cell-free viruses per cell. Surprisingly, the acceleration in onset of viral gene expression was not explained by cooperativity between infecting virions. Instead, more rapid onset was consistent with a model where the fastest expressing virus out of the infecting virus pool sets the time for infection independently of the other co-infecting viruses.

Multiple BiP genes of Arabidopsis thaliana are required for male gametogenesis and pollen competitiveness.

Science.gov (United States)

Maruyama, Daisuke; Sugiyama, Tomoyuki; Endo, Toshiya; Nishikawa, Shuh-Ichi

2014-04-01

Immunoglobulin-binding protein (BiP) is a molecular chaperone of the heat shock protein 70 (Hsp70) family. BiP is localized in the endoplasmic reticulum (ER) and plays key roles in protein translocation, protein folding and quality control in the ER. The genomes of flowering plants contain multiple BiP genes. Arabidopsis thaliana has three BiP genes. BIP1 and BIP2 are ubiquitously expressed. BIP3 encodes a less well conserved BiP paralog, and it is expressed only under ER stress conditions in the majority of organs. Here, we report that all BiP genes are expressed and functional in pollen and pollen tubes. Although the bip1 bip2 double mutation does not affect pollen viability, the bip1 bip2 bip3 triple mutation is lethal in pollen. This result indicates that lethality of the bip1 bip2 double mutation is rescued by BiP3 expression. A decrease in the copy number of the ubiquitously expressed BiP genes correlates well with a decrease in pollen tube growth, which leads to reduced fitness of mutant pollen during fertilization. Because an increased protein secretion activity is expected to increase the protein folding demand in the ER, the multiple BiP genes probably cooperate with each other to ensure ER homeostasis in cells with active secretion such as rapidly growing pollen tubes.
Multiple origins of interdependent endosymbiotic complexes in a genus of cicadas.

Science.gov (United States)

Łukasik, Piotr; Nazario, Katherine; Van Leuven, James T; Campbell, Matthew A; Meyer, Mariah; Michalik, Anna; Pessacq, Pablo; Simon, Chris; Veloso, Claudio; McCutcheon, John P

2018-01-09

Bacterial endosymbionts that provide nutrients to hosts often have genomes that are extremely stable in structure and gene content. In contrast, the genome of the endosymbiont Hodgkinia cicadicola has fractured into multiple distinct lineages in some species of the cicada genus Tettigades To better understand the frequency, timing, and outcomes of Hodgkinia lineage splitting throughout this cicada genus, we sampled cicadas over three field seasons in Chile and performed genomics and microscopy on representative samples. We found that a single ancestral Hodgkinia lineage has split at least six independent times in Tettigades over the last 4 million years, resulting in complexes of between two and six distinct Hodgkinia lineages per host. Individual genomes in these symbiotic complexes differ dramatically in relative abundance, genome size, organization, and gene content. Each Hodgkinia lineage retains a small set of core genes involved in genetic information processing, but the high level of gene loss experienced by all genomes suggests that extensive sharing of gene products among symbiont cells must occur. In total, Hodgkinia complexes that consist of multiple lineages encode nearly complete sets of genes present on the ancestral single lineage and presumably perform the same functions as symbionts that have not undergone splitting. However, differences in the timing of the splits, along with dissimilar gene loss patterns on the resulting genomes, have led to very different outcomes of lineage splitting in extant cicadas.
Robust set-point regulation for ecological models with multiple management goals.

Science.gov (United States)

Guiver, Chris; Mueller, Markus; Hodgson, Dave; Townley, Stuart

2016-05-01

Population managers will often have to deal with problems of meeting multiple goals, for example, keeping at specific levels both the total population and population abundances in given stage-classes of a stratified population. In control engineering, such set-point regulation problems are commonly tackled using multi-input, multi-output proportional and integral (PI) feedback controllers. Building on our recent results for population management with single goals, we develop a PI control approach in a context of multi-objective population management. We show that robust set-point regulation is achieved by using a modified PI controller with saturation and anti-windup elements, both described in the paper, and illustrate the theory with examples. Our results apply more generally to linear control systems with positive state variables, including a class of infinite-dimensional systems, and thus have broader appeal.
A fast and efficient gene-network reconstruction method from multiple over-expression experiments

Directory of Open Access Journals (Sweden)

Thurner Stefan

2009-08-01

Full Text Available Abstract Background Reverse engineering of gene regulatory networks presents one of the big challenges in systems biology. Gene regulatory networks are usually inferred from a set of single-gene over-expressions and/or knockout experiments. Functional relationships between genes are retrieved either from the steady state gene expressions or from respective time series. Results We present a novel algorithm for gene network reconstruction on the basis of steady-state gene-chip data from over-expression experiments. The algorithm is based on a straight forward solution of a linear gene-dynamics equation, where experimental data is fed in as a first predictor for the solution. We compare the algorithm's performance with the NIR algorithm, both on the well known E. coli experimental data and on in-silico experiments. Conclusion We show superiority of the proposed algorithm in the number of correctly reconstructed links and discuss computational time and robustness. The proposed algorithm is not limited by combinatorial explosion problems and can be used in principle for large networks.
An evolvable oestrogen receptor activity sensor: development of a modular system for integrating multiple genes into the yeast genome

NARCIS (Netherlands)

Fox, J.E.; Bridgham, J.T.; Bovee, T.F.H.; Thornton, J.W.

2007-01-01

To study a gene interaction network, we developed a gene-targeting strategy that allows efficient and stable genomic integration of multiple genetic constructs at distinct target loci in the yeast genome. This gene-targeting strategy uses a modular plasmid with a recyclable selectable marker and a
Hemodynamic responses during and after multiple sets of stretching exercises performed with and without the Valsalva maneuver.

Science.gov (United States)

Lima, Tainah P; Farinatti, Paulo T V; Rubini, Ercole C; Silva, Elirez B; Monteiro, Walace D

2015-05-01

This study investigated the acute hemodynamic responses to multiple sets of passive stretching exercises performed with and without the Valsalva maneuver. Fifteen healthy men aged 21 to 29 years with poor flexibility performed stretching protocols comprising 10 sets of maximal passive unilateral hip flexion, sustained for 30 seconds with equal intervals between sets. Protocols without and with the Valsalva maneuver were applied in a random counterbalanced order, separated by 48-hour intervals. Hemodynamic responses were measured by photoplethysmography pre-exercise, during the stretching sets, and post-exercise. The effects of stretching sets on systolic and diastolic blood pressure were cumulative until the fourth set in protocols performed with and without the Valsalva maneuver. The heart rate and rate pressure product increased in both protocols, but no additive effect was observed due to the number of sets. Hemodynamic responses were always higher when stretching was performed with the Valsalva maneuver, causing an additional elevation in the rate pressure product. Multiple sets of unilateral hip flexion stretching significantly increased blood pressure, heart rate, and rate pressure product values. A cumulative effect of the number of sets occurred only for systolic and diastolic blood pressure, at least in the initial sets of the stretching protocols. The performance of the Valsalva maneuver intensified all hemodynamic responses, which resulted in significant increases in cardiac work during stretching exercises.
MULTIPLE OBJECTS

Directory of Open Access Journals (Sweden)

A. A. Bosov

2015-04-01

Full Text Available Purpose. The development of complicated techniques of production and management processes, information systems, computer science, applied objects of systems theory and others requires improvement of mathematical methods, new approaches for researches of application systems. And the variety and diversity of subject systems makes necessary the development of a model that generalizes the classical sets and their development – sets of sets. Multiple objects unlike sets are constructed by multiple structures and represented by the structure and content. The aim of the work is the analysis of multiple structures, generating multiple objects, the further development of operations on these objects in application systems. Methodology. To achieve the objectives of the researches, the structure of multiple objects represents as constructive trio, consisting of media, signatures and axiomatic. Multiple object is determined by the structure and content, as well as represented by hybrid superposition, composed of sets, multi-sets, ordered sets (lists and heterogeneous sets (sequences, corteges. Findings. In this paper we study the properties and characteristics of the components of hybrid multiple objects of complex systems, proposed assessments of their complexity, shown the rules of internal and external operations on objects of implementation. We introduce the relation of arbitrary order over multiple objects, we define the description of functions and display on objects of multiple structures. Originality.In this paper we consider the development of multiple structures, generating multiple objects.Practical value. The transition from the abstract to the subject of multiple structures requires the transformation of the system and multiple objects. Transformation involves three successive stages: specification (binding to the domain, interpretation (multiple sites and particularization (goals. The proposed describe systems approach based on hybrid sets
Joint Estimation of Multiple Precision Matrices with Common Structures.

Science.gov (United States)

Lee, Wonyul; Liu, Yufeng

Estimation of inverse covariance matrices, known as precision matrices, is important in various areas of statistical analysis. In this article, we consider estimation of multiple precision matrices sharing some common structures. In this setting, estimating each precision matrix separately can be suboptimal as it ignores potential common structures. This article proposes a new approach to parameterize each precision matrix as a sum of common and unique components and estimate multiple precision matrices in a constrained l 1 minimization framework. We establish both estimation and selection consistency of the proposed estimator in the high dimensional setting. The proposed estimator achieves a faster convergence rate for the common structure in certain cases. Our numerical examples demonstrate that our new estimator can perform better than several existing methods in terms of the entropy loss and Frobenius loss. An application to a glioblastoma cancer data set reveals some interesting gene networks across multiple cancer subtypes.
Generating and executing programs for a floating point single instruction multiple data instruction set architecture

Science.gov (United States)

Gschwind, Michael K

2013-04-16

Mechanisms for generating and executing programs for a floating point (FP) only single instruction multiple data (SIMD) instruction set architecture (ISA) are provided. A computer program product comprising a computer recordable medium having a computer readable program recorded thereon is provided. The computer readable program, when executed on a computing device, causes the computing device to receive one or more instructions and execute the one or more instructions using logic in an execution unit of the computing device. The logic implements a floating point (FP) only single instruction multiple data (SIMD) instruction set architecture (ISA), based on data stored in a vector register file of the computing device. The vector register file is configured to store both scalar and floating point values as vectors having a plurality of vector elements.
Reduced Set of Virulence Genes Allows High Accuracy Prediction of Bacterial Pathogenicity in Humans

Science.gov (United States)

Iraola, Gregorio; Vazquez, Gustavo; Spangenberg, Lucía; Naya, Hugo

2012-01-01

Although there have been great advances in understanding bacterial pathogenesis, there is still a lack of integrative information about what makes a bacterium a human pathogen. The advent of high-throughput sequencing technologies has dramatically increased the amount of completed bacterial genomes, for both known human pathogenic and non-pathogenic strains; this information is now available to investigate genetic features that determine pathogenic phenotypes in bacteria. In this work we determined presence/absence patterns of different virulence-related genes among more than finished bacterial genomes from both human pathogenic and non-pathogenic strains, belonging to different taxonomic groups (i.e: Actinobacteria, Gammaproteobacteria, Firmicutes, etc.). An accuracy of 95% using a cross-fold validation scheme with in-fold feature selection is obtained when classifying human pathogens and non-pathogens. A reduced subset of highly informative genes () is presented and applied to an external validation set. The statistical model was implemented in the BacFier v1.0 software (freely available at ), that displays not only the prediction (pathogen/non-pathogen) and an associated probability for pathogenicity, but also the presence/absence vector for the analyzed genes, so it is possible to decipher the subset of virulence genes responsible for the classification on the analyzed genome. Furthermore, we discuss the biological relevance for bacterial pathogenesis of the core set of genes, corresponding to eight functional categories, all with evident and documented association with the phenotypes of interest. Also, we analyze which functional categories of virulence genes were more distinctive for pathogenicity in each taxonomic group, which seems to be a completely new kind of information and could lead to important evolutionary conclusions. PMID:22916122
Bayesian meta-analysis of genetic association studies with different sets of markers

NARCIS (Netherlands)

Verzilli, Claudio; Shah, Tina; Casas, Juan P.; Chapman, Juliet; Sandhu, Manjinder; Debenham, Sally L.; Boekholdt, Matthijs S.; Khaw, Kay Tee; Wareham, Nicholas J.; Judson, Richard; Benjamin, Emelia J.; Kathiresan, Sekar; Larson, Martin G.; Rong, Jian; Sofat, Reecha; Humphries, Steve E.; Smeeth, Liam; Cavalleri, Gianpiero; Whittaker, John C.; Hingorani, Aroon D.

2008-01-01

Robust assessment of genetic effects on quantitative traits or complex-disease risk requires synthesis of evidence from multiple studies. Frequently, studies have genotyped partially overlapping sets of SNPs within a gene or region of interest, hampering attempts to combine all the available data.
Identification of a core set of rhizobial infection genes using data from single cell-types

Directory of Open Access Journals (Sweden)

Da-Song eChen

2015-07-01

Full Text Available Genome-wide expression studies on nodulation have varied in their scale from entire root systems to dissected nodules or root sections containing nodule primordia. More recently efforts have focused on developing methods for isolation of root hairs from infected plants and the application of laser-capture microdissection technology to nodules. Here we analyze two published data sets to identify a core set of infection genes that are expressed in the nodule and in root hairs during infection. Among the genes identified were those encoding phenylpropanoid biosynthesis enzymes including Chalcone-O-Methyltransferase which is required for the production of the potent Nod gene inducer 4’,4-dihydroxy-2-methoxychalcone. A promoter-GUS analysis in transgenic hairy roots for two genes encoding Chalcone-O-Methyltransferase isoforms revealed their expression in rhizobially infected root hairs and the nodule infection zone but not in the nitrogen fixation zone. We also describe a group of Rhizobially Induced Peroxidases whose expression overlaps with the production of superoxide in rhizobially infected root hairs and in nodules and roots. Finally, we identify a cohort of co-regulated transcription factors as candidate regulators of these processes.
Canonical correlation analysis for gene-based pleiotropy discovery.

Directory of Open Access Journals (Sweden)

Jose A Seoane

2014-10-01

Full Text Available Genome-wide association studies have identified a wealth of genetic variants involved in complex traits and multifactorial diseases. There is now considerable interest in testing variants for association with multiple phenotypes (pleiotropy and for testing multiple variants for association with a single phenotype (gene-based association tests. Such approaches can increase statistical power by combining evidence for association over multiple phenotypes or genetic variants respectively. Canonical Correlation Analysis (CCA measures the correlation between two sets of multidimensional variables, and thus offers the potential to combine these two approaches. To apply CCA, we must restrict the number of attributes relative to the number of samples. Hence we consider modules of genetic variation that can comprise a gene, a pathway or another biologically relevant grouping, and/or a set of phenotypes. In order to do this, we use an attribute selection strategy based on a binary genetic algorithm. Applied to a UK-based prospective cohort study of 4286 women (the British Women's Heart and Health Study, we find improved statistical power in the detection of previously reported genetic associations, and identify a number of novel pleiotropic associations between genetic variants and phenotypes. New discoveries include gene-based association of NSF with triglyceride levels and several genes (ACSM3, ERI2, IL18RAP, IL23RAP and NRG1 with left ventricular hypertrophy phenotypes. In multiple-phenotype analyses we find association of NRG1 with left ventricular hypertrophy phenotypes, fibrinogen and urea and pleiotropic relationships of F7 and F10 with Factor VII, Factor IX and cholesterol levels.
Strong convergence of an extragradient-type algorithm for the multiple-sets split equality problem.

Science.gov (United States)

Zhao, Ying; Shi, Luoyi

2017-01-01

This paper introduces a new extragradient-type method to solve the multiple-sets split equality problem (MSSEP). Under some suitable conditions, the strong convergence of an algorithm can be verified in the infinite-dimensional Hilbert spaces. Moreover, several numerical results are given to show the effectiveness of our algorithm.
SET: Session Layer-Assisted Efficient TCP Management Architecture for 6LoWPAN with Multiple Gateways

Directory of Open Access Journals (Sweden)

Akbar AliHammad

2010-01-01

Full Text Available 6LoWPAN (IPv6 based Low-Power Personal Area Network is a protocol specification that facilitates communication of IPv6 packets on top of IEEE 802.15.4 so that Internet and wireless sensor networks can be inter-connected. This interconnection is especially required in commercial and enterprise applications of sensor networks where reliable and timely data transfers such as multiple code updates are needed from Internet nodes to sensor nodes. For this type of inbound traffic which is mostly bulk, TCP as transport layer protocol is essential, resulting in end-to-end TCP session through a default gateway. In this scenario, a single gateway tends to become the bottleneck because of non-uniform connectivity to all the sensor nodes besides being vulnerable to buffer overflow. We propose SET; a management architecture for multiple split-TCP sessions across a number of serving gateways. SET implements striping and multiple TCP session management through a shim at session layer. Through analytical modeling and ns2 simulations, we show that our proposed architecture optimizes communication for ingress bulk data transfer while providing associated load balancing services. We conclude that multiple split-TCP sessions managed in parallel across a number of gateways result in reduced latency for bulk data transfer and provide robustness against gateway failures.
A Unified Approach to Functional Principal Component Analysis and Functional Multiple-Set Canonical Correlation.

Science.gov (United States)

Choi, Ji Yeh; Hwang, Heungsun; Yamamoto, Michio; Jung, Kwanghee; Woodward, Todd S

2017-06-01

Functional principal component analysis (FPCA) and functional multiple-set canonical correlation analysis (FMCCA) are data reduction techniques for functional data that are collected in the form of smooth curves or functions over a continuum such as time or space. In FPCA, low-dimensional components are extracted from a single functional dataset such that they explain the most variance of the dataset, whereas in FMCCA, low-dimensional components are obtained from each of multiple functional datasets in such a way that the associations among the components are maximized across the different sets. In this paper, we propose a unified approach to FPCA and FMCCA. The proposed approach subsumes both techniques as special cases. Furthermore, it permits a compromise between the techniques, such that components are obtained from each set of functional data to maximize their associations across different datasets, while accounting for the variance of the data well. We propose a single optimization criterion for the proposed approach, and develop an alternating regularized least squares algorithm to minimize the criterion in combination with basis function approximations to functions. We conduct a simulation study to investigate the performance of the proposed approach based on synthetic data. We also apply the approach for the analysis of multiple-subject functional magnetic resonance imaging data to obtain low-dimensional components of blood-oxygen level-dependent signal changes of the brain over time, which are highly correlated across the subjects as well as representative of the data. The extracted components are used to identify networks of neural activity that are commonly activated across the subjects while carrying out a working memory task.
Mutations of the Birt–Hogg–Dubé gene in patients with multiple lung cysts and recurrent pneumothorax

Science.gov (United States)

Gunji, Yoko; Akiyoshi, Taeko; Sato, Teruhiko; Kurihara, Masatoshi; Tominaga, Shigeru; Takahashi, Kazuhisa; Seyama, Kuniaki

2007-01-01

Rationale Birt–Hogg–Dubé (BHD) syndrome, a rare inherited autosomal genodermatosis first recognised in 1977, is characterised by fibrofolliculomas of the skin, an increased risk of renal tumours and multiple lung cysts with spontaneous pneumothorax. The BHD gene, a tumour suppressor gene located at chromosome 17p11.2, has recently been shown to be defective. Recent genetic studies revealed that clinical pictures of the disease may be variable and may not always present the full expression of the phenotypes. Objectives We hypothesised that mutations of the BHD gene are responsible for patients who have multiple lung cysts of which the underlying causes have not yet been elucidated. Methods We studied eight patients with lung cysts, without skin and renal disease; seven of these patients have a history of spontaneous pneumothorax and five have a family history of pneumothorax. The BHD gene was examined using PCR, denaturing high‐performance liquid chromatography and direct sequencing. Main results We found that five of the eight patients had a BHD germline mutation. All mutations were unique and four of them were novel, including three different deletions or insertions detected in exons 6, 12 and 13, respectively and one splice acceptor site mutation in intron 5 resulting in an in‐frame deletion of exon 6. Conclusions We found that germline mutations of the BHD gene are involved in some patients with multiple lung cysts and pneumothorax. Pulmonologists should be aware that BHD syndrome can occur as an isolated phenotype with pulmonary involvement. PMID:17496196
Identification of multiple sites suitable for insertion of foreign genes in herpes simplex virus genomes.

Science.gov (United States)

Morimoto, Tomomi; Arii, Jun; Akashi, Hiroomi; Kawaguchi, Yasushi

2009-03-01

Information on sites in HSV genomes at which foreign gene(s) can be inserted without disrupting viral genes or affecting properties of the parental virus are important for basic research on HSV and development of HSV-based vectors for human therapy. The intergenic region between HSV-1 UL3 and UL4 genes has been reported to satisfy the requirements for such an insertion site. The UL3 and UL4 genes are oriented toward the intergenic region and, therefore, insertion of a foreign gene(s) into the region between the UL3 and UL4 polyadenylation signals should not disrupt any viral genes or transcriptional units. HSV-1 and HSV-2 each have more than 10 additional regions structurally similar to the intergenic region between UL3 and UL4. In the studies reported here, it has been demonstrated that insertion of a reporter gene expression cassette into several of the HSV-1 and HSV-2 intergenic regions has no effect on viral growth in cell culture or virulence in mice, suggesting that these multiple intergenic regions may be suitable HSV sites for insertion of foreign genes.
Deep convolutional neural networks for annotating gene expression patterns in the mouse brain.

Science.gov (United States)

Zeng, Tao; Li, Rongjian; Mukkamala, Ravi; Ye, Jieping; Ji, Shuiwang

2015-05-07

Profiling gene expression in brain structures at various spatial and temporal scales is essential to understanding how genes regulate the development of brain structures. The Allen Developing Mouse Brain Atlas provides high-resolution 3-D in situ hybridization (ISH) gene expression patterns in multiple developing stages of the mouse brain. Currently, the ISH images are annotated with anatomical terms manually. In this paper, we propose a computational approach to annotate gene expression pattern images in the mouse brain at various structural levels over the course of development. We applied deep convolutional neural network that was trained on a large set of natural images to extract features from the ISH images of developing mouse brain. As a baseline representation, we applied invariant image feature descriptors to capture local statistics from ISH images and used the bag-of-words approach to build image-level representations. Both types of features from multiple ISH image sections of the entire brain were then combined to build 3-D, brain-wide gene expression representations. We employed regularized learning methods for discriminating gene expression patterns in different brain structures. Results show that our approach of using convolutional model as feature extractors achieved superior performance in annotating gene expression patterns at multiple levels of brain structures throughout four developing ages. Overall, we achieved average AUC of 0.894 ± 0.014, as compared with 0.820 ± 0.046 yielded by the bag-of-words approach. Deep convolutional neural network model trained on natural image sets and applied to gene expression pattern annotation tasks yielded superior performance, demonstrating its transfer learning property is applicable to such biological image sets.
Multiple coupled landscapes and non-adiabatic dynamics with applications to self-activating genes.

Science.gov (United States)

Chen, Cong; Zhang, Kun; Feng, Haidong; Sasai, Masaki; Wang, Jin

2015-11-21

Many physical, chemical and biochemical systems (e.g. electronic dynamics and gene regulatory networks) are governed by continuous stochastic processes (e.g. electron dynamics on a particular electronic energy surface and protein (gene product) synthesis) coupled with discrete processes (e.g. hopping among different electronic energy surfaces and on and off switching of genes). One can also think of the underlying dynamics as the continuous motion on a particular landscape and discrete hoppings among different landscapes. The main difference of such systems from the intra-landscape dynamics alone is the emergence of the timescale involved in transitions among different landscapes in addition to the timescale involved in a particular landscape. The adiabatic limit when inter-landscape hoppings are fast compared to continuous intra-landscape dynamics has been studied both analytically and numerically, but the analytical treatment of the non-adiabatic regime where the inter-landscape hoppings are slow or comparable to continuous intra-landscape dynamics remains challenging. In this study, we show that there exists mathematical mapping of the dynamics on 2(N) discretely coupled N continuous dimensional landscapes onto one single landscape in 2N dimensional extended continuous space. On this 2N dimensional landscape, eddy current emerges as a sign of non-equilibrium non-adiabatic dynamics and plays an important role in system evolution. Many interesting physical effects such as the enhancement of fluctuations, irreversibility, dissipation and optimal kinetics emerge due to non-adiabaticity manifested by the eddy current illustrated for an N = 1 self-activator. We further generalize our theory to the N-gene network with multiple binding sites and multiple synthesis rates for discretely coupled non-equilibrium stochastic physical and biological systems.

EBF factors drive expression of multiple classes of target genes governing neuronal development.

Science.gov (United States)

Green, Yangsook S; Vetter, Monica L

2011-04-30

Early B cell factor (EBF) family members are transcription factors known to have important roles in several aspects of vertebrate neurogenesis, including commitment, migration and differentiation. Knowledge of how EBF family members contribute to neurogenesis is limited by a lack of detailed understanding of genes that are transcriptionally regulated by these factors. We performed a microarray screen in Xenopus animal caps to search for targets of EBF transcriptional activity, and identified candidate targets with multiple roles, including transcription factors of several classes. We determined that, among the most upregulated candidate genes with expected neuronal functions, most require EBF activity for some or all of their expression, and most have overlapping expression with ebf genes. We also found that the candidate target genes that had the most strongly overlapping expression patterns with ebf genes were predicted to be direct transcriptional targets of EBF transcriptional activity. The identification of candidate targets that are transcription factor genes, including nscl-1, emx1 and aml1, improves our understanding of how EBF proteins participate in the hierarchy of transcription control during neuronal development, and suggests novel mechanisms by which EBF activity promotes migration and differentiation. Other candidate targets, including pcdh8 and kcnk5, expand our knowledge of the types of terminal differentiated neuronal functions that EBF proteins regulate.
SATB1 tethers multiple gene loci to reprogram expression profiledriving breast cancer metastasis

Energy Technology Data Exchange (ETDEWEB)

Han, Hye-Jung; Kohwi, Yoshinori; Kohwi-Shigematsu, Terumi

2006-07-13

Global changes in gene expression occur during tumor progression, as indicated by expression profiling of metastatic tumors. How this occurs is poorly understood. SATB1 functions as a genome organizer by folding chromatin via tethering multiple genomic loci and recruiting chromatin remodeling enzymes to regulate chromatin structure and expression of a large number of genes. Here we show that SATB1 is expressed at high levels in aggressive breast cancer cells, and is undetectable in non-malignant breast epithelial cells. Importantly, RNAi-mediated removal of SATB1 from highly-aggressive MDA-MB-231 cells altered the expression levels of over 1200 genes, restored breast-like acinar polarity in three-dimensional cultures, and prevented the metastastic phenotype in vivo. Conversely, overexpression of SATB1 in the less-aggressive breast cancer cell line Hs578T altered the gene expression profile and increased metastasis dramatically in vivo. Thus, SATB1 is a global regulator of gene expression in breast cancer cells, directly regulating crucial metastasis-associated genes, including ERRB2 (HER2/NEU), TGF-{beta}1, matrix metalloproteinase 3, and metastasin. The identification of SATB1 as a protein that re-programs chromatin organization and transcription profiles to promote breast cancer metastasis suggests a new model for metastasis and may provide means of therapeutic intervention.
Allen Brain Atlas-Driven Visualizations: a web-based gene expression energy visualization tool.

Science.gov (United States)

Zaldivar, Andrew; Krichmar, Jeffrey L

2014-01-01

The Allen Brain Atlas-Driven Visualizations (ABADV) is a publicly accessible web-based tool created to retrieve and visualize expression energy data from the Allen Brain Atlas (ABA) across multiple genes and brain structures. Though the ABA offers their own search engine and software for researchers to view their growing collection of online public data sets, including extensive gene expression and neuroanatomical data from human and mouse brain, many of their tools limit the amount of genes and brain structures researchers can view at once. To complement their work, ABADV generates multiple pie charts, bar charts and heat maps of expression energy values for any given set of genes and brain structures. Such a suite of free and easy-to-understand visualizations allows for easy comparison of gene expression across multiple brain areas. In addition, each visualization links back to the ABA so researchers may view a summary of the experimental detail. ABADV is currently supported on modern web browsers and is compatible with expression energy data from the Allen Mouse Brain Atlas in situ hybridization data. By creating this web application, researchers can immediately obtain and survey numerous amounts of expression energy data from the ABA, which they can then use to supplement their work or perform meta-analysis. In the future, we hope to enable ABADV across multiple data resources.
Allen Brain Atlas-Driven Visualizations: A Web-Based Gene Expression Energy Visualization Tool

Directory of Open Access Journals (Sweden)

Andrew eZaldivar

2014-05-01

Full Text Available The Allen Brain Atlas-Driven Visualizations (ABADV is a publicly accessible web-based tool created to retrieve and visualize expression energy data from the Allen Brain Atlas (ABA across multiple genes and brain structures. Though the ABA offers their own search engine and software for researchers to view their growing collection of online public data sets, including extensive gene expression and neuroanatomical data from human and mouse brain, many of their tools limit the amount of genes and brain structures researchers can view at once. To complement their work, ABADV generates multiple pie charts, bar charts and heat maps of expression energy values for any given set of genes and brain structures. Such a suite of free and easy-to-understand visualizations allows for easy comparison of gene expression across multiple brain areas. In addition, each visualization links back to the ABA so researchers may view a summary of the experimental detail. ABADV is currently supported on modern web browsers and is compatible with expression energy data from the Allen Mouse Brain Atlas in situ hybridization data. By creating this web application, researchers can immediately obtain and survey numerous amounts of expression energy data from the ABA, which they can then use to supplement their work or perform meta-analysis. In the future, we hope to enable ABADV across multiple data resources.
Genetic Variants Contribute to Gene Expression Variability in Humans

Science.gov (United States)

Hulse, Amanda M.; Cai, James J.

2013-01-01

Expression quantitative trait loci (eQTL) studies have established convincing relationships between genetic variants and gene expression. Most of these studies focused on the mean of gene expression level, but not the variance of gene expression level (i.e., gene expression variability). In the present study, we systematically explore genome-wide association between genetic variants and gene expression variability in humans. We adapt the double generalized linear model (dglm) to simultaneously fit the means and the variances of gene expression among the three possible genotypes of a biallelic SNP. The genomic loci showing significant association between the variances of gene expression and the genotypes are termed expression variability QTL (evQTL). Using a data set of gene expression in lymphoblastoid cell lines (LCLs) derived from 210 HapMap individuals, we identify cis-acting evQTL involving 218 distinct genes, among which 8 genes, ADCY1, CTNNA2, DAAM2, FERMT2, IL6, PLOD2, SNX7, and TNFRSF11B, are cross-validated using an extra expression data set of the same LCLs. We also identify ∼300 trans-acting evQTL between >13,000 common SNPs and 500 randomly selected representative genes. We employ two distinct scenarios, emphasizing single-SNP and multiple-SNP effects on expression variability, to explain the formation of evQTL. We argue that detecting evQTL may represent a novel method for effectively screening for genetic interactions, especially when the multiple-SNP influence on expression variability is implied. The implication of our results for revealing genetic mechanisms of gene expression variability is discussed. PMID:23150607
A Hybrid One-Way ANOVA Approach for the Robust and Efficient Estimation of Differential Gene Expression with Multiple Patterns.

Directory of Open Access Journals (Sweden)

Mohammad Manir Hossain Mollah

Full Text Available Identifying genes that are differentially expressed (DE between two or more conditions with multiple patterns of expression is one of the primary objectives of gene expression data analysis. Several statistical approaches, including one-way analysis of variance (ANOVA, are used to identify DE genes. However, most of these methods provide misleading results for two or more conditions with multiple patterns of expression in the presence of outlying genes. In this paper, an attempt is made to develop a hybrid one-way ANOVA approach that unifies the robustness and efficiency of estimation using the minimum β-divergence method to overcome some problems that arise in the existing robust methods for both small- and large-sample cases with multiple patterns of expression.The proposed method relies on a β-weight function, which produces values between 0 and 1. The β-weight function with β = 0.2 is used as a measure of outlier detection. It assigns smaller weights (≥ 0 to outlying expressions and larger weights (≤ 1 to typical expressions. The distribution of the β-weights is used to calculate the cut-off point, which is compared to the observed β-weight of an expression to determine whether that gene expression is an outlier. This weight function plays a key role in unifying the robustness and efficiency of estimation in one-way ANOVA.Analyses of simulated gene expression profiles revealed that all eight methods (ANOVA, SAM, LIMMA, EBarrays, eLNN, KW, robust BetaEB and proposed perform almost identically for m = 2 conditions in the absence of outliers. However, the robust BetaEB method and the proposed method exhibited considerably better performance than the other six methods in the presence of outliers. In this case, the BetaEB method exhibited slightly better performance than the proposed method for the small-sample cases, but the the proposed method exhibited much better performance than the BetaEB method for both the small- and large
Negotiating jurisdiction in the workplace: a multiple-case study of nurse prescribing in hospital settings.

NARCIS (Netherlands)

Kroezen, M.; Mistiaen, P.; Dijk, L. van; Groenewegen, P.P.; Francke, A.L.

2014-01-01

This paper reports on a multiple-case study of prescribing by nurse specialists in Dutch hospital settings. Most analyses of interprofessional negotiations over professional boundaries take a macro sociological approach and ignore workplace jurisdictions. Yet boundary blurring takes place and
Negotiating jurisdiction in the workplace: A multiple-case study of nurse prescribing in hospital settings

NARCIS (Netherlands)

Kroezen, M.; Mistiaen, P.; van Dijk, L.; Groenewegen, P.P.; Francke, A.L.

2014-01-01

This paper reports on a multiple-case study of prescribing by nurse specialists in Dutch hospital settings. Most analyses of interprofessional negotiations over professional boundaries take a macro sociological approach and ignore workplace jurisdictions. Yet boundary blurring takes place and
Negotiating jurisdiction in the workplace : A multiple-case study of nurse prescribing in hospital settings

NARCIS (Netherlands)

Kroezen, M.; Mistiaen, P.; van Dijk, L.; Groenewegen, P. P.; Francke, A. L.

2014-01-01

This paper reports on a multiple-case study of prescribing by nurse specialists in Dutch hospital settings. Most analyses of interprofessional negotiations over professional boundaries take a macro sociological approach and ignore workplace jurisdictions. Yet boundary blurring takes place and
Genomic determinants of sporulation in Bacilli and Clostridia: towards the minimal set of sporulation-specific genes.

Science.gov (United States)

Galperin, Michael Y; Mekhedov, Sergei L; Puigbo, Pere; Smirnov, Sergey; Wolf, Yuri I; Rigden, Daniel J

2012-11-01

Three classes of low-G+C Gram-positive bacteria (Firmicutes), Bacilli, Clostridia and Negativicutes, include numerous members that are capable of producing heat-resistant endospores. Spore-forming firmicutes include many environmentally important organisms, such as insect pathogens and cellulose-degrading industrial strains, as well as human pathogens responsible for such diseases as anthrax, botulism, gas gangrene and tetanus. In the best-studied model organism Bacillus subtilis, sporulation involves over 500 genes, many of which are conserved among other bacilli and clostridia. This work aimed to define the genomic requirements for sporulation through an analysis of the presence of sporulation genes in various firmicutes, including those with smaller genomes than B. subtilis. Cultivable spore-formers were found to have genomes larger than 2300 kb and encompass over 2150 protein-coding genes of which 60 are orthologues of genes that are apparently essential for sporulation in B. subtilis. Clostridial spore-formers lack, among others, spoIIB, sda, spoVID and safA genes and have non-orthologous displacements of spoIIQ and spoIVFA, suggesting substantial differences between bacilli and clostridia in the engulfment and spore coat formation steps. Many B. subtilis sporulation genes, particularly those encoding small acid-soluble spore proteins and spore coat proteins, were found only in the family Bacillaceae, or even in a subset of Bacillus spp. Phylogenetic profiles of sporulation genes, compiled in this work, confirm the presence of a common sporulation gene core, but also illuminate the diversity of the sporulation processes within various lineages. These profiles should help further experimental studies of uncharacterized widespread sporulation genes, which would ultimately allow delineation of the minimal set(s) of sporulation-specific genes in Bacilli and Clostridia. Published 2012. This article is a U.S. Government work and is in the public domain in the USA.
Assembly and multiple gene expression of thermophilic enzymes in Escherichia coli for in vitro metabolic engineering.

Science.gov (United States)

Ninh, Pham Huynh; Honda, Kohsuke; Sakai, Takaaki; Okano, Kenji; Ohtake, Hisao

2015-01-01

In vitro reconstitution of an artificial metabolic pathway is an emerging approach for the biocatalytic production of industrial chemicals. However, several enzymes have to be separately prepared (and purified) for the construction of an in vitro metabolic pathway, thereby limiting the practical applicability of this approach. In this study, genes encoding the nine thermophilic enzymes involved in a non-ATP-forming chimeric glycolytic pathway were assembled in an artificial operon and co-expressed in a single recombinant Escherichia coli strain. Gene expression levels of the thermophilic enzymes were controlled by their sequential order in the artificial operon. The specific activities of the recombinant enzymes in the cell-free extract of the multiple-gene-expression E. coli were 5.0-1,370 times higher than those in an enzyme cocktail prepared from a mixture of single-gene-expression strains, in each of which a single one of the nine thermophilic enzymes was overproduced. Heat treatment of a crude extract of the multiple-gene-expression cells led to the denaturation of indigenous proteins and one-step preparation of an in vitro synthetic pathway comprising only a limited number of thermotolerant enzymes. Coupling this in vitro pathway with other thermophilic enzymes including the H2 O-forming NADH oxidase or the malate/lactate dehydrogenase facilitated one-pot conversion of glucose to pyruvate or lactate, respectively. © 2014 Wiley Periodicals, Inc.
Multiple gene genealogies and phenotypic characters differentiate several novel species of Mycosphaerella and related anamorphs on banana.

Science.gov (United States)

Arzanlou, M; Groenewald, J Z; Fullerton, R A; Abeln, E C A; Carlier, J; Zapater, M-F; Buddenhagen, I W; Viljoen, A; Crous, P W

2008-06-01

Three species of Mycosphaerella, namely M. eumusae, M. fijiensis, and M. musicola are involved in the Sigatoka disease complex of bananas. Besides these three primary pathogens, several additional species of Mycosphaerella or their anamorphs have been described from Musa. However, very little is known about these taxa, and for the majority of these species no culture or DNA is available for study. In the present study, we collected a global set of Mycosphaerella strains from banana, and compared them by means of morphology and a multi-gene nucleotide sequence data set. The phylogeny inferred from the ITS region and the combined data set containing partial gene sequences of the actin gene, the small subunit mitochondrial ribosomal DNA and the histone H3 gene revealed a rich diversity of Mycosphaerella species on Musa. Integration of morphological and molecular data sets confirmed more than 20 species of Mycosphaerella (incl. anamorphs) to occur on banana. This study reconfirmed the previously described presence of Cercospora apii, M. citri and M. thailandica, and also identified Mycosphaerella communis, M. lateralis and Passalora loranthi on this host. Moreover, eight new species identified from Musa are described, namely Dissoconium musae, Mycosphaerella mozambica, Pseudocercospora assamensis, P. indonesiana, P. longispora, Stenella musae, S. musicola, and S. queenslandica.
SuperTRI: A new approach based on branch support analyses of multiple independent data sets for assessing reliability of phylogenetic inferences.

Science.gov (United States)

Ropiquet, Anne; Li, Blaise; Hassanin, Alexandre

2009-09-01

Supermatrix and supertree are two methods for constructing a phylogenetic tree by using multiple data sets. However, these methods are not a panacea, as conflicting signals between data sets can lead to misinterpret the evolutionary history of taxa. In particular, the supermatrix approach is expected to be misleading if the species-tree signal is not dominant after the combination of the data sets. Moreover, most current supertree methods suffer from two limitations: (i) they ignore or misinterpret secondary (non-dominant) phylogenetic signals of the different data sets; and (ii) the logical basis of node robustness measures is unclear. To overcome these limitations, we propose a new approach, called SuperTRI, which is based on the branch support analyses of the independent data sets, and where the reliability of the nodes is assessed using three measures: the supertree Bootstrap percentage and two other values calculated from the separate analyses: the mean branch support (mean Bootstrap percentage or mean posterior probability) and the reproducibility index. The SuperTRI approach is tested on a data matrix including seven genes for 82 taxa of the family Bovidae (Mammalia, Ruminantia), and the results are compared to those found with the supermatrix approach. The phylogenetic analyses of the supermatrix and independent data sets were done using four methods of tree reconstruction: Bayesian inference, maximum likelihood, and unweighted and weighted maximum parsimony. The results indicate, firstly, that the SuperTRI approach shows less sensitivity to the four phylogenetic methods, secondly, that it is more accurate to interpret the relationships among taxa, and thirdly, that interesting conclusions on introgression and radiation can be drawn from the comparisons between SuperTRI and supermatrix analyses.
Coverage and characteristics of the Affymetrix GeneChip Human Mapping 100K SNP set.

Directory of Open Access Journals (Sweden)

2006-05-01

Full Text Available Improvements in technology have made it possible to conduct genome-wide association mapping at costs within reach of academic investigators, and experiments are currently being conducted with a variety of high-throughput platforms. To provide an appropriate context for interpreting results of such studies, we summarize here results of an investigation of one of the first of these technologies to be publicly available, the Affymetrix GeneChip Human Mapping 100K set of single nucleotide polymorphisms (SNPs. In a systematic analysis of the pattern and distribution of SNPs in the Mapping 100K set, we find that SNPs in this set are undersampled from coding regions (both nonsynonymous and synonymous and oversampled from regions outside genes, relative to SNPs in the overall HapMap database. In addition, we utilize a novel multilocus linkage disequilibrium (LD coefficient based on information content (analogous to the information content scores commonly used for linkage mapping that is equivalent to the familiar measure r2 in the special case of two loci. Using this approach, we are able to summarize for any subset of markers, such as the Affymetrix Mapping 100K set, the information available for association mapping in that subset, relative to the information available in the full set of markers included in the HapMap, and highlight circumstances in which this multilocus measure of LD provides substantial additional insight about the haplotype structure in a region over pairwise measures of LD.
Equivalent Gene Expression Profiles between Glatopa™ and Copaxone®.

Directory of Open Access Journals (Sweden)

Josephine S D'Alessandro

Full Text Available Glatopa™ is a generic glatiramer acetate recently approved for the treatment of patients with relapsing forms of multiple sclerosis. Gene expression profiling was performed as a means to evaluate equivalence of Glatopa and Copaxone®. Microarray analysis containing 39,429 unique probes across the entire genome was performed in murine glatiramer acetate--responsive Th2-polarized T cells, a test system highly relevant to the biology of glatiramer acetate. A closely related but nonequivalent glatiramoid molecule was used as a control to establish assay sensitivity. Multiple probe-level (Student's t-test and sample-level (principal component analysis, multidimensional scaling, and hierarchical clustering statistical analyses were utilized to look for differences in gene expression induced by the test articles. The analyses were conducted across all genes measured, as well as across a subset of genes that were shown to be modulated by Copaxone. The following observations were made across multiple statistical analyses: the expression of numerous genes was significantly changed by treatment with Copaxone when compared against media-only control; gene expression profiles induced by Copaxone and Glatopa were not significantly different; and gene expression profiles induced by Copaxone and the nonequivalent glatiramoid were significantly different, underscoring the sensitivity of the test system and the multiple analysis methods. Comparative analysis was also performed on sets of transcripts relevant to T-cell biology and antigen presentation, among others that are known to be modulated by glatiramer acetate. No statistically significant differences were observed between Copaxone and Glatopa in the expression levels (magnitude and direction of these glatiramer acetate-regulated genes. In conclusion, multiple methods consistently supported equivalent gene expression profiles between Copaxone and Glatopa.
Pattern-set generation algorithm for the one-dimensional multiple stock sizes cutting stock problem

Science.gov (United States)

Cui, Yaodong; Cui, Yi-Ping; Zhao, Zhigang

2015-09-01

A pattern-set generation algorithm (PSG) for the one-dimensional multiple stock sizes cutting stock problem (1DMSSCSP) is presented. The solution process contains two stages. In the first stage, the PSG solves the residual problems repeatedly to generate the patterns in the pattern set, where each residual problem is solved by the column-generation approach, and each pattern is generated by solving a single large object placement problem. In the second stage, the integer linear programming model of the 1DMSSCSP is solved using a commercial solver, where only the patterns in the pattern set are considered. The computational results of benchmark instances indicate that the PSG outperforms existing heuristic algorithms and rivals the exact algorithm in solution quality.
Multiple Origins of Mutations in the mdr1 Gene--A Putative Marker of Chloroquine Resistance in P. vivax.

Directory of Open Access Journals (Sweden)

Mette L Schousboe

2015-11-01

Full Text Available Chloroquine combined with primaquine has been the recommended antimalarial treatment of Plasmodium vivax malaria infections for six decades but the efficacy of this treatment regimen is threatened by chloroquine resistance (CQR. Single nucleotide polymorphisms (SNPs in the multidrug resistance gene, Pvmdr1 are putative determinants of CQR but the extent of their emergence at population level remains to be explored.In this study we describe the prevalence of SNPs in the Pvmdr1 among samples collected in seven P. vivax endemic countries and we looked for molecular evidence of drug selection by characterising polymorphism at microsatellite (MS loci flanking the Pvmdr1 gene.We examined the prevalence of SNPs in the Pvmdr1 gene among 267 samples collected from Pakistan, Afghanistan, Sri Lanka, Nepal, Sudan, São Tomé and Ecuador. We measured and diversity in four microsatellite (MS markers flanking the Pvmdr1 gene to look evidence of selection on mutant alleles.SNP polymorphism in the Pvmdr1 gene was largely confined to codons T958M, Y976F and F1076L. Only 2.4% of samples were wildtype at all three codons (TYF, n = 5, 13.3% (n = 28 of the samples were single mutant MYF, 63.0% of samples (n = 133 were double mutant MYL, and 21.3% (n = 45 were triple mutant MFL. Clear geographic differences in the prevalence of these Pvmdr mutation combinations were observed. Significant linkage disequilibrium (LD between Pvmdr1 and MS alleles was found in populations sampled in Ecuador, Nepal and Sri Lanka, while significant LD between Pvmdr1 and the combined 4 MS locus haplotype was only seen in Ecuador and Sri Lanka. When combining the 5 loci, high level diversity, measured as expected heterozygosity (He, was seen in the complete sample set (He = 0.99, while He estimates for individual loci ranged from 0.00-0.93. Although Pvmdr1 haplotypes were not consistently associated with specific flanking MS alleles, there was significant differentiation between geographic
Identification of self-consistent modulons from bacterial microarray expression data with the help of structured regulon gene sets

KAUST Repository

Permina, Elizaveta A.; Medvedeva, Yulia; Baeck, Pia M.; Hegde, Shubhada R.; Mande, Shekhar C.; Makeev, Vsevolod J.

2013-01-01

interactions helps to evaluate parameters for regulatory subnetwork inference. We suggest a procedure for modulon construction where a seed regulon is iteratively updated with genes having expression patterns similar to those for regulon member genes. A set
A statistical method for predicting splice variants between two groups of samples using GeneChip® expression array data

Directory of Open Access Journals (Sweden)

Olson James M

2006-04-01

Full Text Available Abstract Background Alternative splicing of pre-messenger RNA results in RNA variants with combinations of selected exons. It is one of the essential biological functions and regulatory components in higher eukaryotic cells. Some of these variants are detectable with the Affymetrix GeneChip® that uses multiple oligonucleotide probes (i.e. probe set, since the target sequences for the multiple probes are adjacent within each gene. Hybridization intensity from a probe correlates with abundance of the corresponding transcript. Although the multiple-probe feature in the current GeneChip® was designed to assess expression values of individual genes, it also measures transcriptional abundance for a sub-region of a gene sequence. This additional capacity motivated us to develop a method to predict alternative splicing, taking advance of extensive repositories of GeneChip® gene expression array data. Results We developed a two-step approach to predict alternative splicing from GeneChip® data. First, we clustered the probes from a probe set into pseudo-exons based on similarity of probe intensities and physical adjacency. A pseudo-exon is defined as a sequence in the gene within which multiple probes have comparable probe intensity values. Second, for each pseudo-exon, we assessed the statistical significance of the difference in probe intensity between two groups of samples. Differentially expressed pseudo-exons are predicted to be alternatively spliced. We applied our method to empirical data generated from GeneChip® Hu6800 arrays, which include 7129 probe sets and twenty probes per probe set. The dataset consists of sixty-nine medulloblastoma (27 metastatic and 42 non-metastatic samples and four cerebellum samples as normal controls. We predicted that 577 genes would be alternatively spliced when we compared normal cerebellum samples to medulloblastomas, and predicted that thirteen genes would be alternatively spliced when we compared metastatic
Automated Detection of Cancer Associated Genes Using a Combined Fuzzy-Rough-Set-Based F-Information and Water Swirl Algorithm of Human Gene Expression Data.

Directory of Open Access Journals (Sweden)

Pugalendhi Ganesh Kumar

Full Text Available This study describes a novel approach to reducing the challenges of highly nonlinear multiclass gene expression values for cancer diagnosis. To build a fruitful system for cancer diagnosis, in this study, we introduced two levels of gene selection such as filtering and embedding for selection of potential genes and the most relevant genes associated with cancer, respectively. The filter procedure was implemented by developing a fuzzy rough set (FR-based method for redefining the criterion function of f-information (FI to identify the potential genes without discretizing the continuous gene expression values. The embedded procedure is implemented by means of a water swirl algorithm (WSA, which attempts to optimize the rule set and membership function required to classify samples using a fuzzy-rule-based multiclassification system (FRBMS. Two novel update equations are proposed in WSA, which have better exploration and exploitation abilities while designing a self-learning FRBMS. The efficiency of our new approach was evaluated on 13 multicategory and 9 binary datasets of cancer gene expression. Additionally, the performance of the proposed FRFI-WSA method in designing an FRBMS was compared with existing methods for gene selection and optimization such as genetic algorithm (GA, particle swarm optimization (PSO, and artificial bee colony algorithm (ABC on all the datasets. In the global cancer map with repeated measurements (GCM_RM dataset, the FRFI-WSA showed the smallest number of 16 most relevant genes associated with cancer using a minimal number of 26 compact rules with the highest classification accuracy (96.45%. In addition, the statistical validation used in this study revealed that the biological relevance of the most relevant genes associated with cancer and their linguistics detected by the proposed FRFI-WSA approach are better than those in the other methods. The simple interpretable rules with most relevant genes and effectively

The capability set for work - correlates of sustainable employability in workers with multiple sclerosis.

Science.gov (United States)

van Gorp, D A M; van der Klink, J J L; Abma, F I; Jongen, P J; van Lieshout, I; Arnoldus, E P J; Beenakker, E A C; Bos, H M; van Eijk, J J J; Fermont, J; Frequin, S T F M; de Gans, K; Hengstman, G J D; Hupperts, R M M; Mostert, J P; Pop, P H M; Verhagen, W I M; Zemel, D; Heerings, M A P; Reneman, M F; Middelkoop, H A M; Visser, L H; van der Hiele, K

2018-06-01

The aim of this study was to examine whether work capabilities differ between workers with Multiple Sclerosis (MS) and workers from the general population. The second aim was to investigate whether the capability set was related to work and health outcomes. A total of 163 workers with MS from the MS@Work study and 163 workers from the general population were matched for gender, age, educational level and working hours. All participants completed online questionnaires on demographics, health and work functioning. The Capability Set for Work Questionnaire was used to explore whether a set of seven work values is considered valuable (A), is enabled in the work context (B), and can be achieved by the individual (C). When all three criteria are met a work value can be considered part of the individual's 'capability set'. Group differences and relationships with work and health outcomes were examined. Despite lower physical work functioning (U = 4250, p = 0.001), lower work ability (U = 10591, p = 0.006) and worse self-reported health (U = 9091, p ≤ 0.001) workers with MS had a larger capability set (U = 9649, p ≤ 0.001) than the general population. In workers with MS, a larger capability set was associated with better flexible work functioning (r = 0.30), work ability (r = 0.25), self-rated health (r = 0.25); and with less absenteeism (r = - 0.26), presenteeism (r = - 0.31), cognitive/neuropsychiatric impairment (r = - 0.35), depression (r = - 0.43), anxiety (r = - 0.31) and fatigue (r = - 0.34). Workers with MS have a larger capability set than workers from the general population. In workers with MS a larger capability set was associated with better work and health outcomes. This observational study is registered under NL43098.008.12: 'Voorspellers van arbeidsparticipatie bij mensen met relapsing-remitting Multiple Sclerose'. The study is registered at the Dutch CCMO register ( https
The multiple roles of hypothetical gene BPSS1356 in Burkholderia pseudomallei.

Directory of Open Access Journals (Sweden)

Hokchai Yam

Full Text Available Burkholderia pseudomallei is an opportunistic pathogen and the causative agent of melioidosis. It is able to adapt to harsh environments and can live intracellularly in its infected hosts. In this study, identification of transcriptional factors that associate with the β' subunit (RpoC of RNA polymerase was performed. The N-terminal region of this subunit is known to trigger promoter melting when associated with a sigma factor. A pull-down assay using histidine-tagged B. pseudomallei RpoC N-terminal region as bait showed that a hypothetical protein BPSS1356 was one of the proteins bound. This hypothetical protein is conserved in all B. pseudomallei strains and present only in the Burkholderia genus. A BPSS1356 deletion mutant was generated to investigate its biological function. The mutant strain exhibited reduced biofilm formation and a lower cell density during the stationary phase of growth in LB medium. Electron microscopic analysis revealed that the ΔBPSS1356 mutant cells had a shrunken cytoplasm indicative of cell plasmolysis and a rougher surface when compared to the wild type. An RNA microarray result showed that a total of 63 genes were transcriptionally affected by the BPSS1356 deletion with fold change values of higher than 4. The expression of a group of genes encoding membrane located transporters was concurrently down-regulated in ΔBPSS1356 mutant. Amongst the affected genes, the putative ion transportation genes were the most severely suppressed. Deprivation of BPSS1356 also down-regulated the transcriptions of genes for the arginine deiminase system, glycerol metabolism, type III secretion system cluster 2, cytochrome bd oxidase and arsenic resistance. It is therefore obvious that BPSS1356 plays a multiple regulatory roles on many genes.
GAP1, a novel selection and counter-selection marker for multiple gene disruptions in Saccharomyces cerevisiae

DEFF Research Database (Denmark)

Regenberg, Birgitte; Hansen, J.

2000-01-01

the GAP1 gene. This is caused by recombination between two Salmonella typuimurium hisG direct repeats embracing GAP1, and will result in a sub-population of gap1 cells. Such cells are selected on a medium containing D-histidine, and may subsequently be used for a second gene disruption. Hence, multiple...... flanked by short (60 bp) stretches of the gene in question. Through homologous recombination, the cassette will integrate into the target gene, which is thus replaced by GAP1, and mutants are selected for on minimal L-citrulline medium. When propagated under non-selective conditions, some cells will lose...... gene disruptions can be made fast, cheaply and easily in a gap1 strain, with two positive selection steps for each disruption. Copyright (C) 2000 John Wiley & Sons, Ltd....
Multiple endocrine neoplasia type 2: achievements and current challenges

Directory of Open Access Journals (Sweden)

Andreas Machens

2012-01-01

Full Text Available Incremental advances in medical technology, such as the development of sensitive hormonal assays for routine clinical care, are the drivers of medical progress. This principle is exemplified by the creation of the concept of multiple endocrine neoplasia type 2, encompassing medullary thyroid cancer, pheochromocytoma, and primary hyperparathyroidism, which did not emerge before the early 1960s. This review sets out to highlight key achievements, such as joint biochemical and DNA-based screening of individuals at risk of developing multiple endocrine neoplasia type 2, before casting a spotlight on current challenges which include: (i ill-defined upper limits of calcitonin assays for infants and young children, rendering it difficult to implement the biochemical part of the integrated DNA-based/biochemical concept; (ii our increasingly mobile society in which different service providers are caring for one individual at various stages in the disease process. With familial relationships disintegrating as a result of geographic dispersion, information about the history of the origin family may become sketchy or just unavailable. This is when DNA-based gene tests come into play, confirming or excluding an individual's genetic predisposition to multiple endocrine neoplasia type 2 even before there is any biochemical or clinical evidence of the disease. However, the unrivaled molecular genetic progress in multiple endocrine neoplasia type 2 does not come without a price. Screening may uncover unknown gene sequence variants representing either harmless polymorphisms or pathogenic mutations. In this setting, functional characterization of mutant cells in vitro may generate helpful ancillary evidence with regard to the pathogenicity of gene variants in comparison with established mutations.
Selection and validation of a set of reliable reference genes for quantitative sod gene expression analysis in C. elegans

Directory of Open Access Journals (Sweden)

Vandesompele Jo

2008-01-01

Full Text Available Abstract Background In the nematode Caenorhabditis elegans the conserved Ins/IGF-1 signaling pathway regulates many biological processes including life span, stress response, dauer diapause and metabolism. Detection of differentially expressed genes may contribute to a better understanding of the mechanism by which the Ins/IGF-1 signaling pathway regulates these processes. Appropriate normalization is an essential prerequisite for obtaining accurate and reproducible quantification of gene expression levels. The aim of this study was to establish a reliable set of reference genes for gene expression analysis in C. elegans. Results Real-time quantitative PCR was used to evaluate the expression stability of 12 candidate reference genes (act-1, ama-1, cdc-42, csq-1, eif-3.C, mdh-1, gpd-2, pmp-3, tba-1, Y45F10D.4, rgs-6 and unc-16 in wild-type, three Ins/IGF-1 pathway mutants, dauers and L3 stage larvae. After geNorm analysis, cdc-42, pmp-3 and Y45F10D.4 showed the most stable expression pattern and were used to normalize 5 sod expression levels. Significant differences in mRNA levels were observed for sod-1 and sod-3 in daf-2 relative to wild-type animals, whereas in dauers sod-1, sod-3, sod-4 and sod-5 are differentially expressed relative to third stage larvae. Conclusion Our findings emphasize the importance of accurate normalization using stably expressed reference genes. The methodology used in this study is generally applicable to reliably quantify gene expression levels in the nematode C. elegans using quantitative PCR.
Phylogenetic reconstruction and DNA barcoding for closely related pine moth species (Dendrolimus) in China with multiple gene markers.

Science.gov (United States)

Dai, Qing-Yan; Gao, Qiang; Wu, Chun-Sheng; Chesters, Douglas; Zhu, Chao-Dong; Zhang, Ai-Bing

2012-01-01

Unlike distinct species, closely related species offer a great challenge for phylogeny reconstruction and species identification with DNA barcoding due to their often overlapping genetic variation. We tested a sibling species group of pine moth pests in China with a standard cytochrome c oxidase subunit I (COI) gene and two alternative internal transcribed spacer (ITS) genes (ITS1 and ITS2). Five different phylogenetic/DNA barcoding analysis methods (Maximum likelihood (ML)/Neighbor-joining (NJ), "best close match" (BCM), Minimum distance (MD), and BP-based method (BP)), representing commonly used methodology (tree-based and non-tree based) in the field, were applied to both single-gene and multiple-gene analyses. Our results demonstrated clear reciprocal species monophyly for three relatively distant related species, Dendrolimus superans, D. houi, D. kikuchii, as recovered by both single and multiple genes while the phylogenetic relationship of three closely related species, D. punctatus, D. tabulaeformis, D. spectabilis, could not be resolved with the traditional tree-building methods. Additionally, we find the standard COI barcode outperforms two nuclear ITS genes, whatever the methods used. On average, the COI barcode achieved a success rate of 94.10-97.40%, while ITS1 and ITS2 obtained a success rate of 64.70-81.60%, indicating ITS genes are less suitable for species identification in this case. We propose the use of an overall success rate of species identification that takes both sequencing success and assignation success into account, since species identification success rates with multiple-gene barcoding system were generally overestimated, especially by tree-based methods, where only successfully sequenced DNA sequences were used to construct a phylogenetic tree. Non-tree based methods, such as MD, BCM, and BP approaches, presented advantages over tree-based methods by reporting the overall success rates with statistical significance. In addition, our
Phylogenetic reconstruction and DNA barcoding for closely related pine moth species (Dendrolimus in China with multiple gene markers.

Directory of Open Access Journals (Sweden)

Qing-Yan Dai

Full Text Available Unlike distinct species, closely related species offer a great challenge for phylogeny reconstruction and species identification with DNA barcoding due to their often overlapping genetic variation. We tested a sibling species group of pine moth pests in China with a standard cytochrome c oxidase subunit I (COI gene and two alternative internal transcribed spacer (ITS genes (ITS1 and ITS2. Five different phylogenetic/DNA barcoding analysis methods (Maximum likelihood (ML/Neighbor-joining (NJ, "best close match" (BCM, Minimum distance (MD, and BP-based method (BP, representing commonly used methodology (tree-based and non-tree based in the field, were applied to both single-gene and multiple-gene analyses. Our results demonstrated clear reciprocal species monophyly for three relatively distant related species, Dendrolimus superans, D. houi, D. kikuchii, as recovered by both single and multiple genes while the phylogenetic relationship of three closely related species, D. punctatus, D. tabulaeformis, D. spectabilis, could not be resolved with the traditional tree-building methods. Additionally, we find the standard COI barcode outperforms two nuclear ITS genes, whatever the methods used. On average, the COI barcode achieved a success rate of 94.10-97.40%, while ITS1 and ITS2 obtained a success rate of 64.70-81.60%, indicating ITS genes are less suitable for species identification in this case. We propose the use of an overall success rate of species identification that takes both sequencing success and assignation success into account, since species identification success rates with multiple-gene barcoding system were generally overestimated, especially by tree-based methods, where only successfully sequenced DNA sequences were used to construct a phylogenetic tree. Non-tree based methods, such as MD, BCM, and BP approaches, presented advantages over tree-based methods by reporting the overall success rates with statistical significance. In
Polyuridylylation and processing of transcripts from multiple gene minicircles in chloroplasts of the dinoflagellate Amphidinium carterae

KAUST Repository

Barbrook, Adrian C.; Dorrell, Richard G.; Burrows, Jennifer; Plenderleith, Lindsey J.; Nisbet, R. Ellen R.; Howe, Christopher J.

2012-01-01

-PCR to study transcription and transcript processing in the chloroplasts of Amphidinium carterae, a model peridinin-containing dinoflagellate. These organisms have a highly unusual chloroplast genome, with genes located on multiple small 'minicircle' elements
Evidence for intron length conservation in a set of mammalian genes associated with embryonic development

LENUS (Irish Health Repository)

2011-10-05

Abstract Background We carried out an analysis of intron length conservation across a diverse group of nineteen mammalian species. Motivated by recent research suggesting a role for time delays associated with intron transcription in gene expression oscillations required for early embryonic patterning, we searched for examples of genes that showed the most extreme conservation of total intron content in mammals. Results Gene sets annotated as being involved in pattern specification in the early embryo or containing the homeobox DNA-binding domain, were significantly enriched among genes with highly conserved intron content. We used ancestral sequences reconstructed with probabilistic models that account for insertion and deletion mutations to distinguish insertion and deletion events on lineages leading to human and mouse from their last common ancestor. Using a randomization procedure, we show that genes containing the homeobox domain show less change in intron content than expected, given the number of insertion and deletion events within their introns. Conclusions Our results suggest selection for gene expression precision or the existence of additional development-associated genes for which transcriptional delay is functionally significant.
Positive Selection of Plasmodium falciparum Parasites With Multiple var2csa-Type PfEMP1 Genes During the Course of Infection in Pregnant Women

Science.gov (United States)

Salanti, Ali; Lavstsen, Thomas; Nielsen, Morten A.; Theander, Thor G.; Leke, Rose G. F.; Lo, Yeung Y.; Bobbili, Naveen; Arnot, David E.; Taylor, Diane W.

2011-01-01

Placental malaria infections are caused by Plasmodium falciparum–infected red blood cells sequestering in the placenta by binding to chondroitin sulfate A, mediated by VAR2CSA, a variant of the PfEMP1 family of adhesion antigens. Recent studies have shown that many P. falciparum genomes have multiple genes coding for different VAR2CSA proteins, and parasites with >1 var2csa gene appear to be more common in pregnant women with placental malaria than in nonpregnant individuals. We present evidence that, in pregnant women, parasites containing multiple var2csa-type genes possess a selective advantage over parasites with a single var2csa gene. Accumulation of parasites with multiple copies of the var2csa gene during the course of pregnancy was also correlated with the development of antibodies involved in blocking VAR2CSA adhesion. The data suggest that multiplicity of var2csa-type genes enables P. falciparum parasites to persist for a longer period of time during placental infections, probably because of their greater capacity for antigenic variation and evasion of variant-specific immune responses. PMID:21592998
EBF factors drive expression of multiple classes of target genes governing neuronal development

Directory of Open Access Journals (Sweden)

Vetter Monica L

2011-04-01

Full Text Available Abstract Background Early B cell factor (EBF family members are transcription factors known to have important roles in several aspects of vertebrate neurogenesis, including commitment, migration and differentiation. Knowledge of how EBF family members contribute to neurogenesis is limited by a lack of detailed understanding of genes that are transcriptionally regulated by these factors. Results We performed a microarray screen in Xenopus animal caps to search for targets of EBF transcriptional activity, and identified candidate targets with multiple roles, including transcription factors of several classes. We determined that, among the most upregulated candidate genes with expected neuronal functions, most require EBF activity for some or all of their expression, and most have overlapping expression with ebf genes. We also found that the candidate target genes that had the most strongly overlapping expression patterns with ebf genes were predicted to be direct transcriptional targets of EBF transcriptional activity. Conclusions The identification of candidate targets that are transcription factor genes, including nscl-1, emx1 and aml1, improves our understanding of how EBF proteins participate in the hierarchy of transcription control during neuronal development, and suggests novel mechanisms by which EBF activity promotes migration and differentiation. Other candidate targets, including pcdh8 and kcnk5, expand our knowledge of the types of terminal differentiated neuronal functions that EBF proteins regulate.
Computing all hybridization networks for multiple binary phylogenetic input trees.

Science.gov (United States)

Albrecht, Benjamin

2015-07-30

The computation of phylogenetic trees on the same set of species that are based on different orthologous genes can lead to incongruent trees. One possible explanation for this behavior are interspecific hybridization events recombining genes of different species. An important approach to analyze such events is the computation of hybridization networks. This work presents the first algorithm computing the hybridization number as well as a set of representative hybridization networks for multiple binary phylogenetic input trees on the same set of taxa. To improve its practical runtime, we show how this algorithm can be parallelized. Moreover, we demonstrate the efficiency of the software Hybroscale, containing an implementation of our algorithm, by comparing it to PIRNv2.0, which is so far the best available software computing the exact hybridization number for multiple binary phylogenetic trees on the same set of taxa. The algorithm is part of the software Hybroscale, which was developed specifically for the investigation of hybridization networks including their computation and visualization. Hybroscale is freely available(1) and runs on all three major operating systems. Our simulation study indicates that our approach is on average 100 times faster than PIRNv2.0. Moreover, we show how Hybroscale improves the interpretation of the reported hybridization networks by adding certain features to its graphical representation.
Candidate gene analysis using imputed genotypes: cell cycle single-nucleotide polymorphisms and ovarian cancer risk

DEFF Research Database (Denmark)

Goode, Ellen L; Fridley, Brooke L; Vierkant, Robert A

2009-01-01

Polymorphisms in genes critical to cell cycle control are outstanding candidates for association with ovarian cancer risk; numerous genes have been interrogated by multiple research groups using differing tagging single-nucleotide polymorphism (SNP) sets. To maximize information gleaned from......, and rs3212891; CDK2 rs2069391, rs2069414, and rs17528736; and CCNE1 rs3218036. These results exemplify the utility of imputation in candidate gene studies and lend evidence to a role of cell cycle genes in ovarian cancer etiology, suggest a reduced set of SNPs to target in additional cases and controls....
A permutation-based multiple testing method for time-course microarray experiments

Directory of Open Access Journals (Sweden)

George Stephen L

2009-10-01

Full Text Available Abstract Background Time-course microarray experiments are widely used to study the temporal profiles of gene expression. Storey et al. (2005 developed a method for analyzing time-course microarray studies that can be applied to discovering genes whose expression trajectories change over time within a single biological group, or those that follow different time trajectories among multiple groups. They estimated the expression trajectories of each gene using natural cubic splines under the null (no time-course and alternative (time-course hypotheses, and used a goodness of fit test statistic to quantify the discrepancy. The null distribution of the statistic was approximated through a bootstrap method. Gene expression levels in microarray data are often complicatedly correlated. An accurate type I error control adjusting for multiple testing requires the joint null distribution of test statistics for a large number of genes. For this purpose, permutation methods have been widely used because of computational ease and their intuitive interpretation. Results In this paper, we propose a permutation-based multiple testing procedure based on the test statistic used by Storey et al. (2005. We also propose an efficient computation algorithm. Extensive simulations are conducted to investigate the performance of the permutation-based multiple testing procedure. The application of the proposed method is illustrated using the Caenorhabditis elegans dauer developmental data. Conclusion Our method is computationally efficient and applicable for identifying genes whose expression levels are time-dependent in a single biological group and for identifying the genes for which the time-profile depends on the group in a multi-group setting.
Dynamic evolution of Geranium mitochondrial genomes through multiple horizontal and intracellular gene transfers.

Science.gov (United States)

Park, Seongjun; Grewe, Felix; Zhu, Andan; Ruhlman, Tracey A; Sabir, Jamal; Mower, Jeffrey P; Jansen, Robert K

2015-10-01

The exchange of genetic material between cellular organelles through intracellular gene transfer (IGT) or between species by horizontal gene transfer (HGT) has played an important role in plant mitochondrial genome evolution. The mitochondrial genomes of Geraniaceae display a number of unusual phenomena including highly accelerated rates of synonymous substitutions, extensive gene loss and reduction in RNA editing. Mitochondrial DNA sequences assembled for 17 species of Geranium revealed substantial reduction in gene and intron content relative to the ancestor of the Geranium lineage. Comparative analyses of nuclear transcriptome data suggest that a number of these sequences have been functionally relocated to the nucleus via IGT. Evidence for rampant HGT was detected in several Geranium species containing foreign organellar DNA from diverse eudicots, including many transfers from parasitic plants. One lineage has experienced multiple, independent HGT episodes, many of which occurred within the past 5.5 Myr. Both duplicative and recapture HGT were documented in Geranium lineages. The mitochondrial genome of Geranium brycei contains at least four independent HGT tracts that are absent in its nearest relative. Furthermore, G. brycei mitochondria carry two copies of the cox1 gene that differ in intron content, providing insight into contrasting hypotheses on cox1 intron evolution. © 2015 The Authors. New Phytologist © 2015 New Phytologist Trust.
Identification of self-consistent modulons from bacterial microarray expression data with the help of structured regulon gene sets

KAUST Repository

Permina, Elizaveta A.

2013-01-01

Identification of bacterial modulons from series of gene expression measurements on microarrays is a principal problem, especially relevant for inadequately studied but practically important species. Usage of a priori information on regulatory interactions helps to evaluate parameters for regulatory subnetwork inference. We suggest a procedure for modulon construction where a seed regulon is iteratively updated with genes having expression patterns similar to those for regulon member genes. A set of genes essential for a regulon is used to control modulon updating. Essential genes for a regulon were selected as a subset of regulon genes highly related by different measures to each other. Using Escherichia coli as a model, we studied how modulon identification depends on the data, including the microarray experiments set, the adopted relevance measure and the regulon itself. We have found that results of modulon identification are highly dependent on all parameters studied and thus the resulting modulon varies substantially depending on the identification procedure. Yet, modulons that were identified correctly displayed higher stability during iterations, which allows developing a procedure for reliable modulon identification in the case of less studied species where the known regulatory interactions are sparse. Copyright © 2013 Taylor & Francis.
When Is Hub Gene Selection Better than Standard Meta-Analysis?

Science.gov (United States)

Langfelder, Peter; Mischel, Paul S.; Horvath, Steve

2013-01-01

Since hub nodes have been found to play important roles in many networks, highly connected hub genes are expected to play an important role in biology as well. However, the empirical evidence remains ambiguous. An open question is whether (or when) hub gene selection leads to more meaningful gene lists than a standard statistical analysis based on significance testing when analyzing genomic data sets (e.g., gene expression or DNA methylation data). Here we address this question for the special case when multiple genomic data sets are available. This is of great practical importance since for many research questions multiple data sets are publicly available. In this case, the data analyst can decide between a standard statistical approach (e.g., based on meta-analysis) and a co-expression network analysis approach that selects intramodular hubs in consensus modules. We assess the performance of these two types of approaches according to two criteria. The first criterion evaluates the biological insights gained and is relevant in basic research. The second criterion evaluates the validation success (reproducibility) in independent data sets and often applies in clinical diagnostic or prognostic applications. We compare meta-analysis with consensus network analysis based on weighted correlation network analysis (WGCNA) in three comprehensive and unbiased empirical studies: (1) Finding genes predictive of lung cancer survival, (2) finding methylation markers related to age, and (3) finding mouse genes related to total cholesterol. The results demonstrate that intramodular hub gene status with respect to consensus modules is more useful than a meta-analysis p-value when identifying biologically meaningful gene lists (reflecting criterion 1). However, standard meta-analysis methods perform as good as (if not better than) a consensus network approach in terms of validation success (criterion 2). The article also reports a comparison of meta-analysis techniques applied to
When is hub gene selection better than standard meta-analysis?

Directory of Open Access Journals (Sweden)

Peter Langfelder

Full Text Available Since hub nodes have been found to play important roles in many networks, highly connected hub genes are expected to play an important role in biology as well. However, the empirical evidence remains ambiguous. An open question is whether (or when hub gene selection leads to more meaningful gene lists than a standard statistical analysis based on significance testing when analyzing genomic data sets (e.g., gene expression or DNA methylation data. Here we address this question for the special case when multiple genomic data sets are available. This is of great practical importance since for many research questions multiple data sets are publicly available. In this case, the data analyst can decide between a standard statistical approach (e.g., based on meta-analysis and a co-expression network analysis approach that selects intramodular hubs in consensus modules. We assess the performance of these two types of approaches according to two criteria. The first criterion evaluates the biological insights gained and is relevant in basic research. The second criterion evaluates the validation success (reproducibility in independent data sets and often applies in clinical diagnostic or prognostic applications. We compare meta-analysis with consensus network analysis based on weighted correlation network analysis (WGCNA in three comprehensive and unbiased empirical studies: (1 Finding genes predictive of lung cancer survival, (2 finding methylation markers related to age, and (3 finding mouse genes related to total cholesterol. The results demonstrate that intramodular hub gene status with respect to consensus modules is more useful than a meta-analysis p-value when identifying biologically meaningful gene lists (reflecting criterion 1. However, standard meta-analysis methods perform as good as (if not better than a consensus network approach in terms of validation success (criterion 2. The article also reports a comparison of meta-analysis techniques
When is hub gene selection better than standard meta-analysis?

Science.gov (United States)

Langfelder, Peter; Mischel, Paul S; Horvath, Steve

2013-01-01

Since hub nodes have been found to play important roles in many networks, highly connected hub genes are expected to play an important role in biology as well. However, the empirical evidence remains ambiguous. An open question is whether (or when) hub gene selection leads to more meaningful gene lists than a standard statistical analysis based on significance testing when analyzing genomic data sets (e.g., gene expression or DNA methylation data). Here we address this question for the special case when multiple genomic data sets are available. This is of great practical importance since for many research questions multiple data sets are publicly available. In this case, the data analyst can decide between a standard statistical approach (e.g., based on meta-analysis) and a co-expression network analysis approach that selects intramodular hubs in consensus modules. We assess the performance of these two types of approaches according to two criteria. The first criterion evaluates the biological insights gained and is relevant in basic research. The second criterion evaluates the validation success (reproducibility) in independent data sets and often applies in clinical diagnostic or prognostic applications. We compare meta-analysis with consensus network analysis based on weighted correlation network analysis (WGCNA) in three comprehensive and unbiased empirical studies: (1) Finding genes predictive of lung cancer survival, (2) finding methylation markers related to age, and (3) finding mouse genes related to total cholesterol. The results demonstrate that intramodular hub gene status with respect to consensus modules is more useful than a meta-analysis p-value when identifying biologically meaningful gene lists (reflecting criterion 1). However, standard meta-analysis methods perform as good as (if not better than) a consensus network approach in terms of validation success (criterion 2). The article also reports a comparison of meta-analysis techniques applied to
On the Relationship Between Confidence Sets and Exchangeable Weights in Multiple Linear Regression.

Science.gov (United States)

Pek, Jolynn; Chalmers, R Philip; Monette, Georges

2016-01-01

When statistical models are employed to provide a parsimonious description of empirical relationships, the extent to which strong conclusions can be drawn rests on quantifying the uncertainty in parameter estimates. In multiple linear regression (MLR), regression weights carry two kinds of uncertainty represented by confidence sets (CSs) and exchangeable weights (EWs). Confidence sets quantify uncertainty in estimation whereas the set of EWs quantify uncertainty in the substantive interpretation of regression weights. As CSs and EWs share certain commonalities, we clarify the relationship between these two kinds of uncertainty about regression weights. We introduce a general framework describing how CSs and the set of EWs for regression weights are estimated from the likelihood-based and Wald-type approach, and establish the analytical relationship between CSs and sets of EWs. With empirical examples on posttraumatic growth of caregivers (Cadell et al., 2014; Schneider, Steele, Cadell & Hemsworth, 2011) and on graduate grade point average (Kuncel, Hezlett & Ones, 2001), we illustrate the usefulness of CSs and EWs for drawing strong scientific conclusions. We discuss the importance of considering both CSs and EWs as part of the scientific process, and provide an Online Appendix with R code for estimating Wald-type CSs and EWs for k regression weights.

Prosecutor: parameter-free inference of gene function for prokaryotes using DNA microarray data, genomic context and multiple gene annotation sources

Directory of Open Access Journals (Sweden)

van Hijum Sacha AFT

2008-10-01

Full Text Available Abstract Background Despite a plethora of functional genomic efforts, the function of many genes in sequenced genomes remains unknown. The increasing amount of microarray data for many species allows employing the guilt-by-association principle to predict function on a large scale: genes exhibiting similar expression patterns are more likely to participate in shared biological processes. Results We developed Prosecutor, an application that enables researchers to rapidly infer gene function based on available gene expression data and functional annotations. Our parameter-free functional prediction method uses a sensitive algorithm to achieve a high association rate of linking genes with unknown function to annotated genes. Furthermore, Prosecutor utilizes additional biological information such as genomic context and known regulatory mechanisms that are specific for prokaryotes. We analyzed publicly available transcriptome data sets and used literature sources to validate putative functions suggested by Prosecutor. We supply the complete results of our analysis for 11 prokaryotic organisms on a dedicated website. Conclusion The Prosecutor software and supplementary datasets available at http://www.prosecutor.nl allow researchers working on any of the analyzed organisms to quickly identify the putative functions of their genes of interest. A de novo analysis allows new organisms to be studied.
Interactions between SNPs affecting inflammatory response genes are associated with multiple myeloma disease risk and survival

DEFF Research Database (Denmark)

Nielsen, Kaspar René; Rodrigo-Domingo, Maria; Steffensen, Rudi

2017-01-01

The origin of multiple myeloma depends on interactions with stromal cells in the course of normal B-cell differentiation and evolution of immunity. The concept of the present study is that genes involved in MM pathogenesis, such as immune response genes, can be identified by screening for single......3L1 gene promoters. The occurrence of single polymorphisms, haplotypes and SNP-SNP interactions were statistically analyzed for association with disease risk and outcome following high-dose therapy. Identified genes that carried SNPs or haplotypes that were identified as risk or prognostic factors......= .005). The 'risk genes' were analyzed for expression in normal B-cell subsets (N = 6) from seven healthy donors and we found TNFA and IL-6 expressed both in naïve and in memory B cells when compared to preBI, II, immature and plasma cells. The 'prognosis genes' CHI3L1, IL-6 and IL-10 were differential...
The evolution of multiple isotypic IgM heavy chain genes in the shark.

Science.gov (United States)

Lee, Victor; Huang, Jing Li; Lui, Ming Fai; Malecek, Karolina; Ohta, Yuko; Mooers, Arne; Hsu, Ellen

2008-06-01

The IgM H chain gene organization of cartilaginous fishes consists of 15-200 miniloci, each with a few gene segments (V(H)-D1-D2-J(H)) and one C gene. This is a gene arrangement ancestral to the complex IgH locus that exists in all other vertebrate classes. To understand the molecular evolution of this system, we studied the nurse shark, which has relatively fewer loci, and characterized the IgH isotypes for organization, functionality, and the somatic diversification mechanisms that act upon them. Gene numbers differ slightly between individuals ( approximately 15), but five active IgM subclasses are always present. Each gene undergoes rearrangement that is strictly confined within the minilocus; in B cells there is no interaction between adjacent loci located > or =120 kb apart. Without combinatorial events, the shark IgM H chain repertoire is based on junctional diversity and, subsequently, somatic hypermutation. We suggest that the significant contribution by junctional diversification reflects the selected novelty introduced by RAG in the early vertebrate ancestor, whereas combinatorial diversity coevolved with the complex translocon organization. Moreover, unlike other cartilaginous fishes, there are no germline-joined VDJ at any nurse shark mu locus, and we suggest that such genes, when functional, are species-specific and may have specialized roles. With an entire complement of IgM genes available for the first time, phylogenetic analyses were performed to examine how the multiple Ig loci evolved. We found that all domains changed at comparable rates, but V(H) appears to be under strong positive selection for increased amino acid sequence diversity, and surprisingly, so does Cmicro2.
Frequent expression loss of Inter-alpha-trypsin inhibitor heavy chain (ITIH genes in multiple human solid tumors: A systematic expression analysis

Directory of Open Access Journals (Sweden)

Werbowetski-Ogilvie Tamra

2008-01-01

Full Text Available Abstract Background The inter-alpha-trypsin inhibitors (ITI are a family of plasma protease inhibitors, assembled from a light chain – bikunin, encoded by AMBP – and five homologous heavy chains (encoded by ITIH1, ITIH2, ITIH3, ITIH4, and ITIH5, contributing to extracellular matrix stability by covalent linkage to hyaluronan. So far, ITIH molecules have been shown to play a particularly important role in inflammation and carcinogenesis. Methods We systematically investigated differential gene expression of the ITIH gene family, as well as AMBP and the interacting partner TNFAIP6 in 13 different human tumor entities (of breast, endometrium, ovary, cervix, stomach, small intestine, colon, rectum, lung, thyroid, prostate, kidney, and pancreas using cDNA dot blot analysis (Cancer Profiling Array, CPA, semiquantitative RT-PCR and immunohistochemistry. Results We found that ITIH genes are clearly downregulated in multiple human solid tumors, including breast, colon and lung cancer. Thus, ITIH genes may represent a family of putative tumor suppressor genes that should be analyzed in greater detail in the future. For an initial detailed analysis we chose ITIH2 expression in human breast cancer. Loss of ITIH2 expression in 70% of cases (n = 50, CPA could be confirmed by real-time PCR in an additional set of breast cancers (n = 36. Next we studied ITIH2 expression on the protein level by analyzing a comprehensive tissue micro array including 185 invasive breast cancer specimens. We found a strong correlation (p Conclusion Altogether, this is the first systematic analysis on the differential expression of ITIH genes in human cancer, showing frequent downregulation that may be associated with initiation and/or progression of these malignancies.
Combining Gene Signatures Improves Prediction of Breast Cancer Survival

Science.gov (United States)

Zhao, Xi; Naume, Bjørn; Langerød, Anita; Frigessi, Arnoldo; Kristensen, Vessela N.; Børresen-Dale, Anne-Lise; Lingjærde, Ole Christian

2011-01-01

Background Several gene sets for prediction of breast cancer survival have been derived from whole-genome mRNA expression profiles. Here, we develop a statistical framework to explore whether combination of the information from such sets may improve prediction of recurrence and breast cancer specific death in early-stage breast cancers. Microarray data from two clinically similar cohorts of breast cancer patients are used as training (n = 123) and test set (n = 81), respectively. Gene sets from eleven previously published gene signatures are included in the study. Principal Findings To investigate the relationship between breast cancer survival and gene expression on a particular gene set, a Cox proportional hazards model is applied using partial likelihood regression with an L2 penalty to avoid overfitting and using cross-validation to determine the penalty weight. The fitted models are applied to an independent test set to obtain a predicted risk for each individual and each gene set. Hierarchical clustering of the test individuals on the basis of the vector of predicted risks results in two clusters with distinct clinical characteristics in terms of the distribution of molecular subtypes, ER, PR status, TP53 mutation status and histological grade category, and associated with significantly different survival probabilities (recurrence: p = 0.005; breast cancer death: p = 0.014). Finally, principal components analysis of the gene signatures is used to derive combined predictors used to fit a new Cox model. This model classifies test individuals into two risk groups with distinct survival characteristics (recurrence: p = 0.003; breast cancer death: p = 0.001). The latter classifier outperforms all the individual gene signatures, as well as Cox models based on traditional clinical parameters and the Adjuvant! Online for survival prediction. Conclusion Combining the predictive strength of multiple gene signatures improves prediction of breast
Combining gene signatures improves prediction of breast cancer survival.

Directory of Open Access Journals (Sweden)

Xi Zhao

Full Text Available BACKGROUND: Several gene sets for prediction of breast cancer survival have been derived from whole-genome mRNA expression profiles. Here, we develop a statistical framework to explore whether combination of the information from such sets may improve prediction of recurrence and breast cancer specific death in early-stage breast cancers. Microarray data from two clinically similar cohorts of breast cancer patients are used as training (n = 123 and test set (n = 81, respectively. Gene sets from eleven previously published gene signatures are included in the study. PRINCIPAL FINDINGS: To investigate the relationship between breast cancer survival and gene expression on a particular gene set, a Cox proportional hazards model is applied using partial likelihood regression with an L2 penalty to avoid overfitting and using cross-validation to determine the penalty weight. The fitted models are applied to an independent test set to obtain a predicted risk for each individual and each gene set. Hierarchical clustering of the test individuals on the basis of the vector of predicted risks results in two clusters with distinct clinical characteristics in terms of the distribution of molecular subtypes, ER, PR status, TP53 mutation status and histological grade category, and associated with significantly different survival probabilities (recurrence: p = 0.005; breast cancer death: p = 0.014. Finally, principal components analysis of the gene signatures is used to derive combined predictors used to fit a new Cox model. This model classifies test individuals into two risk groups with distinct survival characteristics (recurrence: p = 0.003; breast cancer death: p = 0.001. The latter classifier outperforms all the individual gene signatures, as well as Cox models based on traditional clinical parameters and the Adjuvant! Online for survival prediction. CONCLUSION: Combining the predictive strength of multiple gene signatures improves
Methods for monitoring multiple gene expression

Energy Technology Data Exchange (ETDEWEB)

Berka, Randy [Davis, CA; Bachkirova, Elena [Davis, CA; Rey, Michael [Davis, CA

2012-05-01

The present invention relates to methods for monitoring differential expression of a plurality of genes in a first filamentous fungal cell relative to expression of the same genes in one or more second filamentous fungal cells using microarrays containing Trichoderma reesei ESTs or SSH clones, or a combination thereof. The present invention also relates to computer readable media and substrates containing such array features for monitoring expression of a plurality of genes in filamentous fungal cells.
Methods for monitoring multiple gene expression

Energy Technology Data Exchange (ETDEWEB)

Berka, Randy; Bachkirova, Elena; Rey, Michael

2013-10-01

The present invention relates to methods for monitoring differential expression of a plurality of genes in a first filamentous fungal cell relative to expression of the same genes in one or more second filamentous fungal cells using microarrays containing Trichoderma reesei ESTs or SSH clones, or a combination thereof. The present invention also relates to computer readable media and substrates containing such array features for monitoring expression of a plurality of genes in filamentous fungal cells.
Multiple ETS family proteins regulate PF4 gene expression by binding to the same ETS binding site.

Directory of Open Access Journals (Sweden)

Yoshiaki Okada

Full Text Available In previous studies on the mechanism underlying megakaryocyte-specific gene expression, several ETS motifs were found in each megakaryocyte-specific gene promoter. Although these studies suggested that several ETS family proteins regulate megakaryocyte-specific gene expression, only a few ETS family proteins have been identified. Platelet factor 4 (PF4 is a megakaryocyte-specific gene and its promoter includes multiple ETS motifs. We had previously shown that ETS-1 binds to an ETS motif in the PF4 promoter. However, the functions of the other ETS motifs are still unclear. The goal of this study was to investigate a novel functional ETS motif in the PF4 promoter and identify proteins binding to the motif. In electrophoretic mobility shift assays and a chromatin immunoprecipitation assay, FLI-1, ELF-1, and GABP bound to the -51 ETS site. Expression of FLI-1, ELF-1, and GABP activated the PF4 promoter in HepG2 cells. Mutation of a -51 ETS site attenuated FLI-1-, ELF-1-, and GABP-mediated transactivation of the promoter. siRNA analysis demonstrated that FLI-1, ELF-1, and GABP regulate PF4 gene expression in HEL cells. Among these three proteins, only FLI-1 synergistically activated the promoter with GATA-1. In addition, only FLI-1 expression was increased during megakaryocytic differentiation. Finally, the importance of the -51 ETS site for the activation of the PF4 promoter during physiological megakaryocytic differentiation was confirmed by a novel reporter gene assay using in vitro ES cell differentiation system. Together, these data suggest that FLI-1, ELF-1, and GABP regulate PF4 gene expression through the -51 ETS site in megakaryocytes and implicate the differentiation stage-specific regulation of PF4 gene expression by multiple ETS factors.
Rare germline alterations in cancer-related genes associated with the risk of multiple primary tumor development

DEFF Research Database (Denmark)

Villacis, Rolando A. R.; Basso, Tatiane R; Canto, Luisa M

2017-01-01

Multiple primary tumors (MPT) have been described in carriers of inherited cancer predisposition genes. However, the genetic etiology of a large proportion of MPT cases remains unclear. We reviewed 267 patients with hereditary cancer predisposition syndromes (HCPS) that underwent genetic counseli...
Bi-directional gene set enrichment and canonical correlation analysis identify key diet-sensitive pathways and biomarkers of metabolic syndrome

Directory of Open Access Journals (Sweden)

Gaora Peadar Ó

2010-10-01

Full Text Available Abstract Background Currently, a number of bioinformatics methods are available to generate appropriate lists of genes from a microarray experiment. While these lists represent an accurate primary analysis of the data, fewer options exist to contextualise those lists. The development and validation of such methods is crucial to the wider application of microarray technology in the clinical setting. Two key challenges in clinical bioinformatics involve appropriate statistical modelling of dynamic transcriptomic changes, and extraction of clinically relevant meaning from very large datasets. Results Here, we apply an approach to gene set enrichment analysis that allows for detection of bi-directional enrichment within a gene set. Furthermore, we apply canonical correlation analysis and Fisher's exact test, using plasma marker data with known clinical relevance to aid identification of the most important gene and pathway changes in our transcriptomic dataset. After a 28-day dietary intervention with high-CLA beef, a range of plasma markers indicated a marked improvement in the metabolic health of genetically obese mice. Tissue transcriptomic profiles indicated that the effects were most dramatic in liver (1270 genes significantly changed; p Conclusion Bi-directional gene set enrichment analysis more accurately reflects dynamic regulatory behaviour in biochemical pathways, and as such highlighted biologically relevant changes that were not detected using a traditional approach. In such cases where transcriptomic response to treatment is exceptionally large, canonical correlation analysis in conjunction with Fisher's exact test highlights the subset of pathways showing strongest correlation with the clinical markers of interest. In this case, we have identified selenoamino acid metabolism and steroid biosynthesis as key pathways mediating the observed relationship between metabolic health and high-CLA beef. These results indicate that this type of
GESearch: An Interactive GUI Tool for Identifying Gene Expression Signature

Directory of Open Access Journals (Sweden)

Ning Ye

2015-01-01

Full Text Available The huge amount of gene expression data generated by microarray and next-generation sequencing technologies present challenges to exploit their biological meanings. When searching for the coexpression genes, the data mining process is largely affected by selection of algorithms. Thus, it is highly desirable to provide multiple options of algorithms in the user-friendly analytical toolkit to explore the gene expression signatures. For this purpose, we developed GESearch, an interactive graphical user interface (GUI toolkit, which is written in MATLAB and supports a variety of gene expression data files. This analytical toolkit provides four models, including the mean, the regression, the delegate, and the ensemble models, to identify the coexpression genes, and enables the users to filter data and to select gene expression patterns by browsing the display window or by importing knowledge-based genes. Subsequently, the utility of this analytical toolkit is demonstrated by analyzing two sets of real-life microarray datasets from cell-cycle experiments. Overall, we have developed an interactive GUI toolkit that allows for choosing multiple algorithms for analyzing the gene expression signatures.
Identification and Validation of a New Set of Five Genes for Prediction of Risk in Early Breast Cancer

Directory of Open Access Journals (Sweden)

Giorgio Mustacchi

2013-05-01

Full Text Available Molecular tests predicting the outcome of breast cancer patients based on gene expression levels can be used to assist in making treatment decisions after consideration of conventional markers. In this study we identified a subset of 20 mRNA differentially regulated in breast cancer analyzing several publicly available array gene expression data using R/Bioconductor package. Using RTqPCR we evaluate 261 consecutive invasive breast cancer cases not selected for age, adjuvant treatment, nodal and estrogen receptor status from paraffin embedded sections. The biological samples dataset was split into a training (137 cases and a validation set (124 cases. The gene signature was developed on the training set and a multivariate stepwise Cox analysis selected five genes independently associated with DFS: FGF18 (HR = 1.13, p = 0.05, BCL2 (HR = 0.57, p = 0.001, PRC1 (HR = 1.51, p = 0.001, MMP9 (HR = 1.11, p = 0.08, SERF1a (HR = 0.83, p = 0.007. These five genes were combined into a linear score (signature weighted according to the coefficients of the Cox model, as: 0.125FGF18 − 0.560BCL2 + 0.409PRC1 + 0.104MMP9 − 0.188SERF1A (HR = 2.7, 95% CI = 1.9–4.0, p < 0.001. The signature was then evaluated on the validation set assessing the discrimination ability by a Kaplan Meier analysis, using the same cut offs classifying patients at low, intermediate or high risk of disease relapse as defined on the training set (p < 0.001. Our signature, after a further clinical validation, could be proposed as prognostic signature for disease free survival in breast cancer patients where the indication for adjuvant chemotherapy added to endocrine treatment is uncertain.
Gene set-based analysis of polymorphisms: finding pathways or biological processes associated to traits in genome-wide association studies

Science.gov (United States)

Medina, Ignacio; Montaner, David; Bonifaci, Nuria; Pujana, Miguel Angel; Carbonell, José; Tarraga, Joaquin; Al-Shahrour, Fatima; Dopazo, Joaquin

2009-01-01

Genome-wide association studies have become a popular strategy to find associations of genes to traits of interest. Despite the high-resolution available today to carry out genotyping studies, the success of its application in real studies has been limited by the testing strategy used. As an alternative to brute force solutions involving the use of very large cohorts, we propose the use of the Gene Set Analysis (GSA), a different analysis strategy based on testing the association of modules of functionally related genes. We show here how the Gene Set-based Analysis of Polymorphisms (GeSBAP), which is a simple implementation of the GSA strategy for the analysis of genome-wide association studies, provides a significant increase in the power testing for this type of studies. GeSBAP is freely available at http://bioinfo.cipf.es/gesbap/ PMID:19502494
Genetic diversity and population structure of Lantana camara in India indicates multiple introductions and gene flow.

Science.gov (United States)

Ray, A; Quader, S

2014-05-01

Lantana camara is a highly invasive plant, which has spread over 60 countries and island groups of Asia, Africa and Australia. In India, it was introduced in the early nineteenth century, since when it has expanded and gradually established itself in almost every available ecosystem. We investigated the genetic diversity and population structure of this plant in India in order to understand its introduction, subsequent range expansion and gene flow. A total of 179 individuals were sequenced at three chloroplast loci and 218 individuals were genotyped for six nuclear microsatellites. Both chloroplasts (nine haplotypes) and microsatellites (83 alleles) showed high genetic diversity. Besides, each type of marker confirmed the presence of private polymorphism. We uncovered low to medium population structure in both markers, and found a faint signal of isolation by distance with microsatellites. Bayesian clustering analyses revealed multiple divergent genetic clusters. Taken together, these findings (i.e. high genetic diversity with private alleles and multiple genetic clusters) suggest that Lantana was introduced multiple times and gradually underwent spatial expansion with recurrent gene flow. © 2013 German Botanical Society and The Royal Botanical Society of the Netherlands.
Overexpression of multiple detoxification genes in deltamethrin resistant Laodelphax striatellus (Hemiptera: Delphacidae in China.

Directory of Open Access Journals (Sweden)

Lu Xu

Full Text Available BACKGROUND: The small brown planthopper (SBPH, Laodelphax striatellus (Fallén, is one of the major rice pests in Asia and has developed resistance to multiple classes of insecticides. Understanding resistance mechanisms is essential to the management of this pest. Biochemical and molecular assays were performed in this study to systematically characterize deltamethrin resistance mechanisms with laboratory-selected resistant and susceptible strains of SBPH. METHODOLOGY/PRINCIPAL FINDINGS: Deltamethrin resistant strains of SBPH (JH-del were derived from a field population by continuously selections (up to 30 generations in the laboratory, while a susceptible strain (JHS was obtained from the same population by removing insecticide pressure for 30 generations. The role of detoxification enzymes in the resistance was investigated using synergism and enzyme activity assays with strains of different resistant levels. Furthermore, 71 cytochrome P450, 93 esterases and 12 glutathione-S-transferases cDNAs were cloned based on transcriptome data of a field collected population. Semi-quantitative RT-PCR screening analysis of 176 identified detoxification genes demonstrated that multiple P450 and esterase genes were overexpressed (>2-fold in JH-del strains (G4 and G30 when compared to that in JHS, and the results of quantitative PCR coincided with the semi-quantitative RT-PCR results. Target mutation at IIS3-IIS6 regions encoded by the voltage-gated sodium channel gene was ruled out for conferring the observed resistance. CONCLUSION/SIGNIFICANCE: As the first attempt to discover genes potentially involved in SBPH pyrethroid resistance, this study putatively identified several candidate genes of detoxification enzymes that were significantly overexpressed in the resistant strain, which matched the synergism and enzyme activity testing. The biochemical and molecular evidences suggest that the high level pyrethroid resistance in L. striatellus could be due to
Genome-Wide Detection and Analysis of Multifunctional Genes

Science.gov (United States)

Pritykin, Yuri; Ghersi, Dario; Singh, Mona

2015-01-01

Many genes can play a role in multiple biological processes or molecular functions. Identifying multifunctional genes at the genome-wide level and studying their properties can shed light upon the complexity of molecular events that underpin cellular functioning, thereby leading to a better understanding of the functional landscape of the cell. However, to date, genome-wide analysis of multifunctional genes (and the proteins they encode) has been limited. Here we introduce a computational approach that uses known functional annotations to extract genes playing a role in at least two distinct biological processes. We leverage functional genomics data sets for three organisms—H. sapiens, D. melanogaster, and S. cerevisiae—and show that, as compared to other annotated genes, genes involved in multiple biological processes possess distinct physicochemical properties, are more broadly expressed, tend to be more central in protein interaction networks, tend to be more evolutionarily conserved, and are more likely to be essential. We also find that multifunctional genes are significantly more likely to be involved in human disorders. These same features also hold when multifunctionality is defined with respect to molecular functions instead of biological processes. Our analysis uncovers key features about multifunctional genes, and is a step towards a better genome-wide understanding of gene multifunctionality. PMID:26436655
Variations in CCL3L gene cluster sequence and non-specific gene copy numbers

Directory of Open Access Journals (Sweden)

Edberg Jeffrey C

2010-03-01

Full Text Available Abstract Background Copy number variations (CNVs of the gene CC chemokine ligand 3-like1 (CCL3L1 have been implicated in HIV-1 susceptibility, but the association has been inconsistent. CCL3L1 shares homology with a cluster of genes localized to chromosome 17q12, namely CCL3, CCL3L2, and, CCL3L3. These genes are involved in host defense and inflammatory processes. Several CNV assays have been developed for the CCL3L1 gene. Findings Through pairwise and multiple alignments of these genes, we have shown that the homology between these genes ranges from 50% to 99% in complete gene sequences and from 70-100% in the exonic regions, with CCL3L1 and CCL3L3 being identical. By use of MEGA 4 and BioEdit, we aligned sense primers, anti-sense primers, and probes used in several previously described assays against pre-multiple alignments of all four chemokine genes. Each set of probes and primers aligned and matched with overlapping sequences in at least two of the four genes, indicating that previously utilized RT-PCR based CNV assays are not specific for only CCL3L1. The four available assays measured median copies of 2 and 3-4 in European and African American, respectively. The concordance between the assays ranged from 0.44-0.83 suggesting individual discordant calls and inconsistencies with the assays from the expected gene coverage from the known sequence. Conclusions This indicates that some of the inconsistencies in the association studies could be due to assays that provide heterogenous results. Sequence information to determine CNV of the three genes separately would allow to test whether their association with the pathogenesis of a human disease or phenotype is affected by an individual gene or by a combination of these genes.
Analysis of Multiple Genomic Sequence Alignments: A Web Resource, Online Tools, and Lessons Learned From Analysis of Mammalian SCL Loci

Science.gov (United States)

Chapman, Michael A.; Donaldson, Ian J.; Gilbert, James; Grafham, Darren; Rogers, Jane; Green, Anthony R.; Göttgens, Berthold

2004-01-01

Comparative analysis of genomic sequences is becoming a standard technique for studying gene regulation. However, only a limited number of tools are currently available for the analysis of multiple genomic sequences. An extensive data set for the testing and training of such tools is provided by the SCL gene locus. Here we have expanded the data set to eight vertebrate species by sequencing the dog SCL locus and by annotating the dog and rat SCL loci. To provide a resource for the bioinformatics community, all SCL sequences and functional annotations, comprising a collation of the extensive experimental evidence pertaining to SCL regulation, have been made available via a Web server. A Web interface to new tools specifically designed for the display and analysis of multiple sequence alignments was also implemented. The unique SCL data set and new sequence comparison tools allowed us to perform a rigorous examination of the true benefits of multiple sequence comparisons. We demonstrate that multiple sequence alignments are, overall, superior to pairwise alignments for identification of mammalian regulatory regions. In the search for individual transcription factor binding sites, multiple alignments markedly increase the signal-to-noise ratio compared to pairwise alignments. PMID:14718377
Bridging cancer biology with the clinic: relative expression of a GRHL2-mediated gene-set pair predicts breast cancer metastasis.

Directory of Open Access Journals (Sweden)

Xinan Yang

Full Text Available Identification and characterization of crucial gene target(s that will allow focused therapeutics development remains a challenge. We have interrogated the putative therapeutic targets associated with the transcription factor Grainy head-like 2 (GRHL2, a critical epithelial regulatory factor. We demonstrate the possibility to define the molecular functions of critical genes in terms of their personalized expression profiles, allowing appropriate functional conclusions to be derived. A novel methodology, relative expression analysis with gene-set pairs (RXA-GSP, is designed to explore the potential clinical utility of cancer-biology discovery. Observing that Grhl2-overexpression leads to increased metastatic potential in vitro, we established a model assuming Grhl2-induced or -inhibited genes confer poor or favorable prognosis respectively for cancer metastasis. Training on public gene expression profiles of 995 breast cancer patients, this method prioritized one gene-set pair (GRHL2, CDH2, FN1, CITED2, MKI67 versus CTNNB1 and CTNNA3 from all 2717 possible gene-set pairs (GSPs. The identified GSP significantly dichotomized 295 independent patients for metastasis-free survival (log-rank tested p = 0.002; severe empirical p = 0.035. It also showed evidence of clinical prognostication in another independent 388 patients collected from three studies (log-rank tested p = 3.3e-6. This GSP is independent of most traditional prognostic indicators, and is only significantly associated with the histological grade of breast cancer (p = 0.0017, a GRHL2-associated clinical character (p = 6.8e-6, Spearman correlation, suggesting that this GSP is reflective of GRHL2-mediated events. Furthermore, a literature review indicates the therapeutic potential of the identified genes. This research demonstrates a novel strategy to integrate both biological experiments and clinical gene expression profiles for extracting and elucidating the genomic

Univariate and multiple linear regression analyses for 23 single nucleotide polymorphisms in 14 genes predisposing to chronic glomerular diseases and IgA nephropathy in Han Chinese.

Science.gov (United States)

Wang, Hui; Sui, Weiguo; Xue, Wen; Wu, Junyong; Chen, Jiejing; Dai, Yong

2014-09-01

Immunoglobulin A nephropathy (IgAN) is a complex trait regulated by the interaction among multiple physiologic regulatory systems and probably involving numerous genes, which leads to inconsistent findings in genetic studies. One possibility of failure to replicate some single-locus results is that the underlying genetics of IgAN nephropathy is based on multiple genes with minor effects. To learn the association between 23 single nucleotide polymorphisms (SNPs) in 14 genes predisposing to chronic glomerular diseases and IgAN in Han males, the 23 SNPs genotypes of 21 Han males were detected and analyzed with a BaiO gene chip, and their associations were analyzed with univariate analysis and multiple linear regression analysis. Analysis showed that CTLA4 rs231726 and CR2 rs1048971 revealed a significant association with IgAN. These findings support the multi-gene nature of the etiology of IgAN and propose a potential gene-gene interactive model for future studies.
Multiple mutualist effects on genomewide expression in the tripartite association between Medicago truncatula, nitrogen-fixing bacteria and mycorrhizal fungi.

Science.gov (United States)

Afkhami, Michelle E; Stinchcombe, John R

2016-10-01

While all species interact with multiple mutualists, the fitness consequences and molecular mechanisms underlying these interactions remain largely unknown. We combined factorial ecological experiments with genomewide expression analyses to examine the phenotypic and transcriptomic responses of model legume Medicago truncatula to rhizobia and mycorrhizal fungi. We found synergistic effects of these mutualists on plant performance and examined unique features of plant gene expression responses to multiple mutualists. There were genomewide signatures of mutualists and multiple mutualists on expression, with partners often affecting unique sets of genes. Mycorrhizal fungi had stronger effects on plant expression than rhizobia, with 70% of differentially expressed genes affected by fungi. Fungal and bacterial mutualists had joint effects on 10% of differentially expressed genes, including unexpected, nonadditive effects on some genes with important functions such as nutrient metabolism. For a subset of genes, interacting with multiple mutualists even led to reversals in the direction of expression (shifts from up to downregulation) compared to interacting with single mutualists. Rhizobia also affected the expression of several mycorrhizal genes, including those involved in nutrient transfer to host plants, indicating that partner species can also impact each other's molecular phenotypes. Collectively, these data illustrate the diverse molecular mechanisms and transcriptional responses associated with the synergistic benefits of multiple mutualists. © 2016 John Wiley & Sons Ltd.
Analysis of gene expression profiles of soft tissue sarcoma using a combination of knowledge-based filtering with integration of multiple statistics.

Directory of Open Access Journals (Sweden)

Anna Takahashi

Full Text Available The diagnosis and treatment of soft tissue sarcomas (STS have been difficult. Of the diverse histological subtypes, undifferentiated pleomorphic sarcoma (UPS is particularly difficult to diagnose accurately, and its classification per se is still controversial. Recent advances in genomic technologies provide an excellent way to address such problems. However, it is often difficult, if not impossible, to identify definitive disease-associated genes using genome-wide analysis alone, primarily because of multiple testing problems. In the present study, we analyzed microarray data from 88 STS patients using a combination method that used knowledge-based filtering and a simulation based on the integration of multiple statistics to reduce multiple testing problems. We identified 25 genes, including hypoxia-related genes (e.g., MIF, SCD1, P4HA1, ENO1, and STAT1 and cell cycle- and DNA repair-related genes (e.g., TACC3, PRDX1, PRKDC, and H2AFY. These genes showed significant differential expression among histological subtypes, including UPS, and showed associations with overall survival. STAT1 showed a strong association with overall survival in UPS patients (logrank p = 1.84 × 10(-6 and adjusted p value 2.99 × 10(-3 after the permutation test. According to the literature, the 25 genes selected are useful not only as markers of differential diagnosis but also as prognostic/predictive markers and/or therapeutic targets for STS. Our combination method can identify genes that are potential prognostic/predictive factors and/or therapeutic targets in STS and possibly in other cancers. These disease-associated genes deserve further preclinical and clinical validation.
Pareto evolution of gene networks: an algorithm to optimize multiple fitness objectives

International Nuclear Information System (INIS)

Warmflash, Aryeh; Siggia, Eric D; Francois, Paul

2012-01-01

The computational evolution of gene networks functions like a forward genetic screen to generate, without preconceptions, all networks that can be assembled from a defined list of parts to implement a given function. Frequently networks are subject to multiple design criteria that cannot all be optimized simultaneously. To explore how these tradeoffs interact with evolution, we implement Pareto optimization in the context of gene network evolution. In response to a temporal pulse of a signal, we evolve networks whose output turns on slowly after the pulse begins, and shuts down rapidly when the pulse terminates. The best performing networks under our conditions do not fall into categories such as feed forward and negative feedback that also encode the input–output relation we used for selection. Pareto evolution can more efficiently search the space of networks than optimization based on a single ad hoc combination of the design criteria. (paper)
Pareto evolution of gene networks: an algorithm to optimize multiple fitness objectives.

Science.gov (United States)

Warmflash, Aryeh; Francois, Paul; Siggia, Eric D

2012-10-01

The computational evolution of gene networks functions like a forward genetic screen to generate, without preconceptions, all networks that can be assembled from a defined list of parts to implement a given function. Frequently networks are subject to multiple design criteria that cannot all be optimized simultaneously. To explore how these tradeoffs interact with evolution, we implement Pareto optimization in the context of gene network evolution. In response to a temporal pulse of a signal, we evolve networks whose output turns on slowly after the pulse begins, and shuts down rapidly when the pulse terminates. The best performing networks under our conditions do not fall into categories such as feed forward and negative feedback that also encode the input-output relation we used for selection. Pareto evolution can more efficiently search the space of networks than optimization based on a single ad hoc combination of the design criteria.
Application of Multiple-Population Genetic Algorithm in Optimizing the Train-Set Circulation Plan Problem

Directory of Open Access Journals (Sweden)

Yu Zhou

2017-01-01

Full Text Available The train-set circulation plan problem (TCPP belongs to the rolling stock scheduling (RSS problem and is similar to the aircraft routing problem (ARP in airline operations and the vehicle routing problem (VRP in the logistics field. However, TCPP involves additional complexity due to the maintenance constraint of train-sets: train-sets must conduct maintenance tasks after running for a certain time and distance. The TCPP is nondeterministic polynomial hard (NP-hard. There is no available algorithm that can obtain the optimal global solution, and many factors such as the utilization mode and the maintenance mode impact the solution of the TCPP. This paper proposes a train-set circulation optimization model to minimize the total connection time and maintenance costs and describes the design of an efficient multiple-population genetic algorithm (MPGA to solve this model. A realistic high-speed railway (HSR case is selected to verify our model and algorithm, and, then, a comparison of different algorithms is carried out. Furthermore, a new maintenance mode is proposed, and related implementation requirements are discussed.
PRKCA and multiple sclerosis: association in two independent populations.

Directory of Open Access Journals (Sweden)

Janna Saarela

2006-03-01

Full Text Available Multiple sclerosis (MS is a chronic disease of the central nervous system responsible for a large portion of neurological disabilities in young adults. Similar to what occurs in numerous complex diseases, both unknown environmental factors and genetic predisposition are required to generate MS. We ascertained a set of 63 Finnish MS families, originating from a high-risk region of the country, to identify a susceptibility gene within the previously established 3.4-Mb region on 17q24. Initial single nucleotide polymorphism (SNP-based association implicated PRKCA (protein kinase C alpha gene, and this association was replicated in an independent set of 148 Finnish MS families (p = 0.0004; remaining significant after correction for multiple testing. Further, a dense set of 211 SNPs evenly covering the PRKCA gene and the flanking regions was selected from the dbSNP database and analyzed in two large, independent MS cohorts: in 211 Finnish and 554 Canadian MS families. A multipoint SNP analysis indicated linkage to PRKCA and its telomeric flanking region in both populations, and SNP haplotype and genotype combination analyses revealed an allelic variant of PRKCA, which covers the region between introns 3 and 8, to be over-represented in Finnish MS cases (odds ratio = 1.34, 95% confidence interval 1.07-1.68. A second allelic variant, covering the same region of the PRKCA gene, showed somewhat stronger evidence for association in the Canadian families (odds ratio = 1.64, 95% confidence interval 1.39-1.94. Initial functional relevance for disease predisposition was suggested by the expression analysis: The transcript levels of PRKCA showed correlation with the copy number of the Finnish and Canadian "risk" haplotypes in CD4-negative mononuclear cells of five Finnish multiplex families and in lymphoblast cell lines of 11 Centre d'Etude du Polymorphisme Humain (CEPH individuals of European origin.
Heterogeneic dynamics of the structures of multiple gene clusters in two pathogenetically different lines originating from the same phytoplasma.

Science.gov (United States)

Arashida, Ryo; Kakizawa, Shigeyuki; Hoshi, Ayaka; Ishii, Yoshiko; Jung, Hee-Young; Kagiwada, Satoshi; Yamaji, Yasuyuki; Oshima, Kenro; Namba, Shigetou

2008-04-01

Phytoplasmas are phloem-limited plant pathogens that are transmitted by insect vectors and are associated with diseases in hundreds of plant species. Despite their small sizes, phytoplasma genomes have repeat-rich sequences, which are due to several genes that are encoded as multiple copies. These multiple genes exist in a gene cluster, the potential mobile unit (PMU). PMUs are present at several distinct regions in the phytoplasma genome. The multicopy genes encoded by PMUs (herein named mobile unit genes [MUGs]) and similar genes elsewhere in the genome (herein named fundamental genes [FUGs]) are likely to have the same function based on their annotations. In this manuscript we show evidence that MUGs and FUGs do not cluster together within the same clade. Each MUG is in a cluster with a short branch length, suggesting that MUGs are recently diverged paralogs, whereas the origin of FUGs is different from that of MUGs. We also compared the genome structures around the lplA gene in two derivative lines of the 'Candidatus Phytoplasma asteris' OY strain, the severe-symptom line W (OY-W) and the mild-symptom line M (OY-M). The gene organizations of the nucleotide sequences upstream of the lplA genes of OY-W and OY-M were dramatically different. The tra5 insertion sequence, an element of PMUs, was found only in this region in OY-W. These results suggest that transposition of entire PMUs and PMU sections has occurred frequently in the OY phytoplasma genome. The difference in the pathogenicities of OY-W and OY-M might be caused by the duplication and transposition of PMUs, followed by genome rearrangement.
Phylogenetic Relationships of Pseudorasbora, Pseudopungtungia, and Pungtungia (Teleostei; Cypriniformes; Gobioninae Inferred from Multiple Nuclear Gene Sequences

Directory of Open Access Journals (Sweden)

Keun-Yong Kim

2013-01-01

Full Text Available Gobionine species belonging to the genera Pseudorasbora, Pseudopungtungia, and Pungtungia (Teleostei; Cypriniformes; Cyprinidae have been heavily studied because of problems on taxonomy, threats of extinction, invasion, and human health. Nucleotide sequences of three nuclear genes, that is, recombination activating protein gene 1 (rag1, recombination activating gene 2 (rag2, and early growth response 1 gene (egr1, from Pseudorasbora, Pseudopungtungia, and Pungtungia species residing in China, Japan, and Korea, were analyzed to elucidate their intergeneric and interspecific phylogenetic relationships. In the phylogenetic tree inferred from their multiple gene sequences, Pseudorasbora, Pseudopungtungia and Pungtungia species ramified into three phylogenetically distinct clades; the “tenuicorpa” clade composed of Pseudopungtungia tenuicorpa, the “parva” clade composed of all Pseudorasbora species/subspecies, and the “herzi” clade composed of Pseudopungtungia nigra, and Pungtungia herzi. The genus Pseudorasbora was recovered as monophyletic, while the genus Pseudopungtungia was recovered as polyphyletic. Our phylogenetic result implies the unstable taxonomic status of the genus Pseudopungtungia.
Comparative genomic analysis of Brucella abortus vaccine strain 104M reveals a set of candidate genes associated with its virulence attenuation.

Science.gov (United States)

Yu, Dong; Hui, Yiming; Zai, Xiaodong; Xu, Junjie; Liang, Long; Wang, Bingxiang; Yue, Junjie; Li, Shanhu

2015-01-01

The Brucella abortus strain 104M, a spontaneously attenuated strain, has been used as a vaccine strain in humans against brucellosis for 6 decades in China. Despite many studies, the molecular mechanisms that cause the attenuation are still unclear. Here, we determined the whole-genome sequence of 104M and conducted a comprehensive comparative analysis against the whole genome sequences of the virulent strain, A13334, and other reference strains. This analysis revealed a highly similar genome structure between 104M and A13334. The further comparative genomic analysis between 104M and A13334 revealed a set of genes missing in 104M. Some of these genes were identified to be directly or indirectly associated with virulence. Similarly, a set of mutations in the virulence-related genes was also identified, which may be related to virulence alteration. This study provides a set of candidate genes associated with virulence attenuation in B.abortus vaccine strain 104M.
Maximizing biomarker discovery by minimizing gene signatures

Directory of Open Access Journals (Sweden)

Chang Chang

2011-12-01

Full Text Available Abstract Background The use of gene signatures can potentially be of considerable value in the field of clinical diagnosis. However, gene signatures defined with different methods can be quite various even when applied the same disease and the same endpoint. Previous studies have shown that the correct selection of subsets of genes from microarray data is key for the accurate classification of disease phenotypes, and a number of methods have been proposed for the purpose. However, these methods refine the subsets by only considering each single feature, and they do not confirm the association between the genes identified in each gene signature and the phenotype of the disease. We proposed an innovative new method termed Minimize Feature's Size (MFS based on multiple level similarity analyses and association between the genes and disease for breast cancer endpoints by comparing classifier models generated from the second phase of MicroArray Quality Control (MAQC-II, trying to develop effective meta-analysis strategies to transform the MAQC-II signatures into a robust and reliable set of biomarker for clinical applications. Results We analyzed the similarity of the multiple gene signatures in an endpoint and between the two endpoints of breast cancer at probe and gene levels, the results indicate that disease-related genes can be preferably selected as the components of gene signature, and that the gene signatures for the two endpoints could be interchangeable. The minimized signatures were built at probe level by using MFS for each endpoint. By applying the approach, we generated a much smaller set of gene signature with the similar predictive power compared with those gene signatures from MAQC-II. Conclusions Our results indicate that gene signatures of both large and small sizes could perform equally well in clinical applications. Besides, consistency and biological significances can be detected among different gene signatures, reflecting the
Ultrahigh-dimensional variable selection method for whole-genome gene-gene interaction analysis

Directory of Open Access Journals (Sweden)

Ueki Masao

2012-05-01

Full Text Available Abstract Background Genome-wide gene-gene interaction analysis using single nucleotide polymorphisms (SNPs is an attractive way for identification of genetic components that confers susceptibility of human complex diseases. Individual hypothesis testing for SNP-SNP pairs as in common genome-wide association study (GWAS however involves difficulty in setting overall p-value due to complicated correlation structure, namely, the multiple testing problem that causes unacceptable false negative results. A large number of SNP-SNP pairs than sample size, so-called the large p small n problem, precludes simultaneous analysis using multiple regression. The method that overcomes above issues is thus needed. Results We adopt an up-to-date method for ultrahigh-dimensional variable selection termed the sure independence screening (SIS for appropriate handling of numerous number of SNP-SNP interactions by including them as predictor variables in logistic regression. We propose ranking strategy using promising dummy coding methods and following variable selection procedure in the SIS method suitably modified for gene-gene interaction analysis. We also implemented the procedures in a software program, EPISIS, using the cost-effective GPGPU (General-purpose computing on graphics processing units technology. EPISIS can complete exhaustive search for SNP-SNP interactions in standard GWAS dataset within several hours. The proposed method works successfully in simulation experiments and in application to real WTCCC (Wellcome Trust Case–control Consortium data. Conclusions Based on the machine-learning principle, the proposed method gives powerful and flexible genome-wide search for various patterns of gene-gene interaction.
MUSI: an integrated system for identifying multiple specificity from very large peptide or nucleic acid data sets.

Science.gov (United States)

Kim, Taehyung; Tyndel, Marc S; Huang, Haiming; Sidhu, Sachdev S; Bader, Gary D; Gfeller, David; Kim, Philip M

2012-03-01

Peptide recognition domains and transcription factors play crucial roles in cellular signaling. They bind linear stretches of amino acids or nucleotides, respectively, with high specificity. Experimental techniques that assess the binding specificity of these domains, such as microarrays or phage display, can retrieve thousands of distinct ligands, providing detailed insight into binding specificity. In particular, the advent of next-generation sequencing has recently increased the throughput of such methods by several orders of magnitude. These advances have helped reveal the presence of distinct binding specificity classes that co-exist within a set of ligands interacting with the same target. Here, we introduce a software system called MUSI that can rapidly analyze very large data sets of binding sequences to determine the relevant binding specificity patterns. Our pipeline provides two major advances. First, it can detect previously unrecognized multiple specificity patterns in any data set. Second, it offers integrated processing of very large data sets from next-generation sequencing machines. The results are visualized as multiple sequence logos describing the different binding preferences of the protein under investigation. We demonstrate the performance of MUSI by analyzing recent phage display data for human SH3 domains as well as microarray data for mouse transcription factors.
A reference gene set for sex pheromone biosynthesis and degradation genes from the diamondback moth, Plutella xylostella, based on genome and transcriptome digital gene expression analyses.

Science.gov (United States)

He, Peng; Zhang, Yun-Fei; Hong, Duan-Yang; Wang, Jun; Wang, Xing-Liang; Zuo, Ling-Hua; Tang, Xian-Fu; Xu, Wei-Ming; He, Ming

2017-03-01

comprehensive gene data set of sex pheromone biosynthesis and degradation enzyme related genes in DBM created by genome- and transcriptome-wide identification, characterization and expression profiling. Our findings provide a basis to better understand the function of genes with tissue enriched expression. The results also provide information on the genes involved in sex pheromone biosynthesis and degradation, and may be useful to identify potential gene targets for pest control strategies by disrupting the insect-insect communication using pheromone-based behavioral antagonists.
Frequent expression loss of Inter-alpha-trypsin inhibitor heavy chain (ITIH) genes in multiple human solid tumors: A systematic expression analysis

International Nuclear Information System (INIS)

Hamm, Alexander; Knuechel, Ruth; Dahl, Edgar; Veeck, Juergen; Bektas, Nuran; Wild, Peter J; Hartmann, Arndt; Heindrichs, Uwe; Kristiansen, Glen; Werbowetski-Ogilvie, Tamra; Del Maestro, Rolando

2008-01-01

The inter-alpha-trypsin inhibitors (ITI) are a family of plasma protease inhibitors, assembled from a light chain – bikunin, encoded by AMBP – and five homologous heavy chains (encoded by ITIH1, ITIH2, ITIH3, ITIH4, and ITIH5), contributing to extracellular matrix stability by covalent linkage to hyaluronan. So far, ITIH molecules have been shown to play a particularly important role in inflammation and carcinogenesis. We systematically investigated differential gene expression of the ITIH gene family, as well as AMBP and the interacting partner TNFAIP6 in 13 different human tumor entities (of breast, endometrium, ovary, cervix, stomach, small intestine, colon, rectum, lung, thyroid, prostate, kidney, and pancreas) using cDNA dot blot analysis (Cancer Profiling Array, CPA), semiquantitative RT-PCR and immunohistochemistry. We found that ITIH genes are clearly downregulated in multiple human solid tumors, including breast, colon and lung cancer. Thus, ITIH genes may represent a family of putative tumor suppressor genes that should be analyzed in greater detail in the future. For an initial detailed analysis we chose ITIH2 expression in human breast cancer. Loss of ITIH2 expression in 70% of cases (n = 50, CPA) could be confirmed by real-time PCR in an additional set of breast cancers (n = 36). Next we studied ITIH2 expression on the protein level by analyzing a comprehensive tissue micro array including 185 invasive breast cancer specimens. We found a strong correlation (p < 0.001) between ITIH2 expression and estrogen receptor (ER) expression indicating that ER may be involved in the regulation of this ECM molecule. Altogether, this is the first systematic analysis on the differential expression of ITIH genes in human cancer, showing frequent downregulation that may be associated with initiation and/or progression of these malignancies
Global temperature response to the major volcanic eruptions in multiple reanalysis data sets

Directory of Open Access Journals (Sweden)

M. Fujiwara

2015-12-01

Full Text Available The global temperature responses to the eruptions of Mount Agung in 1963, El Chichón in 1982, and Mount Pinatubo in 1991 are investigated using nine currently available reanalysis data sets (JRA-55, MERRA, ERA-Interim, NCEP-CFSR, JRA-25, ERA-40, NCEP-1, NCEP-2, and 20CR. Multiple linear regression is applied to the zonal and monthly mean time series of temperature for two periods, 1979–2009 (for eight reanalysis data sets and 1958–2001 (for four reanalysis data sets, by considering explanatory factors of seasonal harmonics, linear trends, Quasi-Biennial Oscillation, solar cycle, and El Niño Southern Oscillation. The residuals are used to define the volcanic signals for the three eruptions separately, and common and different responses among the older and newer reanalysis data sets are highlighted for each eruption. In response to the Mount Pinatubo eruption, most reanalysis data sets show strong warming signals (up to 2–3 K for 1-year average in the tropical lower stratosphere and weak cooling signals (down to −1 K in the subtropical upper troposphere. For the El Chichón eruption, warming signals in the tropical lower stratosphere are somewhat smaller than those for the Mount Pinatubo eruption. The response to the Mount Agung eruption is asymmetric about the equator with strong warming in the Southern Hemisphere midlatitude upper troposphere to lower stratosphere. Comparison of the results from several different reanalysis data sets confirms the atmospheric temperature response to these major eruptions qualitatively, but also shows quantitative differences even among the most recent reanalysis data sets. The consistencies and differences among different reanalysis data sets provide a measure of the confidence and uncertainty in our current understanding of the volcanic response. The results of this intercomparison study may be useful for validation of climate model responses to volcanic forcing and for assessing proposed
A summarization approach for Affymetrix GeneChip data using a reference training set from a large, biologically diverse database

Directory of Open Access Journals (Sweden)

Tripputi Mark

2006-10-01

Full Text Available Abstract Background Many of the most popular pre-processing methods for Affymetrix expression arrays, such as RMA, gcRMA, and PLIER, simultaneously analyze data across a set of predetermined arrays to improve precision of the final measures of expression. One problem associated with these algorithms is that expression measurements for a particular sample are highly dependent on the set of samples used for normalization and results obtained by normalization with a different set may not be comparable. A related problem is that an organization producing and/or storing large amounts of data in a sequential fashion will need to either re-run the pre-processing algorithm every time an array is added or store them in batches that are pre-processed together. Furthermore, pre-processing of large numbers of arrays requires loading all the feature-level data into memory which is a difficult task even with modern computers. We utilize a scheme that produces all the information necessary for pre-processing using a very large training set that can be used for summarization of samples outside of the training set. All subsequent pre-processing tasks can be done on an individual array basis. We demonstrate the utility of this approach by defining a new version of the Robust Multi-chip Averaging (RMA algorithm which we refer to as refRMA. Results We assess performance based on multiple sets of samples processed over HG U133A Affymetrix GeneChip® arrays. We show that the refRMA workflow, when used in conjunction with a large, biologically diverse training set, results in the same general characteristics as that of RMA in its classic form when comparing overall data structure, sample-to-sample correlation, and variation. Further, we demonstrate that the refRMA workflow and reference set can be robustly applied to naïve organ types and to benchmark data where its performance indicates respectable results. Conclusion Our results indicate that a biologically diverse
miR-137 inhibits the invasion of melanoma cells through downregulation of multiple oncogenic target genes.

Science.gov (United States)

Luo, Chonglin; Tetteh, Paul W; Merz, Patrick R; Dickes, Elke; Abukiwan, Alia; Hotz-Wagenblatt, Agnes; Holland-Cunz, Stefan; Sinnberg, Tobias; Schittek, Birgit; Schadendorf, Dirk; Diederichs, Sven; Eichmüller, Stefan B

2013-03-01

MicroRNAs are small noncoding RNAs that regulate gene expression and have important roles in various types of cancer. Previously, miR-137 was reported to act as a tumor suppressor in different cancers, including malignant melanoma. In this study, we show that low miR-137 expression is correlated with poor survival in stage IV melanoma patients. We identified and validated two genes (c-Met and YB1) as direct targets of miR-137 and confirmed two previously known targets, namely enhancer of zeste homolog 2 (EZH2) and microphthalmia-associated transcription factor (MITF). Functional studies showed that miR-137 suppressed melanoma cell invasion through the downregulation of multiple target genes. The decreased invasion caused by miR-137 overexpression could be phenocopied by small interfering RNA knockdown of EZH2, c-Met, or Y box-binding protein 1 (YB1). Furthermore, miR-137 inhibited melanoma cell migration and proliferation. Finally, miR-137 induced apoptosis in melanoma cell lines and decreased BCL2 levels. In summary, our study confirms that miR-137 acts as a tumor suppressor in malignant melanoma and reveals that miR-137 regulates multiple targets including c-Met, YB1, EZH2, and MITF.
Association of a novel point mutation in MSH2 gene with familial multiple primary cancers

Directory of Open Access Journals (Sweden)

Hai Hu

2017-10-01

Full Text Available Abstract Background Multiple primary cancers (MPC have been identified as two or more cancers without any subordinate relationship that occur either simultaneously or metachronously in the same or different organs of an individual. Lynch syndrome is an autosomal dominant genetic disorder that increases the risk of many types of cancers. Lynch syndrome patients who suffer more than two cancers can also be considered as MPC; patients of this kind provide unique resources to learn how genetic mutation causes MPC in different tissues. Methods We performed a whole genome sequencing on blood cells and two tumor samples of a Lynch syndrome patient who was diagnosed with five primary cancers. The mutational landscape of the tumors, including somatic point mutations and copy number alternations, was characterized. We also compared Lynch syndrome with sporadic cancers and proposed a model to illustrate the mutational process by which Lynch syndrome progresses to MPC. Results We revealed a novel pathologic mutation on the MSH2 gene (G504 splicing that associates with Lynch syndrome. Systematical comparison of the mutation landscape revealed that multiple cancers in the proband were evolutionarily independent. Integrative analysis showed that truncating mutations of DNA mismatch repair (MMR genes were significantly enriched in the patient. A mutation progress model that included germline mutations of MMR genes, double hits of MMR system, mutations in tissue-specific driver genes, and rapid accumulation of additional passenger mutations was proposed to illustrate how MPC occurs in Lynch syndrome patients. Conclusion Our findings demonstrate that both germline and somatic alterations are driving forces of carcinogenesis, which may resolve the carcinogenic theory of Lynch syndrome.
Identification of a novel set of genes reflecting different in vivo invasive patterns of human GBM cells

International Nuclear Information System (INIS)

Monticone, Massimiliano; Giaretti, Walter; Pfeffer, Ulrich; Daga, Antonio; Candiani, Simona; Romeo, Francesco; Mirisola, Valentina; Viaggi, Silvia; Melloni, Ilaria; Pedemonte, Simona; Zona, Gianluigi

2012-01-01

Most patients affected by Glioblastoma multiforme (GBM, grade IV glioma) experience a recurrence of the disease because of the spreading of tumor cells beyond surgical boundaries. Unveiling mechanisms causing this process is a logic goal to impair the killing capacity of GBM cells by molecular targeting. We noticed that our long-term GBM cultures, established from different patients, may display two categories/types of growth behavior in an orthotopic xenograft model: expansion of the tumor mass and formation of tumor branches/nodules (nodular like, NL-type) or highly diffuse single tumor cell infiltration (HD-type). We determined by DNA microarrays the gene expression profiles of three NL-type and three HD-type long-term GBM cultures. Subsequently, individual genes with different expression levels between the two groups were identified using Significance Analysis of Microarrays (SAM). Real time RT-PCR, immunofluorescence and immunoblot analyses, were performed for a selected subgroup of regulated gene products to confirm the results obtained by the expression analysis. Here, we report the identification of a set of 34 differentially expressed genes in the two types of GBM cultures. Twenty-three of these genes encode for proteins localized to the plasma membrane and 9 of these for proteins are involved in the process of cell adhesion. This study suggests the participation in the diffuse infiltrative/invasive process of GBM cells within the CNS of a novel set of genes coding for membrane-associated proteins, which should be thus susceptible to an inhibition strategy by specific targeting. Massimiliano Monticone and Antonio Daga contributed equally to this work

Identification of a novel set of genes reflecting different in vivo invasive patterns of human GBM cells.

Science.gov (United States)

Monticone, Massimiliano; Daga, Antonio; Candiani, Simona; Romeo, Francesco; Mirisola, Valentina; Viaggi, Silvia; Melloni, Ilaria; Pedemonte, Simona; Zona, Gianluigi; Giaretti, Walter; Pfeffer, Ulrich; Castagnola, Patrizio

2012-08-17

Most patients affected by Glioblastoma multiforme (GBM, grade IV glioma) experience a recurrence of the disease because of the spreading of tumor cells beyond surgical boundaries. Unveiling mechanisms causing this process is a logic goal to impair the killing capacity of GBM cells by molecular targeting.We noticed that our long-term GBM cultures, established from different patients, may display two categories/types of growth behavior in an orthotopic xenograft model: expansion of the tumor mass and formation of tumor branches/nodules (nodular like, NL-type) or highly diffuse single tumor cell infiltration (HD-type). We determined by DNA microarrays the gene expression profiles of three NL-type and three HD-type long-term GBM cultures. Subsequently, individual genes with different expression levels between the two groups were identified using Significance Analysis of Microarrays (SAM). Real time RT-PCR, immunofluorescence and immunoblot analyses, were performed for a selected subgroup of regulated gene products to confirm the results obtained by the expression analysis. Here, we report the identification of a set of 34 differentially expressed genes in the two types of GBM cultures. Twenty-three of these genes encode for proteins localized to the plasma membrane and 9 of these for proteins are involved in the process of cell adhesion. This study suggests the participation in the diffuse infiltrative/invasive process of GBM cells within the CNS of a novel set of genes coding for membrane-associated proteins, which should be thus susceptible to an inhibition strategy by specific targeting.Massimiliano Monticone and Antonio Daga contributed equally to this work.
In silico analysis of stomach lineage specific gene set expression pattern in gastric cancer

Energy Technology Data Exchange (ETDEWEB)

Pandi, Narayanan Sathiya, E-mail: sathiyapandi@gmail.com; Suganya, Sivagurunathan; Rajendran, Suriliyandi

2013-10-04

Highlights: •Identified stomach lineage specific gene set (SLSGS) was found to be under expressed in gastric tumors. •Elevated expression of SLSGS in gastric tumor is a molecular predictor of metabolic type gastric cancer. •In silico pathway scanning identified estrogen-α signaling is a putative regulator of SLSGS in gastric cancer. •Elevated expression of SLSGS in GC is associated with an overall increase in the survival of GC patients. -- Abstract: Stomach lineage specific gene products act as a protective barrier in the normal stomach and their expression maintains the normal physiological processes, cellular integrity and morphology of the gastric wall. However, the regulation of stomach lineage specific genes in gastric cancer (GC) is far less clear. In the present study, we sought to investigate the role and regulation of stomach lineage specific gene set (SLSGS) in GC. SLSGS was identified by comparing the mRNA expression profiles of normal stomach tissue with other organ tissue. The obtained SLSGS was found to be under expressed in gastric tumors. Functional annotation analysis revealed that the SLSGS was enriched for digestive function and gastric epithelial maintenance. Employing a single sample prediction method across GC mRNA expression profiles identified the under expression of SLSGS in proliferative type and invasive type gastric tumors compared to the metabolic type gastric tumors. Integrative pathway activation prediction analysis revealed a close association between estrogen-α signaling and SLSGS expression pattern in GC. Elevated expression of SLSGS in GC is associated with an overall increase in the survival of GC patients. In conclusion, our results highlight that estrogen mediated regulation of SLSGS in gastric tumor is a molecular predictor of metabolic type GC and prognostic factor in GC.
In silico analysis of stomach lineage specific gene set expression pattern in gastric cancer

International Nuclear Information System (INIS)

Pandi, Narayanan Sathiya; Suganya, Sivagurunathan; Rajendran, Suriliyandi

2013-01-01

Highlights: •Identified stomach lineage specific gene set (SLSGS) was found to be under expressed in gastric tumors. •Elevated expression of SLSGS in gastric tumor is a molecular predictor of metabolic type gastric cancer. •In silico pathway scanning identified estrogen-α signaling is a putative regulator of SLSGS in gastric cancer. •Elevated expression of SLSGS in GC is associated with an overall increase in the survival of GC patients. -- Abstract: Stomach lineage specific gene products act as a protective barrier in the normal stomach and their expression maintains the normal physiological processes, cellular integrity and morphology of the gastric wall. However, the regulation of stomach lineage specific genes in gastric cancer (GC) is far less clear. In the present study, we sought to investigate the role and regulation of stomach lineage specific gene set (SLSGS) in GC. SLSGS was identified by comparing the mRNA expression profiles of normal stomach tissue with other organ tissue. The obtained SLSGS was found to be under expressed in gastric tumors. Functional annotation analysis revealed that the SLSGS was enriched for digestive function and gastric epithelial maintenance. Employing a single sample prediction method across GC mRNA expression profiles identified the under expression of SLSGS in proliferative type and invasive type gastric tumors compared to the metabolic type gastric tumors. Integrative pathway activation prediction analysis revealed a close association between estrogen-α signaling and SLSGS expression pattern in GC. Elevated expression of SLSGS in GC is associated with an overall increase in the survival of GC patients. In conclusion, our results highlight that estrogen mediated regulation of SLSGS in gastric tumor is a molecular predictor of metabolic type GC and prognostic factor in GC
Comparative genomic analysis of SET domain family reveals the origin, expansion, and putative function of the arthropod-specific SmydA genes as histone modifiers in insects.

Science.gov (United States)

Jiang, Feng; Liu, Qing; Wang, Yanli; Zhang, Jie; Wang, Huimin; Song, Tianqi; Yang, Meiling; Wang, Xianhui; Kang, Le

2017-06-01

The SET domain is an evolutionarily conserved motif present in histone lysine methyltransferases, which are important in the regulation of chromatin and gene expression in animals. In this study, we searched for SET domain-containing genes (SET genes) in all of the 147 arthropod genomes sequenced at the time of carrying out this experiment to understand the evolutionary history by which SET domains have evolved in insects. Phylogenetic and ancestral state reconstruction analysis revealed an arthropod-specific SET gene family, named SmydA, that is ancestral to arthropod animals and specifically diversified during insect evolution. Considering that pseudogenization is the most probable fate of the new emerging gene copies, we provided experimental and evolutionary evidence to demonstrate their essential functions. Fluorescence in situ hybridization analysis and in vitro methyltransferase activity assays showed that the SmydA-2 gene was transcriptionally active and retained the original histone methylation activity. Expression knockdown by RNA interference significantly increased mortality, implying that the SmydA genes may be essential for insect survival. We further showed predominantly strong purifying selection on the SmydA gene family and a potential association between the regulation of gene expression and insect phenotypic plasticity by transcriptome analysis. Overall, these data suggest that the SmydA gene family retains essential functions that may possibly define novel regulatory pathways in insects. This work provides insights into the roles of lineage-specific domain duplication in insect evolution. © The Authors 2017. Published by Oxford University Press.
Multiple organ gigantism caused by mutation in VmPPD gene in blackgram (Vigna mungo).

Science.gov (United States)

Naito, Ken; Takahashi, Yu; Chaitieng, Bubpa; Hirano, Kumi; Kaga, Akito; Takagi, Kyoko; Ogiso-Tanaka, Eri; Thavarasook, Charaspon; Ishimoto, Masao; Tomooka, Norihiko

2017-03-01

Seed size is one of the most important traits in leguminous crops. We obtained a recessive mutant of blackgram that had greatly enlarged leaves, stems and seeds. The mutant produced 100% bigger leaves, 50% more biomass and 70% larger seeds though it produced 40% less number of seeds. We designated the mutant as multiple-organ-gigantism ( mog ) and found the mog phenotype was due to increase in cell numbers but not in cell size. We also found the mog mutant showed a rippled leaf ( rl ) phenotype, which was probably caused by a pleiotropic effect of the mutation. We performed a map-based cloning and successfully identified an 8 bp deletion in the coding sequence of VmPPD gene, an orthologue of Arabidopsis PEAPOD ( PPD ) that regulates arrest of cell divisions in meristematic cells . We found no other mutations in the neighboring genes between the mutant and the wild type. We also knocked down GmPPD genes and reproduced both the mog and rl phenotypes in soybean. Controlling PPD genes to produce the mog phenotype is highly valuable for breeding since larger seed size could directly increase the commercial values of grain legumes.
Acute effect of passive rest intervals and stretching exercise on multiple set performance

Directory of Open Access Journals (Sweden)

Antonio Claudio do Rosário Souza

2009-01-01

Full Text Available http://dx.doi.org/10.5007/1980-0037.2009v11n4p435 The objective of this study was to determine the acute effect of passive rest intervals and static stretching between resistance exercise sets on the number of maximal repetitions (RM, rating of perceived exertion (RPE, and cumulative number of repetitions in multiple sets with a workload adjusted by the 8RM test. Fourteen trained male subjects (24.4 ± 2.1 years; 79.1 ± 7.1 kg; 175.4 ± 5.6 cm were studied. On the first two visits, the subjects were submitted to the test and 8RM re-test using chest press (CP and squat (SQ exercises. On the two subsequent visits, all subjects were randomly assigned to two experimental situations: a 8RM test with a passive rest interval (PI; b 8RM test with static stretching (SS. The subjects performed three sets of CP and SQ, intercalated with 2 minutes of passive rest or 30 seconds of static stretching. ANOVA revealed a significant decrease (p < 0.05 in the second (PI = 6 ± 0.8 x SS = 5.2 ± 1.0 repetitions and third (PI = 4.1 ± 0.8 X SS = 3.3 ± 0.6 repetitions sets for CP and only in the third set (PI = 4.9 ± 0.8 X SS = 4.2 ± 1.0 repetitions for SQ. For RPE, the Wilcoxon test showed significant differences (p < 0.05 between all sets for CP and SQ. For the cumulative number of repetitions, the paired t-test revealed a significant decrease (p < 0.05 for CP (PI = 18.3 ± 1.5 X SS = 16.8 ± 1.6 repetitions. These results indicate that static stretching between resistance exercise sets decreases 8RM test performance.
Multiple-scattering theory with a truncated basis set

International Nuclear Information System (INIS)

Zhang, X.; Butler, W.H.

1992-01-01

Multiple-scattering theory (MST) is an extremely efficient technique for calculating the electronic structure of an assembly of atoms. The wave function in MST is expanded in terms of spherical waves centered on each atom and indexed by their orbital and azimuthal quantum numbers, l and m. The secular equation which determines the characteristic energies can be truncated at a value of the orbital angular momentum l max , for which the higher angular momentum phase shifts, δ l (l>l max ), are sufficiently small. Generally, the wave-function coefficients which are calculated from the secular equation are also truncated at l max . Here we point out that this truncation of the wave function is not necessary and is in fact inconsistent with the truncation of the secular equation. A consistent procedure is described in which the states with higher orbital angular momenta are retained but with their phase shifts set to zero. We show that this treatment gives smooth, continuous, and correctly normalized wave functions and that the total charge density calculated from the corresponding Green function agrees with the Lloyd formula result. We also show that this augmented wave function can be written as a linear combination of Andersen's muffin-tin orbitals in the case of muffin-tin potentials, and can be used to generalize the muffin-tin orbital idea to full-cell potentals
PSP: rapid identification of orthologous coding genes under positive selection across multiple closely related prokaryotic genomes.

Science.gov (United States)

Su, Fei; Ou, Hong-Yu; Tao, Fei; Tang, Hongzhi; Xu, Ping

2013-12-27

With genomic sequences of many closely related bacterial strains made available by deep sequencing, it is now possible to investigate trends in prokaryotic microevolution. Positive selection is a sub-process of microevolution, in which a particular mutation is favored, causing the allele frequency to continuously shift in one direction. Wide scanning of prokaryotic genomes has shown that positive selection at the molecular level is much more frequent than expected. Genes with significant positive selection may play key roles in bacterial adaption to different environmental pressures. However, selection pressure analyses are computationally intensive and awkward to configure. Here we describe an open access web server, which is designated as PSP (Positive Selection analysis for Prokaryotic genomes) for performing evolutionary analysis on orthologous coding genes, specially designed for rapid comparison of dozens of closely related prokaryotic genomes. Remarkably, PSP facilitates functional exploration at the multiple levels by assignments and enrichments of KO, GO or COG terms. To illustrate this user-friendly tool, we analyzed Escherichia coli and Bacillus cereus genomes and found that several genes, which play key roles in human infection and antibiotic resistance, show significant evidence of positive selection. PSP is freely available to all users without any login requirement at: http://db-mml.sjtu.edu.cn/PSP/. PSP ultimately allows researchers to do genome-scale analysis for evolutionary selection across multiple prokaryotic genomes rapidly and easily, and identify the genes undergoing positive selection, which may play key roles in the interactions of host-pathogen and/or environmental adaptation.
Normalization and gene p-value estimation: issues in microarray data processing.

Science.gov (United States)

Fundel, Katrin; Küffner, Robert; Aigner, Thomas; Zimmer, Ralf

2008-05-28

Numerous methods exist for basic processing, e.g. normalization, of microarray gene expression data. These methods have an important effect on the final analysis outcome. Therefore, it is crucial to select methods appropriate for a given dataset in order to assure the validity and reliability of expression data analysis. Furthermore, biological interpretation requires expression values for genes, which are often represented by several spots or probe sets on a microarray. How to best integrate spot/probe set values into gene values has so far been a somewhat neglected problem. We present a case study comparing different between-array normalization methods with respect to the identification of differentially expressed genes. Our results show that it is feasible and necessary to use prior knowledge on gene expression measurements to select an adequate normalization method for the given data. Furthermore, we provide evidence that combining spot/probe set p-values into gene p-values for detecting differentially expressed genes has advantages compared to combining expression values for spots/probe sets into gene expression values. The comparison of different methods suggests to use Stouffer's method for this purpose. The study has been conducted on gene expression experiments investigating human joint cartilage samples of osteoarthritis related groups: a cDNA microarray (83 samples, four groups) and an Affymetrix (26 samples, two groups) data set. The apparently straight forward steps of gene expression data analysis, e.g. between-array normalization and detection of differentially regulated genes, can be accomplished by numerous different methods. We analyzed multiple methods and the possible effects and thereby demonstrate the importance of the single decisions taken during data processing. We give guidelines for evaluating normalization outcomes. An overview of these effects via appropriate measures and plots compared to prior knowledge is essential for the biological
Prediction of protein interaction hot spots using rough set-based multiple criteria linear programming.

Science.gov (United States)

Chen, Ruoying; Zhang, Zhiwang; Wu, Di; Zhang, Peng; Zhang, Xinyang; Wang, Yong; Shi, Yong

2011-01-21

Protein-protein interactions are fundamentally important in many biological processes and it is in pressing need to understand the principles of protein-protein interactions. Mutagenesis studies have found that only a small fraction of surface residues, known as hot spots, are responsible for the physical binding in protein complexes. However, revealing hot spots by mutagenesis experiments are usually time consuming and expensive. In order to complement the experimental efforts, we propose a new computational approach in this paper to predict hot spots. Our method, Rough Set-based Multiple Criteria Linear Programming (RS-MCLP), integrates rough sets theory and multiple criteria linear programming to choose dominant features and computationally predict hot spots. Our approach is benchmarked by a dataset of 904 alanine-mutated residues and the results show that our RS-MCLP method performs better than other methods, e.g., MCLP, Decision Tree, Bayes Net, and the existing HotSprint database. In addition, we reveal several biological insights based on our analysis. We find that four features (the change of accessible surface area, percentage of the change of accessible surface area, size of a residue, and atomic contacts) are critical in predicting hot spots. Furthermore, we find that three residues (Tyr, Trp, and Phe) are abundant in hot spots through analyzing the distribution of amino acids. Copyright © 2010 Elsevier Ltd. All rights reserved.
Resistance to Downy Mildew in Lettuce 'La Brillante' is Conferred by Dm50 Gene and Multiple QTL.

Science.gov (United States)

Simko, Ivan; Ochoa, Oswaldo E; Pel, Mathieu A; Tsuchida, Cayla; Font I Forcada, Carolina; Hayes, Ryan J; Truco, Maria-Jose; Antonise, Rudie; Galeano, Carlos H; Michelmore, Richard W

2015-09-01

Many cultivars of lettuce (Lactuca sativa L.) are susceptible to downy mildew, a nearly globally ubiquitous disease caused by Bremia lactucae. We previously determined that Batavia type cultivar 'La Brillante' has a high level of field resistance to the disease in California. Testing of a mapping population developed from a cross between 'Salinas 88' and La Brillante in multiple field and laboratory experiments revealed that at least five loci conferred resistance in La Brillante. The presence of a new dominant resistance gene (designated Dm50) that confers complete resistance to specific isolates was detected in laboratory tests of seedlings inoculated with multiple diverse isolates. Dm50 is located in the major resistance cluster on linkage group 2 that contains at least eight major, dominant Dm genes conferring resistance to downy mildew. However, this Dm gene is ineffective against the isolates of B. lactucae prevalent in the field in California and the Netherlands. A quantitative trait locus (QTL) located at the Dm50 chromosomal region (qDM2.2) was detected, though, when the amount of disease was evaluated a month before plants reached harvest maturity. Four additional QTL for resistance to B. lactucae were identified on linkage groups 4 (qDM4.1 and qDM4.2), 7 (qDM7.1), and 9 (qDM9.2). The largest effect was associated with qDM7.1 (up to 32.9% of the total phenotypic variance) that determined resistance in multiple field experiments. Markers identified in the present study will facilitate introduction of these resistance loci into commercial cultivars of lettuce.
Identification of a novel set of genes reflecting different in vivo invasive patterns of human GBM cells

Directory of Open Access Journals (Sweden)

Monticone Massimiliano

2012-08-01

Full Text Available Abstract Background Most patients affected by Glioblastoma multiforme (GBM, grade IV glioma experience a recurrence of the disease because of the spreading of tumor cells beyond surgical boundaries. Unveiling mechanisms causing this process is a logic goal to impair the killing capacity of GBM cells by molecular targeting. We noticed that our long-term GBM cultures, established from different patients, may display two categories/types of growth behavior in an orthotopic xenograft model: expansion of the tumor mass and formation of tumor branches/nodules (nodular like, NL-type or highly diffuse single tumor cell infiltration (HD-type. Methods We determined by DNA microarrays the gene expression profiles of three NL-type and three HD-type long-term GBM cultures. Subsequently, individual genes with different expression levels between the two groups were identified using Significance Analysis of Microarrays (SAM. Real time RT-PCR, immunofluorescence and immunoblot analyses, were performed for a selected subgroup of regulated gene products to confirm the results obtained by the expression analysis. Results Here, we report the identification of a set of 34 differentially expressed genes in the two types of GBM cultures. Twenty-three of these genes encode for proteins localized to the plasma membrane and 9 of these for proteins are involved in the process of cell adhesion. Conclusions This study suggests the participation in the diffuse infiltrative/invasive process of GBM cells within the CNS of a novel set of genes coding for membrane-associated proteins, which should be thus susceptible to an inhibition strategy by specific targeting. Massimiliano Monticone and Antonio Daga contributed equally to this work
Evaluation of gene importance in microarray data based upon probability of selection

Directory of Open Access Journals (Sweden)

Fu Li M

2005-03-01

Full Text Available Abstract Background Microarray devices permit a genome-scale evaluation of gene function. This technology has catalyzed biomedical research and development in recent years. As many important diseases can be traced down to the gene level, a long-standing research problem is to identify specific gene expression patterns linking to metabolic characteristics that contribute to disease development and progression. The microarray approach offers an expedited solution to this problem. However, it has posed a challenging issue to recognize disease-related genes expression patterns embedded in the microarray data. In selecting a small set of biologically significant genes for classifier design, the nature of high data dimensionality inherent in this problem creates substantial amount of uncertainty. Results Here we present a model for probability analysis of selected genes in order to determine their importance. Our contribution is that we show how to derive the P value of each selected gene in multiple gene selection trials based on different combinations of data samples and how to conduct a reliability analysis accordingly. The importance of a gene is indicated by its associated P value in that a smaller value implies higher information content from information theory. On the microarray data concerning the subtype classification of small round blue cell tumors, we demonstrate that the method is capable of finding the smallest set of genes (19 genes with optimal classification performance, compared with results reported in the literature. Conclusion In classifier design based on microarray data, the probability value derived from gene selection based on multiple combinations of data samples enables an effective mechanism for reducing the tendency of fitting local data particularities.
Differential effects of multiplicity of infection on Helicobacter pylori-induced signaling pathways and interleukin-8 gene transcription.

Science.gov (United States)

Ritter, Birgit; Kilian, Petra; Reboll, Marc Rene; Resch, Klaus; DiStefano, Johanna Kay; Frank, Ronald; Beil, Winfried; Nourbakhsh, Mahtab

2011-02-01

Interleukin-8 (IL-8) plays a central role in the pathogenesis of Helicobacter pylori infection. We used four different H. pylori strains isolated from patients with gastritis or duodenal ulcer disease to examine their differential effects on signaling pathways and IL-8 gene response in gastric epithelial cells. IL-8 mRNA level is elevated in response to high (100) multiplicity of infection (MOI) independent of cagA, vacA, and dupA gene characteristics. By lower MOIs (1 or 10), only cagA ( + ) strains significantly induce IL-8 gene expression. This is based on differential regulation of IL-8 promoter activity. Analysis of intracellular signaling pathways indicates that H. pylori clinical isolates induce IL-8 gene transcription through NF-κB p65, but by a MOI-dependent differential activation of MAPK pathways. Thus, the major virulence factors of H. pylori CagA, VacA, and DupA might play a minor role in the level of IL-8 gene response to a high bacterial load.
A set of vectors for introduction of antibiotic resistance genes by in vitro Cre-mediated recombination

Directory of Open Access Journals (Sweden)

Vassetzky Yegor S

2008-12-01

Full Text Available Abstract Background Introduction of new antibiotic resistance genes in the plasmids of interest is a frequent task in molecular cloning practice. Classical approaches involving digestion with restriction endonucleases and ligation are time-consuming. Findings We have created a set of insertion vectors (pINS carrying genes that provide resistance to various antibiotics (puromycin, blasticidin and G418 and containing a loxP site. Each vector (pINS-Puro, pINS-Blast or pINS-Neo contains either a chloramphenicol or a kanamycin resistance gene and is unable to replicate in most E. coli strains as it contains a conditional R6Kγ replication origin. Introduction of the antibiotic resistance genes into the vector of interest is achieved by Cre-mediated recombination between the replication-incompetent pINS and a replication-competent target vector. The recombination mix is then transformed into E. coli and selected by the resistance marker (kanamycin or chloramphenicol present in pINS, which allows to recover the recombinant plasmids with 100% efficiency. Conclusion Here we propose a simple strategy that allows to introduce various antibiotic-resistance genes into any plasmid containing a replication origin, an ampicillin resistance gene and a loxP site.
Comparison of Deep Learning With Multiple Machine Learning Methods and Metrics Using Diverse Drug Discovery Data Sets.

Science.gov (United States)

Korotcov, Alexandru; Tkachenko, Valery; Russo, Daniel P; Ekins, Sean

2017-12-04

Machine learning methods have been applied to many data sets in pharmaceutical research for several decades. The relative ease and availability of fingerprint type molecular descriptors paired with Bayesian methods resulted in the widespread use of this approach for a diverse array of end points relevant to drug discovery. Deep learning is the latest machine learning algorithm attracting attention for many of pharmaceutical applications from docking to virtual screening. Deep learning is based on an artificial neural network with multiple hidden layers and has found considerable traction for many artificial intelligence applications. We have previously suggested the need for a comparison of different machine learning methods with deep learning across an array of varying data sets that is applicable to pharmaceutical research. End points relevant to pharmaceutical research include absorption, distribution, metabolism, excretion, and toxicity (ADME/Tox) properties, as well as activity against pathogens and drug discovery data sets. In this study, we have used data sets for solubility, probe-likeness, hERG, KCNQ1, bubonic plague, Chagas, tuberculosis, and malaria to compare different machine learning methods using FCFP6 fingerprints. These data sets represent whole cell screens, individual proteins, physicochemical properties as well as a data set with a complex end point. Our aim was to assess whether deep learning offered any improvement in testing when assessed using an array of metrics including AUC, F1 score, Cohen's kappa, Matthews correlation coefficient and others. Based on ranked normalized scores for the metrics or data sets Deep Neural Networks (DNN) ranked higher than SVM, which in turn was ranked higher than all the other machine learning methods. Visualizing these properties for training and test sets using radar type plots indicates when models are inferior or perhaps over trained. These results also suggest the need for assessing deep learning further
DLC1 tumor suppressor gene inhibits migration and invasion of multiple myeloma cells through RhoA GTPase pathway

Czech Academy of Sciences Publication Activity Database

Ullmannová-Benson, Veronika; Guan, M.; Zhou, X. G.; Tripathi, V.; Yang, V.; Zimonjic, D. B.; Popescu, C.

2009-01-01

Roč. 23, č. 2 (2009), s. 383-390 ISSN 0887-6924 Institutional research plan: CEZ:AV0Z50200510 Keywords : multiple myeloma * tumor suppressor gene * promoter methylation Subject RIV: EC - Immunology Impact factor: 8.296, year: 2009
Mycobacterium malmesburyense sp. nov., a non-tuberculous species of the genus Mycobacterium revealed by multiple gene sequence characterization

CSIR Research Space (South Africa)

Gcebe, N

2017-04-01

Full Text Available Journal of Systematic and Evolutionary Microbiology: DOI 10.1099/ijsem.0.001678 Mycobacterium malmesburyense sp. nov., a non-tuberculous species of the genus Mycobacterium revealed by multiple gene sequence characterization Gcebe N Rutten V Gey...
The presence of p53 influences the expression of multiple human cytomegalovirus genes at early times postinfection.

Science.gov (United States)

Hannemann, Holger; Rosenke, Kyle; O'Dowd, John M; Fortunato, Elizabeth A

2009-05-01

Human cytomegalovirus (HCMV) is a common cause of morbidity and mortality in immunocompromised and immunosuppressed individuals. During infection, HCMV is known to employ host transcription factors to facilitate viral gene expression. To further understand the previously observed delay in viral replication and protein expression in p53 knockout cells, we conducted microarray analyses of p53(+/+) and p53(-/-) immortalized fibroblast cell lines. At a multiplicity of infection (MOI) of 1 at 24 h postinfection (p.i.), the expression of 22 viral genes was affected by the absence of p53. Eleven of these 22 genes (group 1) were examined by real-time reverse transcriptase, or quantitative, PCR (q-PCR). Additionally, five genes previously determined to have p53 bound to their nearest p53-responsive elements (group 2) and three control genes without p53 binding sites in their upstream sequences (group 3) were also examined. At an MOI of 1, >3-fold regulation was found for five group 1 genes. The expression of group 2 and 3 genes was not changed. At an MOI of 5, all genes from group 1 and four of five genes from group 2 were found to be regulated. The expression of control genes from group 3 remained unchanged. A q-PCR time course of four genes revealed that p53 influences viral gene expression most at immediate-early and early times p.i., suggesting a mechanism for the reduced and delayed production of virions in p53(-/-) cells.
Alteration of Multiple Leukocyte Gene Expression Networks is Linked with Magnetic Resonance Markers of Prognosis After Acute ST-Elevation Myocardial Infarction.

Science.gov (United States)

Teren, A; Kirsten, H; Beutner, F; Scholz, M; Holdt, L M; Teupser, D; Gutberlet, M; Thiery, J; Schuler, G; Eitel, I

2017-02-03

Prognostic relevant pathways of leukocyte involvement in human myocardial ischemic-reperfusion injury are largely unknown. We enrolled 136 patients with ST-elevation myocardial infarction (STEMI) after primary angioplasty within 12 h after onset of symptoms. Following reperfusion, whole blood was collected within a median time interval of 20 h (interquartile range: 15-25 h) for genome-wide gene expression analysis. Subsequent CMR scans were performed using a standard protocol to determine infarct size (IS), area at risk (AAR), myocardial salvage index (MSI) and the extent of late microvascular obstruction (lateMO). We found 398 genes associated with lateMO and two genes with IS. Neither AAR, nor MSI showed significant correlations with gene expression. Genes correlating with lateMO were strongly related to several canonical pathways, including positive regulation of T-cell activation (p = 3.44 × 10 -5 ), and regulation of inflammatory response (p = 1.86 × 10 -3 ). Network analysis of multiple gene expression alterations associated with larger lateMO identified the following functional consequences: facilitated utilisation and decreased concentration of free fatty acid, repressed cell differentiation, enhanced phagocyte movement, increased cell death, vascular disease and compensatory vasculogenesis. In conclusion, the extent of lateMO after acute, reperfused STEMI correlated with altered activation of multiple genes related to fatty acid utilisation, lymphocyte differentiation, phagocyte mobilisation, cell survival, and vascular dysfunction.

A replica exchange transition interface sampling method with multiple interface sets for investigating networks of rare events

Science.gov (United States)

Swenson, David W. H.; Bolhuis, Peter G.

2014-07-01

The multiple state transition interface sampling (TIS) framework in principle allows the simulation of a large network of complex rare event transitions, but in practice suffers from convergence problems. To improve convergence, we combine multiple state TIS [J. Rogal and P. G. Bolhuis, J. Chem. Phys. 129, 224107 (2008)] with replica exchange TIS [T. S. van Erp, Phys. Rev. Lett. 98, 268301 (2007)]. In addition, we introduce multiple interface sets, which allow more than one order parameter to be defined for each state. We illustrate the methodology on a model system of multiple independent dimers, each with two states. For reaction networks with up to 64 microstates, we determine the kinetics in the microcanonical ensemble, and discuss the convergence properties of the sampling scheme. For this model, we find that the kinetics depend on the instantaneous composition of the system. We explain this dependence in terms of the system's potential and kinetic energy.
The progress of PET based reporter gene imaging

International Nuclear Information System (INIS)

Zhao Wei; Zhang Xiuli

2005-01-01

More than two decades of intense research have allowed gene therapy to move from the laboratory to the clinical setting, where its use for the treatment of human pathologies has been considerably increased in the last years. However, many crucial questions remain to be solved in this challenging field. In vivo imaging with positron emission tomography (PET) by combination of the appropriate PET reporter gene and PET reporter probe could provide invaluable qualitative and quantitative information to answer multiple unsolved questions about gene therapy. PET imaging could be used to define parameters not available by other techniques that are of substantial interest not only for the proper understanding of the gene therapy process, but also for its future development and clinical application in humans. (authors)
Array data extractor (ADE): a LabVIEW program to extract and merge gene array data.

Science.gov (United States)

Kurtenbach, Stefan; Kurtenbach, Sarah; Zoidl, Georg

2013-12-01

Large data sets from gene expression array studies are publicly available offering information highly valuable for research across many disciplines ranging from fundamental to clinical research. Highly advanced bioinformatics tools have been made available to researchers, but a demand for user-friendly software allowing researchers to quickly extract expression information for multiple genes from multiple studies persists. Here, we present a user-friendly LabVIEW program to automatically extract gene expression data for a list of genes from multiple normalized microarray datasets. Functionality was tested for 288 class A G protein-coupled receptors (GPCRs) and expression data from 12 studies comparing normal and diseased human hearts. Results confirmed known regulation of a beta 1 adrenergic receptor and further indicate novel research targets. Although existing software allows for complex data analyses, the LabVIEW based program presented here, "Array Data Extractor (ADE)", provides users with a tool to retrieve meaningful information from multiple normalized gene expression datasets in a fast and easy way. Further, the graphical programming language used in LabVIEW allows applying changes to the program without the need of advanced programming knowledge.
Integrating Multiple Data Sources for Combinatorial Marker Discovery: A Study in Tumorigenesis.

Science.gov (United States)

Bandyopadhyay, Sanghamitra; Mallik, Saurav

2018-01-01

Identification of combinatorial markers from multiple data sources is a challenging task in bioinformatics. Here, we propose a novel computational framework for identifying significant combinatorial markers ( s) using both gene expression and methylation data. The gene expression and methylation data are integrated into a single continuous data as well as a (post-discretized) boolean data based on their intrinsic (i.e., inverse) relationship. A novel combined score of methylation and expression data (viz., ) is introduced which is computed on the integrated continuous data for identifying initial non-redundant set of genes. Thereafter, (maximal) frequent closed homogeneous genesets are identified using a well-known biclustering algorithm applied on the integrated boolean data of the determined non-redundant set of genes. A novel sample-based weighted support ( ) is then proposed that is consecutively calculated on the integrated boolean data of the determined non-redundant set of genes in order to identify the non-redundant significant genesets. The top few resulting genesets are identified as potential s. Since our proposed method generates a smaller number of significant non-redundant genesets than those by other popular methods, the method is much faster than the others. Application of the proposed technique on an expression and a methylation data for Uterine tumor or Prostate Carcinoma produces a set of significant combination of markers. We expect that such a combination of markers will produce lower false positives than individual markers.
AS3MT-mediated tolerance to arsenic evolved by multiple independent horizontal gene transfers from bacteria to eukaryotes

DEFF Research Database (Denmark)

Palmgren, Michael; Engström, Karin; Hallström, Björn M.

2017-01-01

the evolutionary origin of AS3MT and assessed the ability of different genotypes to produce methylated arsenic metabolites. Phylogenetic analysis suggests that multiple, independent horizontal gene transfers between different bacteria, and from bacteria to eukaryotes, increased tolerance to environmental arsenic...
No evidence that polymorphisms of the vanishing white matter disease genes are risk factors in multiple sclerosis

NARCIS (Netherlands)

Pronk, J.C.; Scheper, G.C.; Andel, R.J.; van Berkel, C.G.M.; Polman, C.H.; Uitdehaag, B.M.J.; van der Knaap, M.S.

2008-01-01

Febrile infections are known to cause exacerbations in the white matter disorders 'vanishing white matter' (VWM) and multiple sclerosis (MS). We hypothesized that polymorphisms in EIF2B1-5, the genes involved in VWM, might be risk factors for the development of MS or temperature sensitivity in
Diversification of Root Hair Development Genes in Vascular Plants.

Science.gov (United States)

Huang, Ling; Shi, Xinhui; Wang, Wenjia; Ryu, Kook Hui; Schiefelbein, John

2017-07-01

The molecular genetic program for root hair development has been studied intensively in Arabidopsis ( Arabidopsis thaliana ). To understand the extent to which this program might operate in other plants, we conducted a large-scale comparative analysis of root hair development genes from diverse vascular plants, including eudicots, monocots, and a lycophyte. Combining phylogenetics and transcriptomics, we discovered conservation of a core set of root hair genes across all vascular plants, which may derive from an ancient program for unidirectional cell growth coopted for root hair development during vascular plant evolution. Interestingly, we also discovered preferential diversification in the structure and expression of root hair development genes, relative to other root hair- and root-expressed genes, among these species. These differences enabled the definition of sets of genes and gene functions that were acquired or lost in specific lineages during vascular plant evolution. In particular, we found substantial divergence in the structure and expression of genes used for root hair patterning, suggesting that the Arabidopsis transcriptional regulatory mechanism is not shared by other species. To our knowledge, this study provides the first comprehensive view of gene expression in a single plant cell type across multiple species. © 2017 American Society of Plant Biologists. All Rights Reserved.
Understanding Autoimmune Mechanisms in Multiple Sclerosis Using Gene Expression Microarrays: Treatment Effect and Cytokine-related Pathways

Directory of Open Access Journals (Sweden)

A. Achiron

2004-01-01

Full Text Available Multiple sclerosis (MS is a central nervous system disease in which activated autoreactive T-cells invade the blood brain barrier and initiate an inflammatory response that leads to myelin destruction and axonal loss. The etiology of MS, as well as the mechanisms associated with its unexpected onset, the unpredictable clinical course spanning decades, and the different rates of progression leading to disability over time, remains an enigma. We have applied gene expression microarrays technology in peripheral blood mononuclear cells (PBMC to better understand MS pathogenesis and better target treatment approaches. A signature of 535 genes were found to distinguish immunomodulatory treatment effects between 13 treated and 13 untreated MS patients. In addition, the expression pattern of 1109 gene transcripts that were previously reported to significantly differentiate between MS patients and healthy subjects were further analyzed to study the effect of cytokine-related pathways on disease pathogenesis. When relative gene expression for 26 MS patients was compared to 18 healthy controls, 30 genes related to various cytokine-associated pathways were identified. These genes belong to a variety of families such as interleukins, small inducible cytokine subfamily and tumor necrosis factor ligand and receptor. Further analysis disclosed seven cytokine-associated genes within the immunomodulatory treatment signature, and two cytokine-associated genes SCYA4 (small inducible cytokine A4 and FCAR (Fc fragment of IgA, CD89 that were common to both the MS gene expression signature and the immunomodulatory treatment gene expression signature. Our results indicate that cytokine-associated genes are involved in various pathogenic pathways in MS and also related to immunomodulatory treatment effects.
STRIDE: Species Tree Root Inference from Gene Duplication Events.

Science.gov (United States)

Emms, David M; Kelly, Steven

2017-12-01

The correct interpretation of any phylogenetic tree is dependent on that tree being correctly rooted. We present STRIDE, a fast, effective, and outgroup-free method for identification of gene duplication events and species tree root inference in large-scale molecular phylogenetic analyses. STRIDE identifies sets of well-supported in-group gene duplication events from a set of unrooted gene trees, and analyses these events to infer a probability distribution over an unrooted species tree for the location of its root. We show that STRIDE correctly identifies the root of the species tree in multiple large-scale molecular phylogenetic data sets spanning a wide range of timescales and taxonomic groups. We demonstrate that the novel probability model implemented in STRIDE can accurately represent the ambiguity in species tree root assignment for data sets where information is limited. Furthermore, application of STRIDE to outgroup-free inference of the origin of the eukaryotic tree resulted in a root probability distribution that provides additional support for leading hypotheses for the origin of the eukaryotes. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Involvement of Multiple Gene-Silencing Pathways in a Paramutation-like Phenomenon in Arabidopsis

Directory of Open Access Journals (Sweden)

Zhimin Zheng

2015-05-01

Full Text Available Paramutation is an epigenetic phenomenon that has been observed in a number of multicellular organisms. The epigenetically silenced state of paramutated alleles is not only meiotically stable but also “infectious” to active homologous alleles. The molecular mechanism of paramutation remains unclear, but components involved in RNA-directed DNA methylation (RdDM are required. Here, we report a multi-copy pRD29A-LUC transgene in Arabidopsis thaliana that behaves like a paramutation locus. The silent state of LUC is induced by mutations in the DNA glycosylase gene ROS1. The silent alleles of LUC are not only meiotically stable but also able to transform active LUC alleles into silent ones, in the absence of ros1 mutations. Maintaining silencing at the LUC gene requires action of multiple pathways besides RdDM. Our study identified specific factors that are involved in the paramutation-like phenomenon and established a model system for the study of paramutation in Arabidopsis.
PSAT: A web tool to compare genomic neighborhoods of multiple prokaryotic genomes

Directory of Open Access Journals (Sweden)

Wasnick Michael

2008-03-01

Full Text Available Abstract Background The conservation of gene order among prokaryotic genomes can provide valuable insight into gene function, protein interactions, or events by which genomes have evolved. Although some tools are available for visualizing and comparing the order of genes between genomes of study, few support an efficient and organized analysis between large numbers of genomes. The Prokaryotic Sequence homology Analysis Tool (PSAT is a web tool for comparing gene neighborhoods among multiple prokaryotic genomes. Results PSAT utilizes a database that is preloaded with gene annotation, BLAST hit results, and gene-clustering scores designed to help identify regions of conserved gene order. Researchers use the PSAT web interface to find a gene of interest in a reference genome and efficiently retrieve the sequence homologs found in other bacterial genomes. The tool generates a graphic of the genomic neighborhood surrounding the selected gene and the corresponding regions for its homologs in each comparison genome. Homologs in each region are color coded to assist users with analyzing gene order among various genomes. In contrast to common comparative analysis methods that filter sequence homolog data based on alignment score cutoffs, PSAT leverages gene context information for homologs, including those with weak alignment scores, enabling a more sensitive analysis. Features for constraining or ordering results are designed to help researchers browse results from large numbers of comparison genomes in an organized manner. PSAT has been demonstrated to be useful for helping to identify gene orthologs and potential functional gene clusters, and detecting genome modifications that may result in loss of function. Conclusion PSAT allows researchers to investigate the order of genes within local genomic neighborhoods of multiple genomes. A PSAT web server for public use is available for performing analyses on a growing set of reference genomes through any
Role of inflammation gene polymorphisms on pain and response to radiotherapy in multiple myeloma patients with painful bone destructions

OpenAIRE

Rudžianskienė, Milda; Inčiūra, Arturas; Gerbutavičius, Rolandas; Dambrauskienė, Rūta; Rudžianskas, Viktoras; Juozaitytė, Elona

2016-01-01

Background: Previous researches have demonstrated, that the severity of pain perception and it’s response to analgesia is highly dependent on gene polymorphism encoding for cytokines. We evaluated 12 single nucleotide polymorphisms (SNP) in 6 genes encoding for cytokines in multiple myeloma patients (n = 81) and assessed their influence on pain severity and response to palliative radiotherapy. Methods: Pain intensity was assessed by Visual Analogue Scale. The total dose of opioids was convert...
Detecting microRNA activity from gene expression data

LENUS (Irish Health Repository)

Madden, Stephen F

2010-05-18

Abstract Background MicroRNAs (miRNAs) are non-coding RNAs that regulate gene expression by binding to the messenger RNA (mRNA) of protein coding genes. They control gene expression by either inhibiting translation or inducing mRNA degradation. A number of computational techniques have been developed to identify the targets of miRNAs. In this study we used predicted miRNA-gene interactions to analyse mRNA gene expression microarray data to predict miRNAs associated with particular diseases or conditions. Results Here we combine correspondence analysis, between group analysis and co-inertia analysis (CIA) to determine which miRNAs are associated with differences in gene expression levels in microarray data sets. Using a database of miRNA target predictions from TargetScan, TargetScanS, PicTar4way PicTar5way, and miRanda and combining these data with gene expression levels from sets of microarrays, this method produces a ranked list of miRNAs associated with a specified split in samples. We applied this to three different microarray datasets, a papillary thyroid carcinoma dataset, an in-house dataset of lipopolysaccharide treated mouse macrophages, and a multi-tissue dataset. In each case we were able to identified miRNAs of biological importance. Conclusions We describe a technique to integrate gene expression data and miRNA target predictions from multiple sources.
Detecting microRNA activity from gene expression data.

LENUS (Irish Health Repository)

Madden, Stephen F

2010-01-01

BACKGROUND: MicroRNAs (miRNAs) are non-coding RNAs that regulate gene expression by binding to the messenger RNA (mRNA) of protein coding genes. They control gene expression by either inhibiting translation or inducing mRNA degradation. A number of computational techniques have been developed to identify the targets of miRNAs. In this study we used predicted miRNA-gene interactions to analyse mRNA gene expression microarray data to predict miRNAs associated with particular diseases or conditions. RESULTS: Here we combine correspondence analysis, between group analysis and co-inertia analysis (CIA) to determine which miRNAs are associated with differences in gene expression levels in microarray data sets. Using a database of miRNA target predictions from TargetScan, TargetScanS, PicTar4way PicTar5way, and miRanda and combining these data with gene expression levels from sets of microarrays, this method produces a ranked list of miRNAs associated with a specified split in samples. We applied this to three different microarray datasets, a papillary thyroid carcinoma dataset, an in-house dataset of lipopolysaccharide treated mouse macrophages, and a multi-tissue dataset. In each case we were able to identified miRNAs of biological importance. CONCLUSIONS: We describe a technique to integrate gene expression data and miRNA target predictions from multiple sources.
Gene panel testing for inherited cancer risk.

Science.gov (United States)

Hall, Michael J; Forman, Andrea D; Pilarski, Robert; Wiesner, Georgia; Giri, Veda N

2014-09-01

Next-generation sequencing technologies have ushered in the capability to assess multiple genes in parallel for genetic alterations that may contribute to inherited risk for cancers in families. Thus, gene panel testing is now an option in the setting of genetic counseling and testing for cancer risk. This article describes the many gene panel testing options clinically available to assess inherited cancer susceptibility, the potential advantages and challenges associated with various types of panels, clinical scenarios in which gene panels may be particularly useful in cancer risk assessment, and testing and counseling considerations. Given the potential issues for patients and their families, gene panel testing for inherited cancer risk is recommended to be offered in conjunction or consultation with an experienced cancer genetic specialist, such as a certified genetic counselor or geneticist, as an integral part of the testing process. Copyright © 2014 by the National Comprehensive Cancer Network.
Butyrate induces profound changes in gene expression related to multiple signal pathways in bovine kidney epithelial cells

Directory of Open Access Journals (Sweden)

Li CongJun

2006-09-01

Full Text Available Abstract Background Global gene expression profiles of bovine kidney epithelial cells regulated by sodium butyrate were investigated with high-density oligonucleotide microarrays. The bovine microarray with 86,191 distinct 60mer oligonucleotides, each with 4 replicates, was designed and produced with Maskless Array Synthesizer technology. These oligonucleotides represent approximately 45,383 unique cattle sequences. Results 450 genes significantly regulated by butyrate with a median False Discovery Rate (FDR = 0 % were identified. The majority of these genes were repressed by butyrate and associated with cell cycle control. The expression levels of 30 selected genes identified by the microarray were confirmed using real-time PCR. The results from real-time PCR positively correlated (R = 0.867 with the results from the microarray. Conclusion This study presented the genes related to multiple signal pathways such as cell cycle control and apoptosis. The profound changes in gene expression elucidate the molecular basis for the pleiotropic effects of butyrate on biological processes. These findings enable better recognition of the full range of beneficial roles butyrate may play during cattle energy metabolism, cell growth and proliferation, and possibly in fighting gastrointestinal pathogens.
Association of interleukin-1 gene variations with moderate to severe chronic periodontitis in multiple ethnicities

Science.gov (United States)

Wu, X; Offenbacher, S; Lόpez, N J; Chen, D; Wang, H-Y; Rogus, J; Zhou, J; Beck, J; Jiang, S; Bao, X; Wilkins, L; Doucette-Stamm, L; Kornman, K

2015-01-01

Background and Objective Genetic markers associated with disease are often non-functional and generally tag one or more functional “causative” variants in linkage disequilibrium. Markers may not show tight linkage to the causative variants across multiple ethnicities due to evolutionary divergence, and therefore may not be informative across different population groups. Validated markers of disease suggest causative variants exist in the gene and, if the causative variants can be identified, it is reasonable to hypothesize that such variants will be informative across diverse populations. The aim of this study was to test that hypothesis using functional Interleukin-1 (IL-1) gene variations across multiple ethnic populations to replace the non-functional markers originally associated with chronic adult periodontitis in Caucasians. Material and Methods Adult chronic periodontitis cases and controls from four ethnic groups (Caucasians, African Americans, Hispanics and Asians) were recruited in the USA, Chile and China. Genotypes of IL1B gene single nucleotide polymorphisms (SNPs), including three functional SNPs (rs16944, rs1143623, rs4848306) in the promoter and one intronic SNP (rs1143633), were determined using a single base extension method or TaqMan 5′ nuclease assay. Logistic regression and other statistical analyses were used to examine the association between moderate to severe periodontitis and IL1B gene variations, including SNPs, haplotypes and composite genotypes. Genotype patterns associated with disease in the discovery study were then evaluated in independent validation studies. Results Significant associations were identified in the discovery study, consisting of Caucasians and African Americans, between moderate to severe adult chronic periodontitis and functional variations in the IL1B gene, including a pattern of four IL1B SNPs (OR = 1.87, p < 0.0001). The association between the disease and this IL1B composite genotype pattern was validated
Could age modify the effect of genetic variants in IL6 and TNF-α genes in multiple myeloma?

Science.gov (United States)

Martino, Alessandro; Buda, Gabriele; Maggini, Valentina; Lapi, Francesco; Lupia, Antonella; Di Bello, Domenica; Orciuolo, Enrico; Galimberti, Sara; Barale, Roberto; Petrini, Mario; Rossi, Anna Maria

2012-05-01

Cytokines play a central role in multiple myeloma (MM) pathogenesis thus genetic variations within cytokines coding genes could influence MM susceptibility and therapy outcome. We investigated the impact of 8 SNPs in these genes in 202 MM cases and 235 controls also evaluating their impact on therapy outcome in a subset of 91 patients. Despite the overall negative findings, we found a significant age-modified effect of IL6 and TNF-α SNPs, on MM risk and therapy outcome, respectively. Therefore, this observation suggests that genetic variation in inflammation-related genes could be an important mediator of the complex interplay between ageing and cancer. Copyright Â© 2012 Elsevier Ltd. All rights reserved.
Using RNA-Seq Data to Evaluate Reference Genes Suitable for Gene Expression Studies in Soybean.

Directory of Open Access Journals (Sweden)

Aldrin Kay-Yuen Yim

Full Text Available Differential gene expression profiles often provide important clues for gene functions. While reverse transcription quantitative real-time polymerase chain reaction (RT-qPCR is an important tool, the validity of the results depends heavily on the choice of proper reference genes. In this study, we employed new and published RNA-sequencing (RNA-Seq datasets (26 sequencing libraries in total to evaluate reference genes reported in previous soybean studies. In silico PCR showed that 13 out of 37 previously reported primer sets have multiple targets, and 4 of them have amplicons with different sizes. Using a probabilistic approach, we identified new and improved candidate reference genes. We further performed 2 validation tests (with 26 RNA samples on 8 commonly used reference genes and 7 newly identified candidates, using RT-qPCR. In general, the new candidate reference genes exhibited more stable expression levels under the tested experimental conditions. The three newly identified candidate reference genes Bic-C2, F-box protein2, and VPS-like gave the best overall performance, together with the commonly used ELF1b. It is expected that the proposed probabilistic model could serve as an important tool to identify stable reference genes when more soybean RNA-Seq data from different growth stages and treatments are used.
Combining qualitative and quantitative operational research methods to inform quality improvement in pathways that span multiple settings

Science.gov (United States)

Crowe, Sonya; Brown, Katherine; Tregay, Jenifer; Wray, Jo; Knowles, Rachel; Ridout, Deborah A; Bull, Catherine; Utley, Martin

2017-01-01

Background Improving integration and continuity of care across sectors within resource constraints is a priority in many health systems. Qualitative operational research methods of problem structuring have been used to address quality improvement in services involving multiple sectors but not in combination with quantitative operational research methods that enable targeting of interventions according to patient risk. We aimed to combine these methods to augment and inform an improvement initiative concerning infants with congenital heart disease (CHD) whose complex care pathway spans multiple sectors. Methods Soft systems methodology was used to consider systematically changes to services from the perspectives of community, primary, secondary and tertiary care professionals and a patient group, incorporating relevant evidence. Classification and regression tree (CART) analysis of national audit datasets was conducted along with data visualisation designed to inform service improvement within the context of limited resources. Results A ‘Rich Picture’ was developed capturing the main features of services for infants with CHD pertinent to service improvement. This was used, along with a graphical summary of the CART analysis, to guide discussions about targeting interventions at specific patient risk groups. Agreement was reached across representatives of relevant health professions and patients on a coherent set of targeted recommendations for quality improvement. These fed into national decisions about service provision and commissioning. Conclusions When tackling complex problems in service provision across multiple settings, it is important to acknowledge and work with multiple perspectives systematically and to consider targeting service improvements in response to confined resources. Our research demonstrates that applying a combination of qualitative and quantitative operational research methods is one approach to doing so that warrants further

Two new loci and gene sets related to sex determination and cancer progression are associated with susceptibility to testicular germ cell tumor.

Science.gov (United States)

Kristiansen, Wenche; Karlsson, Robert; Rounge, Trine B; Whitington, Thomas; Andreassen, Bettina K; Magnusson, Patrik K; Fosså, Sophie D; Adami, Hans-Olov; Turnbull, Clare; Haugen, Trine B; Grotmol, Tom; Wiklund, Fredrik

2015-07-15

Genome-wide association (GWA) studies have reported 19 distinct susceptibility loci for testicular germ cell tumor (TGCT). A GWA study for TGCT was performed by genotyping 610 240 single-nucleotide polymorphisms (SNPs) in 1326 cases and 6687 controls from Sweden and Norway. No novel genome-wide significant associations were observed in this discovery stage. We put forward 27 SNPs from 15 novel regions and 12 SNPs previously reported, for replication in 710 case-parent triads and 289 cases and 290 controls. Predefined biological pathways and processes, in addition to a custom-built sex-determination gene set, were subject to enrichment analyses using Meta-Analysis Gene Set Enrichment of Variant Associations (M) and Improved Gene Set Enrichment Analysis for Genome-wide Association Study (I). In the combined meta-analysis, we observed genome-wide significant association for rs7501939 on chromosome 17q12 (OR = 0.78, 95% CI = 0.72-0.84, P = 1.1 × 10(-9)) and rs2195987 on chromosome 19p12 (OR = 0.76, 95% CI: 0.69-0.84, P = 3.2 × 10(-8)). The marker rs7501939 on chromosome 17q12 is located in an intron of the HNF1B gene, encoding a member of the homeodomain-containing superfamily of transcription factors. The sex-determination gene set (false discovery rate, FDRM cancer and apoptosis, was associated with TGCT (FDR utero are implicated in the development of TGCT. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Polymorphisms in genes encoding leptin, ghrelin and their receptors in German multiple sclerosis patients.

Science.gov (United States)

Rey, Linda K; Wieczorek, Stefan; Akkad, Denis A; Linker, Ralf A; Chan, Andrew; Hoffjan, Sabine

2011-01-01

Multiple sclerosis (MS) is a neuro-inflammatory, autoimmune disease influenced by environmental and polygenic components. There is growing evidence that the peptide hormone leptin, known to regulate energy homeostasis, as well as its antagonist ghrelin play an important role in inflammatory processes in autoimmune diseases, including MS. Recently, single nucleotide polymorphisms (SNPs) in the genes encoding leptin, ghrelin and their receptors were evaluated, amongst others, in Wegener's granulomatosis and Churg-Strauss syndrome. The Lys656Asn SNP in the LEPR gene showed a significant but contrasting association with these vasculitides. We therefore aimed at investigating these polymorphisms in a German MS case-control cohort. Twelve SNPs in the LEP, LEPR, GHRL and GHSR genes were genotyped in 776 MS patients and 878 control subjects. We found an association of a haplotype in the GHSR gene with MS that could not be replicated in a second cohort. Otherwise, no significant differences in allele or genotype frequencies were observed between patients and controls in this particular cohort. Thus, the present results do not support the hypothesis that genetic variation in the leptin/ghrelin system contributes substantially to the pathogenesis of MS. However, a modest effect of GHSR variation cannot be ruled out and needs to be further evaluated in future studies. Copyright Â© 2011 Elsevier Ltd. All rights reserved.
Combining multiple hypothesis testing and affinity propagation clustering leads to accurate, robust and sample size independent classification on gene expression data

Directory of Open Access Journals (Sweden)

Sakellariou Argiris

2012-10-01

Full Text Available Abstract Background A feature selection method in microarray gene expression data should be independent of platform, disease and dataset size. Our hypothesis is that among the statistically significant ranked genes in a gene list, there should be clusters of genes that share similar biological functions related to the investigated disease. Thus, instead of keeping N top ranked genes, it would be more appropriate to define and keep a number of gene cluster exemplars. Results We propose a hybrid FS method (mAP-KL, which combines multiple hypothesis testing and affinity propagation (AP-clustering algorithm along with the Krzanowski & Lai cluster quality index, to select a small yet informative subset of genes. We applied mAP-KL on real microarray data, as well as on simulated data, and compared its performance against 13 other feature selection approaches. Across a variety of diseases and number of samples, mAP-KL presents competitive classification results, particularly in neuromuscular diseases, where its overall AUC score was 0.91. Furthermore, mAP-KL generates concise yet biologically relevant and informative N-gene expression signatures, which can serve as a valuable tool for diagnostic and prognostic purposes, as well as a source of potential disease biomarkers in a broad range of diseases. Conclusions mAP-KL is a data-driven and classifier-independent hybrid feature selection method, which applies to any disease classification problem based on microarray data, regardless of the available samples. Combining multiple hypothesis testing and AP leads to subsets of genes, which classify unknown samples from both, small and large patient cohorts with high accuracy.
Microarray analysis identifies a common set of cellular genes modulated by different HCV replicon clones

Directory of Open Access Journals (Sweden)

Gerosolimo Germano

2008-06-01

Full Text Available Abstract Background Hepatitis C virus (HCV RNA synthesis and protein expression affect cell homeostasis by modulation of gene expression. The impact of HCV replication on global cell transcription has not been fully evaluated. Thus, we analysed the expression profiles of different clones of human hepatoma-derived Huh-7 cells carrying a self-replicating HCV RNA which express all viral proteins (HCV replicon system. Results First, we compared the expression profile of HCV replicon clone 21-5 with both the Huh-7 parental cells and the 21-5 cured (21-5c cells. In these latter, the HCV RNA has been eliminated by IFN-α treatment. To confirm data, we also analyzed microarray results from both the 21-5 and two other HCV replicon clones, 22-6 and 21-7, compared to the Huh-7 cells. The study was carried out by using the Applied Biosystems (AB Human Genome Survey Microarray v1.0 which provides 31,700 probes that correspond to 27,868 human genes. Microarray analysis revealed a specific transcriptional program induced by HCV in replicon cells respect to both IFN-α-cured and Huh-7 cells. From the original datasets of differentially expressed genes, we selected by Venn diagrams a final list of 38 genes modulated by HCV in all clones. Most of the 38 genes have never been described before and showed high fold-change associated with significant p-value, strongly supporting data reliability. Classification of the 38 genes by Panther System identified functional categories that were significantly enriched in this gene set, such as histones and ribosomal proteins as well as extracellular matrix and intracellular protein traffic. The dataset also included new genes involved in lipid metabolism, extracellular matrix and cytoskeletal network, which may be critical for HCV replication and pathogenesis. Conclusion Our data provide a comprehensive analysis of alterations in gene expression induced by HCV replication and reveal modulation of new genes potentially useful
Interaction between Social/Psychosocial Factors and Genetic Variants on Body Mass Index: A Gene-Environment Interaction Analysis in a Longitudinal Setting.

Science.gov (United States)

Zhao, Wei; Ware, Erin B; He, Zihuai; Kardia, Sharon L R; Faul, Jessica D; Smith, Jennifer A

2017-09-29

Obesity, which develops over time, is one of the leading causes of chronic diseases such as cardiovascular disease. However, hundreds of BMI (body mass index)-associated genetic loci identified through large-scale genome-wide association studies (GWAS) only explain about 2.7% of BMI variation. Most common human traits are believed to be influenced by both genetic and environmental factors. Past studies suggest a variety of environmental features that are associated with obesity, including socioeconomic status and psychosocial factors. This study combines both gene/regions and environmental factors to explore whether social/psychosocial factors (childhood and adult socioeconomic status, social support, anger, chronic burden, stressful life events, and depressive symptoms) modify the effect of sets of genetic variants on BMI in European American and African American participants in the Health and Retirement Study (HRS). In order to incorporate longitudinal phenotype data collected in the HRS and investigate entire sets of single nucleotide polymorphisms (SNPs) within gene/region simultaneously, we applied a novel set-based test for gene-environment interaction in longitudinal studies (LGEWIS). Childhood socioeconomic status (parental education) was found to modify the genetic effect in the gene/region around SNP rs9540493 on BMI in European Americans in the HRS. The most significant SNP (rs9540488) by childhood socioeconomic status interaction within the rs9540493 gene/region was suggestively replicated in the Multi-Ethnic Study of Atherosclerosis (MESA) ( p = 0.07).
Interaction between Social/Psychosocial Factors and Genetic Variants on Body Mass Index: A Gene-Environment Interaction Analysis in a Longitudinal Setting

Directory of Open Access Journals (Sweden)

Wei Zhao

2017-09-01

Full Text Available Obesity, which develops over time, is one of the leading causes of chronic diseases such as cardiovascular disease. However, hundreds of BMI (body mass index-associated genetic loci identified through large-scale genome-wide association studies (GWAS only explain about 2.7% of BMI variation. Most common human traits are believed to be influenced by both genetic and environmental factors. Past studies suggest a variety of environmental features that are associated with obesity, including socioeconomic status and psychosocial factors. This study combines both gene/regions and environmental factors to explore whether social/psychosocial factors (childhood and adult socioeconomic status, social support, anger, chronic burden, stressful life events, and depressive symptoms modify the effect of sets of genetic variants on BMI in European American and African American participants in the Health and Retirement Study (HRS. In order to incorporate longitudinal phenotype data collected in the HRS and investigate entire sets of single nucleotide polymorphisms (SNPs within gene/region simultaneously, we applied a novel set-based test for gene-environment interaction in longitudinal studies (LGEWIS. Childhood socioeconomic status (parental education was found to modify the genetic effect in the gene/region around SNP rs9540493 on BMI in European Americans in the HRS. The most significant SNP (rs9540488 by childhood socioeconomic status interaction within the rs9540493 gene/region was suggestively replicated in the Multi-Ethnic Study of Atherosclerosis (MESA (p = 0.07.
Evaluation of Appropriate Reference Genes for Gene Expression Normalization during Watermelon Fruit Development.

Directory of Open Access Journals (Sweden)

Qiusheng Kong

Full Text Available Gene expression analysis in watermelon (Citrullus lanatus fruit has drawn considerable attention with the availability of genome sequences to understand the regulatory mechanism of fruit development and to improve its quality. Real-time quantitative reverse-transcription PCR (qRT-PCR is a routine technique for gene expression analysis. However, appropriate reference genes for transcript normalization in watermelon fruits have not been well characterized. The aim of this study was to evaluate the appropriateness of 12 genes for their potential use as reference genes in watermelon fruits. Expression variations of these genes were measured in 48 samples obtained from 12 successive developmental stages of parthenocarpic and fertilized fruits of two watermelon genotypes by using qRT-PCR analysis. Considering the effects of genotype, fruit setting method, and developmental stage, geNorm determined clathrin adaptor complex subunit (ClCAC, β-actin (ClACT, and alpha tubulin 5 (ClTUA5 as the multiple reference genes in watermelon fruit. Furthermore, ClCAC alone or together with SAND family protein (ClSAND was ranked as the single or two best reference genes by NormFinder. By using the top-ranked reference genes to normalize the transcript abundance of phytoene synthase (ClPSY1, a good correlation between lycopene accumulation and ClPSY1 expression pattern was observed in ripening watermelon fruit. These validated reference genes will facilitate the accurate measurement of gene expression in the studies on watermelon fruit biology.
Evaluation of Appropriate Reference Genes for Gene Expression Normalization during Watermelon Fruit Development.

Science.gov (United States)

Kong, Qiusheng; Yuan, Jingxian; Gao, Lingyun; Zhao, Liqiang; Cheng, Fei; Huang, Yuan; Bie, Zhilong

2015-01-01

Gene expression analysis in watermelon (Citrullus lanatus) fruit has drawn considerable attention with the availability of genome sequences to understand the regulatory mechanism of fruit development and to improve its quality. Real-time quantitative reverse-transcription PCR (qRT-PCR) is a routine technique for gene expression analysis. However, appropriate reference genes for transcript normalization in watermelon fruits have not been well characterized. The aim of this study was to evaluate the appropriateness of 12 genes for their potential use as reference genes in watermelon fruits. Expression variations of these genes were measured in 48 samples obtained from 12 successive developmental stages of parthenocarpic and fertilized fruits of two watermelon genotypes by using qRT-PCR analysis. Considering the effects of genotype, fruit setting method, and developmental stage, geNorm determined clathrin adaptor complex subunit (ClCAC), β-actin (ClACT), and alpha tubulin 5 (ClTUA5) as the multiple reference genes in watermelon fruit. Furthermore, ClCAC alone or together with SAND family protein (ClSAND) was ranked as the single or two best reference genes by NormFinder. By using the top-ranked reference genes to normalize the transcript abundance of phytoene synthase (ClPSY1), a good correlation between lycopene accumulation and ClPSY1 expression pattern was observed in ripening watermelon fruit. These validated reference genes will facilitate the accurate measurement of gene expression in the studies on watermelon fruit biology.
The pituitary tumor transforming gene 1 (PTTG-1: An immunological target for multiple myeloma

Directory of Open Access Journals (Sweden)

Gagliano Nicoletta

2008-04-01

Full Text Available Abstract Background Multiple Myeloma is a cancer of B plasma cells, which produce non-specific antibodies and proliferate uncontrolled. Due to the potential relapse and non-specificity of current treatments, immunotherapy promises to be more specific and may induce long-term immunity in patients. The pituitary tumor transforming gene 1 (PTTG-1 has been shown to be a novel oncogene, expressed in the testis, thymus, colon, lung and placenta (undetectable in most other tissues. Furthermore, it is over expressed in many tumors such as the pituitary adenoma, breast, gastrointestinal cancers, leukemia, lymphoma, and lung cancer and it seems to be associated with tumorigenesis, angiogenesis and cancer progression. The purpose was to investigate the presence/rate of expression of PTTG-1 in multiple myeloma patients. Methods We analyzed the PTTG-1 expression at the transcriptional and the protein level, by PCR, immunocytochemical methods, Dot-blot and ELISA performed on patient's sera in 19 multiple myeloma patients, 6 different multiple myeloma cell lines and in normal human tissue. Results We did not find PTTG-1 presence in the normal human tissue panel, but PTTG-1 mRNA was detectable in 12 of the 19 patients, giving evidence of a 63% rate of expression (data confirmed by ELISA. Four of the 6 investigated cell lines (66.6% were positive for PTTG-1. Investigations of protein expression gave evidence of 26.3% cytoplasmic expression and 16% surface expression in the plasma cells of multiple myeloma patients. Protein presence was also confirmed by Dot-blot in both cell lines and patients. Conclusion We established PTTG-1's presence at both the transcriptional and protein levels. These data suggest that PTTG-1 is aberrantly expressed in multiple myeloma plasma cells, is highly immunogenic and is a suitable target for immunotherapy of multiple myeloma.
Whole exome sequencing reveals concomitant mutations of multiple FA genes in individual Fanconi anemia patients.

Science.gov (United States)

Chang, Lixian; Yuan, Weiping; Zeng, Huimin; Zhou, Quanquan; Wei, Wei; Zhou, Jianfeng; Li, Miaomiao; Wang, Xiaomin; Xu, Mingjiang; Yang, Fengchun; Yang, Yungui; Cheng, Tao; Zhu, Xiaofan

2014-05-15

Fanconi anemia (FA) is a rare inherited genetic syndrome with highly variable clinical manifestations. Fifteen genetic subtypes of FA have been identified. Traditional complementation tests for grouping studies have been used generally in FA patients and in stepwise methods to identify the FA type, which can result in incomplete genetic information from FA patients. We diagnosed five pediatric patients with FA based on clinical manifestations, and we performed exome sequencing of peripheral blood specimens from these patients and their family members. The related sequencing data were then analyzed by bioinformatics, and the FANC gene mutations identified by exome sequencing were confirmed by PCR re-sequencing. Homozygous and compound heterozygous mutations of FANC genes were identified in all of the patients. The FA subtypes of the patients included FANCA, FANCM and FANCD2. Interestingly, four FA patients harbored multiple mutations in at least two FA genes, and some of these mutations have not been previously reported. These patients' clinical manifestations were vastly different from each other, as were their treatment responses to androstanazol and prednisone. This finding suggests that heterozygous mutation(s) in FA genes could also have diverse biological and/or pathophysiological effects on FA patients or FA gene carriers. Interestingly, we were not able to identify de novo mutations in the genes implicated in DNA repair pathways when the sequencing data of patients were compared with those of their parents. Our results indicate that Chinese FA patients and carriers might have higher and more complex mutation rates in FANC genes than have been conventionally recognized. Testing of the fifteen FANC genes in FA patients and their family members should be a regular clinical practice to determine the optimal care for the individual patient, to counsel the family and to obtain a better understanding of FA pathophysiology.
Multiple plasmid-borne virulence genes of Clavibacter michiganensis ssp. capsici critical for disease development in pepper.

Science.gov (United States)

Hwang, In Sun; Oh, Eom-Ji; Kim, Donghyuk; Oh, Chang-Sik

2018-02-01

Clavibacter michiganensis ssp. capsici is a Gram-positive plant-pathogenic bacterium causing bacterial canker disease in pepper. Virulence genes and mechanisms of C. michiganensis ssp. capsici in pepper have not yet been studied. To identify virulence genes of C. michiganensis ssp. capsici, comparative genome analyses with C. michiganensis ssp. capsici and its related C. michiganensis subspecies, and functional analysis of its putative virulence genes during infection were performed. The C. michiganensis ssp. capsici type strain PF008 carries one chromosome (3.056 Mb) and two plasmids (39 kb pCM1 Cmc and 145 kb pCM2 Cmc ). The genome analyses showed that this bacterium lacks a chromosomal pathogenicity island and celA gene that are important for disease development by C. michiganensis ssp. michiganensis in tomato, but carries most putative virulence genes in both plasmids. Virulence of pCM1 Cmc -cured C. michiganensis ssp. capsici was greatly reduced compared with the wild-type strain in pepper. The complementation analysis with pCM1 Cmc -located putative virulence genes showed that at least five genes, chpE, chpG, ppaA1, ppaB1 and pelA1, encoding serine proteases or pectate lyase contribute to disease development in pepper. In conclusion, C. michiganensis ssp. capsici has a unique genome structure, and its multiple plasmid-borne genes play critical roles in virulence in pepper, either separately or together. © 2017 The Authors. New Phytologist © 2017 New Phytologist Trust.
A case study for effects of operational taxonomic units from intracellular endoparasites and ciliates on the eukaryotic phylogeny: phylogenetic position of the haptophyta in analyses of multiple slowly evolving genes.

Directory of Open Access Journals (Sweden)

Hisayoshi Nozaki

Full Text Available Recent multigene phylogenetic analyses have contributed much to our understanding of eukaryotic phylogeny. However, the phylogenetic positions of various lineages within the eukaryotes have remained unresolved or in conflict between different phylogenetic studies. These phylogenetic ambiguities might have resulted from mixtures or integration from various factors including limited taxon sampling, missing data in the alignment, saturations of rapidly evolving genes, mixed analyses of short- and long-branched operational taxonomic units (OTUs, intracellular endoparasite and ciliate OTUs with unusual substitution etc. In order to evaluate the effects from intracellular endoparasite and ciliate OTUs co-analyzed on the eukaryotic phylogeny and simplify the results, we here used two different sets of data matrices of multiple slowly evolving genes with small amounts of missing data and examined the phylogenetic position of the secondary photosynthetic chromalveolates Haptophyta, one of the most abundant groups of oceanic phytoplankton and significant primary producers. In both sets, a robust sister relationship between Haptophyta and SAR (stramenopiles, alveolates, rhizarians, or SA [stramenopiles and alveolates] was resolved when intracellular endoparasite/ciliate OTUs were excluded, but not in their presence. Based on comparisons of character optimizations on a fixed tree (with a clade composed of haptophytes and SAR or SA, disruption of the monophyly between haptophytes and SAR (or SA in the presence of intracellular endoparasite/ciliate OTUs can be considered to be a result of multiple evolutionary reversals of character positions that supported the synapomorphy of the haptophyte and SAR (or SA clade in the absence of intracellular endoparasite/ciliate OTUs.
Single-cell multiple gene expression analysis based on single-molecule-detection microarray assay for multi-DNA determination

Energy Technology Data Exchange (ETDEWEB)

Li, Lu [School of Chemistry and Chemical Engineering, Shandong University, Jinan 250100 (China); Wang, Xianwei [School of Life Sciences, Shandong University, Jinan 250100 (China); Zhang, Xiaoli [School of Chemistry and Chemical Engineering, Shandong University, Jinan 250100 (China); Wang, Jinxing [School of Life Sciences, Shandong University, Jinan 250100 (China); Jin, Wenrui, E-mail: jwr@sdu.edu.cn [School of Chemistry and Chemical Engineering, Shandong University, Jinan 250100 (China)

2015-01-07

Highlights: • A single-molecule-detection (SMD) microarray for 10 samples is fabricated. • The based-SMD microarray assay (SMA) can determine 8 DNAs for each sample. • The limit of detection of SMA is as low as 1.3 × 10{sup −16} mol L{sup −1}. • The SMA can be applied in single-cell multiple gene expression analysis. - Abstract: We report a novel ultra-sensitive and high-selective single-molecule-detection microarray assay (SMA) for multiple DNA determination. In the SMA, a capture DNA (DNAc) microarray consisting of 10 subarrays with 9 spots for each subarray is fabricated on a silanized glass coverslip as the substrate. On the subarrays, the spot-to-spot spacing is 500 μm and each spot has a diameter of ∼300 μm. The sequence of the DNAcs on the 9 spots of a subarray is different, to determine 8 types of target DNAs (DNAts). Thus, 8 types of DNAts are captured to their complementary DNAcs at 8 spots of a subarray, respectively, and then labeled with quantum dots (QDs) attached to 8 types of detection DNAs (DNAds) with different sequences. The ninth spot is used to detect the blank value. In order to determine the same 8 types of DNAts in 10 samples, the 10 DNAc-modified subarrays on the microarray are identical. Fluorescence single-molecule images of the QD-labeled DNAts on each spot of the subarray are acquired using a home-made single-molecule microarray reader. The amounts of the DNAts are quantified by counting the bright dots from the QDs. For a microarray, 8 types of DNAts in 10 samples can be quantified in parallel. The limit of detection of the SMA for DNA determination is as low as 1.3 × 10{sup −16} mol L{sup −1}. The SMA for multi-DNA determination can also be applied in single-cell multiple gene expression analysis through quantification of complementary DNAs (cDNAs) corresponding to multiple messenger RNAs (mRNAs) in single cells. To do so, total RNA in single cells is extracted and reversely transcribed into their cDNAs. Three
A genome-wide study of DNA methylation patterns and gene expression levels in multiple human and chimpanzee tissues.

Directory of Open Access Journals (Sweden)

Athma A Pai

2011-02-01

Full Text Available The modification of DNA by methylation is an important epigenetic mechanism that affects the spatial and temporal regulation of gene expression. Methylation patterns have been described in many contexts within and across a range of species. However, the extent to which changes in methylation might underlie inter-species differences in gene regulation, in particular between humans and other primates, has not yet been studied. To this end, we studied DNA methylation patterns in livers, hearts, and kidneys from multiple humans and chimpanzees, using tissue samples for which genome-wide gene expression data were also available. Using the multi-species gene expression and methylation data for 7,723 genes, we were able to study the role of promoter DNA methylation in the evolution of gene regulation across tissues and species. We found that inter-tissue methylation patterns are often conserved between humans and chimpanzees. However, we also found a large number of gene expression differences between species that might be explained, at least in part, by corresponding differences in methylation levels. In particular, we estimate that, in the tissues we studied, inter-species differences in promoter methylation might underlie as much as 12%-18% of differences in gene expression levels between humans and chimpanzees.
Human mast cell tryptase: Multiple cDNAs and genes reveal a multigene serine protease family

International Nuclear Information System (INIS)

Vanderslice, P.; Ballinger, S.M.; Tam, E.K.; Goldstein, S.M.; Craik, C.S.; Caughey, G.H.

1990-01-01

Three different cDNAs and a gene encoding human skin mast cell tryptase have been cloned and sequenced in their entirety. The deduced amino acid sequences reveal a 30-amino acid prepropeptide followed by a 245-amino acid catalytic domain. The C-terminal undecapeptide of the human preprosequence is identical in dog tryptase and appears to be part of a prosequence unique among serine proteases. The differences among the three human tryptase catalytic domains include the loss of a consensus N-glycosylation site in one cDNA, which may explain some of the heterogeneity in size and susceptibility to deglycosylation seen in tryptase preparations. All three tryptase cDNAs are distinct from a recently reported cDNA obtained from a human lung mast cell library. A skin tryptase cDNA was used to isolate a human tryptase gene, the exons of which match one of the skin-derived cDNAs. The organization of the ∼1.8-kilobase-pair tryptase gene is unique and is not closely related to that of any other mast cell or leukocyte serine protease. The 5' regulatory regions of the gene share features with those of other serine proteases, including mast cell chymase, but are unusual in being separated from the protein-coding sequence by an intron. High-stringency hybridization of a human genomic DNA blot with a fragment of the tryptase gene confirms the presence of multiple tryptase genes. These findings provide genetic evidence that human mast cell tryptases are the products of a multigene family
Scuba: scalable kernel-based gene prioritization.

Science.gov (United States)

Zampieri, Guido; Tran, Dinh Van; Donini, Michele; Navarin, Nicolò; Aiolli, Fabio; Sperduti, Alessandro; Valle, Giorgio

2018-01-25

The uncovering of genes linked to human diseases is a pressing challenge in molecular biology and precision medicine. This task is often hindered by the large number of candidate genes and by the heterogeneity of the available information. Computational methods for the prioritization of candidate genes can help to cope with these problems. In particular, kernel-based methods are a powerful resource for the integration of heterogeneous biological knowledge, however, their practical implementation is often precluded by their limited scalability. We propose Scuba, a scalable kernel-based method for gene prioritization. It implements a novel multiple kernel learning approach, based on a semi-supervised perspective and on the optimization of the margin distribution. Scuba is optimized to cope with strongly unbalanced settings where known disease genes are few and large scale predictions are required. Importantly, it is able to efficiently deal both with a large amount of candidate genes and with an arbitrary number of data sources. As a direct consequence of scalability, Scuba integrates also a new efficient strategy to select optimal kernel parameters for each data source. We performed cross-validation experiments and simulated a realistic usage setting, showing that Scuba outperforms a wide range of state-of-the-art methods. Scuba achieves state-of-the-art performance and has enhanced scalability compared to existing kernel-based approaches for genomic data. This method can be useful to prioritize candidate genes, particularly when their number is large or when input data is highly heterogeneous. The code is freely available at https://github.com/gzampieri/Scuba .
Hindsight regulates photoreceptor axon targeting through transcriptional control of jitterbug/Filamin and multiple genes involved in axon guidance in Drosophila.

Science.gov (United States)

Oliva, Carlos; Molina-Fernandez, Claudia; Maureira, Miguel; Candia, Noemi; López, Estefanía; Hassan, Bassem; Aerts, Stein; Cánovas, José; Olguín, Patricio; Sierralta, Jimena

2015-09-01

During axon targeting, a stereotyped pattern of connectivity is achieved by the integration of intrinsic genetic programs and the response to extrinsic long and short-range directional cues. How this coordination occurs is the subject of intense study. Transcription factors play a central role due to their ability to regulate the expression of multiple genes required to sense and respond to these cues during development. Here we show that the transcription factor HNT regulates layer-specific photoreceptor axon targeting in Drosophila through transcriptional control of jbug/Filamin and multiple genes involved in axon guidance and cytoskeleton organization.Using a microarray analysis we identified 235 genes whose expression levels were changed by HNT overexpression in the eye primordia. We analyzed nine candidate genes involved in cytoskeleton regulation and axon guidance, six of which displayed significantly altered gene expression levels in hnt mutant retinas. Functional analysis confirmed the role of OTK/PTK7 in photoreceptor axon targeting and uncovered Tiggrin, an integrin ligand, and Jbug/Filamin, a conserved actin- binding protein, as new factors that participate of photoreceptor axon targeting. Moreover, we provided in silico and molecular evidence that supports jbug/Filamin as a direct transcriptional target of HNT and that HNT acts partially through Jbug/Filamin in vivo to regulate axon guidance. Our work broadens the understanding of how HNT regulates the coordinated expression of a group of genes to achieve the correct connectivity pattern in the Drosophila visual system. © 2015 Wiley Periodicals, Inc. Develop Neurobiol 75: 1018-1032, 2015. © 2015 Wiley Periodicals, Inc.
Semantic Disease Gene Embeddings (SmuDGE): phenotype-based disease gene prioritization without phenotypes

KAUST Repository

AlShahrani, Mona; Hoehndorf, Robert

2018-01-01

In the past years, several methods have been developed to incorporate information about phenotypes into computational disease gene prioritization methods. These methods commonly compute the similarity between a disease's (or patient's) phenotypes and a database of gene-to-phenotype associations to find the phenotypically most similar match. A key limitation of these methods is their reliance on knowledge about phenotypes associated with particular genes which is highly incomplete in humans as well as in many model organisms such as the mouse. Results: We developed SmuDGE, a method that uses feature learning to generate vector-based representations of phenotypes associated with an entity. SmuDGE can be used as a trainable semantic similarity measure to compare two sets of phenotypes (such as between a disease and gene, or a disease and patient). More importantly, SmuDGE can generate phenotype representations for entities that are only indirectly associated with phenotypes through an interaction network; for this purpose, SmuDGE exploits background knowledge in interaction networks comprising of multiple types of interactions. We demonstrate that SmuDGE can match or outperform semantic similarity in phenotype-based disease gene prioritization, and furthermore significantly extends the coverage of phenotype-based methods to all genes in a connected interaction network.
Semantic Disease Gene Embeddings (SmuDGE): phenotype-based disease gene prioritization without phenotypes

KAUST Repository

Alshahrani, Mona

2018-04-30

In the past years, several methods have been developed to incorporate information about phenotypes into computational disease gene prioritization methods. These methods commonly compute the similarity between a disease\\'s (or patient\\'s) phenotypes and a database of gene-to-phenotype associations to find the phenotypically most similar match. A key limitation of these methods is their reliance on knowledge about phenotypes associated with particular genes which is highly incomplete in humans as well as in many model organisms such as the mouse. Results: We developed SmuDGE, a method that uses feature learning to generate vector-based representations of phenotypes associated with an entity. SmuDGE can be used as a trainable semantic similarity measure to compare two sets of phenotypes (such as between a disease and gene, or a disease and patient). More importantly, SmuDGE can generate phenotype representations for entities that are only indirectly associated with phenotypes through an interaction network; for this purpose, SmuDGE exploits background knowledge in interaction networks comprising of multiple types of interactions. We demonstrate that SmuDGE can match or outperform semantic similarity in phenotype-based disease gene prioritization, and furthermore significantly extends the coverage of phenotype-based methods to all genes in a connected interaction network.
The cfr and cfr-like multiple resistance genes

DEFF Research Database (Denmark)

Vester, Birte

2018-01-01

. The cfr gene is found in various bacteria in many geographical locations and placed on plasmids or associated with transposons. Cfr-related genes providing similar resistance have been identified in Bacillales, and now also in the pathogens Clostridium difficile and Enterococcus faecium. In addition......, the presence of the cfr gene has been detected in harbours and food markets....

Combining qualitative and quantitative operational research methods to inform quality improvement in pathways that span multiple settings.

Science.gov (United States)

Crowe, Sonya; Brown, Katherine; Tregay, Jenifer; Wray, Jo; Knowles, Rachel; Ridout, Deborah A; Bull, Catherine; Utley, Martin

2017-08-01

Improving integration and continuity of care across sectors within resource constraints is a priority in many health systems. Qualitative operational research methods of problem structuring have been used to address quality improvement in services involving multiple sectors but not in combination with quantitative operational research methods that enable targeting of interventions according to patient risk. We aimed to combine these methods to augment and inform an improvement initiative concerning infants with congenital heart disease (CHD) whose complex care pathway spans multiple sectors. Soft systems methodology was used to consider systematically changes to services from the perspectives of community, primary, secondary and tertiary care professionals and a patient group, incorporating relevant evidence. Classification and regression tree (CART) analysis of national audit datasets was conducted along with data visualisation designed to inform service improvement within the context of limited resources. A 'Rich Picture' was developed capturing the main features of services for infants with CHD pertinent to service improvement. This was used, along with a graphical summary of the CART analysis, to guide discussions about targeting interventions at specific patient risk groups. Agreement was reached across representatives of relevant health professions and patients on a coherent set of targeted recommendations for quality improvement. These fed into national decisions about service provision and commissioning. When tackling complex problems in service provision across multiple settings, it is important to acknowledge and work with multiple perspectives systematically and to consider targeting service improvements in response to confined resources. Our research demonstrates that applying a combination of qualitative and quantitative operational research methods is one approach to doing so that warrants further consideration. Published by the BMJ Publishing Group
International interlaboratory study comparing single organism 16S rRNA gene sequencing data: Beyond consensus sequence comparisons

Science.gov (United States)

Olson, Nathan D.; Lund, Steven P.; Zook, Justin M.; Rojas-Cornejo, Fabiola; Beck, Brian; Foy, Carole; Huggett, Jim; Whale, Alexandra S.; Sui, Zhiwei; Baoutina, Anna; Dobeson, Michael; Partis, Lina; Morrow, Jayne B.

2015-01-01

This study presents the results from an interlaboratory sequencing study for which we developed a novel high-resolution method for comparing data from different sequencing platforms for a multi-copy, paralogous gene. The combination of PCR amplification and 16S ribosomal RNA gene (16S rRNA) sequencing has revolutionized bacteriology by enabling rapid identification, frequently without the need for culture. To assess variability between laboratories in sequencing 16S rRNA, six laboratories sequenced the gene encoding the 16S rRNA from Escherichia coli O157:H7 strain EDL933 and Listeria monocytogenes serovar 4b strain NCTC11994. Participants performed sequencing methods and protocols available in their laboratories: Sanger sequencing, Roche 454 pyrosequencing®, or Ion Torrent PGM®. The sequencing data were evaluated on three levels: (1) identity of biologically conserved position, (2) ratio of 16S rRNA gene copies featuring identified variants, and (3) the collection of variant combinations in a set of 16S rRNA gene copies. The same set of biologically conserved positions was identified for each sequencing method. Analytical methods using Bayesian and maximum likelihood statistics were developed to estimate variant copy ratios, which describe the ratio of nucleotides at each identified biologically variable position, as well as the likely set of variant combinations present in 16S rRNA gene copies. Our results indicate that estimated variant copy ratios at biologically variable positions were only reproducible for high throughput sequencing methods. Furthermore, the likely variant combination set was only reproducible with increased sequencing depth and longer read lengths. We also demonstrate novel methods for evaluating variable positions when comparing multi-copy gene sequence data from multiple laboratories generated using multiple sequencing technologies. PMID:27077030
International interlaboratory study comparing single organism 16S rRNA gene sequencing data: Beyond consensus sequence comparisons

Directory of Open Access Journals (Sweden)

Nathan D. Olson

2015-03-01

Full Text Available This study presents the results from an interlaboratory sequencing study for which we developed a novel high-resolution method for comparing data from different sequencing platforms for a multi-copy, paralogous gene. The combination of PCR amplification and 16S ribosomal RNA gene (16S rRNA sequencing has revolutionized bacteriology by enabling rapid identification, frequently without the need for culture. To assess variability between laboratories in sequencing 16S rRNA, six laboratories sequenced the gene encoding the 16S rRNA from Escherichia coli O157:H7 strain EDL933 and Listeria monocytogenes serovar 4b strain NCTC11994. Participants performed sequencing methods and protocols available in their laboratories: Sanger sequencing, Roche 454 pyrosequencing®, or Ion Torrent PGM®. The sequencing data were evaluated on three levels: (1 identity of biologically conserved position, (2 ratio of 16S rRNA gene copies featuring identified variants, and (3 the collection of variant combinations in a set of 16S rRNA gene copies. The same set of biologically conserved positions was identified for each sequencing method. Analytical methods using Bayesian and maximum likelihood statistics were developed to estimate variant copy ratios, which describe the ratio of nucleotides at each identified biologically variable position, as well as the likely set of variant combinations present in 16S rRNA gene copies. Our results indicate that estimated variant copy ratios at biologically variable positions were only reproducible for high throughput sequencing methods. Furthermore, the likely variant combination set was only reproducible with increased sequencing depth and longer read lengths. We also demonstrate novel methods for evaluating variable positions when comparing multi-copy gene sequence data from multiple laboratories generated using multiple sequencing technologies.
A novel gene encoding a TIG multiple domain protein is a positional candidate for autosomal recessive polycystic kidney disease.

Science.gov (United States)

Xiong, Huaqi; Chen, Yongxiong; Yi, Yajun; Tsuchiya, Karen; Moeckel, Gilbert; Cheung, Joseph; Liang, Dan; Tham, Kyi; Xu, Xiaohu; Chen, Xing-Zhen; Pei, York; Zhao, Zhizhuang Jeo; Wu, Guanqing

2002-07-01

Autosomal recessive polycystic kidney disease (ARPKD) is a common hereditary renal cystic disease in infants and children. By genetic linkage analyses, the gene responsible for this disease, termed polycystic kidney and hepatic disease 1 (PKHD1), was mapped on human chromosome 6p21.1-p12, and has been further localized to a 1-cM genetic interval flanked by the D6S1714/D6S243 (telomeric) and D6S1024 (centromeric) markers. We recently identified a novel gene in this genetic interval from kidney cDNA, using cloning strategies. The gene PKHD1 (PKHD1-tentative) encodes a novel 3396-amino-acid protein with no apparent homology with any known proteins. We named its gene product "tigmin" because it contains multiple TIG domains, which usually are seen in proteins containing immunoglobulin-like folds. PKHD1 encodes an 11.6-kb transcript and is composed of 61 exons spanning an approximately 365-kb genomic region on chromosome 6p12-p11.2 adjacent to the marker D6S1714. Northern blot analyses demonstrated that the gene has discrete bands with one peak signal at approximately 11 kb, indicating that PKHD1 is likely to have multiple alternative transcripts. PKHD1 is highly expressed in adult and infant kidneys and weakly expressed in liver in northern blot analysis. This expression pattern parallels the tissue involvement observed in ARPKD. In situ hybridization analysis further revealed that the expression of PKHD1 in the kidney is mainly localized to the epithelial cells of the collecting duct, the specific tubular segment involved in cyst formation in ARPKD. These features of PKHD1 make it a strong positional candidate gene for ARPKD.
Development of a set of SNP markers present in expressed genes of the apple.

Science.gov (United States)

Chagné, David; Gasic, Ksenija; Crowhurst, Ross N; Han, Yuepeng; Bassett, Heather C; Bowatte, Deepa R; Lawrence, Timothy J; Rikkerink, Erik H A; Gardiner, Susan E; Korban, Schuyler S

2008-11-01

Molecular markers associated with gene coding regions are useful tools for bridging functional and structural genomics. Due to their high abundance in plant genomes, single nucleotide polymorphisms (SNPs) are present within virtually all genomic regions, including most coding sequences. The objective of this study was to develop a set of SNPs for the apple by taking advantage of the wealth of genomics resources available for the apple, including a large collection of expressed sequenced tags (ESTs). Using bioinformatics tools, a search for SNPs within an EST database of approximately 350,000 sequences developed from a variety of apple accessions was conducted. This resulted in the identification of a total of 71,482 putative SNPs. As the apple genome is reported to be an ancient polyploid, attempts were made to verify whether those SNPs detected in silico were attributable either to allelic polymorphisms or to gene duplication or paralogous or homeologous sequence variations. To this end, a set of 464 PCR primer pairs was designed, PCR was amplified using two subsets of plants, and the PCR products were sequenced. The SNPs retrieved from these sequences were then mapped onto apple genetic maps, including a newly constructed map of a Royal Gala x A689-24 cross and a Malling 9 x Robusta 5, map using a bin mapping strategy. The SNP genotyping was performed using the high-resolution melting (HRM) technique. A total of 93 new markers containing 210 coding SNPs were successfully mapped. This new set of SNP markers for the apple offers new opportunities for understanding the genetic control of important horticultural traits using quantitative trait loci (QTL) or linkage disequilibrium analysis. These also serve as useful markers for aligning physical and genetic maps, and as potential transferable markers across the Rosaceae family.
A novel system for simultaneous or sequential integration of multiple gene-loading vectors into a defined site of a human artificial chromosome.

Science.gov (United States)

Suzuki, Teruhiko; Kazuki, Yasuhiro; Oshimura, Mitsuo; Hara, Takahiko

2014-01-01

Human artificial chromosomes (HACs) are gene-delivery vectors suitable for introducing large DNA fragments into mammalian cells. Although a HAC theoretically incorporates multiple gene expression cassettes of unlimited DNA size, its application has been limited because the conventional gene-loading system accepts only one gene-loading vector (GLV) into a HAC. We report a novel method for the simultaneous or sequential integration of multiple GLVs into a HAC vector (designated as the SIM system) via combined usage of Cre, FLP, Bxb1, and φC31 recombinase/integrase. As a proof of principle, we first attempted simultaneous integration of three GLVs encoding EGFP, Venus, and TdTomato into a gene-loading site of a HAC in CHO cells. These cells successfully expressed all three fluorescent proteins. Furthermore, microcell-mediated transfer of HACs enabled the expression of those fluorescent proteins in recipient cells. We next demonstrated that GLVs could be introduced into a HAC one-by-one via reciprocal usage of recombinase/integrase. Lastly, we introduced a fourth GLV into a HAC after simultaneous integration of three GLVs by FLP-mediated DNA recombination. The SIM system expands the applicability of HAC vectors and is useful for various biomedical studies, including cell reprogramming.
A novel system for simultaneous or sequential integration of multiple gene-loading vectors into a defined site of a human artificial chromosome.

Directory of Open Access Journals (Sweden)

Teruhiko Suzuki

Full Text Available Human artificial chromosomes (HACs are gene-delivery vectors suitable for introducing large DNA fragments into mammalian cells. Although a HAC theoretically incorporates multiple gene expression cassettes of unlimited DNA size, its application has been limited because the conventional gene-loading system accepts only one gene-loading vector (GLV into a HAC. We report a novel method for the simultaneous or sequential integration of multiple GLVs into a HAC vector (designated as the SIM system via combined usage of Cre, FLP, Bxb1, and φC31 recombinase/integrase. As a proof of principle, we first attempted simultaneous integration of three GLVs encoding EGFP, Venus, and TdTomato into a gene-loading site of a HAC in CHO cells. These cells successfully expressed all three fluorescent proteins. Furthermore, microcell-mediated transfer of HACs enabled the expression of those fluorescent proteins in recipient cells. We next demonstrated that GLVs could be introduced into a HAC one-by-one via reciprocal usage of recombinase/integrase. Lastly, we introduced a fourth GLV into a HAC after simultaneous integration of three GLVs by FLP-mediated DNA recombination. The SIM system expands the applicability of HAC vectors and is useful for various biomedical studies, including cell reprogramming.
Two-locus linkage analysis in multiple sclerosis (MS)

Energy Technology Data Exchange (ETDEWEB)

Tienari, P.J. (National Public Health Institute, Helsinki (Finland) Univ. of Helsinki (Finland)); Terwilliger, J.D.; Ott, J. (Columbia Univ., New York (United States)); Palo, J. (Univ. of Helsinki (Finland)); Peltonen, L. (National Public Health Institute, Helsinki (Finland))

1994-01-15

One of the major challenges in genetic linkage analyses is the study of complex diseases. The authors demonstrate here the use of two-locus linkage analysis in multiple sclerosis (MS), a multifactorial disease with a complex mode of inheritance. In a set of Finnish multiplex families, they have previously found evidence for linkage between MS susceptibility and two independent loci, the myelin basic protein gene (MBP) on chromosome 18 and the HLA complex on chromosome 6. This set of families provides a unique opportunity to perform linkage analysis conditional on two loci contributing to the disease. In the two-trait-locus/two-marker-locus analysis, the presence of another disease locus is parametrized and the analysis more appropriately treats information from the unaffected family member than single-disease-locus analysis. As exemplified here in MS, the two-locus analysis can be a powerful method for investigating susceptibility loci in complex traits, best suited for analysis of specific candidate genes, or for situations in which preliminary evidence for linkage already exists or is suggested. 41 refs., 6 tabs.
Bayesian inference based modelling for gene transcriptional dynamics by integrating multiple source of knowledge

Directory of Open Access Journals (Sweden)

Wang Shu-Qiang

2012-07-01

Full Text Available Abstract Background A key challenge in the post genome era is to identify genome-wide transcriptional regulatory networks, which specify the interactions between transcription factors and their target genes. Numerous methods have been developed for reconstructing gene regulatory networks from expression data. However, most of them are based on coarse grained qualitative models, and cannot provide a quantitative view of regulatory systems. Results A binding affinity based regulatory model is proposed to quantify the transcriptional regulatory network. Multiple quantities, including binding affinity and the activity level of transcription factor (TF are incorporated into a general learning model. The sequence features of the promoter and the possible occupancy of nucleosomes are exploited to estimate the binding probability of regulators. Comparing with the previous models that only employ microarray data, the proposed model can bridge the gap between the relative background frequency of the observed nucleotide and the gene's transcription rate. Conclusions We testify the proposed approach on two real-world microarray datasets. Experimental results show that the proposed model can effectively identify the parameters and the activity level of TF. Moreover, the kinetic parameters introduced in the proposed model can reveal more biological sense than previous models can do.
A random set scoring model for prioritization of disease candidate genes using protein complexes and data-mining of GeneRIF, OMIM and PubMed records.

Science.gov (United States)

Jiang, Li; Edwards, Stefan M; Thomsen, Bo; Workman, Christopher T; Guldbrandtsen, Bernt; Sørensen, Peter

2014-09-24

Prioritizing genetic variants is a challenge because disease susceptibility loci are often located in genes of unknown function or the relationship with the corresponding phenotype is unclear. A global data-mining exercise on the biomedical literature can establish the phenotypic profile of genes with respect to their connection to disease phenotypes. The importance of protein-protein interaction networks in the genetic heterogeneity of common diseases or complex traits is becoming increasingly recognized. Thus, the development of a network-based approach combined with phenotypic profiling would be useful for disease gene prioritization. We developed a random-set scoring model and implemented it to quantify phenotype relevance in a network-based disease gene-prioritization approach. We validated our approach based on different gene phenotypic profiles, which were generated from PubMed abstracts, OMIM, and GeneRIF records. We also investigated the validity of several vocabulary filters and different likelihood thresholds for predicted protein-protein interactions in terms of their effect on the network-based gene-prioritization approach, which relies on text-mining of the phenotype data. Our method demonstrated good precision and sensitivity compared with those of two alternative complex-based prioritization approaches. We then conducted a global ranking of all human genes according to their relevance to a range of human diseases. The resulting accurate ranking of known causal genes supported the reliability of our approach. Moreover, these data suggest many promising novel candidate genes for human disorders that have a complex mode of inheritance. We have implemented and validated a network-based approach to prioritize genes for human diseases based on their phenotypic profile. We have devised a powerful and transparent tool to identify and rank candidate genes. Our global gene prioritization provides a unique resource for the biological interpretation of data
SignalSpider: Probabilistic pattern discovery on multiple normalized ChIP-Seq signal profiles

KAUST Repository

Wong, Kachun

2014-09-05

Motivation: Chromatin immunoprecipitation (ChIP) followed by high-throughput sequencing (ChIP-Seq) measures the genome-wide occupancy of transcription factors in vivo. Different combinations of DNA-binding protein occupancies may result in a gene being expressed in different tissues or at different developmental stages. To fully understand the functions of genes, it is essential to develop probabilistic models on multiple ChIP-Seq profiles to decipher the combinatorial regulatory mechanisms by multiple transcription factors. Results: In this work, we describe a probabilistic model (SignalSpider) to decipher the combinatorial binding events of multiple transcription factors. Comparing with similar existing methods, we found SignalSpider performs better in clustering promoter and enhancer regions. Notably, SignalSpider can learn higher-order combinatorial patterns from multiple ChIP-Seq profiles. We have applied SignalSpider on the normalized ChIP-Seq profiles from the ENCODE consortium and learned model instances. We observed different higher-order enrichment and depletion patterns across sets of proteins. Those clustering patterns are supported by Gene Ontology (GO) enrichment, evolutionary conservation and chromatin interaction enrichment, offering biological insights for further focused studies. We also proposed a specific enrichment map visualization method to reveal the genome-wide transcription factor combinatorial patterns from the models built, which extend our existing fine-scale knowledge on gene regulation to a genome-wide level. Availability and implementation: The matrix-algebra-optimized executables and source codes are available at the authors\\' websites: http://www.cs.toronto.edu/∼wkc/SignalSpider. Contact: Supplementary information: Supplementary data are available at Bioinformatics online.
Statistical approach for selection of biologically informative genes.

Science.gov (United States)

Das, Samarendra; Rai, Anil; Mishra, D C; Rai, Shesh N

2018-05-20

Selection of informative genes from high dimensional gene expression data has emerged as an important research area in genomics. Many gene selection techniques have been proposed so far are either based on relevancy or redundancy measure. Further, the performance of these techniques has been adjudged through post selection classification accuracy computed through a classifier using the selected genes. This performance metric may be statistically sound but may not be biologically relevant. A statistical approach, i.e. Boot-MRMR, was proposed based on a composite measure of maximum relevance and minimum redundancy, which is both statistically sound and biologically relevant for informative gene selection. For comparative evaluation of the proposed approach, we developed two biological sufficient criteria, i.e. Gene Set Enrichment with QTL (GSEQ) and biological similarity score based on Gene Ontology (GO). Further, a systematic and rigorous evaluation of the proposed technique with 12 existing gene selection techniques was carried out using five gene expression datasets. This evaluation was based on a broad spectrum of statistically sound (e.g. subject classification) and biological relevant (based on QTL and GO) criteria under a multiple criteria decision-making framework. The performance analysis showed that the proposed technique selects informative genes which are more biologically relevant. The proposed technique is also found to be quite competitive with the existing techniques with respect to subject classification and computational time. Our results also showed that under the multiple criteria decision-making setup, the proposed technique is best for informative gene selection over the available alternatives. Based on the proposed approach, an R Package, i.e. BootMRMR has been developed and available at https://cran.r-project.org/web/packages/BootMRMR. This study will provide a practical guide to select statistical techniques for selecting informative genes
SuhB Is a Regulator of Multiple Virulence Genes and Essential for Pathogenesis of Pseudomonas aeruginosa

Science.gov (United States)

Li, Kewei; Xu, Chang; Jin, Yongxin; Sun, Ziyu; Liu, Chang; Shi, Jing; Chen, Gukui; Chen, Ronghao; Jin, Shouguang; Wu, Weihui

2013-01-01

ABSTRACT During initial colonization and chronic infection, pathogenic bacteria encounter distinct host environments. Adjusting gene expression accordingly is essential for the pathogenesis. Pseudomonas aeruginosa has evolved complicated regulatory networks to regulate different sets of virulence factors to facilitate colonization and persistence. The type III secretion system (T3SS) and motility are associated with acute infections, while biofilm formation and the type VI secretion system (T6SS) are associated with chronic persistence. To identify novel regulatory genes required for pathogenesis, we screened a P. aeruginosa transposon (Tn) insertion library and found suhB to be an essential gene for the T3SS gene expression. The expression of suhB was upregulated in a mouse acute lung infection model, and loss of suhB resulted in avirulence. Suppression of T3SS gene expression in the suhB mutant is linked to a defective translation of the T3SS master regulator, ExsA. Further studies demonstrated that suhB mutation led to the upregulation of GacA and its downstream small RNAs, RsmY and RsmZ, triggering T6SS expression and biofilm formation while inhibiting the T3SS. Our results demonstrate that an in vivo-inducible gene, suhB, reciprocally regulates genes associated with acute and chronic infections and plays an essential role in the pathogenesis of P. aeruginosa. PMID:24169572
In the Context of Multiple Intelligences Theory, Intelligent Data Analysis of Learning Styles Was Based on Rough Set Theory

Science.gov (United States)

Narli, Serkan; Ozgen, Kemal; Alkan, Huseyin

2011-01-01

The present study aims to identify the relationship between individuals' multiple intelligence areas and their learning styles with mathematical clarity using the concept of rough sets which is used in areas such as artificial intelligence, data reduction, discovery of dependencies, prediction of data significance, and generating decision…
Confidence in Phase Definition for Periodicity in Genes Expression Time Series.

Science.gov (United States)

El Anbari, Mohammed; Fadda, Abeer; Ptitsyn, Andrey

2015-01-01

Circadian oscillation in baseline gene expression plays an important role in the regulation of multiple cellular processes. Most of the knowledge of circadian gene expression is based on studies measuring gene expression over time. Our ability to dissect molecular events in time is determined by the sampling frequency of such experiments. However, the real peaks of gene activity can be at any time on or between the time points at which samples are collected. Thus, some genes with a peak activity near the observation point have their phase of oscillation detected with better precision then those which peak between observation time points. Separating genes for which we can confidently identify peak activity from ambiguous genes can improve the analysis of time series gene expression. In this study we propose a new statistical method to quantify the phase confidence of circadian genes. The numerical performance of the proposed method has been tested using three real gene expression data sets.
Gene expression profiles of prostate cancer reveal involvement of multiple molecular pathways in the metastatic process

International Nuclear Information System (INIS)

Chandran, Uma R; Ma, Changqing; Dhir, Rajiv; Bisceglia, Michelle; Lyons-Weiler, Maureen; Liang, Wenjing; Michalopoulos, George; Becich, Michael; Monzon, Federico A

2007-01-01

Prostate cancer is characterized by heterogeneity in the clinical course that often does not correlate with morphologic features of the tumor. Metastasis reflects the most adverse outcome of prostate cancer, and to date there are no reliable morphologic features or serum biomarkers that can reliably predict which patients are at higher risk of developing metastatic disease. Understanding the differences in the biology of metastatic and organ confined primary tumors is essential for developing new prognostic markers and therapeutic targets. Using Affymetrix oligonucleotide arrays, we analyzed gene expression profiles of 24 androgen-ablation resistant metastatic samples obtained from 4 patients and a previously published dataset of 64 primary prostate tumor samples. Differential gene expression was analyzed after removing potentially uninformative stromal genes, addressing the differences in cellular content between primary and metastatic tumors. The metastatic samples are highly heterogenous in expression; however, differential expression analysis shows that 415 genes are upregulated and 364 genes are downregulated at least 2 fold in every patient with metastasis. The expression profile of metastatic samples reveals changes in expression of a unique set of genes representing both the androgen ablation related pathways and other metastasis related gene networks such as cell adhesion, bone remodelling and cell cycle. The differentially expressed genes include metabolic enzymes, transcription factors such as Forkhead Box M1 (FoxM1) and cell adhesion molecules such as Osteopontin (SPP1). We hypothesize that these genes have a role in the biology of metastatic disease and that they represent potential therapeutic targets for prostate cancer
Maltreatment in multiple-birth children.

Science.gov (United States)

Lang, Cathleen A; Cox, Matthew J; Flores, Glenn

2013-12-01

The rate of multiple births has increased over the last two decades. In 1982, an increased frequency of injuries among this patient population was noted, but few studies have evaluated the increased incidence of maltreatment in twins. The study aim was to evaluate the features of all multiple-birth children with substantiated physical abuse and/or neglect over a four-year period at a major children's hospital. A Retrospective chart review was conducted of multiple-gestation children in which at least one child in the multiple set experienced child maltreatment from January 2006 to December 2009. Data regarding the child, injuries, family, and perpetrators were abstracted. We evaluated whether family and child characteristics were associated with maltreatment, and whether types of injuries were similar within multiple sets. For comparison, data from the same time period for single-birth maltreated children also were abstracted, including child age, gestational age at birth, and injury type. There were 19 sets of multiple births in which at least one child had abusive injuries and/or neglect. In 10 of 19 sets (53%), all multiples were found to have a form of maltreatment, and all children in these multiple sets shared at least one injury type. Parents lived together in 63% of cases. Fathers and mothers were the alleged perpetrator in 42% of the cases. Multiple-gestation-birth maltreated children were significantly more likely than single-birth maltreated children to have abdominal trauma (13% vs. 1%, respectively; pchildren often, but not always, were abused. In sets with two maltreated children, children usually shared the same modes of maltreatment. Multiples are significantly more likely than singletons to be younger and experience fractures and abdominal trauma. The findings support the current standard practice of evaluating all children in a multiple set when one is found to be abused or neglected. Copyright © 2013 Elsevier Ltd. All rights reserved.
Iron-related gene variants and brain iron in multiple sclerosis and healthy individuals

Directory of Open Access Journals (Sweden)

Jesper Hagemeier

2018-01-01

Full Text Available Brain iron homeostasis is known to be disturbed in multiple sclerosis (MS, yet little is known about the association of common gene variants linked to iron regulation and pathological tissue changes in the brain. In this study, we investigated the association of genetic determinants linked to iron regulation with deep gray matter (GM magnetic susceptibility in both healthy controls (HC and MS patients. Four hundred (400 patients with MS and 150 age- and sex-matched HCs were enrolled and obtained 3 T MRI examination. Three (3 single nucleotide polymorphisms (SNPs associated with iron regulation were genotyped: two SNPs in the human hereditary hemochromatosis protein gene HFE: rs1800562 (C282Y mutation and rs1799945 (H63D mutation, as well as the rs1049296 SNP in the transferrin gene (C2 mutation. The effects of disease and genetic status were studied using quantitative susceptibility mapping (QSM voxel-based analysis (VBA and region-of-interest (ROI analysis of the deep GM. The general linear model framework was used to compare groups. Analyses were corrected for age and sex, and adjusted for false discovery rate. We found moderate increases in susceptibility in the right putamen of participants with the C282Y (+6.1 ppb and H63D (+6.9 ppb gene variants vs. non-carriers, as well as a decrease in thalamic susceptibility of progressive MS patients with the C282Y mutation (left: −5.3 ppb, right: −6.7 ppb, p < 0.05. Female MS patients had lower susceptibility in the caudate (−6.0 ppb and putamen (left: −3.9 ppb, right: −4.6 ppb than men, but only when they had a wild-type allele (p < 0.05. Iron-gene linked increases in putamen susceptibility (in HC and relapsing remitting MS and decreases in thalamus susceptibility (in progressive MS, coupled with apparent sex interactions, indicate that brain iron in healthy and disease states may be influenced by genetic factors.
Genes and co-expression modules common to drought and bacterial stress responses in Arabidopsis and rice.

Directory of Open Access Journals (Sweden)

Rafi Shaik

Full Text Available Plants are simultaneously exposed to multiple stresses resulting in enormous changes in the molecular landscape within the cell. Identification and characterization of the synergistic and antagonistic components of stress response mechanisms contributing to the cross talk between stresses is of high priority to explore and enhance multiple stress responses. To this end, we performed meta-analysis of drought (abiotic, bacterial (biotic stress response in rice and Arabidopsis by analyzing a total of 386 microarray samples belonging to 20 microarray studies and identified approximately 3100 and 900 DEGs in rice and Arabidopsis, respectively. About 38.5% (1214 and 28.7% (272 DEGs were common to drought and bacterial stresses in rice and Arabidopsis, respectively. A majority of these common DEGs showed conserved expression status in both stresses. Gene ontology enrichment analysis clearly demarcated the response and regulation of various plant hormones and related biological processes. Fatty acid metabolism and biosynthesis of alkaloids were upregulated and, nitrogen metabolism and photosynthesis was downregulated in both stress conditions. WRKY transcription family genes were highly enriched in all upregulated gene sets while 'CO-like' TF family showed inverse relationship of expression between drought and bacterial stresses. Weighted gene co-expression network analysis divided DEG sets into multiple modules that show high co-expression and identified stress specific hub genes with high connectivity. Detection of consensus modules based on DEGs common to drought and bacterial stress revealed 9 and 4 modules in rice and Arabidopsis, respectively, with conserved and reversed co-expression patterns.
Multiple drug resistance protein (MDR-1, multidrug resistance-related protein (MRP and lung resistance protein (LRP gene expression in childhood acute lymphoblastic leukemia

Directory of Open Access Journals (Sweden)

Elvis Terci Valera

Full Text Available CONTEXT: Despite the advances in the cure rate for acute lymphoblastic leukemia, approximately 25% of affected children suffer relapses. Expression of genes for the multiple drug resistance protein (MDR-1, multidrug resistance-related protein (MRP, and lung resistance protein (LRP may confer the phenotype of resistance to the treatment of neoplasias. OBJECTIVE: To analyze the expression of the MDR-1, MRP and LRP genes in children with a diagnosis of acute lymphoblastic leukemia via the semiquantitative reverse transcription polymerase chain reaction (RT-PCR, and to determine the correlation between expression and event-free survival and clinical and laboratory variables. DESIGN: A retrospective clinical study. SETTING: Laboratory of Pediatric Oncology, Department of Pediatrics, Faculdade de Medicina de Ribeirão Preto, Universidade de São Paulo, Brazil. METHODS: Bone marrow aspirates from 30 children with a diagnosis of acute lymphoblastic leukemia were assessed for the expression of messenger RNA for the MDR-1, MRP and LRP genes by semi-quantitative RT-PCR. RESULTS: In the three groups studied, only the increased expression of LRP was related to worsened event-free survival (p = 0.005. The presence of the common acute lymphoblastic leukemia antigen (CALLA was correlated with increased LRP expression (p = 0.009 and increased risk of relapse or death (p = 0.05. The relative risk of relapse or death was six times higher among children with high LRP expression upon diagnosis (p = 0.05, as confirmed by multivariate analysis of the three genes studied (p = 0.035. DISCUSSION: Cell resistance to drugs is a determinant of the response to chemotherapy and its detection via RT-PCR may be of clinical importance. CONCLUSIONS: Evaluation of the expression of genes for resistance to antineoplastic drugs in childhood acute lymphoblastic leukemia upon diagnosis, and particularly the expression of the LRP gene, may be of clinical relevance, and should be the

Transcriptional differences between normal and glioma-derived glial progenitor cells identify a core set of dysregulated genes.

Science.gov (United States)

Auvergne, Romane M; Sim, Fraser J; Wang, Su; Chandler-Militello, Devin; Burch, Jaclyn; Al Fanek, Yazan; Davis, Danielle; Benraiss, Abdellatif; Walter, Kevin; Achanta, Pragathi; Johnson, Mahlon; Quinones-Hinojosa, Alfredo; Natesan, Sridaran; Ford, Heide L; Goldman, Steven A

2013-06-27

Glial progenitor cells (GPCs) are a potential source of malignant gliomas. We used A2B5-based sorting to extract tumorigenic GPCs from human gliomas spanning World Health Organization grades II-IV. Messenger RNA profiling identified a cohort of genes that distinguished A2B5+ glioma tumor progenitor cells (TPCs) from A2B5+ GPCs isolated from normal white matter. A core set of genes and pathways was substantially dysregulated in A2B5+ TPCs, which included the transcription factor SIX1 and its principal cofactors, EYA1 and DACH2. Small hairpin RNAi silencing of SIX1 inhibited the expansion of glioma TPCs in vitro and in vivo, suggesting a critical and unrecognized role of the SIX1-EYA1-DACH2 system in glioma genesis or progression. By comparing the expression patterns of glioma TPCs with those of normal GPCs, we have identified a discrete set of pathways by which glial tumorigenesis may be better understood and more specifically targeted. Copyright © 2013 The Authors. Published by Elsevier Inc. All rights reserved.
Performance of single and concatenated sets of mitochondrial genes at inferring metazoan relationships relative to full mitogenome data.

Directory of Open Access Journals (Sweden)

Justin C Havird

Full Text Available Mitochondrial (mt genes are some of the most popular and widely-utilized genetic loci in phylogenetic studies of metazoan taxa. However, their linked nature has raised questions on whether using the entire mitogenome for phylogenetics is overkill (at best or pseudoreplication (at worst. Moreover, no studies have addressed the comparative phylogenetic utility of mitochondrial genes across individual lineages within the entire Metazoa. To comment on the phylogenetic utility of individual mt genes as well as concatenated subsets of genes, we analyzed mitogenomic data from 1865 metazoan taxa in 372 separate lineages spanning genera to subphyla. Specifically, phylogenies inferred from these datasets were statistically compared to ones generated from all 13 mt protein-coding (PC genes (i.e., the "supergene" set to determine which single genes performed "best" at, and the minimum number of genes required to, recover the "supergene" topology. Surprisingly, the popular marker COX1 performed poorest, while ND5, ND4, and ND2 were most likely to reproduce the "supergene" topology. Averaged across all lineages, the longest ∼2 mt PC genes were sufficient to recreate the "supergene" topology, although this average increased to ∼5 genes for datasets with 40 or more taxa. Furthermore, concatenation of the three "best" performing mt PC genes outperformed that of the three longest mt PC genes (i.e, ND5, COX1, and ND4. Taken together, while not all mt PC genes are equally interchangeable in phylogenetic studies of the metazoans, some subset can serve as a proxy for the 13 mt PC genes. However, the exact number and identity of these genes is specific to the lineage in question and cannot be applied indiscriminately across the Metazoa.
Pleural effusion in 11:14 translocation q1 multiple myeloma in the setting of proteasome inhibitor presents therapeutic complexity.

Science.gov (United States)

Ghannam, Malik; Bryan, Maria; Kuross, Erik; Berry, Brent

2018-01-01

Primary malignant pleural effusion has been reported in about 134 cases of multiple myeloma (MM). Associated pleural effusions in cases of MM portend a poor prognosis and identifying them is highly relevant. Reported is the case of a man diagnosed with MM who developed primary myelomatous pleural effusion in the setting of multiple relapses and subsequent mortality within 2 months of the pleural effusion diagnosis. A 61-year-old African American man was diagnosed with MM in 2011. He received induction therapy of lenalidomide and dexamethasone and an autologous stem cell transplant in 2012. Over the next 5 years, the patient went through alternating periods of remission and relapse that were treated with two rounds of thoracic spine radiation therapy and chemotherapeutic agents. In September 2017, the patient presented with worsening dyspnea and was found to have pleural effusion. Fluid analysis showed plasma cell dyscrasia. Fluid drainage was performed, then the patient was discharged after 1 week which was followed by rapid re-accumulation of fluid and rehospitalization about 10 days after discharge. The patient passed away a few weeks after the second admission. Pleural effusion carries a differential diagnosis which may include malignancy but is commonly thought to be less specific to multiple myeloma but should still remain in the differential diagnosis. To our knowledge, this is the first case of myelomatous pleural effusion (MPE) that was reported after multiple relapses of MM. MPE is a very rare complication of MM, and its presence is a strong indicator of imminent mortality and need for comfort care in case of multiple relapses. End-stage pleural effusion in MM in the setting of proteasome inhibitor adds more therapeutic and diagnostic challenges.
The first set of EST resource for gene discovery and marker development in pigeonpea (Cajanus cajan L.

Directory of Open Access Journals (Sweden)

Byregowda Munishamappa

2010-03-01

.8% in molecular function. Further, 19 genes were identified differentially expressed between FW- responsive genotypes and 20 between SMD- responsive genotypes. Generated ESTs were compiled together with 908 ESTs available in public domain, at the time of analysis, and a set of 5,085 unigenes were defined that were used for identification of molecular markers in pigeonpea. For instance, 3,583 simple sequence repeat (SSR motifs were identified in 1,365 unigenes and 383 primer pairs were designed. Assessment of a set of 84 primer pairs on 40 elite pigeonpea lines showed polymorphism with 15 (28.8% markers with an average of four alleles per marker and an average polymorphic information content (PIC value of 0.40. Similarly, in silico mining of 133 contigs with ≥ 5 sequences detected 102 single nucleotide polymorphisms (SNPs in 37 contigs. As an example, a set of 10 contigs were used for confirming in silico predicted SNPs in a set of four genotypes using wet lab experiments. Occurrence of SNPs were confirmed for all the 6 contigs for which scorable and sequenceable amplicons were generated. PCR amplicons were not obtained in case of 4 contigs. Recognition sites for restriction enzymes were identified for 102 SNPs in 37 contigs that indicates possibility of assaying SNPs in 37 genes using cleaved amplified polymorphic sequences (CAPS assay. Conclusion The pigeonpea EST dataset generated here provides a transcriptomic resource for gene discovery and development of functional markers associated with biotic stress resistance. Sequence analyses of this dataset have showed conservation of a considerable number of pigeonpea transcripts across legume and model plant species analysed as well as some putative pigeonpea specific genes. Validation of identified biotic stress responsive genes should provide candidate genes for allele mining as well as candidate markers for molecular breeding.
SINA: accurate high-throughput multiple sequence alignment of ribosomal RNA genes.

Science.gov (United States)

Pruesse, Elmar; Peplies, Jörg; Glöckner, Frank Oliver

2012-07-15

In the analysis of homologous sequences, computation of multiple sequence alignments (MSAs) has become a bottleneck. This is especially troublesome for marker genes like the ribosomal RNA (rRNA) where already millions of sequences are publicly available and individual studies can easily produce hundreds of thousands of new sequences. Methods have been developed to cope with such numbers, but further improvements are needed to meet accuracy requirements. In this study, we present the SILVA Incremental Aligner (SINA) used to align the rRNA gene databases provided by the SILVA ribosomal RNA project. SINA uses a combination of k-mer searching and partial order alignment (POA) to maintain very high alignment accuracy while satisfying high throughput performance demands. SINA was evaluated in comparison with the commonly used high throughput MSA programs PyNAST and mothur. The three BRAliBase III benchmark MSAs could be reproduced with 99.3, 97.6 and 96.1 accuracy. A larger benchmark MSA comprising 38 772 sequences could be reproduced with 98.9 and 99.3% accuracy using reference MSAs comprising 1000 and 5000 sequences. SINA was able to achieve higher accuracy than PyNAST and mothur in all performed benchmarks. Alignment of up to 500 sequences using the latest SILVA SSU/LSU Ref datasets as reference MSA is offered at http://www.arb-silva.de/aligner. This page also links to Linux binaries, user manual and tutorial. SINA is made available under a personal use license.
Properties of a herpes simplex virus multiple immediate-early gene-deleted recombinant as a vaccine vector

International Nuclear Information System (INIS)

Watanabe, Daisuke; Brockman, Mark A.; Ndung'u, Thumbi; Mathews, Lydia; Lucas, William T.; Murphy, Cynthia G.; Felber, Barbara K.; Pavlakis, George N.; Deluca, Neal A.; Knipe, David M.

2007-01-01

Herpes simplex virus (HSV) recombinants induce durable immune responses in rhesus macaques and mice and have induced partial protection in rhesus macaques against mucosal challenge with virulent simian immunodeficiency virus (SIV). In this study, we evaluated the properties of a new generation HSV vaccine vector, an HSV-1 multiple immediate-early (IE) gene deletion mutant virus, d106, which contains deletions in the ICP4, ICP27, ICP22, and ICP47 genes. Because several of the HSV IE genes have been implicated in immune evasion, inactivation of the genes encoding these proteins was expected to result in enhanced immunogenicity. The d106 virus expresses few HSV gene products and shows minimal cytopathic effect in cultured cells. When d106 was inoculated into mice, viral DNA accumulated at high levels in draining lymph nodes, consistent with an ability to transduce dendritic cells and activate their maturation and movement to lymph nodes. A d106 recombinant expressing Escherichia coli β-galactosidase induced durable β-gal-specific IgG and CD8 + T cell responses in naive and HSV-immune mice. Finally, d106-based recombinants have been constructed that express simian immunodeficiency virus (SIV) gag, env, or a rev-tat-nef fusion protein for several days in cultured cells. Thus, d106 shows many of the properties desirable in a vaccine vector: limited expression of HSV gene products and cytopathogenicity, high level expression of transgenes, ability to induce durable immune responses, and an ability to transduce dendritic cells and induce their maturation and migration to lymph nodes
Meta-analysis of Drosophila circadian microarray studies identifies a novel set of rhythmically expressed genes.

Directory of Open Access Journals (Sweden)

Kevin P Keegan

2007-11-01

Full Text Available Five independent groups have reported microarray studies that identify dozens of rhythmically expressed genes in the fruit fly Drosophila melanogaster. Limited overlap among the lists of discovered genes makes it difficult to determine which, if any, exhibit truly rhythmic patterns of expression. We reanalyzed data from all five reports and found two sources for the observed discrepancies, the use of different expression pattern detection algorithms and underlying variation among the datasets. To improve upon the methods originally employed, we developed a new analysis that involves compilation of all existing data, application of identical transformation and standardization procedures followed by ANOVA-based statistical prescreening, and three separate classes of post hoc analysis: cross-correlation to various cycling waveforms, autocorrelation, and a previously described fast Fourier transform-based technique. Permutation-based statistical tests were used to derive significance measures for all post hoc tests. We find application of our method, most significantly the ANOVA prescreening procedure, significantly reduces the false discovery rate relative to that observed among the results of the original five reports while maintaining desirable statistical power. We identify a set of 81 cycling transcripts previously found in one or more of the original reports as well as a novel set of 133 transcripts not found in any of the original studies. We introduce a novel analysis method that compensates for variability observed among the original five Drosophila circadian array reports. Based on the statistical fidelity of our meta-analysis results, and the results of our initial validation experiments (quantitative RT-PCR, we predict many of our newly found genes to be bona fide cyclers, and suggest that they may lead to new insights into the pathways through which clock mechanisms regulate behavioral rhythms.
Cartilage-selective genes identified in genome-scale analysis of non-cartilage and cartilage gene expression

Directory of Open Access Journals (Sweden)

Cohn Zachary A

2007-06-01

Full Text Available Abstract Background Cartilage plays a fundamental role in the development of the human skeleton. Early in embryogenesis, mesenchymal cells condense and differentiate into chondrocytes to shape the early skeleton. Subsequently, the cartilage anlagen differentiate to form the growth plates, which are responsible for linear bone growth, and the articular chondrocytes, which facilitate joint function. However, despite the multiplicity of roles of cartilage during human fetal life, surprisingly little is known about its transcriptome. To address this, a whole genome microarray expression profile was generated using RNA isolated from 18–22 week human distal femur fetal cartilage and compared with a database of control normal human tissues aggregated at UCLA, termed Celsius. Results 161 cartilage-selective genes were identified, defined as genes significantly expressed in cartilage with low expression and little variation across a panel of 34 non-cartilage tissues. Among these 161 genes were cartilage-specific genes such as cartilage collagen genes and 25 genes which have been associated with skeletal phenotypes in humans and/or mice. Many of the other cartilage-selective genes do not have established roles in cartilage or are novel, unannotated genes. Quantitative RT-PCR confirmed the unique pattern of gene expression observed by microarray analysis. Conclusion Defining the gene expression pattern for cartilage has identified new genes that may contribute to human skeletogenesis as well as provided further candidate genes for skeletal dysplasias. The data suggest that fetal cartilage is a complex and transcriptionally active tissue and demonstrate that the set of genes selectively expressed in the tissue has been greatly underestimated.
Integrated assessment by multiple gene expression analysis of quercetin bioactivity on anticancer-related mechanisms in colon cancer cells in vitro

NARCIS (Netherlands)

Erk, van M.J.; Roepman, P.; Lende, van der T.R.; Stierum, R.H.; Aarts, J.M.M.J.G.; Bladeren, van P.J.; Ommen, van B.

2005-01-01

Background Many different mechanisms are involved in nutrient¿related prevention of colon cancer. In this study, a comprehensive assessment of the spectrum of possible biological actions of the bioactive compound quercetin is made using multiple gene expression analysis. Quercetin is a flavonoid
A multiple genome analysis of Mycobacterium tuberculosis reveals specific novel genes and mutations associated with pyrazinamide resistance

KAUST Repository

Sheen, Patricia

2017-10-11

Tuberculosis (TB) is a major global health problem and drug resistance compromises the efforts to control this disease. Pyrazinamide (PZA) is an important drug used in both first and second line treatment regimes. However, its complete mechanism of action and resistance remains unclear.We genotyped and sequenced the complete genomes of 68 M. tuberculosis strains isolated from unrelated TB patients in Peru. No clustering pattern of the strains was verified based on spoligotyping. We analyzed the association between PZA resistance with non-synonymous mutations and specific genes. We found mutations in pncA and novel genes significantly associated with PZA resistance in strains without pncA mutations. These included genes related to transportation of metal ions, pH regulation and immune system evasion.These results suggest potential alternate mechanisms of PZA resistance that have not been found in other populations, supporting that the antibacterial activity of PZA may hit multiple targets.
A multiple genome analysis of Mycobacterium tuberculosis reveals specific novel genes and mutations associated with pyrazinamide resistance

KAUST Repository

Sheen, Patricia; Requena, David; Gushiken, Eduardo; Gilman, Robert H.; Antiparra, Ricardo; Lucero, Bryan; Lizá rraga, Pilar; Cieza, Basilio; Roncal, Elisa; Grandjean, Louis; Pain, Arnab; McNerney, Ruth; Clark, Taane G.; Moore, David; Zimic, Mirko

2017-01-01

Tuberculosis (TB) is a major global health problem and drug resistance compromises the efforts to control this disease. Pyrazinamide (PZA) is an important drug used in both first and second line treatment regimes. However, its complete mechanism of action and resistance remains unclear.We genotyped and sequenced the complete genomes of 68 M. tuberculosis strains isolated from unrelated TB patients in Peru. No clustering pattern of the strains was verified based on spoligotyping. We analyzed the association between PZA resistance with non-synonymous mutations and specific genes. We found mutations in pncA and novel genes significantly associated with PZA resistance in strains without pncA mutations. These included genes related to transportation of metal ions, pH regulation and immune system evasion.These results suggest potential alternate mechanisms of PZA resistance that have not been found in other populations, supporting that the antibacterial activity of PZA may hit multiple targets.
Expression map of a complete set of gustatory receptor genes in chemosensory organs of Bombyx mori.

Science.gov (United States)

Guo, Huizhen; Cheng, Tingcai; Chen, Zhiwei; Jiang, Liang; Guo, Youbing; Liu, Jianqiu; Li, Shenglong; Taniai, Kiyoko; Asaoka, Kiyoshi; Kadono-Okuda, Keiko; Arunkumar, Kallare P; Wu, Jiaqi; Kishino, Hirohisa; Zhang, Huijie; Seth, Rakesh K; Gopinathan, Karumathil P; Montagné, Nicolas; Jacquin-Joly, Emmanuelle; Goldsmith, Marian R; Xia, Qingyou; Mita, Kazuei

2017-03-01

Most lepidopteran species are herbivores, and interaction with host plants affects their gene expression and behavior as well as their genome evolution. Gustatory receptors (Grs) are expected to mediate host plant selection, feeding, oviposition and courtship behavior. However, due to their high diversity, sequence divergence and extremely low level of expression it has been difficult to identify precisely a complete set of Grs in Lepidoptera. By manual annotation and BAC sequencing, we improved annotation of 43 gene sequences compared with previously reported Grs in the most studied lepidopteran model, the silkworm, Bombyx mori, and identified 7 new tandem copies of BmGr30 on chromosome 7, bringing the total number of BmGrs to 76. Among these, we mapped 68 genes to chromosomes in a newly constructed chromosome distribution map and 8 genes to scaffolds; we also found new evidence for large clusters of BmGrs, especially from the bitter receptor family. RNA-seq analysis of diverse BmGr expression patterns in chemosensory organs of larvae and adults enabled us to draw a precise organ specific map of BmGr expression. Interestingly, most of the clustered genes were expressed in the same tissues and more than half of the genes were expressed in larval maxillae, larval thoracic legs and adult legs. For example, BmGr63 showed high expression levels in all organs in both larval and adult stages. By contrast, some genes showed expression limited to specific developmental stages or organs and tissues. BmGr19 was highly expressed in larval chemosensory organs (especially antennae and thoracic legs), the single exon genes BmGr53 and BmGr67 were expressed exclusively in larval tissues, the BmGr27-BmGr31 gene cluster on chr7 displayed a high expression level limited to adult legs and the candidate CO 2 receptor BmGr2 was highly expressed in adult antennae, where few other Grs were expressed. Transcriptional analysis of the Grs in B. mori provides a valuable new reference for
Accurate Gene Expression-Based Biodosimetry Using a Minimal Set of Human Gene Transcripts

Energy Technology Data Exchange (ETDEWEB)

Tucker, James D., E-mail: jtucker@biology.biosci.wayne.edu [Department of Biological Sciences, Wayne State University, Detroit, Michigan (United States); Joiner, Michael C. [Department of Radiation Oncology, Wayne State University, Detroit, Michigan (United States); Thomas, Robert A.; Grever, William E.; Bakhmutsky, Marina V. [Department of Biological Sciences, Wayne State University, Detroit, Michigan (United States); Chinkhota, Chantelle N.; Smolinski, Joseph M. [Department of Electrical and Computer Engineering, Wayne State University, Detroit, Michigan (United States); Divine, George W. [Department of Public Health Sciences, Henry Ford Hospital, Detroit, Michigan (United States); Auner, Gregory W. [Department of Electrical and Computer Engineering, Wayne State University, Detroit, Michigan (United States)

2014-03-15

Purpose: Rapid and reliable methods for conducting biological dosimetry are a necessity in the event of a large-scale nuclear event. Conventional biodosimetry methods lack the speed, portability, ease of use, and low cost required for triaging numerous victims. Here we address this need by showing that polymerase chain reaction (PCR) on a small number of gene transcripts can provide accurate and rapid dosimetry. The low cost and relative ease of PCR compared with existing dosimetry methods suggest that this approach may be useful in mass-casualty triage situations. Methods and Materials: Human peripheral blood from 60 adult donors was acutely exposed to cobalt-60 gamma rays at doses of 0 (control) to 10 Gy. mRNA expression levels of 121 selected genes were obtained 0.5, 1, and 2 days after exposure by reverse-transcriptase real-time PCR. Optimal dosimetry at each time point was obtained by stepwise regression of dose received against individual gene transcript expression levels. Results: Only 3 to 4 different gene transcripts, ASTN2, CDKN1A, GDF15, and ATM, are needed to explain ≥0.87 of the variance (R{sup 2}). Receiver-operator characteristics, a measure of sensitivity and specificity, of 0.98 for these statistical models were achieved at each time point. Conclusions: The actual and predicted radiation doses agree very closely up to 6 Gy. Dosimetry at 8 and 10 Gy shows some effect of saturation, thereby slightly diminishing the ability to quantify higher exposures. Analyses of these gene transcripts may be advantageous for use in a field-portable device designed to assess exposures in mass casualty situations or in clinical radiation emergencies.
An Extended TOPSIS Method for the Multiple Attribute Decision Making Problems Based on Interval Neutrosophic Set

Directory of Open Access Journals (Sweden)

Pingping Chi

2013-03-01

Full Text Available The interval neutrosophic set (INS can be easier to express the incomplete, indeterminate and inconsistent information, and TOPSIS is one of the most commonly used and effective method for multiple attribute decision making, however, in general, it can only process the attribute values with crisp numbers. In this paper, we have extended TOPSIS to INS, and with respect to the multiple attribute decision making problems in which the attribute weights are unknown and the attribute values take the form of INSs, we proposed an expanded TOPSIS method. Firstly, the definition of INS and the operational laws are given, and distance between INSs is defined. Then, the attribute weights are determined based on the Maximizing deviation method and an extended TOPSIS method is developed to rank the alternatives. Finally, an illustrative example is given to verify the developed approach and to demonstrate its practicality and effectiveness.
COGNATE: comparative gene annotation characterizer.

Science.gov (United States)

Wilbrandt, Jeanne; Misof, Bernhard; Niehuis, Oliver

2017-07-17

The comparison of gene and genome structures across species has the potential to reveal major trends of genome evolution. However, such a comparative approach is currently hampered by a lack of standardization (e.g., Elliott TA, Gregory TR, Philos Trans Royal Soc B: Biol Sci 370:20140331, 2015). For example, testing the hypothesis that the total amount of coding sequences is a reliable measure of potential proteome diversity (Wang M, Kurland CG, Caetano-Anollés G, PNAS 108:11954, 2011) requires the application of standardized definitions of coding sequence and genes to create both comparable and comprehensive data sets and corresponding summary statistics. However, such standard definitions either do not exist or are not consistently applied. These circumstances call for a standard at the descriptive level using a minimum of parameters as well as an undeviating use of standardized terms, and for software that infers the required data under these strict definitions. The acquisition of a comprehensive, descriptive, and standardized set of parameters and summary statistics for genome publications and further analyses can thus greatly benefit from the availability of an easy to use standard tool. We developed a new open-source command-line tool, COGNATE (Comparative Gene Annotation Characterizer), which uses a given genome assembly and its annotation of protein-coding genes for a detailed description of the respective gene and genome structure parameters. Additionally, we revised the standard definitions of gene and genome structures and provide the definitions used by COGNATE as a working draft suggestion for further reference. Complete parameter lists and summary statistics are inferred using this set of definitions to allow down-stream analyses and to provide an overview of the genome and gene repertoire characteristics. COGNATE is written in Perl and freely available at the ZFMK homepage ( https://www.zfmk.de/en/COGNATE ) and on github ( https
Zebrafish homologs of genes within 16p11.2, a genomic region associated with brain disorders, are active during brain development, and include two deletion dosage sensor genes

Directory of Open Access Journals (Sweden)

Alicia Blaker-Lee

2012-11-01

Deletion or duplication of one copy of the human 16p11.2 interval is tightly associated with impaired brain function, including autism spectrum disorders (ASDs, intellectual disability disorder (IDD and other phenotypes, indicating the importance of gene dosage in this copy number variant region (CNV. The core of this CNV includes 25 genes; however, the number of genes that contribute to these phenotypes is not known. Furthermore, genes whose functional levels change with deletion or duplication (termed ‘dosage sensors’, which can associate the CNV with pathologies, have not been identified in this region. Using the zebrafish as a tool, a set of 16p11.2 homologs was identified, primarily on chromosomes 3 and 12. Use of 11 phenotypic assays, spanning the first 5 days of development, demonstrated that this set of genes is highly active, such that 21 out of the 22 homologs tested showed loss-of-function phenotypes. Most genes in this region were required for nervous system development – impacting brain morphology, eye development, axonal density or organization, and motor response. In general, human genes were able to substitute for the fish homolog, demonstrating orthology and suggesting conserved molecular pathways. In a screen for 16p11.2 genes whose function is sensitive to hemizygosity, the aldolase a (aldoaa and kinesin family member 22 (kif22 genes were identified as giving clear phenotypes when RNA levels were reduced by ∼50%, suggesting that these genes are deletion dosage sensors. This study leads to two major findings. The first is that the 16p11.2 region comprises a highly active set of genes, which could present a large genetic target and might explain why multiple brain function, and other, phenotypes are associated with this interval. The second major finding is that there are (at least two genes with deletion dosage sensor properties among the 16p11.2 set, and these could link this CNV to brain disorders such as ASD and IDD.
Bayesian Computational Approaches for Gene Regulation Studies of Bioethanol and Biohydrogen Production. Final Scientific/Technical Report

Energy Technology Data Exchange (ETDEWEB)

Newberg, Lee; McCue, Lee Anne; Van Roey, Patrick

2014-04-17

The project developed mathematical models and first-version software tools for the understanding of gene regulation across multiple related species. The project lays the foundation for understanding how certain alpha-proteobacterial species control their own genes for bioethanol and biohydrogen production, and sets the stage for exploiting bacteria for the production of fuels. Enabling such alternative sources of fuel is a high priority for the Department of Energy and the public.
In silico analysis of stomach lineage specific gene set expression pattern in gastric cancer.

Science.gov (United States)

Pandi, Narayanan Sathiya; Suganya, Sivagurunathan; Rajendran, Suriliyandi

2013-10-04

Stomach lineage specific gene products act as a protective barrier in the normal stomach and their expression maintains the normal physiological processes, cellular integrity and morphology of the gastric wall. However, the regulation of stomach lineage specific genes in gastric cancer (GC) is far less clear. In the present study, we sought to investigate the role and regulation of stomach lineage specific gene set (SLSGS) in GC. SLSGS was identified by comparing the mRNA expression profiles of normal stomach tissue with other organ tissue. The obtained SLSGS was found to be under expressed in gastric tumors. Functional annotation analysis revealed that the SLSGS was enriched for digestive function and gastric epithelial maintenance. Employing a single sample prediction method across GC mRNA expression profiles identified the under expression of SLSGS in proliferative type and invasive type gastric tumors compared to the metabolic type gastric tumors. Integrative pathway activation prediction analysis revealed a close association between estrogen-α signaling and SLSGS expression pattern in GC. Elevated expression of SLSGS in GC is associated with an overall increase in the survival of GC patients. In conclusion, our results highlight that estrogen mediated regulation of SLSGS in gastric tumor is a molecular predictor of metabolic type GC and prognostic factor in GC. Copyright © 2013 Elsevier Inc. All rights reserved.
Multiple gene analyses identify distinct “bois noir” phytoplasma genotypes in the Republic of Macedonia

Directory of Open Access Journals (Sweden)

Emilija KOSTADINOVSKA

2015-01-01

Full Text Available “Bois noir” (BN is a grapevine yellows disease, associated with phytoplasma strains related to ‘Candidatus Phytoplasma solani’, that causes severe losses to viticulture in the Euro-Mediterranean basin. Due to the complex ecological cycle of its etiological agent, BN epidemiology is only partially known, and no effective control strategies have been developed. Numerous studies have focused on molecular characterization of BN phytoplasma strains, to identify molecular markers useful to accurately describe their genetic diversity, geographic distribution and host range. In the present study, a multiple gene analysess were carried out on 16S rRNA, tuf, vmp1, and stamp genes to study the genetic variability among 18 BN phytoplasma strains detected in diverse regions of the Republic of Macedonia. Restriction fragment length polymorphism (RFLP assays showed the presence of one 16S rRNA (16SrXII-A, two tuf (tuf-type a, tuf-type b, five vmp1 (V2-TA, V3, V4, V14, V18, and three stamp (S1, S2, S3 gene patterns among the examined strains. Based on the collective RFLP patterns, seven genotypes (Mac1 to Mac7 were described as evidence for genetic heterogeneity, and highlighting their prevalence and distribution in the investigated regions. Phylogenetic analyses on vmp1 and stamp genes underlined the affiliation of Macedonian BN phytoplasma strains to clusters associated with distinct ecologies.
Genetic variants of the alpha-synuclein gene SNCA are associated with multiple system atrophy.

Directory of Open Access Journals (Sweden)

Ammar Al-Chalabi

Full Text Available BACKGROUND: Multiple system atrophy (MSA is a progressive neurodegenerative disorder characterized by parkinsonism, cerebellar ataxia and autonomic dysfunction. Pathogenic mechanisms remain obscure but the neuropathological hallmark is the presence of alpha-synuclein-immunoreactive glial cytoplasmic inclusions. Genetic variants of the alpha-synuclein gene, SNCA, are thus strong candidates for genetic association with MSA. One follow-up to a genome-wide association of Parkinson's disease has identified association of a SNP in SNCA with MSA. METHODOLOGY/FINDINGS: We evaluated 32 SNPs in the SNCA gene in a European population of 239 cases and 617 controls recruited as part of the Neuroprotection and Natural History in Parkinson Plus Syndromes (NNIPPS study. We used 161 independently collected samples for replication. Two SNCA SNPs showed association with MSA: rs3822086 (P = 0.0044, and rs3775444 (P = 0.012, although only the first survived correction for multiple testing. In the MSA-C subgroup the association strengthened despite more than halving the number of cases: rs3822086 P = 0.0024, OR 2.153, (95% CI 1.3-3.6; rs3775444 P = 0.0017, OR 4.386 (95% CI 1.6-11.7. A 7-SNP haplotype incorporating three SNPs either side of rs3822086 strengthened the association with MSA-C further (best haplotype, P = 8.7 x 10(-4. The association with rs3822086 was replicated in the independent samples (P = 0.035. CONCLUSIONS/SIGNIFICANCE: We report a genetic association between MSA and alpha-synuclein which has replicated in independent samples. The strongest association is with the cerebellar subtype of MSA. TRIAL REGISTRATION: ClinicalTrials.gov NCT00211224.

Multiple-integrations of HPV16 genome and altered transcription of viral oncogenes and cellular genes are associated with the development of cervical cancer.

Directory of Open Access Journals (Sweden)

Xulian Lu

Full Text Available The constitutive expression of the high-risk HPV E6 and E7 viral oncogenes is the major cause of cervical cancer. To comprehensively explore the composition of HPV16 early transcripts and their genomic annotation, cervical squamous epithelial tissues from 40 HPV16-infected patients were collected for analysis of papillomavirus oncogene transcripts (APOT. We observed different transcription patterns of HPV16 oncogenes in progression of cervical lesions to cervical cancer and identified one novel transcript. Multiple-integration events in the tissues of cervical carcinoma (CxCa are significantly more often than those of low-grade squamous intraepithelial lesions (LSIL and high-grade squamous intraepithelial lesions (HSIL. Moreover, most cellular genes within or near these integration sites are cancer-associated genes. Taken together, this study suggests that the multiple-integrations of HPV genome during persistent viral infection, which thereby alters the expression patterns of viral oncogenes and integration-related cellular genes, play a crucial role in progression of cervical lesions to cervix cancer.
Identification of circular RNAs from the parental genes involved in multiple aspects of cellular metabolism in barley

DEFF Research Database (Denmark)

Shirvanehdeh, Behrooz Darbani; Noeparvar, Shahin; Borg, Søren

2016-01-01

circular RNAs as novel interactors in the regulation of gene expression in plants and imply the comprehensiveness of this regulatory pathway by identifying circular RNAs for a diverse set of genes. These genes are involved in several aspects of cellular metabolism as hormonal signaling, intracellular...... protein sorting, carbohydrate metabolism and cell-wall biogenesis, respiration, amino acid biosynthesis, transcription and translation, and protein ubiquitination. Additionally, these parental loci of circular RNAs, from both nuclear and mitochondrial genomes, encode for different transcript classes...... and elucidate their cellular-level alterations across tissues and in response to micronutrients iron and zinc. In further support of circular RNAs’ functional roles in plants, we report several cases where fluctuations of circRNAs do not correlate with the levels of their parental-loci encoded linear...
cDREM: inferring dynamic combinatorial gene regulation.

Science.gov (United States)

Wise, Aaron; Bar-Joseph, Ziv

2015-04-01

Genes are often combinatorially regulated by multiple transcription factors (TFs). Such combinatorial regulation plays an important role in development and facilitates the ability of cells to respond to different stresses. While a number of approaches have utilized sequence and ChIP-based datasets to study combinational regulation, these have often ignored the combinational logic and the dynamics associated with such regulation. Here we present cDREM, a new method for reconstructing dynamic models of combinatorial regulation. cDREM integrates time series gene expression data with (static) protein interaction data. The method is based on a hidden Markov model and utilizes the sparse group Lasso to identify small subsets of combinatorially active TFs, their time of activation, and the logical function they implement. We tested cDREM on yeast and human data sets. Using yeast we show that the predicted combinatorial sets agree with other high throughput genomic datasets and improve upon prior methods developed to infer combinatorial regulation. Applying cDREM to study human response to flu, we were able to identify several combinatorial TF sets, some of which were known to regulate immune response while others represent novel combinations of important TFs.
A gene encoding maize caffeoyl-CoA O-methyltransferase confers quantitative resistance to multiple pathogens.

Science.gov (United States)

Yang, Qin; He, Yijian; Kabahuma, Mercy; Chaya, Timothy; Kelly, Amy; Borrego, Eli; Bian, Yang; El Kasmi, Farid; Yang, Li; Teixeira, Paulo; Kolkman, Judith; Nelson, Rebecca; Kolomiets, Michael; L Dangl, Jeffery; Wisser, Randall; Caplan, Jeffrey; Li, Xu; Lauter, Nick; Balint-Kurti, Peter

2017-09-01

Alleles that confer multiple disease resistance (MDR) are valuable in crop improvement, although the molecular mechanisms underlying their functions remain largely unknown. A quantitative trait locus, qMdr 9.02 , associated with resistance to three important foliar maize diseases-southern leaf blight, gray leaf spot and northern leaf blight-has been identified on maize chromosome 9. Through fine-mapping, association analysis, expression analysis, insertional mutagenesis and transgenic validation, we demonstrate that ZmCCoAOMT2, which encodes a caffeoyl-CoA O-methyltransferase associated with the phenylpropanoid pathway and lignin production, is the gene within qMdr 9.02 conferring quantitative resistance to both southern leaf blight and gray leaf spot. We suggest that resistance might be caused by allelic variation at the level of both gene expression and amino acid sequence, thus resulting in differences in levels of lignin and other metabolites of the phenylpropanoid pathway and regulation of programmed cell death.
Multiple var2csa-type PfEMP1 genes located at different chromosomal loci occur in many Plasmodium falciparum isolates

DEFF Research Database (Denmark)

Sander, Adam F; Salanti, Ali; Lavstsen, Thomas

2009-01-01

in the VAR2CSA protein, sequence variation in the DBL2X region of var2csa genes in 54 P.falciparum samples was analyzed. Chromosome mapping of var2csa loci was carried out and a quantitative PCR assay was developed to estimate the number of var2csa genes in P.falciparum isolates from the placenta of pregnant....... falciparum isolates. One gene is on chromosome 12 but additional var2csa-type genes are on different chromosomes in different isolates. Multiplicity of var2csa genes appears more common in infected placentae than in samples from non-pregnant donors indicating a possible advantage of this genotype...
Report of Chinese family with severe dermatitis, multiple allergies and metabolic wasting syndrome caused by novel homozygous desmoglein-1 gene mutation.

Science.gov (United States)

Cheng, Ruhong; Yan, Ming; Ni, Cheng; Zhang, Jia; Li, Ming; Yao, Zhirong

2016-10-01

Recently, homozygous mutations in the desmoglein-1 (DSG1) gene and heterozygous mutation in the desmoplakin (DSP) gene have been demonstrated to be associated with severe dermatitis, multiple allergies and metabolic wasting (SAM) syndrome (Mendelian Inheritance in Man no. 615508). We aim to identify the molecular basis for a Chinese pedigree of SAM syndrome. A Chinese pedigree of SAM syndrome was subjected to mutation detection in the DSG1 gene. Sequence analysis of the DSG1 gene and quantitative reverse transcriptase polymerase chain reaction analysis for gene expression of DSG1 using cDNA derived from the epidermis of patients and controls were both performed. Skin biopsies were also taken from patients for pathological study and transmission electron microscopy observation. Novel homozygous splicing mutation c.1892-1delG in the exon-intron border of the DSG1 gene has been demonstrated to be associated with SAM syndrome. We report a new family of SAM syndrome of Asian decent and expand the spectrum of mutations in the DSG1 gene. © 2016 Japanese Dermatological Association.
Multiple-endpoints gene alteration-based (MEGA) assay: A toxicogenomics approach for water quality assessment of wastewater effluents.

Science.gov (United States)

Fukushima, Toshikazu; Hara-Yamamura, Hiroe; Nakashima, Koji; Tan, Lea Chua; Okabe, Satoshi

2017-12-01

Wastewater effluents contain a significant number of toxic contaminants, which, even at low concentrations, display a wide variety of toxic actions. In this study, we developed a multiple-endpoints gene alteration-based (MEGA) assay, a real-time PCR-based transcriptomic analysis, to assess the water quality of wastewater effluents for human health risk assessment and management. Twenty-one genes from the human hepatoblastoma cell line (HepG2), covering the basic health-relevant stress responses such as response to xenobiotics, genotoxicity, and cytotoxicity, were selected and incorporated into the MEGA assay. The genes related to the p53-mediated DNA damage response and cytochrome P450 were selected as markers for genotoxicity and response to xenobiotics, respectively. Additionally, the genes that were dose-dependently regulated by exposure to the wastewater effluents were chosen as markers for cytotoxicity. The alterations in the expression of an individual gene, induced by exposure to the wastewater effluents, were evaluated by real-time PCR and the results were validated by genotoxicity (e.g., comet assay) and cell-based cytotoxicity tests. In summary, the MEGA assay is a real-time PCR-based assay that targets cellular responses to contaminants present in wastewater effluents at the transcriptional level; it is rapid, cost-effective, and high-throughput and can thus complement any chemical analysis for water quality assessment and management. Copyright © 2017 Elsevier Ltd. All rights reserved.
Type IX Collagen Gene Mutations Can Result in Multiple Epiphyseal Dysplasia That Is Associated With Osteochondritis Dissecans and a Mild Myopathy

NARCIS (Netherlands)

Jackson, Gail C.; Marcus-Soekarman, Dominique; Stolte-Dijkstra, Irene; Verrips, Aad; Taylor, Jacqueline A.; Briggs, Michael D.

Multiple epiphyseal dysplasia (MED) is a clinically variable and genetically heterogeneous disease that is characterized by mild short stature and early onset osteoarthritis. Autosomal dominant forms are caused by mutations in the genes that encode type IX collagen, cartilage oligomeric matrix
Type IX collagen gene mutations can result in multiple epiphyseal dysplasia that is associated with osteochondritis dissecans and a mild myopathy.

NARCIS (Netherlands)

Jackson, G.C.; Marcus-Soekarman, D.; Stolte-Dijkstra, I.; Verrips, A.; Taylor, J.A.; Briggs, M.D.

2010-01-01

Multiple epiphyseal dysplasia (MED) is a clinically variable and genetically heterogeneous disease that is characterized by mild short stature and early onset osteoarthritis. Autosomal dominant forms are caused by mutations in the genes that encode type IX collagen, cartilage oligomeric matrix
A novel CpG island set identifies tissue-specific methylation at developmental gene loci.

Directory of Open Access Journals (Sweden)

Robert Illingworth

2008-01-01

Full Text Available CpG islands (CGIs are dense clusters of CpG sequences that punctuate the CpG-deficient human genome and associate with many gene promoters. As CGIs also differ from bulk chromosomal DNA by their frequent lack of cytosine methylation, we devised a CGI enrichment method based on nonmethylated CpG affinity chromatography. The resulting library was sequenced to define a novel human blood CGI set that includes many that are not detected by current algorithms. Approximately half of CGIs were associated with annotated gene transcription start sites, the remainder being intra- or intergenic. Using an array representing over 17,000 CGIs, we established that 6%-8% of CGIs are methylated in genomic DNA of human blood, brain, muscle, and spleen. Inter- and intragenic CGIs are preferentially susceptible to methylation. CGIs showing tissue-specific methylation were overrepresented at numerous genetic loci that are essential for development, including HOX and PAX family members. The findings enable a comprehensive analysis of the roles played by CGI methylation in normal and diseased human tissues.
Identification and Construction of Combinatory Cancer Hallmark-Based Gene Signature Sets to Predict Recurrence and Chemotherapy Benefit in Stage II Colorectal Cancer.

Science.gov (United States)

Gao, Shanwu; Tibiche, Chabane; Zou, Jinfeng; Zaman, Naif; Trifiro, Mark; O'Connor-McCourt, Maureen; Wang, Edwin

2016-01-01

Decisions regarding adjuvant therapy in patients with stage II colorectal cancer (CRC) have been among the most challenging and controversial in oncology over the past 20 years. To develop robust combinatory cancer hallmark-based gene signature sets (CSS sets) that more accurately predict prognosis and identify a subset of patients with stage II CRC who could gain survival benefits from adjuvant chemotherapy. Thirteen retrospective studies of patients with stage II CRC who had clinical follow-up and adjuvant chemotherapy were analyzed. Respective totals of 162 and 843 patients from 2 and 11 independent cohorts were used as the discovery and validation cohorts, respectively. A total of 1005 patients with stage II CRC were included in the 13 cohorts. Among them, 84 of 416 patients in 3 independent cohorts received fluorouracil-based adjuvant chemotherapy. Identification of CSS sets to predict relapse-free survival and identify a subset of patients with stage II CRC who could gain substantial survival benefits from fluorouracil-based adjuvant chemotherapy. Eight cancer hallmark-based gene signatures (30 genes each) were identified and used to construct CSS sets for determining prognosis. The CSS sets were validated in 11 independent cohorts of 767 patients with stage II CRC who did not receive adjuvant chemotherapy. The CSS sets accurately stratified patients into low-, intermediate-, and high-risk groups. Five-year relapse-free survival rates were 94%, 78%, and 45%, respectively, representing 60%, 28%, and 12% of patients with stage II disease. The 416 patients with CSS set-defined high-risk stage II CRC who received fluorouracil-based adjuvant chemotherapy showed a substantial gain in survival benefits from the treatment (ie, recurrence reduced by 30%-40% in 5 years). The CSS sets substantially outperformed other prognostic predictors of stage 2 CRC. They are more accurate and robust for prognostic predictions and facilitate the identification of patients with stage
Hybrid-Lambda: simulation of multiple merger and Kingman gene genealogies in species networks and species trees.

Science.gov (United States)

Zhu, Sha; Degnan, James H; Goldstien, Sharyn J; Eldon, Bjarki

2015-09-15

There has been increasing interest in coalescent models which admit multiple mergers of ancestral lineages; and to model hybridization and coalescence simultaneously. Hybrid-Lambda is a software package that simulates gene genealogies under multiple merger and Kingman's coalescent processes within species networks or species trees. Hybrid-Lambda allows different coalescent processes to be specified for different populations, and allows for time to be converted between generations and coalescent units, by specifying a population size for each population. In addition, Hybrid-Lambda can generate simulated datasets, assuming the infinitely many sites mutation model, and compute the F ST statistic. As an illustration, we apply Hybrid-Lambda to infer the time of subdivision of certain marine invertebrates under different coalescent processes. Hybrid-Lambda makes it possible to investigate biogeographic concordance among high fecundity species exhibiting skewed offspring distribution.
FunGene: the functional gene pipeline and repository.

Science.gov (United States)

Fish, Jordan A; Chai, Benli; Wang, Qiong; Sun, Yanni; Brown, C Titus; Tiedje, James M; Cole, James R

2013-01-01

Ribosomal RNA genes have become the standard molecular markers for microbial community analysis for good reasons, including universal occurrence in cellular organisms, availability of large databases, and ease of rRNA gene region amplification and analysis. As markers, however, rRNA genes have some significant limitations. The rRNA genes are often present in multiple copies, unlike most protein-coding genes. The slow rate of change in rRNA genes means that multiple species sometimes share identical 16S rRNA gene sequences, while many more species share identical sequences in the short 16S rRNA regions commonly analyzed. In addition, the genes involved in many important processes are not distributed in a phylogenetically coherent manner, potentially due to gene loss or horizontal gene transfer. While rRNA genes remain the most commonly used markers, key genes in ecologically important pathways, e.g., those involved in carbon and nitrogen cycling, can provide important insights into community composition and function not obtainable through rRNA analysis. However, working with ecofunctional gene data requires some tools beyond those required for rRNA analysis. To address this, our Functional Gene Pipeline and Repository (FunGene; http://fungene.cme.msu.edu/) offers databases of many common ecofunctional genes and proteins, as well as integrated tools that allow researchers to browse these collections and choose subsets for further analysis, build phylogenetic trees, test primers and probes for coverage, and download aligned sequences. Additional FunGene tools are specialized to process coding gene amplicon data. For example, FrameBot produces frameshift-corrected protein and DNA sequences from raw reads while finding the most closely related protein reference sequence. These tools can help provide better insight into microbial communities by directly studying key genes involved in important ecological processes.
FunGene: the Functional Gene Pipeline and Repository

Directory of Open Access Journals (Sweden)

Jordan A. Fish

2013-10-01

Full Text Available Ribosomal RNA genes have become the standard molecular markers for microbial community analysis for good reasons, including universal occurrence in cellular organisms, availability of large databases, and ease of rRNA gene region amplification and analysis. As markers, however, rRNA genes have some significant limitations. The rRNA genes are often present in multiple copies, unlike most protein-coding genes. The slow rate of change in rRNA genes means that multiple species sometimes share identical 16S rRNA gene sequences, while many more species share identical sequences in the short 16S rRNA regions commonly analyzed. In addition, the genes involved in many important processes are not distributed in a phylogenetically coherent manner, potentially due to gene loss or horizontal gene transfer.While rRNA genes remain the most commonly used markers, key genes in ecologically important pathways, e.g., those involved in carbon and nitrogen cycling, can provide important insights into community composition and function not obtainable through rRNA analysis. However, working with ecofunctional gene data requires some tools beyond those required for rRNA analysis. To address this, our Functional Gene Pipeline and Repository (FunGene; http://fungene.cme.msu.edu/ offers databases of many common ecofunctional genes and proteins, as well as integrated tools that allow researchers to browse these collections and choose subsets for further analysis, build phylogenetic trees, test primers and probes for coverage, and download aligned sequences. Additional FunGene tools are specialized to process coding gene amplicon data. For example, FrameBot produces frameshift-corrected protein and DNA sequences from raw reads while finding the most closely related protein reference sequence. These tools can help provide better insight into microbial communities by directly studying key genes involved in important ecological processes.
Gene Ontology Consortium: going forward.

Science.gov (United States)

2015-01-01

The Gene Ontology (GO; http://www.geneontology.org) is a community-based bioinformatics resource that supplies information about gene product function using ontologies to represent biological knowledge. Here we describe improvements and expansions to several branches of the ontology, as well as updates that have allowed us to more efficiently disseminate the GO and capture feedback from the research community. The Gene Ontology Consortium (GOC) has expanded areas of the ontology such as cilia-related terms, cell-cycle terms and multicellular organism processes. We have also implemented new tools for generating ontology terms based on a set of logical rules making use of templates, and we have made efforts to increase our use of logical definitions. The GOC has a new and improved web site summarizing new developments and documentation, serving as a portal to GO data. Users can perform GO enrichment analysis, and search the GO for terms, annotations to gene products, and associated metadata across multiple species using the all-new AmiGO 2 browser. We encourage and welcome the input of the research community in all biological areas in our continued effort to improve the Gene Ontology. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Integrated microarray and ChIP analysis identifies multiple Foxa2 dependent target genes in the notochord.

Science.gov (United States)

Tamplin, Owen J; Cox, Brian J; Rossant, Janet

2011-12-15

The node and notochord are key tissues required for patterning of the vertebrate body plan. Understanding the gene regulatory network that drives their formation and function is therefore important. Foxa2 is a key transcription factor at the top of this genetic hierarchy and finding its targets will help us to better understand node and notochord development. We performed an extensive microarray-based gene expression screen using sorted embryonic notochord cells to identify early notochord-enriched genes. We validated their specificity to the node and notochord by whole mount in situ hybridization. This provides the largest available resource of notochord-expressed genes, and therefore candidate Foxa2 target genes in the notochord. Using existing Foxa2 ChIP-seq data from adult liver, we were able to identify a set of genes expressed in the notochord that had associated regions of Foxa2-bound chromatin. Given that Foxa2 is a pioneer transcription factor, we reasoned that these sites might represent notochord-specific enhancers. Candidate Foxa2-bound regions were tested for notochord specific enhancer function in a zebrafish reporter assay and 7 novel notochord enhancers were identified. Importantly, sequence conservation or predictive models could not have readily identified these regions. Mutation of putative Foxa2 binding elements in two of these novel enhancers abrogated reporter expression and confirmed their Foxa2 dependence. The combination of highly specific gene expression profiling and genome-wide ChIP analysis is a powerful means of understanding developmental pathways, even for small cell populations such as the notochord. Copyright © 2011 Elsevier Inc. All rights reserved.
A MultiSite GatewayTM vector set for the functional analysis of genes in the model Saccharomyces cerevisiae

Directory of Open Access Journals (Sweden)

Nagels Durand Astrid

2012-09-01

Full Text Available Abstract Background Recombinatorial cloning using the GatewayTM technology has been the method of choice for high-throughput omics projects, resulting in the availability of entire ORFeomes in GatewayTM compatible vectors. The MultiSite GatewayTM system allows combining multiple genetic fragments such as promoter, ORF and epitope tag in one single reaction. To date, this technology has not been accessible in the yeast Saccharomyces cerevisiae, one of the most widely used experimental systems in molecular biology, due to the lack of appropriate destination vectors. Results Here, we present a set of three-fragment MultiSite GatewayTM destination vectors that have been developed for gene expression in S. cerevisiae and that allow the assembly of any promoter, open reading frame, epitope tag arrangement in combination with any of four auxotrophic markers and three distinct replication mechanisms. As an example of its applicability, we used yeast three-hybrid to provide evidence for the assembly of a ternary complex of plant proteins involved in jasmonate signalling and consisting of the JAZ, NINJA and TOPLESS proteins. Conclusion Our vectors make MultiSite GatewayTM cloning accessible in S. cerevisiae and implement a fast and versatile cloning method for the high-throughput functional analysis of (heterologous proteins in one of the most widely used model organisms for molecular biology research.
Computerized detection of multiple sclerosis candidate regions based on a level set method using an artificial neural network

International Nuclear Information System (INIS)

Kuwazuru, Junpei; Magome, Taiki; Arimura, Hidetaka; Yamashita, Yasuo; Oki, Masafumi; Toyofuku, Fukai; Kakeda, Shingo; Yamamoto, Daisuke

2010-01-01

Yamamoto et al. developed the system for computer-aided detection of multiple sclerosis (MS) candidate regions. In a level set method in their proposed method, they employed the constant threshold value for the edge indicator function related to a speed function of the level set method. However, it would be appropriate to adjust the threshold value to each MS candidate region, because the edge magnitudes in MS candidates differ from each other. Our purpose of this study was to develop a computerized detection of MS candidate regions in MR images based on a level set method using an artificial neural network (ANN). To adjust the threshold value for the edge indicator function in the level set method to each true positive (TP) and false positive (FP) region, we constructed the ANN. The ANN could provide the suitable threshold value for each candidate region in the proposed level set method so that TP regions can be segmented and FP regions can be removed. Our proposed method detected MS regions at a sensitivity of 82.1% with 0.204 FPs per slice and similarity index of MS candidate regions was 0.717 on average. (author)
A Partial Least Square Approach for Modeling Gene-gene and Gene-environment Interactions When Multiple Markers Are Genotyped

Science.gov (United States)

Wang, Tao; Ho, Gloria; Ye, Kenny; Strickler, Howard; Elston, Robert C.

2008-01-01

Genetic association studies achieve an unprecedented level of resolution in mapping disease genes by genotyping dense SNPs in a gene region. Meanwhile, these studies require new powerful statistical tools that can optimally handle a large amount of information provided by genotype data. A question that arises is how to model interactions between two genes. Simply modeling all possible interactions between the SNPs in two gene regions is not desirable because a greatly increased number of degrees of freedom can be involved in the test statistic. We introduce an approach to reduce the genotype dimension in modeling interactions. The genotype compression of this approach is built upon the information on both the trait and the cross-locus gametic disequilibrium between SNPs in two interacting genes, in such a way as to parsimoniously model the interactions without loss of useful information in the process of dimension reduction. As a result, it improves power to detect association in the presence of gene-gene interactions. This approach can be similarly applied for modeling gene-environment interactions. We compare this method with other approaches: the corresponding test without modeling any interaction, that based on a saturated interaction model, that based on principal component analysis, and that based on Tukey’s 1-df model. Our simulations suggest that this new approach has superior power to that of the other methods. In an application to endometrial cancer case-control data from the Women’s Health Initiative (WHI), this approach detected AKT1 and AKT2 as being significantly associated with endometrial cancer susceptibility by taking into account their interactions with BMI. PMID:18615621
A partial least-square approach for modeling gene-gene and gene-environment interactions when multiple markers are genotyped.

Science.gov (United States)

Wang, Tao; Ho, Gloria; Ye, Kenny; Strickler, Howard; Elston, Robert C

2009-01-01

Genetic association studies achieve an unprecedented level of resolution in mapping disease genes by genotyping dense single nucleotype polymorphisms (SNPs) in a gene region. Meanwhile, these studies require new powerful statistical tools that can optimally handle a large amount of information provided by genotype data. A question that arises is how to model interactions between two genes. Simply modeling all possible interactions between the SNPs in two gene regions is not desirable because a greatly increased number of degrees of freedom can be involved in the test statistic. We introduce an approach to reduce the genotype dimension in modeling interactions. The genotype compression of this approach is built upon the information on both the trait and the cross-locus gametic disequilibrium between SNPs in two interacting genes, in such a way as to parsimoniously model the interactions without loss of useful information in the process of dimension reduction. As a result, it improves power to detect association in the presence of gene-gene interactions. This approach can be similarly applied for modeling gene-environment interactions. We compare this method with other approaches, the corresponding test without modeling any interaction, that based on a saturated interaction model, that based on principal component analysis, and that based on Tukey's one-degree-of-freedom model. Our simulations suggest that this new approach has superior power to that of the other methods. In an application to endometrial cancer case-control data from the Women's Health Initiative, this approach detected AKT1 and AKT2 as being significantly associated with endometrial cancer susceptibility by taking into account their interactions with body mass index.

Large scale gene expression meta-analysis reveals tissue-specific, sex-biased gene expression in humans

Directory of Open Access Journals (Sweden)

Benjamin Mayne

2016-10-01

Full Text Available The severity and prevalence of many diseases are known to differ between the sexes. Organ specific sex-biased gene expression may underpin these and other sexually dimorphic traits. To further our understanding of sex differences in transcriptional regulation, we performed meta-analyses of sex biased gene expression in multiple human tissues. We analysed 22 publicly available human gene expression microarray data sets including over 2500 samples from 15 different tissues and 9 different organs. Briefly, by using an inverse-variance method we determined the effect size difference of gene expression between males and females. We found the greatest sex differences in gene expression in the brain, specifically in the anterior cingulate cortex, (1818 genes, followed by the heart (375 genes, kidney (224 genes, colon (218 genes and thyroid (163 genes. More interestingly, we found different parts of the brain with varying numbers and identity of sex-biased genes, indicating that specific cortical regions may influence sexually dimorphic traits. The majority of sex-biased genes in other tissues such as the bladder, liver, lungs and pancreas were on the sex chromosomes or involved in sex hormone production. On average in each tissue, 32% of autosomal genes that were expressed in a sex-biased fashion contained androgen or estrogen hormone response elements. Interestingly, across all tissues, we found approximately two-thirds of autosomal genes that were sex-biased were not under direct influence of sex hormones. To our knowledge this is the largest analysis of sex-biased gene expression in human tissues to date. We identified many sex-biased genes that were not under the direct influence of sex chromosome genes or sex hormones. These may provide targets for future development of sex-specific treatments for diseases.
Streptococcus pneumoniae Supragenome Hybridization Arrays for Profiling of Genetic Content and Gene Expression.

Science.gov (United States)

Kadam, Anagha; Janto, Benjamin; Eutsey, Rory; Earl, Joshua P; Powell, Evan; Dahlgren, Margaret E; Hu, Fen Z; Ehrlich, Garth D; Hiller, N Luisa

2015-02-02

There is extensive genomic diversity among Streptococcus pneumoniae isolates. Approximately half of the comprehensive set of genes in the species (the supragenome or pangenome) is present in all the isolates (core set), and the remaining is unevenly distributed among strains (distributed set). The Streptococcus pneumoniae Supragenome Hybridization (SpSGH) array provides coverage for an extensive set of genes and polymorphisms encountered within this species, capturing this genomic diversity. Further, the capture is quantitative. In this manner, the SpSGH array allows for both genomic and transcriptomic analyses of diverse S. pneumoniae isolates on a single platform. In this unit, we present the SpSGH array, and describe in detail its design and implementation for both genomic and transcriptomic analyses. The methodology can be applied to construction and modification of SpSGH array platforms, as well to other bacterial species as long as multiple whole-genome sequences are available that collectively capture the vast majority of the species supragenome. Copyright © 2015 John Wiley & Sons, Inc.
Genetic transformation and gene silencing mediated by multiple copies of a transgene in eastern white pine.

Science.gov (United States)

Tang, Wei; Newton, Ronald J; Weidner, Douglas A

2007-01-01

An efficient transgenic eastern white pine (Pinus strobus L.) plant regeneration system has been established using Agrobacterium tumefaciens strain GV3850-mediated transformation and the green fluorescent protein (gfp) gene as a reporter in this investigation. Stable integration of transgenes in the plant genome of pine was confirmed by polymerase chain reaction (PCR), Southern blot, and northern blot analyses. Transgene expression was analysed in pine T-DNA transformants carrying different numbers of copies of T-DNA insertions. Post-transcriptional gene silencing (PTGS) was mostly obtained in transgenic lines with more than three copies of T-DNA, but not in transgenic lines with one copy of T-DNA. In situ hybridization chromosome analysis of transgenic lines demonstrated that silenced transgenic lines had two or more T-DNA insertions in the same chromosome. These results suggest that two or more T-DNA insertions in the same chromosome facilitate efficient gene silencing in transgenic pine cells expressing green fluorescent protein. There were no differences in shoot differentiation and development between transgenic lines with multiple T-DNA copies and transgenic lines with one or two T-DNA copies.
PoCos: Population Covering Locus Sets for Risk Assessment in Complex Diseases.

Directory of Open Access Journals (Sweden)

Marzieh Ayati

2016-11-01

Full Text Available Susceptibility loci identified by GWAS generally account for a limited fraction of heritability. Predictive models based on identified loci also have modest success in risk assessment and therefore are of limited practical use. Many methods have been developed to overcome these limitations by incorporating prior biological knowledge. However, most of the information utilized by these methods is at the level of genes, limiting analyses to variants that are in or proximate to coding regions. We propose a new method that integrates protein protein interaction (PPI as well as expression quantitative trait loci (eQTL data to identify sets of functionally related loci that are collectively associated with a trait of interest. We call such sets of loci "population covering locus sets" (PoCos. The contributions of the proposed approach are three-fold: 1 We consider all possible genotype models for each locus, thereby enabling identification of combinatorial relationships between multiple loci. 2 We develop a framework for the integration of PPI and eQTL into a heterogenous network model, enabling efficient identification of functionally related variants that are associated with the disease. 3 We develop a novel method to integrate the genotypes of multiple loci in a PoCo into a representative genotype to be used in risk assessment. We test the proposed framework in the context of risk assessment for seven complex diseases, type 1 diabetes (T1D, type 2 diabetes (T2D, psoriasis (PS, bipolar disorder (BD, coronary artery disease (CAD, hypertension (HT, and multiple sclerosis (MS. Our results show that the proposed method significantly outperforms individual variant based risk assessment models as well as the state-of-the-art polygenic score. We also show that incorporation of eQTL data improves the performance of identified POCOs in risk assessment. We also assess the biological relevance of PoCos for three diseases that have similar biological mechanisms
Meta-analysis methods for combining multiple expression profiles: comparisons, statistical characterization and an application guideline.

Science.gov (United States)

Chang, Lun-Ching; Lin, Hui-Min; Sibille, Etienne; Tseng, George C

2013-12-21

As high-throughput genomic technologies become accurate and affordable, an increasing number of data sets have been accumulated in the public domain and genomic information integration and meta-analysis have become routine in biomedical research. In this paper, we focus on microarray meta-analysis, where multiple microarray studies with relevant biological hypotheses are combined in order to improve candidate marker detection. Many methods have been developed and applied in the literature, but their performance and properties have only been minimally investigated. There is currently no clear conclusion or guideline as to the proper choice of a meta-analysis method given an application; the decision essentially requires both statistical and biological considerations. We performed 12 microarray meta-analysis methods for combining multiple simulated expression profiles, and such methods can be categorized for different hypothesis setting purposes: (1) HS(A): DE genes with non-zero effect sizes in all studies, (2) HS(B): DE genes with non-zero effect sizes in one or more studies and (3) HS(r): DE gene with non-zero effect in "majority" of studies. We then performed a comprehensive comparative analysis through six large-scale real applications using four quantitative statistical evaluation criteria: detection capability, biological association, stability and robustness. We elucidated hypothesis settings behind the methods and further apply multi-dimensional scaling (MDS) and an entropy measure to characterize the meta-analysis methods and data structure, respectively. The aggregated results from the simulation study categorized the 12 methods into three hypothesis settings (HS(A), HS(B), and HS(r)). Evaluation in real data and results from MDS and entropy analyses provided an insightful and practical guideline to the choice of the most suitable method in a given application. All source files for simulation and real data are available on the author's publication website.
SET oncoprotein accumulation regulates transcription through DNA demethylation and histone hypoacetylation.

Science.gov (United States)

Almeida, Luciana O; Neto, Marinaldo P C; Sousa, Lucas O; Tannous, Maryna A; Curti, Carlos; Leopoldino, Andreia M

2017-04-18

Epigenetic modifications are essential in the control of normal cellular processes and cancer development. DNA methylation and histone acetylation are major epigenetic modifications involved in gene transcription and abnormal events driving the oncogenic process. SET protein accumulates in many cancer types, including head and neck squamous cell carcinoma (HNSCC); SET is a member of the INHAT complex that inhibits gene transcription associating with histones and preventing their acetylation. We explored how SET protein accumulation impacts on the regulation of gene expression, focusing on DNA methylation and histone acetylation. DNA methylation profile of 24 tumour suppressors evidenced that SET accumulation decreased DNA methylation in association with loss of 5-methylcytidine, formation of 5-hydroxymethylcytosine and increased TET1 levels, indicating an active DNA demethylation mechanism. However, the expression of some suppressor genes was lowered in cells with high SET levels, suggesting that loss of methylation is not the main mechanism modulating gene expression. SET accumulation also downregulated the expression of 32 genes of a panel of 84 transcription factors, and SET directly interacted with chromatin at the promoter of the downregulated genes, decreasing histone acetylation. Gene expression analysis after cell treatment with 5-aza-2'-deoxycytidine (5-AZA) and Trichostatin A (TSA) revealed that histone acetylation reversed transcription repression promoted by SET. These results suggest a new function for SET in the regulation of chromatin dynamics. In addition, TSA diminished both SET protein levels and SET capability to bind to gene promoter, suggesting that administration of epigenetic modifier agents could be efficient to reverse SET phenotype in cancer.
[Double mutant alleles in the EXT1 gene not previously reported in a teenager with hereditary multiple exostoses].

Science.gov (United States)

Cammarata-Scalisi, Francisco; Cozar, Mónica; Grinberg, Daniel; Balcells, Susana; Asteggiano, Carla G; Martínez-Domenech, Gustavo; Bracho, Ana; Sánchez, Yanira; Stock, Frances; Delgado-Luengo, Wilmer; Zara-Chirinos, Carmen; Chacín, José Antonio

2015-04-01

Hereditary forms of multiple exostoses, now called EXT1/EXT2-CDG within Congenital Disorders of Glycosylation, are the most common benign bone tumors in humans and clinical description consists of the formation of several cartilage-capped bone tumors, usually benign and localized in the juxta-epiphyseal region of long bones, although wide body dissemination in severe cases is not uncommon. Onset of the disease is variable ranging from 2-3 years up to 13-15 years with an estimated incidence ranging from 1/18,000 to 1/50,000 cases in European countries. We present a double mutant alleles in the EXT1 gene not previously reported in a teenager and her family with hereditary multiple exostoses.
Regeneration of multiple shoots from transgenic potato events facilitates the recovery of phenotypically normal lines: assessing a cry9Aa2 gene conferring insect resistance

Directory of Open Access Journals (Sweden)

Jacobs Jeanne ME

2011-10-01

Full Text Available Abstract Background The recovery of high performing transgenic lines in clonal crops is limited by the occurrence of somaclonal variation during the tissue culture phase of transformation. This is usually circumvented by developing large populations of transgenic lines, each derived from the first shoot to regenerate from each transformation event. This study investigates a new strategy of assessing multiple shoots independently regenerated from different transformed cell colonies of potato (Solanum tuberosum L.. Results A modified cry9Aa2 gene, under the transcriptional control of the CaMV 35S promoter, was transformed into four potato cultivars using Agrobacterium-mediated gene transfer using a nptII gene conferring kanamycin resistance as a selectable marker gene. Following gene transfer, 291 transgenic lines were grown in greenhouse experiments to assess somaclonal variation and resistance to potato tuber moth (PTM, Phthorimaea operculella (Zeller. Independently regenerated lines were recovered from many transformed cell colonies and Southern analysis confirmed whether they were derived from the same transformed cell. Multiple lines regenerated from the same transformed cell exhibited a similar response to PTM, but frequently exhibited a markedly different spectrum of somaclonal variation. Conclusions A new strategy for the genetic improvement of clonal crops involves the regeneration and evaluation of multiple shoots from each transformation event to facilitate the recovery of phenotypically normal transgenic lines. Most importantly, regenerated lines exhibiting the phenotypic appearance most similar to the parental cultivar are not necessarily derived from the first shoot regenerated from a transformed cell colony, but can frequently be a later regeneration event.
Identification of a set of endogenous reference genes for miRNA expression studies in Parkinson's disease blood samples.

Science.gov (United States)

Serafin, Alice; Foco, Luisa; Blankenburg, Hagen; Picard, Anne; Zanigni, Stefano; Zanon, Alessandra; Pramstaller, Peter P; Hicks, Andrew A; Schwienbacher, Christine

2014-10-10

Research on microRNAs (miRNAs) is becoming an increasingly attractive field, as these small RNA molecules are involved in several physiological functions and diseases. To date, only few studies have assessed the expression of blood miRNAs related to Parkinson's disease (PD) using microarray and quantitative real-time PCR (qRT-PCR). Measuring miRNA expression involves normalization of qRT-PCR data using endogenous reference genes for calibration, but their choice remains a delicate problem with serious impact on the resulting expression levels. The aim of the present study was to evaluate the suitability of a set of commonly used small RNAs as normalizers and to identify which of these miRNAs might be considered reliable reference genes in qRT-PCR expression analyses on PD blood samples. Commonly used reference genes snoRNA RNU24, snRNA RNU6B, snoRNA Z30 and miR-103a-3p were selected from the literature. We then analyzed the effect of using these genes as reference, alone or in any possible combination, on the measured expression levels of the target genes miR-30b-5p and miR-29a-3p, which have been previously reported to be deregulated in PD blood samples. We identified RNU24 and Z30 as a reliable and stable pair of reference genes in PD blood samples.
Development of new USER-based cloning vectors for multiple genes expression in Saccharomyces cerevisiae

DEFF Research Database (Denmark)

Kildegaard, Kanchana Rueksomtawin; Jensen, Niels Bjerg; Maury, Jerome

2013-01-01

auxotrophic and dominant markers for convenience of use. Our vector set also contains both integrating and multicopy vectors for stability of protein expression and high expression level. We will make the new vector system available to the yeast community and provide a comprehensive protocol for cloning...... the production strain with the proper phenotype and product yield. However, the sequential number of metabolic engineering is time-consuming. Furthermore, the number of available selectable markers is also limiting the number of genetic modifications. To overcome these limitations, we have developed a new set...... of shuttle vectors for convenience of use for high-throughput cloning and selectable marker recycling. The new USER-based cloning vectors consist of a unique USER site and a CRE-loxP-mediated marker recycling system. The USER site allows insertion of genes of interest along with a bidirectional promoter...
Sporulation genes associated with sporulation efficiency in natural isolates of yeast.

Science.gov (United States)

Tomar, Parul; Bhatia, Aatish; Ramdas, Shweta; Diao, Liyang; Bhanot, Gyan; Sinha, Himanshu

2013-01-01

Yeast sporulation efficiency is a quantitative trait and is known to vary among experimental populations and natural isolates. Some studies have uncovered the genetic basis of this variation and have identified the role of sporulation genes (IME1, RME1) and sporulation-associated genes (FKH2, PMS1, RAS2, RSF1, SWS2), as well as non-sporulation pathway genes (MKT1, TAO3) in maintaining this variation. However, these studies have been done mostly in experimental populations. Sporulation is a response to nutrient deprivation. Unlike laboratory strains, natural isolates have likely undergone multiple selections for quick adaptation to varying nutrient conditions. As a result, sporulation efficiency in natural isolates may have different genetic factors contributing to phenotypic variation. Using Saccharomyces cerevisiae strains in the genetically and environmentally diverse SGRP collection, we have identified genetic loci associated with sporulation efficiency variation in a set of sporulation and sporulation-associated genes. Using two independent methods for association mapping and correcting for population structure biases, our analysis identified two linked clusters containing 4 non-synonymous mutations in genes - HOS4, MCK1, SET3, and SPO74. Five regulatory polymorphisms in five genes such as MLS1 and CDC10 were also identified as putative candidates. Our results provide candidate genes contributing to phenotypic variation in the sporulation efficiency of natural isolates of yeast.
Multiple Cytochrome P450 genes: their constitutive overexpression and permethrin induction in insecticide resistant mosquitoes, Culex quinquefasciatus.

Science.gov (United States)

Liu, Nannan; Li, Ting; Reid, William R; Yang, Ting; Zhang, Lee

2011-01-01

Four cytochrome P450 cDNAs, CYP6AA7, CYP9J40, CYP9J34, and CYP9M10, were isolated from mosquitoes, Culex quinquefasciatus. The P450 gene expression and induction by permethrin were compared for three different mosquito populations bearing different resistance phenotypes, ranging from susceptible (S-Lab), through intermediate (HAmCq(G0), the field parental population) to highly resistant (HAmCq(G8), the 8(th) generation of permethrin selected offspring of HAmCq(G0)). A strong correlation was found for P450 gene expression with the levels of resistance and following permethrin selection at the larval stage of mosquitoes, with the highest expression levels identified in HAmCq(G8), suggesting the importance of CYP6AA7, CYP9J40, CYP9J34, and CYP9M10 in the permethrin resistance of larva mosquitoes. Only CYP6AA7 showed a significant overexpression in HAmCq(G8) adult mosquitoes. Other P450 genes had similar expression levels among the mosquito populations tested, suggesting different P450 genes may be involved in the response to insecticide pressure in different developmental stages. The expression of CYP6AA7, CYP9J34, and CYP9M10 was further induced by permethrin in resistant mosquitoes. Taken together, these results indicate that multiple P450 genes are up-regulated in insecticide resistant mosquitoes through both constitutive overexpression and induction mechanisms, thus increasing the overall expression levels of P450 genes.
Efficient simulation of voxelized phantom in GATE with embedded SimSET multiple photon history generator

Science.gov (United States)

Lin, Hsin-Hon; Chuang, Keh-Shih; Lin, Yi-Hsing; Ni, Yu-Ching; Wu, Jay; Jan, Meei-Ling

2014-10-01

GEANT4 Application for Tomographic Emission (GATE) is a powerful Monte Carlo simulator that combines the advantages of the general-purpose GEANT4 simulation code and the specific software tool implementations dedicated to emission tomography. However, the detailed physical modelling of GEANT4 is highly computationally demanding, especially when tracking particles through voxelized phantoms. To circumvent the relatively slow simulation of voxelized phantoms in GATE, another efficient Monte Carlo code can be used to simulate photon interactions and transport inside a voxelized phantom. The simulation system for emission tomography (SimSET), a dedicated Monte Carlo code for PET/SPECT systems, is well-known for its efficiency in simulation of voxel-based objects. An efficient Monte Carlo workflow integrating GATE and SimSET for simulating pinhole SPECT has been proposed to improve voxelized phantom simulation. Although the workflow achieves a desirable increase in speed, it sacrifices the ability to simulate decaying radioactive sources such as non-pure positron emitters or multiple emission isotopes with complex decay schemes and lacks the modelling of time-dependent processes due to the inherent limitations of the SimSET photon history generator (PHG). Moreover, a large volume of disk storage is needed to store the huge temporal photon history file produced by SimSET that must be transported to GATE. In this work, we developed a multiple photon emission history generator (MPHG) based on SimSET/PHG to support a majority of the medically important positron emitters. We incorporated the new generator codes inside GATE to improve the simulation efficiency of voxelized phantoms in GATE, while eliminating the need for the temporal photon history file. The validation of this new code based on a MicroPET R4 system was conducted for 124I and 18F with mouse-like and rat-like phantoms. Comparison of GATE/MPHG with GATE/GEANT4 indicated there is a slight difference in energy
A novel joint analysis framework improves identification of differentially expressed genes in cross disease transcriptomic analysis

Directory of Open Access Journals (Sweden)

Wenyi Qin

2018-02-01

Full Text Available Abstract Motivation Detecting differentially expressed (DE genes between disease and normal control group is one of the most common analyses in genome-wide transcriptomic data. Since most studies don’t have a lot of samples, researchers have used meta-analysis to group different datasets for the same disease. Even then, in many cases the statistical power is still not enough. Taking into account the fact that many diseases share the same disease genes, it is desirable to design a statistical framework that can identify diseases’ common and specific DE genes simultaneously to improve the identification power. Results We developed a novel empirical Bayes based mixture model to identify DE genes in specific study by leveraging the shared information across multiple different disease expression data sets. The effectiveness of joint analysis was demonstrated through comprehensive simulation studies and two real data applications. The simulation results showed that our method consistently outperformed single data set analysis and two other meta-analysis methods in identification power. In real data analysis, overall our method demonstrated better identification power in detecting DE genes and prioritized more disease related genes and disease related pathways than single data set analysis. Over 150% more disease related genes are identified by our method in application to Huntington’s disease. We expect that our method would provide researchers a new way of utilizing available data sets from different diseases when sample size of the focused disease is limited.
Mapping of multiple criteria for priority setting of health interventions: an aid for decision makers

Directory of Open Access Journals (Sweden)

Tromp Noor

2012-12-01

Full Text Available Abstract Background In rationing decisions in health, many criteria like costs, effectiveness, equity and feasibility concerns play a role. These criteria stem from different disciplines that all aim to inform health care rationing decisions, but a single underlying concept that incorporates all criteria does not yet exist. Therefore, we aim to develop a conceptual mapping of criteria, based on the World Health Organization’s Health Systems Performance and Health Systems Building Blocks frameworks. This map can be an aid to decision makers to identify the relevant criteria for priority setting in their specific context. Methods We made an inventory of all possible criteria for priority setting on the basis of literature review. We categorized the criteria according to both health system frameworks that spell out a country’s health system goals and input. We reason that the criteria that decision makers use in priority setting exercises are a direct manifestation of this. Results Our map includes thirty-one criteria that are distributed among five categories that reflect the goals of a health system (i.e. to improve level of health, fair distribution of health, responsiveness, social & financial risk protection and efficiency and leadership/governance one category that reflects feasibiliy based on the health system building blocks (i.e. service delivery, health care workforce , information, medical products, vaccines & technologies, financing and. Conclusions This conceptual mapping of criteria, based on well-established health system frameworks, will further develop the field of priority setting by assisting decision makers in the identification of multiple criteria for selection of health interventions.
A new set of ESTs and cDNA clones from full-length and normalized libraries for gene discovery and functional characterization in citrus

Directory of Open Access Journals (Sweden)

Alamar Santiago

2009-09-01

Full Text Available Abstract Background Interpretation of ever-increasing raw sequence information generated by modern genome sequencing technologies faces multiple challenges, such as gene function analysis and genome annotation. Indeed, nearly 40% of genes in plants encode proteins of unknown function. Functional characterization of these genes is one of the main challenges in modern biology. In this regard, the availability of full-length cDNA clones may fill in the gap created between sequence information and biological knowledge. Full-length cDNA clones facilitate functional analysis of the corresponding genes enabling manipulation of their expression in heterologous systems and the generation of a variety of tagged versions of the native protein. In addition, the development of full-length cDNA sequences has the power to improve the quality of genome annotation. Results We developed an integrated method to generate a new normalized EST collection enriched in full-length and rare transcripts of different citrus species from multiple tissues and developmental stages. We constructed a total of 15 cDNA libraries, from which we isolated 10,898 high-quality ESTs representing 6142 different genes. Percentages of redundancy and proportion of full-length clones range from 8 to 33, and 67 to 85, respectively, indicating good efficiency of the approach employed. The new EST collection adds 2113 new citrus ESTs, representing 1831 unigenes, to the collection of citrus genes available in the public databases. To facilitate functional analysis, cDNAs were introduced in a Gateway-based cloning vector for high-throughput functional analysis of genes in planta. Herein, we describe the technical methods used in the library construction, sequence analysis of clones and the overexpression of CitrSEP, a citrus homolog to the Arabidopsis SEP3 gene, in Arabidopsis as an example of a practical application of the engineered Gateway vector for functional analysis. Conclusion The new
A new set of ESTs and cDNA clones from full-length and normalized libraries for gene discovery and functional characterization in citrus

Science.gov (United States)

Marques, M Carmen; Alonso-Cantabrana, Hugo; Forment, Javier; Arribas, Raquel; Alamar, Santiago; Conejero, Vicente; Perez-Amador, Miguel A

2009-01-01

Background Interpretation of ever-increasing raw sequence information generated by modern genome sequencing technologies faces multiple challenges, such as gene function analysis and genome annotation. Indeed, nearly 40% of genes in plants encode proteins of unknown function. Functional characterization of these genes is one of the main challenges in modern biology. In this regard, the availability of full-length cDNA clones may fill in the gap created between sequence information and biological knowledge. Full-length cDNA clones facilitate functional analysis of the corresponding genes enabling manipulation of their expression in heterologous systems and the generation of a variety of tagged versions of the native protein. In addition, the development of full-length cDNA sequences has the power to improve the quality of genome annotation. Results We developed an integrated method to generate a new normalized EST collection enriched in full-length and rare transcripts of different citrus species from multiple tissues and developmental stages. We constructed a total of 15 cDNA libraries, from which we isolated 10,898 high-quality ESTs representing 6142 different genes. Percentages of redundancy and proportion of full-length clones range from 8 to 33, and 67 to 85, respectively, indicating good efficiency of the approach employed. The new EST collection adds 2113 new citrus ESTs, representing 1831 unigenes, to the collection of citrus genes available in the public databases. To facilitate functional analysis, cDNAs were introduced in a Gateway-based cloning vector for high-throughput functional analysis of genes in planta. Herein, we describe the technical methods used in the library construction, sequence analysis of clones and the overexpression of CitrSEP, a citrus homolog to the Arabidopsis SEP3 gene, in Arabidopsis as an example of a practical application of the engineered Gateway vector for functional analysis. Conclusion The new EST collection denotes an
An ancient dental gene set governs development and continuous regeneration of teeth in sharks.

Science.gov (United States)

Rasch, Liam J; Martin, Kyle J; Cooper, Rory L; Metscher, Brian D; Underwood, Charlie J; Fraser, Gareth J

2016-07-15

The evolution of oral teeth is considered a major contributor to the overall success of jawed vertebrates. This is especially apparent in cartilaginous fishes including sharks and rays, which develop elaborate arrays of highly specialized teeth, organized in rows and retain the capacity for life-long regeneration. Perpetual regeneration of oral teeth has been either lost or highly reduced in many other lineages including important developmental model species, so cartilaginous fishes are uniquely suited for deep comparative analyses of tooth development and regeneration. Additionally, sharks and rays can offer crucial insights into the characters of the dentition in the ancestor of all jawed vertebrates. Despite this, tooth development and regeneration in chondrichthyans is poorly understood and remains virtually uncharacterized from a developmental genetic standpoint. Using the emerging chondrichthyan model, the catshark (Scyliorhinus spp.), we characterized the expression of genes homologous to those known to be expressed during stages of early dental competence, tooth initiation, morphogenesis, and regeneration in bony vertebrates. We have found that expression patterns of several genes from Hh, Wnt/β-catenin, Bmp and Fgf signalling pathways indicate deep conservation over ~450 million years of tooth development and regeneration. We describe how these genes participate in the initial emergence of the shark dentition and how they are redeployed during regeneration of successive tooth generations. We suggest that at the dawn of the vertebrate lineage, teeth (i) were most likely continuously regenerative structures, and (ii) utilised a core set of genes from members of key developmental signalling pathways that were instrumental in creating a dental legacy redeployed throughout vertebrate evolution. These data lay the foundation for further experimental investigations utilizing the unique regenerative capacity of chondrichthyan models to answer evolutionary
Gene Cluster Responsible for Secretion of and Immunity to Multiple Bacteriocins, the NKR-5-3 Enterocins

Science.gov (United States)

Ishibashi, Naoki; Himeno, Kohei; Masuda, Yoshimitsu; Perez, Rodney Honrada; Iwatani, Shun; Wilaipun, Pongtep; Leelawatcharamas, Vichien; Nakayama, Jiro; Sonomoto, Kenji

2014-01-01

Enterococcus faecium NKR-5-3, isolated from Thai fermented fish, is characterized by the unique ability to produce five bacteriocins, namely, enterocins NKR-5-3A, -B, -C, -D, and -Z (Ent53A, Ent53B, Ent53C, Ent53D, and Ent53Z). Genetic analysis with a genome library revealed that the bacteriocin structural genes (enkA [ent53A], enkC [ent53C], enkD [ent53D], and enkZ [ent53Z]) that encode these peptides (except for Ent53B) are located in close proximity to each other. This NKR-5-3ACDZ (Ent53ACDZ) enterocin gene cluster (approximately 13 kb long) includes certain bacteriocin biosynthetic genes such as an ABC transporter gene (enkT), two immunity genes (enkIaz and enkIc), a response regulator (enkR), and a histidine protein kinase (enkK). Heterologous-expression studies of enkT and ΔenkT mutant strains showed that enkT is responsible for the secretion of Ent53A, Ent53C, Ent53D, and Ent53Z, suggesting that EnkT is a wide-range ABC transporter that contributes to the effective production of these bacteriocins. In addition, EnkIaz and EnkIc were found to confer self-immunity to the respective bacteriocins. Furthermore, bacteriocin induction assays performed with the ΔenkRK mutant strain showed that EnkR and EnkK are regulatory proteins responsible for bacteriocin production and that, together with Ent53D, they constitute a three-component regulatory system. Thus, the Ent53ACDZ gene cluster is essential for the biosynthesis and regulation of NKR-5-3 enterocins, and this is, to our knowledge, the first report that demonstrates the secretion of multiple bacteriocins by an ABC transporter. PMID:25149515
Feature selection and classification of MAQC-II breast cancer and multiple myeloma microarray gene expression data.

Directory of Open Access Journals (Sweden)

Qingzhong Liu

Full Text Available Microarray data has a high dimension of variables but available datasets usually have only a small number of samples, thereby making the study of such datasets interesting and challenging. In the task of analyzing microarray data for the purpose of, e.g., predicting gene-disease association, feature selection is very important because it provides a way to handle the high dimensionality by exploiting information redundancy induced by associations among genetic markers. Judicious feature selection in microarray data analysis can result in significant reduction of cost while maintaining or improving the classification or prediction accuracy of learning machines that are employed to sort out the datasets. In this paper, we propose a gene selection method called Recursive Feature Addition (RFA, which combines supervised learning and statistical similarity measures. We compare our method with the following gene selection methods: Support Vector Machine Recursive Feature Elimination (SVMRFE, Leave-One-Out Calculation Sequential Forward Selection (LOOCSFS, Gradient based Leave-one-out Gene Selection (GLGS. To evaluate the performance of these gene selection methods, we employ several popular learning classifiers on the MicroArray Quality Control phase II on predictive modeling (MAQC-II breast cancer dataset and the MAQC-II multiple myeloma dataset. Experimental results show that gene selection is strictly paired with learning classifier. Overall, our approach outperforms other compared methods. The biological functional analysis based on the MAQC-II breast cancer dataset convinced us to apply our method for phenotype prediction. Additionally, learning classifiers also play important roles in the classification of microarray data and our experimental results indicate that the Nearest Mean Scale Classifier (NMSC is a good choice due to its prediction reliability and its stability across the three performance measurements: Testing accuracy, MCC values, and

Assessment set for evaluation of clinical outcomes in multiple sclerosis: psychometric properties

Directory of Open Access Journals (Sweden)

Rasova K

2012-10-01

Full Text Available Kamila Rasova,1 Patricia Martinkova,2 Jana Vyskotova,3 Michaela Sedova21Department of Rehabilitation, 3rd Faculty of Medicine, Charles University in Prague and Faculty Hospital Královské Vinohrady, Prague, Czech Republic; 2Center of Biomedical Informatics and Department of Medical Informatics and Biostatistics, Institute of Computer Science, AS CR, Prague, Czech Republic; 3Faculty of Medicine, Ostrava University, Ostrava, Czech RepublicPurpose: Multiple sclerosis (MS manifests itself in a wide range of symptoms. Physiotherapy plays an important role in the treatment of those symptoms connected with mobility. For this therapy to be at its most effective it should be based on a systematic examination that is able to describe and classify damaged clinical functions meaningfully. The purpose of this study was to develop and validate a battery of tests and composite tests that can be used to systematically evaluate clinical features of MS treatable by physiotherapy.Methods: The authors assembled a proposed battery of tests comprising known, standard, and validated assessments (low-contrast letter acuity testing; the Motricity Index; the Modified Ashworth Scale; the Berg Balance Scale; scales of postural reactions, tremor, dysdiadochokinesia, and dysmetria; the Nine-Hole Peg Test; the Timed 25-Foot Walk; and the 3-minute version of the Paced Auditory Serial Addition Test and one test (knee hyperextension of the authors’ own. Normalization was calculated and six composite assessments were measured. Seventeen ambulatory subjects with MS were tested twice with the assessment set before undergoing physiotherapy, and 12 were also tested with the assessment set after the physiotherapy. The test–retest reliability, stability, internal consistency of composite measurements, sensitivity to changes after therapy, and correlation between measurements and the Kurtzke Expanded Disability Status Scale score were evaluated for all tests in the assessment set
Inhibition of estrogen-responsive gene activation by the retinoid X receptor beta: evidence for multiple inhibitory pathways.

OpenAIRE

Segars, J H; Marks, M S; Hirschfeld, S; Driggers, P H; Martinez, E; Grippo, J F; Brown, M; Wahli, W; Ozato, K

1993-01-01

The retinoid X receptor beta (RXR beta; H-2RIIBP) forms heterodimers with various nuclear hormone receptors and binds multiple hormone response elements, including the estrogen response element (ERE). In this report, we show that endogenous RXR beta contributes to ERE binding activity in nuclear extracts of the human breast cancer cell line MCF-7. To define a possible regulatory role of RXR beta regarding estrogen-responsive transcription in breast cancer cells, RXR beta and a reporter gene d...
Syntenic block overlap multiplicities with a panel of reference genomes provide a signature of ancient polyploidization events.

Science.gov (United States)

Zheng, Chunfang; Santos Muñoz, Daniella; Albert, Victor A; Sankoff, David

2015-01-01

Following whole genome duplication (WGD), there is a compact distribution of gene similarities within the genome reflecting duplicate pairs of all the genes in the genome. With time, the distribution broadens and loses volume due to variable decay of duplicate gene similarity and to the process of duplicate gene loss. If there are two WGD, the older one becomes so reduced and broad that it merges with the tail of the distributions resulting from more recent events, and it becomes difficult to distinguish them. The goal of this paper is to advance statistical methods of identifying, or at least counting, the WGD events in the lineage of a given genome. For a set of 15 angiosperm genomes, we analyze all 15 × 14 = 210 ordered pairs of target genome versus reference genome, using SynMap to find syntenic blocks. We consider all sets of B ≥ 2 syntenic blocks in the target genome that overlap in the reference genome as evidence of WGD activity in the target, whether it be one event or several. We hypothesize that in fitting an exponential function to the tail of the empirical distribution f (B) of block multiplicities, the size of the exponent will reflect the amount of WGD in the history of the target genome. By amalgamating the results from all reference genomes, a range of values of SynMap parameters, and alternative cutoff points for the tail, we find a clear pattern whereby multiple-WGD core eudicots have the smallest (negative) exponents, followed by core eudicots with only the single "γ" triplication in their history, followed by a non-core eudicot with a single WGD, followed by the monocots, with a basal angiosperm, the WGD-free Amborella having the largest exponent. The hypothesis that the exponent of the fit to the tail of the multiplicity distribution is a signature of the amount of WGD is verified, but there is also a clear complicating factor in the monocot clade, where a history of multiple WGD is not reflected in a small exponent.
Gene selection for the reconstruction of stem cell differentiation trees: a linear programming approach.

Science.gov (United States)

Ghadie, Mohamed A; Japkowicz, Nathalie; Perkins, Theodore J

2015-08-15

Stem cell differentiation is largely guided by master transcriptional regulators, but it also depends on the expression of other types of genes, such as cell cycle genes, signaling genes, metabolic genes, trafficking genes, etc. Traditional approaches to understanding gene expression patterns across multiple conditions, such as principal components analysis or K-means clustering, can group cell types based on gene expression, but they do so without knowledge of the differentiation hierarchy. Hierarchical clustering can organize cell types into a tree, but in general this tree is different from the differentiation hierarchy itself. Given the differentiation hierarchy and gene expression data at each node, we construct a weighted Euclidean distance metric such that the minimum spanning tree with respect to that metric is precisely the given differentiation hierarchy. We provide a set of linear constraints that are provably sufficient for the desired construction and a linear programming approach to identify sparse sets of weights, effectively identifying genes that are most relevant for discriminating different parts of the tree. We apply our method to microarray gene expression data describing 38 cell types in the hematopoiesis hierarchy, constructing a weighted Euclidean metric that uses just 175 genes. However, we find that there are many alternative sets of weights that satisfy the linear constraints. Thus, in the style of random-forest training, we also construct metrics based on random subsets of the genes and compare them to the metric of 175 genes. We then report on the selected genes and their biological functions. Our approach offers a new way to identify genes that may have important roles in stem cell differentiation. tperkins@ohri.ca Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
How Settings Change People: Applying Behavior Setting Theory to Consumer-Run Organizations

Science.gov (United States)

Brown, Louis D.; Shepherd, Matthew D.; Wituk, Scott A.; Meissen, Greg

2007-01-01

Self-help initiatives stand as a classic context for organizational studies in community psychology. Behavior setting theory stands as a classic conception of organizations and the environment. This study explores both, applying behavior setting theory to consumer-run organizations (CROs). Analysis of multiple data sets from all CROs in Kansas…
PACOM: A Versatile Tool for Integrating, Filtering, Visualizing, and Comparing Multiple Large Mass Spectrometry Proteomics Data Sets.

Science.gov (United States)

Martínez-Bartolomé, Salvador; Medina-Aunon, J Alberto; López-García, Miguel Ángel; González-Tejedo, Carmen; Prieto, Gorka; Navajas, Rosana; Salazar-Donate, Emilio; Fernández-Costa, Carolina; Yates, John R; Albar, Juan Pablo

2018-04-06

Mass-spectrometry-based proteomics has evolved into a high-throughput technology in which numerous large-scale data sets are generated from diverse analytical platforms. Furthermore, several scientific journals and funding agencies have emphasized the storage of proteomics data in public repositories to facilitate its evaluation, inspection, and reanalysis. (1) As a consequence, public proteomics data repositories are growing rapidly. However, tools are needed to integrate multiple proteomics data sets to compare different experimental features or to perform quality control analysis. Here, we present a new Java stand-alone tool, Proteomics Assay COMparator (PACOM), that is able to import, combine, and simultaneously compare numerous proteomics experiments to check the integrity of the proteomic data as well as verify data quality. With PACOM, the user can detect source of errors that may have been introduced in any step of a proteomics workflow and that influence the final results. Data sets can be easily compared and integrated, and data quality and reproducibility can be visually assessed through a rich set of graphical representations of proteomics data features as well as a wide variety of data filters. Its flexibility and easy-to-use interface make PACOM a unique tool for daily use in a proteomics laboratory. PACOM is available at https://github.com/smdb21/pacom .
Reprogramming of gene expression during compression wood formation in pine: Coordinated modulation of S-adenosylmethionine, lignin and lignan related genes

Science.gov (United States)

2012-01-01

Background Transcript profiling of differentiating secondary xylem has allowed us to draw a general picture of the genes involved in wood formation. However, our knowledge is still limited about the regulatory mechanisms that coordinate and modulate the different pathways providing substrates during xylogenesis. The development of compression wood in conifers constitutes an exceptional model for these studies. Although differential expression of a few genes in differentiating compression wood compared to normal or opposite wood has been reported, the broad range of features that distinguish this reaction wood suggest that the expression of a larger set of genes would be modified. Results By combining the construction of different cDNA libraries with microarray analyses we have identified a total of 496 genes in maritime pine (Pinus pinaster, Ait.) that change in expression during differentiation of compression wood (331 up-regulated and 165 down-regulated compared to opposite wood). Samples from different provenances collected in different years and geographic locations were integrated into the analyses to mitigate the effects of multiple sources of variability. This strategy allowed us to define a group of genes that are consistently associated with compression wood formation. Correlating with the deposition of a thicker secondary cell wall that characterizes compression wood development, the expression of a number of genes involved in synthesis of cellulose, hemicellulose, lignin and lignans was up-regulated. Further analysis of a set of these genes involved in S-adenosylmethionine metabolism, ammonium recycling, and lignin and lignans biosynthesis showed changes in expression levels in parallel to the levels of lignin accumulation in cells undergoing xylogenesis in vivo and in vitro. Conclusions The comparative transcriptomic analysis reported here have revealed a broad spectrum of coordinated transcriptional modulation of genes involved in biosynthesis of
Reprogramming of gene expression during compression wood formation in pine: Coordinated modulation of S-adenosylmethionine, lignin and lignan related genes

Directory of Open Access Journals (Sweden)

Villalobos David P

2012-06-01

Full Text Available Abstract Background Transcript profiling of differentiating secondary xylem has allowed us to draw a general picture of the genes involved in wood formation. However, our knowledge is still limited about the regulatory mechanisms that coordinate and modulate the different pathways providing substrates during xylogenesis. The development of compression wood in conifers constitutes an exceptional model for these studies. Although differential expression of a few genes in differentiating compression wood compared to normal or opposite wood has been reported, the broad range of features that distinguish this reaction wood suggest that the expression of a larger set of genes would be modified. Results By combining the construction of different cDNA libraries with microarray analyses we have identified a total of 496 genes in maritime pine (Pinus pinaster, Ait. that change in expression during differentiation of compression wood (331 up-regulated and 165 down-regulated compared to opposite wood. Samples from different provenances collected in different years and geographic locations were integrated into the analyses to mitigate the effects of multiple sources of variability. This strategy allowed us to define a group of genes that are consistently associated with compression wood formation. Correlating with the deposition of a thicker secondary cell wall that characterizes compression wood development, the expression of a number of genes involved in synthesis of cellulose, hemicellulose, lignin and lignans was up-regulated. Further analysis of a set of these genes involved in S-adenosylmethionine metabolism, ammonium recycling, and lignin and lignans biosynthesis showed changes in expression levels in parallel to the levels of lignin accumulation in cells undergoing xylogenesis in vivo and in vitro. Conclusions The comparative transcriptomic analysis reported here have revealed a broad spectrum of coordinated transcriptional modulation of genes
Setting conservation priorities.

Science.gov (United States)

Wilson, Kerrie A; Carwardine, Josie; Possingham, Hugh P

2009-04-01

A generic framework for setting conservation priorities based on the principles of classic decision theory is provided. This framework encapsulates the key elements of any problem, including the objective, the constraints, and knowledge of the system. Within the context of this framework the broad array of approaches for setting conservation priorities are reviewed. While some approaches prioritize assets or locations for conservation investment, it is concluded here that prioritization is incomplete without consideration of the conservation actions required to conserve the assets at particular locations. The challenges associated with prioritizing investments through time in the face of threats (and also spatially and temporally heterogeneous costs) can be aided by proper problem definition. Using the authors' general framework for setting conservation priorities, multiple criteria can be rationally integrated and where, how, and when to invest conservation resources can be scheduled. Trade-offs are unavoidable in priority setting when there are multiple considerations, and budgets are almost always finite. The authors discuss how trade-offs, risks, uncertainty, feedbacks, and learning can be explicitly evaluated within their generic framework for setting conservation priorities. Finally, they suggest ways that current priority-setting approaches may be improved.
Platform dependence of inference on gene-wise and gene-set involvement in human lung development

Directory of Open Access Journals (Sweden)

Kho Alvin T

2009-06-01

Full Text Available Abstract Background With the recent development of microarray technologies, the comparability of gene expression data obtained from different platforms poses an important problem. We evaluated two widely used platforms, Affymetrix U133 Plus 2.0 and the Illumina HumanRef-8 v2 Expression Bead Chips, for comparability in a biological system in which changes may be subtle, namely fetal lung tissue as a function of gestational age. Results We performed the comparison via sequence-based probe matching between the two platforms. "Significance grouping" was defined as a measure of comparability. Using both expression correlation and significance grouping as measures of comparability, we demonstrated that despite overall cross-platform differences at the single gene level, increased correlation between the two platforms was found in genes with higher expression level, higher probe overlap, and lower p-value. We also demonstrated that biological function as determined via KEGG pathways or GO categories is more consistent across platforms than single gene analysis. Conclusion We conclude that while the comparability of the platforms at the single gene level may be increased by increasing sample size, they are highly comparable ontologically even for subtle differences in a relatively small sample size. Biologically relevant inference should therefore be reproducible across laboratories using different platforms.
Multiple zebrafish atoh1 genes specify a diversity of neuronal types in the zebrafish cerebellum.

Science.gov (United States)

Kidwell, Chelsea U; Su, Chen-Ying; Hibi, Masahiko; Moens, Cecilia B

2018-06-01

A single Atoh1 basic-helix-loop-helix transcription factor specifies multiple neuron types in the mammalian cerebellum and anterior hindbrain. The zebrafish genome encodes three paralagous atoh1 genes whose functions in cerebellum and anterior hindbrain development we explore here. With use of a transgenic reporter, we report that zebrafish atoh1c-expressing cells are organized in two distinct domains that are separated both by space and developmental time. An early isthmic expression domain gives rise to an extracerebellar population in rhombomere 1 and an upper rhombic lip domain gives rise to granule cell progenitors that migrate to populate all four granule cell territories of the fish cerebellum. Using genetic mutants we find that of the three zebrafish atoh1 paralogs, atoh1c and atoh1a are required for the full complement of granule neurons. Surprisingly, the two genes are expressed in non-overlapping granule cell progenitor populations, indicating that fish use duplicate atoh1 genes to generate granule cell diversity that is not detected in mammals. Finally, live imaging of granule cell migration in wildtype and atoh1c mutant embryos reveals that while atoh1c is not required for granule cell specification per se, it is required for granule cells to delaminate and migrate away from the rhombic lip. Copyright © 2018 Elsevier Inc. All rights reserved.
Candidate genes for COPD in two large data sets.

Science.gov (United States)

Bakke, P S; Zhu, G; Gulsvik, A; Kong, X; Agusti, A G N; Calverley, P M A; Donner, C F; Levy, R D; Make, B J; Paré, P D; Rennard, S I; Vestbo, J; Wouters, E F M; Anderson, W; Lomas, D A; Silverman, E K; Pillai, S G

2011-02-01

Lack of reproducibility of findings has been a criticism of genetic association studies on complex diseases, such as chronic obstructive pulmonary disease (COPD). We selected 257 polymorphisms of 16 genes with reported or potential relationships to COPD and genotyped these variants in a case-control study that included 953 COPD cases and 956 control subjects. We explored the association of these polymorphisms to three COPD phenotypes: a COPD binary phenotype and two quantitative traits (post-bronchodilator forced expiratory volume in 1 s (FEV₁) % predicted and FEV₁/forced vital capacity (FVC)). The polymorphisms significantly associated to these phenotypes in this first study were tested in a second, family-based study that included 635 pedigrees with 1,910 individuals. Significant associations to the binary COPD phenotype in both populations were seen for STAT1 (rs13010343) and NFKBIB/SIRT2 (rs2241704) (p<0.05). Single-nucleotide polymorphisms rs17467825 and rs1155563 of the GC gene were significantly associated with FEV₁ % predicted and FEV₁/FVC, respectively, in both populations (p<0.05). This study has replicated associations to COPD phenotypes in the STAT1, NFKBIB/SIRT2 and GC genes in two independent populations, the associations of the former two genes representing novel findings.
An 80-gene set to predict response to preoperative chemoradiotherapy for rectal cancer by principle component analysis.

Science.gov (United States)

Empuku, Shinichiro; Nakajima, Kentaro; Akagi, Tomonori; Kaneko, Kunihiko; Hijiya, Naoki; Etoh, Tsuyoshi; Shiraishi, Norio; Moriyama, Masatsugu; Inomata, Masafumi

2016-05-01

Preoperative chemoradiotherapy (CRT) for locally advanced rectal cancer not only improves the postoperative local control rate, but also induces downstaging. However, it has not been established how to individually select patients who receive effective preoperative CRT. The aim of this study was to identify a predictor of response to preoperative CRT for locally advanced rectal cancer. This study is additional to our multicenter phase II study evaluating the safety and efficacy of preoperative CRT using oral fluorouracil (UMIN ID: 03396). From April, 2009 to August, 2011, 26 biopsy specimens obtained prior to CRT were analyzed by cyclopedic microarray analysis. Response to CRT was evaluated according to a histological grading system using surgically resected specimens. To decide on the number of genes for dividing into responder and non-responder groups, we statistically analyzed the data using a dimension reduction method, a principle component analysis. Of the 26 cases, 11 were responders and 15 non-responders. No significant difference was found in clinical background data between the two groups. We determined that the optimal number of genes for the prediction of response was 80 of 40,000 and the functions of these genes were analyzed. When comparing non-responders with responders, genes expressed at a high level functioned in alternative splicing, whereas those expressed at a low level functioned in the septin complex. Thus, an 80-gene expression set that predicts response to preoperative CRT for locally advanced rectal cancer was identified using a novel statistical method.
Causal relationship between the AHSG gene and BMD through fetuin-A and BMI: multiple mediation analysis.

Science.gov (United States)

Sritara, C; Thakkinstian, A; Ongphiphadhanakul, B; Chailurkit, L; Chanprasertyothin, S; Ratanachaiwong, W; Vathesatogkit, P; Sritara, P

2014-05-01

Using mediation analysis, a causal relationship between the AHSG gene and bone mineral density (BMD) through fetuin-A and body mass index (BMI) mediators was suggested. Fetuin-A, a multifunctional protein of hepatic origin, is associated with bone mineral density. It is unclear if this association is causal. This study aimed at clarification of this issue. A cross-sectional study was conducted among 1,741 healthy workers from the Electricity Generating Authority of Thailand (EGAT) cohort. The alpha-2-Heremans-Schmid glycoprotein (AHSG) rs2248690 gene was genotyped. Three mediation models were constructed using seemingly unrelated regression analysis. First, the ln[fetuin-A] group was regressed on the AHSG gene. Second, the BMI group was regressed on the AHSG gene and the ln[fetuin-A] group. Finally, the BMD model was constructed by fitting BMD on two mediators (ln[fetuin-A] and BMI) and the independent AHSG variable. All three analyses were adjusted for confounders. The prevalence of the minor T allele for the AHSG locus was 15.2%. The AHSG locus was highly related to serum fetuin-A levels (P Multiple mediation analyses showed that AHSG was significantly associated with BMD through the ln[fetuin-A] and BMI pathway, with beta coefficients of 0.0060 (95% CI 0.0038, 0.0083) and 0.0030 (95% CI 0.0020, 0.0045) at the total hip and lumbar spine, respectively. About 27.3 and 26.0% of total genetic effects on hip and spine BMD, respectively, were explained by the mediation effects of fetuin-A and BMI. Our study suggested evidence of a causal relationship between the AHSG gene and BMD through fetuin-A and BMI mediators.
Selection and validation of a set of reliable reference genes for quantitative RT-PCR studies in the brain of the Cephalopod Mollusc Octopus vulgaris

Directory of Open Access Journals (Sweden)

Biffali Elio

2009-07-01

Full Text Available Abstract Background Quantitative real-time polymerase chain reaction (RT-qPCR is valuable for studying the molecular events underlying physiological and behavioral phenomena. Normalization of real-time PCR data is critical for a reliable mRNA quantification. Here we identify reference genes to be utilized in RT-qPCR experiments to normalize and monitor the expression of target genes in the brain of the cephalopod mollusc Octopus vulgaris, an invertebrate. Such an approach is novel for this taxon and of advantage in future experiments given the complexity of the behavioral repertoire of this species when compared with its relatively simple neural organization. Results We chose 16S, and 18S rRNA, actB, EEF1A, tubA and ubi as candidate reference genes (housekeeping genes, HKG. The expression of 16S and 18S was highly variable and did not meet the requirements of candidate HKG. The expression of the other genes was almost stable and uniform among samples. We analyzed the expression of HKG into two different set of animals using tissues taken from the central nervous system (brain parts and mantle (here considered as control tissue by BestKeeper, geNorm and NormFinder. We found that HKG expressions differed considerably with respect to brain area and octopus samples in an HKG-specific manner. However, when the mantle is treated as control tissue and the entire central nervous system is considered, NormFinder revealed tubA and ubi as the most suitable HKG pair. These two genes were utilized to evaluate the relative expression of the genes FoxP, creb, dat and TH in O. vulgaris. Conclusion We analyzed the expression profiles of some genes here identified for O. vulgaris by applying RT-qPCR analysis for the first time in cephalopods. We validated candidate reference genes and found the expression of ubi and tubA to be the most appropriate to evaluate the expression of target genes in the brain of different octopuses. Our results also underline the
The analysis of correlation between IL-1B gene expression and genotyping in multiple sclerosis patients.

Science.gov (United States)

Heidary, Masoumeh; Rakhshi, Nahid; Pahlevan Kakhki, Majid; Behmanesh, Mehrdad; Sanati, Mohammad Hossein; Sanadgol, Nima; Kamaladini, Hossein; Nikravesh, Abbas

2014-08-15

IL-1B is released by monocytes, astrocytes and brain endothelial cells and seems to be involved in inflammatory reactions of the central nervous system (CNS) in multiple sclerosis (MS). This study aims to evaluate the expression level of IL-1B mRNA in peripheral blood mononuclear cells (PBMCs), genotype the rs16944 SNP and find out the role of this SNP on the expression level of IL-1B in MS patients. We found that the expression level of IL-1B in MS patients increased 3.336 times more than controls in PBMCs but the rs16944 SNP in the promoter region of IL-1B did not affect the expression level of this gene and there was not association of this SNP with MS in the examined population. Also, our data did not reveal any correlation between normalized expressions of IL-1B gene with age of participants, age of onset, and disease duration. Copyright © 2014 Elsevier B.V. All rights reserved.
Covariance approximation for large multivariate spatial data sets with an application to multiple climate model errors

KAUST Repository

Sang, Huiyan

2011-12-01

This paper investigates the cross-correlations across multiple climate model errors. We build a Bayesian hierarchical model that accounts for the spatial dependence of individual models as well as cross-covariances across different climate models. Our method allows for a nonseparable and nonstationary cross-covariance structure. We also present a covariance approximation approach to facilitate the computation in the modeling and analysis of very large multivariate spatial data sets. The covariance approximation consists of two parts: a reduced-rank part to capture the large-scale spatial dependence, and a sparse covariance matrix to correct the small-scale dependence error induced by the reduced rank approximation. We pay special attention to the case that the second part of the approximation has a block-diagonal structure. Simulation results of model fitting and prediction show substantial improvement of the proposed approximation over the predictive process approximation and the independent blocks analysis. We then apply our computational approach to the joint statistical modeling of multiple climate model errors. © 2012 Institute of Mathematical Statistics.
ASPIC: a novel method to predict the exon-intron structure of a gene that is optimally compatible to a set of transcript sequences

Directory of Open Access Journals (Sweden)

Pesole Graziano

2005-10-01

Full Text Available Abstract Background: Currently available methods to predict splice sites are mainly based on the independent and progressive alignment of transcript data (mostly ESTs to the genomic sequence. Apart from often being computationally expensive, this approach is vulnerable to several problems – hence the need to develop novel strategies. Results: We propose a method, based on a novel multiple genome-EST alignment algorithm, for the detection of splice sites. To avoid limitations of splice sites prediction (mainly, over-predictions due to independent single EST alignments to the genomic sequence our approach performs a multiple alignment of transcript data to the genomic sequence based on the combined analysis of all available data. We recast the problem of predicting constitutive and alternative splicing as an optimization problem, where the optimal multiple transcript alignment minimizes the number of exons and hence of splice site observations. We have implemented a splice site predictor based on this algorithm in the software tool ASPIC (Alternative Splicing PredICtion. It is distinguished from other methods based on BLAST-like tools by the incorporation of entirely new ad hoc procedures for accurate and computationally efficient transcript alignment and adopts dynamic programming for the refinement of intron boundaries. ASPIC also provides the minimal set of non-mergeable transcript isoforms compatible with the detected splicing events. The ASPIC web resource is dynamically interconnected with the Ensembl and Unigene databases and also implements an upload facility. Conclusion: Extensive bench marking shows that ASPIC outperforms other existing methods in the detection of novel splicing isoforms and in the minimization of over-predictions. ASPIC also requires a lower computation time for processing a single gene and an EST cluster. The ASPIC web resource is available at http://aspic.algo.disco.unimib.it/aspic-devel/.
Reconstruction of the primordial power spectrum of curvature perturbations using multiple data sets

DEFF Research Database (Denmark)

Hunt, Paul; Sarkar, Subir

2014-01-01

Detailed knowledge of the primordial power spectrum of curvature perturbations is essential both in order to elucidate the physical mechanism (`inflation') which generated it, and for estimating the cosmological parameters from observations of the cosmic microwave background and large-scale struc......Detailed knowledge of the primordial power spectrum of curvature perturbations is essential both in order to elucidate the physical mechanism (`inflation') which generated it, and for estimating the cosmological parameters from observations of the cosmic microwave background and large...... content of the universe. Moreover the deconvolution problem is ill-conditioned so a regularisation scheme must be employed to control error propagation. We demonstrate that `Tikhonov regularisation' can robustly reconstruct the primordial spectrum from multiple cosmological data sets, a significant...... advantage being that both its uncertainty and resolution are then quantified. Using Monte Carlo simulations we investigate several regularisation parameter selection methods and find that generalised cross-validation and Mallow's Cp method give optimal results. We apply our inversion procedure to data from...
DNA copy-number alterations underlie gene expression differences between microsatellite stable and unstable colorectal cancers

DEFF Research Database (Denmark)

Jorissen, Robert N; Lipton, Lara; Gibbs, Peter

2008-01-01

Purpose: About 15% of colorectal cancers harbor microsatellite instability (MSI). MSI-associated gene expression changes have been identified in colorectal cancers, but little overlap exists between signatures hindering an assessment of overall consistency. Little is known about the causes...... and downstream effects of differential gene expression. Experimental Design: DNA microarray data on 89 MSI and 140 microsatellite-stable (MSS) colorectal cancers from this study and 58 MSI and 77 MSS cases from three published reports were randomly divided into test and training sets. MSI-associated gene......-number data. Results: MSI-associated gene expression changes in colorectal cancers were found to be highly consistent across multiple studies of primary tumors and cancer cell lines from patients of different ethnicities (P

Kernel Machine SNP-set Testing under Multiple Candidate Kernels

Science.gov (United States)

Wu, Michael C.; Maity, Arnab; Lee, Seunggeun; Simmons, Elizabeth M.; Harmon, Quaker E.; Lin, Xinyi; Engel, Stephanie M.; Molldrem, Jeffrey J.; Armistead, Paul M.

2013-01-01

Joint testing for the cumulative effect of multiple single nucleotide polymorphisms grouped on the basis of prior biological knowledge has become a popular and powerful strategy for the analysis of large scale genetic association studies. The kernel machine (KM) testing framework is a useful approach that has been proposed for testing associations between multiple genetic variants and many different types of complex traits by comparing pairwise similarity in phenotype between subjects to pairwise similarity in genotype, with similarity in genotype defined via a kernel function. An advantage of the KM framework is its flexibility: choosing different kernel functions allows for different assumptions concerning the underlying model and can allow for improved power. In practice, it is difficult to know which kernel to use a priori since this depends on the unknown underlying trait architecture and selecting the kernel which gives the lowest p-value can lead to inflated type I error. Therefore, we propose practical strategies for KM testing when multiple candidate kernels are present based on constructing composite kernels and based on efficient perturbation procedures. We demonstrate through simulations and real data applications that the procedures protect the type I error rate and can lead to substantially improved power over poor choices of kernels and only modest differences in power versus using the best candidate kernel. PMID:23471868
Cogena, a novel tool for co-expressed gene-set enrichment analysis, applied to drug repositioning and drug mode of action discovery.

Science.gov (United States)

Jia, Zhilong; Liu, Ying; Guan, Naiyang; Bo, Xiaochen; Luo, Zhigang; Barnes, Michael R

2016-05-27

Drug repositioning, finding new indications for existing drugs, has gained much recent attention as a potentially efficient and economical strategy for accelerating new therapies into the clinic. Although improvement in the sensitivity of computational drug repositioning methods has identified numerous credible repositioning opportunities, few have been progressed. Arguably the "black box" nature of drug action in a new indication is one of the main blocks to progression, highlighting the need for methods that inform on the broader target mechanism in the disease context. We demonstrate that the analysis of co-expressed genes may be a critical first step towards illumination of both disease pathology and mode of drug action. We achieve this using a novel framework, co-expressed gene-set enrichment analysis (cogena) for co-expression analysis of gene expression signatures and gene set enrichment analysis of co-expressed genes. The cogena framework enables simultaneous, pathway driven, disease and drug repositioning analysis. Cogena can be used to illuminate coordinated changes within disease transcriptomes and identify drugs acting mechanistically within this framework. We illustrate this using a psoriatic skin transcriptome, as an exemplar, and recover two widely used Psoriasis drugs (Methotrexate and Ciclosporin) with distinct modes of action. Cogena out-performs the results of Connectivity Map and NFFinder webservers in similar disease transcriptome analyses. Furthermore, we investigated the literature support for the other top-ranked compounds to treat psoriasis and showed how the outputs of cogena analysis can contribute new insight to support the progression of drugs into the clinic. We have made cogena freely available within Bioconductor or https://github.com/zhilongjia/cogena . In conclusion, by targeting co-expressed genes within disease transcriptomes, cogena offers novel biological insight, which can be effectively harnessed for drug discovery and
Multiple Linear Regression for Reconstruction of Gene Regulatory Networks in Solving Cascade Error Problems.

Science.gov (United States)

Salleh, Faridah Hani Mohamed; Zainudin, Suhaila; Arif, Shereena M

2017-01-01

Gene regulatory network (GRN) reconstruction is the process of identifying regulatory gene interactions from experimental data through computational analysis. One of the main reasons for the reduced performance of previous GRN methods had been inaccurate prediction of cascade motifs. Cascade error is defined as the wrong prediction of cascade motifs, where an indirect interaction is misinterpreted as a direct interaction. Despite the active research on various GRN prediction methods, the discussion on specific methods to solve problems related to cascade errors is still lacking. In fact, the experiments conducted by the past studies were not specifically geared towards proving the ability of GRN prediction methods in avoiding the occurrences of cascade errors. Hence, this research aims to propose Multiple Linear Regression (MLR) to infer GRN from gene expression data and to avoid wrongly inferring of an indirect interaction (A → B → C) as a direct interaction (A → C). Since the number of observations of the real experiment datasets was far less than the number of predictors, some predictors were eliminated by extracting the random subnetworks from global interaction networks via an established extraction method. In addition, the experiment was extended to assess the effectiveness of MLR in dealing with cascade error by using a novel experimental procedure that had been proposed in this work. The experiment revealed that the number of cascade errors had been very minimal. Apart from that, the Belsley collinearity test proved that multicollinearity did affect the datasets used in this experiment greatly. All the tested subnetworks obtained satisfactory results, with AUROC values above 0.5.
DLRS: gene tree evolution in light of a species tree.

Science.gov (United States)

Sjöstrand, Joel; Sennblad, Bengt; Arvestad, Lars; Lagergren, Jens

2012-11-15

PrIME-DLRS (or colloquially: 'Delirious') is a phylogenetic software tool to simultaneously infer and reconcile a gene tree given a species tree. It accounts for duplication and loss events, a relaxed molecular clock and is intended for the study of homologous gene families, for example in a comparative genomics setting involving multiple species. PrIME-DLRS uses a Bayesian MCMC framework, where the input is a known species tree with divergence times and a multiple sequence alignment, and the output is a posterior distribution over gene trees and model parameters. PrIME-DLRS is available for Java SE 6+ under the New BSD License, and JAR files and source code can be downloaded from http://code.google.com/p/jprime/. There is also a slightly older C++ version available as a binary package for Ubuntu, with download instructions at http://prime.sbc.su.se. The C++ source code is available upon request. joel.sjostrand@scilifelab.se or jens.lagergren@scilifelab.se. PrIME-DLRS is based on a sound probabilistic model (Åkerborg et al., 2009) and has been thoroughly validated on synthetic and biological datasets (Supplementary Material online).
Endogenous interferon-β-inducible gene expression and interferon-β-treatment are associated with reduced T cell responses to myelin basic protein in multiple sclerosis

DEFF Research Database (Denmark)

Börnsen, Lars; Christensen, Jeppe Romme; Ratzer, Rikke

2015-01-01

Autoreactive CD4+ T-cells are considered to play a major role in the pathogenesis of multiple sclerosis. In experimental autoimmune encephalomyelitis, an animal model of multiple sclerosis, exogenous and endogenous type I interferons restrict disease severity. Recombinant interferon-β is used for......-induced CD4+ T-cell autoreactivity in interferon-β-treated multiple sclerosis patients may be mediated by monocyte-derived interleukin-10.......Autoreactive CD4+ T-cells are considered to play a major role in the pathogenesis of multiple sclerosis. In experimental autoimmune encephalomyelitis, an animal model of multiple sclerosis, exogenous and endogenous type I interferons restrict disease severity. Recombinant interferon-β is used...... for treatment of multiple sclerosis, and some untreated multiple sclerosis patients have increased expression levels of type I interferon-inducible genes in immune cells. The role of endogenous type I interferons in multiple sclerosis is controversial: some studies found an association of high expression levels...
Positron emission tomography and gene therapy: basic concepts and experimental approaches for in vivo gene expression imaging.

Science.gov (United States)

Peñuelas, Iván; Boán, JoséF; Martí-Climent, Josep M; Sangro, Bruno; Mazzolini, Guillermo; Prieto, Jesús; Richter, José A

2004-01-01

More than two decades of intense research have allowed gene therapy to move from the laboratory to the clinical setting, where its use for the treatment of human pathologies has been considerably increased in the last years. However, many crucial questions remain to be solved in this challenging field. In vivo imaging with positron emission tomography (PET) by combination of the appropriate PET reporter gene and PET reporter probe could provide invaluable qualitative and quantitative information to answer multiple unsolved questions about gene therapy. PET imaging could be used to define parameters not available by other techniques that are of substantial interest not only for the proper understanding of the gene therapy process, but also for its future development and clinical application in humans. This review focuses on the molecular biology basis of gene therapy and molecular imaging, describing the fundamentals of in vivo gene expression imaging by PET, and the application of PET to gene therapy, as a technology that can be used in many different ways. It could be applied to avoid invasive procedures for gene therapy monitoring; accurately diagnose the pathology for better planning of the most adequate therapeutic approach; as treatment evaluation to image the functional effects of gene therapy at the biochemical level; as a quantitative noninvasive way to monitor the location, magnitude and persistence of gene expression over time; and would also help to a better understanding of vector biology and pharmacology devoted to the development of safer and more efficient vectors.
CSA: An efficient algorithm to improve circular DNA multiple alignment

Directory of Open Access Journals (Sweden)

Pereira Luísa

2009-07-01

Full Text Available Abstract Background The comparison of homologous sequences from different species is an essential approach to reconstruct the evolutionary history of species and of the genes they harbour in their genomes. Several complete mitochondrial and nuclear genomes are now available, increasing the importance of using multiple sequence alignment algorithms in comparative genomics. MtDNA has long been used in phylogenetic analysis and errors in the alignments can lead to errors in the interpretation of evolutionary information. Although a large number of multiple sequence alignment algorithms have been proposed to date, they all deal with linear DNA and cannot handle directly circular DNA. Researchers interested in aligning circular DNA sequences must first rotate them to the "right" place using an essentially manual process, before they can use multiple sequence alignment tools. Results In this paper we propose an efficient algorithm that identifies the most interesting region to cut circular genomes in order to improve phylogenetic analysis when using standard multiple sequence alignment algorithms. This algorithm identifies the largest chain of non-repeated longest subsequences common to a set of circular mitochondrial DNA sequences. All the sequences are then rotated and made linear for multiple alignment purposes. To evaluate the effectiveness of this new tool, three different sets of mitochondrial DNA sequences were considered. Other tests considering randomly rotated sequences were also performed. The software package Arlequin was used to evaluate the standard genetic measures of the alignments obtained with and without the use of the CSA algorithm with two well known multiple alignment algorithms, the CLUSTALW and the MAVID tools, and also the visualization tool SinicView. Conclusion The results show that a circularization and rotation pre-processing step significantly improves the efficiency of public available multiple sequence alignment
Therapeutic genes for anti-HIV/AIDS gene therapy.

Science.gov (United States)

Bovolenta, Chiara; Porcellini, Simona; Alberici, Luca

2013-01-01

The multiple therapeutic approaches developed so far to cope HIV-1 infection, such as anti-retroviral drugs, germicides and several attempts of therapeutic vaccination have provided significant amelioration in terms of life-quality and survival rate of AIDS patients. Nevertheless, no approach has demonstrated efficacy in eradicating this lethal, if untreated, infection. The curative power of gene therapy has been proven for the treatment of monogenic immunodeficiensies, where permanent gene modification of host cells is sufficient to correct the defect for life-time. No doubt, a similar concept is not applicable for gene therapy of infectious immunodeficiensies as AIDS, where there is not a single gene to be corrected; rather engineered cells must gain immunotherapeutic or antiviral features to grant either short- or long-term efficacy mostly by acquisition of antiviral genes or payloads. Anti-HIV/AIDS gene therapy is one of the most promising strategy, although challenging, to eradicate HIV-1 infection. In fact, genetic modification of hematopoietic stem cells with one or multiple therapeutic genes is expected to originate blood cell progenies resistant to viral infection and thereby able to prevail on infected unprotected cells. Ultimately, protected cells will re-establish a functional immune system able to control HIV-1 replication. More than hundred gene therapy clinical trials against AIDS employing different viral vectors and transgenes have been approved or are currently ongoing worldwide. This review will overview anti-HIV-1 infection gene therapy field evaluating strength and weakness of the transgenes and payloads used in the past and of those potentially exploitable in the future.
Multiple giant cell lesions in a patient with Noonan syndrome with multiple lentigines

NARCIS (Netherlands)

van den Berg, Henk; Schreuder, Willem Hans; Jongmans, Marjolijn; van Bommel-Slee, Danielle; Witsenburg, Bart; de Lange, Jan

2016-01-01

A patient with Noonan syndrome with multiple lentigines (NSML) and multiple giant cell lesions (MGCL) in mandibles and maxillae is described. A mutation p.Thr468Met in the PTPN11-gene was found. This is the second reported NSML patient with MGCL. Our case adds to the assumption that, despite a
Solving problems by interrogating sets of knowledge systems: Toward a theory of multiple knowledge systems

Science.gov (United States)

Dekorvin, Andre

1989-01-01

The main purpose is to develop a theory for multiple knowledge systems. A knowledge system could be a sensor or an expert system, but it must specialize in one feature. The problem is that we have an exhaustive list of possible answers to some query (such as what object is it). By collecting different feature values, in principle, it should be possible to give an answer to the query, or at least narrow down the list. Since a sensor, or for that matter an expert system, does not in most cases yield a precise value for the feature, uncertainty must be built into the model. Also, researchers must have a formal mechanism to be able to put the information together. Researchers chose to use the Dempster-Shafer approach to handle the problems mentioned above. Researchers introduce the concept of a state of recognition and point out that there is a relation between receiving updates and defining a set valued Markov Chain. Also, deciding what the value of the next set valued variable is can be phrased in terms of classical decision making theory such as minimizing the maximum regret. Other related problems are examined.
Network Diffusion-Based Prioritization of Autism Risk Genes Identifies Significantly Connected Gene Modules

Directory of Open Access Journals (Sweden)

Ettore Mosca

2017-09-01

Full Text Available Autism spectrum disorder (ASD is marked by a strong genetic heterogeneity, which is underlined by the low overlap between ASD risk gene lists proposed in different studies. In this context, molecular networks can be used to analyze the results of several genome-wide studies in order to underline those network regions harboring genetic variations associated with ASD, the so-called “disease modules.” In this work, we used a recent network diffusion-based approach to jointly analyze multiple ASD risk gene lists. We defined genome-scale prioritizations of human genes in relation to ASD genes from multiple studies, found significantly connected gene modules associated with ASD and predicted genes functionally related to ASD risk genes. Most of them play a role in synapsis and neuronal development and function; many are related to syndromes that can be in comorbidity with ASD and the remaining are involved in epigenetics, cell cycle, cell adhesion and cancer.
A dual origin of the Xist gene from a protein-coding gene and a set of transposable elements.

Directory of Open Access Journals (Sweden)

Eugeny A Elisaphenko

2008-06-01

Full Text Available X-chromosome inactivation, which occurs in female eutherian mammals is controlled by a complex X-linked locus termed the X-inactivation center (XIC. Previously it was proposed that genes of the XIC evolved, at least in part, as a result of pseudogenization of protein-coding genes. In this study we show that the key XIC gene Xist, which displays fragmentary homology to a protein-coding gene Lnx3, emerged de novo in early eutherians by integration of mobile elements which gave rise to simple tandem repeats. The Xist gene promoter region and four out of ten exons found in eutherians retain homology to exons of the Lnx3 gene. The remaining six Xist exons including those with simple tandem repeats detectable in their structure have similarity to different transposable elements. Integration of mobile elements into Xist accompanies the overall evolution of the gene and presumably continues in contemporary eutherian species. Additionally we showed that the combination of remnants of protein-coding sequences and mobile elements is not unique to the Xist gene and is found in other XIC genes producing non-coding nuclear RNA.
The Candidate Cancer Gene Database: a database of cancer driver genes from forward genetic screens in mice.

Science.gov (United States)

Abbott, Kenneth L; Nyre, Erik T; Abrahante, Juan; Ho, Yen-Yi; Isaksson Vogel, Rachel; Starr, Timothy K

2015-01-01

Identification of cancer driver gene mutations is crucial for advancing cancer therapeutics. Due to the overwhelming number of passenger mutations in the human tumor genome, it is difficult to pinpoint causative driver genes. Using transposon mutagenesis in mice many laboratories have conducted forward genetic screens and identified thousands of candidate driver genes that are highly relevant to human cancer. Unfortunately, this information is difficult to access and utilize because it is scattered across multiple publications using different mouse genome builds and strength metrics. To improve access to these findings and facilitate meta-analyses, we developed the Candidate Cancer Gene Database (CCGD, http://ccgd-starrlab.oit.umn.edu/). The CCGD is a manually curated database containing a unified description of all identified candidate driver genes and the genomic location of transposon common insertion sites (CISs) from all currently published transposon-based screens. To demonstrate relevance to human cancer, we performed a modified gene set enrichment analysis using KEGG pathways and show that human cancer pathways are highly enriched in the database. We also used hierarchical clustering to identify pathways enriched in blood cancers compared to solid cancers. The CCGD is a novel resource available to scientists interested in the identification of genetic drivers of cancer. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Deriving Trading Rules Using Gene Expression Programming

Directory of Open Access Journals (Sweden)

Adrian VISOIU

2011-01-01

Full Text Available This paper presents how buy and sell trading rules are generated using gene expression programming with special setup. Market concepts are presented and market analysis is discussed with emphasis on technical analysis and quantitative methods. The use of genetic algorithms in deriving trading rules is presented. Gene expression programming is applied in a form where multiple types of operators and operands are used. This gives birth to multiple gene contexts and references between genes in order to keep the linear structure of the gene expression programming chromosome. The setup of multiple gene contexts is presented. The case study shows how to use the proposed gene setup to derive trading rules encoded by Boolean expressions, using a dataset with the reference exchange rates between the Euro and the Romanian leu. The conclusions highlight the positive results obtained in deriving useful trading rules.
Integrated Enrichment Analysis of Variants and Pathways in Genome-Wide Association Studies Indicates Central Role for IL-2 Signaling Genes in Type 1 Diabetes, and Cytokine Signaling Genes in Crohn's Disease

Science.gov (United States)

Carbonetto, Peter; Stephens, Matthew

2013-01-01

Pathway analyses of genome-wide association studies aggregate information over sets of related genes, such as genes in common pathways, to identify gene sets that are enriched for variants associated with disease. We develop a model-based approach to pathway analysis, and apply this approach to data from the Wellcome Trust Case Control Consortium (WTCCC) studies. Our method offers several benefits over existing approaches. First, our method not only interrogates pathways for enrichment of disease associations, but also estimates the level of enrichment, which yields a coherent way to promote variants in enriched pathways, enhancing discovery of genes underlying disease. Second, our approach allows for multiple enriched pathways, a feature that leads to novel findings in two diseases where the major histocompatibility complex (MHC) is a major determinant of disease susceptibility. Third, by modeling disease as the combined effect of multiple markers, our method automatically accounts for linkage disequilibrium among variants. Interrogation of pathways from eight pathway databases yields strong support for enriched pathways, indicating links between Crohn's disease (CD) and cytokine-driven networks that modulate immune responses; between rheumatoid arthritis (RA) and “Measles” pathway genes involved in immune responses triggered by measles infection; and between type 1 diabetes (T1D) and IL2-mediated signaling genes. Prioritizing variants in these enriched pathways yields many additional putative disease associations compared to analyses without enrichment. For CD and RA, 7 of 8 additional non-MHC associations are corroborated by other studies, providing validation for our approach. For T1D, prioritization of IL-2 signaling genes yields strong evidence for 7 additional non-MHC candidate disease loci, as well as suggestive evidence for several more. Of the 7 strongest associations, 4 are validated by other studies, and 3 (near IL-2 signaling genes RAF1, MAPK14
Multiple Genes Cause Postmating Prezygotic Reproductive Isolation in the Drosophila virilis Group.

Science.gov (United States)

Ahmed-Braimah, Yasir H

2016-12-07

Understanding the genetic basis of speciation is a central problem in evolutionary biology. Studies of reproductive isolation have provided several insights into the genetic causes of speciation, especially in taxa that lend themselves to detailed genetic scrutiny. Reproductive barriers have usually been divided into those that occur before zygote formation (prezygotic) and after (postzygotic), with the latter receiving a great deal of attention over several decades. Reproductive barriers that occur after mating but before zygote formation [postmating prezygotic (PMPZ)] are especially understudied at the genetic level. Here, I present a phenotypic and genetic analysis of a PMPZ reproductive barrier between two species of the Drosophila virilis group: D. americana and D. virilis This species pair shows strong PMPZ isolation, especially when D. americana males mate with D. virilis females: ∼99% of eggs laid after these heterospecific copulations are not fertilized. Previous work has shown that the paternal loci contributing to this incompatibility reside on two chromosomes, one of which (chromosome 5) likely carries multiple factors. The other (chromosome 2) is fixed for a paracentric inversion that encompasses nearly half the chromosome. Here, I present two results. First, I show that PMPZ in this species cross is largely due to defective sperm storage in heterospecific copulations. Second, using advanced intercross and backcross mapping approaches, I identify genomic regions that carry genes capable of rescuing heterospecific fertilization. I conclude that paternal incompatibility between D. americana males and D. virilis females is underlain by four or more genes on chromosomes 2 and 5. Copyright © 2016 Ahmed-Braimah.
Discovering time-lagged rules from microarray data using gene profile classifiers

Directory of Open Access Journals (Sweden)

Ponzoni Ignacio

2011-04-01

Full Text Available Abstract Background Gene regulatory networks have an essential role in every process of life. In this regard, the amount of genome-wide time series data is becoming increasingly available, providing the opportunity to discover the time-delayed gene regulatory networks that govern the majority of these molecular processes. Results This paper aims at reconstructing gene regulatory networks from multiple genome-wide microarray time series datasets. In this sense, a new model-free algorithm called GRNCOP2 (Gene Regulatory Network inference by Combinatorial OPtimization 2, which is a significant evolution of the GRNCOP algorithm, was developed using combinatorial optimization of gene profile classifiers. The method is capable of inferring potential time-delay relationships with any span of time between genes from various time series datasets given as input. The proposed algorithm was applied to time series data composed of twenty yeast genes that are highly relevant for the cell-cycle study, and the results were compared against several related approaches. The outcomes have shown that GRNCOP2 outperforms the contrasted methods in terms of the proposed metrics, and that the results are consistent with previous biological knowledge. Additionally, a genome-wide study on multiple publicly available time series data was performed. In this case, the experimentation has exhibited the soundness and scalability of the new method which inferred highly-related statistically-significant gene associations. Conclusions A novel method for inferring time-delayed gene regulatory networks from genome-wide time series datasets is proposed in this paper. The method was carefully validated with several publicly available data sets. The results have demonstrated that the algorithm constitutes a usable model-free approach capable of predicting meaningful relationships between genes, revealing the time-trends of gene regulation.
Empirical validation of the S-Score algorithm in the analysis of gene expression data

Directory of Open Access Journals (Sweden)

Archer Kellie J

2006-03-01

Full Text Available Abstract Background Current methods of analyzing Affymetrix GeneChip® microarray data require the estimation of probe set expression summaries, followed by application of statistical tests to determine which genes are differentially expressed. The S-Score algorithm described by Zhang and colleagues is an alternative method that allows tests of hypotheses directly from probe level data. It is based on an error model in which the detected signal is proportional to the probe pair signal for highly expressed genes, but approaches a background level (rather than 0 for genes with low levels of expression. This model is used to calculate relative change in probe pair intensities that converts probe signals into multiple measurements with equalized errors, which are summed over a probe set to form the S-Score. Assuming no expression differences between chips, the S-Score follows a standard normal distribution, allowing direct tests of hypotheses to be made. Using spike-in and dilution datasets, we validated the S-Score method against comparisons of gene expression utilizing the more recently developed methods RMA, dChip, and MAS5. Results The S-score showed excellent sensitivity and specificity in detecting low-level gene expression changes. Rank ordering of S-Score values more accurately reflected known fold-change values compared to other algorithms. Conclusion The S-score method, utilizing probe level data directly, offers significant advantages over comparisons using only probe set expression summaries.
Novel functional polymorphism in IGF-1 gene associated with multiple sclerosis: A new insight to MS.

Science.gov (United States)

Shahbazi, Majid; Abdolmohammadi, Reza; Ebadi, Hamid; Farazmandfar, Touraj

2017-04-01

Interactions between several genes and environment may play a role in susceptibility to multiple sclerosis (MS). The IGF-1 plays a key role in proliferation, maintenance and survival of nerve cells. Therefore, we hypothesized that IGF-1 may be a target for prediction and control MS. We aimed to analysis IGF-1 gene promoter sequence, to investigate the effect of the single nucleotide variants on IGF-1 expression and its association with MS. We enrolled 339 MS patients and 431 healthy controls. A specific region in IGF-1 gene promoter was investigated by SSCP analysis. All samples were genotyped by SSP-PCR. In-vitro and in-vivo IGF-1 production was measured by ELISA assay. IGF-1 expression in PBMCs was measured using real-time PCR. We identified a T to C single nucleotide substitution at position -1089 and a C to T at position -383 from transcription start site in the IGF-1 gene promoter. There was a significant association between MS and genotypes IGF-1(-383) C/T (p=0.001) and IGF-1(-383) C/C (pMS (p=0.001). In-vitro and in-vivo IGF-1 level showed that IGF-1 production in samples with genotype IGF-1(-383) C/C significantly was less than T/T (p=0.004) but not T/C (p=0.220). According to IGF-1 roles in CNS and our results, this study suggests that low IGF-1 level may be associated with susceptibility to MS. Copyright © 2017 Elsevier B.V. All rights reserved.
Calibration of Multiple In Silico Tools for Predicting Pathogenicity of Mismatch Repair Gene Missense Substitutions

Science.gov (United States)

Thompson, Bryony A.; Greenblatt, Marc S.; Vallee, Maxime P.; Herkert, Johanna C.; Tessereau, Chloe; Young, Erin L.; Adzhubey, Ivan A.; Li, Biao; Bell, Russell; Feng, Bingjian; Mooney, Sean D.; Radivojac, Predrag; Sunyaev, Shamil R.; Frebourg, Thierry; Hofstra, Robert M.W.; Sijmons, Rolf H.; Boucher, Ken; Thomas, Alun; Goldgar, David E.; Spurdle, Amanda B.; Tavtigian, Sean V.

2015-01-01

Classification of rare missense substitutions observed during genetic testing for patient management is a considerable problem in clinical genetics. The Bayesian integrated evaluation of unclassified variants is a solution originally developed for BRCA1/2. Here, we take a step toward an analogous system for the mismatch repair (MMR) genes (MLH1, MSH2, MSH6, and PMS2) that confer colon cancer susceptibility in Lynch syndrome by calibrating in silico tools to estimate prior probabilities of pathogenicity for MMR gene missense substitutions. A qualitative five-class classification system was developed and applied to 143 MMR missense variants. This identified 74 missense substitutions suitable for calibration. These substitutions were scored using six different in silico tools (Align-Grantham Variation Grantham Deviation, multivariate analysis of protein polymorphisms [MAPP], Mut-Pred, PolyPhen-2.1, Sorting Intolerant From Tolerant, and Xvar), using curated MMR multiple sequence alignments where possible. The output from each tool was calibrated by regression against the classifications of the 74 missense substitutions; these calibrated outputs are interpretable as prior probabilities of pathogenicity. MAPP was the most accurate tool and MAPP + PolyPhen-2.1 provided the best-combined model (R2 = 0.62 and area under receiver operating characteristic = 0.93). The MAPP + PolyPhen-2.1 output is sufficiently predictive to feed as a continuous variable into the quantitative Bayesian integrated evaluation for clinical classification of MMR gene missense substitutions. PMID:22949387

Some links on this page may take you to non-federal websites. Their policies may differ from this site.