WorldWideScience

Sample records for prognostic gene clusters

  1. A 65‑gene signature for prognostic prediction in colon adenocarcinoma.

    Science.gov (United States)

    Jiang, Hui; Du, Jun; Gu, Jiming; Jin, Liugen; Pu, Yong; Fei, Bojian

    2018-04-01

    The aim of the present study was to examine the molecular factors associated with the prognosis of colon cancer. Gene expression datasets were downloaded from The Cancer Genome Atlas (TCGA) and Gene Expression Omnibus databases to screen differentially expressed genes (DEGs) between colon cancer samples and normal samples. Survival‑related genes were selected from the DEGs using the Cox regression method. A co‑expression network of survival‑related genes was then constructed, and functional clusters were extracted from this network. The significantly enriched functions and pathways of the genes in the network were identified. Using Bayesian discriminant analysis, a prognostic prediction system was established to distinguish the positive from negative prognostic samples. The discrimination efficacy of the system was validated in the GSE17538 dataset using Kaplan‑Meier survival analysis. A total of 636 and 1,892 DEGs between the colon cancer samples and normal samples were screened from the TCGA and GSE44861 dataset, respectively. There were 155 survival‑related genes selected. The co‑expression network of survival‑related genes included 138 genes, 534 lines (connections) and five functional clusters, including the signaling pathway, cellular response to cAMP, and immune system process functional clusters. The molecular function, cellular components and biological processes were the significantly enriched functions. The peroxisome proliferator‑activated receptor signaling pathway, Wnt signaling pathway, B cell receptor signaling pathway, and cytokine‑cytokine receptor interactions were the significant pathways. A prognostic prediction system based on a 65‑gene signature was established using this co‑expression network. Its discriminatory effect was validated in the TCGA dataset (P=3.56e‑12) and the GSE17538 dataset (P=1.67e‑6). The 65‑gene signature included kallikrein‑related peptidase 6 (KLK6), collagen type XI α1 (COL11A1), cartilage

  2. Investigating a multigene prognostic assay based on significant pathways for Luminal A breast cancer through gene expression profile analysis.

    Science.gov (United States)

    Gao, Haiyan; Yang, Mei; Zhang, Xiaolan

    2018-04-01

    The present study aimed to investigate potential recurrence-risk biomarkers based on significant pathways for Luminal A breast cancer through gene expression profile analysis. Initially, the gene expression profiles of Luminal A breast cancer patients were downloaded from The Cancer Genome Atlas database. The differentially expressed genes (DEGs) were identified using a Limma package and the hierarchical clustering analysis was conducted for the DEGs. In addition, the functional pathways were screened using Kyoto Encyclopedia of Genes and Genomes pathway enrichment analyses and rank ratio calculation. The multigene prognostic assay was exploited based on the statistically significant pathways and its prognostic function was tested using train set and verified using the gene expression data and survival data of Luminal A breast cancer patients downloaded from the Gene Expression Omnibus. A total of 300 DEGs were identified between good and poor outcome groups, including 176 upregulated genes and 124 downregulated genes. The DEGs may be used to effectively distinguish Luminal A samples with different prognoses verified by hierarchical clustering analysis. There were 9 pathways screened as significant pathways and a total of 18 DEGs involved in these 9 pathways were identified as prognostic biomarkers. According to the survival analysis and receiver operating characteristic curve, the obtained 18-gene prognostic assay exhibited good prognostic function with high sensitivity and specificity to both the train and test samples. In conclusion the 18-gene prognostic assay including the key genes, transcription factor 7-like 2, anterior parietal cortex and lymphocyte enhancer factor-1 may provide a new method for predicting outcomes and may be conducive to the promotion of precision medicine for Luminal A breast cancer.

  3. Genes with a spike expression are clustered in chromosome (sub)bands and spike (sub)bands have a powerful prognostic value in patients with multiple myeloma

    Science.gov (United States)

    Kassambara, Alboukadel; Hose, Dirk; Moreaux, Jérôme; Walker, Brian A.; Protopopov, Alexei; Reme, Thierry; Pellestor, Franck; Pantesco, Véronique; Jauch, Anna; Morgan, Gareth; Goldschmidt, Hartmut; Klein, Bernard

    2012-01-01

    Background Genetic abnormalities are common in patients with multiple myeloma, and may deregulate gene products involved in tumor survival, proliferation, metabolism and drug resistance. In particular, translocations may result in a high expression of targeted genes (termed spike expression) in tumor cells. We identified spike genes in multiple myeloma cells of patients with newly-diagnosed myeloma and investigated their prognostic value. Design and Methods Genes with a spike expression in multiple myeloma cells were picked up using box plot probe set signal distribution and two selection filters. Results In a cohort of 206 newly diagnosed patients with multiple myeloma, 2587 genes/expressed sequence tags with a spike expression were identified. Some spike genes were associated with some transcription factors such as MAF or MMSET and with known recurrent translocations as expected. Spike genes were not associated with increased DNA copy number and for a majority of them, involved unknown mechanisms. Of spiked genes, 36.7% clustered significantly in 149 out of 862 documented chromosome (sub)bands, of which 53 had prognostic value (35 bad, 18 good). Their prognostic value was summarized with a spike band score that delineated 23.8% of patients with a poor median overall survival (27.4 months versus not reached, Pband score was independent of other gene expression profiling-based risk scores, t(4;14), or del17p in an independent validation cohort of 345 patients. Conclusions We present a new approach to identify spike genes and their relationship to patients’ survival. PMID:22102711

  4. Radiogenomics of hepatocellular carcinoma: multiregion analysis-based identification of prognostic imaging biomarkers by integrating gene data—a preliminary study

    Science.gov (United States)

    Xia, Wei; Chen, Ying; Zhang, Rui; Yan, Zhuangzhi; Zhou, Xiaobo; Zhang, Bo; Gao, Xin

    2018-02-01

    Our objective was to identify prognostic imaging biomarkers for hepatocellular carcinoma in contrast-enhanced computed tomography (CECT) with biological interpretations by associating imaging features and gene modules. We retrospectively analyzed 371 patients who had gene expression profiles. For the 38 patients with CECT imaging data, automatic intra-tumor partitioning was performed, resulting in three spatially distinct subregions. We extracted a total of 37 quantitative imaging features describing intensity, geometry, and texture from each subregion. Imaging features were selected after robustness and redundancy analysis. Gene modules acquired from clustering were chosen for their prognostic significance. By constructing an association map between imaging features and gene modules with Spearman rank correlations, the imaging features that significantly correlated with gene modules were obtained. These features were evaluated with Cox’s proportional hazard models and Kaplan-Meier estimates to determine their prognostic capabilities for overall survival (OS). Eight imaging features were significantly correlated with prognostic gene modules, and two of them were associated with OS. Among these, the geometry feature volume fraction of the subregion, which was significantly correlated with all prognostic gene modules representing cancer-related interpretation, was predictive of OS (Cox p  =  0.022, hazard ratio  =  0.24). The texture feature cluster prominence in the subregion, which was correlated with the prognostic gene module representing lipid metabolism and complement activation, also had the ability to predict OS (Cox p  =  0.021, hazard ratio  =  0.17). Imaging features depicting the volume fraction and textural heterogeneity in subregions have the potential to be predictors of OS with interpretable biological meaning.

  5. Gene cluster statistics with gene families.

    Science.gov (United States)

    Raghupathy, Narayanan; Durand, Dannie

    2009-05-01

    Identifying genomic regions that descended from a common ancestor is important for understanding the function and evolution of genomes. In distantly related genomes, clusters of homologous gene pairs are evidence of candidate homologous regions. Demonstrating the statistical significance of such "gene clusters" is an essential component of comparative genomic analyses. However, currently there are no practical statistical tests for gene clusters that model the influence of the number of homologs in each gene family on cluster significance. In this work, we demonstrate empirically that failure to incorporate gene family size in gene cluster statistics results in overestimation of significance, leading to incorrect conclusions. We further present novel analytical methods for estimating gene cluster significance that take gene family size into account. Our methods do not require complete genome data and are suitable for testing individual clusters found in local regions, such as contigs in an unfinished assembly. We consider pairs of regions drawn from the same genome (paralogous clusters), as well as regions drawn from two different genomes (orthologous clusters). Determining cluster significance under general models of gene family size is computationally intractable. By assuming that all gene families are of equal size, we obtain analytical expressions that allow fast approximation of cluster probabilities. We evaluate the accuracy of this approximation by comparing the resulting gene cluster probabilities with cluster probabilities obtained by simulating a realistic, power-law distributed model of gene family size, with parameters inferred from genomic data. Surprisingly, despite the simplicity of the underlying assumption, our method accurately approximates the true cluster probabilities. It slightly overestimates these probabilities, yielding a conservative test. We present additional simulation results indicating the best choice of parameter values for data

  6. Identification and Validation of a Diagnostic and Prognostic Multi-Gene Biomarker Panel for Pancreatic Ductal Adenocarcinoma.

    Science.gov (United States)

    Klett, Hagen; Fuellgraf, Hannah; Levit-Zerdoun, Ella; Hussung, Saskia; Kowar, Silke; Küsters, Simon; Bronsert, Peter; Werner, Martin; Wittel, Uwe; Fritsch, Ralph; Busch, Hauke; Boerries, Melanie

    2018-01-01

    Late diagnosis and systemic dissemination essentially contribute to the invariably poor prognosis of pancreatic ductal adenocarcinoma (PDAC). Therefore, the development of diagnostic biomarkers for PDAC are urgently needed to improve patient stratification and outcome in the clinic. By studying the transcriptomes of independent PDAC patient cohorts of tumor and non-tumor tissues, we identified 81 robustly regulated genes, through a novel, generally applicable meta-analysis. Using consensus clustering on co-expression values revealed four distinct clusters with genes originating from exocrine/endocrine pancreas, stromal and tumor cells. Three clusters were strongly associated with survival of PDAC patients based on TCGA database underlining the prognostic potential of the identified genes. With the added information of impact of survival and the robustness within the meta-analysis, we extracted a 17-gene subset for further validation. We show that it did not only discriminate PDAC from non-tumor tissue and stroma in fresh-frozen as well as formalin-fixed paraffin embedded samples, but also detected pancreatic precursor lesions and singled out pancreatitis samples. Moreover, the classifier discriminated PDAC from other cancers in the TCGA database. In addition, we experimentally validated the classifier in PDAC patients on transcript level using qPCR and exemplify the usage on protein level for three proteins (AHNAK2, LAMC2, TFF1) using immunohistochemistry and for two secreted proteins (TFF1, SERPINB5) using ELISA-based protein detection in blood-plasma. In conclusion, we present a novel robust diagnostic and prognostic gene signature for PDAC with future potential applicability in the clinic.

  7. Identification and Validation of a Diagnostic and Prognostic Multi-Gene Biomarker Panel for Pancreatic Ductal Adenocarcinoma

    Directory of Open Access Journals (Sweden)

    Hagen Klett

    2018-04-01

    Full Text Available Late diagnosis and systemic dissemination essentially contribute to the invariably poor prognosis of pancreatic ductal adenocarcinoma (PDAC. Therefore, the development of diagnostic biomarkers for PDAC are urgently needed to improve patient stratification and outcome in the clinic. By studying the transcriptomes of independent PDAC patient cohorts of tumor and non-tumor tissues, we identified 81 robustly regulated genes, through a novel, generally applicable meta-analysis. Using consensus clustering on co-expression values revealed four distinct clusters with genes originating from exocrine/endocrine pancreas, stromal and tumor cells. Three clusters were strongly associated with survival of PDAC patients based on TCGA database underlining the prognostic potential of the identified genes. With the added information of impact of survival and the robustness within the meta-analysis, we extracted a 17-gene subset for further validation. We show that it did not only discriminate PDAC from non-tumor tissue and stroma in fresh-frozen as well as formalin-fixed paraffin embedded samples, but also detected pancreatic precursor lesions and singled out pancreatitis samples. Moreover, the classifier discriminated PDAC from other cancers in the TCGA database. In addition, we experimentally validated the classifier in PDAC patients on transcript level using qPCR and exemplify the usage on protein level for three proteins (AHNAK2, LAMC2, TFF1 using immunohistochemistry and for two secreted proteins (TFF1, SERPINB5 using ELISA-based protein detection in blood-plasma. In conclusion, we present a novel robust diagnostic and prognostic gene signature for PDAC with future potential applicability in the clinic.

  8. Diametrical clustering for identifying anti-correlated gene clusters.

    Science.gov (United States)

    Dhillon, Inderjit S; Marcotte, Edward M; Roshan, Usman

    2003-09-01

    Clustering genes based upon their expression patterns allows us to predict gene function. Most existing clustering algorithms cluster genes together when their expression patterns show high positive correlation. However, it has been observed that genes whose expression patterns are strongly anti-correlated can also be functionally similar. Biologically, this is not unintuitive-genes responding to the same stimuli, regardless of the nature of the response, are more likely to operate in the same pathways. We present a new diametrical clustering algorithm that explicitly identifies anti-correlated clusters of genes. Our algorithm proceeds by iteratively (i). re-partitioning the genes and (ii). computing the dominant singular vector of each gene cluster; each singular vector serving as the prototype of a 'diametric' cluster. We empirically show the effectiveness of the algorithm in identifying diametrical or anti-correlated clusters. Testing the algorithm on yeast cell cycle data, fibroblast gene expression data, and DNA microarray data from yeast mutants reveals that opposed cellular pathways can be discovered with this method. We present systems whose mRNA expression patterns, and likely their functions, oppose the yeast ribosome and proteosome, along with evidence for the inverse transcriptional regulation of a number of cellular systems.

  9. Prognostic Biomarker Identification Through Integrating the Gene Signatures of Hepatocellular Carcinoma Properties

    Directory of Open Access Journals (Sweden)

    Jialin Cai

    2017-05-01

    Full Text Available Many molecular classification and prognostic gene signatures for hepatocellular carcinoma (HCC patients have been established based on genome-wide gene expression profiling; however, their generalizability is unclear. Herein, we systematically assessed the prognostic effects of these gene signatures and identified valuable prognostic biomarkers by integrating these gene signatures. With two independent HCC datasets (GSE14520, N = 242 and GSE54236, N = 78, 30 published gene signatures were evaluated, and 11 were significantly associated with the overall survival (OS of postoperative HCC patients in both datasets. The random survival forest models suggested that the gene signatures were superior to clinical characteristics for predicting the prognosis of the patients. Based on the 11 gene signatures, a functional protein-protein interaction (PPI network with 1406 nodes and 10,135 edges was established. With tissue microarrays of HCC patients (N = 60, we determined the prognostic values of the core genes in the network and found that RAD21, CDK1, and HDAC2 expression levels were negatively associated with OS for HCC patients. The multivariate Cox regression analyses suggested that CDK1 was an independent prognostic factor, which was validated in an independent case cohort (N = 78. In cellular models, inhibition of CDK1 by siRNA or a specific inhibitor, RO-3306, reduced cellular proliferation and viability for HCC cells. These results suggest that the prognostic predictive capacities of these gene signatures are reproducible and that CDK1 is a potential prognostic biomarker or therapeutic target for HCC patients.

  10. Tumor Microenvironment Gene Signature as a Prognostic Classifier and Therapeutic Target

    Science.gov (United States)

    2016-06-01

    AWARD NUMBER: W81XWH-14-1-0107 TITLE: Tumor Microenvironment Gene Signature as a Prognostic Classifier and Therapeutic Target PRINCIPAL...AND SUBTITLE Tumor Microenvironment Gene Signature as a 5a. CONTRACT NUMBER W81XWH-14-1-0107 Prognostic Classifier and Therapeutic Target 5b...gene signature that correlates with poor survival in ovarian cancer patients. We are refining this gene signature to develop biomarkers for the

  11. Persistence drives gene clustering in bacterial genomes

    Directory of Open Access Journals (Sweden)

    Rocha Eduardo PC

    2008-01-01

    Full Text Available Abstract Background Gene clustering plays an important role in the organization of the bacterial chromosome and several mechanisms have been proposed to explain its extent. However, the controversies raised about the validity of each of these mechanisms remind us that the cause of this gene organization remains an open question. Models proposed to explain clustering did not take into account the function of the gene products nor the likely presence or absence of a given gene in a genome. However, genomes harbor two very different categories of genes: those genes present in a majority of organisms – persistent genes – and those present in very few organisms – rare genes. Results We show that two classes of genes are significantly clustered in bacterial genomes: the highly persistent and the rare genes. The clustering of rare genes is readily explained by the selfish operon theory. Yet, genes persistently present in bacterial genomes are also clustered and we try to understand why. We propose a model accounting specifically for such clustering, and show that indispensability in a genome with frequent gene deletion and insertion leads to the transient clustering of these genes. The model describes how clusters are created via the gene flux that continuously introduces new genes while deleting others. We then test if known selective processes, such as co-transcription, physical interaction or functional neighborhood, account for the stabilization of these clusters. Conclusion We show that the strong selective pressure acting on the function of persistent genes, in a permanent state of flux of genes in bacterial genomes, maintaining their size fairly constant, that drives persistent genes clustering. A further selective stabilization process might contribute to maintaining the clustering.

  12. Oral tongue cancer gene expression profiling: Identification of novel potential prognosticators by oligonucleotide microarray analysis

    International Nuclear Information System (INIS)

    Estilo, Cherry L; Boyle, Jay O; Kraus, Dennis H; Patel, Snehal; Shaha, Ashok R; Wong, Richard J; Huryn, Joseph M; Shah, Jatin P; Singh, Bhuvanesh; O-charoenrat, Pornchai; Talbot, Simon; Socci, Nicholas D; Carlson, Diane L; Ghossein, Ronald; Williams, Tijaana; Yonekawa, Yoshihiro; Ramanathan, Yegnanarayana

    2009-01-01

    The present study is aimed at identifying potential candidate genes as prognostic markers in human oral tongue squamous cell carcinoma (SCC) by large scale gene expression profiling. The gene expression profile of patients (n=37) with oral tongue SCC were analyzed using Affymetrix HG-U95Av2 high-density oligonucleotide arrays. Patients (n=20) from which there were available tumor and matched normal mucosa were grouped into stage (early vs. late) and nodal disease (node positive vs. node negative) subgroups and genes differentially expressed in tumor vs. normal and between the subgroups were identified. Three genes, GLUT3, HSAL2, and PACE4, were selected for their potential biological significance in a larger cohort of 49 patients via quantitative real-time RT-PCR. Hierarchical clustering analyses failed to show significant segregation of patients. In patients (n=20) with available tumor and matched normal mucosa, 77 genes were found to be differentially expressed (P< 0.05) in the tongue tumor samples compared to their matched normal controls. Among the 45 over-expressed genes, MMP-1 encoding interstitial collagenase showed the highest level of increase (average: 34.18 folds). Using the criterion of two-fold or greater as overexpression, 30.6%, 24.5% and 26.5% of patients showed high levels of GLUT3, HSAL2 and PACE4, respectively. Univariate analyses demonstrated that GLUT3 over-expression correlated with depth of invasion (P<0.0001), tumor size (P=0.024), pathological stage (P=0.009) and recurrence (P=0.038). HSAL2 was positively associated with depth of invasion (P=0.015) and advanced T stage (P=0.047). In survival studies, only GLUT3 showed a prognostic value with disease-free (P=0.049), relapse-free (P=0.002) and overall survival (P=0.003). PACE4 mRNA expression failed to show correlation with any of the relevant parameters. The characterization of genes identified to be significant predictors of prognosis by oligonucleotide microarray and further validation by

  13. Intra-Gene DNA Methylation Variability Is a Clinically Independent Prognostic Marker in Women's Cancers.

    Science.gov (United States)

    Bartlett, Thomas E; Jones, Allison; Goode, Ellen L; Fridley, Brooke L; Cunningham, Julie M; Berns, Els M J J; Wik, Elisabeth; Salvesen, Helga B; Davidson, Ben; Trope, Claes G; Lambrechts, Sandrina; Vergote, Ignace; Widschwendter, Martin

    2015-01-01

    We introduce a novel per-gene measure of intra-gene DNA methylation variability (IGV) based on the Illumina Infinium HumanMethylation450 platform, which is prognostic independently of well-known predictors of clinical outcome. Using IGV, we derive a robust gene-panel prognostic signature for ovarian cancer (OC, n = 221), which validates in two independent data sets from Mayo Clinic (n = 198) and TCGA (n = 358), with significance of p = 0.004 in both sets. The OC prognostic signature gene-panel is comprised of four gene groups, which represent distinct biological processes. We show the IGV measurements of these gene groups are most likely a reflection of a mixture of intra-tumour heterogeneity and transcription factor (TF) binding/activity. IGV can be used to predict clinical outcome in patients individually, providing a surrogate read-out of hard-to-measure disease processes.

  14. Intra-Gene DNA Methylation Variability Is a Clinically Independent Prognostic Marker in Women's Cancers.

    Directory of Open Access Journals (Sweden)

    Thomas E Bartlett

    Full Text Available We introduce a novel per-gene measure of intra-gene DNA methylation variability (IGV based on the Illumina Infinium HumanMethylation450 platform, which is prognostic independently of well-known predictors of clinical outcome. Using IGV, we derive a robust gene-panel prognostic signature for ovarian cancer (OC, n = 221, which validates in two independent data sets from Mayo Clinic (n = 198 and TCGA (n = 358, with significance of p = 0.004 in both sets. The OC prognostic signature gene-panel is comprised of four gene groups, which represent distinct biological processes. We show the IGV measurements of these gene groups are most likely a reflection of a mixture of intra-tumour heterogeneity and transcription factor (TF binding/activity. IGV can be used to predict clinical outcome in patients individually, providing a surrogate read-out of hard-to-measure disease processes.

  15. The prognostic value of temporal in vitro and in vivo derived hypoxia gene-expression signatures in breast cancer

    International Nuclear Information System (INIS)

    Starmans, Maud H.W.; Chu, Kenneth C.; Haider, Syed; Nguyen, Francis; Seigneuric, Renaud; Magagnin, Michael G.; Koritzinsky, Marianne; Kasprzyk, Arek; Boutros, Paul C.; Wouters, Bradly G.

    2012-01-01

    Background and purpose: Recent data suggest that in vitro and in vivo derived hypoxia gene-expression signatures have prognostic power in breast and possibly other cancers. However, both tumour hypoxia and the biological adaptation to this stress are highly dynamic. Assessment of time-dependent gene-expression changes in response to hypoxia may thus provide additional biological insights and assist in predicting the impact of hypoxia on patient prognosis. Materials and methods: Transcriptome profiling was performed for three cell lines derived from diverse tumour-types after hypoxic exposure at eight time-points, which include a normoxic time-point. Time-dependent sets of co-regulated genes were identified from these data. Subsequently, gene ontology (GO) and pathway analyses were performed. The prognostic power of these novel signatures was assessed in parallel with previous in vitro and in vivo derived hypoxia signatures in a large breast cancer microarray meta-dataset (n = 2312). Results: We identified seven recurrent temporal and two general hypoxia signatures. GO and pathway analyses revealed regulation of both common and unique underlying biological processes within these signatures. None of the new or previously published in vitro signatures consisting of hypoxia-induced genes were prognostic in the large breast cancer dataset. In contrast, signatures of repressed genes, as well as the in vivo derived signatures of hypoxia-induced genes showed clear prognostic power. Conclusions: Only a subset of hypoxia-induced genes in vitro demonstrates prognostic value when evaluated in a large clinical dataset. Despite clear evidence of temporal patterns of gene-expression in vitro, the subset of prognostic hypoxia regulated genes cannot be identified based on temporal pattern alone. In vivo derived signatures appear to identify the prognostic hypoxia induced genes. The prognostic value of hypoxia-repressed genes is likely a surrogate for the known importance of

  16. Systematic assessment of prognostic gene signatures for breast cancer shows distinct influence of time and ER status

    International Nuclear Information System (INIS)

    Zhao, Xi; Rødland, Einar Andreas; Sørlie, Therese; Vollan, Hans Kristian Moen; Russnes, Hege G; Kristensen, Vessela N; Lingjærde, Ole Christian; Børresen-Dale, Anne-Lise

    2014-01-01

    The aim was to assess and compare prognostic power of nine breast cancer gene signatures (Intrinsic, PAM50, 70-gene, 76-gene, Genomic-Grade-Index, 21-gene-Recurrence-Score, EndoPredict, Wound-Response and Hypoxia) in relation to ER status and follow-up time. A gene expression dataset from 947 breast tumors was used to evaluate the signatures for prediction of Distant Metastasis Free Survival (DMFS). A total of 912 patients had available DMFS status. The recently published METABRIC cohort was used as an additional validation set. Survival predictions were fairly concordant across most signatures. Prognostic power declined with follow-up time. During the first 5 years of followup, all signatures except for Hypoxia were predictive for DMFS in ER-positive disease, and 76-gene, Hypoxia and Wound-Response were prognostic in ER-negative disease. After 5 years, the signatures had little prognostic power. Gene signatures provide significant prognostic information beyond tumor size, node status and histological grade. Generally, these signatures performed better for ER-positive disease, indicating that risk within each ER stratum is driven by distinct underlying biology. Most of the signatures were strong risk predictors for DMFS during the first 5 years of follow-up. Combining gene signatures with histological grade or tumor size, could improve the prognostic power, perhaps also of long-term survival

  17. Intra-Gene DNA Methylation Variability Is a Clinically Independent Prognostic Marker in Women’s Cancers

    Science.gov (United States)

    Bartlett, Thomas E.; Jones, Allison; Goode, Ellen L.; Fridley, Brooke L.; Cunningham, Julie M.; Berns, Els M. J. J.; Wik, Elisabeth; Salvesen, Helga B.; Davidson, Ben; Trope, Claes G.; Lambrechts, Sandrina; Vergote, Ignace; Widschwendter, Martin

    2015-01-01

    We introduce a novel per-gene measure of intra-gene DNA methylation variability (IGV) based on the Illumina Infinium HumanMethylation450 platform, which is prognostic independently of well-known predictors of clinical outcome. Using IGV, we derive a robust gene-panel prognostic signature for ovarian cancer (OC, n = 221), which validates in two independent data sets from Mayo Clinic (n = 198) and TCGA (n = 358), with significance of p = 0.004 in both sets. The OC prognostic signature gene-panel is comprised of four gene groups, which represent distinct biological processes. We show the IGV measurements of these gene groups are most likely a reflection of a mixture of intra-tumour heterogeneity and transcription factor (TF) binding/activity. IGV can be used to predict clinical outcome in patients individually, providing a surrogate read-out of hard-to-measure disease processes. PMID:26629914

  18. Effects of sample size on robustness and prediction accuracy of a prognostic gene signature

    Directory of Open Access Journals (Sweden)

    Kim Seon-Young

    2009-05-01

    Full Text Available Abstract Background Few overlap between independently developed gene signatures and poor inter-study applicability of gene signatures are two of major concerns raised in the development of microarray-based prognostic gene signatures. One recent study suggested that thousands of samples are needed to generate a robust prognostic gene signature. Results A data set of 1,372 samples was generated by combining eight breast cancer gene expression data sets produced using the same microarray platform and, using the data set, effects of varying samples sizes on a few performances of a prognostic gene signature were investigated. The overlap between independently developed gene signatures was increased linearly with more samples, attaining an average overlap of 16.56% with 600 samples. The concordance between predicted outcomes by different gene signatures also was increased with more samples up to 94.61% with 300 samples. The accuracy of outcome prediction also increased with more samples. Finally, analysis using only Estrogen Receptor-positive (ER+ patients attained higher prediction accuracy than using both patients, suggesting that sub-type specific analysis can lead to the development of better prognostic gene signatures Conclusion Increasing sample sizes generated a gene signature with better stability, better concordance in outcome prediction, and better prediction accuracy. However, the degree of performance improvement by the increased sample size was different between the degree of overlap and the degree of concordance in outcome prediction, suggesting that the sample size required for a study should be determined according to the specific aims of the study.

  19. A Combinatory Approach for Selecting Prognostic Genes in Microarray Studies of Tumour Survivals

    Directory of Open Access Journals (Sweden)

    Qihua Tan

    2009-01-01

    Full Text Available Different from significant gene expression analysis which looks for genes that are differentially regulated, feature selection in the microarray-based prognostic gene expression analysis aims at finding a subset of marker genes that are not only differentially expressed but also informative for prediction. Unfortunately feature selection in literature of microarray study is predominated by the simple heuristic univariate gene filter paradigm that selects differentially expressed genes according to their statistical significances. We introduce a combinatory feature selection strategy that integrates differential gene expression analysis with the Gram-Schmidt process to identify prognostic genes that are both statistically significant and highly informative for predicting tumour survival outcomes. Empirical application to leukemia and ovarian cancer survival data through-within- and cross-study validations shows that the feature space can be largely reduced while achieving improved testing performances.

  20. A prognostic profile of hypoxia-induced genes for localised high-grade soft tissue sarcoma

    DEFF Research Database (Denmark)

    Aggerholm-Pedersen, Ninna; Sørensen, Brita Singers; Overgaard, Jens

    2016-01-01

    sarcoma (STS). METHODS: The hypoxia-induced gene quantification was performed by real-time quantitative PCR (RT-qPCR) of formalin-fixed, paraffin-embedded tissue samples. The gene expression cut-points were determined in a test cohort of 55 STS patients and used to allocate each patient into a more......BACKGROUND: For decades, tumour hypoxia has been pursued as a cancer treatment target. However, prognostic and predictive biomarkers are essential for the use of this target in the clinic. This study investigates the prognostic value of a hypoxia-induced gene profile in localised soft tissue...

  1. The Prognostic Value of Haplotypes in the Vascular Endothelial Growth Factor A Gene in Colorectal Cancer

    International Nuclear Information System (INIS)

    Hansen, Torben F.; Spindler, Karen-Lise G.; Andersen, Rikke F.; Lindebjerg, Jan; Kølvraa, Steen; Brandslund, Ivan; Jakobsen, Anders

    2010-01-01

    New prognostic markers in patients with colorectal cancer (CRC) are a prerequisite for individualized treatment. Prognostic importance of single nucleotide polymorphisms (SNPs) in the vascular endothelial growth factor A (VEGF-A) gene has been proposed. The objective of the present study was to investigate the prognostic importance of haplotypes in the VEGF-A gene in patients with CRC. The study included 486 patients surgically resected for stage II and III CRC, divided into two independent cohorts. Three SNPs in the VEGF-A gene were analyzed by polymerase chain reaction. Haplotypes were estimated using the PHASE program. The prognostic influence was evaluated using Kaplan-Meir plots and log rank tests. Cox regression method was used to analyze the independent prognostic importance of different markers. All three SNPs were significantly related to survival. A haplotype combination, responsible for this effect, was present in approximately 30% of the patients and demonstrated a significant relationship with poor survival, and it remained an independent prognostic marker after multivariate analysis, hazard ratio 2.46 (95% confidence interval 1.49–4.06), p < 0.001. Validation was provided by consistent findings in a second and independent cohort. Haplotype combinations call for further investigation

  2. Building prognostic models for breast cancer patients using clinical variables and hundreds of gene expression signatures

    Directory of Open Access Journals (Sweden)

    Liu Yufeng

    2011-01-01

    Full Text Available Abstract Background Multiple breast cancer gene expression profiles have been developed that appear to provide similar abilities to predict outcome and may outperform clinical-pathologic criteria; however, the extent to which seemingly disparate profiles provide additive prognostic information is not known, nor do we know whether prognostic profiles perform equally across clinically defined breast cancer subtypes. We evaluated whether combining the prognostic powers of standard breast cancer clinical variables with a large set of gene expression signatures could improve on our ability to predict patient outcomes. Methods Using clinical-pathological variables and a collection of 323 gene expression "modules", including 115 previously published signatures, we build multivariate Cox proportional hazards models using a dataset of 550 node-negative systemically untreated breast cancer patients. Models predictive of pathological complete response (pCR to neoadjuvant chemotherapy were also built using this approach. Results We identified statistically significant prognostic models for relapse-free survival (RFS at 7 years for the entire population, and for the subgroups of patients with ER-positive, or Luminal tumors. Furthermore, we found that combined models that included both clinical and genomic parameters improved prognostication compared with models with either clinical or genomic variables alone. Finally, we were able to build statistically significant combined models for pathological complete response (pCR predictions for the entire population. Conclusions Integration of gene expression signatures and clinical-pathological factors is an improved method over either variable type alone. Highly prognostic models could be created when using all patients, and for the subset of patients with lymph node-negative and ER-positive breast cancers. Other variables beyond gene expression and clinical-pathological variables, like gene mutation status or DNA

  3. Co-evolution of secondary metabolite gene clusters and their host

    DEFF Research Database (Denmark)

    Kjærbølling, Inge; Vesth, Tammi Camilla; Frisvad, Jens Christian

    Secondary metabolite gene cluster evolution is mainly driven by two events: gene duplication and annexation and horizontal gene transfer. Here we use comparative genomics of Aspergillus species to investigate the evolution of secondary metabolite (SM) gene clusters across a wide spectrum of speci....... We investigate the dynamic evolutionary relationship between the cluster and the host by examining the genes within the cluster and the number of homologous genes found within the host and in closely related species.......Secondary metabolite gene cluster evolution is mainly driven by two events: gene duplication and annexation and horizontal gene transfer. Here we use comparative genomics of Aspergillus species to investigate the evolution of secondary metabolite (SM) gene clusters across a wide spectrum of species...

  4. Origin and distribution of epipolythiodioxopiperazine (ETP gene clusters in filamentous ascomycetes

    Directory of Open Access Journals (Sweden)

    Gardiner Donald M

    2007-09-01

    Full Text Available Abstract Background Genes responsible for biosynthesis of fungal secondary metabolites are usually tightly clustered in the genome and co-regulated with metabolite production. Epipolythiodioxopiperazines (ETPs are a class of secondary metabolite toxins produced by disparate ascomycete fungi and implicated in several animal and plant diseases. Gene clusters responsible for their production have previously been defined in only two fungi. Fungal genome sequence data have been surveyed for the presence of putative ETP clusters and cluster data have been generated from several fungal taxa where genome sequences are not available. Phylogenetic analysis of cluster genes has been used to investigate the assembly and heredity of these gene clusters. Results Putative ETP gene clusters are present in 14 ascomycete taxa, but absent in numerous other ascomycetes examined. These clusters are discontinuously distributed in ascomycete lineages. Gene content is not absolutely fixed, however, common genes are identified and phylogenies of six of these are separately inferred. In each phylogeny almost all cluster genes form monophyletic clades with non-cluster fungal paralogues being the nearest outgroups. This relatedness of cluster genes suggests that a progenitor ETP gene cluster assembled within an ancestral taxon. Within each of the cluster clades, the cluster genes group together in consistent subclades, however, these relationships do not always reflect the phylogeny of ascomycetes. Micro-synteny of several of the genes within the clusters provides further support for these subclades. Conclusion ETP gene clusters appear to have a single origin and have been inherited relatively intact rather than assembling independently in the different ascomycete lineages. This progenitor cluster has given rise to a small number of distinct phylogenetic classes of clusters that are represented in a discontinuous pattern throughout ascomycetes. The disjunct heredity of

  5. Combining multiple hypothesis testing and affinity propagation clustering leads to accurate, robust and sample size independent classification on gene expression data

    Directory of Open Access Journals (Sweden)

    Sakellariou Argiris

    2012-10-01

    Full Text Available Abstract Background A feature selection method in microarray gene expression data should be independent of platform, disease and dataset size. Our hypothesis is that among the statistically significant ranked genes in a gene list, there should be clusters of genes that share similar biological functions related to the investigated disease. Thus, instead of keeping N top ranked genes, it would be more appropriate to define and keep a number of gene cluster exemplars. Results We propose a hybrid FS method (mAP-KL, which combines multiple hypothesis testing and affinity propagation (AP-clustering algorithm along with the Krzanowski & Lai cluster quality index, to select a small yet informative subset of genes. We applied mAP-KL on real microarray data, as well as on simulated data, and compared its performance against 13 other feature selection approaches. Across a variety of diseases and number of samples, mAP-KL presents competitive classification results, particularly in neuromuscular diseases, where its overall AUC score was 0.91. Furthermore, mAP-KL generates concise yet biologically relevant and informative N-gene expression signatures, which can serve as a valuable tool for diagnostic and prognostic purposes, as well as a source of potential disease biomarkers in a broad range of diseases. Conclusions mAP-KL is a data-driven and classifier-independent hybrid feature selection method, which applies to any disease classification problem based on microarray data, regardless of the available samples. Combining multiple hypothesis testing and AP leads to subsets of genes, which classify unknown samples from both, small and large patient cohorts with high accuracy.

  6. Semi-supervised consensus clustering for gene expression data analysis

    OpenAIRE

    Wang, Yunli; Pan, Youlian

    2014-01-01

    Background Simple clustering methods such as hierarchical clustering and k-means are widely used for gene expression data analysis; but they are unable to deal with noise and high dimensionality associated with the microarray gene expression data. Consensus clustering appears to improve the robustness and quality of clustering results. Incorporating prior knowledge in clustering process (semi-supervised clustering) has been shown to improve the consistency between the data partitioning and do...

  7. Fast gene ontology based clustering for microarray experiments.

    Science.gov (United States)

    Ovaska, Kristian; Laakso, Marko; Hautaniemi, Sampsa

    2008-11-21

    Analysis of a microarray experiment often results in a list of hundreds of disease-associated genes. In order to suggest common biological processes and functions for these genes, Gene Ontology annotations with statistical testing are widely used. However, these analyses can produce a very large number of significantly altered biological processes. Thus, it is often challenging to interpret GO results and identify novel testable biological hypotheses. We present fast software for advanced gene annotation using semantic similarity for Gene Ontology terms combined with clustering and heat map visualisation. The methodology allows rapid identification of genes sharing the same Gene Ontology cluster. Our R based semantic similarity open-source package has a speed advantage of over 2000-fold compared to existing implementations. From the resulting hierarchical clustering dendrogram genes sharing a GO term can be identified, and their differences in the gene expression patterns can be seen from the heat map. These methods facilitate advanced annotation of genes resulting from data analysis.

  8. Hox gene clusters in the Indonesian coelacanth, Latimeria menadoensis

    Science.gov (United States)

    Koh, Esther G. L.; Lam, Kevin; Christoffels, Alan; Erdmann, Mark V.; Brenner, Sydney; Venkatesh, Byrappa

    2003-01-01

    The Hox genes encode transcription factors that play a key role in specifying body plans of metazoans. They are organized into clusters that contain up to 13 paralogue group members. The complex morphology of vertebrates has been attributed to the duplication of Hox clusters during vertebrate evolution. In contrast to the single Hox cluster in the amphioxus (Branchiostoma floridae), an invertebrate-chordate, mammals have four clusters containing 39 Hox genes. Ray-finned fishes (Actinopterygii) such as zebrafish and fugu possess more than four Hox clusters. The coelacanth occupies a basal phylogenetic position among lobe-finned fishes (Sarcopterygii), which gave rise to the tetrapod lineage. The lobe fins of sarcopterygians are considered to be the evolutionary precursors of tetrapod limbs. Thus, the characterization of Hox genes in the coelacanth should provide insights into the origin of tetrapod limbs. We have cloned the complete second exon of 33 Hox genes from the Indonesian coelacanth, Latimeria menadoensis, by extensive PCR survey and genome walking. Phylogenetic analysis shows that 32 of these genes have orthologs in the four mammalian HOX clusters, including three genes (HoxA6, D1, and D8) that are absent in ray-finned fishes. The remaining coelacanth gene is an ortholog of hoxc1 found in zebrafish but absent in mammals. Our results suggest that coelacanths have four Hox clusters bearing a gene complement more similar to mammals than to ray-finned fishes, but with an additional gene, HoxC1, which has been lost during the evolution of mammals from lobe-finned fishes. PMID:12547909

  9. Conditions for the Evolution of Gene Clusters in Bacterial Genomes

    Science.gov (United States)

    Ballouz, Sara; Francis, Andrew R.; Lan, Ruiting; Tanaka, Mark M.

    2010-01-01

    Genes encoding proteins in a common pathway are often found near each other along bacterial chromosomes. Several explanations have been proposed to account for the evolution of these structures. For instance, natural selection may directly favour gene clusters through a variety of mechanisms, such as increased efficiency of coregulation. An alternative and controversial hypothesis is the selfish operon model, which asserts that clustered arrangements of genes are more easily transferred to other species, thus improving the prospects for survival of the cluster. According to another hypothesis (the persistence model), genes that are in close proximity are less likely to be disrupted by deletions. Here we develop computational models to study the conditions under which gene clusters can evolve and persist. First, we examine the selfish operon model by re-implementing the simulation and running it under a wide range of conditions. Second, we introduce and study a Moran process in which there is natural selection for gene clustering and rearrangement occurs by genome inversion events. Finally, we develop and study a model that includes selection and inversion, which tracks the occurrence and fixation of rearrangements. Surprisingly, gene clusters fail to evolve under a wide range of conditions. Factors that promote the evolution of gene clusters include a low number of genes in the pathway, a high population size, and in the case of the selfish operon model, a high horizontal transfer rate. The computational analysis here has shown that the evolution of gene clusters can occur under both direct and indirect selection as long as certain conditions hold. Under these conditions the selfish operon model is still viable as an explanation for the evolution of gene clusters. PMID:20168992

  10. Single-gene prognostic signatures for advanced stage serous ovarian cancer based on 1257 patient samples.

    Science.gov (United States)

    Zhang, Fan; Yang, Kai; Deng, Kui; Zhang, Yuanyuan; Zhao, Weiwei; Xu, Huan; Rong, Zhiwei; Li, Kang

    2018-04-16

    We sought to identify stable single-gene prognostic signatures based on a large collection of advanced stage serous ovarian cancer (AS-OvCa) gene expression data and explore their functions. The empirical Bayes (EB) method was used to remove the batch effect and integrate 8 ovarian cancer datasets. Univariate Cox regression was used to evaluate the association between gene and overall survival (OS). The Database for Annotation, Visualization and Integrated Discovery (DAVID) tool was used for the functional annotation of genes for Gene Ontology (GO) terms and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. The batch effect was removed by the EB method, and 1257 patient samples were used for further analysis. We selected 341 single-gene prognostic signatures with FDR matrix organization, focal adhesion and DNA replication which are closely associated with cancer. We used the EB method to remove the batch effect of 8 datasets, integrated these datasets and identified stable prognosis signatures for AS-OvCa.

  11. Genome-scale analysis of positional clustering of mouse testis-specific genes

    Directory of Open Access Journals (Sweden)

    Lee Bernett TK

    2005-01-01

    Full Text Available Abstract Background Genes are not randomly distributed on a chromosome as they were thought even after removal of tandem repeats. The positional clustering of co-expressed genes is known in prokaryotes and recently reported in several eukaryotic organisms such as Caenorhabditis elegans, Drosophila melanogaster, and Homo sapiens. In order to further investigate the mode of tissue-specific gene clustering in higher eukaryotes, we have performed a genome-scale analysis of positional clustering of the mouse testis-specific genes. Results Our computational analysis shows that a large proportion of testis-specific genes are clustered in groups of 2 to 5 genes in the mouse genome. The number of clusters is much higher than expected by chance even after removal of tandem repeats. Conclusion Our result suggests that testis-specific genes tend to cluster on the mouse chromosomes. This provides another piece of evidence for the hypothesis that clusters of tissue-specific genes do exist.

  12. Conditions for the evolution of gene clusters in bacterial genomes.

    Directory of Open Access Journals (Sweden)

    Sara Ballouz

    2010-02-01

    Full Text Available Genes encoding proteins in a common pathway are often found near each other along bacterial chromosomes. Several explanations have been proposed to account for the evolution of these structures. For instance, natural selection may directly favour gene clusters through a variety of mechanisms, such as increased efficiency of coregulation. An alternative and controversial hypothesis is the selfish operon model, which asserts that clustered arrangements of genes are more easily transferred to other species, thus improving the prospects for survival of the cluster. According to another hypothesis (the persistence model, genes that are in close proximity are less likely to be disrupted by deletions. Here we develop computational models to study the conditions under which gene clusters can evolve and persist. First, we examine the selfish operon model by re-implementing the simulation and running it under a wide range of conditions. Second, we introduce and study a Moran process in which there is natural selection for gene clustering and rearrangement occurs by genome inversion events. Finally, we develop and study a model that includes selection and inversion, which tracks the occurrence and fixation of rearrangements. Surprisingly, gene clusters fail to evolve under a wide range of conditions. Factors that promote the evolution of gene clusters include a low number of genes in the pathway, a high population size, and in the case of the selfish operon model, a high horizontal transfer rate. The computational analysis here has shown that the evolution of gene clusters can occur under both direct and indirect selection as long as certain conditions hold. Under these conditions the selfish operon model is still viable as an explanation for the evolution of gene clusters.

  13. Fast Gene Ontology based clustering for microarray experiments

    Directory of Open Access Journals (Sweden)

    Ovaska Kristian

    2008-11-01

    Full Text Available Abstract Background Analysis of a microarray experiment often results in a list of hundreds of disease-associated genes. In order to suggest common biological processes and functions for these genes, Gene Ontology annotations with statistical testing are widely used. However, these analyses can produce a very large number of significantly altered biological processes. Thus, it is often challenging to interpret GO results and identify novel testable biological hypotheses. Results We present fast software for advanced gene annotation using semantic similarity for Gene Ontology terms combined with clustering and heat map visualisation. The methodology allows rapid identification of genes sharing the same Gene Ontology cluster. Conclusion Our R based semantic similarity open-source package has a speed advantage of over 2000-fold compared to existing implementations. From the resulting hierarchical clustering dendrogram genes sharing a GO term can be identified, and their differences in the gene expression patterns can be seen from the heat map. These methods facilitate advanced annotation of genes resulting from data analysis.

  14. Bioinformatics Prediction of Polyketide Synthase Gene Clusters from Mycosphaerella fijiensis.

    Science.gov (United States)

    Noar, Roslyn D; Daub, Margaret E

    2016-01-01

    Mycosphaerella fijiensis, causal agent of black Sigatoka disease of banana, is a Dothideomycete fungus closely related to fungi that produce polyketides important for plant pathogenicity. We utilized the M. fijiensis genome sequence to predict PKS genes and their gene clusters and make bioinformatics predictions about the types of compounds produced by these clusters. Eight PKS gene clusters were identified in the M. fijiensis genome, placing M. fijiensis into the 23rd percentile for the number of PKS genes compared to other Dothideomycetes. Analysis of the PKS domains identified three of the PKS enzymes as non-reducing and two as highly reducing. Gene clusters contained types of genes frequently found in PKS clusters including genes encoding transporters, oxidoreductases, methyltransferases, and non-ribosomal peptide synthases. Phylogenetic analysis identified a putative PKS cluster encoding melanin biosynthesis. None of the other clusters were closely aligned with genes encoding known polyketides, however three of the PKS genes fell into clades with clusters encoding alternapyrone, fumonisin, and solanapyrone produced by Alternaria and Fusarium species. A search for homologs among available genomic sequences from 103 Dothideomycetes identified close homologs (>80% similarity) for six of the PKS sequences. One of the PKS sequences was not similar (< 60% similarity) to sequences in any of the 103 genomes, suggesting that it encodes a unique compound. Comparison of the M. fijiensis PKS sequences with those of two other banana pathogens, M. musicola and M. eumusae, showed that these two species have close homologs to five of the M. fijiensis PKS sequences, but three others were not found in either species. RT-PCR and RNA-Seq analysis showed that the melanin PKS cluster was down-regulated in infected banana as compared to growth in culture. Three other clusters, however were strongly upregulated during disease development in banana, suggesting that they may encode

  15. Bioinformatics Prediction of Polyketide Synthase Gene Clusters from Mycosphaerella fijiensis.

    Directory of Open Access Journals (Sweden)

    Roslyn D Noar

    Full Text Available Mycosphaerella fijiensis, causal agent of black Sigatoka disease of banana, is a Dothideomycete fungus closely related to fungi that produce polyketides important for plant pathogenicity. We utilized the M. fijiensis genome sequence to predict PKS genes and their gene clusters and make bioinformatics predictions about the types of compounds produced by these clusters. Eight PKS gene clusters were identified in the M. fijiensis genome, placing M. fijiensis into the 23rd percentile for the number of PKS genes compared to other Dothideomycetes. Analysis of the PKS domains identified three of the PKS enzymes as non-reducing and two as highly reducing. Gene clusters contained types of genes frequently found in PKS clusters including genes encoding transporters, oxidoreductases, methyltransferases, and non-ribosomal peptide synthases. Phylogenetic analysis identified a putative PKS cluster encoding melanin biosynthesis. None of the other clusters were closely aligned with genes encoding known polyketides, however three of the PKS genes fell into clades with clusters encoding alternapyrone, fumonisin, and solanapyrone produced by Alternaria and Fusarium species. A search for homologs among available genomic sequences from 103 Dothideomycetes identified close homologs (>80% similarity for six of the PKS sequences. One of the PKS sequences was not similar (< 60% similarity to sequences in any of the 103 genomes, suggesting that it encodes a unique compound. Comparison of the M. fijiensis PKS sequences with those of two other banana pathogens, M. musicola and M. eumusae, showed that these two species have close homologs to five of the M. fijiensis PKS sequences, but three others were not found in either species. RT-PCR and RNA-Seq analysis showed that the melanin PKS cluster was down-regulated in infected banana as compared to growth in culture. Three other clusters, however were strongly upregulated during disease development in banana, suggesting that

  16. Gene duplication, modularity and adaptation in the evolution of the aflatoxin gene cluster

    Directory of Open Access Journals (Sweden)

    Jakobek Judy L

    2007-07-01

    Full Text Available Abstract Background The biosynthesis of aflatoxin (AF involves over 20 enzymatic reactions in a complex polyketide pathway that converts acetate and malonate to the intermediates sterigmatocystin (ST and O-methylsterigmatocystin (OMST, the respective penultimate and ultimate precursors of AF. Although these precursors are chemically and structurally very similar, their accumulation differs at the species level for Aspergilli. Notable examples are A. nidulans that synthesizes only ST, A. flavus that makes predominantly AF, and A. parasiticus that generally produces either AF or OMST. Whether these differences are important in the evolutionary/ecological processes of species adaptation and diversification is unknown. Equally unknown are the specific genomic mechanisms responsible for ordering and clustering of genes in the AF pathway of Aspergillus. Results To elucidate the mechanisms that have driven formation of these clusters, we performed systematic searches of aflatoxin cluster homologs across five Aspergillus genomes. We found a high level of gene duplication and identified seven modules consisting of highly correlated gene pairs (aflA/aflB, aflR/aflS, aflX/aflY, aflF/aflE, aflT/aflQ, aflC/aflW, and aflG/aflL. With the exception of A. nomius, contrasts of mean Ka/Ks values across all cluster genes showed significant differences in selective pressure between section Flavi and non-section Flavi species. A. nomius mean Ka/Ks values were more similar to partial clusters in A. fumigatus and A. terreus. Overall, mean Ka/Ks values were significantly higher for section Flavi than for non-section Flavi species. Conclusion Our results implicate several genomic mechanisms in the evolution of ST, OMST and AF cluster genes. Gene modules may arise from duplications of a single gene, whereby the function of the pre-duplication gene is retained in the copy (aflF/aflE or the copies may partition the ancestral function (aflA/aflB. In some gene modules, the

  17. Large clusters of co-expressed genes in the Drosophila genome.

    Science.gov (United States)

    Boutanaev, Alexander M; Kalmykova, Alla I; Shevelyov, Yuri Y; Nurminsky, Dmitry I

    2002-12-12

    Clustering of co-expressed, non-homologous genes on chromosomes implies their co-regulation. In lower eukaryotes, co-expressed genes are often found in pairs. Clustering of genes that share aspects of transcriptional regulation has also been reported in higher eukaryotes. To advance our understanding of the mode of coordinated gene regulation in multicellular organisms, we performed a genome-wide analysis of the chromosomal distribution of co-expressed genes in Drosophila. We identified a total of 1,661 testes-specific genes, one-third of which are clustered on chromosomes. The number of clusters of three or more genes is much higher than expected by chance. We observed a similar trend for genes upregulated in the embryo and in the adult head, although the expression pattern of individual genes cannot be predicted on the basis of chromosomal position alone. Our data suggest that the prevalent mechanism of transcriptional co-regulation in higher eukaryotes operates with extensive chromatin domains that comprise multiple genes.

  18. Differential Retention of Gene Functions in a Secondary Metabolite Cluster.

    Science.gov (United States)

    Reynolds, Hannah T; Slot, Jason C; Divon, Hege H; Lysøe, Erik; Proctor, Robert H; Brown, Daren W

    2017-08-01

    In fungi, distribution of secondary metabolite (SM) gene clusters is often associated with host- or environment-specific benefits provided by SMs. In the plant pathogen Alternaria brassicicola (Dothideomycetes), the DEP cluster confers an ability to synthesize the SM depudecin, a histone deacetylase inhibitor that contributes weakly to virulence. The DEP cluster includes genes encoding enzymes, a transporter, and a transcription regulator. We investigated the distribution and evolution of the DEP cluster in 585 fungal genomes and found a wide but sporadic distribution among Dothideomycetes, Sordariomycetes, and Eurotiomycetes. We confirmed DEP gene expression and depudecin production in one fungus, Fusarium langsethiae. Phylogenetic analyses suggested 6-10 horizontal gene transfers (HGTs) of the cluster, including a transfer that led to the presence of closely related cluster homologs in Alternaria and Fusarium. The analyses also indicated that HGTs were frequently followed by loss/pseudogenization of one or more DEP genes. Independent cluster inactivation was inferred in at least four fungal classes. Analyses of transitions among functional, pseudogenized, and absent states of DEP genes among Fusarium species suggest enzyme-encoding genes are lost at higher rates than the transporter (DEP3) and regulatory (DEP6) genes. The phenotype of an experimentally-induced DEP3 mutant of Fusarium did not support the hypothesis that selective retention of DEP3 and DEP6 protects fungi from exogenous depudecin. Together, the results suggest that HGT and gene loss have contributed significantly to DEP cluster distribution, and that some DEP genes provide a greater fitness benefit possibly due to a differential tendency to form network connections. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution 2017. This work is written by US Government employees and is in the public domain in the US.

  19. Functional clustering of time series gene expression data by Granger causality

    Science.gov (United States)

    2012-01-01

    Background A common approach for time series gene expression data analysis includes the clustering of genes with similar expression patterns throughout time. Clustered gene expression profiles point to the joint contribution of groups of genes to a particular cellular process. However, since genes belong to intricate networks, other features, besides comparable expression patterns, should provide additional information for the identification of functionally similar genes. Results In this study we perform gene clustering through the identification of Granger causality between and within sets of time series gene expression data. Granger causality is based on the idea that the cause of an event cannot come after its consequence. Conclusions This kind of analysis can be used as a complementary approach for functional clustering, wherein genes would be clustered not solely based on their expression similarity but on their topological proximity built according to the intensity of Granger causality among them. PMID:23107425

  20. Identification of nitrogen-fixing genes and gene clusters from metagenomic library of acid mine drainage.

    Directory of Open Access Journals (Sweden)

    Zhimin Dai

    Full Text Available Biological nitrogen fixation is an essential function of acid mine drainage (AMD microbial communities. However, most acidophiles in AMD environments are uncultured microorganisms and little is known about the diversity of nitrogen-fixing genes and structure of nif gene cluster in AMD microbial communities. In this study, we used metagenomic sequencing to isolate nif genes in the AMD microbial community from Dexing Copper Mine, China. Meanwhile, a metagenome microarray containing 7,776 large-insertion fosmids was constructed to screen novel nif gene clusters. Metagenomic analyses revealed that 742 sequences were identified as nif genes including structural subunit genes nifH, nifD, nifK and various additional genes. The AMD community is massively dominated by the genus Acidithiobacillus. However, the phylogenetic diversity of nitrogen-fixing microorganisms is much higher than previously thought in the AMD community. Furthermore, a 32.5-kb genomic sequence harboring nif, fix and associated genes was screened by metagenome microarray. Comparative genome analysis indicated that most nif genes in this cluster are most similar to those of Herbaspirillum seropedicae, but the organization of the nif gene cluster had significant differences from H. seropedicae. Sequence analysis and reverse transcription PCR also suggested that distinct transcription units of nif genes exist in this gene cluster. nifQ gene falls into the same transcription unit with fixABCX genes, which have not been reported in other diazotrophs before. All of these results indicated that more novel diazotrophs survive in the AMD community.

  1. Identification of nitrogen-fixing genes and gene clusters from metagenomic library of acid mine drainage.

    Science.gov (United States)

    Dai, Zhimin; Guo, Xue; Yin, Huaqun; Liang, Yili; Cong, Jing; Liu, Xueduan

    2014-01-01

    Biological nitrogen fixation is an essential function of acid mine drainage (AMD) microbial communities. However, most acidophiles in AMD environments are uncultured microorganisms and little is known about the diversity of nitrogen-fixing genes and structure of nif gene cluster in AMD microbial communities. In this study, we used metagenomic sequencing to isolate nif genes in the AMD microbial community from Dexing Copper Mine, China. Meanwhile, a metagenome microarray containing 7,776 large-insertion fosmids was constructed to screen novel nif gene clusters. Metagenomic analyses revealed that 742 sequences were identified as nif genes including structural subunit genes nifH, nifD, nifK and various additional genes. The AMD community is massively dominated by the genus Acidithiobacillus. However, the phylogenetic diversity of nitrogen-fixing microorganisms is much higher than previously thought in the AMD community. Furthermore, a 32.5-kb genomic sequence harboring nif, fix and associated genes was screened by metagenome microarray. Comparative genome analysis indicated that most nif genes in this cluster are most similar to those of Herbaspirillum seropedicae, but the organization of the nif gene cluster had significant differences from H. seropedicae. Sequence analysis and reverse transcription PCR also suggested that distinct transcription units of nif genes exist in this gene cluster. nifQ gene falls into the same transcription unit with fixABCX genes, which have not been reported in other diazotrophs before. All of these results indicated that more novel diazotrophs survive in the AMD community.

  2. Identification of Nitrogen-Fixing Genes and Gene Clusters from Metagenomic Library of Acid Mine Drainage

    Science.gov (United States)

    Yin, Huaqun; Liang, Yili; Cong, Jing; Liu, Xueduan

    2014-01-01

    Biological nitrogen fixation is an essential function of acid mine drainage (AMD) microbial communities. However, most acidophiles in AMD environments are uncultured microorganisms and little is known about the diversity of nitrogen-fixing genes and structure of nif gene cluster in AMD microbial communities. In this study, we used metagenomic sequencing to isolate nif genes in the AMD microbial community from Dexing Copper Mine, China. Meanwhile, a metagenome microarray containing 7,776 large-insertion fosmids was constructed to screen novel nif gene clusters. Metagenomic analyses revealed that 742 sequences were identified as nif genes including structural subunit genes nifH, nifD, nifK and various additional genes. The AMD community is massively dominated by the genus Acidithiobacillus. However, the phylogenetic diversity of nitrogen-fixing microorganisms is much higher than previously thought in the AMD community. Furthermore, a 32.5-kb genomic sequence harboring nif, fix and associated genes was screened by metagenome microarray. Comparative genome analysis indicated that most nif genes in this cluster are most similar to those of Herbaspirillum seropedicae, but the organization of the nif gene cluster had significant differences from H. seropedicae. Sequence analysis and reverse transcription PCR also suggested that distinct transcription units of nif genes exist in this gene cluster. nifQ gene falls into the same transcription unit with fixABCX genes, which have not been reported in other diazotrophs before. All of these results indicated that more novel diazotrophs survive in the AMD community. PMID:24498417

  3. A scale invariant clustering of genes on human chromosome 7

    Directory of Open Access Journals (Sweden)

    Kendal Wayne S

    2004-01-01

    Full Text Available Abstract Background Vertebrate genes often appear to cluster within the background of nontranscribed genomic DNA. Here an analysis of the physical distribution of gene structures on human chromosome 7 was performed to confirm the presence of clustering, and to elucidate possible underlying statistical and biological mechanisms. Results Clustering of genes was confirmed by virtue of a variance of the number of genes per unit physical length that exceeded the respective mean. Further evidence for clustering came from a power function relationship between the variance and mean that possessed an exponent of 1.51. This power function implied that the spatial distribution of genes on chromosome 7 was scale invariant, and that the underlying statistical distribution had a Poisson-gamma (PG form. A PG distribution for the spatial scattering of genes was validated by stringent comparisons of both the predicted variance to mean power function and its cumulative distribution function to data derived from chromosome 7. Conclusion The PG distribution was consistent with at least two different biological models: In the microrearrangement model, the number of genes per unit length of chromosome represented the contribution of a random number of smaller chromosomal segments that had originated by random breakage and reconstruction of more primitive chromosomes. Each of these smaller segments would have necessarily contained (on average a gamma distributed number of genes. In the gene cluster model, genes would be scattered randomly to begin with. Over evolutionary timescales, tandem duplication, mutation, insertion, deletion and rearrangement could act at these gene sites through a stochastic birth death and immigration process to yield a PG distribution. On the basis of the gene position data alone it was not possible to identify the biological model which best explained the observed clustering. However, the underlying PG statistical model implicated neutral

  4. Time-series clustering of gene expression in irradiated and bystander fibroblasts: an application of FBPA clustering

    Directory of Open Access Journals (Sweden)

    Markatou Marianthi

    2011-01-01

    Full Text Available Abstract Background The radiation bystander effect is an important component of the overall biological response of tissues and organisms to ionizing radiation, but the signaling mechanisms between irradiated and non-irradiated bystander cells are not fully understood. In this study, we measured a time-series of gene expression after α-particle irradiation and applied the Feature Based Partitioning around medoids Algorithm (FBPA, a new clustering method suitable for sparse time series, to identify signaling modules that act in concert in the response to direct irradiation and bystander signaling. We compared our results with those of an alternate clustering method, Short Time series Expression Miner (STEM. Results While computational evaluations of both clustering results were similar, FBPA provided more biological insight. After irradiation, gene clusters were enriched for signal transduction, cell cycle/cell death and inflammation/immunity processes; but only FBPA separated clusters by function. In bystanders, gene clusters were enriched for cell communication/motility, signal transduction and inflammation processes; but biological functions did not separate as clearly with either clustering method as they did in irradiated samples. Network analysis confirmed p53 and NF-κB transcription factor-regulated gene clusters in irradiated and bystander cells and suggested novel regulators, such as KDM5B/JARID1B (lysine (K-specific demethylase 5B and HDACs (histone deacetylases, which could epigenetically coordinate gene expression after irradiation. Conclusions In this study, we have shown that a new time series clustering method, FBPA, can provide new leads to the mechanisms regulating the dynamic cellular response to radiation. The findings implicate epigenetic control of gene expression in addition to transcription factor networks.

  5. A robust approach based on Weibull distribution for clustering gene expression data

    Directory of Open Access Journals (Sweden)

    Gong Binsheng

    2011-05-01

    Full Text Available Abstract Background Clustering is a widely used technique for analysis of gene expression data. Most clustering methods group genes based on the distances, while few methods group genes according to the similarities of the distributions of the gene expression levels. Furthermore, as the biological annotation resources accumulated, an increasing number of genes have been annotated into functional categories. As a result, evaluating the performance of clustering methods in terms of the functional consistency of the resulting clusters is of great interest. Results In this paper, we proposed the WDCM (Weibull Distribution-based Clustering Method, a robust approach for clustering gene expression data, in which the gene expressions of individual genes are considered as the random variables following unique Weibull distributions. Our WDCM is based on the concept that the genes with similar expression profiles have similar distribution parameters, and thus the genes are clustered via the Weibull distribution parameters. We used the WDCM to cluster three cancer gene expression data sets from the lung cancer, B-cell follicular lymphoma and bladder carcinoma and obtained well-clustered results. We compared the performance of WDCM with k-means and Self Organizing Map (SOM using functional annotation information given by the Gene Ontology (GO. The results showed that the functional annotation ratios of WDCM are higher than those of the other methods. We also utilized the external measure Adjusted Rand Index to validate the performance of the WDCM. The comparative results demonstrate that the WDCM provides the better clustering performance compared to k-means and SOM algorithms. The merit of the proposed WDCM is that it can be applied to cluster incomplete gene expression data without imputing the missing values. Moreover, the robustness of WDCM is also evaluated on the incomplete data sets. Conclusions The results demonstrate that our WDCM produces clusters

  6. Heterologous expression of pikromycin biosynthetic gene cluster using Streptomyces artificial chromosome system.

    Science.gov (United States)

    Pyeon, Hye-Rim; Nah, Hee-Ju; Kang, Seung-Hoon; Choi, Si-Sun; Kim, Eung-Soo

    2017-05-31

    Heterologous expression of biosynthetic gene clusters of natural microbial products has become an essential strategy for titer improvement and pathway engineering of various potentially-valuable natural products. A Streptomyces artificial chromosomal conjugation vector, pSBAC, was previously successfully applied for precise cloning and tandem integration of a large polyketide tautomycetin (TMC) biosynthetic gene cluster (Nah et al. in Microb Cell Fact 14(1):1, 2015), implying that this strategy could be employed to develop a custom overexpression scheme of natural product pathway clusters present in actinomycetes. To validate the pSBAC system as a generally-applicable heterologous overexpression system for a large-sized polyketide biosynthetic gene cluster in Streptomyces, another model polyketide compound, the pikromycin biosynthetic gene cluster, was preciously cloned and heterologously expressed using the pSBAC system. A unique HindIII restriction site was precisely inserted at one of the border regions of the pikromycin biosynthetic gene cluster within the chromosome of Streptomyces venezuelae, followed by site-specific recombination of pSBAC into the flanking region of the pikromycin gene cluster. Unlike the previous cloning process, one HindIII site integration step was skipped through pSBAC modification. pPik001, a pSBAC containing the pikromycin biosynthetic gene cluster, was directly introduced into two heterologous hosts, Streptomyces lividans and Streptomyces coelicolor, resulting in the production of 10-deoxymethynolide, a major pikromycin derivative. When two entire pikromycin biosynthetic gene clusters were tandemly introduced into the S. lividans chromosome, overproduction of 10-deoxymethynolide and the presence of pikromycin, which was previously not detected, were both confirmed. Moreover, comparative qRT-PCR results confirmed that the transcription of pikromycin biosynthetic genes was significantly upregulated in S. lividans containing tandem

  7. Clustering approaches to identifying gene expression patterns from DNA microarray data.

    Science.gov (United States)

    Do, Jin Hwan; Choi, Dong-Kug

    2008-04-30

    The analysis of microarray data is essential for large amounts of gene expression data. In this review we focus on clustering techniques. The biological rationale for this approach is the fact that many co-expressed genes are co-regulated, and identifying co-expressed genes could aid in functional annotation of novel genes, de novo identification of transcription factor binding sites and elucidation of complex biological pathways. Co-expressed genes are usually identified in microarray experiments by clustering techniques. There are many such methods, and the results obtained even for the same datasets may vary considerably depending on the algorithms and metrics for dissimilarity measures used, as well as on user-selectable parameters such as desired number of clusters and initial values. Therefore, biologists who want to interpret microarray data should be aware of the weakness and strengths of the clustering methods used. In this review, we survey the basic principles of clustering of DNA microarray data from crisp clustering algorithms such as hierarchical clustering, K-means and self-organizing maps, to complex clustering algorithms like fuzzy clustering.

  8. Prediction of operon-like gene clusters in the Arabidopsis thaliana genome based on co-expression analysis of neighboring genes.

    Science.gov (United States)

    Wada, Masayoshi; Takahashi, Hiroki; Altaf-Ul-Amin, Md; Nakamura, Kensuke; Hirai, Masami Y; Ohta, Daisaku; Kanaya, Shigehiko

    2012-07-15

    Operon-like arrangements of genes occur in eukaryotes ranging from yeasts and filamentous fungi to nematodes, plants, and mammals. In plants, several examples of operon-like gene clusters involved in metabolic pathways have recently been characterized, e.g. the cyclic hydroxamic acid pathways in maize, the avenacin biosynthesis gene clusters in oat, the thalianol pathway in Arabidopsis thaliana, and the diterpenoid momilactone cluster in rice. Such operon-like gene clusters are defined by their co-regulation or neighboring positions within immediate vicinity of chromosomal regions. A comprehensive analysis of the expression of neighboring genes therefore accounts a crucial step to reveal the complete set of operon-like gene clusters within a genome. Genome-wide prediction of operon-like gene clusters should contribute to functional annotation efforts and provide novel insight into evolutionary aspects acquiring certain biological functions as well. We predicted co-expressed gene clusters by comparing the Pearson correlation coefficient of neighboring genes and randomly selected gene pairs, based on a statistical method that takes false discovery rate (FDR) into consideration for 1469 microarray gene expression datasets of A. thaliana. We estimated that A. thaliana contains 100 operon-like gene clusters in total. We predicted 34 statistically significant gene clusters consisting of 3 to 22 genes each, based on a stringent FDR threshold of 0.1. Functional relationships among genes in individual clusters were estimated by sequence similarity and functional annotation of genes. Duplicated gene pairs (determined based on BLAST with a cutoff of EOperon-like clusters tend to include genes encoding bio-machinery associated with ribosomes, the ubiquitin/proteasome system, secondary metabolic pathways, lipid and fatty-acid metabolism, and the lipid transfer system. Copyright © 2012 Elsevier B.V. All rights reserved.

  9. Prognostic Significance of Promoter DNA Hypermethylation of cysteine dioxygenase 1 (CDO1 Gene in Primary Breast Cancer.

    Directory of Open Access Journals (Sweden)

    Naoko Minatani

    Full Text Available Using pharmacological unmasking microarray, we identified promoter DNA methylation of cysteine dioxygenase 1 (CDO1 gene in human cancer. In this study, we assessed the clinicopathological significance of CDO1 methylation in primary breast cancer (BC with no prior chemotherapy. The CDO1 DNA methylation was quantified by TaqMan methylation specific PCR (Q-MSP in 7 BC cell lines and 172 primary BC patients with no prior chemotherapy. Promoter DNA of the CDO1 gene was hypermethylated in 6 BC cell lines except SK-BR3, and CDO1 gene expression was all silenced at mRNA level in the 7 BC cell lines. Quantification of CDO1 methylation was developed using Q-MSP, and assessed in primary BC. Among the clinicopathologic factors, CDO1 methylation level was not statistically significantly associated with any prognostic factors. The log-rank plot analysis elucidated that the higher methylation the tumors harbored, the poorer prognosis the patients exhibited. Using the median value of 58.0 as a cut-off one, disease specific survival in BC patients with CDO1 hypermethylation showed significantly poorer prognosis than those with hypomethylation (p = 0.004. Multivariate Cox proportional hazards model identified that CDO1 hypermethylation was prognostic factor as well as Ki-67 and hormone receptor status. The most intriguingly, CDO1 hypermethylation was of robust prognostic relevance in triple negative BC (p = 0.007. Promoter DNA methylation of CDO1 gene was robust prognostic indicator in primary BC patients with no prior chemotherapy. Prognostic relevance of the CDO1 promoter DNA methylation is worthy of being paid attention in triple negative BC cancer.

  10. Nearest Neighbor Networks: clustering expression data based on gene neighborhoods

    Directory of Open Access Journals (Sweden)

    Olszewski Kellen L

    2007-07-01

    Full Text Available Abstract Background The availability of microarrays measuring thousands of genes simultaneously across hundreds of biological conditions represents an opportunity to understand both individual biological pathways and the integrated workings of the cell. However, translating this amount of data into biological insight remains a daunting task. An important initial step in the analysis of microarray data is clustering of genes with similar behavior. A number of classical techniques are commonly used to perform this task, particularly hierarchical and K-means clustering, and many novel approaches have been suggested recently. While these approaches are useful, they are not without drawbacks; these methods can find clusters in purely random data, and even clusters enriched for biological functions can be skewed towards a small number of processes (e.g. ribosomes. Results We developed Nearest Neighbor Networks (NNN, a graph-based algorithm to generate clusters of genes with similar expression profiles. This method produces clusters based on overlapping cliques within an interaction network generated from mutual nearest neighborhoods. This focus on nearest neighbors rather than on absolute distance measures allows us to capture clusters with high connectivity even when they are spatially separated, and requiring mutual nearest neighbors allows genes with no sufficiently similar partners to remain unclustered. We compared the clusters generated by NNN with those generated by eight other clustering methods. NNN was particularly successful at generating functionally coherent clusters with high precision, and these clusters generally represented a much broader selection of biological processes than those recovered by other methods. Conclusion The Nearest Neighbor Networks algorithm is a valuable clustering method that effectively groups genes that are likely to be functionally related. It is particularly attractive due to its simplicity, its success in the

  11. Unusual Gene Order and Organization of the Sea Urchin HoxCluster

    Energy Technology Data Exchange (ETDEWEB)

    Richardson, Paul M.; Lucas, Susan; Cameron, R. Andrew; Rowen,Lee; Nesbitt, Ryan; Bloom, Scott; Rast, Jonathan P.; Berney, Kevin; Arenas-Mena, Cesar; Martinez, Pedro; Davidson, Eric H.; Peterson, KevinJ.; Hood, Leroy

    2005-05-10

    The highly consistent gene order and axial colinear expression patterns found in vertebrate hox gene clusters are less well conserved across the rest of bilaterians. We report the first deuterostome instance of an intact hox cluster with a unique gene order where the paralog groups are not expressed in a sequential manner. The finished sequence from BAC clones from the genome of the sea urchin, Strongylocentrotus purpuratus, reveals a gene order wherein the anterior genes (Hox1, Hox2 and Hox3) lie nearest the posterior genes in the cluster such that the most 3' gene is Hox5. (The gene order is : 5'-Hox1,2, 3, 11/13c, 11/13b, '11/13a, 9/10, 8, 7, 6, 5 - 3)'. The finished sequence result is corroborated by restriction mapping evidence and BAC-end scaffold analyses. Comparisons with a putative ancestral deuterostome Hox gene cluster suggest that the rearrangements leading to the sea urchin gene order were many and complex.

  12. Recursive Cluster Elimination (RCE for classification and feature selection from gene expression data

    Directory of Open Access Journals (Sweden)

    Showe Louise C

    2007-05-01

    Full Text Available Abstract Background Classification studies using gene expression datasets are usually based on small numbers of samples and tens of thousands of genes. The selection of those genes that are important for distinguishing the different sample classes being compared, poses a challenging problem in high dimensional data analysis. We describe a new procedure for selecting significant genes as recursive cluster elimination (RCE rather than recursive feature elimination (RFE. We have tested this algorithm on six datasets and compared its performance with that of two related classification procedures with RFE. Results We have developed a novel method for selecting significant genes in comparative gene expression studies. This method, which we refer to as SVM-RCE, combines K-means, a clustering method, to identify correlated gene clusters, and Support Vector Machines (SVMs, a supervised machine learning classification method, to identify and score (rank those gene clusters for the purpose of classification. K-means is used initially to group genes into clusters. Recursive cluster elimination (RCE is then applied to iteratively remove those clusters of genes that contribute the least to the classification performance. SVM-RCE identifies the clusters of correlated genes that are most significantly differentially expressed between the sample classes. Utilization of gene clusters, rather than individual genes, enhances the supervised classification accuracy of the same data as compared to the accuracy when either SVM or Penalized Discriminant Analysis (PDA with recursive feature elimination (SVM-RFE and PDA-RFE are used to remove genes based on their individual discriminant weights. Conclusion SVM-RCE provides improved classification accuracy with complex microarray data sets when it is compared to the classification accuracy of the same datasets using either SVM-RFE or PDA-RFE. SVM-RCE identifies clusters of correlated genes that when considered together

  13. Hox gene cluster of the ascidian, Halocynthia roretzi, reveals multiple ancient steps of cluster disintegration during ascidian evolution.

    Science.gov (United States)

    Sekigami, Yuka; Kobayashi, Takuya; Omi, Ai; Nishitsuji, Koki; Ikuta, Tetsuro; Fujiyama, Asao; Satoh, Noriyuki; Saiga, Hidetoshi

    2017-01-01

    Hox gene clusters with at least 13 paralog group (PG) members are common in vertebrate genomes and in that of amphioxus. Ascidians, which belong to the subphylum Tunicata (Urochordata), are phylogenetically positioned between vertebrates and amphioxus, and traditionally divided into two groups: the Pleurogona and the Enterogona. An enterogonan ascidian, Ciona intestinalis ( Ci ), possesses nine Hox genes localized on two chromosomes; thus, the Hox gene cluster is disintegrated. We investigated the Hox gene cluster of a pleurogonan ascidian, Halocynthia roretzi ( Hr ) to investigate whether Hox gene cluster disintegration is common among ascidians, and if so, how such disintegration occurred during ascidian or tunicate evolution. Our phylogenetic analysis reveals that the Hr Hox gene complement comprises nine members, including one with a relatively divergent Hox homeodomain sequence. Eight of nine Hr Hox genes were orthologous to Ci-Hox1 , 2, 3, 4, 5, 10, 12 and 13. Following the phylogenetic classification into 13 PGs, we designated Hr Hox genes as Hox1, 2, 3, 4, 5, 10, 11/12/13.a , 11/12/13.b and HoxX . To address the chromosomal arrangement of the nine Hox genes, we performed two-color chromosomal fluorescent in situ hybridization, which revealed that the nine Hox genes are localized on a single chromosome in Hr , distinct from their arrangement in Ci . We further examined the order of the nine Hox genes on the chromosome by chromosome/scaffold walking. This analysis suggested a gene order of Hox1 , 11/12/13.b, 11/12/13.a, 10, 5, X, followed by either Hox4, 3, 2 or Hox2, 3, 4 on the chromosome. Based on the present results and those previously reported in Ci , we discuss the establishment of the Hox gene complement and disintegration of Hox gene clusters during the course of ascidian or tunicate evolution. The Hox gene cluster and the genome must have experienced extensive reorganization during the course of evolution from the ancestral tunicate to Hr and Ci

  14. Calcitonin gene-related peptide antagonism and cluster headache

    DEFF Research Database (Denmark)

    Ashina, Håkan; Newman, Lawrence; Ashina, Sait

    2017-01-01

    Calcitonin gene-related peptide (CGRP) is a key signaling molecule involved in migraine pathophysiology. Efficacy of CGRP monoclonal antibodies and antagonists in migraine treatment has fueled an increasing interest in the prospect of treating cluster headache (CH) with CGRP antagonism. The exact...... role of CGRP and its mechanism of action in CH have not been fully clarified. A search for original studies and randomized controlled trials (RCTs) published in English was performed in PubMed and in ClinicalTrials.gov . The search term used was "cluster headache and calcitonin gene related peptide......" and "primary headaches and calcitonin gene related peptide." Reference lists of identified articles were also searched for additional relevant papers. Human experimental studies have reported elevated plasma CGRP levels during both spontaneous and glyceryl trinitrate-induced cluster attacks. CGRP may play...

  15. A phylogenomic gene cluster resource: The phylogeneticallyinferred groups (PhlGs) database

    Energy Technology Data Exchange (ETDEWEB)

    Dehal, Paramvir S.; Boore, Jeffrey L.

    2005-08-25

    We present here the PhIGs database, a phylogenomic resource for sequenced genomes. Although many methods exist for clustering gene families, very few attempt to create truly orthologous clusters sharing descent from a single ancestral gene across a range of evolutionary depths. Although these non-phylogenetic gene family clusters have been used broadly for gene annotation, errors are known to be introduced by the artifactual association of slowly evolving paralogs and lack of annotation for those more rapidly evolving. A full phylogenetic framework is necessary for accurate inference of function and for many studies that address pattern and mechanism of the evolution of the genome. The automated generation of evolutionary gene clusters, creation of gene trees, determination of orthology and paralogy relationships, and the correlation of this information with gene annotations, expression information, and genomic context is an important resource to the scientific community.

  16. Prognostic and predictive value of VHL gene alteration in renal cell carcinoma: a meta-analysis and review.

    Science.gov (United States)

    Kim, Bum Jun; Kim, Jung Han; Kim, Hyeong Su; Zang, Dae Young

    2017-02-21

    The von Hippel-Lindau (VHL) gene is often inactivated in sporadic renal cell carcinoma (RCC) by mutation or promoter hypermethylation. The prognostic or predictive value of VHL gene alteration is not well established. We conducted this meta-analysis to evaluate the association between the VHL alteration and clinical outcomes in patients with RCC. We searched PUBMED, MEDLINE and EMBASE for articles including following terms in their titles, abstracts, or keywords: 'kidney or renal', 'carcinoma or cancer or neoplasm or malignancy', 'von Hippel-Lindau or VHL', 'alteration or mutation or methylation', and 'prognostic or predictive'. There were six studies fulfilling inclusion criteria and a total of 633 patients with clear cell RCC were included in the study: 244 patients who received anti-vascular endothelial growth factor (VEGF) therapy in the predictive value analysis and 419 in the prognostic value analysis. Out of 663 patients, 410 (61.8%) had VHL alteration. The meta-analysis showed no association between the VHL gene alteration and overall response rate (relative risk = 1.47 [95% CI, 0.81-2.67], P = 0.20) or progression free survival (hazard ratio = 1.02 [95% CI, 0.72-1.44], P = 0.91) in patients with RCC who received VEGF-targeted therapy. There was also no correlation between the VHL alteration and overall survival (HR = 0.80 [95% CI, 0.56-1.14], P = 0.21). In conclusion, this meta-analysis indicates that VHL gene alteration has no prognostic or predictive value in patients with clear cell RCC.

  17. An Effective Tri-Clustering Algorithm Combining Expression Data with Gene Regulation Information

    Directory of Open Access Journals (Sweden)

    Ao Li

    2009-04-01

    Full Text Available Motivation: Bi-clustering algorithms aim to identify sets of genes sharing similar expression patterns across a subset of conditions. However direct interpretation or prediction of gene regulatory mechanisms may be difficult as only gene expression data is used. Information about gene regulators may also be available, most commonly about which transcription factors may bind to the promoter region and thus control the expression level of a gene. Thus a method to integrate gene expression and gene regulation information is desirable for clustering and analyzing. Methods: By incorporating gene regulatory information with gene expression data, we define regulated expression values (REV as indicators of how a gene is regulated by a specific factor. Existing bi-clustering methods are extended to a three dimensional data space by developing a heuristic TRI-Clustering algorithm. An additional approach named Automatic Boundary Searching algorithm (ABS is introduced to automatically determine the boundary threshold. Results: Results based on incorporating ChIP-chip data representing transcription factor-gene interactions show that the algorithms are efficient and robust for detecting tri-clusters. Detailed analysis of the tri-cluster extracted from yeast sporulation REV data shows genes in this cluster exhibited significant differences during the middle and late stages. The implicated regulatory network was then reconstructed for further study of defined regulatory mechanisms. Topological and statistical analysis of this network demonstrated evidence of significant changes of TF activities during the different stages of yeast sporulation, and suggests this approach might be a general way to study regulatory networks undergoing transformations.

  18. Global Analysis of miRNA Gene Clusters and Gene Families Reveals Dynamic and Coordinated Expression

    Directory of Open Access Journals (Sweden)

    Li Guo

    2014-01-01

    Full Text Available To further understand the potential expression relationships of miRNAs in miRNA gene clusters and gene families, a global analysis was performed in 4 paired tumor (breast cancer and adjacent normal tissue samples using deep sequencing datasets. The compositions of miRNA gene clusters and families are not random, and clustered and homologous miRNAs may have close relationships with overlapped miRNA species. Members in the miRNA group always had various expression levels, and even some showed larger expression divergence. Despite the dynamic expression as well as individual difference, these miRNAs always indicated consistent or similar deregulation patterns. The consistent deregulation expression may contribute to dynamic and coordinated interaction between different miRNAs in regulatory network. Further, we found that those clustered or homologous miRNAs that were also identified as sense and antisense miRNAs showed larger expression divergence. miRNA gene clusters and families indicated important biological roles, and the specific distribution and expression further enrich and ensure the flexible and robust regulatory network.

  19. Lampreys, the jawless vertebrates, contain only two ParaHox gene clusters.

    Science.gov (United States)

    Zhang, Huixian; Ravi, Vydianathan; Tay, Boon-Hui; Tohari, Sumanty; Pillai, Nisha E; Prasad, Aravind; Lin, Qiang; Brenner, Sydney; Venkatesh, Byrappa

    2017-08-22

    ParaHox genes ( Gsx , Pdx , and Cdx ) are an ancient family of developmental genes closely related to the Hox genes. They play critical roles in the patterning of brain and gut. The basal chordate, amphioxus, contains a single ParaHox cluster comprising one member of each family, whereas nonteleost jawed vertebrates contain four ParaHox genomic loci with six or seven ParaHox genes. Teleosts, which have experienced an additional whole-genome duplication, contain six ParaHox genomic loci with six ParaHox genes. Jawless vertebrates, represented by lampreys and hagfish, are the most ancient group of vertebrates and are crucial for understanding the origin and evolution of vertebrate gene families. We have previously shown that lampreys contain six Hox gene loci. Here we report that lampreys contain only two ParaHox gene clusters (designated as α- and β-clusters) bearing five ParaHox genes ( Gsxα , Pdxα , Cdxα , Gsxβ , and Cdxβ ). The order and orientation of the three genes in the α-cluster are identical to that of the single cluster in amphioxus. However, the orientation of Gsxβ in the β-cluster is inverted. Interestingly, Gsxβ is expressed in the eye, unlike its homologs in jawed vertebrates, which are expressed mainly in the brain. The lamprey Pdxα is expressed in the pancreas similar to jawed vertebrate Pdx genes, indicating that the pancreatic expression of Pdx was acquired before the divergence of jawless and jawed vertebrate lineages. It is likely that the lamprey Pdxα plays a crucial role in pancreas specification and insulin production similar to the Pdx of jawed vertebrates.

  20. Comparative analysis of clustering methods for gene expression time course data

    Directory of Open Access Journals (Sweden)

    Ivan G. Costa

    2004-01-01

    Full Text Available This work performs a data driven comparative study of clustering methods used in the analysis of gene expression time courses (or time series. Five clustering methods found in the literature of gene expression analysis are compared: agglomerative hierarchical clustering, CLICK, dynamical clustering, k-means and self-organizing maps. In order to evaluate the methods, a k-fold cross-validation procedure adapted to unsupervised methods is applied. The accuracy of the results is assessed by the comparison of the partitions obtained in these experiments with gene annotation, such as protein function and series classification.

  1. Evaluation of gene-expression clustering via mutual information distance measure

    Directory of Open Access Journals (Sweden)

    Maimon Oded

    2007-03-01

    Full Text Available Abstract Background The definition of a distance measure plays a key role in the evaluation of different clustering solutions of gene expression profiles. In this empirical study we compare different clustering solutions when using the Mutual Information (MI measure versus the use of the well known Euclidean distance and Pearson correlation coefficient. Results Relying on several public gene expression datasets, we evaluate the homogeneity and separation scores of different clustering solutions. It was found that the use of the MI measure yields a more significant differentiation among erroneous clustering solutions. The proposed measure was also used to analyze the performance of several known clustering algorithms. A comparative study of these algorithms reveals that their "best solutions" are ranked almost oppositely when using different distance measures, despite the found correspondence between these measures when analysing the averaged scores of groups of solutions. Conclusion In view of the results, further attention should be paid to the selection of a proper distance measure for analyzing the clustering of gene expression data.

  2. Minimum Information about a Biosynthetic Gene cluster : commentary

    NARCIS (Netherlands)

    Medema, Marnix H; Kottmann, Renzo; Yilmaz, Pelin; Cummings, Matthew; Biggins, John B; Blin, Kai; de Bruijn, Irene; Chooi, Yit Heng; Claesen, Jan; Coates, R Cameron; Cruz-Morales, Pablo; Duddela, Srikanth; Dusterhus, Stephanie; Edwards, Daniel J; Fewer, David P; Garg, Neha; Geiger, Christoph; Gomez-Escribano, Juan Pablo; Greule, Anja; Hadjithomas, Michalis; Haines, Anthony S; Helfrich, Eric J N; Hillwig, Matthew L; Ishida, Keishi; Jones, Adam C; Jones, Carla S; Jungmann, Katrin; Kegler, Carsten; Kim, Hyun Uk; Kotter, Peter; Krug, Daniel; Masschelein, Joleen; Melnik, Alexey V; Mantovani, Simone M; Monroe, Emily A; Moore, Marcus; Moss, Nathan; Nutzmann, Hans-Wilhelm; Pan, Guohui; Pati, Amrita; Petras, Daniel; Reen, F Jerry; Rosconi, Federico; Rui, Zhe; Tian, Zhenhua; Tobias, Nicholas J; Tsunematsu, Yuta; Wiemann, Philipp; Wyckoff, Elizabeth; Yan, Xiaohui; Yim, Grace; Yu, Fengan; Xie, Yunchang; Aigle, Bertrand; Apel, Alexander K; Balibar, Carl J; Balskus, Emily P; Barona-Gomez, Francisco; Bechthold, Andreas; Bode, Helge B; Borriss, Rainer; Brady, Sean F; Brakhage, Axel A; Caffrey, Patrick; Cheng, Yi-Qiang; Clardy, Jon; Cox, Russell J; De Mot, Rene; Donadio, Stefano; Donia, Mohamed S; van der Donk, Wilfred A; Dorrestein, Pieter C; Doyle, Sean; Driessen, Arnold J M; Ehling-Schulz, Monika; Entian, Karl-Dieter; Fischbach, Michael A; Gerwick, Lena; Gerwick, William H; Gross, Harald; Gust, Bertolt; Hertweck, Christian; Hofte, Monica; Jensen, Susan E; Ju, Jianhua; Katz, Leonard; Kaysser, Leonard; Klassen, Jonathan L; Keller, Nancy P; Kormanec, Jan; Kuipers, Oscar P; Kuzuyama, Tomohisa; Kyrpides, Nikos C; Kwon, Hyung-Jin; Lautru, Sylvie; Lavigne, Rob; Lee, Chia Y; Linquan, Bai; Liu, Xinyu; Liu, Wen; Luzhetskyy, Andriy; Mahmud, Taifo; Mast, Yvonne; Mendez, Carmen; Metsa-Ketela, Mikko; Micklefield, Jason; Mitchell, Douglas A; Moore, Bradley S; Moreira, Leonilde M; Muller, Rolf; Neilan, Brett A; Nett, Markus; Nielsen, Jens; O'Gara, Fergal; Oikawa, Hideaki; Osbourn, Anne; Osburne, Marcia S; Ostash, Bohdan; Payne, Shelley M; Pernodet, Jean-Luc; Petricek, Miroslav; Piel, Jorn; Ploux, Olivier; Raaijmakers, Jos M; Salas, Jose A; Schmitt, Esther K; Scott, Barry; Seipke, Ryan F; Shen, Ben; Sherman, David H; Sivonen, Kaarina; Smanski, Michael J; Sosio, Margherita; Stegmann, Evi; Sussmuth, Roderich D; Tahlan, Kapil; Thomas, Christopher M; Tang, Yi; Truman, Andrew W; Viaud, Muriel; Walton, Jonathan D; Walsh, Christopher T; Weber, Tilmann; van Wezel, Gilles P; Wilkinson, Barrie; Willey, Joanne M; Wohlleben, Wolfgang; Wright, Gerard D; Ziemert, Nadine; Zhang, Changsheng; Zotchev, Sergey B; Breitling, Rainer; Takano, Eriko; Glockner, Frank Oliver

    A wide variety of enzymatic pathways that produce specialized metabolites in bacteria, fungi and plants are known to be encoded in biosynthetic gene clusters. Information about these clusters, pathways and metabolites is currently dispersed throughout the literature, making it difficult to exploit.

  3. Gene expression classification of colon cancer into molecular subtypes: characterization, validation, and prognostic value.

    Directory of Open Access Journals (Sweden)

    Laetitia Marisa

    Full Text Available Colon cancer (CC pathological staging fails to accurately predict recurrence, and to date, no gene expression signature has proven reliable for prognosis stratification in clinical practice, perhaps because CC is a heterogeneous disease. The aim of this study was to establish a comprehensive molecular classification of CC based on mRNA expression profile analyses.Fresh-frozen primary tumor samples from a large multicenter cohort of 750 patients with stage I to IV CC who underwent surgery between 1987 and 2007 in seven centers were characterized for common DNA alterations, including BRAF, KRAS, and TP53 mutations, CpG island methylator phenotype, mismatch repair status, and chromosomal instability status, and were screened with whole genome and transcriptome arrays. 566 samples fulfilled RNA quality requirements. Unsupervised consensus hierarchical clustering applied to gene expression data from a discovery subset of 443 CC samples identified six molecular subtypes. These subtypes were associated with distinct clinicopathological characteristics, molecular alterations, specific enrichments of supervised gene expression signatures (stem cell phenotype-like, normal-like, serrated CC phenotype-like, and deregulated signaling pathways. Based on their main biological characteristics, we distinguished a deficient mismatch repair subtype, a KRAS mutant subtype, a cancer stem cell subtype, and three chromosomal instability subtypes, including one associated with down-regulated immune pathways, one with up-regulation of the Wnt pathway, and one displaying a normal-like gene expression profile. The classification was validated in the remaining 123 samples plus an independent set of 1,058 CC samples, including eight public datasets. Furthermore, prognosis was analyzed in the subset of stage II-III CC samples. The subtypes C4 and C6, but not the subtypes C1, C2, C3, and C5, were independently associated with shorter relapse-free survival, even after

  4. Characterization of the largest effector gene cluster of Ustilago maydis.

    Directory of Open Access Journals (Sweden)

    Thomas Brefort

    2014-07-01

    Full Text Available In the genome of the biotrophic plant pathogen Ustilago maydis, many of the genes coding for secreted protein effectors modulating virulence are arranged in gene clusters. The vast majority of these genes encode novel proteins whose expression is coupled to plant colonization. The largest of these gene clusters, cluster 19A, encodes 24 secreted effectors. Deletion of the entire cluster results in severe attenuation of virulence. Here we present the functional analysis of this genomic region. We show that a 19A deletion mutant behaves like an endophyte, i.e. is still able to colonize plants and complete the infection cycle. However, tumors, the most conspicuous symptoms of maize smut disease, are only rarely formed and fungal biomass in infected tissue is significantly reduced. The generation and analysis of strains carrying sub-deletions identified several genes significantly contributing to tumor formation after seedling infection. Another of the effectors could be linked specifically to anthocyanin induction in the infected tissue. As the individual contributions of these genes to tumor formation were small, we studied the response of maize plants to the whole cluster mutant as well as to several individual mutants by array analysis. This revealed distinct plant responses, demonstrating that the respective effectors have discrete plant targets. We propose that the analysis of plant responses to effector mutant strains that lack a strong virulence phenotype may be a general way to visualize differences in effector function.

  5. IGSA: Individual Gene Sets Analysis, including Enrichment and Clustering.

    Science.gov (United States)

    Wu, Lingxiang; Chen, Xiujie; Zhang, Denan; Zhang, Wubing; Liu, Lei; Ma, Hongzhe; Yang, Jingbo; Xie, Hongbo; Liu, Bo; Jin, Qing

    2016-01-01

    Analysis of gene sets has been widely applied in various high-throughput biological studies. One weakness in the traditional methods is that they neglect the heterogeneity of genes expressions in samples which may lead to the omission of some specific and important gene sets. It is also difficult for them to reflect the severities of disease and provide expression profiles of gene sets for individuals. We developed an application software called IGSA that leverages a powerful analytical capacity in gene sets enrichment and samples clustering. IGSA calculates gene sets expression scores for each sample and takes an accumulating clustering strategy to let the samples gather into the set according to the progress of disease from mild to severe. We focus on gastric, pancreatic and ovarian cancer data sets for the performance of IGSA. We also compared the results of IGSA in KEGG pathways enrichment with David, GSEA, SPIA, ssGSEA and analyzed the results of IGSA clustering and different similarity measurement methods. Notably, IGSA is proved to be more sensitive and specific in finding significant pathways, and can indicate related changes in pathways with the severity of disease. In addition, IGSA provides with significant gene sets profile for each sample.

  6. Pichia stipitis genomics, transcriptomics, and gene clusters

    Science.gov (United States)

    Thomas W. Jeffries; Jennifer R. Headman Van Vleet

    2009-01-01

    Genome sequencing and subsequent global gene expression studies have advanced our understanding of the lignocellulose-fermenting yeast Pichia stipitis. These studies have provided an insight into its central carbon metabolism, and analysis of its genome has revealed numerous functional gene clusters and tandem repeats. Specialized physiological traits are often the...

  7. [Expression of BAG3 Gene in Acute Myeloid Leukemia and Its Prognostic Value].

    Science.gov (United States)

    Zhu, Hua-Yuan; Fu, Yuan; Wu, Wei; Xu, Jia-Dai; Chen, Ting-Mei; Qiao, Chun; Li, Jian-Yong; Liu, Peng

    2015-08-01

    To investigate the expression of BAG3 gene in acue myeloid leukemia (AML) and its prognostic value. Real-time quantitative RT-PCR was used to detect the expression of BAG3 mRNA in 88 previously untreated AML patients. The corelation of BAG3 expression level with clinical characteristics and known prognostic markers of AML was analyzed. In 88 patients with AML, the expression of BAG3 mRNA in NPMI mutated AML patients was obviously lower than that in NPMI unmutated patients (P = 0.018). The expression level of BAG3 mRNA did not related to clinical parameters, such as age, sex, FAB subtype, WBC count, extra-modullary presentation, and to prognostic factors including cytogenetics, FLT3-ITD, c-kit and CEBPα mutation status (P > 0.05). The expression level of BAG3 had no obvious effect on complete remission (CR) of patients in first treatment. The expression level of BAG3 in non-M3 patients was higher than that in relapsed patients (P = 0.036). The expression level of BAG3 had no effect on overall survival (OS) of patients. The expression level of BAG3 does not correlated with known-prognostic markers of AML, only the expression level of BAG3 in NPM1 mutated patients is lower than that in NPM1 unmutated patients. The expression level of BAG3 has no effect on OS of AML patients, the BAG3 can not be difined as a prognostic marker in AML.

  8. Comparison of two schemes for automatic keyword extraction from MEDLINE for functional gene clustering.

    Science.gov (United States)

    Liu, Ying; Ciliax, Brian J; Borges, Karin; Dasigi, Venu; Ram, Ashwin; Navathe, Shamkant B; Dingledine, Ray

    2004-01-01

    One of the key challenges of microarray studies is to derive biological insights from the unprecedented quatities of data on gene-expression patterns. Clustering genes by functional keyword association can provide direct information about the nature of the functional links among genes within the derived clusters. However, the quality of the keyword lists extracted from biomedical literature for each gene significantly affects the clustering results. We extracted keywords from MEDLINE that describes the most prominent functions of the genes, and used the resulting weights of the keywords as feature vectors for gene clustering. By analyzing the resulting cluster quality, we compared two keyword weighting schemes: normalized z-score and term frequency-inverse document frequency (TFIDF). The best combination of background comparison set, stop list and stemming algorithm was selected based on precision and recall metrics. In a test set of four known gene groups, a hierarchical algorithm correctly assigned 25 of 26 genes to the appropriate clusters based on keywords extracted by the TDFIDF weighting scheme, but only 23 og 26 with the z-score method. To evaluate the effectiveness of the weighting schemes for keyword extraction for gene clusters from microarray profiles, 44 yeast genes that are differentially expressed during the cell cycle were used as a second test set. Using established measures of cluster quality, the results produced from TFIDF-weighted keywords had higher purity, lower entropy, and higher mutual information than those produced from normalized z-score weighted keywords. The optimized algorithms should be useful for sorting genes from microarray lists into functionally discrete clusters.

  9. GraphTeams: a method for discovering spatial gene clusters in Hi-C sequencing data.

    Science.gov (United States)

    Schulz, Tizian; Stoye, Jens; Doerr, Daniel

    2018-05-08

    Hi-C sequencing offers novel, cost-effective means to study the spatial conformation of chromosomes. We use data obtained from Hi-C experiments to provide new evidence for the existence of spatial gene clusters. These are sets of genes with associated functionality that exhibit close proximity to each other in the spatial conformation of chromosomes across several related species. We present the first gene cluster model capable of handling spatial data. Our model generalizes a popular computational model for gene cluster prediction, called δ-teams, from sequences to graphs. Following previous lines of research, we subsequently extend our model to allow for several vertices being associated with the same label. The model, called δ-teams with families, is particular suitable for our application as it enables handling of gene duplicates. We develop algorithmic solutions for both models. We implemented the algorithm for discovering δ-teams with families and integrated it into a fully automated workflow for discovering gene clusters in Hi-C data, called GraphTeams. We applied it to human and mouse data to find intra- and interchromosomal gene cluster candidates. The results include intrachromosomal clusters that seem to exhibit a closer proximity in space than on their chromosomal DNA sequence. We further discovered interchromosomal gene clusters that contain genes from different chromosomes within the human genome, but are located on a single chromosome in mouse. By identifying δ-teams with families, we provide a flexible model to discover gene cluster candidates in Hi-C data. Our analysis of Hi-C data from human and mouse reveals several known gene clusters (thus validating our approach), but also few sparsely studied or possibly unknown gene cluster candidates that could be the source of further experimental investigations.

  10. Clusters of Antibiotic Resistance Genes Enriched Together Stay Together in Swine Agriculture.

    Science.gov (United States)

    Johnson, Timothy A; Stedtfeld, Robert D; Wang, Qiong; Cole, James R; Hashsham, Syed A; Looft, Torey; Zhu, Yong-Guan; Tiedje, James M

    2016-04-12

    Antibiotic resistance is a worldwide health risk, but the influence of animal agriculture on the genetic context and enrichment of individual antibiotic resistance alleles remains unclear. Using quantitative PCR followed by amplicon sequencing, we quantified and sequenced 44 genes related to antibiotic resistance, mobile genetic elements, and bacterial phylogeny in microbiomes from U.S. laboratory swine and from swine farms from three Chinese regions. We identified highly abundant resistance clusters: groups of resistance and mobile genetic element alleles that cooccur. For example, the abundance of genes conferring resistance to six classes of antibiotics together with class 1 integrase and the abundance of IS6100-type transposons in three Chinese regions are directly correlated. These resistance cluster genes likely colocalize in microbial genomes in the farms. Resistance cluster alleles were dramatically enriched (up to 1 to 10% as abundant as 16S rRNA) and indicate that multidrug-resistant bacteria are likely the norm rather than an exception in these communities. This enrichment largely occurred independently of phylogenetic composition; thus, resistance clusters are likely present in many bacterial taxa. Furthermore, resistance clusters contain resistance genes that confer resistance to antibiotics independently of their particular use on the farms. Selection for these clusters is likely due to the use of only a subset of the broad range of chemicals to which the clusters confer resistance. The scale of animal agriculture and its wastes, the enrichment and horizontal gene transfer potential of the clusters, and the vicinity of large human populations suggest that managing this resistance reservoir is important for minimizing human risk. Agricultural antibiotic use results in clusters of cooccurring resistance genes that together confer resistance to multiple antibiotics. The use of a single antibiotic could select for an entire suite of resistance genes if

  11. Challenges in microarray class discovery: a comprehensive examination of normalization, gene selection and clustering

    Directory of Open Access Journals (Sweden)

    Landfors Mattias

    2010-10-01

    Full Text Available Abstract Background Cluster analysis, and in particular hierarchical clustering, is widely used to extract information from gene expression data. The aim is to discover new classes, or sub-classes, of either individuals or genes. Performing a cluster analysis commonly involve decisions on how to; handle missing values, standardize the data and select genes. In addition, pre-processing, involving various types of filtration and normalization procedures, can have an effect on the ability to discover biologically relevant classes. Here we consider cluster analysis in a broad sense and perform a comprehensive evaluation that covers several aspects of cluster analyses, including normalization. Result We evaluated 2780 cluster analysis methods on seven publicly available 2-channel microarray data sets with common reference designs. Each cluster analysis method differed in data normalization (5 normalizations were considered, missing value imputation (2, standardization of data (2, gene selection (19 or clustering method (11. The cluster analyses are evaluated using known classes, such as cancer types, and the adjusted Rand index. The performances of the different analyses vary between the data sets and it is difficult to give general recommendations. However, normalization, gene selection and clustering method are all variables that have a significant impact on the performance. In particular, gene selection is important and it is generally necessary to include a relatively large number of genes in order to get good performance. Selecting genes with high standard deviation or using principal component analysis are shown to be the preferred gene selection methods. Hierarchical clustering using Ward's method, k-means clustering and Mclust are the clustering methods considered in this paper that achieves the highest adjusted Rand. Normalization can have a significant positive impact on the ability to cluster individuals, and there are indications that

  12. Challenges in microarray class discovery: a comprehensive examination of normalization, gene selection and clustering

    Science.gov (United States)

    2010-01-01

    Background Cluster analysis, and in particular hierarchical clustering, is widely used to extract information from gene expression data. The aim is to discover new classes, or sub-classes, of either individuals or genes. Performing a cluster analysis commonly involve decisions on how to; handle missing values, standardize the data and select genes. In addition, pre-processing, involving various types of filtration and normalization procedures, can have an effect on the ability to discover biologically relevant classes. Here we consider cluster analysis in a broad sense and perform a comprehensive evaluation that covers several aspects of cluster analyses, including normalization. Result We evaluated 2780 cluster analysis methods on seven publicly available 2-channel microarray data sets with common reference designs. Each cluster analysis method differed in data normalization (5 normalizations were considered), missing value imputation (2), standardization of data (2), gene selection (19) or clustering method (11). The cluster analyses are evaluated using known classes, such as cancer types, and the adjusted Rand index. The performances of the different analyses vary between the data sets and it is difficult to give general recommendations. However, normalization, gene selection and clustering method are all variables that have a significant impact on the performance. In particular, gene selection is important and it is generally necessary to include a relatively large number of genes in order to get good performance. Selecting genes with high standard deviation or using principal component analysis are shown to be the preferred gene selection methods. Hierarchical clustering using Ward's method, k-means clustering and Mclust are the clustering methods considered in this paper that achieves the highest adjusted Rand. Normalization can have a significant positive impact on the ability to cluster individuals, and there are indications that background correction is

  13. Quantitative multiplex quantum dot in-situ hybridisation based gene expression profiling in tissue microarrays identifies prognostic genes in acute myeloid leukaemia

    Energy Technology Data Exchange (ETDEWEB)

    Tholouli, Eleni [Department of Haematology, Manchester Royal Infirmary, Oxford Road, Manchester, M13 9WL (United Kingdom); MacDermott, Sarah [The Medical School, The University of Manchester, Oxford Road, M13 9PT Manchester (United Kingdom); Hoyland, Judith [School of Biomedicine, Faculty of Medical and Human Sciences, The University of Manchester, Oxford Road, M13 9PT Manchester (United Kingdom); Yin, John Liu [Department of Haematology, Manchester Royal Infirmary, Oxford Road, Manchester, M13 9WL (United Kingdom); Byers, Richard, E-mail: richard.byers@cmft.nhs.uk [School of Cancer and Enabling Sciences, Faculty of Medical and Human Sciences, The University of Manchester, Stopford Building, Oxford Road, M13 9PT Manchester (United Kingdom)

    2012-08-24

    Highlights: Black-Right-Pointing-Pointer Development of a quantitative high throughput in situ expression profiling method. Black-Right-Pointing-Pointer Application to a tissue microarray of 242 AML bone marrow samples. Black-Right-Pointing-Pointer Identification of HOXA4, HOXA9, Meis1 and DNMT3A as prognostic markers in AML. -- Abstract: Measurement and validation of microarray gene signatures in routine clinical samples is problematic and a rate limiting step in translational research. In order to facilitate measurement of microarray identified gene signatures in routine clinical tissue a novel method combining quantum dot based oligonucleotide in situ hybridisation (QD-ISH) and post-hybridisation spectral image analysis was used for multiplex in-situ transcript detection in archival bone marrow trephine samples from patients with acute myeloid leukaemia (AML). Tissue-microarrays were prepared into which white cell pellets were spiked as a standard. Tissue microarrays were made using routinely processed bone marrow trephines from 242 patients with AML. QD-ISH was performed for six candidate prognostic genes using triplex QD-ISH for DNMT1, DNMT3A, DNMT3B, and for HOXA4, HOXA9, Meis1. Scrambled oligonucleotides were used to correct for background staining followed by normalisation of expression against the expression values for the white cell pellet standard. Survival analysis demonstrated that low expression of HOXA4 was associated with poorer overall survival (p = 0.009), whilst high expression of HOXA9 (p < 0.0001), Meis1 (p = 0.005) and DNMT3A (p = 0.04) were associated with early treatment failure. These results demonstrate application of a standardised, quantitative multiplex QD-ISH method for identification of prognostic markers in formalin-fixed paraffin-embedded clinical samples, facilitating measurement of gene expression signatures in routine clinical samples.

  14. A prognostic gene signature for metastasis-free survival of triple negative breast cancer patients.

    Science.gov (United States)

    Lee, Unjin; Frankenberger, Casey; Yun, Jieun; Bevilacqua, Elena; Caldas, Carlos; Chin, Suet-Feung; Rueda, Oscar M; Reinitz, John; Rosner, Marsha Rich

    2013-01-01

    Although triple negative breast cancers (TNBC) are the most aggressive subtype of breast cancer, they currently lack targeted therapies. Because this classification still includes a heterogeneous collection of tumors, new tools to classify TNBCs are urgently required in order to improve our prognostic capability for high risk patients and predict response to therapy. We previously defined a gene expression signature, RKIP Pathway Metastasis Signature (RPMS), based upon a metastasis-suppressive signaling pathway initiated by Raf Kinase Inhibitory Protein (RKIP). We have now generated a new BACH1 Pathway Metastasis gene signature (BPMS) that utilizes targets of the metastasis regulator BACH1. Specifically, we substituted experimentally validated target genes to generate a new BACH1 metagene, developed an approach to optimize patient tumor stratification, and reduced the number of signature genes to 30. The BPMS significantly and selectively stratified metastasis-free survival in basal-like and, in particular, TNBC patients. In addition, the BPMS further stratified patients identified as having a good or poor prognosis by other signatures including the Mammaprint® and Oncotype® clinical tests. The BPMS is thus complementary to existing signatures and is a prognostic tool for high risk ER-HER2- patients. We also demonstrate the potential clinical applicability of the BPMS as a single sample predictor. Together, these results reveal the potential of this pathway-based BPMS gene signature to identify high risk TNBC patients that can respond effectively to targeted therapy, and highlight BPMS genes as novel drug targets for therapeutic development.

  15. A prognostic gene signature for metastasis-free survival of triple negative breast cancer patients.

    Directory of Open Access Journals (Sweden)

    Unjin Lee

    Full Text Available Although triple negative breast cancers (TNBC are the most aggressive subtype of breast cancer, they currently lack targeted therapies. Because this classification still includes a heterogeneous collection of tumors, new tools to classify TNBCs are urgently required in order to improve our prognostic capability for high risk patients and predict response to therapy. We previously defined a gene expression signature, RKIP Pathway Metastasis Signature (RPMS, based upon a metastasis-suppressive signaling pathway initiated by Raf Kinase Inhibitory Protein (RKIP. We have now generated a new BACH1 Pathway Metastasis gene signature (BPMS that utilizes targets of the metastasis regulator BACH1. Specifically, we substituted experimentally validated target genes to generate a new BACH1 metagene, developed an approach to optimize patient tumor stratification, and reduced the number of signature genes to 30. The BPMS significantly and selectively stratified metastasis-free survival in basal-like and, in particular, TNBC patients. In addition, the BPMS further stratified patients identified as having a good or poor prognosis by other signatures including the Mammaprint® and Oncotype® clinical tests. The BPMS is thus complementary to existing signatures and is a prognostic tool for high risk ER-HER2- patients. We also demonstrate the potential clinical applicability of the BPMS as a single sample predictor. Together, these results reveal the potential of this pathway-based BPMS gene signature to identify high risk TNBC patients that can respond effectively to targeted therapy, and highlight BPMS genes as novel drug targets for therapeutic development.

  16. Identification of Subtype-Specific Prognostic Genes for Early-Stage Lung Adenocarcinoma and Squamous Cell Carcinoma Patients Using an Embedded Feature Selection Algorithm.

    Directory of Open Access Journals (Sweden)

    Suyan Tian

    Full Text Available The existence of fundamental differences between lung adenocarcinoma (AC and squamous cell carcinoma (SCC in their underlying mechanisms motivated us to postulate that specific genes might exist relevant to prognosis of each histology subtype. To test on this research hypothesis, we previously proposed a simple Cox-regression model based feature selection algorithm and identified successfully some subtype-specific prognostic genes when applying this method to real-world data. In this article, we continue our effort on identification of subtype-specific prognostic genes for AC and SCC, and propose a novel embedded feature selection method by extending Threshold Gradient Descent Regularization (TGDR algorithm and minimizing on a corresponding negative partial likelihood function. Using real-world datasets and simulated ones, we show these two proposed methods have comparable performance whereas the new proposal is superior in terms of model parsimony. Our analysis provides some evidence on the existence of such subtype-specific prognostic genes, more investigation is warranted.

  17. Inactivation of human α-globin gene expression by a de novo deletion located upstream of the α-globin gene cluster

    International Nuclear Information System (INIS)

    Liebhaber, S.A.; Weiss, I.; Cash, F.E.; Griese, E.U.; Horst, J.; Ayyub, H.; Higgs, D.R.

    1990-01-01

    Synthesis of normal human hemoglobin A, α 2 β 2 , is based upon balanced expression of genes in the α-globin gene cluster on chromosome 15 and the β-globin gene cluster on chromosome 11. Full levels of erythroid-specific activation of the β-globin cluster depend on sequences located at a considerable distance 5' to the β-globin gene, referred to as the locus-activating or dominant control region. The existence of an analogous element(s) upstream of the α-globin cluster has been suggested from observations on naturally occurring deletions and experimental studies. The authors have identified an individual with α-thalassemia in whom structurally normal α-globin genes have been inactivated in cis by a discrete de novo 35-kilobase deletion located ∼30 kilobases 5' from the α-globin gene cluster. They conclude that this deletion inactivates expression of the α-globin genes by removing one or more of the previously identified upstream regulatory sequences that are critical to expression of the α-globin genes

  18. Dominant control region of the human β- like globin gene cluster

    NARCIS (Netherlands)

    Blom van Assendelft, Margaretha van

    1989-01-01

    The structure and regulation of the human β -like globin gene cluster has been studied extensively. Genetic disorders connected with this gene cluster are responsible for human diseases associated with high levels of morbidity and mortality, such as β-thalassaemia and sickle cell anaemia. The work

  19. Expression of multi-drug resistance-related genes MDR3 and MRP as prognostic factors in clinical liver cancer patients.

    Science.gov (United States)

    Yu, Zheng; Peng, Sun; Hong-Ming, Pan; Kai-Feng, Wang

    2012-01-01

    To investigate the expression of multi-drug resistance-related genes, MDR3 and MRP, in clinical specimens of primary liver cancer and their potential as prognostic factors in liver cancer patients. A total of 26 patients with primary liver cancer were enrolled. The expression of MDR3 and MRP genes was measured by real-time PCR and the association between gene expression and the prognosis of patients was analyzed by the Kaplan-Meier method and COX regression model. This study showed that increases in MDR3 gene expression were identified in cholangiocellular carcinoma, cirrhosis and HBsAg-positive patients, while MRP expression increased in hepatocellular carcinoma, non-cirrhosis and HBsAg-negative patients. Moreover, conjugated bilirubin and total bile acid in the serum were significantly reduced in patients with high MRP expression compared to patients with low expression. The overall survival tended to be longer in patients with high MDR3 and MRP expression compared to the control group. MRP might be an independent prognostic factor in patients with liver cancer by COX regression analysis. MDR3 and MRP may play important roles in liver cancer patients as prognostic factors and their underlying mechanisms in liver cancer are worthy of further investigation.

  20. Identification, characterization and metagenome analysis of oocyte-specific genes organized in clusters in the mouse genome

    Directory of Open Access Journals (Sweden)

    Vaiman Daniel

    2005-05-01

    Full Text Available Abstract Background Genes specifically expressed in the oocyte play key roles in oogenesis, ovarian folliculogenesis, fertilization and/or early embryonic development. In an attempt to identify novel oocyte-specific genes in the mouse, we have used an in silico subtraction methodology, and we have focused our attention on genes that are organized in genomic clusters. Results In the present work, five clusters have been studied: a cluster of thirteen genes characterized by an F-box domain localized on chromosome 9, a cluster of six genes related to T-cell leukaemia/lymphoma protein 1 (Tcl1 on chromosome 12, a cluster composed of a SPErm-associated glutamate (E-Rich (Speer protein expressed in the oocyte in the vicinity of four unknown genes specifically expressed in the testis on chromosome 14, a cluster composed of the oocyte secreted protein-1 (Oosp-1 gene and two Oosp-related genes on chromosome 19, all three being characterized by a partial N-terminal zona pellucida-like domain, and another small cluster of two genes on chromosome 19 as well, composed of a TWIK-Related spinal cord K+ channel encoding-gene, and an unknown gene predicted in silico to be testis-specific. The specificity of expression was confirmed by RT-PCR and in situ hybridization for eight and five of them, respectively. Finally, we showed by comparing all of the isolated and clustered oocyte-specific genes identified so far in the mouse genome, that the oocyte-specific clusters are significantly closer to telomeres than isolated oocyte-specific genes are. Conclusion We have studied five clusters of genes specifically expressed in female, some of them being also expressed in male germ-cells. Moreover, contrarily to non-clustered oocyte-specific genes, those that are organized in clusters tend to map near chromosome ends, suggesting that this specific near-telomere position of oocyte-clusters in rodents could constitute an evolutionary advantage. Understanding the biological

  1. A robust prognostic gene expression signature for early stage lung adenocarcinoma

    DEFF Research Database (Denmark)

    Krzystanek, Marcin; Moldvay, Judit; Szüts, David

    2016-01-01

    Stage I lung adenocarcinoma is usually not treated with adjuvant chemotherapy; however, around half of these patients do not survive 5 years. Therefore, a reliable prognostic biomarker for early stage patients would be critical to identify those most likely to benefit from early additional treatm...... not given adjuvant therapy. Seven genes consistently obtained statistical significance in Cox regression for overall survival. The combined signature has a weighted mean hazard ratio of 3.2 in all cohorts and 3.0 (C.I. 1.3-7.4, p ...

  2. Hessian regularization based non-negative matrix factorization for gene expression data clustering.

    Science.gov (United States)

    Liu, Xiao; Shi, Jun; Wang, Congzhi

    2015-01-01

    Since a key step in the analysis of gene expression data is to detect groups of genes that have similar expression patterns, clustering technique is then commonly used to analyze gene expression data. Data representation plays an important role in clustering analysis. The non-negative matrix factorization (NMF) is a widely used data representation method with great success in machine learning. Although the traditional manifold regularization method, Laplacian regularization (LR), can improve the performance of NMF, LR still suffers from the problem of its weak extrapolating power. Hessian regularization (HR) is a newly developed manifold regularization method, whose natural properties make it more extrapolating, especially for small sample data. In this work, we propose the HR-based NMF (HR-NMF) algorithm, and then apply it to represent gene expression data for further clustering task. The clustering experiments are conducted on five commonly used gene datasets, and the results indicate that the proposed HR-NMF outperforms LR-based NMM and original NMF, which suggests the potential application of HR-NMF for gene expression data.

  3. Conservation of gene linkage in dispersed vertebrate NK homeobox clusters.

    Science.gov (United States)

    Wotton, Karl R; Weierud, Frida K; Juárez-Morales, José L; Alvares, Lúcia E; Dietrich, Susanne; Lewis, Katharine E

    2009-10-01

    Nk homeobox genes are important regulators of many different developmental processes including muscle, heart, central nervous system and sensory organ development. They are thought to have arisen as part of the ANTP megacluster, which also gave rise to Hox and ParaHox genes, and at least some NK genes remain tightly linked in all animals examined so far. The protostome-deuterostome ancestor probably contained a cluster of nine Nk genes: (Msx)-(Nk4/tinman)-(Nk3/bagpipe)-(Lbx/ladybird)-(Tlx/c15)-(Nk7)-(Nk6/hgtx)-(Nk1/slouch)-(Nk5/Hmx). Of these genes, only NKX2.6-NKX3.1, LBX1-TLX1 and LBX2-TLX2 remain tightly linked in humans. However, it is currently unclear whether this is unique to the human genome as we do not know which of these Nk genes are clustered in other vertebrates. This makes it difficult to assess whether the remaining linkages are due to selective pressures or because chance rearrangements have "missed" certain genes. In this paper, we identify all of the paralogs of these ancestrally clustered NK genes in several distinct vertebrates. We demonstrate that tight linkages of Lbx1-Tlx1, Lbx2-Tlx2 and Nkx3.1-Nkx2.6 have been widely maintained in both the ray-finned and lobe-finned fish lineages. Moreover, the recently duplicated Hmx2-Hmx3 genes are also tightly linked. Finally, we show that Lbx1-Tlx1 and Hmx2-Hmx3 are flanked by highly conserved noncoding elements, suggesting that shared regulatory regions may have resulted in evolutionary pressure to maintain these linkages. Consistent with this, these pairs of genes have overlapping expression domains. In contrast, Lbx2-Tlx2 and Nkx3.1-Nkx2.6, which do not seem to be coexpressed, are also not associated with conserved noncoding sequences, suggesting that an alternative mechanism may be responsible for the continued clustering of these genes.

  4. Integrating Data Clustering and Visualization for the Analysis of 3D Gene Expression Data

    Energy Technology Data Exchange (ETDEWEB)

    Data Analysis and Visualization (IDAV) and the Department of Computer Science, University of California, Davis, One Shields Avenue, Davis CA 95616, USA,; nternational Research Training Group ``Visualization of Large and Unstructured Data Sets,' ' University of Kaiserslautern, Germany; Computational Research Division, Lawrence Berkeley National Laboratory, One Cyclotron Road, Berkeley, CA 94720, USA; Genomics Division, Lawrence Berkeley National Laboratory, One Cyclotron Road, Berkeley CA 94720, USA; Life Sciences Division, Lawrence Berkeley National Laboratory, One Cyclotron Road, Berkeley CA 94720, USA,; Computer Science Division,University of California, Berkeley, CA, USA,; Computer Science Department, University of California, Irvine, CA, USA,; All authors are with the Berkeley Drosophila Transcription Network Project, Lawrence Berkeley National Laboratory,; Rubel, Oliver; Weber, Gunther H.; Huang, Min-Yu; Bethel, E. Wes; Biggin, Mark D.; Fowlkes, Charless C.; Hendriks, Cris L. Luengo; Keranen, Soile V. E.; Eisen, Michael B.; Knowles, David W.; Malik, Jitendra; Hagen, Hans; Hamann, Bernd

    2008-05-12

    The recent development of methods for extracting precise measurements of spatial gene expression patterns from three-dimensional (3D) image data opens the way for new analyses of the complex gene regulatory networks controlling animal development. We present an integrated visualization and analysis framework that supports user-guided data clustering to aid exploration of these new complex datasets. The interplay of data visualization and clustering-based data classification leads to improved visualization and enables a more detailed analysis than previously possible. We discuss (i) integration of data clustering and visualization into one framework; (ii) application of data clustering to 3D gene expression data; (iii) evaluation of the number of clusters k in the context of 3D gene expression clustering; and (iv) improvement of overall analysis quality via dedicated post-processing of clustering results based on visualization. We discuss the use of this framework to objectively define spatial pattern boundaries and temporal profiles of genes and to analyze how mRNA patterns are controlled by their regulatory transcription factors.

  5. Variation in the fumonisin biosynthetic gene cluster in fumonisin-producing and nonproducing black aspergilli.

    Science.gov (United States)

    Susca, Antonia; Proctor, Robert H; Butchko, Robert A E; Haidukowski, Miriam; Stea, Gaetano; Logrieco, Antonio; Moretti, Antonio

    2014-12-01

    The ability to produce fumonisin mycotoxins varies among members of the black aspergilli. Previously, analyses of selected genes in the fumonisin biosynthetic gene (fum) cluster in black aspergilli from California grapes indicated that fumonisin-nonproducing isolates of Aspergillus welwitschiae lack six fum genes, but nonproducing isolates of Aspergillus niger do not. In the current study, analyses of black aspergilli from grapes from the Mediterranean Basin indicate that the genomic context of the fum cluster is the same in isolates of A. niger and A. welwitschiae regardless of fumonisin-production ability and that full-length clusters occur in producing isolates of both species and nonproducing isolates of A. niger. In contrast, the cluster has undergone an eight-gene deletion in fumonisin-nonproducing isolates of A. welwitschiae. Phylogenetic analyses suggest each species consists of a mixed population of fumonisin-producing and nonproducing individuals, and that existence of both production phenotypes may provide a selective advantage to these species. Differences in gene content of fum cluster homologues and phylogenetic relationships of fum genes suggest that the mutation(s) responsible for the nonproduction phenotype differs, and therefore arose independently, in the two species. Partial fum cluster homologues were also identified in genome sequences of four other black Aspergillus species. Gene content of these partial clusters and phylogenetic relationships of fum sequences indicate that non-random partial deletion of the cluster has occurred multiple times among the species. This in turn suggests that an intact cluster and fumonisin production were once more widespread among black aspergilli. Copyright © 2014 Elsevier Inc. All rights reserved.

  6. HOXA genes cluster: clinical implications of the smallest deletion

    OpenAIRE

    Pezzani, Lidia; Milani, Donatella; Manzoni, Francesca; Baccarin, Marco; Silipigni, Rosamaria; Guerneri, Silvana; Esposito, Susanna

    2015-01-01

    Background HOXA genes cluster plays a fundamental role in embryologic development. Deletion of the entire cluster is known to cause a clinically recognizable syndrome with mild developmental delay, characteristic facies, small feet with unusually short and big halluces, abnormal thumbs, and urogenital malformations. The clinical manifestations may vary with different ranges of deletions of HOXA cluster and flanking regions. Case presentation We report a girl with the smallest deletion reporte...

  7. Validation of the prognostic gene portfolio, ClinicoMolecular Triad Classification, using an independent prospective breast cancer cohort and external patient populations

    Science.gov (United States)

    2014-01-01

    Introduction Using genome-wide expression profiles of a prospective training cohort of breast cancer patients, ClinicoMolecular Triad Classification (CMTC) was recently developed to classify breast cancers into three clinically relevant groups to aid treatment decisions. CMTC was found to be both prognostic and predictive in a large external breast cancer cohort in that study. This study serves to validate the reproducibility of CMTC and its prognostic value using independent patient cohorts. Methods An independent internal cohort (n = 284) and a new external cohort (n = 2,181) were used to validate the association of CMTC between clinicopathological factors, 12 known gene signatures, two molecular subtype classifiers, and 19 oncogenic signalling pathway activities, and to reproduce the abilities of CMTC to predict clinical outcomes of breast cancer. In addition, we also updated the outcome data of the original training cohort (n = 147). Results The original training cohort reached a statistically significant difference (p risk groups. Conclusions Both prospective internal cohorts and the independent external cohorts reproduced the triad classification of CMTC and its prognostic significance. CMTC is an independent prognostic predictor, and it outperformed 12 other known prognostic gene signatures, molecular subtype classifications, and all other standard prognostic clinicopathological factors. Our results support further development of CMTC portfolio into a guide for personalized breast cancer treatments. PMID:24996446

  8. EVALUATION OF THE PROGNOSTIC VALUE OF nm23 GENE EXPRESSION IN BREAST CANCER

    Institute of Scientific and Technical Information of China (English)

    刘红; 毛慧生; 傅西林; 方志沂; 冯玉梅; 范宇; 李树玲

    2002-01-01

    Objective: To investigate the expression of nm23 gene and evaluate its prognostic value in breast cancer. Methods: nm23 expressions were detected in 101 breast cancer patients (group 1) by immunohistochemistry. RT-PCR and immunohistochemistry were used to measure expressions of nm23 gene in another 68 patients with breast cancer (group 2). Results: nm23 gene expression in group 1 was inversely associated with distant metastasis and lymph node metastasis (P<0.05). In 44 patients with negative lymph node, 9 cases progressed to distant metastasis, 7 of them (77.8%) showed low expression of nm23 gene (P<0.05). In 57 patients with positive lymph node, 24 our of 29 patients who had no distant metastasis (82.8%) expressed nm23 gene at high level (P<0.05). Meanwhile, there were 6 patients with distant metastasis in the group 2, all of thenm expressed nm23 gene mRNA at low level. Conclusion: The results showed that nm23 gene might play an independent role in predicting prognosis of breast cancer.

  9. Gene identification and protein classification in microbial metagenomic sequence data via incremental clustering

    Directory of Open Access Journals (Sweden)

    Li Weizhong

    2008-04-01

    Full Text Available Abstract Background The identification and study of proteins from metagenomic datasets can shed light on the roles and interactions of the source organisms in their communities. However, metagenomic datasets are characterized by the presence of organisms with varying GC composition, codon usage biases etc., and consequently gene identification is challenging. The vast amount of sequence data also requires faster protein family classification tools. Results We present a computational improvement to a sequence clustering approach that we developed previously to identify and classify protein coding genes in large microbial metagenomic datasets. The clustering approach can be used to identify protein coding genes in prokaryotes, viruses, and intron-less eukaryotes. The computational improvement is based on an incremental clustering method that does not require the expensive all-against-all compute that was required by the original approach, while still preserving the remote homology detection capabilities. We present evaluations of the clustering approach in protein-coding gene identification and classification, and also present the results of updating the protein clusters from our previous work with recent genomic and metagenomic sequences. The clustering results are available via CAMERA, (http://camera.calit2.net. Conclusion The clustering paradigm is shown to be a very useful tool in the analysis of microbial metagenomic data. The incremental clustering method is shown to be much faster than the original approach in identifying genes, grouping sequences into existing protein families, and also identifying novel families that have multiple members in a metagenomic dataset. These clusters provide a basis for further studies of protein families.

  10. Gene profiling and circulating tumor cells as biomarker to prognostic of patients with locoregional breast cancer.

    Science.gov (United States)

    Kuniyoshi, Renata K; Gehrke, Flávia de Sousa; Alves, Beatriz C A; Vilas-Bôas, Viviane; Coló, Anna E; Sousa, Naiara; Nunes, João; Fonseca, Fernando L A; Del Giglio, Auro

    2015-09-01

    The gene profile of primary tumors, as well as the identification of circulating tumor cells (CTCs), can provide important prognostic and predictive information. In this study, our objective was to perform tumor gene profiling (TGP) in combination with CTC characterization in women with nonmetastatic breast cancer. Biological samples (from peripheral blood and tumors) from 167 patients diagnosed with stage I, II, and III mammary carcinoma, who were also referred for adjuvant/neoadjuvant chemotherapy, were assessed for the following parameters: (a) the presence of CTCs identified by the expression of CK-19 and c-erbB-2 in the peripheral blood mononuclear cell (PBMC) fraction by quantitative reverse transcription PCR (RT-PCR) and (b) the TGP, which was determined by analyzing the expression of 21 genes in paraffin-embedded tissue samples by quantitative multiplex RT-PCR with the Plexor® system. We observed a statistically significant correlation between the progression-free interval (PFI) and the clinical stage (p = 0.000701), the TGP score (p = 0.006538), and the presence of hormone receptors in the tumor (p = 0.0432). We observed no correlation between the PFI and the presence or absence of CK-19 or HER2 expression in the PBMC fraction prior to the start of treatment or in the two following readouts. Multivariate analysis revealed that only the TGP score significantly correlated with the PFI (p = 0.029247). The TGP is an important prognostic variable for patients with locoregional breast cancer. The presence of CTCs adds no prognostic value to the information already provided by the TGP.

  11. Validation of the prognostic gene portfolio, ClinicoMolecular Triad Classification, using an independent prospective breast cancer cohort and external patient populations.

    Science.gov (United States)

    Wang, Dong-Yu; Done, Susan J; Mc Cready, David R; Leong, Wey L

    2014-07-04

    Using genome-wide expression profiles of a prospective training cohort of breast cancer patients, ClinicoMolecular Triad Classification (CMTC) was recently developed to classify breast cancers into three clinically relevant groups to aid treatment decisions. CMTC was found to be both prognostic and predictive in a large external breast cancer cohort in that study. This study serves to validate the reproducibility of CMTC and its prognostic value using independent patient cohorts. An independent internal cohort (n = 284) and a new external cohort (n = 2,181) were used to validate the association of CMTC between clinicopathological factors, 12 known gene signatures, two molecular subtype classifiers, and 19 oncogenic signalling pathway activities, and to reproduce the abilities of CMTC to predict clinical outcomes of breast cancer. In addition, we also updated the outcome data of the original training cohort (n = 147). The original training cohort reached a statistically significant difference (p value of the triad classification was reproduced in the second independent internal cohort and the new external validation cohort. CMTC achieved even higher prognostic significance when all available patients were analyzed (n = 4,851). Oncogenic pathways Myc, E2F1, Ras and β-catenin were again implicated in the high-risk groups. Both prospective internal cohorts and the independent external cohorts reproduced the triad classification of CMTC and its prognostic significance. CMTC is an independent prognostic predictor, and it outperformed 12 other known prognostic gene signatures, molecular subtype classifications, and all other standard prognostic clinicopathological factors. Our results support further development of CMTC portfolio into a guide for personalized breast cancer treatments.

  12. Clustering gene expression regulators: new approach to disease subtyping.

    Directory of Open Access Journals (Sweden)

    Mikhail Pyatnitskiy

    Full Text Available One of the main challenges in modern medicine is to stratify different patient groups in terms of underlying disease molecular mechanisms as to develop more personalized approach to therapy. Here we propose novel method for disease subtyping based on analysis of activated expression regulators on a sample-by-sample basis. Our approach relies on Sub-Network Enrichment Analysis algorithm (SNEA which identifies gene subnetworks with significant concordant changes in expression between two conditions. Subnetwork consists of central regulator and downstream genes connected by relations extracted from global literature-extracted regulation database. Regulators found in each patient separately are clustered together and assigned activity scores which are used for final patients grouping. We show that our approach performs well compared to other related methods and at the same time provides researchers with complementary level of understanding of pathway-level biology behind a disease by identification of significant expression regulators. We have observed the reasonable grouping of neuromuscular disorders (triggered by structural damage vs triggered by unknown mechanisms, that was not revealed using standard expression profile clustering. For another experiment we were able to suggest the clusters of regulators, responsible for colorectal carcinoma vs adenoma discrimination and identify frequently genetically changed regulators that could be of specific importance for the individual characteristics of cancer development. Proposed approach can be regarded as biologically meaningful feature selection, reducing tens of thousands of genes down to dozens of clusters of regulators. Obtained clusters of regulators make possible to generate valuable biological hypotheses about molecular mechanisms related to a clinical outcome for individual patient.

  13. Gene network inherent in genomic big data improves the accuracy of prognostic prediction for cancer patients.

    Science.gov (United States)

    Kim, Yun Hak; Jeong, Dae Cheon; Pak, Kyoungjune; Goh, Tae Sik; Lee, Chi-Seung; Han, Myoung-Eun; Kim, Ji-Young; Liangwen, Liu; Kim, Chi Dae; Jang, Jeon Yeob; Cha, Wonjae; Oh, Sae-Ock

    2017-09-29

    Accurate prediction of prognosis is critical for therapeutic decisions regarding cancer patients. Many previously developed prognostic scoring systems have limitations in reflecting recent progress in the field of cancer biology such as microarray, next-generation sequencing, and signaling pathways. To develop a new prognostic scoring system for cancer patients, we used mRNA expression and clinical data in various independent breast cancer cohorts (n=1214) from the Molecular Taxonomy of Breast Cancer International Consortium (METABRIC) and Gene Expression Omnibus (GEO). A new prognostic score that reflects gene network inherent in genomic big data was calculated using Network-Regularized high-dimensional Cox-regression (Net-score). We compared its discriminatory power with those of two previously used statistical methods: stepwise variable selection via univariate Cox regression (Uni-score) and Cox regression via Elastic net (Enet-score). The Net scoring system showed better discriminatory power in prediction of disease-specific survival (DSS) than other statistical methods (p=0 in METABRIC training cohort, p=0.000331, 4.58e-06 in two METABRIC validation cohorts) when accuracy was examined by log-rank test. Notably, comparison of C-index and AUC values in receiver operating characteristic analysis at 5 years showed fewer differences between training and validation cohorts with the Net scoring system than other statistical methods, suggesting minimal overfitting. The Net-based scoring system also successfully predicted prognosis in various independent GEO cohorts with high discriminatory power. In conclusion, the Net-based scoring system showed better discriminative power than previous statistical methods in prognostic prediction for breast cancer patients. This new system will mark a new era in prognosis prediction for cancer patients.

  14. Evolutionary conservation of regulatory elements in vertebrate HOX gene clusters

    Energy Technology Data Exchange (ETDEWEB)

    Santini, Simona; Boore, Jeffrey L.; Meyer, Axel

    2003-12-31

    Due to their high degree of conservation, comparisons of DNA sequences among evolutionarily distantly-related genomes permit to identify functional regions in noncoding DNA. Hox genes are optimal candidate sequences for comparative genome analyses, because they are extremely conserved in vertebrates and occur in clusters. We aligned (Pipmaker) the nucleotide sequences of HoxA clusters of tilapia, pufferfish, striped bass, zebrafish, horn shark, human and mouse (over 500 million years of evolutionary distance). We identified several highly conserved intergenic sequences, likely to be important in gene regulation. Only a few of these putative regulatory elements have been previously described as being involved in the regulation of Hox genes, while several others are new elements that might have regulatory functions. The majority of these newly identified putative regulatory elements contain short fragments that are almost completely conserved and are identical to known binding sites for regulatory proteins (Transfac). The conserved intergenic regions located between the most rostrally expressed genes in the developing embryo are longer and better retained through evolution. We document that presumed regulatory sequences are retained differentially in either A or A clusters resulting from a genome duplication in the fish lineage. This observation supports both the hypothesis that the conserved elements are involved in gene regulation and the Duplication-Deletion-Complementation model.

  15. A genomics based discovery of secondary metabolite biosynthetic gene clusters in Aspergillus ustus.

    Directory of Open Access Journals (Sweden)

    Borui Pi

    Full Text Available Secondary metabolites (SMs produced by Aspergillus have been extensively studied for their crucial roles in human health, medicine and industrial production. However, the resulting information is almost exclusively derived from a few model organisms, including A. nidulans and A. fumigatus, but little is known about rare pathogens. In this study, we performed a genomics based discovery of SM biosynthetic gene clusters in Aspergillus ustus, a rare human pathogen. A total of 52 gene clusters were identified in the draft genome of A. ustus 3.3904, such as the sterigmatocystin biosynthesis pathway that was commonly found in Aspergillus species. In addition, several SM biosynthetic gene clusters were firstly identified in Aspergillus that were possibly acquired by horizontal gene transfer, including the vrt cluster that is responsible for viridicatumtoxin production. Comparative genomics revealed that A. ustus shared the largest number of SM biosynthetic gene clusters with A. nidulans, but much fewer with other Aspergilli like A. niger and A. oryzae. These findings would help to understand the diversity and evolution of SM biosynthesis pathways in genus Aspergillus, and we hope they will also promote the development of fungal identification methodology in clinic.

  16. A Genomics Based Discovery of Secondary Metabolite Biosynthetic Gene Clusters in Aspergillus ustus

    Science.gov (United States)

    Pi, Borui; Yu, Dongliang; Dai, Fangwei; Song, Xiaoming; Zhu, Congyi; Li, Hongye; Yu, Yunsong

    2015-01-01

    Secondary metabolites (SMs) produced by Aspergillus have been extensively studied for their crucial roles in human health, medicine and industrial production. However, the resulting information is almost exclusively derived from a few model organisms, including A. nidulans and A. fumigatus, but little is known about rare pathogens. In this study, we performed a genomics based discovery of SM biosynthetic gene clusters in Aspergillus ustus, a rare human pathogen. A total of 52 gene clusters were identified in the draft genome of A. ustus 3.3904, such as the sterigmatocystin biosynthesis pathway that was commonly found in Aspergillus species. In addition, several SM biosynthetic gene clusters were firstly identified in Aspergillus that were possibly acquired by horizontal gene transfer, including the vrt cluster that is responsible for viridicatumtoxin production. Comparative genomics revealed that A. ustus shared the largest number of SM biosynthetic gene clusters with A. nidulans, but much fewer with other Aspergilli like A. niger and A. oryzae. These findings would help to understand the diversity and evolution of SM biosynthesis pathways in genus Aspergillus, and we hope they will also promote the development of fungal identification methodology in clinic. PMID:25706180

  17. ICGE: an R package for detecting relevant clusters and atypical units in gene expression

    Directory of Open Access Journals (Sweden)

    Irigoien Itziar

    2012-02-01

    Full Text Available Abstract Background Gene expression technologies have opened up new ways to diagnose and treat cancer and other diseases. Clustering algorithms are a useful approach with which to analyze genome expression data. They attempt to partition the genes into groups exhibiting similar patterns of variation in expression level. An important problem associated with gene classification is to discern whether the clustering process can find a relevant partition as well as the identification of new genes classes. There are two key aspects to classification: the estimation of the number of clusters, and the decision as to whether a new unit (gene, tumor sample... belongs to one of these previously identified clusters or to a new group. Results ICGE is a user-friendly R package which provides many functions related to this problem: identify the number of clusters using mixed variables, usually found by applied biomedical researchers; detect whether the data have a cluster structure; identify whether a new unit belongs to one of the pre-identified clusters or to a novel group, and classify new units into the corresponding cluster. The functions in the ICGE package are accompanied by help files and easy examples to facilitate its use. Conclusions We demonstrate the utility of ICGE by analyzing simulated and real data sets. The results show that ICGE could be very useful to a broad research community.

  18. A Link-Based Cluster Ensemble Approach For Improved Gene Expression Data Analysis

    Directory of Open Access Journals (Sweden)

    P.Balaji

    2015-01-01

    Full Text Available Abstract It is difficult from possibilities to select a most suitable effective way of clustering algorithm and its dataset for a defined set of gene expression data because we have a huge number of ways and huge number of gene expressions. At present many researchers are preferring to use hierarchical clustering in different forms this is no more totally optimal. Cluster ensemble research can solve this type of problem by automatically merging multiple data partitions from a wide range of different clusterings of any dimensions to improve both the quality and robustness of the clustering result. But we have many existing ensemble approaches using an association matrix to condense sample-cluster and co-occurrence statistics and relations within the ensemble are encapsulated only at raw level while the existing among clusters are totally discriminated. Finding these missing associations can greatly expand the capability of those ensemble methodologies for microarray data clustering. We propose general K-means cluster ensemble approach for the clustering of general categorical data into required number of partitions.

  19. Activation and clustering of a Plasmodium falciparum var gene are affected by subtelomeric sequences.

    Science.gov (United States)

    Duffy, Michael F; Tang, Jingyi; Sumardy, Fransisca; Nguyen, Hanh H T; Selvarajah, Shamista A; Josling, Gabrielle A; Day, Karen P; Petter, Michaela; Brown, Graham V

    2017-01-01

    The Plasmodium falciparum var multigene family encodes the cytoadhesive, variant antigen PfEMP1. P. falciparum antigenic variation and cytoadhesion specificity are controlled by epigenetic switching between the single, or few, simultaneously expressed var genes. Most var genes are maintained in perinuclear clusters of heterochromatic telomeres. The active var gene(s) occupy a single, perinuclear var expression site. It is unresolved whether the var expression site forms in situ at a telomeric cluster or whether it is an extant compartment to which single chromosomes travel, thus controlling var switching. Here we show that transcription of a var gene did not require decreased colocalisation with clusters of telomeres, supporting var expression site formation in situ. However following recombination within adjacent subtelomeric sequences, the same var gene was persistently activated and did colocalise less with telomeric clusters. Thus, participation in stable, heterochromatic, telomere clusters and var switching are independent but are both affected by subtelomeric sequences. The var expression site colocalised with the euchromatic mark H3K27ac to a greater extent than it did with heterochromatic H3K9me3. H3K27ac was enriched within the active var gene promoter even when the var gene was transiently repressed in mature parasites and thus H3K27ac may contribute to var gene epigenetic memory. © 2016 Federation of European Biochemical Societies.

  20. A simple but highly effective approach to evaluate the prognostic performance of gene expression signatures.

    Directory of Open Access Journals (Sweden)

    Maud H W Starmans

    Full Text Available BACKGROUND: Highly parallel analysis of gene expression has recently been used to identify gene sets or 'signatures' to improve patient diagnosis and risk stratification. Once a signature is generated, traditional statistical testing is used to evaluate its prognostic performance. However, due to the dimensionality of microarrays, this can lead to false interpretation of these signatures. PRINCIPAL FINDINGS: A method was developed to test batches of a user-specified number of randomly chosen signatures in patient microarray datasets. The percentage of random generated signatures yielding prognostic value was assessed using ROC analysis by calculating the area under the curve (AUC in six public available cancer patient microarray datasets. We found that a signature consisting of randomly selected genes has an average 10% chance of reaching significance when assessed in a single dataset, but can range from 1% to ∼40% depending on the dataset in question. Increasing the number of validation datasets markedly reduces this number. CONCLUSIONS: We have shown that the use of an arbitrary cut-off value for evaluation of signature significance is not suitable for this type of research, but should be defined for each dataset separately. Our method can be used to establish and evaluate signature performance of any derived gene signature in a dataset by comparing its performance to thousands of randomly generated signatures. It will be of most interest for cases where few data are available and testing in multiple datasets is limited.

  1. Physical and genetic map of the major nif gene cluster from Azotobacter vinelandii.

    OpenAIRE

    Jacobson, M R; Brigle, K E; Bennett, L T; Setterquist, R A; Wilson, M S; Cash, V L; Beynon, J; Newton, W E; Dean, D R

    1989-01-01

    Determination of a 28,793-base-pair DNA sequence of a region from the Azotobacter vinelandii genome that includes and flanks the nitrogenase structural gene region was completed. This information was used to revise the previously proposed organization of the major nif cluster. The major nif cluster from A. vinelandii encodes 15 nif-specific genes whose products bear significant structural identity to the corresponding nif-specific gene products from Klebsiella pneumoniae. These genes include ...

  2. Evolution of the C-Type Lectin-Like Receptor Genes of the DECTIN-1 Cluster in the NK Gene Complex

    Directory of Open Access Journals (Sweden)

    Susanne Sattler

    2012-01-01

    Full Text Available Pattern recognition receptors are crucial in initiating and shaping innate and adaptive immune responses and often belong to families of structurally and evolutionarily related proteins. The human C-type lectin-like receptors encoded in the DECTIN-1 cluster within the NK gene complex contain prominent receptors with pattern recognition function, such as DECTIN-1 and LOX-1. All members of this cluster share significant homology and are considered to have arisen from subsequent gene duplications. Recent developments in sequencing and the availability of comprehensive sequence data comprising many species showed that the receptors of the DECTIN-1 cluster are not only homologous to each other but also highly conserved between species. Even in Caenorhabditis elegans, genes displaying homology to the mammalian C-type lectin-like receptors have been detected. In this paper, we conduct a comprehensive phylogenetic survey and give an up-to-date overview of the currently available data on the evolutionary emergence of the DECTIN-1 cluster genes.

  3. Usefulness of BCOR gene mutation as a prognostic factor in acute myeloid leukemia with intermediate cytogenetic prognosis.

    Science.gov (United States)

    Terada, Kazuki; Yamaguchi, Hiroki; Ueki, Toshimitsu; Usuki, Kensuke; Kobayashi, Yutaka; Tajika, Kenji; Gomi, Seiji; Kurosawa, Saiko; Saito, Riho; Furuta, Yutaka; Miyadera, Keiki; Tokura, Taichiro; Marumo, Atushi; Omori, Ikuko; Sakaguchi, Masahiro; Fujiwara, Yusuke; Yui, Shunsuke; Ryotokuji, Takeshi; Arai, Kunihito; Kitano, Tomoaki; Wakita, Satoshi; Fukuda, Takahiro; Inokuchi, Koiti

    2018-04-16

    BCOR gene is a transcription regulatory factor that plays an essential role in normal hematopoiesis. The wider introduction of next-generation sequencing technology has led to reports in recent years of mutations in the BCOR gene in acute myeloid leukemia (AML), but the related clinical characteristics and prognosis are not sufficiently understood. We investigated the clinical characteristics and prognosis of 377 de novo AML cases with BCOR or BCORL1 mutation. BCOR or BCORL1 gene mutations were found in 28 cases (7.4%). Among cases aged 65 years or below that were also FLT3-ITD-negative and in the intermediate cytogenetic prognosis group, BCOR or BCORL1 gene mutations were observed in 11% of cases (12 of 111 cases), and this group had significantly lower 5-year overall survival (OS) (13.6% vs. 55.0%, P=0.0021) and relapse-free survival (RFS) (14.3% vs. 44.5%, P=0.0168) compared to cases without BCOR or BCORL1 gene mutations. Multivariate analysis demonstrated that BCOR mutations were an independent unfavorable prognostic factor (P=0.0038, P=0.0463) for both OS and RFS. In cases of AML that are FLT3-ITD-negative, aged 65 years or below, and in the intermediate cytogenetic prognosis group, which are considered to have relatively favorable prognosis, BCOR gene mutations appear to be an important prognostic factor. This article is protected by copyright. All rights reserved. © 2018 Wiley Periodicals, Inc.

  4. A strategy for full interrogation of prognostic gene expression patterns: exploring the biology of diffuse large B cell lymphoma.

    Directory of Open Access Journals (Sweden)

    Lisa M Rimsza

    Full Text Available Gene expression profiling yields quantitative data on gene expression used to create prognostic models that accurately predict patient outcome in diffuse large B cell lymphoma (DLBCL. Often, data are analyzed with genes classified by whether they fall above or below the median expression level. We sought to determine whether examining multiple cut-points might be a more powerful technique to investigate the association of gene expression with outcome.We explored gene expression profiling data using variable cut-point analysis for 36 genes with reported prognostic value in DLBCL. We plotted two-group survival logrank test statistics against corresponding cut-points of the gene expression levels and smooth estimates of the hazard ratio of death versus gene expression levels. To facilitate comparisons we also standardized the expression of each of the genes by the fraction of patients that would be identified by any cut-point. A multiple comparison adjusted permutation p-value identified 3 different patterns of significance: 1 genes with significant cut-point points below the median, whose loss is associated with poor outcome (e.g. HLA-DR; 2 genes with significant cut-points above the median, whose over-expression is associated with poor outcome (e.g. CCND2; and 3 genes with significant cut-points on either side of the median, (e.g. extracellular molecules such as FN1.Variable cut-point analysis with permutation p-value calculation can be used to identify significant genes that would not otherwise be identified with median cut-points and may suggest biological patterns of gene effects.

  5. GenClust: A genetic algorithm for clustering gene expression data

    Directory of Open Access Journals (Sweden)

    Raimondi Alessandra

    2005-12-01

    Full Text Available Abstract Background Clustering is a key step in the analysis of gene expression data, and in fact, many classical clustering algorithms are used, or more innovative ones have been designed and validated for the task. Despite the widespread use of artificial intelligence techniques in bioinformatics and, more generally, data analysis, there are very few clustering algorithms based on the genetic paradigm, yet that paradigm has great potential in finding good heuristic solutions to a difficult optimization problem such as clustering. Results GenClust is a new genetic algorithm for clustering gene expression data. It has two key features: (a a novel coding of the search space that is simple, compact and easy to update; (b it can be used naturally in conjunction with data driven internal validation methods. We have experimented with the FOM methodology, specifically conceived for validating clusters of gene expression data. The validity of GenClust has been assessed experimentally on real data sets, both with the use of validation measures and in comparison with other algorithms, i.e., Average Link, Cast, Click and K-means. Conclusion Experiments show that none of the algorithms we have used is markedly superior to the others across data sets and validation measures; i.e., in many cases the observed differences between the worst and best performing algorithm may be statistically insignificant and they could be considered equivalent. However, there are cases in which an algorithm may be better than others and therefore worthwhile. In particular, experiments for GenClust show that, although simple in its data representation, it converges very rapidly to a local optimum and that its ability to identify meaningful clusters is comparable, and sometimes superior, to that of more sophisticated algorithms. In addition, it is well suited for use in conjunction with data driven internal validation measures and, in particular, the FOM methodology.

  6. Clustering based gene expression feature selection method: A computational approach to enrich the classifier efficiency of differentially expressed genes

    KAUST Repository

    Abusamra, Heba

    2016-07-20

    The native nature of high dimension low sample size of gene expression data make the classification task more challenging. Therefore, feature (gene) selection become an apparent need. Selecting a meaningful and relevant genes for classifier not only decrease the computational time and cost, but also improve the classification performance. Among different approaches of feature selection methods, however most of them suffer from several problems such as lack of robustness, validation issues etc. Here, we present a new feature selection technique that takes advantage of clustering both samples and genes. Materials and methods We used leukemia gene expression dataset [1]. The effectiveness of the selected features were evaluated by four different classification methods; support vector machines, k-nearest neighbor, random forest, and linear discriminate analysis. The method evaluate the importance and relevance of each gene cluster by summing the expression level for each gene belongs to this cluster. The gene cluster consider important, if it satisfies conditions depend on thresholds and percentage otherwise eliminated. Results Initial analysis identified 7120 differentially expressed genes of leukemia (Fig. 15a), after applying our feature selection methodology we end up with specific 1117 genes discriminating two classes of leukemia (Fig. 15b). Further applying the same method with more stringent higher positive and lower negative threshold condition, number reduced to 58 genes have be tested to evaluate the effectiveness of the method (Fig. 15c). The results of the four classification methods are summarized in Table 11. Conclusions The feature selection method gave good results with minimum classification error. Our heat-map result shows distinct pattern of refines genes discriminating between two classes of leukemia.

  7. The complete coenzyme B12 biosynthesis gene cluster of Lactobacillus reuteri CRL 1098

    NARCIS (Netherlands)

    Santos, dos F.; Vera, J.L.; Heijden, van der R.; Valdez, G.F.; Vos, de W.M.; Sesma, F.; Hugenholtz, J.

    2008-01-01

    The coenzyme B12 production pathway in Lactobacillus reuteri has been deduced using a combination of genetic, biochemical and bioinformatics approaches. The coenzyme B12 gene cluster of Lb. reuteri CRL1098 has the unique feature of clustering together the cbi, cob and hem genes. It consists of 29

  8. Functional heterogeneity of cancer-associated fibroblasts from human colon tumors shows specific prognostic gene expression signature.

    Science.gov (United States)

    Herrera, Mercedes; Islam, Abul B M M K; Herrera, Alberto; Martín, Paloma; García, Vanesa; Silva, Javier; Garcia, Jose M; Salas, Clara; Casal, Ignacio; de Herreros, Antonio García; Bonilla, Félix; Peña, Cristina

    2013-11-01

    Cancer-associated fibroblasts (CAF) actively participate in reciprocal communication with tumor cells and with other cell types in the microenvironment, contributing to a tumor-permissive neighborhood and promoting tumor progression. The aim of this study is the characterization of how CAFs from primary human colon tumors promote migration of colon cancer cells. Primary CAF cultures from 15 primary human colon tumors were established. Their enrichment in CAFs was evaluated by the expression of various epithelial and myofibroblast specific markers. Coculture assays of primary CAFs with different colon tumor cells were performed to evaluate promigratory CAF-derived effects on cancer cells. Gene expression profiles were developed to further investigate CAF characteristics. Coculture assays showed significant differences in fibroblast-derived paracrine promigratory effects on cancer cells. Moreover, the association between CAFs' promigratory effects on cancer cells and classic fibroblast activation or stemness markers was observed. CAF gene expression profiles were analyzed by microarray to identify deregulated genes in different promigratory CAFs. The gene expression signature, derived from the most protumorogenic CAFs, was identified. Interestingly, this "CAF signature" showed a remarkable prognostic value for the clinical outcome of patients with colon cancer. Moreover, this prognostic value was validated in an independent series of 142 patients with colon cancer, by quantitative real-time PCR (qRT-PCR), with a set of four genes included in the "CAF signature." In summary, these studies show for the first time the heterogeneity of primary CAFs' effect on colon cancer cell migration. A CAF gene expression signature able to classify patients with colon cancer into high- and low-risk groups was identified.

  9. Resistance gene candidates identified by PCR with degenerate oligonucleotide primers map to clusters of resistance genes in lettuce.

    Science.gov (United States)

    Shen, K A; Meyers, B C; Islam-Faridi, M N; Chin, D B; Stelly, D M; Michelmore, R W

    1998-08-01

    The recent cloning of genes for resistance against diverse pathogens from a variety of plants has revealed that many share conserved sequence motifs. This provides the possibility of isolating numerous additional resistance genes by polymerase chain reaction (PCR) with degenerate oligonucleotide primers. We amplified resistance gene candidates (RGCs) from lettuce with multiple combinations of primers with low degeneracy designed from motifs in the nucleotide binding sites (NBSs) of RPS2 of Arabidopsis thaliana and N of tobacco. Genomic DNA, cDNA, and bacterial artificial chromosome (BAC) clones were successfully used as templates. Four families of sequences were identified that had the same similarity to each other as to resistance genes from other species. The relationship of the amplified products to resistance genes was evaluated by several sequence and genetic criteria. The amplified products contained open reading frames with additional sequences characteristic of NBSs. Hybridization of RGCs to genomic DNA and to BAC clones revealed large numbers of related sequences. Genetic analysis demonstrated the existence of clustered multigene families for each of the four RGC sequences. This parallels classical genetic data on clustering of disease resistance genes. Two of the four families mapped to known clusters of resistance genes; these two families were therefore studied in greater detail. Additional evidence that these RGCs could be resistance genes was gained by the identification of leucine-rich repeat (LRR) regions in sequences adjoining the NBS similar to those in RPM1 and RPS2 of A. thaliana. Fluorescent in situ hybridization confirmed the clustered genomic distribution of these sequences. The use of PCR with degenerate oligonucleotide primers is therefore an efficient method to identify numerous RGCs in plants.

  10. Two Gene Clusters Coordinate Galactose and Lactose Metabolism in Streptococcus gordonii

    Science.gov (United States)

    Zeng, Lin; Martino, Nicole C.

    2012-01-01

    Streptococcus gordonii is an early colonizer of the human oral cavity and an abundant constituent of oral biofilms. Two tandemly arranged gene clusters, designated lac and gal, were identified in the S. gordonii DL1 genome, which encode genes of the tagatose pathway (lacABCD) and sugar phosphotransferase system (PTS) enzyme II permeases. Genes encoding a predicted phospho-β-galactosidase (LacG), a DeoR family transcriptional regulator (LacR), and a transcriptional antiterminator (LacT) were also present in the clusters. Growth and PTS assays supported that the permease designated EIILac transports lactose and galactose, whereas EIIGal transports galactose. The expression of the gene for EIIGal was markedly upregulated in cells growing on galactose. Using promoter-cat fusions, a role for LacR in the regulation of the expressions of both gene clusters was demonstrated, and the gal cluster was also shown to be sensitive to repression by CcpA. The deletion of lacT caused an inability to grow on lactose, apparently because of its role in the regulation of the expression of the genes for EIILac, but had little effect on galactose utilization. S. gordonii maintained a selective advantage over Streptococcus mutans in a mixed-species competition assay, associated with its possession of a high-affinity galactose PTS, although S. mutans could persist better at low pHs. Collectively, these results support the concept that the galactose and lactose systems of S. gordonii are subject to complex regulation and that a high-affinity galactose PTS may be advantageous when S. gordonii is competing against the caries pathogen S. mutans in oral biofilms. PMID:22660715

  11. Regulatory role of tetR gene in a novel gene cluster of Acidovorax avenae subsp. avenae RS-1 under oxidative stress

    Directory of Open Access Journals (Sweden)

    He eLiu

    2014-10-01

    Full Text Available Acidovorax avenae subsp. avenae is the causal agent of bacterial brown stripe disease in rice. In this study, we characterized a novel horizontal transfer of a gene cluster, including tetR, on the chromosome of A. avenae subsp. avenae RS-1 by genome-wide analysis. TetR acted as a repressor in this gene cluster and the oxidative stress resistance was enhanced in tetR-deletion mutant strain. Electrophoretic mobility shift assay (EMSA demonstrated that TetR regulator bound directly to the promoter of this gene cluster. Consistently, the results of quantitative real-time PCR also showed alterations in expression of associated genes. Moreover, the proteins affected by TetR under oxidative stress were revealed by comparing proteomic profiles of wild-type and mutant strains via 1D SDS-PAGE and LC-MS/MS analyses. Taken together, our results demonstrated that tetR gene in this novel gene cluster contributed to cell survival under oxidative stress, and TetR protein played an important regulatory role in growth kinetics, biofilm-forming capability, SOD and catalase activity, and oxide detoxicating ability.

  12. Regulatory role of tetR gene in a novel gene cluster of Acidovorax avenae subsp. avenae RS-1 under oxidative stress.

    Science.gov (United States)

    Liu, He; Yang, Chun-Lan; Ge, Meng-Yu; Ibrahim, Muhammad; Li, Bin; Zhao, Wen-Jun; Chen, Gong-You; Zhu, Bo; Xie, Guan-Lin

    2014-01-01

    Acidovorax avenae subsp. avenae is the causal agent of bacterial brown stripe disease in rice. In this study, we characterized a novel horizontal transfer of a gene cluster, including tetR, on the chromosome of A. avenae subsp. avenae RS-1 by genome-wide analysis. TetR acted as a repressor in this gene cluster and the oxidative stress resistance was enhanced in tetR-deletion mutant strain. Electrophoretic mobility shift assay demonstrated that TetR regulator bound directly to the promoter of this gene cluster. Consistently, the results of quantitative real-time PCR also showed alterations in expression of associated genes. Moreover, the proteins affected by TetR under oxidative stress were revealed by comparing proteomic profiles of wild-type and mutant strains via 1D SDS-PAGE and LC-MS/MS analyses. Taken together, our results demonstrated that tetR gene in this novel gene cluster contributed to cell survival under oxidative stress, and TetR protein played an important regulatory role in growth kinetics, biofilm-forming capability, superoxide dismutase and catalase activity, and oxide detoxicating ability.

  13. QTL global meta-analysis: are trait determining genes clustered?

    Directory of Open Access Journals (Sweden)

    Adelson David L

    2009-04-01

    Full Text Available Abstract Background A key open question in biology is if genes are physically clustered with respect to their known functions or phenotypic effects. This is of particular interest for Quantitative Trait Loci (QTL where a QTL region could contain a number of genes that contribute to the trait being measured. Results We observed a significant increase in gene density within QTL regions compared to non-QTL regions and/or the entire bovine genome. By grouping QTL from the Bovine QTL Viewer database into 8 categories of non-redundant regions, we have been able to analyze gene density and gene function distribution, based on Gene Ontology (GO with relation to their location within QTL regions, outside of QTL regions and across the entire bovine genome. We identified a number of GO terms that were significantly over represented within particular QTL categories. Furthermore, select GO terms expected to be associated with the QTL category based on common biological knowledge have also proved to be significantly over represented in QTL regions. Conclusion Our analysis provides evidence of over represented GO terms in QTL regions. This increased GO term density indicates possible clustering of gene functions within QTL regions of the bovine genome. Genes with similar functions may be grouped in specific locales and could be contributing to QTL traits. Moreover, we have identified over-represented GO terminology that from a biological standpoint, makes sense with respect to QTL category type.

  14. MADIBA: A web server toolkit for biological interpretation of Plasmodium and plant gene clusters

    Directory of Open Access Journals (Sweden)

    Louw Abraham I

    2008-02-01

    Full Text Available Abstract Background Microarray technology makes it possible to identify changes in gene expression of an organism, under various conditions. Data mining is thus essential for deducing significant biological information such as the identification of new biological mechanisms or putative drug targets. While many algorithms and software have been developed for analysing gene expression, the extraction of relevant information from experimental data is still a substantial challenge, requiring significant time and skill. Description MADIBA (MicroArray Data Interface for Biological Annotation facilitates the assignment of biological meaning to gene expression clusters by automating the post-processing stage. A relational database has been designed to store the data from gene to pathway for Plasmodium, rice and Arabidopsis. Tools within the web interface allow rapid analyses for the identification of the Gene Ontology terms relevant to each cluster; visualising the metabolic pathways where the genes are implicated, their genomic localisations, putative common transcriptional regulatory elements in the upstream sequences, and an analysis specific to the organism being studied. Conclusion MADIBA is an integrated, online tool that will assist researchers in interpreting their results and understand the meaning of the co-expression of a cluster of genes. Functionality of MADIBA was validated by analysing a number of gene clusters from several published experiments – expression profiling of the Plasmodium life cycle, and salt stress treatments of Arabidopsis and rice. In most of the cases, the same conclusions found by the authors were quickly and easily obtained after analysing the gene clusters with MADIBA.

  15. Statistical indicators of collective behavior and functional clusters in gene networks of yeast

    Science.gov (United States)

    Živković, J.; Tadić, B.; Wick, N.; Thurner, S.

    2006-03-01

    We analyze gene expression time-series data of yeast (S. cerevisiae) measured along two full cell-cycles. We quantify these data by using q-exponentials, gene expression ranking and a temporal mean-variance analysis. We construct gene interaction networks based on correlation coefficients and study the formation of the corresponding giant components and minimum spanning trees. By coloring genes according to their cell function we find functional clusters in the correlation networks and functional branches in the associated trees. Our results suggest that a percolation point of functional clusters can be identified on these gene expression correlation networks.

  16. Genome-wide identification of physically clustered genes suggests chromatin-level co-regulation in male reproductive development in Arabidopsis thaliana.

    Science.gov (United States)

    Reimegård, Johan; Kundu, Snehangshu; Pendle, Ali; Irish, Vivian F; Shaw, Peter; Nakayama, Naomi; Sundström, Jens F; Emanuelsson, Olof

    2017-04-07

    Co-expression of physically linked genes occurs surprisingly frequently in eukaryotes. Such chromosomal clustering may confer a selective advantage as it enables coordinated gene regulation at the chromatin level. We studied the chromosomal organization of genes involved in male reproductive development in Arabidopsis thaliana. We developed an in-silico tool to identify physical clusters of co-regulated genes from gene expression data. We identified 17 clusters (96 genes) involved in stamen development and acting downstream of the transcriptional activator MS1 (MALE STERILITY 1), which contains a PHD domain associated with chromatin re-organization. The clusters exhibited little gene homology or promoter element similarity, and largely overlapped with reported repressive histone marks. Experiments on a subset of the clusters suggested a link between expression activation and chromatin conformation: qRT-PCR and mRNA in situ hybridization showed that the clustered genes were up-regulated within 48 h after MS1 induction; out of 14 chromatin-remodeling mutants studied, expression of clustered genes was consistently down-regulated only in hta9/hta11, previously associated with metabolic cluster activation; DNA fluorescence in situ hybridization confirmed that transcriptional activation of the clustered genes was correlated with open chromatin conformation. Stamen development thus appears to involve transcriptional activation of physically clustered genes through chromatin de-condensation. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  17. AutoSOME: a clustering method for identifying gene expression modules without prior knowledge of cluster number

    Directory of Open Access Journals (Sweden)

    Cooper James B

    2010-03-01

    Full Text Available Abstract Background Clustering the information content of large high-dimensional gene expression datasets has widespread application in "omics" biology. Unfortunately, the underlying structure of these natural datasets is often fuzzy, and the computational identification of data clusters generally requires knowledge about cluster number and geometry. Results We integrated strategies from machine learning, cartography, and graph theory into a new informatics method for automatically clustering self-organizing map ensembles of high-dimensional data. Our new method, called AutoSOME, readily identifies discrete and fuzzy data clusters without prior knowledge of cluster number or structure in diverse datasets including whole genome microarray data. Visualization of AutoSOME output using network diagrams and differential heat maps reveals unexpected variation among well-characterized cancer cell lines. Co-expression analysis of data from human embryonic and induced pluripotent stem cells using AutoSOME identifies >3400 up-regulated genes associated with pluripotency, and indicates that a recently identified protein-protein interaction network characterizing pluripotency was underestimated by a factor of four. Conclusions By effectively extracting important information from high-dimensional microarray data without prior knowledge or the need for data filtration, AutoSOME can yield systems-level insights from whole genome microarray expression studies. Due to its generality, this new method should also have practical utility for a variety of data-intensive applications, including the results of deep sequencing experiments. AutoSOME is available for download at http://jimcooperlab.mcdb.ucsb.edu/autosome.

  18. Clustering gene expression data based on predicted differential effects of GV interaction.

    Science.gov (United States)

    Pan, Hai-Yan; Zhu, Jun; Han, Dan-Fu

    2005-02-01

    Microarray has become a popular biotechnology in biological and medical research. However, systematic and stochastic variabilities in microarray data are expected and unavoidable, resulting in the problem that the raw measurements have inherent "noise" within microarray experiments. Currently, logarithmic ratios are usually analyzed by various clustering methods directly, which may introduce bias interpretation in identifying groups of genes or samples. In this paper, a statistical method based on mixed model approaches was proposed for microarray data cluster analysis. The underlying rationale of this method is to partition the observed total gene expression level into various variations caused by different factors using an ANOVA model, and to predict the differential effects of GV (gene by variety) interaction using the adjusted unbiased prediction (AUP) method. The predicted GV interaction effects can then be used as the inputs of cluster analysis. We illustrated the application of our method with a gene expression dataset and elucidated the utility of our approach using an external validation.

  19. Transcriptional regulation of gene expression clusters in motor neurons following spinal cord injury

    Directory of Open Access Journals (Sweden)

    Westerdahl Ann-Charlotte

    2010-06-01

    Full Text Available Abstract Background Spinal cord injury leads to neurological dysfunctions affecting the motor, sensory as well as the autonomic systems. Increased excitability of motor neurons has been implicated in injury-induced spasticity, where the reappearance of self-sustained plateau potentials in the absence of modulatory inputs from the brain correlates with the development of spasticity. Results Here we examine the dynamic transcriptional response of motor neurons to spinal cord injury as it evolves over time to unravel common gene expression patterns and their underlying regulatory mechanisms. For this we use a rat-tail-model with complete spinal cord transection causing injury-induced spasticity, where gene expression profiles are obtained from labeled motor neurons extracted with laser microdissection 0, 2, 7, 21 and 60 days post injury. Consensus clustering identifies 12 gene clusters with distinct time expression profiles. Analysis of these gene clusters identifies early immunological/inflammatory and late developmental responses as well as a regulation of genes relating to neuron excitability that support the development of motor neuron hyper-excitability and the reappearance of plateau potentials in the late phase of the injury response. Transcription factor motif analysis identifies differentially expressed transcription factors involved in the regulation of each gene cluster, shaping the expression of the identified biological processes and their associated genes underlying the changes in motor neuron excitability. Conclusions This analysis provides important clues to the underlying mechanisms of transcriptional regulation responsible for the increased excitability observed in motor neurons in the late chronic phase of spinal cord injury suggesting alternative targets for treatment of spinal cord injury. Several transcription factors were identified as potential regulators of gene clusters containing elements related to motor neuron hyper

  20. Transcriptional regulation of gene expression clusters in motor neurons following spinal cord injury.

    Science.gov (United States)

    Ryge, Jesper; Winther, Ole; Wienecke, Jacob; Sandelin, Albin; Westerdahl, Ann-Charlotte; Hultborn, Hans; Kiehn, Ole

    2010-06-09

    Spinal cord injury leads to neurological dysfunctions affecting the motor, sensory as well as the autonomic systems. Increased excitability of motor neurons has been implicated in injury-induced spasticity, where the reappearance of self-sustained plateau potentials in the absence of modulatory inputs from the brain correlates with the development of spasticity. Here we examine the dynamic transcriptional response of motor neurons to spinal cord injury as it evolves over time to unravel common gene expression patterns and their underlying regulatory mechanisms. For this we use a rat-tail-model with complete spinal cord transection causing injury-induced spasticity, where gene expression profiles are obtained from labeled motor neurons extracted with laser microdissection 0, 2, 7, 21 and 60 days post injury. Consensus clustering identifies 12 gene clusters with distinct time expression profiles. Analysis of these gene clusters identifies early immunological/inflammatory and late developmental responses as well as a regulation of genes relating to neuron excitability that support the development of motor neuron hyper-excitability and the reappearance of plateau potentials in the late phase of the injury response. Transcription factor motif analysis identifies differentially expressed transcription factors involved in the regulation of each gene cluster, shaping the expression of the identified biological processes and their associated genes underlying the changes in motor neuron excitability. This analysis provides important clues to the underlying mechanisms of transcriptional regulation responsible for the increased excitability observed in motor neurons in the late chronic phase of spinal cord injury suggesting alternative targets for treatment of spinal cord injury. Several transcription factors were identified as potential regulators of gene clusters containing elements related to motor neuron hyper-excitability, the manipulation of which potentially could be

  1. Transcriptional analysis of exopolysaccharides biosynthesis gene clusters in Lactobacillus plantarum.

    Science.gov (United States)

    Vastano, Valeria; Perrone, Filomena; Marasco, Rosangela; Sacco, Margherita; Muscariello, Lidia

    2016-04-01

    Exopolysaccharides (EPS) from lactic acid bacteria contribute to specific rheology and texture of fermented milk products and find applications also in non-dairy foods and in therapeutics. Recently, four clusters of genes (cps) associated with surface polysaccharide production have been identified in Lactobacillus plantarum WCFS1, a probiotic and food-associated lactobacillus. These clusters are involved in cell surface architecture and probably in release and/or exposure of immunomodulating bacterial molecules. Here we show a transcriptional analysis of these clusters. Indeed, RT-PCR experiments revealed that the cps loci are organized in five operons. Moreover, by reverse transcription-qPCR analysis performed on L. plantarum WCFS1 (wild type) and WCFS1-2 (ΔccpA), we demonstrated that expression of three cps clusters is under the control of the global regulator CcpA. These results, together with the identification of putative CcpA target sequences (catabolite responsive element CRE) in the regulatory region of four out of five transcriptional units, strongly suggest for the first time a role of the master regulator CcpA in EPS gene transcription among lactobacilli.

  2. TMEM88, CCL14 and CLEC3B as prognostic biomarkers for prognosis and palindromia of human hepatocellular carcinoma.

    Science.gov (United States)

    Zhang, Xin; Wan, Jin-Xiang; Ke, Zun-Ping; Wang, Feng; Chai, Hai-Xia; Liu, Jia-Qiang

    2017-07-01

    Hepatocellular carcinoma is one of the most mortal and prevalent cancers with increasing incidence worldwide. Elucidating genetic driver genes for prognosis and palindromia of hepatocellular carcinoma helps managing clinical decisions for patients. In this study, the high-throughput RNA sequencing data on platform IlluminaHiSeq of hepatocellular carcinoma were downloaded from The Cancer Genome Atlas with 330 primary hepatocellular carcinoma patient samples. Stable key genes with differential expressions were identified with which Kaplan-Meier survival analysis was performed using Cox proportional hazards test in R language. Driver genes influencing the prognosis of this disease were determined using clustering analysis. Functional analysis of driver genes was performed by literature search and Gene Set Enrichment Analysis. Finally, the selected driver genes were verified using external dataset GSE40873. A total of 5781 stable key genes were identified, including 156 genes definitely related to prognoses of hepatocellular carcinoma. Based on the significant key genes, samples were grouped into five clusters which were further integrated into high- and low-risk classes based on clinical features. TMEM88, CCL14, and CLEC3B were selected as driver genes which clustered high-/low-risk patients successfully (generally, p = 0.0005124445). Finally, survival analysis of the high-/low-risk samples from external database illustrated significant difference with p value 0.0198. In conclusion, TMEM88, CCL14, and CLEC3B genes were stable and available in predicting the survival and palindromia time of hepatocellular carcinoma. These genes could function as potential prognostic genes contributing to improve patients' outcomes and survival.

  3. Motif-Independent De Novo Detection of Secondary Metabolite Gene Clusters – Towards Identification of Novel Secondary Metabolisms from Filamentous Fungi -

    Directory of Open Access Journals (Sweden)

    Myco eUmemura

    2015-05-01

    Full Text Available Secondary metabolites are produced mostly by clustered genes that are essential to their biosynthesis. The transcriptional expression of these genes is often cooperatively regulated by a transcription factor located inside or close to a cluster. Most of the secondary metabolism biosynthesis (SMB gene clusters identified to date contain so-called core genes with distinctive sequence features, such as polyketide synthase (PKS and non-ribosomal peptide synthetase (NRPS. Recent efforts in sequencing fungal genomes have revealed far more SMB gene clusters than expected based on the number of core genes in the genomes. Several bioinformatics tools have been developed to survey SMB gene clusters using the sequence motif information of the core genes, including SMURF and antiSMASH.More recently, accompanied by the development of sequencing techniques allowing to obtain large-scale genomic and transcriptomic data, motif-independent prediction methods of SMB gene clusters, including MIDDAS-M, have been developed. Most these methods detect the clusters in which the genes are cooperatively regulated at transcriptional levels, thus allowing the identification of novel SMB gene clusters regardless of the presence of the core genes. Another type of the method, MIPS-CG, uses the characteristics of SMB genes, which are highly enriched in non-syntenic blocks (NSBs, enabling the prediction even without transcriptome data although the results have not been evaluated in detail. Considering that large portion of SMB gene clusters might be sufficiently expressed only in limited uncommon conditions, it seems that prediction of SMB gene clusters by bioinformatics and successive experimental validation is an only way to efficiently uncover hidden SMB gene clusters. Here, we describe and discuss possible novel approaches for the determination of SMB gene clusters that have not been identified using conventional methods.

  4. The Genome of Tolypocladium inflatum: Evolution, Organization, and Expression of the Cyclosporin Biosynthetic Gene Cluster

    Science.gov (United States)

    Bushley, Kathryn E.; Raja, Rajani; Jaiswal, Pankaj; Cumbie, Jason S.; Nonogaki, Mariko; Boyd, Alexander E.; Owensby, C. Alisha; Knaus, Brian J.; Elser, Justin; Miller, Daniel; Di, Yanming; McPhail, Kerry L.; Spatafora, Joseph W.

    2013-01-01

    The ascomycete fungus Tolypocladium inflatum, a pathogen of beetle larvae, is best known as the producer of the immunosuppressant drug cyclosporin. The draft genome of T. inflatum strain NRRL 8044 (ATCC 34921), the isolate from which cyclosporin was first isolated, is presented along with comparative analyses of the biosynthesis of cyclosporin and other secondary metabolites in T. inflatum and related taxa. Phylogenomic analyses reveal previously undetected and complex patterns of homology between the nonribosomal peptide synthetase (NRPS) that encodes for cyclosporin synthetase (simA) and those of other secondary metabolites with activities against insects (e.g., beauvericin, destruxins, etc.), and demonstrate the roles of module duplication and gene fusion in diversification of NRPSs. The secondary metabolite gene cluster responsible for cyclosporin biosynthesis is described. In addition to genes necessary for cyclosporin biosynthesis, it harbors a gene for a cyclophilin, which is a member of a family of immunophilins known to bind cyclosporin. Comparative analyses support a lineage specific origin of the cyclosporin gene cluster rather than horizontal gene transfer from bacteria or other fungi. RNA-Seq transcriptome analyses in a cyclosporin-inducing medium delineate the boundaries of the cyclosporin cluster and reveal high levels of expression of the gene cluster cyclophilin. In medium containing insect hemolymph, weaker but significant upregulation of several genes within the cyclosporin cluster, including the highly expressed cyclophilin gene, was observed. T. inflatum also represents the first reference draft genome of Ophiocordycipitaceae, a third family of insect pathogenic fungi within the fungal order Hypocreales, and supports parallel and qualitatively distinct radiations of insect pathogens. The T. inflatum genome provides additional insight into the evolution and biosynthesis of cyclosporin and lays a foundation for further investigations of the role

  5. A genome-wide analysis of nonribosomal peptide synthetase gene clusters and their peptides in a Planktothrix rubescens strain

    Directory of Open Access Journals (Sweden)

    Nederbragt Alexander J

    2009-08-01

    Full Text Available Abstract Background Cyanobacteria often produce several different oligopeptides, with unknown biological functions, by nonribosomal peptide synthetases (NRPS. Although some cyanobacterial NRPS gene cluster types are well described, the entire NRPS genomic content within a single cyanobacterial strain has never been investigated. Here we have combined a genome-wide analysis using massive parallel pyrosequencing ("454" and mass spectrometry screening of oligopeptides produced in the strain Planktothrix rubescens NIVA CYA 98 in order to identify all putative gene clusters for oligopeptides. Results Thirteen types of oligopeptides were uncovered by mass spectrometry (MS analyses. Microcystin, cyanopeptolin and aeruginosin synthetases, highly similar to already characterized NRPS, were present in the genome. Two novel NRPS gene clusters were associated with production of anabaenopeptins and microginins, respectively. Sequence-depth of the genome and real-time PCR data revealed three copies of the microginin gene cluster. Since NRPS gene cluster candidates for microviridin and oscillatorin synthesis could not be found, putative (gene encoded precursor peptide sequences to microviridin and oscillatorin were found in the genes mdnA and oscA, respectively. The genes flanking the microviridin and oscillatorin precursor genes encode putative modifying enzymes of the precursor oligopeptides. We therefore propose ribosomal pathways involving modifications and cyclisation for microviridin and oscillatorin. The microviridin, anabaenopeptin and cyanopeptolin gene clusters are situated in close proximity to each other, constituting an oligopeptide island. Conclusion Altogether seven nonribosomal peptide synthetase (NRPS gene clusters and two gene clusters putatively encoding ribosomal oligopeptide biosynthetic pathways were revealed. Our results demonstrate that whole genome shotgun sequencing combined with MS-directed determination of oligopeptides successfully

  6. A scan statistic to extract causal gene clusters from case-control genome-wide rare CNV data

    Directory of Open Access Journals (Sweden)

    Scherer Stephen W

    2011-05-01

    Full Text Available Abstract Background Several statistical tests have been developed for analyzing genome-wide association data by incorporating gene pathway information in terms of gene sets. Using these methods, hundreds of gene sets are typically tested, and the tested gene sets often overlap. This overlapping greatly increases the probability of generating false positives, and the results obtained are difficult to interpret, particularly when many gene sets show statistical significance. Results We propose a flexible statistical framework to circumvent these problems. Inspired by spatial scan statistics for detecting clustering of disease occurrence in the field of epidemiology, we developed a scan statistic to extract disease-associated gene clusters from a whole gene pathway. Extracting one or a few significant gene clusters from a global pathway limits the overall false positive probability, which results in increased statistical power, and facilitates the interpretation of test results. In the present study, we applied our method to genome-wide association data for rare copy-number variations, which have been strongly implicated in common diseases. Application of our method to a simulated dataset demonstrated the high accuracy of this method in detecting disease-associated gene clusters in a whole gene pathway. Conclusions The scan statistic approach proposed here shows a high level of accuracy in detecting gene clusters in a whole gene pathway. This study has provided a sound statistical framework for analyzing genome-wide rare CNV data by incorporating topological information on the gene pathway.

  7. antiSMASH 3.0—a comprehensive resource for the genome mining of biosynthetic gene clusters

    DEFF Research Database (Denmark)

    Weber, Tilmann; Blin, Kai; Duddela, Srikanth

    2015-01-01

    Microbial secondary metabolism constitutes a rich source of antibiotics, chemotherapeutics, insecticides and other high-value chemicals. Genome mining of gene clusters that encode the biosynthetic pathways for these metabolites has become a key methodology for novel compound discovery. In 2011, we...... introduced antiSMASH, a web server and stand-alone tool for the automatic genomic identification and analysis of biosynthetic gene clusters, available at http://antismash.secondarymetabolites.org. Here, we present version 3.0 of antiSMASH, which has undergone major improvements. A full integration...... of the recently published ClusterFinder algorithm now allows using this probabilistic algorithm to detect putative gene clusters of unknown types. Also, a new dereplication variant of the ClusterBlast module now identifies similarities of identified clusters to any of 1172 clusters with known end products...

  8. Gene structure and expression characteristic of a novel odorant receptor gene cluster in the parasitoid wasp Microplitis mediator (Hymenoptera: Braconidae).

    Science.gov (United States)

    Wang, S-N; Shan, S; Zheng, Y; Peng, Y; Lu, Z-Y; Yang, Y-Q; Li, R-J; Zhang, Y-J; Guo, Y-Y

    2017-08-01

    Odorant receptors (ORs) expressed in the antennae of parasitoid wasps are responsible for detection of various lipophilic airborne molecules. In the present study, 107 novel OR genes were identified from Microplitis mediator antennal transcriptome data. Phylogenetic analysis of the set of OR genes from M. mediator and Microplitis demolitor revealed that M. mediator OR (MmedOR) genes can be classified into different subfamilies, and the majority of MmedORs in each subfamily shared high sequence identities and clear orthologous relationships to M. demolitor ORs. Within a subfamily, six MmedOR genes, MmedOR98, 124, 125, 126, 131 and 155, shared a similar gene structure and were tightly linked in the genome. To evaluate whether the clustered MmedOR genes share common regulatory features, the transcription profile and expression characteristics of the six closely related OR genes were investigated in M. mediator. Rapid amplification of cDNA ends-PCR experiments revealed that the OR genes within the cluster were transcribed as single mRNAs, and a bicistronic mRNA for two adjacent genes (MmedOR124 and MmedOR98) was also detected in female antennae by reverse transcription PCR. In situ hybridization experiments indicated that each OR gene within the cluster was expressed in a different number of cells. Moreover, there was no co-expression of the two highly related OR genes, MmedOR124 and MmedOR98, which appeared to be individually expressed in a distinct population of neurons. Overall, there were distinct expression profiles of closely related MmedOR genes from the same cluster in M. mediator. These data provide a basic understanding of the olfactory coding in parasitoid wasps. © 2017 The Royal Entomological Society.

  9. Variation in sequence and location of the fumonisin mycotoxin niosynthetic gene cluster in Fusarium

    NARCIS (Netherlands)

    Proctor, R.H.; Hove, van F.; Susca, A.; Stea, A.; Busman, M.; Lee, van der T.A.J.; Waalwijk, C.; Moretti, A.

    2010-01-01

    In Fusarium, the ability to produce fumonisins is governed by a 17-gene fumonisin biosynthetic gene (FUM) cluster. Here, we examined the cluster in F. oxysporum strain O-1890 and nine other species selected to represent a wide range of the genetic diversity within the GFSC.

  10. Molecular profiling identifies prognostic markers of stage IA lung adenocarcinoma.

    Science.gov (United States)

    Zhang, Jie; Shao, Jinchen; Zhu, Lei; Zhao, Ruiying; Xing, Jie; Wang, Jun; Guo, Xiaohui; Tu, Shichun; Han, Baohui; Yu, Keke

    2017-09-26

    We previously showed that different pathologic subtypes were associated with different prognostic values in patients with stage IA lung adenocarcinoma (AC). We hypothesize that differential gene expression profiles of different subtypes may be valuable factors for prognosis in stage IA lung adenocarcinoma. We performed microarray gene expression profiling on tumor tissues micro-dissected from patients with acinar and solid predominant subtypes of stage IA lung adenocarcinoma. These patients had undergone a lobectomy and mediastinal lymph node dissection at the Shanghai Chest Hospital, Shanghai, China in 2012. No patient had preoperative treatment. We performed the Gene Set Enrichment Analysis (GSEA) analysis to look for gene expression signatures associated with tumor subtypes. The histologic subtypes of all patients were classified according to the 2015 WHO lung Adenocarcinoma classification. We found that patients with the solid predominant subtype are enriched for genes involved in RNA polymerase activity as well as inactivation of the p53 pathway. Further, we identified a list of genes that may serve as prognostic markers for stage IA lung adenocarcinoma. Validation in the TCGA database shows that these genes are correlated with survival, suggesting that they are novel prognostic factors for stage IA lung adenocarcinoma. In conclusion, we have uncovered novel prognostic factors for stage IA lung adenocarcinoma using gene expression profiling in combination with histopathology subtyping.

  11. Differential expression patterns of housekeeping genes increase diagnostic and prognostic value in lung cancer

    Directory of Open Access Journals (Sweden)

    Yu-Chun Chang

    2018-05-01

    Full Text Available Background Using DNA microarrays, we previously identified 451 genes expressed in 19 different human tissues. Although ubiquitously expressed, the variable expression patterns of these “housekeeping genes” (HKGs could separate one normal human tissue type from another. Current focus on identifying “specific disease markers” is problematic as single gene expression in a given sample represents the specific cellular states of the sample at the time of collection. In this study, we examine the diagnostic and prognostic potential of the variable expressions of HKGs in lung cancers. Methods Microarray and RNA-seq data for normal lungs, lung adenocarcinomas (AD, squamous cell carcinomas of the lung (SQCLC, and small cell carcinomas of the lung (SCLC were collected from online databases. Using 374 of 451 HKGs, differentially expressed genes between pairs of sample types were determined via two-sided, homoscedastic t-test. Principal component analysis and hierarchical clustering classified normal lung and lung cancers subtypes according to relative gene expression variations. We used uni- and multi-variate cox-regressions to identify significant predictors of overall survival in AD patients. Classifying genes were selected using a set of training samples and then validated using an independent test set. Gene Ontology was examined by PANTHER. Results This study showed that the differential expression patterns of 242, 245, and 99 HKGs were able to distinguish normal lung from AD, SCLC, and SQCLC, respectively. From these, 70 HKGs were common across the three lung cancer subtypes. These HKGs have low expression variation compared to current lung cancer markers (e.g., EGFR, KRAS and were involved in the most common biological processes (e.g., metabolism, stress response. In addition, the expression pattern of 106 HKGs alone was a significant classifier of AD versus SQCLC. We further highlighted that a panel of 13 HKGs was an independent predictor of

  12. Two Horizontally Transferred Xenobiotic Resistance Gene Clusters Associated with Detoxification of Benzoxazolinones by Fusarium Species

    Science.gov (United States)

    Glenn, Anthony E.; Davis, C. Britton; Gao, Minglu; Gold, Scott E.; Mitchell, Trevor R.; Proctor, Robert H.; Stewart, Jane E.; Snook, Maurice E.

    2016-01-01

    Microbes encounter a broad spectrum of antimicrobial compounds in their environments and often possess metabolic strategies to detoxify such xenobiotics. We have previously shown that Fusarium verticillioides, a fungal pathogen of maize known for its production of fumonisin mycotoxins, possesses two unlinked loci, FDB1 and FDB2, necessary for detoxification of antimicrobial compounds produced by maize, including the γ-lactam 2-benzoxazolinone (BOA). In support of these earlier studies, microarray analysis of F. verticillioides exposed to BOA identified the induction of multiple genes at FDB1 and FDB2, indicating the loci consist of gene clusters. One of the FDB1 cluster genes encoded a protein having domain homology to the metallo-β-lactamase (MBL) superfamily. Deletion of this gene (MBL1) rendered F. verticillioides incapable of metabolizing BOA and thus unable to grow on BOA-amended media. Deletion of other FDB1 cluster genes, in particular AMD1 and DLH1, did not affect BOA degradation. Phylogenetic analyses and topology testing of the FDB1 and FDB2 cluster genes suggested two horizontal transfer events among fungi, one being transfer of FDB1 from Fusarium to Colletotrichum, and the second being transfer of the FDB2 cluster from Fusarium to Aspergillus. Together, the results suggest that plant-derived xenobiotics have exerted evolutionary pressure on these fungi, leading to horizontal transfer of genes that enhance fitness or virulence. PMID:26808652

  13. Sequencing, physical organization and kinetic expression of the patulin biosynthetic gene cluster from Penicillium expansum

    International Nuclear Information System (INIS)

    Tannous, J.; El Khoury, R.; El Khoury, A.; Lteif, R.; Snini, S.; Lippi, Y.; Oswald, I.; Olivier, P.; Atoui, A.

    2014-01-01

    Patulin is a polyketide-derived mycotoxin produced by numerous filamentous fungi. Among them, Penicillium expansum is by far the most problematic species. This fungus is a destructive phytopathogen capable of growing on fruit, provoking the blue mold decay of apples and producing significant amounts of patulin. The biosynthetic pathway of this mycotoxin is chemically well-characterized, but its genetic bases remain largely unknown with only few characterized genes in less economic relevant species. The present study consisted of the identification and positional organization of the patulin gene cluster in P. expansum strain NRRL 35695. Several amplification reactions were performed with degenerative primers that were designed based on sequences from the orthologous genes available in other species. An improved genome Walking approach was used in order to sequence the remaining adjacent genes of the cluster. RACE-PCR was also carried out from mRNAs to determine the start and stop codons of the coding sequences. The patulin gene cluster in P. expansum consists of 15 genes in the following order: patH, patG, patF, patE, patD, patC, patB, patA, patM, patN, patO, patL, patI, patJ, and patK. These genes share 60–70% of identity with orthologous genes grouped differently, within a putative patulin cluster described in a non-producing strain of Aspergillus clavatus. The kinetics of patulin cluster genes expression was studied under patulin-permissive conditions (natural apple-based medium) and patulin-restrictive conditions (Eagle's minimal essential medium), and demonstrated a significant association between gene expression and patulin production. In conclusion, the sequence of the patulin cluster in P. expansum constitutes a key step for a better understanding of themechanisms leading to patulin production in this fungus. It will allow the role of each gene to be elucidated, and help to define strategies to reduce patulin production in apple-based products

  14. PTK 7 is a transforming gene and prognostic marker for breast cancer and nodal metastasis involvement.

    Directory of Open Access Journals (Sweden)

    Silvia Gärtner

    Full Text Available Protein Tyrosin Kinase 7 (PTK7 is upregulated in several human cancers; however, its clinical implication in breast cancer (BC and lymph node (LN is still unclear. In order to investigate the function of PTK7 in mediating BC cell motility and invasivity, PTK7 expression in BC cell lines was determined. PTK7 signaling in highly invasive breast cancer cells was inhibited by a dominant-negative PTK7 mutant, an antibody against the extracellular domain of PTK7, and siRNA knockdown of PTK7. This resulted in decreased motility and invasivity of BC cells. We further examined PTK7 expression in BC and LN tissue of 128 BC patients by RT-PCR and its correlation with BC related genes like HER2, HER3, PAI1, MMP1, K19, and CD44. Expression profiling in BC cell lines and primary tumors showed association of PTK7 with ER/PR/HER2-negative (TNBC-triple negative BC cancer. Oncomine data analysis confirmed this observation and classified PTK7 in a cluster with genes associated with agressive behavior of primary BC. Furthermore PTK7 expression was significantly different with respect to tumor size (ANOVA, p = 0.033 in BC and nodal involvement (ANOVA, p = 0.007 in LN. PTK7 expression in metastatic LN was related to shorter DFS (Cox Regression, p = 0.041. Our observations confirmed the transforming potential of PTK7, as well as its involvement in motility and invasivity of BC cells. PTK7 is highly expressed in TNBC cell lines. It represents a novel prognostic marker for BC patients and has potential therapeutic significance.

  15. PTK 7 is a transforming gene and prognostic marker for breast cancer and nodal metastasis involvement.

    Science.gov (United States)

    Gärtner, Silvia; Gunesch, Angela; Knyazeva, Tatiana; Wolf, Petra; Högel, Bernhard; Eiermann, Wolfgang; Ullrich, Axel; Knyazev, Pjotr; Ataseven, Beyhan

    2014-01-01

    Protein Tyrosin Kinase 7 (PTK7) is upregulated in several human cancers; however, its clinical implication in breast cancer (BC) and lymph node (LN) is still unclear. In order to investigate the function of PTK7 in mediating BC cell motility and invasivity, PTK7 expression in BC cell lines was determined. PTK7 signaling in highly invasive breast cancer cells was inhibited by a dominant-negative PTK7 mutant, an antibody against the extracellular domain of PTK7, and siRNA knockdown of PTK7. This resulted in decreased motility and invasivity of BC cells. We further examined PTK7 expression in BC and LN tissue of 128 BC patients by RT-PCR and its correlation with BC related genes like HER2, HER3, PAI1, MMP1, K19, and CD44. Expression profiling in BC cell lines and primary tumors showed association of PTK7 with ER/PR/HER2-negative (TNBC-triple negative BC) cancer. Oncomine data analysis confirmed this observation and classified PTK7 in a cluster with genes associated with agressive behavior of primary BC. Furthermore PTK7 expression was significantly different with respect to tumor size (ANOVA, p = 0.033) in BC and nodal involvement (ANOVA, p = 0.007) in LN. PTK7 expression in metastatic LN was related to shorter DFS (Cox Regression, p = 0.041). Our observations confirmed the transforming potential of PTK7, as well as its involvement in motility and invasivity of BC cells. PTK7 is highly expressed in TNBC cell lines. It represents a novel prognostic marker for BC patients and has potential therapeutic significance.

  16. Horizontal transfer of a nitrate assimilation gene cluster and ecological transitions in fungi: a phylogenetic study.

    Directory of Open Access Journals (Sweden)

    Jason C Slot

    Full Text Available High affinity nitrate assimilation genes in fungi occur in a cluster (fHANT-AC that can be coordinately regulated. The clustered genes include nrt2, which codes for a high affinity nitrate transporter; euknr, which codes for nitrate reductase; and NAD(PH-nir, which codes for nitrite reductase. Homologs of genes in the fHANT-AC occur in other eukaryotes and prokaryotes, but they have only been found clustered in the oomycete Phytophthora (heterokonts. We performed independent and concatenated phylogenetic analyses of homologs of all three genes in the fHANT-AC. Phylogenetic analyses limited to fungal sequences suggest that the fHANT-AC has been transferred horizontally from a basidiomycete (mushrooms and smuts to an ancestor of the ascomycetous mold Trichoderma reesei. Phylogenetic analyses of sequences from diverse eukaryotes and eubacteria, and cluster structure, are consistent with a hypothesis that the fHANT-AC was assembled in a lineage leading to the oomycetes and was subsequently transferred to the Dikarya (Ascomycota+Basidiomycota, which is a derived fungal clade that includes the vast majority of terrestrial fungi. We propose that the acquisition of high affinity nitrate assimilation contributed to the success of Dikarya on land by allowing exploitation of nitrate in aerobic soils, and the subsequent transfer of a complete assimilation cluster improved the fitness of T. reesei in a new niche. Horizontal transmission of this cluster of functionally integrated genes supports the "selfish operon" hypothesis for maintenance of gene clusters.

  17. Comparison of Expression of Secondary Metabolite Biosynthesis Cluster Genes in Aspergillus flavus, A. parasiticus, and A. oryzae

    OpenAIRE

    Ehrlich, Kenneth C.; Mack, Brian M.

    2014-01-01

    Fifty six secondary metabolite biosynthesis gene clusters are predicted to be in the Aspergillus flavus genome. In spite of this, the biosyntheses of only seven metabolites, including the aflatoxins, kojic acid, cyclopiazonic acid and aflatrem, have been assigned to a particular gene cluster. We used RNA-seq to compare expression of secondary metabolite genes in gene clusters for the closely related fungi A. parasiticus, A. oryzae, and A. flavus S and L sclerotial morphotypes. The data help ...

  18. Increasing Power by Sharing Information from Genetic Background and Treatment in Clustering of Gene Expression Time Series

    OpenAIRE

    Sura Zaki Alrashid; Muhammad Arifur Rahman; Nabeel H Al-Aaraji; Neil D Lawrence; Paul R Heath

    2018-01-01

    Clustering of gene expression time series gives insight into which genes may be co-regulated, allowing us to discern the activity of pathways in a given microarray experiment. Of particular interest is how a given group of genes varies with different conditions or genetic background. This paper develops
a new clustering method that allows each cluster to be parameterised according to whether the behaviour of the genes across conditions is correlated or anti-correlated. By specifying correlati...

  19. Identification of the Regulator Gene Responsible for the Acetone-Responsive Expression of the Binuclear Iron Monooxygenase Gene Cluster in Mycobacteria ▿

    Science.gov (United States)

    Furuya, Toshiki; Hirose, Satomi; Semba, Hisashi; Kino, Kuniki

    2011-01-01

    The mimABCD gene cluster encodes the binuclear iron monooxygenase that oxidizes propane and phenol in Mycobacterium smegmatis strain MC2 155 and Mycobacterium goodii strain 12523. Interestingly, expression of the mimABCD gene cluster is induced by acetone. In this study, we investigated the regulator gene responsible for this acetone-responsive expression. In the genome sequence of M. smegmatis strain MC2 155, the mimABCD gene cluster is preceded by a gene designated mimR, which is divergently transcribed. Sequence analysis revealed that MimR exhibits amino acid similarity with the NtrC family of transcriptional activators, including AcxR and AcoR, which are involved in acetone and acetoin metabolism, respectively. Unexpectedly, many homologs of the mimR gene were also found in the sequenced genomes of actinomycetes. A plasmid carrying a transcriptional fusion of the intergenic region between the mimR and mimA genes with a promoterless green fluorescent protein (GFP) gene was constructed and introduced into M. smegmatis strain MC2 155. Using a GFP reporter system, we confirmed by deletion and complementation analyses that the mimR gene product is the positive regulator of the mimABCD gene cluster expression that is responsive to acetone. M. goodii strain 12523 also utilized the same regulatory system as M. smegmatis strain MC2 155. Although transcriptional activators of the NtrC family generally control transcription using the σ54 factor, a gene encoding the σ54 factor was absent from the genome sequence of M. smegmatis strain MC2 155. These results suggest the presence of a novel regulatory system in actinomycetes, including mycobacteria. PMID:21856847

  20. Increasing Power by Sharing Information from Genetic Background and Treatment in Clustering of Gene Expression Time Series

    Directory of Open Access Journals (Sweden)

    Sura Zaki Alrashid

    2018-02-01

    Full Text Available Clustering of gene expression time series gives insight into which genes may be co-regulated, allowing us to discern the activity of pathways in a given microarray experiment. Of particular interest is how a given group of genes varies with different conditions or genetic background. This paper develops
a new clustering method that allows each cluster to be parameterised according to whether the behaviour of the genes across conditions is correlated or anti-correlated. By specifying correlation between such genes,more information is gain within the cluster about how the genes interrelate. Amyotrophic lateral sclerosis (ALS is an irreversible neurodegenerative disorder that kills the motor neurons and results in death within 2 to 3 years from the symptom onset. Speed of progression for different patients are heterogeneous with significant variability. The SOD1G93A transgenic mice from different backgrounds (129Sv and C57 showed consistent phenotypic differences for disease progression. A hierarchy of Gaussian isused processes to model condition-specific and gene-specific temporal co-variances. This study demonstrated about finding some significant gene expression profiles and clusters of associated or co-regulated gene expressions together from four groups of data (SOD1G93A and Ntg from 129Sv and C57 backgrounds. Our study shows the effectiveness of sharing information between replicates and different model conditions when modelling gene expression time series. Further gene enrichment score analysis and ontology pathway analysis of some specified clusters for a particular group may lead toward identifying features underlying the differential speed of disease progression.

  1. Form gene clustering method about pan-ethnic-group products based on emotional semantic

    Science.gov (United States)

    Chen, Dengkai; Ding, Jingjing; Gao, Minzhuo; Ma, Danping; Liu, Donghui

    2016-09-01

    The use of pan-ethnic-group products form knowledge primarily depends on a designer's subjective experience without user participation. The majority of studies primarily focus on the detection of the perceptual demands of consumers from the target product category. A pan-ethnic-group products form gene clustering method based on emotional semantic is constructed. Consumers' perceptual images of the pan-ethnic-group products are obtained by means of product form gene extraction and coding and computer aided product form clustering technology. A case of form gene clustering about the typical pan-ethnic-group products is investigated which indicates that the method is feasible. This paper opens up a new direction for the future development of product form design which improves the agility of product design process in the era of Industry 4.0.

  2. Independent replication of a melanoma subtype gene signature and evaluation of its prognostic value and biological correlates in a population cohort.

    Science.gov (United States)

    Nsengimana, Jérémie; Laye, Jon; Filia, Anastasia; Walker, Christy; Jewell, Rosalyn; Van den Oord, Joost J; Wolter, Pascal; Patel, Poulam; Sucker, Antje; Schadendorf, Dirk; Jönsson, Göran B; Bishop, D Timothy; Newton-Bishop, Julia

    2015-05-10

    Development and validation of robust molecular biomarkers has so far been limited in melanoma research. In this paper we used a large population-based cohort to replicate two published gene signatures for melanoma classification. We assessed the signatures prognostic value and explored their biological significance by correlating them with factors known to be associated with survival (vitamin D) or etiological routes (nevi, sun sensitivity and telomere length). Genomewide microarray gene expressions were profiled in 300 archived tumors (224 primaries, 76 secondaries). The two gene signatures classified up to 96% of our samples and showed strong correlation with melanoma specific survival (P=3 x 10(-4)), Breslow thickness (P=5 x 10(-10)), ulceration (P=9.x10-8) and mitotic rate (P=3 x 10(-7)), adding prognostic value over AJCC stage (adjusted hazard ratio 1.79, 95%CI 1.13-2.83), as previously reported. Furthermore, molecular subtypes were associated with season-adjusted serum vitamin D at diagnosis (P=0.04) and genetically predicted telomere length (P=0.03). Specifically, molecular high-grade tumors were more frequent in patients with lower vitamin D levels whereas high immune tumors came from patients with predicted shorter telomeres. Our data confirm the utility of molecular biomarkers in melanoma prognostic estimation using tiny archived specimens and shed light on biological mechanisms likely to impact on cancer initiation and progression.

  3. Prognostic impact of carboxylesterase 1 gene variants in patients with congestive heart failure treated with angiotensin-converting enzyme inhibitors

    DEFF Research Database (Denmark)

    Nelveg-Kristensen, Karl E.; Madsen, Majbritt B.; Torp-Pedersen, Christian

    2016-01-01

    OBJECTIVE: Most angiotensin-converting enzyme inhibitors (ACEIs) are prodrugs activated by carboxylesterase 1 (CES1). We investigated the prognostic importance of CES1 gene (CES1) copy number variation and the rs3815583 single-nucleotide polymorphism in CES1 among ACEI-treated patients with conge...

  4. Comparison of expression of secondary metabolite biosynthesis cluster genes in Aspergillus flavus, A. parasiticus, and A. oryzae.

    Science.gov (United States)

    Ehrlich, Kenneth C; Mack, Brian M

    2014-06-23

    Fifty six secondary metabolite biosynthesis gene clusters are predicted to be in the Aspergillus flavus genome. In spite of this, the biosyntheses of only seven metabolites, including the aflatoxins, kojic acid, cyclopiazonic acid and aflatrem, have been assigned to a particular gene cluster. We used RNA-seq to compare expression of secondary metabolite genes in gene clusters for the closely related fungi A. parasiticus, A. oryzae, and A. flavus S and L sclerotial morphotypes. The data help to refine the identification of probable functional gene clusters within these species. Our results suggest that A. flavus, a prevalent contaminant of maize, cottonseed, peanuts and tree nuts, is capable of producing metabolites which, besides aflatoxin, could be an underappreciated contributor to its toxicity.

  5. A recently transferred cluster of bacterial genes in Trichomonas vaginalis - lateral gene transfer and the fate of acquired genes

    Science.gov (United States)

    2014-01-01

    Background Lateral Gene Transfer (LGT) has recently gained recognition as an important contributor to some eukaryote proteomes, but the mechanisms of acquisition and fixation in eukaryotic genomes are still uncertain. A previously defined norm for LGTs in microbial eukaryotes states that the majority are genes involved in metabolism, the LGTs are typically localized one by one, surrounded by vertically inherited genes on the chromosome, and phylogenetics shows that a broad collection of bacterial lineages have contributed to the transferome. Results A unique 34 kbp long fragment with 27 clustered genes (TvLF) of prokaryote origin was identified in the sequenced genome of the protozoan parasite Trichomonas vaginalis. Using a PCR based approach we confirmed the presence of the orthologous fragment in four additional T. vaginalis strains. Detailed sequence analyses unambiguously suggest that TvLF is the result of one single, recent LGT event. The proposed donor is a close relative to the firmicute bacterium Peptoniphilus harei. High nucleotide sequence similarity between T. vaginalis strains, as well as to P. harei, and the absence of homologs in other Trichomonas species, suggests that the transfer event took place after the radiation of the genus Trichomonas. Some genes have undergone pseudogenization and degradation, indicating that they may not be retained in the future. Functional annotations reveal that genes involved in informational processes are particularly prone to degradation. Conclusions We conclude that, although the majority of eukaryote LGTs are single gene occurrences, they may be acquired in clusters of several genes that are subsequently cleansed of evolutionarily less advantageous genes. PMID:24898731

  6. Identification of new genes in a cell envelope-cell division gene cluster of Escherichia coli: cell envelope gene murG.

    Science.gov (United States)

    Salmond, G P; Lutkenhaus, J F; Donachie, W D

    1980-01-01

    We report the identification, cloning, and mapping of a new cell envelope gene, murG. This lies in a group of five genes of similar phenotype (in the order murE murF murG murC ddl) all concerned with peptidoglycan biosynthesis. This group is in a larger cluster of at least 10 genes, all of which are involved in some way with cell envelope growth. Images PMID:6998962

  7. antiSMASH 3.0-a comprehensive resource for the genome mining of biosynthetic gene clusters.

    Science.gov (United States)

    Weber, Tilmann; Blin, Kai; Duddela, Srikanth; Krug, Daniel; Kim, Hyun Uk; Bruccoleri, Robert; Lee, Sang Yup; Fischbach, Michael A; Müller, Rolf; Wohlleben, Wolfgang; Breitling, Rainer; Takano, Eriko; Medema, Marnix H

    2015-07-01

    Microbial secondary metabolism constitutes a rich source of antibiotics, chemotherapeutics, insecticides and other high-value chemicals. Genome mining of gene clusters that encode the biosynthetic pathways for these metabolites has become a key methodology for novel compound discovery. In 2011, we introduced antiSMASH, a web server and stand-alone tool for the automatic genomic identification and analysis of biosynthetic gene clusters, available at http://antismash.secondarymetabolites.org. Here, we present version 3.0 of antiSMASH, which has undergone major improvements. A full integration of the recently published ClusterFinder algorithm now allows using this probabilistic algorithm to detect putative gene clusters of unknown types. Also, a new dereplication variant of the ClusterBlast module now identifies similarities of identified clusters to any of 1172 clusters with known end products. At the enzyme level, active sites of key biosynthetic enzymes are now pinpointed through a curated pattern-matching procedure and Enzyme Commission numbers are assigned to functionally classify all enzyme-coding genes. Additionally, chemical structure prediction has been improved by incorporating polyketide reduction states. Finally, in order for users to be able to organize and analyze multiple antiSMASH outputs in a private setting, a new XML output module allows offline editing of antiSMASH annotations within the Geneious software. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  8. Ensemble attribute profile clustering: discovering and characterizing groups of genes with similar patterns of biological features

    Directory of Open Access Journals (Sweden)

    Bissell MJ

    2006-03-01

    Full Text Available Abstract Background Ensemble attribute profile clustering is a novel, text-based strategy for analyzing a user-defined list of genes and/or proteins. The strategy exploits annotation data present in gene-centered corpora and utilizes ideas from statistical information retrieval to discover and characterize properties shared by subsets of the list. The practical utility of this method is demonstrated by employing it in a retrospective study of two non-overlapping sets of genes defined by a published investigation as markers for normal human breast luminal epithelial cells and myoepithelial cells. Results Each genetic locus was characterized using a finite set of biological properties and represented as a vector of features indicating attributes associated with the locus (a gene attribute profile. In this study, the vector space models for a pre-defined list of genes were constructed from the Gene Ontology (GO terms and the Conserved Domain Database (CDD protein domain terms assigned to the loci by the gene-centered corpus LocusLink. This data set of GO- and CDD-based gene attribute profiles, vectors of binary random variables, was used to estimate multiple finite mixture models and each ensuing model utilized to partition the profiles into clusters. The resultant partitionings were combined using a unanimous voting scheme to produce consensus clusters, sets of profiles that co-occured consistently in the same cluster. Attributes that were important in defining the genes assigned to a consensus cluster were identified. The clusters and their attributes were inspected to ascertain the GO and CDD terms most associated with subsets of genes and in conjunction with external knowledge such as chromosomal location, used to gain functional insights into human breast biology. The 52 luminal epithelial cell markers and 89 myoepithelial cell markers are disjoint sets of genes. Ensemble attribute profile clustering-based analysis indicated that both lists

  9. Evolution and Diversity of Biosynthetic Gene Clusters in Fusarium

    Directory of Open Access Journals (Sweden)

    Koen Hoogendoorn

    2018-06-01

    Full Text Available Plant pathogenic fungi in the Fusarium genus cause severe damage to crops, resulting in great financial losses and health hazards. Specialized metabolites synthesized by these fungi are known to play key roles in the infection process, and to provide survival advantages inside and outside the host. However, systematic studies of the evolution of specialized metabolite-coding potential across Fusarium have been scarce. Here, we apply a combination of bioinformatic approaches to identify biosynthetic gene clusters (BGCs across publicly available genomes from Fusarium, to group them into annotated families and to study gain/loss events of BGC families throughout the history of the genus. Comparison with MIBiG reference BGCs allowed assignment of 29 gene cluster families (GCFs to pathways responsible for the production of known compounds, while for 57 GCFs, the molecular products remain unknown. Comparative analysis of BGC repertoires using ancestral state reconstruction raised several new hypotheses on how BGCs contribute to Fusarium pathogenicity or host specificity, sometimes surprisingly so: for example, a gene cluster for the biosynthesis of hexadehydro-astechrome was identified in the genome of the biocontrol strain Fusarium oxysporum Fo47, while being absent in that of the tomato pathogen F. oxysporum f.sp. lycopersici. Several BGCs were also identified on supernumerary chromosomes; heterologous expression of genes for three terpene synthases encoded on the Fusarium poae supernumerary chromosome and subsequent GC/MS analysis showed that these genes are functional and encode enzymes that each are able to synthesize koraiol; this observed functional redundancy supports the hypothesis that localization of copies of BGCs on supernumerary chromosomes provides freedom for evolutionary innovations to occur, while the original function remains conserved. Altogether, this systematic overview of biosynthetic diversity in Fusarium paves the way for

  10. Genomic organization of the rat alpha 2u-globulin gene cluster.

    Science.gov (United States)

    McFadyen, D A; Addison, W; Locke, J

    1999-05-01

    The alpha 2u-globulin are a group of similar proteins, belonging to the lipocalin superfamily of proteins, that are synthesized in a subset of secretory tissues in rats. The many alpha 2u-globulin isoforms are encoded by a multigene family that exhibits extensive homology. Despite a high degree of sequence identity, individual family members show diverse expression patterns involving complex hormonal, tissue-specific, and developmental regulation. Analysis suggests that there are approximately 20 alpha 2u-globulin genes in the rat genome. We have used fluorescence in situ hybridization (FISH) to show that the alpha 2u-globulin genes are clustered at a single site on rat Chromosome (Chr) 5 (5q22-24). Southern blots of rat genomic DNA separated by pulsed field gel electrophoresis indicated that the alpha 2u-globulin genes are contained on two NruI fragments with a total size of 880 kbp. Analysis of three P1 clones containing alpha 2u-globulin genes indicated that the alpha 2u-globulin genes are tandemly arranged in a head-to-tail fashion. The organization of the alpha 2u-globulin genes in the rat as a tandem array of single genes differs from the homologous major urinary protein genes in the mouse, which are organized as tandem arrays of divergently oriented gene pairs. The structure of these gene clusters may have consequences for the proposed function, as a pheromone transporter, for the protein products encoded by these genes.

  11. Prognostic Gene Expression Profiles in Breast Cancer

    DEFF Research Database (Denmark)

    Sørensen, Kristina Pilekær

    Each year approximately 4,800 Danish women are diagnosed with breast cancer. Several clinical and pathological factors are used as prognostic and predictive markers to categorize the patients into groups of high or low risk. Around 90% of all patients are allocated to the high risk group...... clinical courses, and they may be useful as novel prognostic biomarkers in breast cancer. The aim of the present project was to predict the development of metastasis in lymph node negative breast cancer patients by RNA profiling. We collected and analyzed 82 primary breast tumors from patients who...... and the time of event. Previous findings have shown that high expression of the lncRNA HOTAIR is correlated with poor survival in breast cancer. We validated this finding by demonstrating that high HOTAIR expression in our primary tumors was significantly associated with worse prognosis independent...

  12. APOBEC3A is an oral cancer prognostic biomarker in Taiwanese carriers of an APOBEC deletion polymorphism.

    Science.gov (United States)

    Chen, Ting-Wen; Lee, Chi-Ching; Liu, Hsuan; Wu, Chi-Sheng; Pickering, Curtis R; Huang, Po-Jung; Wang, Jing; Chang, Ian Yi-Feng; Yeh, Yuan-Ming; Chen, Chih-De; Li, Hsin-Pai; Luo, Ji-Dung; Tan, Bertrand Chin-Ming; Chan, Timothy En Haw; Hsueh, Chuen; Chu, Lichieh Julie; Chen, Yi-Ting; Zhang, Bing; Yang, Chia-Yu; Wu, Chih-Ching; Hsu, Chia-Wei; See, Lai-Chu; Tang, Petrus; Yu, Jau-Song; Liao, Wei-Chao; Chiang, Wei-Fan; Rodriguez, Henry; Myers, Jeffrey N; Chang, Kai-Ping; Chang, Yu-Sun

    2017-09-06

    Oral squamous cell carcinoma is a prominent cancer worldwide, particularly in Taiwan. By integrating omics analyses in 50 matched samples, we uncover in Taiwanese patients a predominant mutation signature associated with cytidine deaminase APOBEC, which correlates with the upregulation of APOBEC3A expression in the APOBEC3 gene cluster at 22q13. APOBEC3A expression is significantly higher in tumors carrying APOBEC3B-deletion allele(s). High-level APOBEC3A expression is associated with better overall survival, especially among patients carrying APOBEC3B-deletion alleles, as examined in a second cohort (n = 188; p = 0.004). The frequency of APOBEC3B-deletion alleles is ~50% in 143 genotyped oral squamous cell carcinoma -Taiwan samples (27A3B -/- :89A3B +/- :27A3B +/+ ), compared to the 5.8% found in 314 OSCC-TCGA samples. We thus report a frequent APOBEC mutational profile, which relates to a APOBEC3B-deletion germline polymorphism in Taiwanese oral squamous cell carcinoma that impacts expression of APOBEC3A, and is shown to be of clinical prognostic relevance. Our finding might be recapitulated by genomic studies in other cancer types.Oral squamous cell carcinoma is a prevalent malignancy in Taiwan. Here, the authors show that OSCC in Taiwanese show a frequent deletion polymorphism in the cytidine deaminases gene cluster APOBEC3 resulting in increased expression of A3A, which is shown to be of clinical prognostic relevance.

  13. Gene Expression of the EGF System-a Prognostic Model in Non-Small Cell Lung Cancer Patients Without Activating EGFR Mutations

    DEFF Research Database (Denmark)

    Sandfeld-Paulsen, Birgitte; Folkersen, Birgitte Holst; Rasmussen, Torben Riis

    2016-01-01

    OBJECTIVES: Contradicting results have been demonstrated for the expression of the epidermal growth factor receptor (EGFR) as a prognostic marker in non-small cell lung cancer (NSCLC). The complexity of the EGF system with four interacting receptors and more than a dozen activating ligands is a l.......17-6.47], P model that takes the complexity of the EGF system into account and shows that this model is a strong prognostic marker in NSCLC patients.......OBJECTIVES: Contradicting results have been demonstrated for the expression of the epidermal growth factor receptor (EGFR) as a prognostic marker in non-small cell lung cancer (NSCLC). The complexity of the EGF system with four interacting receptors and more than a dozen activating ligands...... is a likely explanation. The aim of this study is to demonstrate that the combined network of receptors and ligands from the EGF system is a prognostic marker. MATERIAL AND METHODS: Gene expression of the receptors EGFR, HER2, HER3, HER4, and the ligands AREG, HB-EGF, EPI, TGF-α, and EGF was measured...

  14. Acquisition and evolution of plant pathogenesis-associated gene clusters and candidate determinants of tissue-specificity in xanthomonas.

    Directory of Open Access Journals (Sweden)

    Hong Lu

    Full Text Available Xanthomonas is a large genus of plant-associated and plant-pathogenic bacteria. Collectively, members cause diseases on over 392 plant species. Individually, they exhibit marked host- and tissue-specificity. The determinants of this specificity are unknown.To assess potential contributions to host- and tissue-specificity, pathogenesis-associated gene clusters were compared across genomes of eight Xanthomonas strains representing vascular or non-vascular pathogens of rice, brassicas, pepper and tomato, and citrus. The gum cluster for extracellular polysaccharide is conserved except for gumN and sequences downstream. The xcs and xps clusters for type II secretion are conserved, except in the rice pathogens, in which xcs is missing. In the otherwise conserved hrp cluster, sequences flanking the core genes for type III secretion vary with respect to insertion sequence element and putative effector gene content. Variation at the rpf (regulation of pathogenicity factors cluster is more pronounced, though genes with established functional relevance are conserved. A cluster for synthesis of lipopolysaccharide varies highly, suggesting multiple horizontal gene transfers and reassortments, but this variation does not correlate with host- or tissue-specificity. Phylogenetic trees based on amino acid alignments of gum, xps, xcs, hrp, and rpf cluster products generally reflect strain phylogeny. However, amino acid residues at four positions correlate with tissue specificity, revealing hpaA and xpsD as candidate determinants. Examination of genome sequences of xanthomonads Xylella fastidiosa and Stenotrophomonas maltophilia revealed that the hrp, gum, and xcs clusters are recent acquisitions in the Xanthomonas lineage.Our results provide insight into the ancestral Xanthomonas genome and indicate that differentiation with respect to host- and tissue-specificity involved not major modifications or wholesale exchange of clusters, but subtle changes in a small

  15. Prognostic value of HER2 gene amplification detected by chromogenic in situ hybridization (CISH) in metastatic breast cancer.

    Science.gov (United States)

    Todorović-Raković, Natasa; Jovanović, Danica; Nesković-Konstantinović, Zora; Nikolić-Vukosavljević, Dragica

    2007-06-01

    After so many years of research, clinical value of HER2 (Human epidermal growth factor receptor 2) is unclear. Perhaps the main reason is variability of testing methods that produce controversial results. There is a lack of studies regarding prognostic value of CISH especially in metastatic breast cancer (MBC) when risk evaluation is based on different parameters than for primary breast cancer. Aim of this study was to compare prognostic relevance of HER2 status in MBC tested by two different methods i.e. immunohistochemistry (IHC) and chromogenic in situ hybridization (CISH). HER2 status of the same group of 107 MBC patients was determined by IHC (protein overexpression) and by CISH (gene amplification). HER2 results obtained by IHC and CISH showed significant correlation, beside the existence of discrepancies. Beside the significant correlation in two methods, there was a difference in prognostic values of compared methods during the course of metastatic disease. There was a significant difference in progression-free interval (PFI) between HER2 non-amplified and HER2 amplified cases determined by CISH, in postmenopausal subgroup and node-positive subgroup, but no significant difference for IHC stratified MBC patients. CISH seems to be accurate and more informative method than IHC regarding prognostic value of HER2 in metastatic breast cancer.

  16. MeSH key terms for validation and annotation of gene expression clusters

    Energy Technology Data Exchange (ETDEWEB)

    Rechtsteiner, A. (Andreas); Rocha, L. M. (Luis Mateus)

    2004-01-01

    Integration of different sources of information is a great challenge for the analysis of gene expression data, and for the field of Functional Genomics in general. As the availability of numerical data from high-throughput methods increases, so does the need for technologies that assist in the validation and evaluation of the biological significance of results extracted from these data. In mRNA assaying with microarrays, for example, numerical analysis often attempts to identify clusters of co-expressed genes. The important task to find the biological significance of the results and validate them has so far mostly fallen to the biological expert who had to perform this task manually. One of the most promising avenues to develop automated and integrative technology for such tasks lies in the application of modern Information Retrieval (IR) and Knowledge Management (KM) algorithms to databases with biomedical publications and data. Examples of databases available for the field are bibliographic databases c ntaining scientific publications (e.g. MEDLINE/PUBMED), databases containing sequence data (e.g. GenBank) and databases of semantic annotations (e.g. the Gene Ontology Consortium and Medical Subject Headings (MeSH)). We present here an approach that uses the MeSH terms and their concept hierarchies to validate and obtain functional information for gene expression clusters. The controlled and hierarchical MeSH vocabulary is used by the National Library of Medicine (NLM) to index all the articles cited in MEDLINE. Such indexing with a controlled vocabulary eliminates some of the ambiguity due to polysemy (terms that have multiple meanings) and synonymy (multiple terms have similar meaning) that would be encountered if terms would be extracted directly from the articles due to differing article contexts or author preferences and background. Further, the hierarchical organization of the MeSH terms can illustrate the conceptuallfunctional relationships of genes

  17. A remarkably stable TipE gene cluster: evolution of insect Para sodium channel auxiliary subunits

    Directory of Open Access Journals (Sweden)

    Li Jia

    2011-11-01

    Full Text Available Abstract Background First identified in fruit flies with temperature-sensitive paralysis phenotypes, the Drosophila melanogaster TipE locus encodes four voltage-gated sodium (NaV channel auxiliary subunits. This cluster of TipE-like genes on chromosome 3L, and a fifth family member on chromosome 3R, are important for the optional expression and functionality of the Para NaV channel but appear quite distinct from auxiliary subunits in vertebrates. Here, we exploited available arthropod genomic resources to trace the origin of TipE-like genes by mapping their evolutionary histories and examining their genomic architectures. Results We identified a remarkably conserved synteny block of TipE-like orthologues with well-maintained local gene arrangements from 21 insect species. Homologues in the water flea, Daphnia pulex, suggest an ancestral pancrustacean repertoire of four TipE-like genes; a subsequent gene duplication may have generated functional redundancy allowing gene losses in the silk moth and mosquitoes. Intronic nesting of the insect TipE gene cluster probably occurred following the divergence from crustaceans, but in the flour beetle and silk moth genomes the clusters apparently escaped from nesting. Across Pancrustacea, TipE gene family members have experienced intronic nesting, escape from nesting, retrotransposition, translocation, and gene loss events while generally maintaining their local gene neighbourhoods. D. melanogaster TipE-like genes exhibit coordinated spatial and temporal regulation of expression distinct from their host gene but well-correlated with their regulatory target, the Para NaV channel, suggesting that functional constraints may preserve the TipE gene cluster. We identified homology between TipE-like NaV channel regulators and vertebrate Slo-beta auxiliary subunits of big-conductance calcium-activated potassium (BKCa channels, which suggests that ion channel regulatory partners have evolved distinct lineage

  18. Clustering Gene Expression Time Series with Coregionalization: Speed propagation of ALS

    OpenAIRE

    Rahman, Muhammad Arifur; Heath, Paul R.; Lawrence, Neil D.

    2018-01-01

    Clustering of gene expression time series gives insight into which genes may be coregulated, allowing us to discern the activity of pathways in a given microarray experiment. Of particular interest is how a given group of genes varies with different model conditions or genetic background. Amyotrophic lateral sclerosis (ALS), an irreversible diverse neurodegenerative disorder showed consistent phenotypic differences and the disease progression is heterogeneous with significant variability. Thi...

  19. Structure-related clustering of gene expression fingerprints of thp-1 cells exposed to smaller polycyclic aromatic hydrocarbons.

    Science.gov (United States)

    Wan, B; Yarbrough, J W; Schultz, T W

    2008-01-01

    This study was undertaken to test the hypothesis that structurally similar PAHs induce similar gene expression profiles. THP-1 cells were exposed to a series of 12 selected PAHs at 50 microM for 24 hours and gene expressions profiles were analyzed using both unsupervised and supervised methods. Clustering analysis of gene expression profiles revealed that the 12 tested chemicals were grouped into five clusters. Within each cluster, the gene expression profiles are more similar to each other than to the ones outside the cluster. One-methylanthracene and 1-methylfluorene were found to have the most similar profiles; dibenzothiophene and dibenzofuran were found to share common profiles with fluorine. As expression pattern comparisons were expanded, similarity in genomic fingerprint dropped off dramatically. Prediction analysis of microarrays (PAM) based on the clustering pattern generated 49 predictor genes that can be used for sample discrimination. Moreover, a significant analysis of Microarrays (SAM) identified 598 genes being modulated by tested chemicals with a variety of biological processes, such as cell cycle, metabolism, and protein binding and KEGG pathways being significantly (p < 0.05) affected. It is feasible to distinguish structurally different PAHs based on their genomic fingerprints, which are mechanism based.

  20. Measurement of circulating transcripts and gene cluster analysis predicts and defines therapeutic efficacy of peptide receptor radionuclide therapy (PRRT) in neuroendocrine tumors

    International Nuclear Information System (INIS)

    Bodei, L.; Kidd, M.; Modlin, I.M.; Severi, S.; Nicolini, S.; Paganelli, G.; Drozdov, I.; Kwekkeboom, D.J.; Krenning, E.P.; Baum, R.P.

    2016-01-01

    Peptide receptor radionuclide therapy (PRRT) is an effective method for treating neuroendocrine tumors (NETs). It is limited, however, in the prediction of individual tumor response and the precise and early identification of changes in tumor size. Currently, response prediction is based on somatostatin receptor expression and efficacy by morphological imaging and/or chromogranin A (CgA) measurement. The aim of this study was to assess the accuracy of circulating NET transcripts as a measure of PRRT efficacy, and moreover to identify prognostic gene clusters in pretreatment blood that could be interpolated with relevant clinical features in order to define a biological index for the tumor and a predictive quotient for PRRT efficacy. NET patients (n = 54), M: F 37:17, median age 66, bronchial: n = 13, GEP-NET: n = 35, CUP: n = 6 were treated with 177 Lu-based-PRRT (cumulative activity: 6.5-27.8 GBq, median 18.5). At baseline: 47/54 low-grade (G1/G2; bronchial typical/atypical), 31/49 18 FDG positive and 39/54 progressive. Disease status was assessed by RECIST1.1. Transcripts were measured by real-time quantitative reverse transcription PCR (qRT-PCR) and multianalyte algorithmic analysis (NETest); CgA by enzyme-linked immunosorbent assay (ELISA). Gene cluster (GC) derivations: regulatory network, protein:protein interactome analyses. Statistical analyses: chi-square, non-parametric measurements, multiple regression, receiver operating characteristic and Kaplan-Meier survival. The disease control rate was 72 %. Median PFS was not achieved (follow-up: 1-33 months, median: 16). Only grading was associated with response (p < 0.01). At baseline, 94 % of patients were NETest-positive, while CgA was elevated in 59 %. NETest accurately (89 %, χ 2 = 27.4; p = 1.2 x 10 -7 ) correlated with treatment response, while CgA was 24 % accurate. Gene cluster expression (growth-factor signalome and metabolome) had an AUC of 0.74 ± 0.08 (z-statistic = 2.92, p < 0.004) for predicting

  1. Measurement of circulating transcripts and gene cluster analysis predicts and defines therapeutic efficacy of peptide receptor radionuclide therapy (PRRT) in neuroendocrine tumors

    Energy Technology Data Exchange (ETDEWEB)

    Bodei, L. [European Institute of Oncology, Division of Nuclear Medicine, Milan (Italy); LuGenIum Consortium, Milan, Rotterdam, Bad Berka, London, Italy, Netherlands, Germany (Country Unknown); Kidd, M. [Wren Laboratories, Branford, CT (United States); Modlin, I.M. [LuGenIum Consortium, Milan, Rotterdam, Bad Berka, London, Italy, Netherlands, Germany (Country Unknown); Yale School of Medicine, New Haven, CT (United States); Severi, S.; Nicolini, S.; Paganelli, G. [Istituto Scientifico Romagnolo per lo Studio e la Cura dei Tumori (IRST) IRCCS, Nuclear Medicine and Radiometabolic Units, Meldola (Italy); Drozdov, I. [Bering Limited, London (United Kingdom); Kwekkeboom, D.J.; Krenning, E.P. [LuGenIum Consortium, Milan, Rotterdam, Bad Berka, London, Italy, Netherlands, Germany (Country Unknown); Erasmus Medical Center, Nuclear Medicine Department, Rotterdam (Netherlands); Baum, R.P. [LuGenIum Consortium, Milan, Rotterdam, Bad Berka, London, Italy, Netherlands, Germany (Country Unknown); Zentralklinik Bad Berka, Theranostics Center for Molecular Radiotherapy and Imaging, Bad Berka (Germany)

    2016-05-15

    Peptide receptor radionuclide therapy (PRRT) is an effective method for treating neuroendocrine tumors (NETs). It is limited, however, in the prediction of individual tumor response and the precise and early identification of changes in tumor size. Currently, response prediction is based on somatostatin receptor expression and efficacy by morphological imaging and/or chromogranin A (CgA) measurement. The aim of this study was to assess the accuracy of circulating NET transcripts as a measure of PRRT efficacy, and moreover to identify prognostic gene clusters in pretreatment blood that could be interpolated with relevant clinical features in order to define a biological index for the tumor and a predictive quotient for PRRT efficacy. NET patients (n = 54), M: F 37:17, median age 66, bronchial: n = 13, GEP-NET: n = 35, CUP: n = 6 were treated with {sup 177}Lu-based-PRRT (cumulative activity: 6.5-27.8 GBq, median 18.5). At baseline: 47/54 low-grade (G1/G2; bronchial typical/atypical), 31/49 {sup 18}FDG positive and 39/54 progressive. Disease status was assessed by RECIST1.1. Transcripts were measured by real-time quantitative reverse transcription PCR (qRT-PCR) and multianalyte algorithmic analysis (NETest); CgA by enzyme-linked immunosorbent assay (ELISA). Gene cluster (GC) derivations: regulatory network, protein:protein interactome analyses. Statistical analyses: chi-square, non-parametric measurements, multiple regression, receiver operating characteristic and Kaplan-Meier survival. The disease control rate was 72 %. Median PFS was not achieved (follow-up: 1-33 months, median: 16). Only grading was associated with response (p < 0.01). At baseline, 94 % of patients were NETest-positive, while CgA was elevated in 59 %. NETest accurately (89 %, χ{sup 2} = 27.4; p = 1.2 x 10{sup -7}) correlated with treatment response, while CgA was 24 % accurate. Gene cluster expression (growth-factor signalome and metabolome) had an AUC of 0.74 ± 0.08 (z-statistic = 2.92, p < 0

  2. Patterns of genetic diversity and differentiation in resistance gene clusters of two hybridizing European Populus species

    OpenAIRE

    Casey, Céline; Stölting, Kai N.; Barbará, Thelma; González-Martínez, Santiago C.; Lexer, Christian

    2015-01-01

    Resistance genes (R-genes) are essential for long-lived organisms such as forest trees, which are exposed to diverse herbivores and pathogens. In short-lived model species, R-genes have been shown to be involved in species isolation. Here, we studied more than 400 trees from two natural hybrid zones of the European Populus species Populus alba and Populus tremula for microsatellite markers located in three R-gene clusters, including one cluster situated in the incipient sex chromosome region....

  3. PROGNOSTIC VALUE OF BRAIN AND ACUTE LEUKEMIA CYTOPLASMIC GENE EXPRESSION IN EGYPTIAN CHILDREN WITH ACUTE MYELOID LEUKEMIA

    Directory of Open Access Journals (Sweden)

    adel abd elhaleim hagag

    2015-04-01

    Full Text Available Abstract      Background: Acute myeloid leukemia (AML accounts for 25%-35% of the acute leukemia in children. BAALC (Brain and Acute Leukemia, Cytoplasmic gene is a recently identified gene on chromosome 8q22.3 that has prognostic significance in AML.  The aim of this work was to study the impact of BAALC gene expression on prognosis of AML in Egyptian children. Patients and methods: This study was conducted on 40 patients of newly diagnosed AML who were subjected to the following: Full history taking, clinical examination, laboratory investigations including: complete blood count, LDH, bone marrow aspiration, cytochemistry and immunophenotyping, assessment of BAALC Gene by real time PCR in bone marrow aspirate mononuclear cells before the start of chemotherapy. Results: BAALC gene expression showed positive expression in 24 cases (60% and negative expression in 16 cases (40%. Patients who showed positive BAALC gene expression included 10 patients achieved complete remission, 8 patients died and 6 relapsed patients, while patients who showed negative expression include 12 patients achieved complete remission, 1 relapsed patient and 3 patients died. There was significant association between BAALC gene expression and FAB classification of patients of AML patientsas positive BAALC expression is predominantly seen in FAB subtypes M1 and M2 compared with negative BAALC gene expression that was found more in M3 and M4 (8 cases with M1, 12 cases with M2, 1 case with M3 and 3 cases with M4 in positive BAALC expression versus 2 cases with M1, 3 cases with M2, 4 cases with M3 and 7 cases with M4 in BAALC gene negative expression group with significant difference regarding FAB subtypes. As regard age, sex, splenomegaly, lymphadenopathy, pallor, purpura, platelets count, WBCs count, and percentage of blast cells in BM, the present study showed no significant association with BAALC. Conclusion: BAALC expression is an important prognostic factor in AML

  4. Mouse Nkrp1-Clr gene cluster sequence and expression analyses reveal conservation of tissue-specific MHC-independent immunosurveillance.

    Directory of Open Access Journals (Sweden)

    Qiang Zhang

    Full Text Available The Nkrp1 (Klrb1-Clr (Clec2 genes encode a receptor-ligand system utilized by NK cells as an MHC-independent immunosurveillance strategy for innate immune responses. The related Ly49 family of MHC-I receptors displays extreme allelic polymorphism and haplotype plasticity. In contrast, previous BAC-mapping and aCGH studies in the mouse suggest the neighboring and related Nkrp1-Clr cluster is evolutionarily stable. To definitively compare the relative evolutionary rate of Nkrp1-Clr vs. Ly49 gene clusters, the Nkrp1-Clr gene clusters from two Ly49 haplotype-disparate inbred mouse strains, BALB/c and 129S6, were sequenced. Both Nkrp1-Clr gene cluster sequences are highly similar to the C57BL/6 reference sequence, displaying the same gene numbers and order, complete pseudogenes, and gene fragments. The Nkrp1-Clr clusters contain a strikingly dissimilar proportion of repetitive elements compared to the Ly49 clusters, suggesting that certain elements may be partly responsible for the highly disparate Ly49 vs. Nkrp1 evolutionary rate. Focused allelic polymorphisms were found within the Nkrp1b/d (Klrb1b, Nkrp1c (Klrb1c, and Clr-c (Clec2f genes, suggestive of possible immune selection. Cell-type specific transcription of Nkrp1-Clr genes in a large panel of tissues/organs was determined. Clr-b (Clec2d and Clr-g (Clec2i showed wide expression, while other Clr genes showed more tissue-specific expression patterns. In situ hybridization revealed specific expression of various members of the Clr family in leukocytes/hematopoietic cells of immune organs, various tissue-restricted epithelial cells (including intestinal, kidney tubular, lung, and corneal progenitor epithelial cells, as well as myocytes. In summary, the Nkrp1-Clr gene cluster appears to evolve more slowly relative to the related Ly49 cluster, and likely regulates innate immunosurveillance in a tissue-specific manner.

  5. Comprehensive annotation of secondary metabolite biosynthetic genes and gene clusters of Aspergillus nidulans, A. fumigatus, A. niger and A. oryzae

    Science.gov (United States)

    2013-01-01

    Background Secondary metabolite production, a hallmark of filamentous fungi, is an expanding area of research for the Aspergilli. These compounds are potent chemicals, ranging from deadly toxins to therapeutic antibiotics to potential anti-cancer drugs. The genome sequences for multiple Aspergilli have been determined, and provide a wealth of predictive information about secondary metabolite production. Sequence analysis and gene overexpression strategies have enabled the discovery of novel secondary metabolites and the genes involved in their biosynthesis. The Aspergillus Genome Database (AspGD) provides a central repository for gene annotation and protein information for Aspergillus species. These annotations include Gene Ontology (GO) terms, phenotype data, gene names and descriptions and they are crucial for interpreting both small- and large-scale data and for aiding in the design of new experiments that further Aspergillus research. Results We have manually curated Biological Process GO annotations for all genes in AspGD with recorded functions in secondary metabolite production, adding new GO terms that specifically describe each secondary metabolite. We then leveraged these new annotations to predict roles in secondary metabolism for genes lacking experimental characterization. As a starting point for manually annotating Aspergillus secondary metabolite gene clusters, we used antiSMASH (antibiotics and Secondary Metabolite Analysis SHell) and SMURF (Secondary Metabolite Unknown Regions Finder) algorithms to identify potential clusters in A. nidulans, A. fumigatus, A. niger and A. oryzae, which we subsequently refined through manual curation. Conclusions This set of 266 manually curated secondary metabolite gene clusters will facilitate the investigation of novel Aspergillus secondary metabolites. PMID:23617571

  6. Persistence of TEL-AML1 fusion gene as minimal residual disease has no additive prognostic value in CD 10 positive B-acute lymphoblastic leukemia: a FISH study

    Directory of Open Access Journals (Sweden)

    Ezz-Eldin Azza M

    2008-10-01

    Full Text Available Abstract Objectives We have analyzed t(12;21(p13:q22 in an attempt to evaluate the frequency and prognostic significance of TEL-AML1 fusion gene in patients with childhood CD 10 positive B-ALL by fluorescence in situ hybridization (FISH. Also, we have monitored the prognostic value of this gene as a minimal residual disease (MRD. Methods All bone marrow samples of eighty patients diagnosed as CD 10 positive B-ALL in South Egypt Cancer Institute were evaluated by fluorescence in situ hybridization (FISH for t(12;21 in newly diagnosed cases and after morphological complete remission as a minimal residual disease (MRD. We determined the prognostic significance of TEL-AML1 fusion represented by disease course and survival. Results TEL-AML1 fusion gene was positive in (37.5% in newly diagnosed patients. There was a significant correlation between TEL-AML1 fusion gene both at diagnosis (r = 0.5, P = 0.003 and as a MRD (r = 0.4, P = 0.01 with favorable course. Kaplan-Meier curve for the presence of TEL-AML1 fusion at the diagnosis was associated with a better probability of overall survival (OS; mean survival time was 47 ± 1 month, in contrast to 28 ± 5 month in its absence (P = 0.006. Also, the persistence at TEL-AML1 fusion as a MRD was not significantly associated with a better probability of OS; the mean survival time was 42 ± 2 months in the presence of MRD and it was 40 ± 1 months in its absence. So, persistence of TEL-AML1 fusion as a MRD had no additive prognostic value over its measurement at diagnosis in terms of predicting the probability of OS. Conclusion For most patients, the presence of TEL-AML1 fusion gene at diagnosis suggests a favorable prognosis. The present study suggests that persistence of TEL-AML1 fusion as MRD has no additive prognostic value.

  7. Identifying prognostic features by bottom-up approach and correlating to drug repositioning.

    Directory of Open Access Journals (Sweden)

    Wei Li

    Full Text Available Traditionally top-down method was used to identify prognostic features in cancer research. That is to say, differentially expressed genes usually in cancer versus normal were identified to see if they possess survival prediction power. The problem is that prognostic features identified from one set of patient samples can rarely be transferred to other datasets. We apply bottom-up approach in this study: survival correlated or clinical stage correlated genes were selected first and prioritized by their network topology additionally, then a small set of features can be used as a prognostic signature.Gene expression profiles of a cohort of 221 hepatocellular carcinoma (HCC patients were used as a training set, 'bottom-up' approach was applied to discover gene-expression signatures associated with survival in both tumor and adjacent non-tumor tissues, and compared with 'top-down' approach. The results were validated in a second cohort of 82 patients which was used as a testing set.Two sets of gene signatures separately identified in tumor and adjacent non-tumor tissues by bottom-up approach were developed in the training cohort. These two signatures were associated with overall survival times of HCC patients and the robustness of each was validated in the testing set, and each predictive performance was better than gene expression signatures reported previously. Moreover, genes in these two prognosis signature gave some indications for drug-repositioning on HCC. Some approved drugs targeting these markers have the alternative indications on hepatocellular carcinoma.Using the bottom-up approach, we have developed two prognostic gene signatures with a limited number of genes that associated with overall survival times of patients with HCC. Furthermore, prognostic markers in these two signatures have the potential to be therapeutic targets.

  8. Fine Mapping of Two Wheat Powdery Mildew Resistance Genes Located at the Pm1 Cluster

    Directory of Open Access Journals (Sweden)

    Junchao Liang

    2016-07-01

    Full Text Available Powdery mildew caused by (DC. f. sp. ( is a globally devastating foliar disease of wheat ( L.. More than a dozen genes against this disease, identified from wheat germplasms of different ploidy levels, have been mapped to the region surrounding the locus on the long arm of chromosome 7A, which forms a resistance (-gene cluster. and from einkorn wheat ( L. were two of the genes belonging to this cluster. This study was initiated to fine map these two genes toward map-based cloning. Comparative genomics study showed that macrocolinearity exists between L. chromosome 1 (Bd1 and the – region, which allowed us to develop markers based on the wheat sequences orthologous to genes contained in the Bd1 region. With these and other newly developed and published markers, high-resolution maps were constructed for both and using large F populations. Moreover, a physical map of was constructed through chromosome walking with bacterial artificial chromosome (BAC clones and comparative mapping. Eventually, and were restricted to a 0.12- and 0.86-cM interval, respectively. Based on the closely linked common markers, , , and (another powdery mildew resistance gene in the cluster were not allelic to one another. Severe recombination suppression and disruption of synteny were noted in the region encompassing . These results provided useful information for map-based cloning of the genes in the cluster and interpretation of their evolution.

  9. Gene expression data clustering and it’s application in differential analysis of leukemia

    Directory of Open Access Journals (Sweden)

    M. Vahedi

    2008-02-01

    Full Text Available Introduction: DNA microarray technique is one of the most important categories in bioinformatics,which allows the possibility of monitoring thousands of expressed genes has been resulted in creatinggiant data bases of gene expression data, recently. Statistical analysis of such databases includednormalization, clustering, classification and etc.Materials and Methods: Golub et al (1999 collected data bases of leukemia based on the method ofoligonucleotide. The data is on the internet. In this paper, we analyzed gene expression data. It wasclustered by several methods including multi-dimensional scaling, hierarchical and non-hierarchicalclustering. Data set included 20 Acute Lymphoblastic Leukemia (ALL patients and 14 Acute MyeloidLeukemia (AML patients. The results of tow methods of clustering were compared with regard to realgrouping (ALL & AML. R software was used for data analysis.Results: Specificity and sensitivity of divisive hierarchical clustering in diagnosing of ALL patientswere 75% and 92%, respectively. Specificity and sensitivity of partitioning around medoids indiagnosing of ALL patients were 90% and 93%, respectively. These results showed a wellaccomplishment of both methods of clustering. It is considerable that, due to clustering methodsresults, one of the samples was placed in ALL groups, which was in AML group in clinical test.Conclusion: With regard to concordance of the results with real grouping of data, therefore we canuse these methods in the cases where we don't have accurate information of real grouping of data.Moreover, Results of clustering might distinct subgroups of data in such a way that would be necessaryfor concordance with clinical outcomes, laboratory results and so on.

  10. A novel polyketide biosynthesis gene cluster is involved in fruiting body morphogenesis in the filamentous fungi Sordaria macrospora and Neurospora crassa.

    Science.gov (United States)

    Nowrousian, Minou

    2009-04-01

    During fungal fruiting body development, hyphae aggregate to form multicellular structures that protect and disperse the sexual spores. Analysis of microarray data revealed a gene cluster strongly upregulated during fruiting body development in the ascomycete Sordaria macrospora. Real time PCR analysis showed that the genes from the orthologous cluster in Neurospora crassa are also upregulated during development. The cluster encodes putative polyketide biosynthesis enzymes, including a reducing polyketide synthase. Analysis of knockout strains of a predicted dehydrogenase gene from the cluster showed that mutants in N. crassa and S. macrospora are delayed in fruiting body formation. In addition to the upregulated cluster, the N. crassa genome comprises another cluster containing a polyketide synthase gene, and five additional reducing polyketide synthase (rpks) genes that are not part of clusters. To study the role of these genes in sexual development, expression of the predicted rpks genes in S. macrospora (five genes) and N. crassa (six genes) was analyzed; all but one are upregulated during sexual development. Analysis of knockout strains for the N. crassa rpks genes showed that one of them is essential for fruiting body formation. These data indicate that polyketides produced by RPKSs are involved in sexual development in filamentous ascomycetes.

  11. Predicting stabilizing treatment outcomes for complex posttraumatic stress disorder and dissociative identity disorder: an expertise-based prognostic model.

    Science.gov (United States)

    Baars, Erik W; van der Hart, Onno; Nijenhuis, Ellert R S; Chu, James A; Glas, Gerrit; Draijer, Nel

    2011-01-01

    The purpose of this study was to develop an expertise-based prognostic model for the treatment of complex posttraumatic stress disorder (PTSD) and dissociative identity disorder (DID). We developed a survey in 2 rounds: In the first round we surveyed 42 experienced therapists (22 DID and 20 complex PTSD therapists), and in the second round we surveyed a subset of 22 of the 42 therapists (13 DID and 9 complex PTSD therapists). First, we drew on therapists' knowledge of prognostic factors for stabilization-oriented treatment of complex PTSD and DID. Second, therapists prioritized a list of prognostic factors by estimating the size of each variable's prognostic effect; we clustered these factors according to content and named the clusters. Next, concept mapping methodology and statistical analyses (including principal components analyses) were used to transform individual judgments into weighted group judgments for clusters of items. A prognostic model, based on consensually determined estimates of effect sizes, of 8 clusters containing 51 factors for both complex PTSD and DID was formed. It includes the clusters lack of motivation, lack of healthy relationships, lack of healthy therapeutic relationships, lack of other internal and external resources, serious Axis I comorbidity, serious Axis II comorbidity, poor attachment, and self-destruction. In addition, a set of 5 DID-specific items was constructed. The model is supportive of the current phase-oriented treatment model, emphasizing the strengthening of the therapeutic relationship and the patient's resources in the initial stabilization phase. Further research is needed to test the model's statistical and clinical validity.

  12. The ergot alkaloid gene cluster: Functional analyses and evolutionary aspects

    Czech Academy of Sciences Publication Activity Database

    Lorenz, N.; Haarmann, T.; Pažoutová, Sylvie; Jung, M.; Tudzynski, P.

    2009-01-01

    Roč. 70, 15-16 (2009), s. 1822-1832 ISSN 0031-9422 Institutional research plan: CEZ:AV0Z50200510 Keywords : Claviceps purpurea * Ergot fungus * Ergot alkaloid gene cluster Subject RIV: EE - Microbiology, Virology Impact factor: 3.104, year: 2009

  13. [Prognostic factors of early breast cancer].

    Science.gov (United States)

    Almagro, Elena; González, Cynthia S; Espinosa, Enrique

    2016-02-19

    Decision about the administration of adjuvant therapy for early breast cancer depends on the evaluation of prognostic factors. Lymph node status, tumor size and grade of differentiation are classical variables in this regard, and can be complemented by hormonal receptor status and HER2 expression. These factors can be combined into prognostic indexes to better estimate the risk of relapse or death. Other factors are less important. Gene profiles have emerged in recent years to identify low-risk patients who can forgo adjuvant chemotherapy. A number of profiles are available and can be used in selected cases. In the future, gene profiling will be used to select patients for treatment with new targeted therapies. Copyright © 2015 Elsevier España, S.L.U. All rights reserved.

  14. Leveraging long sequencing reads to investigate R-gene clustering and variation in sugar beet

    Science.gov (United States)

    Host-pathogen interactions are of prime importance to modern agriculture. Plants utilize various types of resistance genes to mitigate pathogen damage. Identification of the specific gene responsible for a specific resistance can be difficult due to duplication and clustering within R-gene families....

  15. Sequencing and transcriptional analysis of the Streptococcus thermophilus histamine biosynthesis gene cluster: factors that affect differential hdcA expression

    DEFF Research Database (Denmark)

    Calles-Enríquez, Marina; Hjort, Benjamin Benn; Andersen, Pia Skov

    2010-01-01

    to produce histamine. The hdc clusters of S. thermophilus CHCC1524 and CHCC6483 were sequenced, and the factors that affect histamine biosynthesis and histidine-decarboxylating gene (hdcA) expression were studied. The hdc cluster began with the hdcA gene, was followed by a transporter (hdcP), and ended...... with the hdcB gene, which is of unknown function. The three genes were orientated in the same direction. The genetic organization of the hdc cluster showed a unique organization among the lactic acid bacterial group and resembled those of Staphylococcus and Clostridium species, thus indicating possible...... acquisition through a horizontal transfer mechanism. Transcriptional analysis of the hdc cluster revealed the existence of a polycistronic mRNA covering the three genes. The histidine-decarboxylating gene (hdcA) of S. thermophilus demonstrated maximum expression during the stationary growth phase, with high...

  16. A multi-Poisson dynamic mixture model to cluster developmental patterns of gene expression by RNA-seq.

    Science.gov (United States)

    Ye, Meixia; Wang, Zhong; Wang, Yaqun; Wu, Rongling

    2015-03-01

    Dynamic changes of gene expression reflect an intrinsic mechanism of how an organism responds to developmental and environmental signals. With the increasing availability of expression data across a time-space scale by RNA-seq, the classification of genes as per their biological function using RNA-seq data has become one of the most significant challenges in contemporary biology. Here we develop a clustering mixture model to discover distinct groups of genes expressed during a period of organ development. By integrating the density function of multivariate Poisson distribution, the model accommodates the discrete property of read counts characteristic of RNA-seq data. The temporal dependence of gene expression is modeled by the first-order autoregressive process. The model is implemented with the Expectation-Maximization algorithm and model selection to determine the optimal number of gene clusters and obtain the estimates of Poisson parameters that describe the pattern of time-dependent expression of genes from each cluster. The model has been demonstrated by analyzing a real data from an experiment aimed to link the pattern of gene expression to catkin development in white poplar. The usefulness of the model has been validated through computer simulation. The model provides a valuable tool for clustering RNA-seq data, facilitating our global view of expression dynamics and understanding of gene regulation mechanisms. © The Author 2014. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.

  17. Strategies to regulate transcription factor-mediated gene positioning and interchromosomal clustering at the nuclear periphery.

    Science.gov (United States)

    Randise-Hinchliff, Carlo; Coukos, Robert; Sood, Varun; Sumner, Michael Chas; Zdraljevic, Stefan; Meldi Sholl, Lauren; Garvey Brickner, Donna; Ahmed, Sara; Watchmaker, Lauren; Brickner, Jason H

    2016-03-14

    In budding yeast, targeting of active genes to the nuclear pore complex (NPC) and interchromosomal clustering is mediated by transcription factor (TF) binding sites in the gene promoters. For example, the binding sites for the TFs Put3, Ste12, and Gcn4 are necessary and sufficient to promote positioning at the nuclear periphery and interchromosomal clustering. However, in all three cases, gene positioning and interchromosomal clustering are regulated. Under uninducing conditions, local recruitment of the Rpd3(L) histone deacetylase by transcriptional repressors blocks Put3 DNA binding. This is a general function of yeast repressors: 16 of 21 repressors blocked Put3-mediated subnuclear positioning; 11 of these required Rpd3. In contrast, Ste12-mediated gene positioning is regulated independently of DNA binding by mitogen-activated protein kinase phosphorylation of the Dig2 inhibitor, and Gcn4-dependent targeting is up-regulated by increasing Gcn4 protein levels. These different regulatory strategies provide either qualitative switch-like control or quantitative control of gene positioning over different time scales. © 2016 Randise-Hinchliff et al.

  18. Motif-independent prediction of a secondary metabolism gene cluster using comparative genomics: application to sequenced genomes of Aspergillus and ten other filamentous fungal species.

    Science.gov (United States)

    Takeda, Itaru; Umemura, Myco; Koike, Hideaki; Asai, Kiyoshi; Machida, Masayuki

    2014-08-01

    Despite their biological importance, a significant number of genes for secondary metabolite biosynthesis (SMB) remain undetected due largely to the fact that they are highly diverse and are not expressed under a variety of cultivation conditions. Several software tools including SMURF and antiSMASH have been developed to predict fungal SMB gene clusters by finding core genes encoding polyketide synthase, nonribosomal peptide synthetase and dimethylallyltryptophan synthase as well as several others typically present in the cluster. In this work, we have devised a novel comparative genomics method to identify SMB gene clusters that is independent of motif information of the known SMB genes. The method detects SMB gene clusters by searching for a similar order of genes and their presence in nonsyntenic blocks. With this method, we were able to identify many known SMB gene clusters with the core genes in the genomic sequences of 10 filamentous fungi. Furthermore, we have also detected SMB gene clusters without core genes, including the kojic acid biosynthesis gene cluster of Aspergillus oryzae. By varying the detection parameters of the method, a significant difference in the sequence characteristics was detected between the genes residing inside the clusters and those outside the clusters. © The Author 2014. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.

  19. Do two machine-learning based prognostic signatures for breast cancer capture the same biological processes?

    Science.gov (United States)

    Drier, Yotam; Domany, Eytan

    2011-03-14

    The fact that there is very little if any overlap between the genes of different prognostic signatures for early-discovery breast cancer is well documented. The reasons for this apparent discrepancy have been explained by the limits of simple machine-learning identification and ranking techniques, and the biological relevance and meaning of the prognostic gene lists was questioned. Subsequently, proponents of the prognostic gene lists claimed that different lists do capture similar underlying biological processes and pathways. The present study places under scrutiny the validity of this claim, for two important gene lists that are at the focus of current large-scale validation efforts. We performed careful enrichment analysis, controlling the effects of multiple testing in a manner which takes into account the nested dependent structure of gene ontologies. In contradiction to several previous publications, we find that the only biological process or pathway for which statistically significant concordance can be claimed is cell proliferation, a process whose relevance and prognostic value was well known long before gene expression profiling. We found that the claims reported by others, of wider concordance between the biological processes captured by the two prognostic signatures studied, were found either to be lacking statistical rigor or were in fact based on addressing some other question.

  20. Do two machine-learning based prognostic signatures for breast cancer capture the same biological processes?

    Directory of Open Access Journals (Sweden)

    Yotam Drier

    2011-03-01

    Full Text Available The fact that there is very little if any overlap between the genes of different prognostic signatures for early-discovery breast cancer is well documented. The reasons for this apparent discrepancy have been explained by the limits of simple machine-learning identification and ranking techniques, and the biological relevance and meaning of the prognostic gene lists was questioned. Subsequently, proponents of the prognostic gene lists claimed that different lists do capture similar underlying biological processes and pathways. The present study places under scrutiny the validity of this claim, for two important gene lists that are at the focus of current large-scale validation efforts. We performed careful enrichment analysis, controlling the effects of multiple testing in a manner which takes into account the nested dependent structure of gene ontologies. In contradiction to several previous publications, we find that the only biological process or pathway for which statistically significant concordance can be claimed is cell proliferation, a process whose relevance and prognostic value was well known long before gene expression profiling. We found that the claims reported by others, of wider concordance between the biological processes captured by the two prognostic signatures studied, were found either to be lacking statistical rigor or were in fact based on addressing some other question.

  1. Methods for simultaneously identifying coherent local clusters with smooth global patterns in gene expression profiles

    Directory of Open Access Journals (Sweden)

    Lee Yun-Shien

    2008-03-01

    Full Text Available Abstract Background The hierarchical clustering tree (HCT with a dendrogram 1 and the singular value decomposition (SVD with a dimension-reduced representative map 2 are popular methods for two-way sorting the gene-by-array matrix map employed in gene expression profiling. While HCT dendrograms tend to optimize local coherent clustering patterns, SVD leading eigenvectors usually identify better global grouping and transitional structures. Results This study proposes a flipping mechanism for a conventional agglomerative HCT using a rank-two ellipse (R2E, an improved SVD algorithm for sorting purpose seriation by Chen 3 as an external reference. While HCTs always produce permutations with good local behaviour, the rank-two ellipse seriation gives the best global grouping patterns and smooth transitional trends. The resulting algorithm automatically integrates the desirable properties of each method so that users have access to a clustering and visualization environment for gene expression profiles that preserves coherent local clusters and identifies global grouping trends. Conclusion We demonstrate, through four examples, that the proposed method not only possesses better numerical and statistical properties, it also provides more meaningful biomedical insights than other sorting algorithms. We suggest that sorted proximity matrices for genes and arrays, in addition to the gene-by-array expression matrix, can greatly aid in the search for comprehensive understanding of gene expression structures. Software for the proposed methods can be obtained at http://gap.stat.sinica.edu.tw/Software/GAP.

  2. Spatial expression of Hox cluster genes in the ontogeny of a sea urchin

    Science.gov (United States)

    Arenas-Mena, C.; Cameron, A. R.; Davidson, E. H.

    2000-01-01

    The Hox cluster of the sea urchin Strongylocentrous purpuratus contains ten genes in a 500 kb span of the genome. Only two of these genes are expressed during embryogenesis, while all of eight genes tested are expressed during development of the adult body plan in the larval stage. We report the spatial expression during larval development of the five 'posterior' genes of the cluster: SpHox7, SpHox8, SpHox9/10, SpHox11/13a and SpHox11/13b. The five genes exhibit a dynamic, largely mesodermal program of expression. Only SpHox7 displays extensive expression within the pentameral rudiment itself. A spatially sequential and colinear arrangement of expression domains is found in the somatocoels, the paired posterior mesodermal structures that will become the adult perivisceral coeloms. No such sequential expression pattern is observed in endodermal, epidermal or neural tissues of either the larva or the presumptive juvenile sea urchin. The spatial expression patterns of the Hox genes illuminate the evolutionary process by which the pentameral echinoderm body plan emerged from a bilateral ancestor.

  3. Some statistical properties of gene expression clustering for array data

    DEFF Research Database (Denmark)

    Abreu, G C G; Pinheiro, A; Drummond, R D

    2010-01-01

    DNA array data without a corresponding statistical error measure. We propose an easy-to-implement and simple-to-use technique that uses bootstrap re-sampling to evaluate the statistical error of the nodes provided by SOM-based clustering. Comparisons between SOM and parametric clustering are presented...... for simulated as well as for two real data sets. We also implement a bootstrap-based pre-processing procedure for SOM, that improves the false discovery ratio of differentially expressed genes. Code in Matlab is freely available, as well as some supplementary material, at the following address: https...

  4. Genomic and expression analysis of the vanG-like gene cluster of Clostridium difficile.

    Science.gov (United States)

    Peltier, Johann; Courtin, Pascal; El Meouche, Imane; Catel-Ferreira, Manuella; Chapot-Chartier, Marie-Pierre; Lemée, Ludovic; Pons, Jean-Louis

    2013-07-01

    Primary antibiotic treatment of Clostridium difficile intestinal diseases requires metronidazole or vancomycin therapy. A cluster of genes homologous to enterococcal glycopeptides resistance vanG genes was found in the genome of C. difficile 630, although this strain remains sensitive to vancomycin. This vanG-like gene cluster was found to consist of five ORFs: the regulatory region consisting of vanR and vanS and the effector region consisting of vanG, vanXY and vanT. We found that 57 out of 83 C. difficile strains, representative of the main lineages of the species, harbour this vanG-like cluster. The cluster is expressed as an operon and, when present, is found at the same genomic location in all strains. The vanG, vanXY and vanT homologues in C. difficile 630 are co-transcribed and expressed to a low level throughout the growth phases in the absence of vancomycin. Conversely, the expression of these genes is strongly induced in the presence of subinhibitory concentrations of vancomycin, indicating that the vanG-like operon is functional at the transcriptional level in C. difficile. Hydrophilic interaction liquid chromatography (HILIC-HPLC) and MS analysis of cytoplasmic peptidoglycan precursors of C. difficile 630 grown without vancomycin revealed the exclusive presence of a UDP-MurNAc-pentapeptide with an alanine at the C terminus. UDP-MurNAc-pentapeptide [d-Ala] was also the only peptidoglycan precursor detected in C. difficile grown in the presence of vancomycin, corroborating the lack of vancomycin resistance. Peptidoglycan structures of a vanG-like mutant strain and of a strain lacking the vanG-like cluster did not differ from the C. difficile 630 strain, indicating that the vanG-like cluster also has no impact on cell-wall composition.

  5. Burkholderia thailandensis harbors two identical rhl gene clusters responsible for the biosynthesis of rhamnolipids

    Directory of Open Access Journals (Sweden)

    Woods Donald E

    2009-12-01

    Full Text Available Abstract Background Rhamnolipids are surface active molecules composed of rhamnose and β-hydroxydecanoic acid. These biosurfactants are produced mainly by Pseudomonas aeruginosa and have been thoroughly investigated since their early discovery. Recently, they have attracted renewed attention because of their involvement in various multicellular behaviors. Despite this high interest, only very few studies have focused on the production of rhamnolipids by Burkholderia species. Results Orthologs of rhlA, rhlB and rhlC, which are responsible for the biosynthesis of rhamnolipids in P. aeruginosa, have been found in the non-infectious Burkholderia thailandensis, as well as in the genetically similar important pathogen B. pseudomallei. In contrast to P. aeruginosa, both Burkholderia species contain these three genes necessary for rhamnolipid production within a single gene cluster. Furthermore, two identical, paralogous copies of this gene cluster are found on the second chromosome of these bacteria. Both Burkholderia spp. produce rhamnolipids containing 3-hydroxy fatty acid moieties with longer side chains than those described for P. aeruginosa. Additionally, the rhamnolipids produced by B. thailandensis contain a much larger proportion of dirhamnolipids versus monorhamnolipids when compared to P. aeruginosa. The rhamnolipids produced by B. thailandensis reduce the surface tension of water to 42 mN/m while displaying a critical micelle concentration value of 225 mg/L. Separate mutations in both rhlA alleles, which are responsible for the synthesis of the rhamnolipid precursor 3-(3-hydroxyalkanoyloxyalkanoic acid, prove that both copies of the rhl gene cluster are functional, but one contributes more to the total production than the other. Finally, a double ΔrhlA mutant that is completely devoid of rhamnolipid production is incapable of swarming motility, showing that both gene clusters contribute to this phenotype. Conclusions Collectively, these

  6. Histone and ribosomal RNA repetitive gene clusters of the boll weevil are linked in a tandem array.

    Science.gov (United States)

    Roehrdanz, R; Heilmann, L; Senechal, P; Sears, S; Evenson, P

    2010-08-01

    Histones are the major protein component of chromatin structure. The histone family is made up of a quintet of proteins, four core histones (H2A, H2B, H3 & H4) and the linker histones (H1). Spacers are found between the coding regions. Among insects this quintet of genes is usually clustered and the clusters are tandemly repeated. Ribosomal DNA contains a cluster of the rRNA sequences 18S, 5.8S and 28S. The rRNA genes are separated by the spacers ITS1, ITS2 and IGS. This cluster is also tandemly repeated. We found that the ribosomal RNA repeat unit of at least two species of Anthonomine weevils, Anthonomus grandis and Anthonomus texanus (Coleoptera: Curculionidae), is interspersed with a block containing the histone gene quintet. The histone genes are situated between the rRNA 18S and 28S genes in what is known as the intergenic spacer region (IGS). The complete reiterated Anthonomus grandis histone-ribosomal sequence is 16,248 bp.

  7. Molecular population genetics of the β-esterase gene cluster of ...

    Indian Academy of Sciences (India)

    We suggest that the demographic history (bottleneck and admixture of genetically differentiated populations) is the major factor shaping the pattern of nucleotide polymorphism in the -esterase gene cluster. However there are some 'footprints' of directional and balancing selection shaping specific distribution of nucleotide ...

  8. VRprofile: gene-cluster-detection-based profiling of virulence and antibiotic resistance traits encoded within genome sequences of pathogenic bacteria.

    Science.gov (United States)

    Li, Jun; Tai, Cui; Deng, Zixin; Zhong, Weihong; He, Yongqun; Ou, Hong-Yu

    2017-01-10

    VRprofile is a Web server that facilitates rapid investigation of virulence and antibiotic resistance genes, as well as extends these trait transfer-related genetic contexts, in newly sequenced pathogenic bacterial genomes. The used backend database MobilomeDB was firstly built on sets of known gene cluster loci of bacterial type III/IV/VI/VII secretion systems and mobile genetic elements, including integrative and conjugative elements, prophages, class I integrons, IS elements and pathogenicity/antibiotic resistance islands. VRprofile is thus able to co-localize the homologs of these conserved gene clusters using HMMer or BLASTp searches. With the integration of the homologous gene cluster search module with a sequence composition module, VRprofile has exhibited better performance for island-like region predictions than the other widely used methods. In addition, VRprofile also provides an integrated Web interface for aligning and visualizing identified gene clusters with MobilomeDB-archived gene clusters, or a variety set of bacterial genomes. VRprofile might contribute to meet the increasing demands of re-annotations of bacterial variable regions, and aid in the real-time definitions of disease-relevant gene clusters in pathogenic bacteria of interest. VRprofile is freely available at http://bioinfo-mml.sjtu.edu.cn/VRprofile. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  9. Genome-wide association study identifies the SERPINB gene cluster as a susceptibility locus for food allergy.

    Science.gov (United States)

    Marenholz, Ingo; Grosche, Sarah; Kalb, Birgit; Rüschendorf, Franz; Blümchen, Katharina; Schlags, Rupert; Harandi, Neda; Price, Mareike; Hansen, Gesine; Seidenberg, Jürgen; Röblitz, Holger; Yürek, Songül; Tschirner, Sebastian; Hong, Xiumei; Wang, Xiaobin; Homuth, Georg; Schmidt, Carsten O; Nöthen, Markus M; Hübner, Norbert; Niggemann, Bodo; Beyer, Kirsten; Lee, Young-Ae

    2017-10-20

    Genetic factors and mechanisms underlying food allergy are largely unknown. Due to heterogeneity of symptoms a reliable diagnosis is often difficult to make. Here, we report a genome-wide association study on food allergy diagnosed by oral food challenge in 497 cases and 2387 controls. We identify five loci at genome-wide significance, the clade B serpin (SERPINB) gene cluster at 18q21.3, the cytokine gene cluster at 5q31.1, the filaggrin gene, the C11orf30/LRRC32 locus, and the human leukocyte antigen (HLA) region. Stratifying the results for the causative food demonstrates that association of the HLA locus is peanut allergy-specific whereas the other four loci increase the risk for any food allergy. Variants in the SERPINB gene cluster are associated with SERPINB10 expression in leukocytes. Moreover, SERPINB genes are highly expressed in the esophagus. All identified loci are involved in immunological regulation or epithelial barrier function, emphasizing the role of both mechanisms in food allergy.

  10. CAR gene cluster and transcript levels of carotenogenic genes in Rhodotorula mucilaginosa.

    Science.gov (United States)

    Landolfo, Sara; Ianiri, Giuseppe; Camiolo, Salvatore; Porceddu, Andrea; Mulas, Giuliana; Chessa, Rossella; Zara, Giacomo; Mannazzu, Ilaria

    2018-01-01

    A molecular approach was applied to the study of the carotenoid biosynthetic pathway of Rhodotorula mucilaginosa. At first, functional annotation of the genome of R. mucilaginosa C2.5t1 was carried out and gene ontology categories were assigned to 4033 predicted proteins. Then, a set of genes involved in different steps of carotenogenesis was identified and those coding for phytoene desaturase, phytoene synthase/lycopene cyclase and carotenoid dioxygenase (CAR genes) proved to be clustered within a region of ~10 kb. Quantitative PCR of the genes involved in carotenoid biosynthesis showed that genes coding for 3-hydroxy-3-methylglutharyl-CoA reductase and mevalonate kinase are induced during exponential phase while no clear trend of induction was observed for phytoene synthase/lycopene cyclase and phytoene dehydrogenase encoding genes. Thus, in R. mucilaginosa the induction of genes involved in the early steps of carotenoid biosynthesis is transient and accompanies the onset of carotenoid production, while that of CAR genes does not correlate with the amount of carotenoids produced. The transcript levels of genes coding for carotenoid dioxygenase, superoxide dismutase and catalase A increased during the accumulation of carotenoids, thus suggesting the activation of a mechanism aimed at the protection of cell structures from oxidative stress during carotenoid biosynthesis. The data presented herein, besides being suitable for the elucidation of the mechanisms that underlie carotenoid biosynthesis, will contribute to boosting the biotechnological potential of this yeast by improving the outcome of further research efforts aimed at also exploring other features of interest.

  11. Genetic clusters and sex-biased gene flow in a unicolonial Formica ant

    Directory of Open Access Journals (Sweden)

    Chapuisat Michel

    2009-03-01

    Full Text Available Abstract Background Animal societies are diverse, ranging from small family-based groups to extraordinarily large social networks in which many unrelated individuals interact. At the extreme of this continuum, some ant species form unicolonial populations in which workers and queens can move among multiple interconnected nests without eliciting aggression. Although unicoloniality has been mostly studied in invasive ants, it also occurs in some native non-invasive species. Unicoloniality is commonly associated with very high queen number, which may result in levels of relatedness among nestmates being so low as to raise the question of the maintenance of altruism by kin selection in such systems. However, the actual relatedness among cooperating individuals critically depends on effective dispersal and the ensuing pattern of genetic structuring. In order to better understand the evolution of unicoloniality in native non-invasive ants, we investigated the fine-scale population genetic structure and gene flow in three unicolonial populations of the wood ant F. paralugubris. Results The analysis of geo-referenced microsatellite genotypes and mitochondrial haplotypes revealed the presence of cryptic clusters of genetically-differentiated nests in the three populations of F. paralugubris. Because of this spatial genetic heterogeneity, members of the same clusters were moderately but significantly related. The comparison of nuclear (microsatellite and mitochondrial differentiation indicated that effective gene flow was male-biased in all populations. Conclusion The three unicolonial populations exhibited male-biased and mostly local gene flow. The high number of queens per nest, exchanges among neighbouring nests and restricted long-distance gene flow resulted in large clusters of genetically similar nests. The positive relatedness among clustermates suggests that kin selection may still contribute to the maintenance of altruism in unicolonial

  12. Regulatory role of tetR gene in a novel gene cluster of Acidovorax avenae subsp. avenae RS-1 under oxidative stress

    OpenAIRE

    Liu, He; Yang, Chun-Lan; Ge, Meng-Yu; Ibrahim, Muhammad; Li, Bin; Zhao, Wen-Jun; Chen, Gong-You; Zhu, Bo; Xie, Guan-Lin

    2014-01-01

    Acidovorax avenae subsp. avenae is the causal agent of bacterial brown stripe disease in rice. In this study, we characterized a novel horizontal transfer of a gene cluster, including tetR, on the chromosome of A. avenae subsp. avenae RS-1 by genome-wide analysis. TetR acted as a repressor in this gene cluster and the oxidative stress resistance was enhanced in tetR-deletion mutant strain. Electrophoretic mobility shift assay demonstrated that TetR regulator bound directly to the promoter of ...

  13. Using logistic regression to improve the prognostic value of microarray gene expression data sets: application to early-stage squamous cell carcinoma of the lung and triple negative breast carcinoma.

    Science.gov (United States)

    Mount, David W; Putnam, Charles W; Centouri, Sara M; Manziello, Ann M; Pandey, Ritu; Garland, Linda L; Martinez, Jesse D

    2014-06-10

    Numerous microarray-based prognostic gene expression signatures of primary neoplasms have been published but often with little concurrence between studies, thus limiting their clinical utility. We describe a methodology using logistic regression, which circumvents limitations of conventional Kaplan Meier analysis. We applied this approach to a thrice-analyzed and published squamous cell carcinoma (SQCC) of the lung data set, with the objective of identifying gene expressions predictive of early death versus long survival in early-stage disease. A similar analysis was applied to a data set of triple negative breast carcinoma cases, which present similar clinical challenges. Important to our approach is the selection of homogenous patient groups for comparison. In the lung study, we selected two groups (including only stages I and II), equal in size, of earliest deaths and longest survivors. Genes varying at least four-fold were tested by logistic regression for accuracy of prediction (area under a ROC plot). The gene list was refined by applying two sliding-window analyses and by validations using a leave-one-out approach and model building with validation subsets. In the breast study, a similar logistic regression analysis was used after selecting appropriate cases for comparison. A total of 8594 variable genes were tested for accuracy in predicting earliest deaths versus longest survivors in SQCC. After applying the two sliding window and the leave-one-out analyses, 24 prognostic genes were identified; most of them were B-cell related. When the same data set of stage I and II cases was analyzed using a conventional Kaplan Meier (KM) approach, we identified fewer immune-related genes among the most statistically significant hits; when stage III cases were included, most of the prognostic genes were missed. Interestingly, logistic regression analysis of the breast cancer data set identified many immune-related genes predictive of clinical outcome. Stratification of

  14. Clustering of two genes putatively involved in cyanate detoxification evolved recently and independently in multiple fungal lineages

    Science.gov (United States)

    Fungi that have the enzymes cyanase and carbonic anhydrase show a limited capacity to detoxify cyanate, a fungicide employed by both plants and humans. Here, we describe a novel two-gene cluster that comprises duplicated cyanase and carbonic anhydrase copies, which we name the CCA gene cluster, trac...

  15. Deletion and Gene Expression Analyses Define the Paxilline Biosynthetic Gene Cluster in Penicillium paxilli

    Directory of Open Access Journals (Sweden)

    Emily J. Parker

    2013-08-01

    Full Text Available The indole-diterpene paxilline is an abundant secondary metabolite synthesized by Penicillium paxilli. In total, 21 genes have been identified at the PAX locus of which six have been previously confirmed to have a functional role in paxilline biosynthesis. A combination of bioinformatics, gene expression and targeted gene replacement analyses were used to define the boundaries of the PAX gene cluster. Targeted gene replacement identified seven genes, paxG, paxA, paxM, paxB, paxC, paxP and paxQ that were all required for paxilline production, with one additional gene, paxD, required for regular prenylation of the indole ring post paxilline synthesis. The two putative transcription factors, PP104 and PP105, were not co-regulated with the pax genes and based on targeted gene replacement, including the double knockout, did not have a role in paxilline production. The relationship of indole dimethylallyl transferases involved in prenylation of indole-diterpenes such as paxilline or lolitrem B, can be found as two disparate clades, not supported by prenylation type (e.g., regular or reverse. This paper provides insight into the P. paxilli indole-diterpene locus and reviews the recent advances identified in paxilline biosynthesis.

  16. Bacillus sp.CDB3 isolated from cattle dip-sites possesses two ars gene clusters

    Institute of Scientific and Technical Information of China (English)

    Somanath Bhat; Xi Luo; Zhiqiang Xu; Lixia Liu; Ren Zhang

    2011-01-01

    Contamination of soil and water by arsenic is a global problem.In Australia, the dipping of cattle in arsenic-containing solution to control cattle ticks in last centenary has left many sites heavily contaminated with arsenic and other toxicants.We had previously isolated five soil bacterial strains (CDB1-5) highly resistant to arsenic.To understand the resistance mechanism, molecular studies have been carried out.Two chromosome-encoded arsenic resistance (ars) gene clusters have been cloned from CDB3 (Bacillus sp.).They both function in Escherichia coli and cluster 1 exerts a much higher resistance to the toxic metalloid.Cluster 2 is smaller possessing four open reading frames (ORFs) arsRorf2BC, similar to that identified in Bacillus subtilis Skin element.Among the eight ORFs in cluster 1 five are analogs of common ars genes found in other bacteria, however, organized in a unique order arsRBCDA instead of arsRDABC.Three other putative genes are located directly downstream and designated as arsTIP based on the homologies of their theoretical translation sequences respectively to thioredoxin reductases, iron-sulphur cluster proteins and protein phosphatases.The latter two are novel of any known ars operons.The arsD gene from Bacillus species was cloned for the first time and the predict protein differs from the well studied E.coli ArsD by lacking two pairs of C-terrninal cysteine residues.Its functional involvement in arsenic resistance has been confirmed by a deletion experiment.There exists also an inverted repeat in the intergenic region between arsC and arsD implying some unknown transcription regulation.

  17. Genome-Wide Analysis of Secondary Metabolite Gene Clusters in Ophiostoma ulmi and Ophiostoma novo-ulmi Reveals a Fujikurin-Like Gene Cluster with a Putative Role in Infection

    Directory of Open Access Journals (Sweden)

    Nicolau Sbaraini

    2017-06-01

    Full Text Available The emergence of new microbial pathogens can result in destructive outbreaks, since their hosts have limited resistance and pathogens may be excessively aggressive. Described as the major ecological incident of the twentieth century, Dutch elm disease, caused by ascomycete fungi from the Ophiostoma genus, has caused a significant decline in elm tree populations (Ulmus sp. in North America and Europe. Genome sequencing of the two main causative agents of Dutch elm disease (Ophiostoma ulmi and Ophiostoma novo-ulmi, along with closely related species with different lifestyles, allows for unique comparisons to be made to identify how pathogens and virulence determinants have emerged. Among several established virulence determinants, secondary metabolites (SMs have been suggested to play significant roles during phytopathogen infection. Interestingly, the secondary metabolism of Dutch elm pathogens remains almost unexplored, and little is known about how SM biosynthetic genes are organized in these species. To better understand the metabolic potential of O. ulmi and O. novo-ulmi, we performed a deep survey and description of SM biosynthetic gene clusters (BGCs in these species and assessed their conservation among eight species from the Ophiostomataceae family. Among 19 identified BGCs, a fujikurin-like gene cluster (OpPKS8 was unique to Dutch elm pathogens. Phylogenetic analysis revealed that orthologs for this gene cluster are widespread among phytopathogens and plant-associated fungi, suggesting that OpPKS8 may have been horizontally acquired by the Ophiostoma genus. Moreover, the detailed identification of several BGCs paves the way for future in-depth research and supports the potential impact of secondary metabolism on Ophiostoma genus’ lifestyle.

  18. Prognostic Impact of 21-Gene Recurrence Score in Patients With Stage IV Breast Cancer: TBCRC 013.

    Science.gov (United States)

    King, Tari A; Lyman, Jaclyn P; Gonen, Mithat; Voci, Amy; De Brot, Marina; Boafo, Camilla; Sing, Amy Pratt; Hwang, E Shelley; Alvarado, Michael D; Liu, Minetta C; Boughey, Judy C; McGuire, Kandace P; Van Poznak, Catherine H; Jacobs, Lisa K; Meszoely, Ingrid M; Krontiras, Helen; Babiera, Gildy V; Norton, Larry; Morrow, Monica; Hudis, Clifford A

    2016-07-10

    The objective of this study was to determine whether the 21-gene Recurrence Score (RS) provides clinically meaningful information in patients with de novo stage IV breast cancer enrolled in the Translational Breast Cancer Research Consortium (TBCRC) 013. TBCRC 013 was a multicenter prospective registry that evaluated the role of surgery of the primary tumor in patients with de novo stage IV breast cancer. From July 2009 to April 2012, 127 patients from 14 sites were enrolled; 109 (86%) patients had pretreatment primary tumor samples suitable for 21-gene RS analysis. Clinical variables, time to first progression (TTP), and 2-year overall survival (OS) were correlated with the 21-gene RS by using log-rank, Kaplan-Meier, and Cox regression. Median patient age was 52 years (21 to 79 years); the majority had hormone receptor-positive/human epidermal growth factor receptor 2 (HER2)-negative (72 [66%]) or hormone receptor-positive/HER2-positive (20 [18%]) breast cancer. At a median follow-up of 29 months, median TTP was 20 months (95% CI, 16 to 26 months), and median survival was 49 months (95% CI, 40 months to not reached). An RS was generated for 101 (93%) primary tumor samples: 22 (23%) low risk (< 18), 29 (28%) intermediate risk (18 to 30); and 50 (49%) high risk (≥ 31). For all patients, RS was associated with TTP (P = .01) and 2-year OS (P = .04). In multivariable Cox regression models among 69 patients with estrogen receptor (ER)-positive/HER2-negative cancer, RS was independently prognostic for TTP (hazard ratio, 1.40; 95% CI, 1.05 to 1.86; P = .02) and 2-year OS (hazard ratio, 1.83; 95% CI, 1.14 to 2.95; P = .013). The 21-gene RS is independently prognostic for both TTP and 2-year OS in ER-positive/HER2-negative de novo stage IV breast cancer. Prospective validation is needed to determine the potential role for this assay in the clinical management of this patient subset. © 2016 by American Society of Clinical Oncology.

  19. Comparison of loline alkaloid gene clusters across fungal endophytes: predicting the co-regulatory sequence motifs and the evolutionary history.

    Science.gov (United States)

    Kutil, Brandi L; Greenwald, Charles; Liu, Gang; Spiering, Martin J; Schardl, Christopher L; Wilkinson, Heather H

    2007-10-01

    LOL, a fungal secondary metabolite gene cluster found in Epichloë and Neotyphodium species, is responsible for production of insecticidal loline alkaloids. To analyze the genetic architecture and to predict the evolutionary history of LOL, we compared five clusters from four fungal species (single clusters from Epichloë festucae, Neotyphodium sp. PauTG-1, Neotyphodium coenophialum, and two clusters we previously characterized in Neotyphodium uncinatum). Using PhyloCon to compare putative lol gene promoter regions, we have identified four motifs conserved across the lol genes in all five clusters. Each motif has significant similarity to known fungal transcription factor binding sites in the TRANSFAC database. Conservation of these motifs is further support for the hypothesis that the lol genes are co-regulated. Interestingly, the history of asexual Neotyphodium spp. includes multiple interspecific hybridization events. Comparing clusters from three Neotyphodium species and E. festucae allowed us to determine which Epichloë ancestors are the most likely contributors of LOL in these asexual species. For example, while no present day Epichloë typhina isolates are known to produce lolines, our data support the hypothesis that the E. typhina ancestor(s) of three asexual endophyte species contained a LOL gene cluster. Thus, these data support a model of evolution in which the polymorphism in loline alkaloid production phenotypes among endophyte species is likely due to the loss of the trait over time.

  20. Identification and analysis of the paulomycin biosynthetic gene cluster and titer improvement of the paulomycins in Streptomyces paulus NRRL 8115.

    Directory of Open Access Journals (Sweden)

    Jine Li

    Full Text Available The paulomycins are a group of glycosylated compounds featuring a unique paulic acid moiety. To locate their biosynthetic gene clusters, the genomes of two paulomycin producers, Streptomyces paulus NRRL 8115 and Streptomyces sp. YN86, were sequenced. The paulomycin biosynthetic gene clusters were defined by comparative analyses of the two genomes together with the genome of the third paulomycin producer Streptomyces albus J1074. Subsequently, the identity of the paulomycin biosynthetic gene cluster was confirmed by inactivation of two genes involved in biosynthesis of the paulomycose branched chain (pau11 and the ring A moiety (pau18 in Streptomyces paulus NRRL 8115. After determining the gene cluster boundaries, a convergent biosynthetic model was proposed for paulomycin based on the deduced functions of the pau genes. Finally, a paulomycin high-producing strain was constructed by expressing an activator-encoding gene (pau13 in S. paulus, setting the stage for future investigations.

  1. Evaluation of prognostic models developed using standardised image features from different PET automated segmentation methods.

    Science.gov (United States)

    Parkinson, Craig; Foley, Kieran; Whybra, Philip; Hills, Robert; Roberts, Ashley; Marshall, Chris; Staffurth, John; Spezi, Emiliano

    2018-04-11

    Prognosis in oesophageal cancer (OC) is poor. The 5-year overall survival (OS) rate is approximately 15%. Personalised medicine is hoped to increase the 5- and 10-year OS rates. Quantitative analysis of PET is gaining substantial interest in prognostic research but requires the accurate definition of the metabolic tumour volume. This study compares prognostic models developed in the same patient cohort using individual PET segmentation algorithms and assesses the impact on patient risk stratification. Consecutive patients (n = 427) with biopsy-proven OC were included in final analysis. All patients were staged with PET/CT between September 2010 and July 2016. Nine automatic PET segmentation methods were studied. All tumour contours were subjectively analysed for accuracy, and segmentation methods with segmentation methods studied, clustering means (KM2), general clustering means (GCM3), adaptive thresholding (AT) and watershed thresholding (WT) methods were included for analysis. Known clinical prognostic factors (age, treatment and staging) were significant in all of the developed prognostic models. AT and KM2 segmentation methods developed identical prognostic models. Patient risk stratification was dependent on the segmentation method used to develop the prognostic model with up to 73 patients (17.1%) changing risk stratification group. Prognostic models incorporating quantitative image features are dependent on the method used to delineate the primary tumour. This has a subsequent effect on risk stratification, with patients changing groups depending on the image segmentation method used.

  2. Chassis organism from Corynebacterium glutamicum--a top-down approach to identify and delete irrelevant gene clusters.

    Science.gov (United States)

    Unthan, Simon; Baumgart, Meike; Radek, Andreas; Herbst, Marius; Siebert, Daniel; Brühl, Natalie; Bartsch, Anna; Bott, Michael; Wiechert, Wolfgang; Marin, Kay; Hans, Stephan; Krämer, Reinhard; Seibold, Gerd; Frunzke, Julia; Kalinowski, Jörn; Rückert, Christian; Wendisch, Volker F; Noack, Stephan

    2015-02-01

    For synthetic biology applications, a robust structural basis is required, which can be constructed either from scratch or in a top-down approach starting from any existing organism. In this study, we initiated the top-down construction of a chassis organism from Corynebacterium glutamicum ATCC 13032, aiming for the relevant gene set to maintain its fast growth on defined medium. We evaluated each native gene for its essentiality considering expression levels, phylogenetic conservation, and knockout data. Based on this classification, we determined 41 gene clusters ranging from 3.7 to 49.7 kbp as target sites for deletion. 36 deletions were successful and 10 genome-reduced strains showed impaired growth rates, indicating that genes were hit, which are relevant to maintain biological fitness at wild-type level. In contrast, 26 deleted clusters were found to include exclusively irrelevant genes for growth on defined medium. A combinatory deletion of all irrelevant gene clusters would, in a prophage-free strain, decrease the size of the native genome by about 722 kbp (22%) to 2561 kbp. Finally, five combinatory deletions of irrelevant gene clusters were investigated. The study introduces the novel concept of relevant genes and demonstrates general strategies to construct a chassis suitable for biotechnological application. © 2014 The Authors. Biotechnology Journal published by Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim. This is an open access article under the terms of the Creative Commons Attribution-Non-Commercial-NoDerivs Licence, which permits use and distribution in any medium, provided the original work is properly cited, the use is non- commercial and no modifications or adaptations are made.

  3. Gravitation field algorithm and its application in gene cluster

    Directory of Open Access Journals (Sweden)

    Zheng Ming

    2010-09-01

    Full Text Available Abstract Background Searching optima is one of the most challenging tasks in clustering genes from available experimental data or given functions. SA, GA, PSO and other similar efficient global optimization methods are used by biotechnologists. All these algorithms are based on the imitation of natural phenomena. Results This paper proposes a novel searching optimization algorithm called Gravitation Field Algorithm (GFA which is derived from the famous astronomy theory Solar Nebular Disk Model (SNDM of planetary formation. GFA simulates the Gravitation field and outperforms GA and SA in some multimodal functions optimization problem. And GFA also can be used in the forms of unimodal functions. GFA clusters the dataset well from the Gene Expression Omnibus. Conclusions The mathematical proof demonstrates that GFA could be convergent in the global optimum by probability 1 in three conditions for one independent variable mass functions. In addition to these results, the fundamental optimization concept in this paper is used to analyze how SA and GA affect the global search and the inherent defects in SA and GA. Some results and source code (in Matlab are publicly available at http://ccst.jlu.edu.cn/CSBG/GFA.

  4. Draft genome sequence of Streptomyces coelicoflavus ZG0656 reveals the putative biosynthetic gene cluster of acarviostatin family α-amylase inhibitors.

    Science.gov (United States)

    Guo, X; Geng, P; Bai, F; Bai, G; Sun, T; Li, X; Shi, L; Zhong, Q

    2012-08-01

    The aims of this study are to obtain the draft genome sequence of Streptomyces coelicoflavus ZG0656, which produces novel acarviostatin family α-amylase inhibitors, and then to reveal the putative acarviostatin-related gene cluster and the biosynthetic pathway. The draft genome sequence of S. coelicoflavus ZG0656 was generated using a shotgun approach employing a combination of 454 and Solexa sequencing technologies. Genome analysis revealed a putative gene cluster for acarviostatin biosynthesis, termed sct-cluster. The cluster contains 13 acarviostatin synthetic genes, six transporter genes, four starch degrading or transglycosylation enzyme genes and two regulator genes. On the basis of bioinformatic analysis, we proposed a putative biosynthetic pathway of acarviostatins. The intracellular steps produce a structural core, acarviostatin I00-7-P, and the extracellular assemblies lead to diverse acarviostatin end products. The draft genome sequence of S. coelicoflavus ZG0656 revealed the putative biosynthetic gene cluster of acarviostatins and a putative pathway of acarviostatin production. To our knowledge, S. coelicoflavus ZG0656 is the first strain in this species for which a genome sequence has been reported. The analysis of sct-cluster provided important insights into the biosynthesis of acarviostatins. This work will be a platform for producing novel variants and yield improvement. © 2012 The Authors. Letters in Applied Microbiology © 2012 The Society for Applied Microbiology.

  5. Identification and functional analysis of gene cluster involvement in biosynthesis of the cyclic lipopeptide antibiotic pelgipeptin produced by Paenibacillus elgii

    Directory of Open Access Journals (Sweden)

    Qian Chao-Dong

    2012-09-01

    Full Text Available Abstract Background Pelgipeptin, a potent antibacterial and antifungal agent, is a non-ribosomally synthesised lipopeptide antibiotic. This compound consists of a β-hydroxy fatty acid and nine amino acids. To date, there is no information about its biosynthetic pathway. Results A potential pelgipeptin synthetase gene cluster (plp was identified from Paenibacillus elgii B69 through genome analysis. The gene cluster spans 40.8 kb with eight open reading frames. Among the genes in this cluster, three large genes, plpD, plpE, and plpF, were shown to encode non-ribosomal peptide synthetases (NRPSs, with one, seven, and one module(s, respectively. Bioinformatic analysis of the substrate specificity of all nine adenylation domains indicated that the sequence of the NRPS modules is well collinear with the order of amino acids in pelgipeptin. Additional biochemical analysis of four recombinant adenylation domains (PlpD A1, PlpE A1, PlpE A3, and PlpF A1 provided further evidence that the plp gene cluster involved in pelgipeptin biosynthesis. Conclusions In this study, a gene cluster (plp responsible for the biosynthesis of pelgipeptin was identified from the genome sequence of Paenibacillus elgii B69. The identification of the plp gene cluster provides an opportunity to develop novel lipopeptide antibiotics by genetic engineering.

  6. Molecular comparison of the structural proteins encoding gene clusters of two related Lactobacillus delbrueckii bacteriophages.

    Science.gov (United States)

    Vasala, A; Dupont, L; Baumann, M; Ritzenthaler, P; Alatossava, T

    1993-01-01

    Virulent phage LL-H and temperate phage mv4 are two related bacteriophages of Lactobacillus delbrueckii. The gene clusters encoding structural proteins of these two phages have been sequenced and further analyzed. Six open reading frames (ORF-1 to ORF-6) were detected. Protein sequencing and Western immunoblotting experiments confirmed that ORF-3 (g34) encoded the main capsid protein Gp34. The presence of a putative late promoter in front of the phage LL-H g34 gene was suggested by primer extension experiments. Comparative sequence analysis between phage LL-H and phage mv4 revealed striking similarities in the structure and organization of this gene cluster, suggesting that the genes encoding phage structural proteins belong to a highly conservative module. Images PMID:8497043

  7. Genetic variations and haplotype diversity of the UGT1 gene cluster in the Chinese population.

    Directory of Open Access Journals (Sweden)

    Jing Yang

    Full Text Available Vertebrates require tremendous molecular diversity to defend against numerous small hydrophobic chemicals. UDP-glucuronosyltransferases (UGTs are a large family of detoxification enzymes that glucuronidate xenobiotics and endobiotics, facilitating their excretion from the body. The UGT1 gene cluster contains a tandem array of variable first exons, each preceded by a specific promoter, and a common set of downstream constant exons, similar to the genomic organization of the protocadherin (Pcdh, immunoglobulin, and T-cell receptor gene clusters. To assist pharmacogenomics studies in Chinese, we sequenced nine first exons, promoter and intronic regions, and five common exons of the UGT1 gene cluster in a population sample of 253 unrelated Chinese individuals. We identified 101 polymorphisms and found 15 novel SNPs. We then computed allele frequencies for each polymorphism and reconstructed their linkage disequilibrium (LD map. The UGT1 cluster can be divided into five linkage blocks: Block 9 (UGT1A9, Block 9/7/6 (UGT1A9, UGT1A7, and UGT1A6, Block 5 (UGT1A5, Block 4/3 (UGT1A4 and UGT1A3, and Block 3' UTR. Furthermore, we inferred haplotypes and selected their tagSNPs. Finally, comparing our data with those of three other populations of the HapMap project revealed ethnic specificity of the UGT1 genetic diversity in Chinese. These findings have important implications for future molecular genetic studies of the UGT1 gene cluster as well as for personalized medical therapies in Chinese.

  8. Identification and manipulation of the pleuromutilin gene cluster from Clitopilus passeckerianus for increased rapid antibiotic production

    Science.gov (United States)

    Bailey, Andy M.; Alberti, Fabrizio; Kilaru, Sreedhar; Collins, Catherine M.; de Mattos-Shipley, Kate; Hartley, Amanda J.; Hayes, Patrick; Griffin, Alison; Lazarus, Colin M.; Cox, Russell J.; Willis, Christine L.; O'Dwyer, Karen; Spence, David W.; Foster, Gary D.

    2016-05-01

    Semi-synthetic derivatives of the tricyclic diterpene antibiotic pleuromutilin from the basidiomycete Clitopilus passeckerianus are important in combatting bacterial infections in human and veterinary medicine. These compounds belong to the only new class of antibiotics for human applications, with novel mode of action and lack of cross-resistance, representing a class with great potential. Basidiomycete fungi, being dikaryotic, are not generally amenable to strain improvement. We report identification of the seven-gene pleuromutilin gene cluster and verify that using various targeted approaches aimed at increasing antibiotic production in C. passeckerianus, no improvement in yield was achieved. The seven-gene pleuromutilin cluster was reconstructed within Aspergillus oryzae giving production of pleuromutilin in an ascomycete, with a significant increase (2106%) in production. This is the first gene cluster from a basidiomycete to be successfully expressed in an ascomycete, and paves the way for the exploitation of a metabolically rich but traditionally overlooked group of fungi.

  9. Molecular evolution of the nif gene cluster carrying nifI1 and nifI2 genes in the Gram-positive phototrophic bacterium Heliobacterium chlorum.

    Science.gov (United States)

    Enkh-Amgalan, Jigjiddorj; Kawasaki, Hiroko; Seki, Tatsuji

    2006-01-01

    A major nif cluster was detected in the strictly anaerobic, Gram-positive phototrophic bacterium Heliobacterium chlorum. The cluster consisted of 11 genes arranged within a 10 kb region in the order nifI1, nifI2, nifH, nifD, nifK, nifE, nifN, nifX, fdx, nifB and nifV. The phylogenetic position of Hbt. chlorum was the same in the NifH, NifD, NifK, NifE and NifN trees; Hbt. chlorum formed a cluster with Desulfitobacterium hafniense, the closest neighbour of heliobacteria based on the 16S rRNA phylogeny, and two species of the genus Geobacter belonging to the Deltaproteobacteria. Two nifI genes, known to occur in the nif clusters of methanogenic archaea between nifH and nifD, were found upstream of the nifH gene of Hbt. chlorum. The organization of the nif operon and the phylogeny of individual and concatenated gene products showed that the Hbt. chlorum nif operon carrying nifI genes upstream of the nifH gene was an intermediate between the nif operon with nifI downstream of nifH (group II and III of the nitrogenase classification) and the nif operon lacking nifI (group I). Thus, the phylogenetic position of Hbt. chlorum nitrogenase may reflect an evolutionary stage of a divergence of the two nitrogenase groups, with group I consisting of the aerobic diazotrophs and group II consisting of strictly anaerobic prokaryotes.

  10. Association of paraoxonase gene cluster polymorphisms with ALS in France, Quebec, and Sweden.

    Science.gov (United States)

    Valdmanis, P N; Kabashi, E; Dyck, A; Hince, P; Lee, J; Dion, P; D'Amour, M; Souchon, F; Bouchard, J-P; Salachas, F; Meininger, V; Andersen, P M; Camu, W; Dupré, N; Rouleau, G A

    2008-08-12

    The paraoxonase gene cluster on chromosome 7 comprising the PON1-3 genes is an attractive candidate for association in amyotrophic lateral sclerosis (ALS) given the role of paraoxonase genes during the response to oxidative stress and their contribution to the enzymatic break down of nerve toxins. Oxidative stress is considered one of the mechanisms involved in ALS pathogenesis. Evidence for this includes the fact that mutations of SOD1, which normally reduce the production of toxic superoxide anion, account for 12% to 23% of familial cases in ALS. In addition, PON variants were shown to be associated with susceptibility to ALS in several North American and European populations. We extended this analysis to examine 20 single nucleotide polymorphisms (SNPs) across the PON gene cluster in a set of patients from France (480 cases, 475 controls), Quebec (159 cases, 95 controls), and Sweden (558 cases, 506 controls). Although individual SNPs were not considered associated on their own, a haplotype of SNPs at the C-terminal portion of PON2 that includes the PON2 C311S amino acid change was significant in the French (p value 0.0075) and Quebec (p value 0.026) populations as well as all three populations combined (p value 1.69 x 10(-6)). Stratification of the samples showed that this variation was pertinent to ALS susceptibility as a whole, and not to a particular subset of patients. These findings contribute to the increasing weight of evidence that genetic variants in the paraoxonase gene cluster are associated with amyotrophic lateral sclerosis.

  11. Average correlation clustering algorithm (ACCA) for grouping of co-regulated genes with similar pattern of variation in their expression values.

    Science.gov (United States)

    Bhattacharya, Anindya; De, Rajat K

    2010-08-01

    Distance based clustering algorithms can group genes that show similar expression values under multiple experimental conditions. They are unable to identify a group of genes that have similar pattern of variation in their expression values. Previously we developed an algorithm called divisive correlation clustering algorithm (DCCA) to tackle this situation, which is based on the concept of correlation clustering. But this algorithm may also fail for certain cases. In order to overcome these situations, we propose a new clustering algorithm, called average correlation clustering algorithm (ACCA), which is able to produce better clustering solution than that produced by some others. ACCA is able to find groups of genes having more common transcription factors and similar pattern of variation in their expression values. Moreover, ACCA is more efficient than DCCA with respect to the time of execution. Like DCCA, we use the concept of correlation clustering concept introduced by Bansal et al. ACCA uses the correlation matrix in such a way that all genes in a cluster have the highest average correlation values with the genes in that cluster. We have applied ACCA and some well-known conventional methods including DCCA to two artificial and nine gene expression datasets, and compared the performance of the algorithms. The clustering results of ACCA are found to be more significantly relevant to the biological annotations than those of the other methods. Analysis of the results show the superiority of ACCA over some others in determining a group of genes having more common transcription factors and with similar pattern of variation in their expression profiles. Availability of the software: The software has been developed using C and Visual Basic languages, and can be executed on the Microsoft Windows platforms. The software may be downloaded as a zip file from http://www.isical.ac.in/~rajat. Then it needs to be installed. Two word files (included in the zip file) need to

  12. Genetic interrelations in the actinomycin biosynthetic gene clusters of Streptomyces antibioticus IMRU 3720 and Streptomyces chrysomallus ATCC11523, producers of actinomycin X and actinomycin C

    Science.gov (United States)

    Crnovčić, Ivana; Rückert, Christian; Semsary, Siamak; Lang, Manuel; Kalinowski, Jörn; Keller, Ullrich

    2017-01-01

    Sequencing the actinomycin (acm) biosynthetic gene cluster of Streptomyces antibioticus IMRU 3720, which produces actinomycin X (Acm X), revealed 20 genes organized into a highly similar framework as in the bi-armed acm C biosynthetic gene cluster of Streptomyces chrysomallus but without an attached additional extra arm of orthologues as in the latter. Curiously, the extra arm of the S. chrysomallus gene cluster turned out to perfectly match the single arm of the S. antibioticus gene cluster in the same order of orthologues including the the presence of two pseudogenes, scacmM and scacmN, encoding a cytochrome P450 and its ferredoxin, respectively. Orthologues of the latter genes were both missing in the principal arm of the S. chrysomallus acm C gene cluster. All orthologues of the extra arm showed a G +C-contents different from that of their counterparts in the principal arm. Moreover, the similarities of translation products from the extra arm were all higher to the corresponding translation products of orthologue genes from the S. antibioticus acm X gene cluster than to those encoded by the principal arm of their own gene cluster. This suggests that the duplicated structure of the S. chrysomallus acm C biosynthetic gene cluster evolved from previous fusion between two one-armed acm gene clusters each from a different genetic background. However, while scacmM and scacmN in the extra arm of the S. chrysomallus acm C gene cluster are mutated and therefore are non-functional, their orthologues saacmM and saacmN in the S. antibioticus acm C gene cluster show no defects seemingly encoding active enzymes with functions specific for Acm X biosynthesis. Both acm biosynthetic gene clusters lack a kynurenine-3-monooxygenase gene necessary for biosynthesis of 3-hydroxy-4-methylanthranilic acid, the building block of the Acm chromophore, which suggests participation of a genome-encoded relevant monooxygenase during Acm biosynthesis in both S. chrysomallus and S

  13. Correlation-based iterative clustering methods for time course data: The identification of temporal gene response modules for influenza infection in humans

    Directory of Open Access Journals (Sweden)

    Michelle Carey

    2016-10-01

    Full Text Available Many pragmatic clustering methods have been developed to group data vectors or objects into clusters so that the objects in one cluster are very similar and objects in different clusters are distinct based on some similarity measure. The availability of time course data has motivated researchers to develop methods, such as mixture and mixed-effects modelling approaches, that incorporate the temporal information contained in the shape of the trajectory of the data. However, there is still a need for the development of time-course clustering methods that can adequately deal with inhomogeneous clusters (some clusters are quite large and others are quite small. Here we propose two such methods, hierarchical clustering (IHC and iterative pairwise-correlation clustering (IPC. We evaluate and compare the proposed methods to the Markov Cluster Algorithm (MCL and the generalised mixed-effects model (GMM using simulation studies and an application to a time course gene expression data set from a study containing human subjects who were challenged by a live influenza virus. We identify four types of temporal gene response modules to influenza infection in humans, i.e., single-gene modules (SGM, small-size modules (SSM, medium-size modules (MSM and large-size modules (LSM. The LSM contain genes that perform various fundamental biological functions that are consistent across subjects. The SSM and SGM contain genes that perform either different or similar biological functions that have complex temporal responses to the virus and are unique to each subject. We show that the temporal response of the genes in the LSM have either simple patterns with a single peak or trough a consequence of the transient stimuli sustained or state-transitioning patterns pertaining to developmental cues and that these modules can differentiate the severity of disease outcomes. Additionally, the size of gene response modules follows a power-law distribution with a consistent

  14. The Local Maximum Clustering Method and Its Application in Microarray Gene Expression Data Analysis

    Directory of Open Access Journals (Sweden)

    Chen Yidong

    2004-01-01

    Full Text Available An unsupervised data clustering method, called the local maximum clustering (LMC method, is proposed for identifying clusters in experiment data sets based on research interest. A magnitude property is defined according to research purposes, and data sets are clustered around each local maximum of the magnitude property. By properly defining a magnitude property, this method can overcome many difficulties in microarray data clustering such as reduced projection in similarities, noises, and arbitrary gene distribution. To critically evaluate the performance of this clustering method in comparison with other methods, we designed three model data sets with known cluster distributions and applied the LMC method as well as the hierarchic clustering method, the -mean clustering method, and the self-organized map method to these model data sets. The results show that the LMC method produces the most accurate clustering results. As an example of application, we applied the method to cluster the leukemia samples reported in the microarray study of Golub et al. (1999.

  15. Genetic interrelations in the actinomycin biosynthetic gene clusters of Streptomyces antibioticus IMRU 3720 and Streptomyces chrysomallus ATCC11523, producers of actinomycin X and actinomycin C

    Directory of Open Access Journals (Sweden)

    Crnovčić I

    2017-04-01

    Full Text Available Ivana Crnovčić,1 Christian Rückert,2 Siamak Semsary,1 Manuel Lang,1 Jörn Kalinowski,2 Ullrich Keller1 1Institut für Chemie, Technische Universität Berlin, Berlin-Charlottenburg, 2Technology Platform Genomics, Center for Biotechnology, Bielefeld University, Bielefeld, Germany Abstract: Sequencing the actinomycin (acm biosynthetic gene cluster of Streptomyces antibioticus IMRU 3720, which produces actinomycin X (Acm X, revealed 20 genes organized into a highly similar framework as in the bi-armed acm C biosynthetic gene cluster of Streptomyces chrysomallus but without an attached additional extra arm of orthologues as in the latter. Curiously, the extra arm of the S. chrysomallus gene cluster turned out to perfectly match the single arm of the S. antibioticus gene cluster in the same order of orthologues including the the presence of two pseudogenes, scacmM and scacmN, encoding a cytochrome P450 and its ferredoxin, respectively. Orthologues of the latter genes were both missing in the principal arm of the S. chrysomallus acm C gene cluster. All orthologues of the extra arm showed a G +C-contents different from that of their counterparts in the principal arm. Moreover, the similarities of translation products from the extra arm were all higher to the corresponding translation products of orthologue genes from the S. antibioticus acm X gene cluster than to those encoded by the principal arm of their own gene cluster. This suggests that the duplicated structure of the S. chrysomallus acm C biosynthetic gene cluster evolved from previous fusion between two one-armed acm gene clusters each from a different genetic background. However, while scacmM and scacmN in the extra arm of the S. chrysomallus acm C gene cluster are mutated and therefore are non-functional, their orthologues saacmM and saacmN in the S. antibioticus acm C gene cluster show no defects seemingly encoding active enzymes with functions specific for Acm X biosynthesis. Both acm

  16. Heterologous Reconstitution of the Intact Geodin Gene Cluster in Aspergillus nidulans through a Simple and Versatile PCR Based Approach

    DEFF Research Database (Denmark)

    Nielsen, Morten Thrane; Nielsen, Jakob Blæsbjerg; Anyaogu, Dianna Chinyere

    2013-01-01

    was transferred in a two step procedure to an expression platform in A. nidulans. The individual cluster fragments were generated by PCR and assembled via efficient USER fusion prior to ransformation and integration via re-iterative gene targeting. A total of 13 open reading frames contained in 25 kb of DNA were...... of solid methodology for genetic manipulation of most species severely hampers pathway haracterization. Here we present a simple PCR based approach for heterologous reconstitution of intact gene clusters. Specifically, the putative gene cluster responsible for geodin production from Aspergillus terreus...... successfully transferred between the two species enabling geodin synthesis in A. nidulans. Subsequently, functions of three genes in the cluster were validated by genetic and chemical analyses. Specifically, ATEG_08451 (gedC) encodes a polyketide synthase, ATEG_08453 (gedR) encodes a transcription factor...

  17. Comprehensive cluster analysis with Transitivity Clustering.

    Science.gov (United States)

    Wittkop, Tobias; Emig, Dorothea; Truss, Anke; Albrecht, Mario; Böcker, Sebastian; Baumbach, Jan

    2011-03-01

    Transitivity Clustering is a method for the partitioning of biological data into groups of similar objects, such as genes, for instance. It provides integrated access to various functions addressing each step of a typical cluster analysis. To facilitate this, Transitivity Clustering is accessible online and offers three user-friendly interfaces: a powerful stand-alone version, a web interface, and a collection of Cytoscape plug-ins. In this paper, we describe three major workflows: (i) protein (super)family detection with Cytoscape, (ii) protein homology detection with incomplete gold standards and (iii) clustering of gene expression data. This protocol guides the user through the most important features of Transitivity Clustering and takes ∼1 h to complete.

  18. Characterization of the biosynthetic gene cluster for cryptic phthoxazolin A in Streptomyces avermitilis.

    Directory of Open Access Journals (Sweden)

    Dian Anggraini Suroto

    Full Text Available Phthoxazolin A, an oxazole-containing polyketide, has a broad spectrum of anti-oomycete activity and herbicidal activity. We recently identified phthoxazolin A as a cryptic metabolite of Streptomyces avermitilis that produces the important anthelmintic agent avermectin. Even though genome data of S. avermitilis is publicly available, no plausible biosynthetic gene cluster for phthoxazolin A is apparent in the sequence data. Here, we identified and characterized the phthoxazolin A (ptx biosynthetic gene cluster through genome sequencing, comparative genomic analysis, and gene disruption. Sequence analysis uncovered that the putative ptx biosynthetic genes are laid on an extra genomic region that is not found in the public database, and 8 open reading frames in the extra genomic region could be assigned roles in the biosynthesis of the oxazole ring, triene polyketide and carbamoyl moieties. Disruption of the ptxA gene encoding a discrete acyltransferase resulted in a complete loss of phthoxazolin A production, confirming that the trans-AT type I PKS system is responsible for the phthoxazolin A biosynthesis. Based on the predicted functional domains in the ptx assembly line, we propose the biosynthetic pathway of phthoxazolin A.

  19. Transcriptional interference networks coordinate the expression of functionally related genes clustered in the same genomic loci.

    Science.gov (United States)

    Boldogköi, Zsolt

    2012-01-01

    The regulation of gene expression is essential for normal functioning of biological systems in every form of life. Gene expression is primarily controlled at the level of transcription, especially at the phase of initiation. Non-coding RNAs are one of the major players at every level of genetic regulation, including the control of chromatin organization, transcription, various post-transcriptional processes, and translation. In this study, the Transcriptional Interference Network (TIN) hypothesis was put forward in an attempt to explain the global expression of antisense RNAs and the overall occurrence of tandem gene clusters in the genomes of various biological systems ranging from viruses to mammalian cells. The TIN hypothesis suggests the existence of a novel layer of genetic regulation, based on the interactions between the transcriptional machineries of neighboring genes at their overlapping regions, which are assumed to play a fundamental role in coordinating gene expression within a cluster of functionally linked genes. It is claimed that the transcriptional overlaps between adjacent genes are much more widespread in genomes than is thought today. The Waterfall model of the TIN hypothesis postulates a unidirectional effect of upstream genes on the transcription of downstream genes within a cluster of tandemly arrayed genes, while the Seesaw model proposes a mutual interdependence of gene expression between the oppositely oriented genes. The TIN represents an auto-regulatory system with an exquisitely timed and highly synchronized cascade of gene expression in functionally linked genes located in close physical proximity to each other. In this study, we focused on herpesviruses. The reason for this lies in the compressed nature of viral genes, which allows a tight regulation and an easier investigation of the transcriptional interactions between genes. However, I believe that the same or similar principles can be applied to cellular organisms too.

  20. Transcriptional interference networks coordinate the expression of functionally-related genes clustered in the same genomic loci

    Directory of Open Access Journals (Sweden)

    Zsolt eBoldogkoi

    2012-07-01

    Full Text Available The regulation of gene expression is essential for normal functioning of biological systems in every form of life. Gene expression is primarily controlled at the level of transcription, especially at the phase of initiation. Non-coding RNAs are one of the major players at every level of genetic regulation, including the control of chromatin organisation, transcription, various post-transcriptional processes and translation. In this study, the Transcriptional Interference Network (TIN hypothesis was put forward in an attempt to explain the global expression of antisense RNAs and the overall occurrence of tandem gene clusters in the genomes of various biological systems ranging from viruses to mammalian cells. The TIN hypothesis suggests the existence of a novel layer of genetic regulation, based on the interactions between the transcriptional machineries of neighbouring genes at their overlapping regions, which are assumed to play a fundamental role in coordinating gene expression within a cluster of functionally-linked genes. It is claimed that the transcriptional overlaps between adjacent genes are much more widespread in genomes than is thought today. The Waterfall model of the TIN hypothesis postulates a unidirectional effect of upstream genes on the transcription of downstream genes within a cluster of tandemly-arrayed genes, while the Seesaw model proposes a mutual interdependence of gene expression between the oppositely-oriented genes. The TIN represents an auto-regulatory system with an exquisitely timed and highly synchronised cascade of gene expression in functionally-linked genes located in close physical proximity to each other. In this study, we focused on herpesviruses. The reason for this lies in the compressed nature of viral genes, which allows a tight regulation and an easier investigation of the transcriptional interactions between genes. However, I believe that the same or similar principles can be applied to cellular

  1. Hierarchical clustering of breast cancer methylomes revealed differentially methylated and expressed breast cancer genes.

    Directory of Open Access Journals (Sweden)

    I-Hsuan Lin

    Full Text Available Oncogenic transformation of normal cells often involves epigenetic alterations, including histone modification and DNA methylation. We conducted whole-genome bisulfite sequencing to determine the DNA methylomes of normal breast, fibroadenoma, invasive ductal carcinomas and MCF7. The emergence, disappearance, expansion and contraction of kilobase-sized hypomethylated regions (HMRs and the hypomethylation of the megabase-sized partially methylated domains (PMDs are the major forms of methylation changes observed in breast tumor samples. Hierarchical clustering of HMR revealed tumor-specific hypermethylated clusters and differential methylated enhancers specific to normal or breast cancer cell lines. Joint analysis of gene expression and DNA methylation data of normal breast and breast cancer cells identified differentially methylated and expressed genes associated with breast and/or ovarian cancers in cancer-specific HMR clusters. Furthermore, aberrant patterns of X-chromosome inactivation (XCI was found in breast cancer cell lines as well as breast tumor samples in the TCGA BRCA (breast invasive carcinoma dataset. They were characterized with differentially hypermethylated XIST promoter, reduced expression of XIST, and over-expression of hypomethylated X-linked genes. High expressions of these genes were significantly associated with lower survival rates in breast cancer patients. Comprehensive analysis of the normal and breast tumor methylomes suggests selective targeting of DNA methylation changes during breast cancer progression. The weak causal relationship between DNA methylation and gene expression observed in this study is evident of more complex role of DNA methylation in the regulation of gene expression in human epigenetics that deserves further investigation.

  2. Whole-exome sequencing of muscle-invasive bladder cancer identifies recurrent mutations of UNC5C and prognostic importance of DNA repair gene mutations on survival.

    Science.gov (United States)

    Yap, Kai Lee; Kiyotani, Kazuma; Tamura, Kenji; Antic, Tatjana; Jang, Miran; Montoya, Magdeline; Campanile, Alexa; Yew, Poh Yin; Ganshert, Cory; Fujioka, Tomoaki; Steinberg, Gary D; O'Donnell, Peter H; Nakamura, Yusuke

    2014-12-15

    Because of suboptimal outcomes in muscle-invasive bladder cancer even with multimodality therapy, determination of potential genetic drivers offers the possibility of improving therapeutic approaches and discovering novel prognostic indicators. Using pTN staging, we case-matched 81 patients with resected ≥pT2 bladder cancers for whom perioperative chemotherapy use and disease recurrence status were known. Whole-exome sequencing was conducted in 43 cases to identify recurrent somatic mutations and targeted sequencing of 10 genes selected from the initial screening in an additional 38 cases was completed. Mutational profiles along with clinicopathologic information were correlated with recurrence-free survival (RFS) in the patients. We identified recurrent novel somatic mutations in the gene UNC5C (9.9%), in addition to TP53 (40.7%), KDM6A (21.0%), and TSC1 (12.3%). Patients who were carriers of somatic mutations in DNA repair genes (one or more of ATM, ERCC2, FANCD2, PALB2, BRCA1, or BRCA2) had a higher overall number of somatic mutations (P = 0.011). Importantly, after a median follow-up of 40.4 months, carriers of somatic mutations (n = 25) in any of these six DNA repair genes had significantly enhanced RFS compared with noncarriers [median, 32.4 vs. 14.8 months; hazard ratio of 0.46, 95% confidence interval (CI), 0.22-0.98; P = 0.0435], after adjustment for pathologic pTN staging and independent of adjuvant chemotherapy usage. Better prognostic outcomes of individuals carrying somatic mutations in DNA repair genes suggest these mutations as favorable prognostic events in muscle-invasive bladder cancer. Additional mechanistic investigation into the previously undiscovered role of UNC5C in bladder cancer is warranted. ©2014 American Association for Cancer Research.

  3. Gene clusters for insecticidal loline alkaloids in the grass-endophytic fungus Neotyphodium uncinatum.

    Science.gov (United States)

    Spiering, Martin J; Moon, Christina D; Wilkinson, Heather H; Schardl, Christopher L

    2005-03-01

    Loline alkaloids are produced by mutualistic fungi symbiotic with grasses, and they protect the host plants from insects. Here we identify in the fungal symbiont, Neotyphodium uncinatum, two homologous gene clusters (LOL-1 and LOL-2) associated with loline-alkaloid production. Nine genes were identified in a 25-kb region of LOL-1 and designated (in order) lolF-1, lolC-1, lolD-1, lolO-1, lolA-1, lolU-1, lolP-1, lolT-1, and lolE-1. LOL-2 contained the homologs lolC-2 through lolE-2 in the same order and orientation. Also identified was lolF-2, but its possible linkage with either cluster was undetermined. Most lol genes were regulated in N. uncinatum and N. coenophialum, and all were expressed concomitantly with loline-alkaloid biosynthesis. A lolC-2 RNA-interference (RNAi) construct was introduced into N. uncinatum, and in two independent transformants, RNAi significantly decreased lolC expression (P lol-gene products indicate that the pathway has evolved from various different primary and secondary biosynthesis pathways.

  4. Structure and gene cluster of the O-antigen of Escherichia coli O54.

    Science.gov (United States)

    Naumenko, Olesya I; Guo, Xi; Senchenkova, Sof'ya N; Geng, Peng; Perepelov, Andrei V; Shashkov, Alexander S; Liu, Bin; Knirel, Yuriy A

    2018-06-15

    Mild acid hydrolysis of the lipopolysaccharide of Escherichia coli O54 afforded an O-polysaccharide, which was studied by sugar analysis, solvolysis with anhydrous trifluoroacetic acid, and 1 H and 13 C NMR spectroscopy. Solvolysis cleaved predominantly the linkage of β-d-Ribf and, to a lesser extent, that of β-d-GlcpNAc, whereas the other linkages, including the linkage of α-l-Rhap, were stable under selected conditions (40 °C, 5 h). The following structure of the O-polysaccharide was established: →4)-α-d-GalpA-(1 → 2)-α-l-Rhap-(1 → 2)-β-d-Ribf-(1 → 4)-β-d-Galp-(1 → 3)-β-d-GlcpNAc-(1→ The O-antigen gene cluster of E. coli O54 was analyzed and found to be consistent in general with the O-polysaccharide structure established but there were two exceptions: i) in the cluster, there were genes for phosphoserine phosphatase and serine transferase, which have no apparent role in the O-polysaccharide synthesis, and ii) no ribofuranosyltransferase gene was present in the cluster. Both uncommon features are shared by some other enteric bacteria. Copyright © 2018 Elsevier Ltd. All rights reserved.

  5. An additional k-means clustering step improves the biological features of WGCNA gene co-expression networks.

    Science.gov (United States)

    Botía, Juan A; Vandrovcova, Jana; Forabosco, Paola; Guelfi, Sebastian; D'Sa, Karishma; Hardy, John; Lewis, Cathryn M; Ryten, Mina; Weale, Michael E

    2017-04-12

    Weighted Gene Co-expression Network Analysis (WGCNA) is a widely used R software package for the generation of gene co-expression networks (GCN). WGCNA generates both a GCN and a derived partitioning of clusters of genes (modules). We propose k-means clustering as an additional processing step to conventional WGCNA, which we have implemented in the R package km2gcn (k-means to gene co-expression network, https://github.com/juanbot/km2gcn ). We assessed our method on networks created from UKBEC data (10 different human brain tissues), on networks created from GTEx data (42 human tissues, including 13 brain tissues), and on simulated networks derived from GTEx data. We observed substantially improved module properties, including: (1) few or zero misplaced genes; (2) increased counts of replicable clusters in alternate tissues (x3.1 on average); (3) improved enrichment of Gene Ontology terms (seen in 48/52 GCNs) (4) improved cell type enrichment signals (seen in 21/23 brain GCNs); and (5) more accurate partitions in simulated data according to a range of similarity indices. The results obtained from our investigations indicate that our k-means method, applied as an adjunct to standard WGCNA, results in better network partitions. These improved partitions enable more fruitful downstream analyses, as gene modules are more biologically meaningful.

  6. An eleven gene molecular signature for extra-capsular spread in oral squamous cell carcinoma serves as a prognosticator of outcome in patients without nodal metastases.

    Science.gov (United States)

    Wang, Weining; Lim, Weng Khong; Leong, Hui Sun; Chong, Fui Teen; Lim, Tony K H; Tan, Daniel S W; Teh, Bin Tean; Iyer, N Gopalakrishna

    2015-04-01

    Extracapsular spread (ECS) is an important prognostic factor for oral squamous cell carcinoma (OSCC) and is used to guide management. In this study, we aimed to identify an expression profile signature for ECS in node-positive OSCC using data derived from two different sources: a cohort of OSCC patients from our institution (National Cancer Centre Singapore) and The Cancer Genome Atlas (TCGA) head and neck squamous cell carcinoma (HNSCC) cohort. We also sought to determine if this signature could serve as a prognostic factor in node negative cancers. Patients with a histological diagnosis of OSCC were identified from an institutional database and fresh tumor samples were retrieved. RNA was extracted and gene expression profiling was performed using the Affymetrix GeneChip Human Genome U133 Plus 2.0 microarray platform. RNA sequence data and corresponding clinical data for the TCGA HNSCC cohort were downloaded from the TCGA Data Portal. All data analyses were conducted using R package and SPSS. We identified an 11 gene signature (GGH, MTFR1, CDKN3, PSRC1, SMIM3, CA9, IRX4, CPA3, ZSCAN16, CBX7 and ZFP3) which was robust in segregating tumors by ECS status. In node negative patients, patients harboring this ECS signature had a significantly worse overall survival (p=0.04). An eleven gene signature for ECS was derived. Our results also suggest that this signature is prognostic in a separate subset of patients with no nodal metastasis Further validation of this signature on other datasets and immunohistochemical studies are required to establish utility of this signature in stratifying early stage OSCC patients. Copyright © 2014 Elsevier Ltd. All rights reserved.

  7. Characterization of the fumonisin B2 biosynthetic gene cluster in Aspergillus niger and A. awamori.

    Science.gov (United States)

    Aspergillus niger and A. awamori strains isolated from grapes cultivated in Mediterranean basin were examined for fumonisin B2 (FB2) production and presence/absence of sequences within the fumonisin biosynthetic gene (fum) cluster. Presence of 13 regions in the fum cluster was evaluated by PCR assay...

  8. Transforming RNA-Seq data to improve the performance of prognostic gene signatures.

    Science.gov (United States)

    Zwiener, Isabella; Frisch, Barbara; Binder, Harald

    2014-01-01

    Gene expression measurements have successfully been used for building prognostic signatures, i.e for identifying a short list of important genes that can predict patient outcome. Mostly microarray measurements have been considered, and there is little advice available for building multivariable risk prediction models from RNA-Seq data. We specifically consider penalized regression techniques, such as the lasso and componentwise boosting, which can simultaneously consider all measurements and provide both, multivariable regression models for prediction and automated variable selection. However, they might be affected by the typical skewness, mean-variance-dependency or extreme values of RNA-Seq covariates and therefore could benefit from transformations of the latter. In an analytical part, we highlight preferential selection of covariates with large variances, which is problematic due to the mean-variance dependency of RNA-Seq data. In a simulation study, we compare different transformations of RNA-Seq data for potentially improving detection of important genes. Specifically, we consider standardization, the log transformation, a variance-stabilizing transformation, the Box-Cox transformation, and rank-based transformations. In addition, the prediction performance for real data from patients with kidney cancer and acute myeloid leukemia is considered. We show that signature size, identification performance, and prediction performance critically depend on the choice of a suitable transformation. Rank-based transformations perform well in all scenarios and can even outperform complex variance-stabilizing approaches. Generally, the results illustrate that the distribution and potential transformations of RNA-Seq data need to be considered as a critical step when building risk prediction models by penalized regression techniques.

  9. Transforming RNA-Seq data to improve the performance of prognostic gene signatures.

    Directory of Open Access Journals (Sweden)

    Isabella Zwiener

    Full Text Available Gene expression measurements have successfully been used for building prognostic signatures, i.e for identifying a short list of important genes that can predict patient outcome. Mostly microarray measurements have been considered, and there is little advice available for building multivariable risk prediction models from RNA-Seq data. We specifically consider penalized regression techniques, such as the lasso and componentwise boosting, which can simultaneously consider all measurements and provide both, multivariable regression models for prediction and automated variable selection. However, they might be affected by the typical skewness, mean-variance-dependency or extreme values of RNA-Seq covariates and therefore could benefit from transformations of the latter. In an analytical part, we highlight preferential selection of covariates with large variances, which is problematic due to the mean-variance dependency of RNA-Seq data. In a simulation study, we compare different transformations of RNA-Seq data for potentially improving detection of important genes. Specifically, we consider standardization, the log transformation, a variance-stabilizing transformation, the Box-Cox transformation, and rank-based transformations. In addition, the prediction performance for real data from patients with kidney cancer and acute myeloid leukemia is considered. We show that signature size, identification performance, and prediction performance critically depend on the choice of a suitable transformation. Rank-based transformations perform well in all scenarios and can even outperform complex variance-stabilizing approaches. Generally, the results illustrate that the distribution and potential transformations of RNA-Seq data need to be considered as a critical step when building risk prediction models by penalized regression techniques.

  10. Two different secondary metabolism gene clusters occupied the same ancestral locus in fungal dermatophytes of the arthrodermataceae.

    Science.gov (United States)

    Zhang, Han; Rokas, Antonis; Slot, Jason C

    2012-01-01

    Dermatophyte fungi of the family Arthrodermataceae (Eurotiomycetes) colonize keratinized tissue, such as skin, frequently causing superficial mycoses in humans and other mammals, reptiles, and birds. Competition with native microflora likely underlies the propensity of these dermatophytes to produce a diversity of antibiotics and compounds for scavenging iron, which is extremely scarce, as well as the presence of an unusually large number of putative secondary metabolism gene clusters, most of which contain non-ribosomal peptide synthetases (NRPS), in their genomes. To better understand the historical origins and diversification of NRPS-containing gene clusters we examined the evolution of a variable locus (VL) that exists in one of three alternative conformations among the genomes of seven dermatophyte species. The first conformation of the VL (termed VLA) contains only 539 base pairs of sequence and lacks protein-coding genes, whereas the other two conformations (termed VLB and VLC) span 36 Kb and 27 Kb and contain 12 and 10 genes, respectively. Interestingly, both VLB and VLC appear to contain distinct secondary metabolism gene clusters; VLB contains a NRPS gene as well as four porphyrin metabolism genes never found to be physically linked in the genomes of 128 other fungal species, whereas VLC also contains a NRPS gene as well as several others typically found associated with secondary metabolism gene clusters. Phylogenetic evidence suggests that the VL locus was present in the ancestor of all seven species achieving its present distribution through subsequent differential losses or retentions of specific conformations. We propose that the existence of variable loci, similar to the one we studied, in fungal genomes could potentially explain the dramatic differences in secondary metabolic diversity between closely related species of filamentous fungi, and contribute to host adaptation and the generation of metabolic diversity.

  11. An improved Pearson's correlation proximity-based hierarchical clustering for mining biological association between genes.

    Science.gov (United States)

    Booma, P M; Prabhakaran, S; Dhanalakshmi, R

    2014-01-01

    Microarray gene expression datasets has concerned great awareness among molecular biologist, statisticians, and computer scientists. Data mining that extracts the hidden and usual information from datasets fails to identify the most significant biological associations between genes. A search made with heuristic for standard biological process measures only the gene expression level, threshold, and response time. Heuristic search identifies and mines the best biological solution, but the association process was not efficiently addressed. To monitor higher rate of expression levels between genes, a hierarchical clustering model was proposed, where the biological association between genes is measured simultaneously using proximity measure of improved Pearson's correlation (PCPHC). Additionally, the Seed Augment algorithm adopts average linkage methods on rows and columns in order to expand a seed PCPHC model into a maximal global PCPHC (GL-PCPHC) model and to identify association between the clusters. Moreover, a GL-PCPHC applies pattern growing method to mine the PCPHC patterns. Compared to existing gene expression analysis, the PCPHC model achieves better performance. Experimental evaluations are conducted for GL-PCPHC model with standard benchmark gene expression datasets extracted from UCI repository and GenBank database in terms of execution time, size of pattern, significance level, biological association efficiency, and pattern quality.

  12. Genetic homogeneity of Clostridium botulinum type A1 strains with unique toxin gene clusters.

    Science.gov (United States)

    Raphael, Brian H; Luquez, Carolina; McCroskey, Loretta M; Joseph, Lavin A; Jacobson, Mark J; Johnson, Eric A; Maslanka, Susan E; Andreadis, Joanne D

    2008-07-01

    A group of five clonally related Clostridium botulinum type A strains isolated from different sources over a period of nearly 40 years harbored several conserved genetic properties. These strains contained a variant bont/A1 with five nucleotide polymorphisms compared to the gene in C. botulinum strain ATCC 3502. The strains also had a common toxin gene cluster composition (ha-/orfX+) similar to that associated with bont/A in type A strains containing an unexpressed bont/B [termed A(B) strains]. However, bont/B was not identified in the strains examined. Comparative genomic hybridization demonstrated identical genomic content among the strains relative to C. botulinum strain ATCC 3502. In addition, microarray data demonstrated the absence of several genes flanking the toxin gene cluster among the ha-/orfX+ A1 strains, suggesting the presence of genomic rearrangements with respect to this region compared to the C. botulinum ATCC 3502 strain. All five strains were shown to have identical flaA variable region nucleotide sequences. The pulsed-field gel electrophoresis patterns of the strains were indistinguishable when digested with SmaI, and a shift in the size of at least one band was observed in a single strain when digested with XhoI. These results demonstrate surprising genomic homogeneity among a cluster of unique C. botulinum type A strains of diverse origin.

  13. No prognostic value added by vitamin D pathway SNPs to current prognostic system for melanoma survival.

    Directory of Open Access Journals (Sweden)

    Li Luo

    Full Text Available The prognostic improvement attributed to genetic markers over current prognostic system has not been well studied for melanoma. The goal of this study is to evaluate the added prognostic value of Vitamin D Pathway (VitD SNPs to currently known clinical and demographic factors such as age, sex, Breslow thickness, mitosis and ulceration (CDF. We utilized two large independent well-characterized melanoma studies: the Genes, Environment, and Melanoma (GEM and MD Anderson studies, and performed variable selection of VitD pathway SNPs and CDF using Random Survival Forest (RSF method in addition to Cox proportional hazards models. The Harrell's C-index was used to compare the performance of model predictability. The population-based GEM study enrolled 3,578 incident cases of cutaneous melanoma (CM, and the hospital-based MD Anderson study consisted of 1,804 CM patients. Including both VitD SNPs and CDF yielded C-index of 0.85, which provided slight but not significant improvement by CDF alone (C-index = 0.83 in the GEM study. Similar results were observed in the independent MD Anderson study (C-index = 0.84 and 0.83, respectively. The Cox model identified no significant associations after adjusting for multiplicity. Our results do not support clinically significant prognostic improvements attributable to VitD pathway SNPs over current prognostic system for melanoma survival.

  14. Human major histocompatibility complex contains a minimum of 19 genes between the complement cluster and HLA-B

    International Nuclear Information System (INIS)

    Spies, T.; Bresnahan, M.; Strominger, J.L.

    1989-01-01

    A 600-kilobase (kb) DNA segment from the human major histocompatibility complex (MHC) class III region was isolated by extension of a previous 435-kb chromosome walk. The contiguous series of cloned overlapping cosmids contains the entire 555-kb interval between C2 in the complement gene cluster and HLA-B. This region is known to encode the tumor necrosis factors (TNFs) α and β, B144, and the major heat shock protein HSP70. Moreover, a cluster of genes, BAT1-BAT5 (HLA-B-associated transcripts) have been localized in the vicinity of the genes for TNFα and TNFβ. An additional four genes were identified by isolation of corresponding cDNA clones with cosmid DNA probes. These genes for BAT6-BAT9 were mapped near the gene for C2 within a 120-kb region that includes a HSP70 gene pair. These results, together with complementary data from a similar recent study, indicated the presence of a minimum of 19 genes within the C2-HLA-B interval of the MHC class III region. Although the functional properties of most of these genes are yet unknown, they may be involved in some aspects of immunity. This idea is supported by the genetic mapping of the hematopoietic histocompatibility locus-1 (Hh-1) in recombinant mice between TNFα and H-2S, which is homologous to the complement gene cluster in humans

  15. SNPs in genes implicated in radiation response are associated with radiotoxicity and evoke roles as predictive and prognostic biomarkers

    International Nuclear Information System (INIS)

    Alsbeih, Ghazi; El-Sebaie, Medhat; Al-Harbi, Najla; Al-Hadyan, Khaled; Shoukri, Mohamed; Al-Rajhi, Nasser

    2013-01-01

    Biomarkers are needed to individualize cancer radiation treatment. Therefore, we have investigated the association between various risk factors, including single nucleotide polymorphisms (SNPs) in candidate genes and late complications to radiotherapy in our nasopharyngeal cancer patients. A cohort of 155 patients was included. Normal tissue fibrosis was scored using RTOG/EORTC grading system. A total of 45 SNPs in 11 candidate genes (ATM, XRCC1, XRCC3, XRCC4, XRCC5, PRKDC, LIG4, TP53, HDM2, CDKN1A, TGFB1) were genotyped by direct genomic DNA sequencing. Patients with severe fibrosis (cases, G3-4, n = 48) were compared to controls (G0-2, n = 107). Univariate analysis showed significant association (P < 0.05) with radiation complications for 6 SNPs (ATM G/A rs1801516, HDM2 promoter T/G rs2279744 and T/A rs1196333, XRCC1 G/A rs25487, XRCC5 T/C rs1051677 and TGFB1 C/T rs1800469). In addition, Kaplan-Meier analyses have also highlighted significant association between genotypes and length of patients’ follow-up after radiotherapy. Multivariate logistic regression has further sustained these results suggesting predictive and prognostic roles of SNPs. Univariate and multivariate analysis suggest that radiation toxicity in radiotherapy patients are associated with certain SNPs, in genes including HDM2 promoter studied for the 1st time. These results support the use of SNPs as genetic predictive markers for clinical radiosensitivity and evoke a prognostic role for length of patients’ follow-up after radiotherapy

  16. Genomic characterization of a new endophytic Streptomyces kebangsaanensis identifies biosynthetic pathway gene clusters for novel phenazine antibiotic production

    Directory of Open Access Journals (Sweden)

    Juwairiah Remali

    2017-11-01

    Full Text Available Background Streptomyces are well known for their capability to produce many bioactive secondary metabolites with medical and industrial importance. Here we report a novel bioactive phenazine compound, 6-((2-hydroxy-4-methoxyphenoxy carbonyl phenazine-1-carboxylic acid (HCPCA extracted from Streptomyces kebangsaanensis, an endophyte isolated from the ethnomedicinal Portulaca oleracea. Methods The HCPCA chemical structure was determined using nuclear magnetic resonance spectroscopy. We conducted whole genome sequencing for the identification of the gene cluster(s believed to be responsible for phenazine biosynthesis in order to map its corresponding pathway, in addition to bioinformatics analysis to assess the potential of S. kebangsaanensis in producing other useful secondary metabolites. Results The S. kebangsaanensis genome comprises an 8,328,719 bp linear chromosome with high GC content (71.35% consisting of 12 rRNA operons, 81 tRNA, and 7,558 protein coding genes. We identified 24 gene clusters involved in polyketide, nonribosomal peptide, terpene, bacteriocin, and siderophore biosynthesis, as well as a gene cluster predicted to be responsible for phenazine biosynthesis. Discussion The HCPCA phenazine structure was hypothesized to derive from the combination of two biosynthetic pathways, phenazine-1,6-dicarboxylic acid and 4-methoxybenzene-1,2-diol, originated from the shikimic acid pathway. The identification of a biosynthesis pathway gene cluster for phenazine antibiotics might facilitate future genetic engineering design of new synthetic phenazine antibiotics. Additionally, these findings confirm the potential of S. kebangsaanensis for producing various antibiotics and secondary metabolites.

  17. The Prognostic Value of Haplotypes in the Vascular Endothelial Growth Factor

    DEFF Research Database (Denmark)

    Hansen, Torben Frøstrup; Spindler, Karen-Lise Garm; Andersen, Rikke Fredslund

    2010-01-01

    Abstract: New prognostic markers in patients with colorectal cancer (CRC) are a prerequisite for individualized treatment. Prognostic importance of single nucleotide polymorphisms (SNPs) in the vascular endothelial growth factor A (VEGF-A) gene has been proposed. The objective of the present study...... using the PHASE program. The prognostic influence was evaluated using Kaplan-Meir plots and log rank tests. Cox regression method was used to analyze the independent prognostic importance of different markers. All three SNPs were significantly related to survival. A haplotype combination, responsible...... findings in a second and independent cohort. Haplotype combinations call for further investigation. Keywords: colorectal neoplasm; single nucleotide polymorphisms; haplotypes; vascular endothelial growth factor A; survival...

  18. Heterologous reconstitution of the intact geodin gene cluster in Aspergillus nidulans through a simple and versatile PCR based approach.

    Directory of Open Access Journals (Sweden)

    Morten Thrane Nielsen

    Full Text Available Fungal natural products are a rich resource for bioactive molecules. To fully exploit this potential it is necessary to link genes to metabolites. Genetic information for numerous putative biosynthetic pathways has become available in recent years through genome sequencing. However, the lack of solid methodology for genetic manipulation of most species severely hampers pathway characterization. Here we present a simple PCR based approach for heterologous reconstitution of intact gene clusters. Specifically, the putative gene cluster responsible for geodin production from Aspergillus terreus was transferred in a two step procedure to an expression platform in A. nidulans. The individual cluster fragments were generated by PCR and assembled via efficient USER fusion prior to transformation and integration via re-iterative gene targeting. A total of 13 open reading frames contained in 25 kb of DNA were successfully transferred between the two species enabling geodin synthesis in A. nidulans. Subsequently, functions of three genes in the cluster were validated by genetic and chemical analyses. Specifically, ATEG_08451 (gedC encodes a polyketide synthase, ATEG_08453 (gedR encodes a transcription factor responsible for activation of the geodin gene cluster and ATEG_08460 (gedL encodes a halogenase that catalyzes conversion of sulochrin to dihydrogeodin. We expect that our approach for transferring intact biosynthetic pathways to a fungus with a well developed genetic toolbox will be instrumental in characterizing the many exciting pathways for secondary metabolite production that are currently being uncovered by the fungal genome sequencing projects.

  19. Variations in CCL3L gene cluster sequence and non-specific gene copy numbers

    Directory of Open Access Journals (Sweden)

    Edberg Jeffrey C

    2010-03-01

    Full Text Available Abstract Background Copy number variations (CNVs of the gene CC chemokine ligand 3-like1 (CCL3L1 have been implicated in HIV-1 susceptibility, but the association has been inconsistent. CCL3L1 shares homology with a cluster of genes localized to chromosome 17q12, namely CCL3, CCL3L2, and, CCL3L3. These genes are involved in host defense and inflammatory processes. Several CNV assays have been developed for the CCL3L1 gene. Findings Through pairwise and multiple alignments of these genes, we have shown that the homology between these genes ranges from 50% to 99% in complete gene sequences and from 70-100% in the exonic regions, with CCL3L1 and CCL3L3 being identical. By use of MEGA 4 and BioEdit, we aligned sense primers, anti-sense primers, and probes used in several previously described assays against pre-multiple alignments of all four chemokine genes. Each set of probes and primers aligned and matched with overlapping sequences in at least two of the four genes, indicating that previously utilized RT-PCR based CNV assays are not specific for only CCL3L1. The four available assays measured median copies of 2 and 3-4 in European and African American, respectively. The concordance between the assays ranged from 0.44-0.83 suggesting individual discordant calls and inconsistencies with the assays from the expected gene coverage from the known sequence. Conclusions This indicates that some of the inconsistencies in the association studies could be due to assays that provide heterogenous results. Sequence information to determine CNV of the three genes separately would allow to test whether their association with the pathogenesis of a human disease or phenotype is affected by an individual gene or by a combination of these genes.

  20. Prognostic molecular markers in early breast cancer

    International Nuclear Information System (INIS)

    Esteva, Francisco J; Hortobagyi, Gabriel N

    2004-01-01

    A multitude of molecules involved in breast cancer biology have been studied as potential prognostic markers. In the present review we discuss the role of established molecular markers, as well as potential applications of emerging new technologies. Those molecules used routinely to make treatment decisions in patients with early-stage breast cancer include markers of proliferation (e.g. Ki-67), hormone receptors, and the human epidermal growth factor receptor 2. Tumor markers shown to have prognostic value but not used routinely include cyclin D 1 and cyclin E, urokinase-like plasminogen activator/plasminogen activator inhibitor, and cathepsin D. The level of evidence for other molecular markers is lower, in part because most studies were retrospective and not adequately powered, making their findings unsuitable for choosing treatments for individual patients. Gene microarrays have been successfuly used to classify breast cancers into subtypes with specific gene expression profiles and to evaluate prognosis. RT-PCR has also been used to evaluate expression of multiple genes in archival tissue. Proteomics technologies are in development

  1. Contribution of the Pmra Promoter to Expression of Genes in the Escherichia coli mra Cluster of Cell Envelope Biosynthesis and Cell Division Genes

    Science.gov (United States)

    Mengin-Lecreulx, Dominique; Ayala, Juan; Bouhss, Ahmed; van Heijenoort, Jean; Parquet, Claudine; Hara, Hiroshi

    1998-01-01

    Recently, a promoter for the essential gene ftsI, which encodes penicillin-binding protein 3 of Escherichia coli, was precisely localized 1.9 kb upstream from this gene, at the beginning of the mra cluster of cell division and cell envelope biosynthesis genes (H. Hara, S. Yasuda, K. Horiuchi, and J. T. Park, J. Bacteriol. 179:5802–5811, 1997). Disruption of this promoter (Pmra) on the chromosome and its replacement by the lac promoter (Pmra::Plac) led to isopropyl-β-d-thiogalactopyranoside (IPTG)-dependent cells that lysed in the absence of inducer, a defect which was complemented only when the whole region from Pmra to ftsW, the fifth gene downstream from ftsI, was provided in trans on a plasmid. In the present work, the levels of various proteins involved in peptidoglycan synthesis and cell division were precisely determined in cells in which Pmra::Plac promoter expression was repressed or fully induced. It was confirmed that the Pmra promoter is required for expression of the first nine genes of the mra cluster: mraZ (orfC), mraW (orfB), ftsL (mraR), ftsI, murE, murF, mraY, murD, and ftsW. Interestingly, three- to sixfold-decreased levels of MurG and MurC enzymes were observed in uninduced Pmra::Plac cells. This was correlated with an accumulation of the nucleotide precursors UDP–N-acetylglucosamine and UDP–N-acetylmuramic acid, substrates of these enzymes, and with a depletion of the pool of UDP–N-acetylmuramyl pentapeptide, resulting in decreased cell wall peptidoglycan synthesis. Moreover, the expression of ftsZ, the penultimate gene from this cluster, was significantly reduced when Pmra expression was repressed. It was concluded that the transcription of the genes located downstream from ftsW in the mra cluster, from murG to ftsZ, is also mainly (but not exclusively) dependent on the Pmra promoter. PMID:9721276

  2. Comparing large covariance matrices under weak conditions on the dependence structure and its application to gene clustering.

    Science.gov (United States)

    Chang, Jinyuan; Zhou, Wen; Zhou, Wen-Xin; Wang, Lan

    2017-03-01

    Comparing large covariance matrices has important applications in modern genomics, where scientists are often interested in understanding whether relationships (e.g., dependencies or co-regulations) among a large number of genes vary between different biological states. We propose a computationally fast procedure for testing the equality of two large covariance matrices when the dimensions of the covariance matrices are much larger than the sample sizes. A distinguishing feature of the new procedure is that it imposes no structural assumptions on the unknown covariance matrices. Hence, the test is robust with respect to various complex dependence structures that frequently arise in genomics. We prove that the proposed procedure is asymptotically valid under weak moment conditions. As an interesting application, we derive a new gene clustering algorithm which shares the same nice property of avoiding restrictive structural assumptions for high-dimensional genomics data. Using an asthma gene expression dataset, we illustrate how the new test helps compare the covariance matrices of the genes across different gene sets/pathways between the disease group and the control group, and how the gene clustering algorithm provides new insights on the way gene clustering patterns differ between the two groups. The proposed methods have been implemented in an R-package HDtest and are available on CRAN. © 2016, The International Biometric Society.

  3. Systematic profiling of alternative splicing signature reveals prognostic predictor for ovarian cancer.

    Science.gov (United States)

    Zhu, Junyong; Chen, Zuhua; Yong, Lei

    2018-02-01

    The majority of genes are alternatively spliced and growing evidence suggests that alternative splicing is modified in cancer and is associated with cancer progression. Systematic analysis of alternative splicing signature in ovarian cancer is lacking and greatly needed. We profiled genome-wide alternative splicing events in 408 ovarian serous cystadenocarcinoma (OV) patients in TCGA. Seven types of alternative splicing events were curated and prognostic analyses were performed with predictive models and splicing network built for OV patients. Among 48,049 mRNA splicing events in 10,582 genes, we detected 2,611 alternative splicing events in 2,036 genes which were significant associated with overall survival of OV patients. Exon skip events were the most powerful prognostic factors among the seven types. The area under the curve of the receiver-operator characteristic curve for prognostic predictor, which was built with top significant alternative splicing events, was 0.937 at 2,000 days of overall survival, indicating powerful efficiency in distinguishing patient outcome. Interestingly, splicing correlation network suggested obvious trends in the role of splicing factors in OV. In summary, we built powerful prognostic predictors for OV patients and uncovered interesting splicing networks which could be underlying mechanisms. Copyright © 2017 Elsevier Inc. All rights reserved.

  4. Improved prediction of breast cancer outcome by identifying heterogeneous biomarkers.

    Science.gov (United States)

    Choi, Jonghwan; Park, Sanghyun; Yoon, Youngmi; Ahn, Jaegyoon

    2017-11-15

    Identification of genes that can be used to predict prognosis in patients with cancer is important in that it can lead to improved therapy, and can also promote our understanding of tumor progression on the molecular level. One of the common but fundamental problems that render identification of prognostic genes and prediction of cancer outcomes difficult is the heterogeneity of patient samples. To reduce the effect of sample heterogeneity, we clustered data samples using K-means algorithm and applied modified PageRank to functional interaction (FI) networks weighted using gene expression values of samples in each cluster. Hub genes among resulting prioritized genes were selected as biomarkers to predict the prognosis of samples. This process outperformed traditional feature selection methods as well as several network-based prognostic gene selection methods when applied to Random Forest. We were able to find many cluster-specific prognostic genes for each dataset. Functional study showed that distinct biological processes were enriched in each cluster, which seems to reflect different aspect of tumor progression or oncogenesis among distinct patient groups. Taken together, these results provide support for the hypothesis that our approach can effectively identify heterogeneous prognostic genes, and these are complementary to each other, improving prediction accuracy. https://github.com/mathcom/CPR. jgahn@inu.ac.kr. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  5. IMG-ABC: new features for bacterial secondary metabolism analysis and targeted biosynthetic gene cluster discovery in thousands of microbial genomes.

    Science.gov (United States)

    Hadjithomas, Michalis; Chen, I-Min A; Chu, Ken; Huang, Jinghua; Ratner, Anna; Palaniappan, Krishna; Andersen, Evan; Markowitz, Victor; Kyrpides, Nikos C; Ivanova, Natalia N

    2017-01-04

    Secondary metabolites produced by microbes have diverse biological functions, which makes them a great potential source of biotechnologically relevant compounds with antimicrobial, anti-cancer and other activities. The proteins needed to synthesize these natural products are often encoded by clusters of co-located genes called biosynthetic gene clusters (BCs). In order to advance the exploration of microbial secondary metabolism, we developed the largest publically available database of experimentally verified and predicted BCs, the Integrated Microbial Genomes Atlas of Biosynthetic gene Clusters (IMG-ABC) (https://img.jgi.doe.gov/abc/). Here, we describe an update of IMG-ABC, which includes ClusterScout, a tool for targeted identification of custom biosynthetic gene clusters across 40 000 isolate microbial genomes, and a new search capability to query more than 700 000 BCs from isolate genomes for clusters with similar Pfam composition. Additional features enable fast exploration and analysis of BCs through two new interactive visualization features, a BC function heatmap and a BC similarity network graph. These new tools and features add to the value of IMG-ABC's vast body of BC data, facilitating their in-depth analysis and accelerating secondary metabolite discovery. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  6. Genomic organization, tissue distribution and functional characterization of the rat Pate gene cluster.

    Directory of Open Access Journals (Sweden)

    Angireddy Rajesh

    Full Text Available The cysteine rich prostate and testis expressed (Pate proteins identified till date are thought to resemble the three fingered protein/urokinase-type plasminogen activator receptor proteins. In this study, for the first time, we report the identification, cloning and characterization of rat Pate gene cluster and also determine the expression pattern. The rat Pate genes are clustered on chromosome 8 and their predicted proteins retained the ten cysteine signature characteristic to TFP/Ly-6 protein family. PATE and PATE-F three dimensional protein structure was found to be similar to that of the toxin bucandin. Though Pate gene expression is thought to be prostate and testis specific, we observed that rat Pate genes are also expressed in seminal vesicle and epididymis and in tissues beyond the male reproductive tract. In the developing rats (20-60 day old, expression of Pate genes seem to be androgen dependent in the epididymis and testis. In the adult rat, androgen ablation resulted in down regulation of the majority of Pate genes in the epididymides. PATE and PATE-F proteins were found to be expressed abundantly in the male reproductive tract of rats and on the sperm. Recombinant PATE protein exhibited potent antibacterial activity, whereas PATE-F did not exhibit any antibacterial activity. Pate expression was induced in the epididymides when challenged with LPS. Based on our results, we conclude that rat PATE proteins may contribute to the reproductive and defense functions.

  7. Linkage of the Nit1C gene cluster to bacterial cyanide assimilation as a nitrogen source.

    Science.gov (United States)

    Jones, Lauren B; Ghosh, Pallab; Lee, Jung-Hyun; Chou, Chia-Ni; Kunz, Daniel A

    2018-05-21

    A genetic linkage between a conserved gene cluster (Nit1C) and the ability of bacteria to utilize cyanide as the sole nitrogen source was demonstrated for nine different bacterial species. These included three strains whose cyanide nutritional ability has formerly been documented (Pseudomonas fluorescens Pf11764, Pseudomonas putida BCN3 and Klebsiella pneumoniae BCN33), and six not previously known to have this ability [Burkholderia (Paraburkholderia) xenovorans LB400, Paraburkholderia phymatum STM815, Paraburkholderia phytofirmans PsJN, Cupriavidus (Ralstonia) eutropha H16, Gluconoacetobacter diazotrophicus PA1 5 and Methylobacterium extorquens AM1]. For all bacteria, growth on or exposure to cyanide led to the induction of the canonical nitrilase (NitC) linked to the gene cluster, and in the case of Pf11764 in particular, transcript levels of cluster genes (nitBCDEFGH) were raised, and a nitC knock-out mutant failed to grow. Further studies demonstrated that the highly conserved nitB gene product was also significantly elevated. Collectively, these findings provide strong evidence for a genetic linkage between Nit1C and bacterial growth on cyanide, supporting use of the term cyanotrophy in describing what may represent a new nutritional paradigm in microbiology. A broader search of Nit1C genes in presently available genomes revealed its presence in 270 different bacteria, all contained within the domain Bacteria, including Gram-positive Firmicutes and Actinobacteria, and Gram-negative Proteobacteria and Cyanobacteria. Absence of the cluster in the Archaea is congruent with events that may have led to the inception of Nit1C occurring coincidentally with the first appearance of cyanogenic species on Earth, dating back 400-500 million years.

  8. The Serratia gene cluster encoding biosynthesis of the red antibiotic, prodigiosin, shows species- and strain-dependent genome context variation

    DEFF Research Database (Denmark)

    Harris, Abigail K P; Williamson, Neil R; Slater, Holly

    2004-01-01

    The prodigiosin biosynthesis gene cluster (pig cluster) from two strains of Serratia (S. marcescens ATCC 274 and Serratia sp. ATCC 39006) has been cloned, sequenced and expressed in heterologous hosts. Sequence analysis of the respective pig clusters revealed 14 ORFs in S. marcescens ATCC 274...... and 15 ORFs in Serratia sp. ATCC 39006. In each Serratia species, predicted gene products showed similarity to polyketide synthases (PKSs), non-ribosomal peptide synthases (NRPSs) and the Red proteins of Streptomyces coelicolor A3(2). Comparisons between the two Serratia pig clusters and the red cluster...... from Str. coelicolor A3(2) revealed some important differences. A modified scheme for the biosynthesis of prodigiosin, based on the pathway recently suggested for the synthesis of undecylprodigiosin, is proposed. The distribution of the pig cluster within several Serratia sp. isolates is demonstrated...

  9. A functional bikaverin biosynthesis gene cluster in rare strains of Botrytis cinerea is positively controlled by VELVET.

    Directory of Open Access Journals (Sweden)

    Julia Schumacher

    Full Text Available The gene cluster responsible for the biosynthesis of the red polyketidic pigment bikaverin has only been characterized in Fusarium ssp. so far. Recently, a highly homologous but incomplete and nonfunctional bikaverin cluster has been found in the genome of the unrelated phytopathogenic fungus Botrytis cinerea. In this study, we provided evidence that rare B. cinerea strains such as 1750 have a complete and functional cluster comprising the six genes orthologous to Fusarium fujikuroi ffbik1-ffbik6 and do produce bikaverin. Phylogenetic analysis confirmed that the whole cluster was acquired from Fusarium through a horizontal gene transfer (HGT. In the bikaverin-nonproducing strain B05.10, the genes encoding bikaverin biosynthesis enzymes are nonfunctional due to deleterious mutations (bcbik2-3 or missing (bcbik1 but interestingly, the genes encoding the regulatory proteins BcBIK4 and BcBIK5 do not harbor deleterious mutations which suggests that they may still be functional. Heterologous complementation of the F. fujikuroi Δffbik4 mutant confirmed that bcbik4 of strain B05.10 is indeed fully functional. Deletion of bcvel1 in the pink strain 1750 resulted in loss of bikaverin and overproduction of melanin indicating that the VELVET protein BcVEL1 regulates the biosynthesis of the two pigments in an opposite manner. Although strain 1750 itself expresses a truncated BcVEL1 protein (100 instead of 575 aa that is nonfunctional with regard to sclerotia formation, virulence and oxalic acid formation, it is sufficient to regulate pigment biosynthesis (bikaverin and melanin and fenhexamid HydR2 type of resistance. Finally, a genetic cross between strain 1750 and a bikaverin-nonproducing strain sensitive to fenhexamid revealed that the functional bikaverin cluster is genetically linked to the HydR2 locus.

  10. A highly divergent gene cluster in honey bees encodes a novel silk family

    OpenAIRE

    Sutherland, Tara D.; Campbell, Peter M.; Weisman, Sarah; Trueman, Holly E.; Sriskantha, Alagacone; Wanjura, Wolfgang J.; Haritos, Victoria S.

    2006-01-01

    The pupal cocoon of the domesticated silk moth Bombyx mori is the best known and most extensively studied insect silk. It is not widely known that Apis mellifera larvae also produce silk. We have used a combination of genomic and proteomic techniques to identify four honey bee fiber genes (AmelFibroin1–4) and two silk-associated genes (AmelSA1 and 2). The four fiber genes are small, comprise a single exon each, and are clustered on a short genomic region where the open reading frames are GC-r...

  11. Functional characterization of KanP, a methyltransferase from the kanamycin biosynthetic gene cluster of Streptomyces kanamyceticus.

    Science.gov (United States)

    Nepal, Keshav Kumar; Yoo, Jin Cheol; Sohng, Jae Kyung

    2010-09-20

    KanP, a putative methyltransferase, is located in the kanamycin biosynthetic gene cluster of Streptomyces kanamyceticus ATCC12853. Amino acid sequence analysis of KanP revealed the presence of S-adenosyl-L-methionine binding motifs, which are present in other O-methyltransferases. The kanP gene was expressed in Escherichia coli BL21 (DE3) to generate the E. coli KANP recombinant strain. The conversion of external quercetin to methylated quercetin in the culture extract of E. coli KANP proved the function of kanP as S-adenosyl-L-methionine-dependent methyltransferase. This is the first report concerning the identification of an O-methyltransferase gene from the kanamycin gene cluster. The resistant activity assay and RT-PCR analysis demonstrated the leeway for obtaining methylated kanamycin derivatives from the wild-type strain of kanamycin producer. 2009 Elsevier GmbH. All rights reserved.

  12. Clustering gene expression time series data using an infinite Gaussian process mixture model.

    Science.gov (United States)

    McDowell, Ian C; Manandhar, Dinesh; Vockley, Christopher M; Schmid, Amy K; Reddy, Timothy E; Engelhardt, Barbara E

    2018-01-01

    Transcriptome-wide time series expression profiling is used to characterize the cellular response to environmental perturbations. The first step to analyzing transcriptional response data is often to cluster genes with similar responses. Here, we present a nonparametric model-based method, Dirichlet process Gaussian process mixture model (DPGP), which jointly models data clusters with a Dirichlet process and temporal dependencies with Gaussian processes. We demonstrate the accuracy of DPGP in comparison to state-of-the-art approaches using hundreds of simulated data sets. To further test our method, we apply DPGP to published microarray data from a microbial model organism exposed to stress and to novel RNA-seq data from a human cell line exposed to the glucocorticoid dexamethasone. We validate our clusters by examining local transcription factor binding and histone modifications. Our results demonstrate that jointly modeling cluster number and temporal dependencies can reveal shared regulatory mechanisms. DPGP software is freely available online at https://github.com/PrincetonUniversity/DP_GP_cluster.

  13. Clustering gene expression time series data using an infinite Gaussian process mixture model.

    Directory of Open Access Journals (Sweden)

    Ian C McDowell

    2018-01-01

    Full Text Available Transcriptome-wide time series expression profiling is used to characterize the cellular response to environmental perturbations. The first step to analyzing transcriptional response data is often to cluster genes with similar responses. Here, we present a nonparametric model-based method, Dirichlet process Gaussian process mixture model (DPGP, which jointly models data clusters with a Dirichlet process and temporal dependencies with Gaussian processes. We demonstrate the accuracy of DPGP in comparison to state-of-the-art approaches using hundreds of simulated data sets. To further test our method, we apply DPGP to published microarray data from a microbial model organism exposed to stress and to novel RNA-seq data from a human cell line exposed to the glucocorticoid dexamethasone. We validate our clusters by examining local transcription factor binding and histone modifications. Our results demonstrate that jointly modeling cluster number and temporal dependencies can reveal shared regulatory mechanisms. DPGP software is freely available online at https://github.com/PrincetonUniversity/DP_GP_cluster.

  14. Characterization and detection of a widely distributed gene cluster that predicts anaerobic choline utilization by human gut bacteria.

    Science.gov (United States)

    Martínez-del Campo, Ana; Bodea, Smaranda; Hamer, Hilary A; Marks, Jonathan A; Haiser, Henry J; Turnbaugh, Peter J; Balskus, Emily P

    2015-04-14

    Elucidation of the molecular mechanisms underlying the human gut microbiota's effects on health and disease has been complicated by difficulties in linking metabolic functions associated with the gut community as a whole to individual microorganisms and activities. Anaerobic microbial choline metabolism, a disease-associated metabolic pathway, exemplifies this challenge, as the specific human gut microorganisms responsible for this transformation have not yet been clearly identified. In this study, we established the link between a bacterial gene cluster, the choline utilization (cut) cluster, and anaerobic choline metabolism in human gut isolates by combining transcriptional, biochemical, bioinformatic, and cultivation-based approaches. Quantitative reverse transcription-PCR analysis and in vitro biochemical characterization of two cut gene products linked the entire cluster to growth on choline and supported a model for this pathway. Analyses of sequenced bacterial genomes revealed that the cut cluster is present in many human gut bacteria, is predictive of choline utilization in sequenced isolates, and is widely but discontinuously distributed across multiple bacterial phyla. Given that bacterial phylogeny is a poor marker for choline utilization, we were prompted to develop a degenerate PCR-based method for detecting the key functional gene choline TMA-lyase (cutC) in genomic and metagenomic DNA. Using this tool, we found that new choline-metabolizing gut isolates universally possessed cutC. We also demonstrated that this gene is widespread in stool metagenomic data sets. Overall, this work represents a crucial step toward understanding anaerobic choline metabolism in the human gut microbiota and underscores the importance of examining this microbial community from a function-oriented perspective. Anaerobic choline utilization is a bacterial metabolic activity that occurs in the human gut and is linked to multiple diseases. While bacterial genes responsible for

  15. The human TREM gene cluster at 6p21.1 encodes both activating and inhibitory single IgV domain receptors and includes NKp44.

    Science.gov (United States)

    Allcock, Richard J N; Barrow, Alexander D; Forbes, Simon; Beck, Stephan; Trowsdale, John

    2003-02-01

    We have characterized a cluster of single immunoglobulin variable (IgV) domain receptors centromeric of the major histocompatibility complex (MHC) on human chromosome 6. In addition to triggering receptor expressed on myeloid cells (TREM)-1 and TREM2, the cluster contains NKp44, a triggering receptor whose expression is limited to NK cells. We identified three new related genes and two gene fragments within a cluster of approximately 200 kb. Two of the three new genes lack charged residues in their transmembrane domain tails. Further, one of the genes contains two potential immunotyrosine Inhibitory motifs in its cytoplasmic tail, suggesting that it delivers inhibitory signals. The human and mouse TREM clusters appear to have diverged such that there are unique sequences in each species. Finally, each gene in the TREM cluster was expressed in a different range of cell types.

  16. Analysis of genetic association using hierarchical clustering and cluster validation indices.

    Science.gov (United States)

    Pagnuco, Inti A; Pastore, Juan I; Abras, Guillermo; Brun, Marcel; Ballarin, Virginia L

    2017-10-01

    It is usually assumed that co-expressed genes suggest co-regulation in the underlying regulatory network. Determining sets of co-expressed genes is an important task, based on some criteria of similarity. This task is usually performed by clustering algorithms, where the genes are clustered into meaningful groups based on their expression values in a set of experiment. In this work, we propose a method to find sets of co-expressed genes, based on cluster validation indices as a measure of similarity for individual gene groups, and a combination of variants of hierarchical clustering to generate the candidate groups. We evaluated its ability to retrieve significant sets on simulated correlated and real genomics data, where the performance is measured based on its detection ability of co-regulated sets against a full search. Additionally, we analyzed the quality of the best ranked groups using an online bioinformatics tool that provides network information for the selected genes. Copyright © 2017 Elsevier Inc. All rights reserved.

  17. Archaeal Clusters of Orthologous Genes (arCOGs): An Update and Application for Analysis of Shared Features between Thermococcales, Methanococcales, and Methanobacteriales

    OpenAIRE

    Makarova, Kira; Wolf, Yuri; Koonin, Eugene

    2015-01-01

    With the continuously accelerating genome sequencing from diverse groups of archaea and bacteria, accurate identification of gene orthology and availability of readily expandable clusters of orthologous genes are essential for the functional annotation of new genomes. We report an update of the collection of archaeal Clusters of Orthologous Genes (arCOGs) to cover, on average, 91% of the protein-coding genes in 168 archaeal genomes. The new arCOGs were constructed using refined algorithms for...

  18. Organization of nif gene cluster in Frankia sp. EuIK1 strain, a symbiont of Elaeagnus umbellata.

    Science.gov (United States)

    Oh, Chang Jae; Kim, Ho Bang; Kim, Jitae; Kim, Won Jin; Lee, Hyoungseok; An, Chung Sun

    2012-01-01

    The nucleotide sequence of a 20.5-kb genomic region harboring nif genes was determined and analyzed. The fragment was obtained from Frankia sp. EuIK1 strain, an indigenous symbiont of Elaeagnus umbellata. A total of 20 ORFs including 12 nif genes were identified and subjected to comparative analysis with the genome sequences of 3 Frankia strains representing diverse host plant specificities. The nucleotide and deduced amino acid sequences showed highest levels of identity with orthologous genes from an Elaeagnus-infecting strain. The gene organization patterns around the nif gene clusters were well conserved among all 4 Frankia strains. However, characteristic features appeared in the location of the nifV gene for each Frankia strain, depending on the type of host plant. Sequence analysis was performed to determine the transcription units and suggested that there could be an independent operon starting from the nifW gene in the EuIK strain. Considering the organization patterns and their total extensions on the genome, we propose that the nif gene clusters remained stable despite genetic variations occurring in the Frankia genomes.

  19. Lactobacillus plantarum gene clusters encoding putative cell-surface protein complexes for carbohydrate utilization are conserved in specific gram-positive bacteria

    Directory of Open Access Journals (Sweden)

    Muscariello Lidia

    2006-05-01

    Full Text Available Abstract Background Genomes of gram-positive bacteria encode many putative cell-surface proteins, of which the majority has no known function. From the rapidly increasing number of available genome sequences it has become apparent that many cell-surface proteins are conserved, and frequently encoded in gene clusters or operons, suggesting common functions, and interactions of multiple components. Results A novel gene cluster encoding exclusively cell-surface proteins was identified, which is conserved in a subgroup of gram-positive bacteria. Each gene cluster generally has one copy of four new gene families called cscA, cscB, cscC and cscD. Clusters encoding these cell-surface proteins were found only in complete genomes of Lactobacillus plantarum, Lactobacillus sakei, Enterococcus faecalis, Listeria innocua, Listeria monocytogenes, Lactococcus lactis ssp lactis and Bacillus cereus and in incomplete genomes of L. lactis ssp cremoris, Lactobacillus casei, Enterococcus faecium, Pediococcus pentosaceus, Lactobacillius brevis, Oenococcus oeni, Leuconostoc mesenteroides, and Bacillus thuringiensis. These genes are neither present in the genomes of streptococci, staphylococci and clostridia, nor in the Lactobacillus acidophilus group, suggesting a niche-specific distribution, possibly relating to association with plants. All encoded proteins have a signal peptide for secretion by the Sec-dependent pathway, while some have cell-surface anchors, novel WxL domains, and putative domains for sugar binding and degradation. Transcriptome analysis in L. plantarum shows that the cscA-D genes are co-expressed, supporting their operon organization. Many gene clusters are significantly up-regulated in a glucose-grown, ccpA-mutant derivative of L. plantarum, suggesting catabolite control. This is supported by the presence of predicted CRE-sites upstream or inside the up-regulated cscA-D gene clusters. Conclusion We propose that the CscA, CscB, CscC and Csc

  20. Genetic diversity of K-antigen gene clusters of Escherichia coli and their molecular typing using a suspension array.

    Science.gov (United States)

    Yang, Shuang; Xi, Daoyi; Jing, Fuyi; Kong, Deju; Wu, Junli; Feng, Lu; Cao, Boyang; Wang, Lei

    2018-04-01

    Capsular polysaccharides (CPSs), or K-antigens, are the major surface antigens of Escherichia coli. More than 80 serologically unique K-antigens are classified into 4 groups (Groups 1-4) of capsules. Groups 1 and 4 contain the Wzy-dependent polymerization pathway and the gene clusters are in the order galF to gnd; Groups 2 and 3 contain the ABC-transporter-dependent pathway and the gene clusters consist of 3 regions, regions 1, 2 and 3. Little is known about the variations among the gene clusters. In this study, 9 serotypes of K-antigen gene clusters (K2ab, K11, K20, K24, K38, K84, K92, K96, and K102) were sequenced and correlated with their CPS chemical structures. On the basis of sequence data, a K-antigen-specific suspension array that detects 10 distinct CPSs, including the above 9 CPSs plus K30, was developed. This is the first report to catalog the genetic features of E. coli K-antigen variations and to develop a suspension array for their molecular typing. The method has a number of advantages over traditional bacteriophage and serum agglutination methods and lays the foundation for straightforward identification and detection of additional K-antigens in the future.

  1. An enhanced deterministic K-Means clustering algorithm for cancer subtype prediction from gene expression data.

    Science.gov (United States)

    Nidheesh, N; Abdul Nazeer, K A; Ameer, P M

    2017-12-01

    Clustering algorithms with steps involving randomness usually give different results on different executions for the same dataset. This non-deterministic nature of algorithms such as the K-Means clustering algorithm limits their applicability in areas such as cancer subtype prediction using gene expression data. It is hard to sensibly compare the results of such algorithms with those of other algorithms. The non-deterministic nature of K-Means is due to its random selection of data points as initial centroids. We propose an improved, density based version of K-Means, which involves a novel and systematic method for selecting initial centroids. The key idea of the algorithm is to select data points which belong to dense regions and which are adequately separated in feature space as the initial centroids. We compared the proposed algorithm to a set of eleven widely used single clustering algorithms and a prominent ensemble clustering algorithm which is being used for cancer data classification, based on the performances on a set of datasets comprising ten cancer gene expression datasets. The proposed algorithm has shown better overall performance than the others. There is a pressing need in the Biomedical domain for simple, easy-to-use and more accurate Machine Learning tools for cancer subtype prediction. The proposed algorithm is simple, easy-to-use and gives stable results. Moreover, it provides comparatively better predictions of cancer subtypes from gene expression data. Copyright © 2017 Elsevier Ltd. All rights reserved.

  2. Validation of the 18-gene classifier as a prognostic biomarker of distant metastasis in breast cancer.

    Directory of Open Access Journals (Sweden)

    Skye Hung-Chun Cheng

    Full Text Available We validated an 18-gene classifier (GC initially developed to predict local/regional recurrence after mastectomy in estimating distant metastasis risk. The 18-gene scoring algorithm defines scores as: <21, low risk; ≥21, high risk. Six hundred eighty-three patients with primary operable breast cancer and fresh frozen tumor tissues available were included. The primary outcome was the 5-year probability of freedom from distant metastasis (DMFP. Two external datasets were used to test the predictive accuracy of 18-GC. The 5-year rates of DMFP for patients classified as low-risk (n = 146, 21.7% and high-risk (n = 537, 78.6% were 96.2% (95% CI, 91.1%-98.8% and 80.9% (74.6%-81.9%, respectively (median follow-up interval, 71.8 months. The 5-year rates of DMFP of the low-risk group in stage I (n = 62, 35.6%, stage II (n = 66, 20.1%, and stage III (n = 18, 10.3% were 100%, 94.2% (78.5%-98.5%, and 90.9% (50.8%-98.7%, respectively. Multivariate analysis revealed that 18-GC is an independent prognostic factor of distant metastasis (adjusted hazard ratio, 5.1; 95% CI, 1.8-14.1; p = 0.0017 for scores of ≥21. External validation showed that the 5-year rate of DMFP in the low- and high-risk patients was 94.1% (82.9%-100% and 80.3% (70.7%-89.9%, p = 0.06 in a Singapore dataset, and 89.5% (81.9%-94.1% and 73.6% (67.2%-79.0%, p = 0.0039 in the GEO-GSE20685 dataset, respectively. In conclusion, 18-GC is a viable prognostic biomarker for breast cancer to estimate distant metastasis risk.

  3. Characterization of the Second LysR-Type Regulator in the Biphenyl-Catabolic Gene Cluster of Pseudomonas pseudoalcaligenes KF707

    OpenAIRE

    Watanabe, Takahito; Fujihara, Hidehiko; Furukawa, Kensuke

    2003-01-01

    Pseudomonas pseudoalcaligenes KF707 possesses a biphenyl-catabolic (bph) gene cluster consisting of bphR1A1A2-(orf3)-bphA3A4BCX0X1X2X3D. The bphR1 (formerly orf0) gene product, which belongs to the GntR family, is a positive regulator for itself and bphX0X1X2X3D. Further analysis in this study revealed that a second regulator belonging to the LysR family (designated bphR2) is involved in the regulation of the bph genes in KF707. The bphR2 gene was not located near the bph gene cluster, and it...

  4. Insights into secondary metabolism from a global analysis of prokaryotic biosynthetic gene clusters

    NARCIS (Netherlands)

    Cimermancic, P.; Medema, Marnix; Claesen, J.; Kurika, K.; Wieland Brown, L.C.; Mavrommatis, K.; Pati, A.; Godfrey, P.A.; Koehrsen, M.; Clardy, J.; Birren, B. W.; Takano, Eriko; Sali, A.; Linington, R.G.; Fischbach, M.A.

    2014-01-01

    Although biosynthetic gene clusters (BGCs) have been discovered for hundreds of bacterial metabolites, our knowledge of their diversity remains limited. Here, we used a novel algorithm to systematically identify BGCs in the extensive extant microbial sequencing data. Network analysis of the

  5. Identification of cell proliferation, immune response and cell migration as critical pathways in a prognostic signature for HER2+:ERα- breast cancer.

    Directory of Open Access Journals (Sweden)

    Jeffrey C Liu

    Full Text Available Multi-gene prognostic signatures derived from primary tumor biopsies can guide clinicians in designing an appropriate course of treatment. Identifying genes and pathways most essential to a signature performance may facilitate clinical application, provide insights into cancer progression, and uncover potentially new therapeutic targets. We previously developed a 17-gene prognostic signature (HTICS for HER2+:ERα- breast cancer patients, using genes that are differentially expressed in tumor initiating cells (TICs versus non-TICs from MMTV-Her2/neu mammary tumors. Here we probed the pathways and genes that underlie the prognostic power of HTICS.We used Leave-One Out, Data Combination Test, Gene Set Enrichment Analysis (GSEA, Correlation and Substitution analyses together with Receiver Operating Characteristic (ROC and Kaplan-Meier survival analysis to identify critical biological pathways within HTICS. Publically available cohorts with gene expression and clinical outcome were used to assess prognosis. NanoString technology was used to detect gene expression in formalin-fixed paraffin embedded (FFPE tissues.We show that three major biological pathways: cell proliferation, immune response, and cell migration, drive the prognostic power of HTICS, which is further tuned by Homeostatic and Glycan metabolic signalling. A 6-gene minimal Core that retained a significant prognostic power, albeit less than HTICS, also comprised the proliferation/immune/migration pathways. Finally, we developed NanoString probes that could detect expression of HTICS genes and their substitutions in FFPE samples.Our results demonstrate that the prognostic power of a signature is driven by the biological processes it monitors, identify cell proliferation, immune response and cell migration as critical pathways for HER2+:ERα- cancer progression, and defines substitutes and Core genes that should facilitate clinical application of HTICS.

  6. Functional dissection of HOXD cluster genes in regulation of neuroblastoma cell proliferation and differentiation.

    Directory of Open Access Journals (Sweden)

    Yunhong Zha

    Full Text Available Retinoic acid (RA can induce growth arrest and neuronal differentiation of neuroblastoma cells and has been used in clinic for treatment of neuroblastoma. It has been reported that RA induces the expression of several HOXD genes in human neuroblastoma cell lines, but their roles in RA action are largely unknown. The HOXD cluster contains nine genes (HOXD1, HOXD3, HOXD4, and HOXD8-13 that are positioned sequentially from 3' to 5', with HOXD1 at the 3' end and HOXD13 the 5' end. Here we show that all HOXD genes are induced by RA in the human neuroblastoma BE(2-C cells, with the genes located at the 3' end being activated generally earlier than those positioned more 5' within the cluster. Individual induction of HOXD8, HOXD9, HOXD10 or HOXD12 is sufficient to induce both growth arrest and neuronal differentiation, which is associated with downregulation of cell cycle-promoting genes and upregulation of neuronal differentiation genes. However, induction of other HOXD genes either has no effect (HOXD1 or has partial effects (HOXD3, HOXD4, HOXD11 and HOXD13 on BE(2-C cell proliferation or differentiation. We further show that knockdown of HOXD8 expression, but not that of HOXD9 expression, significantly inhibits the differentiation-inducing activity of RA. HOXD8 directly activates the transcription of HOXC9, a key effector of RA action in neuroblastoma cells. These findings highlight the distinct functions of HOXD genes in RA induction of neuroblastoma cell differentiation.

  7. The Cremeomycin Biosynthetic Gene Cluster Encodes a Pathway for Diazo Formation.

    Science.gov (United States)

    Waldman, Abraham J; Pechersky, Yakov; Wang, Peng; Wang, Jennifer X; Balskus, Emily P

    2015-10-12

    Diazo groups are found in a range of natural products that possess potent biological activities. Despite longstanding interest in these metabolites, diazo group biosynthesis is not well understood, in part because of difficulties in identifying specific genes linked to diazo formation. Here we describe the discovery of the gene cluster that produces the o-diazoquinone natural product cremeomycin and its heterologous expression in Streptomyces lividans. We used stable isotope feeding experiments and in vitro characterization of biosynthetic enzymes to decipher the order of events in this pathway and establish that diazo construction involves late-stage N-N bond formation. This work represents the first successful production of a diazo-containing metabolite in a heterologous host, experimentally linking a set of genes with diazo formation. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  8. An original SERPINA3 gene cluster: Elucidation of genomic organization and gene expression in the Bos taurus 21q24 region

    Directory of Open Access Journals (Sweden)

    Ouali Ahmed

    2008-04-01

    Full Text Available Abstract Background The superfamily of serine proteinase inhibitors (serpins is involved in numerous fundamental biological processes as inflammation, blood coagulation and apoptosis. Our interest is focused on the SERPINA3 sub-family. The major human plasma protease inhibitor, α1-antichymotrypsin, encoded by the SERPINA3 gene, is homologous to genes organized in clusters in several mammalian species. However, although there is a similar genic organization with a high degree of sequence conservation, the reactive-centre-loop domains, which are responsible for the protease specificity, show significant divergences. Results We provide additional information by analyzing the situation of SERPINA3 in the bovine genome. A cluster of eight genes and one pseudogene sharing a high degree of identity and the same structural organization was characterized. Bovine SERPINA3 genes were localized by radiation hybrid mapping on 21q24 and only spanned over 235 Kilobases. For all these genes, we propose a new nomenclature from SERPINA3-1 to SERPINA3-8. They share approximately 70% of identity with the human SERPINA3 homologue. In the cluster, we described an original sub-group of six members with an unexpected high degree of conservation for the reactive-centre-loop domain, suggesting a similar peptidase inhibitory pattern. Preliminary expression analyses of these bovSERPINA3s showed different tissue-specific patterns and diverse states of glycosylation and phosphorylation. Finally, in the context of phylogenetic analyses, we improved our knowledge on mammalian SERPINAs evolution. Conclusion Our experimental results update data of the bovine genome sequencing, substantially increase the bovSERPINA3 sub-family and enrich the phylogenetic tree of serpins. We provide new opportunities for future investigations to approach the biological functions of this unusual subset of serine proteinase inhibitors.

  9. The entire β-globin gene cluster is deleted in a form of τδβ-thalassemia.

    NARCIS (Netherlands)

    E.R. Fearon; H.H.Jr. Kazazian; P.G. Waber (Pamela); J.I. Lee (Joseph); S.E. Antonarakis; S.H. Orkin (Stuart); E.F. Vanin; P.S. Henthorn; F.G. Grosveld (Frank); A.F. Scott; G.R. Buchanan

    1983-01-01

    textabstractWe have used restriction endonuclease mapping to study a deletion involving the beta-globin gene cluster in a Mexican-American family with gamma delta beta-thalassemia. Analysis of DNA polymorphisms demonstrated deletion of the beta-globin gene from the affected chromosome. Using a DNA

  10. Aromatic Polyketide GTRI-02 is a Previously Unidentified Product of the act Gene Cluster in Streptomyces coelicolor A3(2).

    Science.gov (United States)

    Wu, Changsheng; Ichinose, Koji; Choi, Young Hae; van Wezel, Gilles P

    2017-07-18

    The biosynthesis of aromatic polyketides derived from type II polyketide synthases (PKSs) is complex, and it is not uncommon that highly similar gene clusters give rise to diverse structural architectures. The act biosynthetic gene cluster (BGC) of the model actinomycete Streptomyces coelicolor A3(2) is an archetypal type II PKS. Here we show that the act BGC also specifies the aromatic polyketide GTRI-02 (1) and propose a mechanism for the biogenesis of its 3,4-dihydronaphthalen-1(2H)-one backbone. Polyketide 1 was also produced by Streptomyces sp. MBT76 after activation of the act-like qin gene cluster by overexpression of the pathway-specific activator. Mining of this strain also identified dehydroxy-GTRI-02 (2), which most likely originated from dehydration of 1 during the isolation process. This work shows that even extensively studied model gene clusters such as act of S. coelicolor can still produce new chemistry, offering new perspectives for drug discovery. © 2017 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.

  11. Development of a gene cloning system in a fast-growing and moderately thermophilic Streptomyces species and heterologous expression of Streptomyces antibiotic biosynthetic gene clusters

    Science.gov (United States)

    2011-01-01

    Background Streptomyces species are a major source of antibiotics. They usually grow slowly at their optimal temperature and fermentation of industrial strains in a large scale often takes a long time, consuming more energy and materials than some other bacterial industrial strains (e.g., E. coli and Bacillus). Most thermophilic Streptomyces species grow fast, but no gene cloning systems have been developed in such strains. Results We report here the isolation of 41 fast-growing (about twice the rate of S. coelicolor), moderately thermophilic (growing at both 30°C and 50°C) Streptomyces strains, detection of one linear and three circular plasmids in them, and sequencing of a 6996-bp plasmid, pTSC1, from one of them. pTSC1-derived pCWH1 could replicate in both thermophilic and mesophilic Streptomyces strains. On the other hand, several Streptomyces replicons function in thermophilic Streptomyces species. By examining ten well-sporulating strains, we found two promising cloning hosts, 2C and 4F. A gene cloning system was established by using the two strains. The actinorhodin and anthramycin biosynthetic gene clusters from mesophilic S. coelicolor A3(2) and thermophilic S. refuineus were heterologously expressed in one of the hosts. Conclusions We have developed a gene cloning and expression system in a fast-growing and moderately thermophilic Streptomyces species. Although just a few plasmids and one antibiotic biosynthetic gene cluster from mesophilic Streptomyces were successfully expressed in thermophilic Streptomyces species, we expect that by utilizing thermophilic Streptomyces-specific promoters, more genes and especially antibiotic genes clusters of mesophilic Streptomyces should be heterologously expressed. PMID:22032628

  12. Defining reference sequences for Nocardia species by similarity and clustering analyses of 16S rRNA gene sequence data.

    Directory of Open Access Journals (Sweden)

    Manal Helal

    Full Text Available BACKGROUND: The intra- and inter-species genetic diversity of bacteria and the absence of 'reference', or the most representative, sequences of individual species present a significant challenge for sequence-based identification. The aims of this study were to determine the utility, and compare the performance of several clustering and classification algorithms to identify the species of 364 sequences of 16S rRNA gene with a defined species in GenBank, and 110 sequences of 16S rRNA gene with no defined species, all within the genus Nocardia. METHODS: A total of 364 16S rRNA gene sequences of Nocardia species were studied. In addition, 110 16S rRNA gene sequences assigned only to the Nocardia genus level at the time of submission to GenBank were used for machine learning classification experiments. Different clustering algorithms were compared with a novel algorithm or the linear mapping (LM of the distance matrix. Principal Components Analysis was used for the dimensionality reduction and visualization. RESULTS: The LM algorithm achieved the highest performance and classified the set of 364 16S rRNA sequences into 80 clusters, the majority of which (83.52% corresponded with the original species. The most representative 16S rRNA sequences for individual Nocardia species have been identified as 'centroids' in respective clusters from which the distances to all other sequences were minimized; 110 16S rRNA gene sequences with identifications recorded only at the genus level were classified using machine learning methods. Simple kNN machine learning demonstrated the highest performance and classified Nocardia species sequences with an accuracy of 92.7% and a mean frequency of 0.578. CONCLUSION: The identification of centroids of 16S rRNA gene sequence clusters using novel distance matrix clustering enables the identification of the most representative sequences for each individual species of Nocardia and allows the quantitation of inter- and intra

  13. Array-based gene expression, CGH and tissue data defines a 12q24 gain in neuroblastic tumors with prognostic implication

    Directory of Open Access Journals (Sweden)

    Kilpinen Sami

    2010-05-01

    Full Text Available Abstract Background Neuroblastoma has successfully served as a model system for the identification of neuroectoderm-derived oncogenes. However, in spite of various efforts, only a few clinically useful prognostic markers have been found. Here, we present a framework, which integrates DNA, RNA and tissue data to identify and prioritize genetic events that represent clinically relevant new therapeutic targets and prognostic biomarkers for neuroblastoma. Methods A single-gene resolution aCGH profiling was integrated with microarray-based gene expression profiling data to distinguish genetic copy number alterations that were strongly associated with transcriptional changes in two neuroblastoma cell lines. FISH analysis using a hotspot tumor tissue microarray of 37 paraffin-embedded neuroblastoma samples and in silico data mining for gene expression information obtained from previously published studies including up to 445 healthy nervous system samples and 123 neuroblastoma samples were used to evaluate the clinical significance and transcriptional consequences of the detected alterations and to identify subsequently activated gene(s. Results In addition to the anticipated high-level amplification and subsequent overexpression of MYCN, MEIS1, CDK4 and MDM2 oncogenes, the aCGH analysis revealed numerous other genetic alterations, including microamplifications at 2p and 12q24.11. Most interestingly, we identified and investigated the clinical relevance of a previously poorly characterized amplicon at 12q24.31. FISH analysis showed low-level gain of 12q24.31 in 14 of 33 (42% neuroblastomas. Patients with the low-level gain had an intermediate prognosis in comparison to patients with MYCN amplification (poor prognosis and to those with no MYCN amplification or 12q24.31 gain (good prognosis (P = 0.001. Using the in silico data mining approach, we identified elevated expression of five genes located at the 12q24.31 amplicon in neuroblastoma (DIABLO, ZCCHC

  14. Prognostic DNA Methylation Markers for Prostate Cancer

    Directory of Open Access Journals (Sweden)

    Siri H. Strand

    2014-09-01

    Full Text Available Prostate cancer (PC is the most commonly diagnosed neoplasm and the third most common cause of cancer-related death amongst men in the Western world. PC is a clinically highly heterogeneous disease, and distinction between aggressive and indolent disease is a major challenge for the management of PC. Currently, no biomarkers or prognostic tools are able to accurately predict tumor progression at the time of diagnosis. Thus, improved biomarkers for PC prognosis are urgently needed. This review focuses on the prognostic potential of DNA methylation biomarkers for PC. Epigenetic changes are hallmarks of PC and associated with malignant initiation as well as tumor progression. Moreover, DNA methylation is the most frequently studied epigenetic alteration in PC, and the prognostic potential of DNA methylation markers for PC has been demonstrated in multiple studies. The most promising methylation marker candidates identified so far include PITX2, C1orf114 (CCDC181 and the GABRE~miR-452~miR-224 locus, in addition to the three-gene signature AOX1/C1orf114/HAPLN3. Several other biomarker candidates have also been investigated, but with less stringent clinical validation and/or conflicting evidence regarding their possible prognostic value available at this time. Here, we review the current evidence for the prognostic potential of DNA methylation markers in PC.

  15. Isolation of Hox cluster genes from insects reveals an accelerated sequence evolution rate.

    Directory of Open Access Journals (Sweden)

    Heike Hadrys

    Full Text Available Among gene families it is the Hox genes and among metazoan animals it is the insects (Hexapoda that have attracted particular attention for studying the evolution of development. Surprisingly though, no Hox genes have been isolated from 26 out of 35 insect orders yet, and the existing sequences derive mainly from only two orders (61% from Hymenoptera and 22% from Diptera. We have designed insect specific primers and isolated 37 new partial homeobox sequences of Hox cluster genes (lab, pb, Hox3, ftz, Antp, Scr, abd-a, Abd-B, Dfd, and Ubx from six insect orders, which are crucial to insect phylogenetics. These new gene sequences provide a first step towards comparative Hox gene studies in insects. Furthermore, comparative distance analyses of homeobox sequences reveal a correlation between gene divergence rate and species radiation success with insects showing the highest rate of homeobox sequence evolution.

  16. Glycosulfatase-Encoding Gene Cluster in Bifidobacterium breve UCC2003.

    Science.gov (United States)

    Egan, Muireann; Jiang, Hao; O'Connell Motherway, Mary; Oscarson, Stefan; van Sinderen, Douwe

    2016-11-15

    Bifidobacteria constitute a specific group of commensal bacteria typically found in the gastrointestinal tract (GIT) of humans and other mammals. Bifidobacterium breve strains are numerically prevalent among the gut microbiota of many healthy breastfed infants. In the present study, we investigated glycosulfatase activity in a bacterial isolate from a nursling stool sample, B. breve UCC2003. Two putative sulfatases were identified on the genome of B. breve UCC2003. The sulfated monosaccharide N-acetylglucosamine-6-sulfate (GlcNAc-6-S) was shown to support the growth of B. breve UCC2003, while N-acetylglucosamine-3-sulfate, N-acetylgalactosamine-3-sulfate, and N-acetylgalactosamine-6-sulfate did not support appreciable growth. By using a combination of transcriptomic and functional genomic approaches, a gene cluster designated ats2 was shown to be specifically required for GlcNAc-6-S metabolism. Transcription of the ats2 cluster is regulated by a repressor open reading frame kinase (ROK) family transcriptional repressor. This study represents the first description of glycosulfatase activity within the Bifidobacterium genus. Bifidobacteria are saccharolytic organisms naturally found in the digestive tract of mammals and insects. Bifidobacterium breve strains utilize a variety of plant- and host-derived carbohydrates that allow them to be present as prominent members of the infant gut microbiota as well as being present in the gastrointestinal tract of adults. In this study, we introduce a previously unexplored area of carbohydrate metabolism in bifidobacteria, namely, the metabolism of sulfated carbohydrates. B. breve UCC2003 was shown to metabolize N-acetylglucosamine-6-sulfate (GlcNAc-6-S) through one of two sulfatase-encoding gene clusters identified on its genome. GlcNAc-6-S can be found in terminal or branched positions of mucin oligosaccharides, the glycoprotein component of the mucous layer that covers the digestive tract. The results of this study provide

  17. The medaka novel immune-type receptor (NITR gene clusters reveal an extraordinary degree of divergence in variable domains

    Directory of Open Access Journals (Sweden)

    Litman Gary W

    2008-06-01

    Full Text Available Abstract Background Novel immune-type receptor (NITR genes are members of diversified multigene families that are found in bony fish and encode type I transmembrane proteins containing one or two extracellular immunoglobulin (Ig domains. The majority of NITRs can be classified as inhibitory receptors that possess cytoplasmic immunoreceptor tyrosine-based inhibition motifs (ITIMs. A much smaller number of NITRs can be classified as activating receptors by the lack of cytoplasmic ITIMs and presence of a positively charged residue within their transmembrane domain, which permits partnering with an activating adaptor protein. Results Forty-four NITR genes in medaka (Oryzias latipes are located in three gene clusters on chromosomes 10, 18 and 21 and can be organized into 24 families including inhibitory and activating forms. The particularly large dataset acquired in medaka makes direct comparison possible to another complete dataset acquired in zebrafish in which NITRs are localized in two clusters on different chromosomes. The two largest medaka NITR gene clusters share conserved synteny with the two zebrafish NITR gene clusters. Shared synteny between NITRs and CD8A/CD8B is limited but consistent with a potential common ancestry. Conclusion Comprehensive phylogenetic analyses between the complete datasets of NITRs from medaka and zebrafish indicate multiple species-specific expansions of different families of NITRs. The patterns of sequence variation among gene family members are consistent with recent birth-and-death events. Similar effects have been observed with mammalian immunoglobulin (Ig, T cell antigen receptor (TCR and killer cell immunoglobulin-like receptor (KIR genes. NITRs likely diverged along an independent pathway from that of the somatically rearranging antigen binding receptors but have undergone parallel evolution of V family diversity.

  18. The prognostic value of ERCC1 and RRM1 gene expression in completely resected non-small cell lung cancer: tumor recurrence and overall survival

    International Nuclear Information System (INIS)

    Tantraworasin, Apichat; Saeteng, Somcharoen; Lertprasertsuke, Nirush; Arayawudhikul, Nuttapon; Kasemsarn, Choosak; Patumanond, Jayanton

    2013-01-01

    The roles of excision repair cross-complementing group 1 gene (ERCC1) expression and ribonucleotide reductase subunit M1 gene (RRM1) expression in completely resected non-small cell lung cancer (NSCLC) are still debatable. Previous studies have shown that both genes affected the overall survival and outcomes of patients who received platinum-based chemotherapy; however, some studies did not show this correlation. The aim of this study was to evaluate the prognostic values of ERCC1 and RRM1 gene expression in predicting tumor recurrence and overall survival in patients with completely resected NSCLC who received adjuvant chemotherapy and in those who did not. A retrospective cohort study was conducted in 247 patients with completely resected NSCLC. All patients had been treated with anatomic resection (lobectomy or pneumonectomy) with systematic mediastinal lymphadenectomy between January 2002 and December 2011 at Chiang Mai University Hospital, Chiang Mai, Thailand. They were divided into two groups: recurrence and no recurrence. Protein expression of ERCC1 and RRM1 was determined by immunohistochemistry. Correlations between clinicopathologic variables, including ERCC1 and RRM1 expression and tumor recurrence, were analyzed. Univariate and multivariate Cox proportional hazards regression analysis stratified by nodal involvement, tumor staging, intratumoral blood vessel invasion, intratumoral lymphatic invasion, and tumor necrosis was used to identify the prognostic roles of ERCC1 and RRM1. ERCC1 and RRM1 expression did not demonstrate prognostic value for tumor recurrence and overall survival in patients with completely resected NSCLC. In patients who did not receive adjuvant chemotherapy treatment, those with high ERCC1 and high RRM1 expression seemed to have greater potential for tumor recurrence and shorter overall survival than did those who had low ERCC1 and low RRM1 (hazard ratio [HR] =1.7, 95% confidence interval [CI] =0.6–4.3, P=0.292 and HR =1.6, 95% CI

  19. plantiSMASH: automated identification, annotation and expression analysis of plant biosynthetic gene clusters

    DEFF Research Database (Denmark)

    Kautsar, Satria A.; Suarez Duran, Hernando G.; Blin, Kai

    2017-01-01

    exploration of the nature and dynamics of gene clustering in plant metabolism. Moreover, spurred by the continuing decrease in costs of plant genome sequencing, they will allow genome mining technologies to be applied to plant natural product discovery. The plantiSMASH web server, precalculated results...

  20. A CLUSTERING OF DJA STOCKS - THE APPLICATION IN FINANCE OF A METHOD FIRST USED IN GENE TRAJECTORY STUDY

    Directory of Open Access Journals (Sweden)

    Silaghi Gheorghe Cosmin

    2009-05-01

    Full Text Available Previously we employed the Gene Trajectory Clustering methodology to search for different associations of the stocks composing the DJA index, with the aim of finding different, logic clusters, supported by economic reasons, preferably different than the

  1. Genes involved in degradation of para-nitrophenol are differentially arranged in form of non-contiguous gene clusters in Burkholderia sp. strain SJ98.

    Directory of Open Access Journals (Sweden)

    Surendra Vikram

    Full Text Available Biodegradation of para-Nitrophenol (PNP proceeds via two distinct pathways, having 1,2,3-benzenetriol (BT and hydroquinone (HQ as their respective terminal aromatic intermediates. Genes involved in these pathways have already been studied in different PNP degrading bacteria. Burkholderia sp. strain SJ98 degrades PNP via both the pathways. Earlier, we have sequenced and analyzed a ~41 kb fragment from the genomic library of strain SJ98. This DNA fragment was found to harbor all the lower pathway genes; however, genes responsible for the initial transformation of PNP could not be identified within this fragment. Now, we have sequenced and annotated the whole genome of strain SJ98 and found two ORFs (viz., pnpA and pnpB showing maximum identity at amino acid level with p-nitrophenol 4-monooxygenase (PnpM and p-benzoquinone reductase (BqR. Unlike the other PNP gene clusters reported earlier in different bacteria, these two ORFs in SJ98 genome are physically separated from the other genes of PNP degradation pathway. In order to ascertain the identity of ORFs pnpA and pnpB, we have performed in-vitro assays using recombinant proteins heterologously expressed and purified to homogeneity. Purified PnpA was found to be a functional PnpM and transformed PNP into benzoquinone (BQ, while PnpB was found to be a functional BqR which catalyzed the transformation of BQ into hydroquinone (HQ. Noticeably, PnpM from strain SJ98 could also transform a number of PNP analogues. Based on the above observations, we propose that the genes for PNP degradation in strain SJ98 are arranged differentially in form of non-contiguous gene clusters. This is the first report for such arrangement for gene clusters involved in PNP degradation. Therefore, we propose that PNP degradation in strain SJ98 could be an important model system for further studies on differential evolution of PNP degradation functions.

  2. Biological Prognostic Markers in Chronic Lymphocytic Leukemia

    Directory of Open Access Journals (Sweden)

    Vladimíra Vroblová

    2009-01-01

    Full Text Available Chronic lymphocytic leukemia (CLL is the most frequent leukemic disease of adults in the Western world. It is remarkable by an extraordinary heterogeneity of clinical course with overall survival ranging from several months to more than 15 years. Classical staging sytems by Rai and Binet, while readily available and useful for initial assessment of prognosis, are not able to determine individual patient’s ongoing clinical course of CLL at the time of diagnosis, especially in early stages. Therefore, newer biological prognostic parameters are currently being clinically evaluated. Mutational status of variable region of immunoglobulin heavy chain genes (IgVH, cytogenetic aberrations, and both intracellular ZAP- 70 and surface CD38 expression are recognized as parameters with established prognostic value. Molecules regulating the process of angiogenesis are also considered as promising markers. The purpose of this review is to summarize in detail the specific role of these prognostic factors in chronic lymphocytic leukemia.

  3. Gene cluster analysis for the biosynthesis of elgicins, novel lantibiotics produced by paenibacillus elgii B69

    Directory of Open Access Journals (Sweden)

    Teng Yi

    2012-03-01

    Full Text Available Abstract Background The recent increase in bacterial resistance to antibiotics has promoted the exploration of novel antibacterial materials. As a result, many researchers are undertaking work to identify new lantibiotics because of their potent antimicrobial activities. The objective of this study was to provide details of a lantibiotic-like gene cluster in Paenibacillus elgii B69 and to produce the antibacterial substances coded by this gene cluster based on culture screening. Results Analysis of the P. elgii B69 genome sequence revealed the presence of a lantibiotic-like gene cluster composed of five open reading frames (elgT1, elgC, elgT2, elgB, and elgA. Screening of culture extracts for active substances possessing the predicted properties of the encoded product led to the isolation of four novel peptides (elgicins AI, AII, B, and C with a broad inhibitory spectrum. The molecular weights of these peptides were 4536, 4593, 4706, and 4820 Da, respectively. The N-terminal sequence of elgicin B was Leu-Gly-Asp-Tyr, which corresponded to the partial sequence of the peptide ElgA encoded by elgA. Edman degradation suggested that the product elgicin B is derived from ElgA. By correlating the results of electrospray ionization-mass spectrometry analyses of elgicins AI, AII, and C, these peptides are deduced to have originated from the same precursor, ElgA. Conclusions A novel lantibiotic-like gene cluster was shown to be present in P. elgii B69. Four new lantibiotics with a broad inhibitory spectrum were isolated, and these appear to be promising antibacterial agents.

  4. Gene expression patterns of oxidative phosphorylation complex I subunits are organized in clusters.

    Directory of Open Access Journals (Sweden)

    Yael Garbian

    Full Text Available After the radiation of eukaryotes, the NUO operon, controlling the transcription of the NADH dehydrogenase complex of the oxidative phosphorylation system (OXPHOS complex I, was broken down and genes encoding this protein complex were dispersed across the nuclear genome. Seven genes, however, were retained in the genome of the mitochondrion, the ancient symbiote of eukaryotes. This division, in combination with the three-fold increase in subunit number from bacteria (N = approximately 14 to man (N = 45, renders the transcription regulation of OXPHOS complex I a challenge. Recently bioinformatics analysis of the promoter regions of all OXPHOS genes in mammals supported patterns of co-regulation, suggesting that natural selection favored a mechanism facilitating the transcriptional regulatory control of genes encoding subunits of these large protein complexes. Here, using real time PCR of mitochondrial (mtDNA- and nuclear DNA (nDNA-encoded transcripts in a panel of 13 different human tissues, we show that the expression pattern of OXPHOS complex I genes is regulated in several clusters. Firstly, all mtDNA-encoded complex I subunits (N = 7 share a similar expression pattern, distinct from all tested nDNA-encoded subunits (N = 10. Secondly, two sub-clusters of nDNA-encoded transcripts with significantly different expression patterns were observed. Thirdly, the expression patterns of two nDNA-encoded genes, NDUFA4 and NDUFA5, notably diverged from the rest of the nDNA-encoded subunits, suggesting a certain degree of tissue specificity. Finally, the expression pattern of the mtDNA-encoded ND4L gene diverged from the rest of the tested mtDNA-encoded transcripts that are regulated by the same promoter, consistent with post-transcriptional regulation. These findings suggest, for the first time, that the regulation of complex I subunits expression in humans is complex rather than reflecting global co-regulation.

  5. Genetic recombination as a major cause of mutagenesis in the human globin gene clusters.

    Science.gov (United States)

    Borg, Joseph; Georgitsi, Marianthi; Aleporou-Marinou, Vassiliki; Kollia, Panagoula; Patrinos, George P

    2009-12-01

    Homologous recombination is a frequent phenomenon in multigene families and as such it occurs several times in both the alpha- and beta-like globin gene families. In numerous occasions, genetic recombination has been previously implicated as a major mechanism that drives mutagenesis in the human globin gene clusters, either in the form of unequal crossover or gene conversion. Unequal crossover results in the increase or decrease of the human globin gene copies, accompanied in the majority of cases with minor phenotypic consequences, while gene conversion contributes either to maintaining sequence homogeneity or generating sequence diversity. The role of genetic recombination, particularly gene conversion in the evolution of the human globin gene families has been discussed elsewhere. Here, we summarize our current knowledge and review existing experimental evidence outlining the role of genetic recombination in the mutagenic process in the human globin gene families.

  6. Prognostic value of FGFR gene amplification in patients with different types of cancer: a systematic review and meta-analysis.

    Directory of Open Access Journals (Sweden)

    Jinjia Chang

    Full Text Available BACKGROUND: Fibroblast growth factor receptor (FGFR gene amplification has been reported in different types of cancer. We performed an up-to-date meta-analysis to further characterize the prognostic value of FGFR gene amplification in patients with cancer. METHODS: A search of several databases, including MEDLINE (PubMed, EMBASE, Web of Science, and China National Knowledge Infrastructure, was conducted to identify studies examining the association between FGFR gene amplification and cancer. A total of 24 studies met the inclusion criteria, and overall incidence rates, hazard risk (HR, overall survival, disease-free survival, and 95% confidence intervals (CIs were calculated employing fixed- or random-effects models depending on the heterogeneity of the included studies. RESULTS: In the meta-analysis of 24 studies, the prevalence of FGFR gene amplification was FGFR1: 0.11 (95% CI: 0.08-0.13 and FGFR2: 0.04 (95% CI: 0.02-0.06. Overall survival was significantly worse among patients with FGFR gene amplification: FGFR1 [HR 1.57 (95% CI: 1.23-1.99; p = 0.0002] and FGFR2 [HR 2.27 (95% CI: 1.73-3.00; p<0.00001]. CONCLUSIONS: Current evidence supports the conclusion that the outcomes of patients with FGFR gene amplified cancers is worse than for those with non-FGFR gene amplified cancers.

  7. IMG-ABC: A Knowledge Base To Fuel Discovery of Biosynthetic Gene Clusters and Novel Secondary Metabolites.

    Science.gov (United States)

    Hadjithomas, Michalis; Chen, I-Min Amy; Chu, Ken; Ratner, Anna; Palaniappan, Krishna; Szeto, Ernest; Huang, Jinghua; Reddy, T B K; Cimermančič, Peter; Fischbach, Michael A; Ivanova, Natalia N; Markowitz, Victor M; Kyrpides, Nikos C; Pati, Amrita

    2015-07-14

    In the discovery of secondary metabolites, analysis of sequence data is a promising exploration path that remains largely underutilized due to the lack of computational platforms that enable such a systematic approach on a large scale. In this work, we present IMG-ABC (https://img.jgi.doe.gov/abc), an atlas of biosynthetic gene clusters within the Integrated Microbial Genomes (IMG) system, which is aimed at harnessing the power of "big" genomic data for discovering small molecules. IMG-ABC relies on IMG's comprehensive integrated structural and functional genomic data for the analysis of biosynthetic gene clusters (BCs) and associated secondary metabolites (SMs). SMs and BCs serve as the two main classes of objects in IMG-ABC, each with a rich collection of attributes. A unique feature of IMG-ABC is the incorporation of both experimentally validated and computationally predicted BCs in genomes as well as metagenomes, thus identifying BCs in uncultured populations and rare taxa. We demonstrate the strength of IMG-ABC's focused integrated analysis tools in enabling the exploration of microbial secondary metabolism on a global scale, through the discovery of phenazine-producing clusters for the first time in Alphaproteobacteria. IMG-ABC strives to fill the long-existent void of resources for computational exploration of the secondary metabolism universe; its underlying scalable framework enables traversal of uncovered phylogenetic and chemical structure space, serving as a doorway to a new era in the discovery of novel molecules. IMG-ABC is the largest publicly available database of predicted and experimental biosynthetic gene clusters and the secondary metabolites they produce. The system also includes powerful search and analysis tools that are integrated with IMG's extensive genomic/metagenomic data and analysis tool kits. As new research on biosynthetic gene clusters and secondary metabolites is published and more genomes are sequenced, IMG-ABC will continue to

  8. Prognostic factors and outcome in anorexia nervosa: a follow-up study.

    Science.gov (United States)

    Errichiello, Luca; Iodice, Davide; Bruzzese, Dario; Gherghi, Marco; Senatore, Ignazio

    2016-03-01

    Anorexia nervosa is an eating disorder characterized by food restriction, irrational fear of gaining weight and consequent weight loss. High mortality rates have been reported, mostly due to suicide and malnutrition. Good outcomes largely vary between 18 and 42%. We aimed to assess outcome and prognostic factors of a large group of patients with anorexia nervosa. Moreover we aimed to identify clusters of prognostic factors related to specific outcomes. We retrospectively reviewed data of 100 patients diagnosed with anorexia nervosa previously hospitalized in a tertiary level structure. Then we performed follow-up structured telephone interviews. We identified four dead patients, while 34% were clinically recovered. In univariate analysis, short duration of inpatient treatment (p = 0.003), short duration of disorder (p = 0.001), early age at first inpatient treatment (p = 0.025) and preserved insight (p = 0.029) were significantly associated with clinical recovery at follow-up. In multiple logistic regression analysis, duration of first inpatient treatment, duration of disorder and preserved insight maintained their association with outcome. Moreover multiple correspondence analysis and cluster analysis allowed to identify different typologies of patients with specific features. Notably, group 1 was characterized by two or more inpatient treatments, BMI ≤ 14, absence of insight, history of long-term inpatient treatments, first inpatient treatment ≥30 days. While group 4 was characterized by preserved insight, BMI ≥ 16, first inpatient treatment ≤14 days, no more than one inpatient treatment, no psychotropic drugs intake, duration of illness ≤4 years. We confirmed the association between short duration of inpatient treatment, short duration of disorder, early age at first inpatient treatment, preserved insight and clinical recovery. We also differentiated patients with anorexia nervosa in well-defined outcome groups according to specific clusters of

  9. Assessment of the Prognostic and Treatment-Predictive Performance of the Combined HOXB13:IL17BR-MGI Gene Expression Signature in the Trans-ATAC Cohort

    Science.gov (United States)

    2013-12-01

    Shak S, Tang G, et al. A multigene assay to predict recurrence of tamoxifen-treated, node-negative breast cancer. N Engl J Med 2004; 351: 2817–26. 7...Barlow WE, Shak S, et al, for The Breast Cancer Intergroup of North America. Prognostic and predictive value of the 21-gene recurrence score assay in

  10. Deletion of a regulatory gene within the cpk gene cluster reveals novel antibacterial activity in Streptomyces coelicolor A3(2)

    NARCIS (Netherlands)

    Gottelt, Marco; Kol, Stefan; Gomez-Escribano, Juan Pablo; Bibb, Mervyn; Takano, Eriko

    Genome sequencing of Streptomyces coelicolor A3(2) revealed an uncharacterized type I polyketide synthase gene cluster (cpk) Here we describe the discovery of a novel antibacterial activity (abCPK) and a yellow-pigmented secondary metabolite (yCPK) after deleting a presumed pathway-specific

  11. Sequencing and Transcriptional Analysis of the Biosynthesis Gene Cluster of Putrescine-Producing Lactococcus lactis ▿ †

    Science.gov (United States)

    Ladero, Victor; Rattray, Fergal P.; Mayo, Baltasar; Martín, María Cruz; Fernández, María; Alvarez, Miguel A.

    2011-01-01

    Lactococcus lactis is a prokaryotic microorganism with great importance as a culture starter and has become the model species among the lactic acid bacteria. The long and safe history of use of L. lactis in dairy fermentations has resulted in the classification of this species as GRAS (General Regarded As Safe) or QPS (Qualified Presumption of Safety). However, our group has identified several strains of L. lactis subsp. lactis and L. lactis subsp. cremoris that are able to produce putrescine from agmatine via the agmatine deiminase (AGDI) pathway. Putrescine is a biogenic amine that confers undesirable flavor characteristics and may even have toxic effects. The AGDI cluster of L. lactis is composed of a putative regulatory gene, aguR, followed by the genes (aguB, aguD, aguA, and aguC) encoding the catabolic enzymes. These genes are transcribed as an operon that is induced in the presence of agmatine. In some strains, an insertion (IS) element interrupts the transcription of the cluster, which results in a non-putrescine-producing phenotype. Based on this knowledge, a PCR-based test was developed in order to differentiate nonproducing L. lactis strains from those with a functional AGDI cluster. The analysis of the AGDI cluster and their flanking regions revealed that the capacity to produce putrescine via the AGDI pathway could be a specific characteristic that was lost during the adaptation to the milk environment by a process of reductive genome evolution. PMID:21803900

  12. NFκB-mediated activation of the cellular FUT3, 5 and 6 gene cluster by herpes simplex virus type 1.

    Science.gov (United States)

    Nordén, Rickard; Samuelsson, Ebba; Nyström, Kristina

    2017-11-01

    Herpes simplex virus type 1 has the ability to induce expression of a human gene cluster located on chromosome 19 upon infection. This gene cluster contains three fucosyltransferases (encoded by FUT3, FUT5 and FUT6) with the ability to add a fucose to an N-acetylglucosamine residue. Little is known regarding the transcriptional activation of these three genes in human cells. Intriguingly, herpes simplex virus type 1 activates all three genes simultaneously during infection, a situation not observed in uninfected tissue, pointing towards a virus specific mechanism for transcriptional activation. The aim of this study was to define the underlying mechanism for the herpes simplex virus type 1 activation of FUT3, FUT5 and FUT6 transcription. The transcriptional activation of the FUT-gene cluster on chromosome 19 in fibroblasts was specific, not involving adjacent genes. Moreover, inhibition of NFκB signaling through panepoxydone treatment significantly decreased the induction of FUT3, FUT5 and FUT6 transcriptional activation, as did siRNA targeting of p65, in herpes simplex virus type 1 infected fibroblasts. NFκB and p65 signaling appears to play an important role in the regulation of FUT3, FUT5 and FUT6 transcriptional activation by herpes simplex virus type 1 although additional, unidentified, viral factors might account for part of the mechanism as direct interferon mediated stimulation of NFκB was not sufficient to induce the fucosyltransferase encoding gene cluster in uninfected cells. © The Author 2017. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  13. De novo deletion of HOXB gene cluster in a patient with failure to thrive, developmental delay, gastroesophageal reflux and bronchiectasis.

    Science.gov (United States)

    Pajusalu, Sander; Reimand, Tiia; Uibo, Oivi; Vasar, Maire; Talvik, Inga; Zilina, Olga; Tammur, Pille; Õunap, Katrin

    2015-01-01

    We report a female patient with a complex phenotype consisting of failure to thrive, developmental delay, congenital bronchiectasis, gastroesophageal reflux and bilateral inguinal hernias. Chromosomal microarray analysis revealed a 230 kilobase deletion in chromosomal region 17q21.32 (arr[hg19] 17q21.32(46 550 362-46 784 039)×1) encompassing only 9 genes - HOXB1 to HOXB9. The deletion was not found in her mother or father. This is the first report of a patient with a HOXB gene cluster deletion involving only HOXB1 to HOXB9 genes. By comparing our case to previously reported five patients with larger chromosomal aberrations involving the HOXB gene cluster, we can suppose that HOXB gene cluster deletions are responsible for growth retardation, developmental delay, and specific facial dysmorphic features. Also, we suppose that bilateral inguinal hernias, tracheo-esophageal abnormalities, and lung malformations represent features with incomplete penetrance. Interestingly, previously published knock-out mice with targeted heterozygous deletion comparable to our patient did not show phenotypic alterations. Copyright © 2015 Elsevier Masson SAS. All rights reserved.

  14. Gene expression profiling of canine osteosarcoma reveals genes associated with short and long survival times

    Directory of Open Access Journals (Sweden)

    Rao Nagesha AS

    2009-09-01

    Full Text Available Abstract Background Gene expression profiling of spontaneous tumors in the dog offers a unique translational opportunity to identify prognostic biomarkers and signaling pathways that are common to both canine and human. Osteosarcoma (OS accounts for approximately 80% of all malignant bone tumors in the dog. Canine OS are highly comparable with their human counterpart with respect to histology, high metastatic rate and poor long-term survival. This study investigates the prognostic gene profile among thirty-two primary canine OS using canine specific cDNA microarrays representing 20,313 genes to identify genes and cellular signaling pathways associated with survival. This, the first report of its kind in dogs with OS, also demonstrates the advantages of cross-species comparison with human OS. Results The 32 tumors were classified into two prognostic groups based on survival time (ST. They were defined as short survivors (dogs with poor prognosis: surviving fewer than 6 months and long survivors (dogs with better prognosis: surviving 6 months or longer. Fifty-one transcripts were found to be differentially expressed, with common upregulation of these genes in the short survivors. The overexpressed genes in short survivors are associated with possible roles in proliferation, drug resistance or metastasis. Several deregulated pathways identified in the present study, including Wnt signaling, Integrin signaling and Chemokine/cytokine signaling are comparable to the pathway analysis conducted on human OS gene profiles, emphasizing the value of the dog as an excellent model for humans. Conclusion A molecular-based method for discrimination of outcome for short and long survivors is useful for future prognostic stratification at initial diagnosis, where genes and pathways associated with cell cycle/proliferation, drug resistance and metastasis could be potential targets for diagnosis and therapy. The similarities between human and canine OS makes the

  15. Clusters of orthologous genes for 41 archaeal genomes and implications for evolutionary genomics of archaea

    Directory of Open Access Journals (Sweden)

    Wolf Yuri I

    2007-11-01

    Full Text Available Abstract Background An evolutionary classification of genes from sequenced genomes that distinguishes between orthologs and paralogs is indispensable for genome annotation and evolutionary reconstruction. Shortly after multiple genome sequences of bacteria, archaea, and unicellular eukaryotes became available, an attempt on such a classification was implemented in Clusters of Orthologous Groups of proteins (COGs. Rapid accumulation of genome sequences creates opportunities for refining COGs but also represents a challenge because of error amplification. One of the practical strategies involves construction of refined COGs for phylogenetically compact subsets of genomes. Results New Archaeal Clusters of Orthologous Genes (arCOGs were constructed for 41 archaeal genomes (13 Crenarchaeota, 27 Euryarchaeota and one Nanoarchaeon using an improved procedure that employs a similarity tree between smaller, group-specific clusters, semi-automatically partitions orthology domains in multidomain proteins, and uses profile searches for identification of remote orthologs. The annotation of arCOGs is a consensus between three assignments based on the COGs, the CDD database, and the annotations of homologs in the NR database. The 7538 arCOGs, on average, cover ~88% of the genes in a genome compared to a ~76% coverage in COGs. The finer granularity of ortholog identification in the arCOGs is apparent from the fact that 4538 arCOGs correspond to 2362 COGs; ~40% of the arCOGs are new. The archaeal gene core (protein-coding genes found in all 41 genome consists of 166 arCOGs. The arCOGs were used to reconstruct gene loss and gene gain events during archaeal evolution and gene sets of ancestral forms. The Last Archaeal Common Ancestor (LACA is conservatively estimated to possess 996 genes compared to 1245 and 1335 genes for the last common ancestors of Crenarchaeota and Euryarchaeota, respectively. It is inferred that LACA was a chemoautotrophic hyperthermophile

  16. ColoFinder: a prognostic 9-gene signature improves prognosis for 871 stage II and III colorectal cancer patients

    Directory of Open Access Journals (Sweden)

    Mingguang Shi

    2016-03-01

    Full Text Available Colorectal cancer (CRC is a heterogeneous disease with a high mortality rate and is still lacking an effective treatment. Our goal is to develop a robust prognosis model for predicting the prognosis in CRC patients. In this study, 871 stage II and III CRC samples were collected from six gene expression profilings. ColoFinder was developed using a 9-gene signature based Random Survival Forest (RSF prognosis model. The 9-gene signature recurrence score was derived with a 5-fold cross validation to test the association with relapse-free survival, and the value of AUC was gained with 0.87 in GSE39582(95% CI [0.83–0.91]. The low-risk group had a significantly better relapse-free survival (HR, 14.8; 95% CI [8.17–26.8]; P < 0.001 than the high-risk group. We also found that the 9-gene signature recurrence score contributed more information about recurrence than standard clinical and pathological variables in univariate and multivariate Cox analyses when applied to GSE17536(p = 0.03 and p = 0.01 respectively. Furthermore, ColoFinder improved the predictive ability and better stratified the risk subgroups when applied to CRC gene expression datasets GSE14333, GSE17537, GSE12945and GSE24551. In summary, ColoFinder significantly improves the risk assessment in stage II and III CRC patients. The 9-gene prognostic classifier informs patient prognosis and treatment response.

  17. Prognostic value of alcohol dehydrogenase mRNA expression in gastric cancer.

    Science.gov (United States)

    Guo, Erna; Wei, Haotang; Liao, Xiwen; Xu, Yang; Li, Shu; Zeng, Xiaoyun

    2018-04-01

    Previous studies have reported that alcohol dehydrogenase (ADH) isoenzymes possess diagnostic value in gastric cancer (GC). However, the prognostic value of ADH isoenzymes in GC remains unclear. The aim of the present study was to identify the prognostic value of ADH genes in patients with GC. The prognostic value of ADH genes was investigated in patients with GC using the Kaplan-Meier plotter tool. Kaplan-Meier plots were used to assess the difference between groups of patients with GC with different prognoses. Hazard ratios (HR) and 95% confidence intervals (CI) were used to assess the relative risk of GC survival. Overall, 593 patients with GC and 7 ADH genes were included in the survival analysis. High expression of ADH 1A (class 1), α polypeptide ( ADH1A; log-rank P=0.043; HR=0.79; 95% CI: 0.64-0.99), ADH 1B (class 1), β polypeptide ( ADH1B ; log-rank P=1.9×10 -05 ; HR=0.65; 95% CI: 0.53-0.79) and ADH 5 (class III), χ polypeptide ( ADH5 ; log-rank P=0.0011; HR=0.73; 95% CI: 0.6-0.88) resulted in a significantly decreased risk of mortality in all patients with GC compared with patients with low expression of those genes. Furthermore, protective effects may additionally be observed in patients with intestinal-type GC with high expression of ADH1B (log-rank P=0.031; HR=0.64; 95% CI: 0.43-0.96) and patients with diffuse-type GC with high expression of ADH1A (log-rank P=0.014; HR=0.51; 95% CI: 0.3-0.88), ADH1B (log-rank P=0.04; HR=0.53; 95% CI: 0.29-0.98), ADH 4 (class II), π polypeptide (log-rank P=0.033; HR=0.58; 95% CI: 0.35-0.96) and ADH 6 (class V) (log-rank P=0.037; HR=0.59; 95% CI: 0.35-0.97) resulting in a significantly decreased risk of mortality compared with patients with low expression of those genes. In contrast, patients with diffuse-type GC with high expression of ADH5 (log-rank P=0.044; HR=1.66; 95% CI: 1.01-2.74) were significantly correlated with a poor prognosis. The results of the present study suggest that ADH1A and ADH1B may be potential

  18. Recurrent adenylation domain replacement in the microcystin synthetase gene cluster

    Directory of Open Access Journals (Sweden)

    Laakso Kati

    2007-10-01

    Full Text Available Abstract Background Microcystins are small cyclic heptapeptide toxins produced by a range of distantly related cyanobacteria. Microcystins are synthesized on large NRPS-PKS enzyme complexes. Many structural variants of microcystins are produced simulatenously. A recombination event between the first module of mcyB (mcyB1 and mcyC in the microcystin synthetase gene cluster is linked to the simultaneous production of microcystin variants in strains of the genus Microcystis. Results Here we undertook a phylogenetic study to investigate the order and timing of recombination between the mcyB1 and mcyC genes in a diverse selection of microcystin producing cyanobacteria. Our results provide support for complex evolutionary processes taking place at the mcyB1 and mcyC adenylation domains which recognize and activate the amino acids found at X and Z positions. We find evidence for recent recombination between mcyB1 and mcyC in strains of the genera Anabaena, Microcystis, and Hapalosiphon. We also find clear evidence for independent adenylation domain conversion of mcyB1 by unrelated peptide synthetase modules in strains of the genera Nostoc and Microcystis. The recombination events replace only the adenylation domain in each case and the condensation domains of mcyB1 and mcyC are not transferred together with the adenylation domain. Our findings demonstrate that the mcyB1 and mcyC adenylation domains are recombination hotspots in the microcystin synthetase gene cluster. Conclusion Recombination is thought to be one of the main mechanisms driving the diversification of NRPSs. However, there is very little information on how recombination takes place in nature. This study demonstrates that functional peptide synthetases are created in nature through transfer of adenylation domains without the concomitant transfer of condensation domains.

  19. Discovery of a Novel Immune Gene Signature with Profound Prognostic Value in Colorectal Cancer: A Model of Cooperativity Disorientation Created in the Process from Development to Cancer.

    Directory of Open Access Journals (Sweden)

    Ning An

    Full Text Available Immune response-related genes play a major role in colorectal carcinogenesis by mediating inflammation or immune-surveillance evasion. Although remarkable progress has been made to investigate the underlying mechanism, the understanding of the complicated carcinogenesis process was enormously hindered by large-scale tumor heterogeneity. Development and carcinogenesis share striking similarities in their cellular behavior and underlying molecular mechanisms. The association between embryonic development and carcinogenesis makes embryonic development a viable reference model for studying cancer thereby circumventing the potentially misleading complexity of tumor heterogeneity. Here we proposed that the immune genes, responsible for intra-immune cooperativity disorientation (defined in this study as disruption of developmental expression correlation patterns during carcinogenesis, probably contain untapped prognostic resource of colorectal cancer. In this study, we determined the mRNA expression profile of 137 human biopsy samples, including samples from different stages of human colonic development, colorectal precancerous progression and colorectal cancer samples, among which 60 were also used to generate miRNA expression profile. We originally established Spearman correlation transition model to quantify the cooperativity disorientation associated with the transition from normal to precancerous to cancer tissue, in conjunction with miRNA-mRNA regulatory network and machine learning algorithm to identify genes with prognostic value. Finally, a 12-gene signature was extracted, whose prognostic value was evaluated using Kaplan-Meier survival analysis in five independent datasets. Using the log-rank test, the 12-gene signature was closely related to overall survival in four datasets (GSE17536, n = 177, p = 0.0054; GSE17537, n = 55, p = 0.0039; GSE39582, n = 562, p = 0.13; GSE39084, n = 70, p = 0.11, and significantly associated with disease

  20. Clinical and Prognostic Profiles of Cardiomyopathies Caused by Mutations in the Troponin T Gene.

    Science.gov (United States)

    Ripoll-Vera, Tomás; Gámez, José María; Govea, Nancy; Gómez, Yolanda; Núñez, Juana; Socías, Lorenzo; Escandell, Ángela; Rosell, Jorge

    2016-02-01

    Mutations in the troponin T gene (TTNT2) have been associated in small studies with the development of hypertrophic cardiomyopathy characterized by a high risk of sudden death and mild hypertrophy. We describe the clinical course of patients carrying mutations in this gene. We analyzed the clinical characteristics and prognosis of patients with mutations in the TNNT2 gene who were seen in an inherited cardiac disease unit. Of 180 families with genetically studied cardiomyopathies, 21 families (11.7%) were identified as having mutations in TNNT2: 10 families had Arg92Gln, 5 had Arg286His, 3 had Arg278Cys, 1 had Arg92Trp, 1 had Arg94His, and 1 had Ile221Thr. Thirty-three additional genetic carriers were identified through family assessment. The study included 54 genetic carriers: 56% were male, and the mean average age was 41 ± 17 years. There were 33 cases of hypertrophic cardiomyopathy, 9 of dilated cardiomyopathy, and 1 of noncompaction cardiomyopathy, and maximal myocardial thickness was 18.5 ± 6mm. Ventricular dysfunction was present in 30% of individuals and a history of sudden death in 62%. During follow-up, 4 patients died and 14 (33%) received a defibrillator (8 probands, 6 relatives). Mean survival was 54 years. Carriers of Arg92Gln had early disease development, high penetrance, a high risk of sudden death, a high rate of defibrillator implantation, and a high frequency of mixed phenotype. Mutations in the TNNT2 gene were more common in this series than in previous studies. The clinical and prognostic profiles depended on the mutation present. Carriers of the Arg92Gln mutation developed hypertrophic or dilated cardiomyopathy and had a significantly worse prognosis than those with other mutations in TNNT2 or other sarcomeric genes. Copyright © 2015 Sociedad Española de Cardiología. Published by Elsevier España, S.L.U. All rights reserved.

  1. Hierarchical Bayesian modelling of gene expression time series across irregularly sampled replicates and clusters.

    Science.gov (United States)

    Hensman, James; Lawrence, Neil D; Rattray, Magnus

    2013-08-20

    Time course data from microarrays and high-throughput sequencing experiments require simple, computationally efficient and powerful statistical models to extract meaningful biological signal, and for tasks such as data fusion and clustering. Existing methodologies fail to capture either the temporal or replicated nature of the experiments, and often impose constraints on the data collection process, such as regularly spaced samples, or similar sampling schema across replications. We propose hierarchical Gaussian processes as a general model of gene expression time-series, with application to a variety of problems. In particular, we illustrate the method's capacity for missing data imputation, data fusion and clustering.The method can impute data which is missing both systematically and at random: in a hold-out test on real data, performance is significantly better than commonly used imputation methods. The method's ability to model inter- and intra-cluster variance leads to more biologically meaningful clusters. The approach removes the necessity for evenly spaced samples, an advantage illustrated on a developmental Drosophila dataset with irregular replications. The hierarchical Gaussian process model provides an excellent statistical basis for several gene-expression time-series tasks. It has only a few additional parameters over a regular GP, has negligible additional complexity, is easily implemented and can be integrated into several existing algorithms. Our experiments were implemented in python, and are available from the authors' website: http://staffwww.dcs.shef.ac.uk/people/J.Hensman/.

  2. New gene cluster from the thermophile Bacillus fordii MH602 in the conversion of DL-5-substituted hydantoins to L-amino acids.

    Science.gov (United States)

    Mei, Yan-Zhen; Wan, Yong-Min; He, Bing-Fang; Ying, Han-Jie; Ouyang, Ping-Kai

    2009-12-01

    The thermophile Bacillus fordii MH602 was screened for stereospecifically hydrolyzing DL-5-substituted hydantoins to L-alpha-amino acids. Since the reaction at higher temperature, the advantageous for enhancement of substrate solubility and for racemization of DL-5-substituted hydantoins during the conversion were achieved. The hydantoin metabolism gene cluster from thermophile was firstly reported in this paper. The genes involved in hydantoin utilization (hyu) were isolated on an 8.2 kb DNA fragment by Restriction Site-dependent PCR, and six ORFs were identified by DNA sequence analysis. The hyu gene cluster contained four genes with novel cluster organization characteristics: the hydantoinase gene hyuH, putative transport protein hyuP, hyperprotein hyuHP, and L-carbamoylase gene hyuC. The hyuH and hyuC genes were heterogeneously expressed in E. coli. The results indicated that hyuH and hyuC are involved in the conversion of DL-5-substituted hydantoins to an N-carbamyl intermediate that is subsequently converted to L-alpha-amino acids. Hydantoinase and carbamoylase from B. fordii MH602 comparing respectively with reported hydantoinase and carbamoylase showed the highest identities of 71% and 39%. The novel cluster organization characteristics and the difference of the key enzymes between thermopile B. fordii MH602 and other mesophiles were presumed to be related to the evolutionary origins of concerned metabolism.

  3. In silico analysis highlights the frequency and diversity of type 1 lantibiotic gene clusters in genome sequenced bacteria

    LENUS (Irish Health Repository)

    Marsh, Alan J

    2010-11-30

    Abstract Background Lantibiotics are lanthionine-containing, post-translationally modified antimicrobial peptides. These peptides have significant, but largely untapped, potential as preservatives and chemotherapeutic agents. Type 1 lantibiotics are those in which lanthionine residues are introduced into the structural peptide (LanA) through the activity of separate lanthionine dehydratase (LanB) and lanthionine synthetase (LanC) enzymes. Here we take advantage of the conserved nature of LanC enzymes to devise an in silico approach to identify potential lantibiotic-encoding gene clusters in genome sequenced bacteria. Results In total 49 novel type 1 lantibiotic clusters were identified which unexpectedly were associated with species, genera and even phyla of bacteria which have not previously been associated with lantibiotic production. Conclusions Multiple type 1 lantibiotic gene clusters were identified at a frequency that suggests that these antimicrobials are much more widespread than previously thought. These clusters represent a rich repository which can yield a large number of valuable novel antimicrobials and biosynthetic enzymes.

  4. The Prognostic Role of Androgen Receptor in Patients with Early-Stage Breast Cancer: A Meta-analysis of Clinical and Gene Expression Data.

    Science.gov (United States)

    Bozovic-Spasojevic, Ivana; Zardavas, Dimitrios; Brohée, Sylvain; Ameye, Lieveke; Fumagalli, Debora; Ades, Felipe; de Azambuja, Evandro; Bareche, Yacine; Piccart, Martine; Paesmans, Marianne; Sotiriou, Christos

    2017-06-01

    Purpose: Androgen receptor (AR) expression has been observed in about 70% of patients with breast cancer, but its prognostic role remains uncertain. Experimental Design: To assess the prognostic role of AR expression in early-stage breast cancer, we performed a meta-analysis of studies that evaluated the impact of AR at the protein and gene expression level on disease-free survival (DFS) and/or overall survival (OS). Eligible studies were identified by systematic review of electronic databases using the MeSH-terms "breast neoplasm" and "androgen receptor" and were selected after a qualitative assessment based on the REMARK criteria. A pooled gene expression analysis of 35 publicly available microarray data sets was also performed from patients with early-stage breast cancer with available gene expression and clinical outcome data. Results: Twenty-two of 33 eligible studies for the clinical meta-analysis, including 10,004 patients, were considered as evaluable for the current study after the qualitative assessment. AR positivity defined by IHC was associated with improved DFS in all patients with breast cancer [multivariate (M) analysis, HR 0.46; 95% confidence interval (CI) 0.37-0.58, P expression analysis. High AR mRNA levels were found to confer positive prognosis overall in terms of DFS (HR 0.82; 95% CI 0.72-0.92; P = 0.0007) and OS (HR 0.84; 95% CI, 0.75-0.94; P = 0.02) only in univariate analysis. Conclusions: Our analysis, conducted among more than 17,000 women with early-stage breast cancer included in clinical and gene expression analysis, demonstrates that AR positivity is associated with favorable clinical outcome. Clin Cancer Res; 23(11); 2702-12. ©2016 AACR . ©2016 American Association for Cancer Research.

  5. Directed natural product biosynthesis gene cluster capture and expression in the model bacterium Bacillus subtilis

    KAUST Repository

    Li, Yongxin; Li, Zhongrui; Yamanaka, Kazuya; Xu, Ying; Zhang, Weipeng; Vlamakis, Hera; Kolter, Roberto; Moore, Bradley S.; Qian, Pei-Yuan

    2015-01-01

    validating this direct cloning plug-and-playa approach with surfactin, we genetically interrogated amicoumacin biosynthetic gene cluster from the marine isolate Bacillus subtilis 1779. Its heterologous expression allowed us to explore an unusual maturation

  6. Evolutionary history of the phl gene cluster in the plant-associated bacterium Pseudomonas fluorescens

    NARCIS (Netherlands)

    Moynihan, J.A.; Morrissey, J.P.; Coppoolse, E.; Stiekema, W.J.; O'Gara, F.; Boyd, E.F.

    2009-01-01

    Pseudomonas fluorescens is of agricultural and economic importance as a biological control agent largely because of its plant-association and production of secondary metabolites, in particular 2, 4-diacetylphloroglucinol (2, 4-DAPG). This polyketide, which is encoded by the eight gene phl cluster,

  7. Gene co-expression analysis identifies gene clusters associated with isotropic and polarized growth in Aspergillus fumigatus conidia.

    Science.gov (United States)

    Baltussen, Tim J H; Coolen, Jordy P M; Zoll, Jan; Verweij, Paul E; Melchers, Willem J G

    2018-04-26

    Aspergillus fumigatus is a saprophytic fungus that extensively produces conidia. These microscopic asexually reproductive structures are small enough to reach the lungs. Germination of conidia followed by hyphal growth inside human lungs is a key step in the establishment of infection in immunocompromised patients. RNA-Seq was used to analyze the transcriptome of dormant and germinating A. fumigatus conidia. Construction of a gene co-expression network revealed four gene clusters (modules) correlated with a growth phase (dormant, isotropic growth, polarized growth). Transcripts levels of genes encoding for secondary metabolites were high in dormant conidia. During isotropic growth, transcript levels of genes involved in cell wall modifications increased. Two modules encoding for growth and cell cycle/DNA processing were associated with polarized growth. In addition, the co-expression network was used to identify highly connected intermodular hub genes. These genes may have a pivotal role in the respective module and could therefore be compelling therapeutic targets. Generally, cell wall remodeling is an important process during isotropic and polarized growth, characterized by an increase of transcripts coding for hyphal growth and cell cycle/DNA processing when polarized growth is initiated. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.

  8. The Fdb3 transcription factor of the Fusarium Detoxification of Benzoxazolinone gene cluster is required for MBOA but not BOA degradation in Fusarium pseudograminearum.

    Science.gov (United States)

    Kettle, Andrew J; Carere, Jason; Batley, Jacqueline; Manners, John M; Kazan, Kemal; Gardiner, Donald M

    2016-03-01

    A number of cereals produce the benzoxazolinone class of phytoalexins. Fusarium species pathogenic towards these hosts can typically degrade these compounds via an aminophenol intermediate, and the ability to do so is encoded by a group of genes found in the Fusarium Detoxification of Benzoxazolinone (FDB) cluster. A zinc finger transcription factor encoded by one of the FDB cluster genes (FDB3) has been proposed to regulate the expression of other genes in the cluster and hence is potentially involved in benzoxazolinone degradation. Herein we show that Fdb3 is essential for the ability of Fusarium pseudograminearum to efficiently detoxify the predominant wheat benzoxazolinone, 6-methoxy-benzoxazolin-2-one (MBOA), but not benzoxazoline-2-one (BOA). Furthermore, additional genes thought to be part of the FDB gene cluster, based upon transcriptional response to benzoxazolinones, are regulated by Fdb3. However, deletion mutants for these latter genes remain capable of benzoxazolinone degradation, suggesting that they are not essential for this process. Crown Copyright © 2016. Published by Elsevier Inc. All rights reserved.

  9. Heterogeneic dynamics of the structures of multiple gene clusters in two pathogenetically different lines originating from the same phytoplasma.

    Science.gov (United States)

    Arashida, Ryo; Kakizawa, Shigeyuki; Hoshi, Ayaka; Ishii, Yoshiko; Jung, Hee-Young; Kagiwada, Satoshi; Yamaji, Yasuyuki; Oshima, Kenro; Namba, Shigetou

    2008-04-01

    Phytoplasmas are phloem-limited plant pathogens that are transmitted by insect vectors and are associated with diseases in hundreds of plant species. Despite their small sizes, phytoplasma genomes have repeat-rich sequences, which are due to several genes that are encoded as multiple copies. These multiple genes exist in a gene cluster, the potential mobile unit (PMU). PMUs are present at several distinct regions in the phytoplasma genome. The multicopy genes encoded by PMUs (herein named mobile unit genes [MUGs]) and similar genes elsewhere in the genome (herein named fundamental genes [FUGs]) are likely to have the same function based on their annotations. In this manuscript we show evidence that MUGs and FUGs do not cluster together within the same clade. Each MUG is in a cluster with a short branch length, suggesting that MUGs are recently diverged paralogs, whereas the origin of FUGs is different from that of MUGs. We also compared the genome structures around the lplA gene in two derivative lines of the 'Candidatus Phytoplasma asteris' OY strain, the severe-symptom line W (OY-W) and the mild-symptom line M (OY-M). The gene organizations of the nucleotide sequences upstream of the lplA genes of OY-W and OY-M were dramatically different. The tra5 insertion sequence, an element of PMUs, was found only in this region in OY-W. These results suggest that transposition of entire PMUs and PMU sections has occurred frequently in the OY phytoplasma genome. The difference in the pathogenicities of OY-W and OY-M might be caused by the duplication and transposition of PMUs, followed by genome rearrangement.

  10. CytoCluster: A Cytoscape Plugin for Cluster Analysis and Visualization of Biological Networks.

    Science.gov (United States)

    Li, Min; Li, Dongyan; Tang, Yu; Wu, Fangxiang; Wang, Jianxin

    2017-08-31

    Nowadays, cluster analysis of biological networks has become one of the most important approaches to identifying functional modules as well as predicting protein complexes and network biomarkers. Furthermore, the visualization of clustering results is crucial to display the structure of biological networks. Here we present CytoCluster, a cytoscape plugin integrating six clustering algorithms, HC-PIN (Hierarchical Clustering algorithm in Protein Interaction Networks), OH-PIN (identifying Overlapping and Hierarchical modules in Protein Interaction Networks), IPCA (Identifying Protein Complex Algorithm), ClusterONE (Clustering with Overlapping Neighborhood Expansion), DCU (Detecting Complexes based on Uncertain graph model), IPC-MCE (Identifying Protein Complexes based on Maximal Complex Extension), and BinGO (the Biological networks Gene Ontology) function. Users can select different clustering algorithms according to their requirements. The main function of these six clustering algorithms is to detect protein complexes or functional modules. In addition, BinGO is used to determine which Gene Ontology (GO) categories are statistically overrepresented in a set of genes or a subgraph of a biological network. CytoCluster can be easily expanded, so that more clustering algorithms and functions can be added to this plugin. Since it was created in July 2013, CytoCluster has been downloaded more than 9700 times in the Cytoscape App store and has already been applied to the analysis of different biological networks. CytoCluster is available from http://apps.cytoscape.org/apps/cytocluster.

  11. Mechanisms of transcriptional regulation and prognostic significance of activated leukocyte cell adhesion molecule in cancer

    Directory of Open Access Journals (Sweden)

    Chen Hairu

    2010-10-01

    Full Text Available Abstract Background Activated leukocyte cell adhesion molecule (ALCAM is implicated in the prognosis of multiple cancers with low level expression associated with metastasis and early death in breast cancer. Despite this significance, mechanisms that regulate ALCAM gene expression and ALCAM's role in adhesion of pre-metastatic circulating tumor cells have not been defined. We studied ALCAM expression in 20 tumor cell lines by real-time PCR, western blot and immunochemistry. Epigenetic alterations of the ALCAM promoter were assessed using methylation-specific PCR and bisulfite sequencing. ALCAM's role in adhesion of tumor cells to the vascular wall was studied in isolated perfused lungs. Results A common site for transcription initiation of the ALCAM gene was identified and the ALCAM promoter sequenced. The promoter contains multiple cis-active elements including a functional p65 NF-κB motif, and it harbors an extensive array of CpG residues highly methylated exclusively in ALCAM-negative tumor cells. These CpG residues were modestly demethylated after 5-aza-2-deoxycytidine treatment. Restoration of high-level ALCAM expression using an ALCAM cDNA increased clustering of MDA-MB-435 tumor cells perfused through the pulmonary vasculature of ventilated rat lungs. Anti-ALCAM antibodies reduced the number of intravascular tumor cell clusters. Conclusion Our data suggests that loss of ALCAM expression, due in part to DNA methylation of extensive segments of the promoter, significantly impairs the ability of circulating tumor cells to adhere to each other, and may therefore promote metastasis. These findings offer insight into the mechanisms for down-regulation of ALCAM gene expression in tumor cells, and for the positive prognostic value of high-level ALCAM in breast cancer.

  12. A nonparametric Bayesian approach for clustering bisulfate-based DNA methylation profiles.

    Science.gov (United States)

    Zhang, Lin; Meng, Jia; Liu, Hui; Huang, Yufei

    2012-01-01

    DNA methylation occurs in the context of a CpG dinucleotide. It is an important epigenetic modification, which can be inherited through cell division. The two major types of methylation include hypomethylation and hypermethylation. Unique methylation patterns have been shown to exist in diseases including various types of cancer. DNA methylation analysis promises to become a powerful tool in cancer diagnosis, treatment and prognostication. Large-scale methylation arrays are now available for studying methylation genome-wide. The Illumina methylation platform simultaneously measures cytosine methylation at more than 1500 CpG sites associated with over 800 cancer-related genes. Cluster analysis is often used to identify DNA methylation subgroups for prognosis and diagnosis. However, due to the unique non-Gaussian characteristics, traditional clustering methods may not be appropriate for DNA and methylation data, and the determination of optimal cluster number is still problematic. A Dirichlet process beta mixture model (DPBMM) is proposed that models the DNA methylation expressions as an infinite number of beta mixture distribution. The model allows automatic learning of the relevant parameters such as the cluster mixing proportion, the parameters of beta distribution for each cluster, and especially the number of potential clusters. Since the model is high dimensional and analytically intractable, we proposed a Gibbs sampling "no-gaps" solution for computing the posterior distributions, hence the estimates of the parameters. The proposed algorithm was tested on simulated data as well as methylation data from 55 Glioblastoma multiform (GBM) brain tissue samples. To reduce the computational burden due to the high data dimensionality, a dimension reduction method is adopted. The two GBM clusters yielded by DPBMM are based on data of different number of loci (P-value < 0.1), while hierarchical clustering cannot yield statistically significant clusters.

  13. The gsdf gene locus harbors evolutionary conserved and clustered genes preferentially expressed in fish previtellogenic oocytes.

    Science.gov (United States)

    Gautier, Aude; Le Gac, Florence; Lareyre, Jean-Jacques

    2011-02-01

    display a different cellular localization compared to that of the gsdf gene indicating that the later gene is not co-regulated. Interestingly, our study identifies new clustered genes that are specifically expressed in previtellogenic oocytes (nup54, aff1, klhl8, sdad1). Copyright © 2010 Elsevier B.V. All rights reserved.

  14. Biosynthesis of Akaeolide and Lorneic Acids and Annotation of Type I Polyketide Synthase Gene Clusters in the Genome of Streptomyces sp. NPS554

    Directory of Open Access Journals (Sweden)

    Tao Zhou

    2015-01-01

    Full Text Available The incorporation pattern of biosynthetic precursors into two structurally unique polyketides, akaeolide and lorneic acid A, was elucidated by feeding experiments with 13C-labeled precursors. In addition, the draft genome sequence of the producer, Streptomyces sp. NPS554, was performed and the biosynthetic gene clusters for these polyketides were identified. The putative gene clusters contain all the polyketide synthase (PKS domains necessary for assembly of the carbon skeletons. Combined with the 13C-labeling results, gene function prediction enabled us to propose biosynthetic pathways involving unusual carbon-carbon bond formation reactions. Genome analysis also indicated the presence of at least ten orphan type I PKS gene clusters that might be responsible for the production of new polyketides.

  15. Prognostic signature and clonality pattern of recurrently mutated genes in inactive chronic lymphocytic leukemia

    International Nuclear Information System (INIS)

    Hurtado, A M; Chen-Liang, T-H; Przychodzen, B; Hamedi, C; Muñoz-Ballester, J; Dienes, B; García-Malo, M D; Antón, A I; Arriba, F de; Teruel-Montoya, R; Ortuño, F J; Vicente, V; Maciejewski, J P; Jerez, A

    2015-01-01

    An increasing numbers of patients are being diagnosed with asymptomatic early-stage chronic lymphocytic leukemia (CLL), with no treatment indication at baseline. We applied a high-throughput deep-targeted analysis, especially designed for covering widely TP53 and ATM genes, in 180 patients with inactive disease at diagnosis, to test the independent prognostic value of CLL somatic recurrent mutations. We found that 40/180 patients harbored at least one acquired variant with ATM (n=17, 9.4%), NOTCH1 (n=14, 7.7%), TP53 (n=14, 7.7%) and SF3B1 (n=10, 5.5%) as most prevalent mutated genes. Harboring one ‘sub-Sanger' TP53 mutation granted an independent 3.5-fold increase of probability of needing treatment. Those patients with a double-hit ATM lesion (mutation+11q deletion) had the shorter median time to first treatment (17 months). We found that a genomic variable: TP53 mutations, most of them under the sensitivity of conventional techniques; a cell phenotypic factor: CD38-positive expression; and a classical marker as β2-microglobulin, remained as the unique independent predictors of outcome. The high-throughput determination of TP53 status, particularly in this set of patients frequently lacking high-risk chromosomal aberrations, emerges as a key step, not only for prediction modeling, but also for exploring mutation-specific therapeutic approaches and minimal residual disease monitoring

  16. Expression-based clustering of CAZyme-encoding genes of Aspergillus niger.

    Science.gov (United States)

    Gruben, Birgit S; Mäkelä, Miia R; Kowalczyk, Joanna E; Zhou, Miaomiao; Benoit-Gelber, Isabelle; De Vries, Ronald P

    2017-11-23

    The Aspergillus niger genome contains a large repertoire of genes encoding carbohydrate active enzymes (CAZymes) that are targeted to plant polysaccharide degradation enabling A. niger to grow on a wide range of plant biomass substrates. Which genes need to be activated in certain environmental conditions depends on the composition of the available substrate. Previous studies have demonstrated the involvement of a number of transcriptional regulators in plant biomass degradation and have identified sets of target genes for each regulator. In this study, a broad transcriptional analysis was performed of the A. niger genes encoding (putative) plant polysaccharide degrading enzymes. Microarray data focusing on the initial response of A. niger to the presence of plant biomass related carbon sources were analyzed of a wild-type strain N402 that was grown on a large range of carbon sources and of the regulatory mutant strains ΔxlnR, ΔaraR, ΔamyR, ΔrhaR and ΔgalX that were grown on their specific inducing compounds. The cluster analysis of the expression data revealed several groups of co-regulated genes, which goes beyond the traditionally described co-regulated gene sets. Additional putative target genes of the selected regulators were identified, based on their expression profile. Notably, in several cases the expression profile puts questions on the function assignment of uncharacterized genes that was based on homology searches, highlighting the need for more extensive biochemical studies into the substrate specificity of enzymes encoded by these non-characterized genes. The data also revealed sets of genes that were upregulated in the regulatory mutants, suggesting interaction between the regulatory systems and a therefore even more complex overall regulatory network than has been reported so far. Expression profiling on a large number of substrates provides better insight in the complex regulatory systems that drive the conversion of plant biomass by fungi. In

  17. Transcriptional organization of the DNA region controlling expression of the K99 gene cluster.

    Science.gov (United States)

    Roosendaal, B; Damoiseaux, J; Jordi, W; de Graaf, F K

    1989-01-01

    The transcriptional organization of the K99 gene cluster was investigated in two ways. First, the DNA region, containing the transcriptional signals was analyzed using a transcription vector system with Escherichia coli galactokinase (GalK) as assayable marker and second, an in vitro transcription system was employed. A detailed analysis of the transcription signals revealed that a strong promoter PA and a moderate promoter PB are located upstream of fanA and fanB, respectively. No promoter activity was detected in the intercistronic region between fanB and fanC. Factor-dependent terminators of transcription were detected and are probably located in the intercistronic region between fanA and fanB (T1), and between fanB and fanC (T2). A third terminator (T3) was observed between fanC and fanD and has an efficiency of 90%. Analysis of the regulatory region in an in vitro transcription system confirmed the location of the respective transcription signals. A model for the transcriptional organization of the K99 cluster is presented. Indications were obtained that the trans-acting regulatory polypeptides FanA and FanB both function as anti-terminators. A model for the regulation of expression of the K99 gene cluster is postulated.

  18. Acinetobacter baumannii K27 and K44 capsular polysaccharides have the same K unit but different structures due to the presence of distinct wzy genes in otherwise closely related K gene clusters.

    Science.gov (United States)

    Shashkov, Alexander S; Kenyon, Johanna J; Senchenkova, Sof'ya N; Shneider, Mikhail M; Popova, Anastasiya V; Arbatsky, Nikolay P; Miroshnikov, Konstantin A; Volozhantsev, Nikolay V; Hall, Ruth M; Knirel, Yuriy A

    2016-05-01

    Capsular polysaccharides (CPSs), from Acinetobacter baumannii isolates 1432, 4190 and NIPH 70, which have related gene content at the K locus, were examined, and the chemical structures established using 2D(1)H and(13)C NMR spectroscopy. The three isolates produce the same pentasaccharide repeat unit, which consists of 5-N-acetyl-7-N-[(S)-3-hydroxybutanoyl] (major) or 5,7-di-N-acetyl (minor) derivatives of 5,7-diamino-3,5,7,9-tetradeoxy-D-glycero-D-galacto-non-2-ulosonic (legionaminic) acid (Leg5Ac7R), D-galactose, N-acetyl-D-galactosamine and N-acetyl-D-glucosamine. However, the linkage between repeat units in NIPH 70 was different to that in 1432 and 4190, and this significantly alters the CPS structure. The KL27 gene cluster in 4190 and KL44 gene cluster in NIPH 70 are organized identically and contain lga genes for Leg5Ac7R synthesis, genes for the synthesis of the common sugars, as well as anitrA2 initiating transferase and four glycosyltransferases genes. They share high-level nucleotide sequence identity for corresponding genes, but differ in the wzy gene encoding the Wzy polymerase. The Wzy proteins, which have different lengths and share no similarity, would form the unrelated linkages in the K27 and K44 structures. The linkages formed by the four shared glycosyltransferases were predicted by comparison with gene clusters that synthesize related structures. These findings unambiguously identify the linkages formed by WzyK27 and WzyK44, and show that the presence of different wzy genes in otherwise closely related K gene clusters changes the structure of the CPS. This may affect its capacity as a protective barrier for A. baumannii. © The Author 2015. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  19. Ancient expansion of the hox cluster in lepidoptera generated four homeobox genes implicated in extra-embryonic tissue formation.

    Directory of Open Access Journals (Sweden)

    Laura Ferguson

    2014-10-01

    Full Text Available Gene duplications within the conserved Hox cluster are rare in animal evolution, but in Lepidoptera an array of divergent Hox-related genes (Shx genes has been reported between pb and zen. Here, we use genome sequencing of five lepidopteran species (Polygonia c-album, Pararge aegeria, Callimorpha dominula, Cameraria ohridella, Hepialus sylvina plus a caddisfly outgroup (Glyphotaelius pellucidus to trace the evolution of the lepidopteran Shx genes. We demonstrate that Shx genes originated by tandem duplication of zen early in the evolution of large clade Ditrysia; Shx are not found in a caddisfly and a member of the basally diverging Hepialidae (swift moths. Four distinct Shx genes were generated early in ditrysian evolution, and were stably retained in all descendent Lepidoptera except the silkmoth which has additional duplications. Despite extensive sequence divergence, molecular modelling indicates that all four Shx genes have the potential to encode stable homeodomains. The four Shx genes have distinct spatiotemporal expression patterns in early development of the Speckled Wood butterfly (Pararge aegeria, with ShxC demarcating the future sites of extraembryonic tissue formation via strikingly localised maternal RNA in the oocyte. All four genes are also expressed in presumptive serosal cells, prior to the onset of zen expression. Lepidopteran Shx genes represent an unusual example of Hox cluster expansion and integration of novel genes into ancient developmental regulatory networks.

  20. Prognostic value and molecular correlates of a CT image-based quantitative pleural contact index in early stage NSCLC

    Energy Technology Data Exchange (ETDEWEB)

    Lee, Juheon; Cui, Yi; Li, Bailiang; Wu, Jia; Gensheimer, Michael F. [Stanford University School of Medicine, Department of Radiation Oncology, Stanford, CA (United States); Sun, Xiaoli [First Affiliated Hospital of Zhejiang University, Radiotherapy Department, Hangzhou, Zhejiang (China); Li, Dengwang [Stanford University School of Medicine, Department of Radiation Oncology, Stanford, CA (United States); Shandong Normal University, Shandong Province Key Laboratory of Medical Physics and Image Processing Technology, Institute of Biomedical Sciences, School of Physics and Electronics, Jinan Shi (China); Loo, Billy W.; Li, Ruijiang [Stanford University School of Medicine, Department of Radiation Oncology, Stanford, CA (United States); Stanford University School of Medicine, Stanford Cancer Institute, Stanford, CA (United States); Diehn, Maximilian [Stanford University School of Medicine, Department of Radiation Oncology, Stanford, CA (United States); Stanford University School of Medicine, Stanford Cancer Institute, Stanford, CA (United States); Stanford University School of Medicine, Institute for Stem Cell Biology and Regenerative Medicine, Stanford, CA (United States)

    2018-02-15

    To evaluate the prognostic value and molecular basis of a CT-derived pleural contact index (PCI) in early stage non-small cell lung cancer (NSCLC). We retrospectively analysed seven NSCLC cohorts. A quantitative PCI was defined on CT as the length of tumour-pleura interface normalised by tumour diameter. We evaluated the prognostic value of PCI in a discovery cohort (n = 117) and tested in an external cohort (n = 88) of stage I NSCLC. Additionally, we identified the molecular correlates and built a gene expression-based surrogate of PCI using another cohort of 89 patients. To further evaluate the prognostic relevance, we used four datasets totalling 775 stage I patients with publically available gene expression data and linked survival information. At a cutoff of 0.8, PCI stratified patients for overall survival in both imaging cohorts (log-rank p = 0.0076, 0.0304). Extracellular matrix (ECM) remodelling was enriched among genes associated with PCI (p = 0.0003). The genomic surrogate of PCI remained an independent predictor of overall survival in the gene expression cohorts (hazard ratio: 1.46, p = 0.0007) adjusting for age, gender, and tumour stage. CT-derived pleural contact index is associated with ECM remodelling and may serve as a noninvasive prognostic marker in early stage NSCLC. (orig.)

  1. A Metabolic Gene Cluster in the Wheat W1 and the Barley Cer-cqu Loci Determines β-Diketone Biosynthesis and Glaucousness.

    Science.gov (United States)

    Hen-Avivi, Shelly; Savin, Orna; Racovita, Radu C; Lee, Wing-Sham; Adamski, Nikolai M; Malitsky, Sergey; Almekias-Siegl, Efrat; Levy, Matan; Vautrin, Sonia; Bergès, Hélène; Friedlander, Gilgi; Kartvelishvily, Elena; Ben-Zvi, Gil; Alkan, Noam; Uauy, Cristobal; Kanyuka, Kostya; Jetter, Reinhard; Distelfeld, Assaf; Aharoni, Asaph

    2016-06-01

    The glaucous appearance of wheat (Triticum aestivum) and barley (Hordeum vulgare) plants, that is the light bluish-gray look of flag leaf, stem, and spike surfaces, results from deposition of cuticular β-diketone wax on their surfaces; this phenotype is associated with high yield, especially under drought conditions. Despite extensive genetic and biochemical characterization, the molecular genetic basis underlying the biosynthesis of β-diketones remains unclear. Here, we discovered that the wheat W1 locus contains a metabolic gene cluster mediating β-diketone biosynthesis. The cluster comprises genes encoding proteins of several families including type-III polyketide synthases, hydrolases, and cytochrome P450s related to known fatty acid hydroxylases. The cluster region was identified in both genetic and physical maps of glaucous and glossy tetraploid wheat, demonstrating entirely different haplotypes in these accessions. Complementary evidence obtained through gene silencing in planta and heterologous expression in bacteria supports a model for a β-diketone biosynthesis pathway involving members of these three protein families. Mutations in homologous genes were identified in the barley eceriferum mutants defective in β-diketone biosynthesis, demonstrating a gene cluster also in the β-diketone biosynthesis Cer-cqu locus in barley. Hence, our findings open new opportunities to breed major cereal crops for surface features that impact yield and stress response. © 2016 American Society of Plant Biologists. All rights reserved.

  2. Cloning of the staurosporine biosynthetic gene cluster from Streptomyces sp. TP-A0274 and its heterologous expression in Streptomyces lividans.

    Science.gov (United States)

    Onaka, Hiroyasu; Taniguchi, Shin-ichi; Igarashi, Yasuhiro; Furumai, Tamotsu

    2002-12-01

    Staurosporine is a representative member of indolocarbazole antibiotics. The entire staurosporine biosynthetic and regulatory gene cluster spanning 20-kb was cloned from Streptomyces sp. TP-A0274 and sequenced. The gene cluster consists of 14 ORFs and the amino acid sequence homology search revealed that it contains three genes, staO, staD, and staP, coding for the enzymes involved in the indolocarbazole aglycone biosynthesis, two genes, staG and staN, for the bond formation between the aglycone and deoxysugar, eight genes, staA, staB, staE, staJ, staI, staK, staMA, and staMB, for the deoxysugar biosynthesis and one gene, staR is a transcriptional regulator. Heterologous gene expression of a 38-kb fragment containing a complete set of the biosynthetic genes for staurosporine cloned into pTOYAMAcos confirmed its role in staurosporine biosynthesis. Moreover, the distribution of the gene for chromopyrrolic acid synthase, the key enzyme for the biosynthesis of indolocarbazole aglycone, in actinomycetes was investigated, and rebD homologs were shown to exist only in the strains producing indolocarbazole antibiotics.

  3. Gene Clusters for Insecticidal Loline Alkaloids in the Grass-Endophytic Fungus Neotyphodium uncinatum

    OpenAIRE

    Spiering, Martin J.; Moon, Christina D.; Wilkinson, Heather H.; Schardl, Christopher L.

    2005-01-01

    Loline alkaloids are produced by mutualistic fungi symbiotic with grasses, and they protect the host plants from insects. Here we identify in the fungal symbiont, Neotyphodium uncinatum, two homologous gene clusters (LOL-1 and LOL-2) associated with loline-alkaloid production. Nine genes were identified in a 25-kb region of LOL-1 and designated (in order) lolF-1, lolC-1, lolD-1, lolO-1, lolA-1, lolU-1, lolP-1, lolT-1, and lolE-1. LOL-2 contained the homologs lolC-2 through lolE-2 in the same ...

  4. Using SNP genetic markers to elucidate the linkage of the Co-34/Phg-3 anthracnose and angular leaf spot resistance gene cluster with the Ur-14 resistance gene

    Science.gov (United States)

    The Ouro Negro common bean cultivar contains the Co-34/Phg-3 gene cluster that confers resistance to the anthracnose (ANT) and angular leaf spot (ALS) pathogens. These genes are tightly linked on chromosome 4. Ouro Negro also has the Ur-14 rust resistance gene, reportedly in the vicinity of Co- 34; ...

  5. Prognostic significance of IDH 1 mutation in patients with glioblastoma multiforme.

    Science.gov (United States)

    Khan, Inamullah; Waqas, Muhammad; Shamim, Muhammad Shahzad

    2017-05-01

    Focus of brain tumour research is shifting towards tumour genesis and genetics, and possible development of individualized treatment plans. Genetic analysis shows recurrent mutation in isocitrate dehydrogenase (IDH1) gene in most Glioblastoma multiforme (GBM) cells. In this review we evaluated the prognostic significance of IDH 1 mutation on the basis of published evidence. Multiple retrospective clinical analyses correlate the presence of IDH1 mutation in GBM with good prognostic outcomes compared to wild-type IDH1. A systematic review reported similar results. Based on the review of current literature IDH1 mutation is an independent factor for longer overall survival (OS) and progression free survival (PFS) in GBM patients when compared to wild-type IDH1. The prognostic significance opens up new avenues for treatment.

  6. Identification of the chelocardin biosynthetic gene cluster from Amycolatopsis sulphurea: a platform for producing novel tetracycline antibiotics.

    Science.gov (United States)

    Lukežič, Tadeja; Lešnik, Urška; Podgoršek, Ajda; Horvat, Jaka; Polak, Tomaž; Šala, Martin; Jenko, Branko; Raspor, Peter; Herron, Paul R; Hunter, Iain S; Petković, Hrvoje

    2013-12-01

    Tetracyclines (TCs) are medically important antibiotics from the polyketide family of natural products. Chelocardin (CHD), produced by Amycolatopsis sulphurea, is a broad-spectrum tetracyclic antibiotic with potent bacteriolytic activity against a number of Gram-positive and Gram-negative multi-resistant pathogens. CHD has an unknown mode of action that is different from TCs. It has some structural features that define it as 'atypical' and, notably, is active against tetracycline-resistant pathogens. Identification and characterization of the chelocardin biosynthetic gene cluster from A. sulphurea revealed 18 putative open reading frames including a type II polyketide synthase. Compared to typical TCs, the chd cluster contains a number of features that relate to its classification as 'atypical': an additional gene for a putative two-component cyclase/aromatase that may be responsible for the different aromatization pattern, a gene for a putative aminotransferase for C-4 with the opposite stereochemistry to TCs and a gene for a putative C-9 methylase that is a unique feature of this biosynthetic cluster within the TCs. Collectively, these enzymes deliver a molecule with different aromatization of ring C that results in an unusual planar structure of the TC backbone. This is a likely contributor to its different mode of action. In addition CHD biosynthesis is primed with acetate, unlike the TCs, which are primed with malonamate, and offers a biosynthetic engineering platform that represents a unique opportunity for efficient generation of novel tetracyclic backbones using combinatorial biosynthesis.

  7. Transcriptional analysis of the jamaicamide gene cluster from the marine cyanobacterium Lyngbya majuscula and identification of possible regulatory proteins

    Directory of Open Access Journals (Sweden)

    Dorrestein Pieter C

    2009-12-01

    Full Text Available Abstract Background The marine cyanobacterium Lyngbya majuscula is a prolific producer of bioactive secondary metabolites. Although biosynthetic gene clusters encoding several of these compounds have been identified, little is known about how these clusters of genes are transcribed or regulated, and techniques targeting genetic manipulation in Lyngbya strains have not yet been developed. We conducted transcriptional analyses of the jamaicamide gene cluster from a Jamaican strain of Lyngbya majuscula, and isolated proteins that could be involved in jamaicamide regulation. Results An unusually long untranslated leader region of approximately 840 bp is located between the jamaicamide transcription start site (TSS and gene cluster start codon. All of the intergenic regions between the pathway ORFs were transcribed into RNA in RT-PCR experiments; however, a promoter prediction program indicated the possible presence of promoters in multiple intergenic regions. Because the functionality of these promoters could not be verified in vivo, we used a reporter gene assay in E. coli to show that several of these intergenic regions, as well as the primary promoter preceding the TSS, are capable of driving β-galactosidase production. A protein pulldown assay was also used to isolate proteins that may regulate the jamaicamide pathway. Pulldown experiments using the intergenic region upstream of jamA as a DNA probe isolated two proteins that were identified by LC-MS/MS. By BLAST analysis, one of these had close sequence identity to a regulatory protein in another cyanobacterial species. Protein comparisons suggest a possible correlation between secondary metabolism regulation and light dependent complementary chromatic adaptation. Electromobility shift assays were used to evaluate binding of the recombinant proteins to the jamaicamide promoter region. Conclusion Insights into natural product regulation in cyanobacteria are of significant value to drug discovery

  8. Plasmid Complement of Lactococcus lactis NCDO712 Reveals a Novel Pilus Gene Cluster.

    Science.gov (United States)

    Tarazanova, Mariya; Beerthuyzen, Marke; Siezen, Roland; Fernandez-Gutierrez, Marcela M; de Jong, Anne; van der Meulen, Sjoerd; Kok, Jan; Bachmann, Herwig

    2016-01-01

    Lactococcus lactis MG1363 is an important gram-positive model organism. It is a plasmid-free and phage-cured derivative of strain NCDO712. Plasmid-cured strains facilitate studies on molecular biological aspects, but many properties which make L. lactis an important organism in the dairy industry are plasmid encoded. We sequenced the total DNA of strain NCDO712 and, contrary to earlier reports, revealed that the strain carries 6 rather than 5 plasmids. A new 50-kb plasmid, designated pNZ712, encodes functional nisin immunity (nisCIP) and copper resistance (lcoRSABC). The copper resistance could be used as a marker for the conjugation of pNZ712 to L. lactis MG1614. A genome comparison with the plasmid cured daughter strain MG1363 showed that the number of single nucleotide polymorphisms that accumulated in the laboratory since the strains diverted more than 30 years ago is limited to 11 of which only 5 lead to amino acid changes. The 16-kb plasmid pSH74 was found to contain a novel 8-kb pilus gene cluster spaCB-spaA-srtC1-srtC2, which is predicted to encode a pilin tip protein SpaC, a pilus basal subunit SpaB, and a pilus backbone protein SpaA. The sortases SrtC1/SrtC2 are most likely involved in pilus polymerization while the chromosomally encoded SrtA could act to anchor the pilus to peptidoglycan in the cell wall. Overexpression of the pilus gene cluster from a multi-copy plasmid in L. lactis MG1363 resulted in cell chaining, aggregation, rapid sedimentation and increased conjugation efficiency of the cells. Electron microscopy showed that the over-expression of the pilus gene cluster leads to appendices on the cell surfaces. A deletion of the gene encoding the putative basal protein spaB, by truncating spaCB, led to more pilus-like structures on the cell surface, but cell aggregation and cell chaining were no longer observed. This is consistent with the prediction that spaB is involved in the anchoring of the pili to the cell.

  9. Heterologous expression of the Halothiobacillus neapolitanus carboxysomal gene cluster in Corynebacterium glutamicum.

    Science.gov (United States)

    Baumgart, Meike; Huber, Isabel; Abdollahzadeh, Iman; Gensch, Thomas; Frunzke, Julia

    2017-09-20

    Compartmentalization represents a ubiquitous principle used by living organisms to optimize metabolic flux and to avoid detrimental interactions within the cytoplasm. Proteinaceous bacterial microcompartments (BMCs) have therefore created strong interest for the encapsulation of heterologous pathways in microbial model organisms. However, attempts were so far mostly restricted to Escherichia coli. Here, we introduced the carboxysomal gene cluster of Halothiobacillus neapolitanus into the biotechnological platform species Corynebacterium gluta-micum. Transmission electron microscopy, fluorescence microscopy and single molecule localization microscopy suggested the formation of BMC-like structures in cells expressing the complete carboxysome operon or only the shell proteins. Purified carboxysomes consisted of the expected protein components as verified by mass spectrometry. Enzymatic assays revealed the functional production of RuBisCO in C. glutamicum both in the presence and absence of carboxysomal shell proteins. Furthermore, we could show that eYFP is targeted to the carboxysomes by fusion to the large RuBisCO subunit. Overall, this study represents the first transfer of an α-carboxysomal gene cluster into a Gram-positive model species supporting the modularity and orthogonality of these microcompartments, but also identified important challenges which need to be addressed on the way towards biotechnological application. Copyright © 2017 Elsevier B.V. All rights reserved.

  10. Identification of the Biosynthetic Gene Clusters for the Lipopeptides Fusaristatin A and W493 B in Fusarium graminearum and F. pseudograminearum

    DEFF Research Database (Denmark)

    Sørensen, Jens Laurids; Sondergaard, Teis Esben; Covarelli, Lorenzo

    2014-01-01

    The closely related species Fusarium graminearum and Fusarium pseudograminearum differ in that each contains a gene cluster with a polyketide synthase (PKS) and a nonribosomal peptide synthetase (NRPS) that is not present in the other species. To identify their products, we deleted PKS6 and NRPS7...... Fusarium species. On the basis of genes in the putative gene clusters we propose a model for biosynthesis where the polyketide product is shuttled to the NPRS via a CoA ligase and a thioesterase in F. pseudograminearum. In F. graminearum the polyketide is proposed to be directly assimilated by the NRPS....

  11. High GC Content Cas9-Mediated Genome-Editing and Biosynthetic Gene Cluster Activation in Saccharopolyspora erythraea.

    Science.gov (United States)

    Liu, Yong; Wei, Wen-Ping; Ye, Bang-Ce

    2018-05-18

    The overexpression of bacterial secondary metabolite biosynthetic enzymes is the basis for industrial overproducing strains. Genome editing tools can be used to further improve gene expression and yield. Saccharopolyspora erythraea produces erythromycin, which has extensive clinical applications. In this study, the CRISPR-Cas9 system was used to edit genes in the S. erythraea genome. A temperature-sensitive plasmid containing the PermE promoter, to drive Cas9 expression, and the Pj23119 and PkasO promoters, to drive sgRNAs, was designed. Erythromycin esterase, encoded by S. erythraea SACE_1765, inactivates erythromycin by hydrolyzing the macrolactone ring. Sequencing and qRT-PCR confirmed that reporter genes were successfully inserted into the SACE_1765 gene. Deletion of SACE_1765 in a high-producing strain resulted in a 12.7% increase in erythromycin levels. Subsequent PermE- egfp knock-in at the SACE_0712 locus resulted in an 80.3% increase in erythromycin production compared with that of wild type. Further investigation showed that PermE promoter knock-in activated the erythromycin biosynthetic gene clusters at the SACE_0712 locus. Additionally, deletion of indA (SACE_1229) using dual sgRNA targeting without markers increased the editing efficiency to 65%. In summary, we have successfully applied Cas9-based genome editing to a bacterial strain, S. erythraea, with a high GC content. This system has potential application for both genome-editing and biosynthetic gene cluster activation in Actinobacteria.

  12. Latent cluster analysis of ALS phenotypes identifies prognostically differing groups.

    Directory of Open Access Journals (Sweden)

    Jeban Ganesalingam

    2009-09-01

    Full Text Available Amyotrophic lateral sclerosis (ALS is a degenerative disease predominantly affecting motor neurons and manifesting as several different phenotypes. Whether these phenotypes correspond to different underlying disease processes is unknown. We used latent cluster analysis to identify groupings of clinical variables in an objective and unbiased way to improve phenotyping for clinical and research purposes.Latent class cluster analysis was applied to a large database consisting of 1467 records of people with ALS, using discrete variables which can be readily determined at the first clinic appointment. The model was tested for clinical relevance by survival analysis of the phenotypic groupings using the Kaplan-Meier method.The best model generated five distinct phenotypic classes that strongly predicted survival (p<0.0001. Eight variables were used for the latent class analysis, but a good estimate of the classification could be obtained using just two variables: site of first symptoms (bulbar or limb and time from symptom onset to diagnosis (p<0.00001.The five phenotypic classes identified using latent cluster analysis can predict prognosis. They could be used to stratify patients recruited into clinical trials and generating more homogeneous disease groups for genetic, proteomic and risk factor research.

  13. Identification of a trichothecene gene cluster and description of the harzianum A biosynthesis pathway in the fungus Trichoderma arundinaceum

    Science.gov (United States)

    Trichothecenes are sesquiterpenes that act like mycotoxins. Their biosynthesis has been mainly studied in the fungal genera Fusarium, where most of the biosynthetic genes (tri) are grouped in a cluster regulated by ambient conditions and regulatory genes. Unexpectedly, few studies are available abou...

  14. Fuzzy C-means method for clustering microarray data.

    Science.gov (United States)

    Dembélé, Doulaye; Kastner, Philippe

    2003-05-22

    Clustering analysis of data from DNA microarray hybridization studies is essential for identifying biologically relevant groups of genes. Partitional clustering methods such as K-means or self-organizing maps assign each gene to a single cluster. However, these methods do not provide information about the influence of a given gene for the overall shape of clusters. Here we apply a fuzzy partitioning method, Fuzzy C-means (FCM), to attribute cluster membership values to genes. A major problem in applying the FCM method for clustering microarray data is the choice of the fuzziness parameter m. We show that the commonly used value m = 2 is not appropriate for some data sets, and that optimal values for m vary widely from one data set to another. We propose an empirical method, based on the distribution of distances between genes in a given data set, to determine an adequate value for m. By setting threshold levels for the membership values, genes which are tigthly associated to a given cluster can be selected. Using a yeast cell cycle data set as an example, we show that this selection increases the overall biological significance of the genes within the cluster. Supplementary text and Matlab functions are available at http://www-igbmc.u-strasbg.fr/fcm/

  15. Microbial communication leading to the activation of silent fungal secondary metabolite gene clusters

    Directory of Open Access Journals (Sweden)

    Tina eNetzker

    2015-04-01

    Full Text Available Microorganisms form diverse multispecies communities in various ecosystems. The high abundance of fungal and bacterial species in these consortia results in specific communication between the microorganisms. A key role in this communication is played by secondary metabolites (SMs, which are also called natural products. Recently, it was shown that interspecies ‘talk’ between microorganisms represents a physiological trigger to activate silent gene clusters leading to the formation of novel SMs by the involved species. This review focuses on mixed microbial cultivation, mainly between bacteria and fungi, with a special emphasis on the induced formation of fungal SMs in co-cultures. In addition, the role of chromatin remodeling in the induction is examined, and methodical perspectives for the analysis of natural products are presented. As an example for an intermicrobial interaction elucidated at the molecular level, we discuss the specific interaction between the filamentous fungi Aspergillus nidulans and Aspergillus fumigatus with the soil bacterium Streptomyces rapamycinicus, which provides an excellent model system to enlighten molecular concepts behind regulatory mechanisms and will pave the way to a novel avenue of drug discovery through targeted activation of silent SM gene clusters through co-cultivations of microorganisms.

  16. Mycobiota and identification of aflatoxin gene cluster in marketed spices in West Africa

    DEFF Research Database (Denmark)

    Gnonlonfin, G. J. B.; Adjovi, Y. C.; Tokpo, A. F.

    2013-01-01

    Fungal infection and aflatoxin contamination were evaluated on 114 samples of dried and milled spices such as ginger, garlic and black pepper from southern Benin and Togo collected in November 2008 -January 2009. These products are dried to preserve them for lean periods available throughout...... of Aspergillus were dominant on all marketed dried and milled spices irrespective of country. Gene characterization and amplification analysis showed that most of the Aspergillus flavus isolates possess the cluster genes for aflatoxin production. Aflatoxin B1 assessment by Thin Layer Chromatography showed...... further for other products such as dried and milled spices. Crown Copyright (C) 2013 Published by Elsevier Ltd. All rights reserved....

  17. The prognostic implications of growth-related gene product β in laryngeal squamous cell carcinoma.

    Science.gov (United States)

    Tang, Mingming; Xu, Xinjiang; Chen, Juanjuan; Huang, Jiangfei; Jiang, Bin; Han, Liang

    2017-09-01

    Growth-related gene product β (GROβ) is an angiogenic chemokine that belongs to the CXC chemokine family, and a number of studies have suggested that GROβ is associated with tumor development and progression. However, a number of studies have investigated the association between GROβ expression and the clinical attributes of laryngeal squamous cell carcinoma (LSCC). In the present study, one-step quantitative polymerase chain reaction and immunohistochemistry analysis were used to detect GROβ expression and evaluate the association between its expression and the clinicopathological characteristics of LSCC. The results demonstrated that the GROβ mRNA and protein expression levels were significantly increased in LSCC compared with the corresponding non-cancerous tissues. GROβ protein expression in LSCC was associated with tumor-node-metastasis stage, lymph node metastasis and histopathological grade. The Kaplan-Meier method and Cox multi-factor analysis indicated that high GROβ expression, lymph node metastasis and histopathological grade were significantly associated with poor survival of patients with LSCC. These data indicated that GROβ may be a novel prognostic biomarker of LSCC.

  18. Distributed Prognostics and Health Management with a Wireless Network Architecture

    Science.gov (United States)

    Goebel, Kai; Saha, Sankalita; Sha, Bhaskar

    2013-01-01

    A heterogeneous set of system components monitored by a varied suite of sensors and a particle-filtering (PF) framework, with the power and the flexibility to adapt to the different diagnostic and prognostic needs, has been developed. Both the diagnostic and prognostic tasks are formulated as a particle-filtering problem in order to explicitly represent and manage uncertainties in state estimation and remaining life estimation. Current state-of-the-art prognostic health management (PHM) systems are mostly centralized in nature, where all the processing is reliant on a single processor. This can lead to a loss in functionality in case of a crash of the central processor or monitor. Furthermore, with increases in the volume of sensor data as well as the complexity of algorithms, traditional centralized systems become for a number of reasons somewhat ungainly for successful deployment, and efficient distributed architectures can be more beneficial. The distributed health management architecture is comprised of a network of smart sensor devices. These devices monitor the health of various subsystems or modules. They perform diagnostics operations and trigger prognostics operations based on user-defined thresholds and rules. The sensor devices, called computing elements (CEs), consist of a sensor, or set of sensors, and a communication device (i.e., a wireless transceiver beside an embedded processing element). The CE runs in either a diagnostic or prognostic operating mode. The diagnostic mode is the default mode where a CE monitors a given subsystem or component through a low-weight diagnostic algorithm. If a CE detects a critical condition during monitoring, it raises a flag. Depending on availability of resources, a networked local cluster of CEs is formed that then carries out prognostics and fault mitigation by efficient distribution of the tasks. It should be noted that the CEs are expected not to suspend their previous tasks in the prognostic mode. When the

  19. Structural Diversification of Lyngbyatoxin A by Host-Dependent Heterologous Expression of the tleABC Biosynthetic Gene Cluster.

    Science.gov (United States)

    Zhang, Lihan; Hoshino, Shotaro; Awakawa, Takayoshi; Wakimoto, Toshiyuki; Abe, Ikuro

    2016-08-03

    Natural products have enormous structural diversity, yet little is known about how such diversity is achieved in nature. Here we report the structural diversification of a cyanotoxin-lyngbyatoxin A-and its biosynthetic intermediates by heterologous expression of the Streptomyces-derived tleABC biosynthetic gene cluster in three different Streptomyces hosts: S. lividans, S. albus, and S. avermitilis. Notably, the isolated lyngbyatoxin derivatives, including four new natural products, were biosynthesized by crosstalk between the heterologous tleABC gene cluster and the endogenous host enzymes. The simple strategy described here has expanded the structural diversity of lyngbyatoxin A and its biosynthetic intermediates, and provides opportunities for investigation of the currently underestimated hidden biosynthetic crosstalk. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  20. The Genome Sequence of the Cyanobacterium Oscillatoria sp. PCC 6506 Reveals Several Gene Clusters Responsible for the Biosynthesis of Toxins and Secondary Metabolites▿

    Science.gov (United States)

    Méjean, Annick; Mazmouz, Rabia; Mann, Stéphane; Calteau, Alexandra; Médigue, Claudine; Ploux, Olivier

    2010-01-01

    We report a draft sequence of the genome of Oscillatoria sp. PCC 6506, a cyanobacterium that produces anatoxin-a and homoanatoxin-a, two neurotoxins, and cylindrospermopsin, a cytotoxin. Beside the clusters of genes responsible for the biosynthesis of these toxins, we have found other clusters of genes likely involved in the biosynthesis of not-yet-identified secondary metabolites. PMID:20675499

  1. A highly divergent gene cluster in honey bees encodes a novel silk family.

    Science.gov (United States)

    Sutherland, Tara D; Campbell, Peter M; Weisman, Sarah; Trueman, Holly E; Sriskantha, Alagacone; Wanjura, Wolfgang J; Haritos, Victoria S

    2006-11-01

    The pupal cocoon of the domesticated silk moth Bombyx mori is the best known and most extensively studied insect silk. It is not widely known that Apis mellifera larvae also produce silk. We have used a combination of genomic and proteomic techniques to identify four honey bee fiber genes (AmelFibroin1-4) and two silk-associated genes (AmelSA1 and 2). The four fiber genes are small, comprise a single exon each, and are clustered on a short genomic region where the open reading frames are GC-rich amid low GC intergenic regions. The genes encode similar proteins that are highly helical and predicted to form unusually tight coiled coils. Despite the similarity in size, structure, and composition of the encoded proteins, the genes have low primary sequence identity. We propose that the four fiber genes have arisen from gene duplication events but have subsequently diverged significantly. The silk-associated genes encode proteins likely to act as a glue (AmelSA1) and involved in silk processing (AmelSA2). Although the silks of honey bees and silkmoths both originate in larval labial glands, the silk proteins are completely different in their primary, secondary, and tertiary structures as well as the genomic arrangement of the genes encoding them. This implies independent evolutionary origins for these functionally related proteins.

  2. [The diagnostic value of microsatellite LOH analysis and the prognostic relevance of angiogenic gene expression in urinary bladder cancer].

    Science.gov (United States)

    Szarvas, Tibor

    2009-12-01

    Bladder cancer is the second most common malignancy affecting the urinary system. Currently, histology is the only tool that determines therapy and patients' prognosis. As the treatment of non-invasive (Ta/T1) and muscle invasive (T2-T4) bladder tumors are completely different, correct staging is important, although it is often hampered by disturbing factors. Molecular methods offer new prospects for early disease detection, confirmation of unclear histological findings and prognostication. Applying molecular biological methods, the present study is searching for answers to current diagnostic and prognostic problems in bladder carcinoma. We analyzed tumor, blood and/or urine samples of 334 bladder cancer patients and 117 control individuals. Genetic alterations were analyzed in urine samples of patients and controls, both by PCR-based microsatellite loss of heterozigosity (LOH) analysis using 12 fluorescently labeled primers and by DNA hybridization based UroVysion FISH technique using 4 probes, to assess the diagnostic values of these methods. Whole genome microsatellite analysis (with 400 markers) was performed in tumor and blood specimens of bladder cancer patients to find chromosomal regions, the loss of which may be associated with tumor stage. Furthermore, we assessed the prognostic value of Tie2, VEGF, Angiopoietin-1 and -2. We concluded that DNA analysis of voided urine samples by microsatellite analysis and FISH are sensitive and non-invasive methods to detect bladder cancer. Furthermore, we established a panel of microsatellite markers that could differentiate between non-invasive and invasive bladder cancer. However, further analyses in a larger cohort of patients are needed to assess their specificity and sensitivity. Finally, we identified high Ang-2 and low Tie2 gene expression as significant and independent risk factors of tumor recurrence and cancer related survival.

  3. Directed natural product biosynthesis gene cluster capture and expression in the model bacterium Bacillus subtilis

    KAUST Repository

    Li, Yongxin

    2015-03-24

    Bacilli are ubiquitous low G+C environmental Gram-positive bacteria that produce a wide assortment of specialized small molecules. Although their natural product biosynthetic potential is high, robust molecular tools to support the heterologous expression of large biosynthetic gene clusters in Bacillus hosts are rare. Herein we adapt transformation-associated recombination (TAR) in yeast to design a single genomic capture and expression vector for antibiotic production in Bacillus subtilis. After validating this direct cloning plug-and-playa approach with surfactin, we genetically interrogated amicoumacin biosynthetic gene cluster from the marine isolate Bacillus subtilis 1779. Its heterologous expression allowed us to explore an unusual maturation process involving the N-acyl-asparagine pro-drug intermediates preamicoumacins, which are hydrolyzed by the asparagine-specific peptidase into the active component amicoumacin A. This work represents the first direct cloning based heterologous expression of natural products in the model organism B. subtilis and paves the way to the development of future genome mining efforts in this genus.

  4. Output ordering and prioritisation system (OOPS): ranking biosynthetic gene clusters to enhance bioactive metabolite discovery.

    Science.gov (United States)

    Peña, Alejandro; Del Carratore, Francesco; Cummings, Matthew; Takano, Eriko; Breitling, Rainer

    2017-12-18

    The rapid increase of publicly available microbial genome sequences has highlighted the presence of hundreds of thousands of biosynthetic gene clusters (BGCs) encoding valuable secondary metabolites. The experimental characterization of new BGCs is extremely laborious and struggles to keep pace with the in silico identification of potential BGCs. Therefore, the prioritisation of promising candidates among computationally predicted BGCs represents a pressing need. Here, we propose an output ordering and prioritisation system (OOPS) which helps sorting identified BGCs by a wide variety of custom-weighted biological and biochemical criteria in a flexible and user-friendly interface. OOPS facilitates a judicious prioritisation of BGCs using G+C content, coding sequence length, gene number, cluster self-similarity and codon bias parameters, as well as enabling the user to rank BGCs based upon BGC type, novelty, and taxonomic distribution. Effective prioritisation of BGCs will help to reduce experimental attrition rates and improve the breadth of bioactive metabolites characterized.

  5. Directed natural product biosynthesis gene cluster capture and expression in the model bacterium Bacillus subtilis

    Science.gov (United States)

    Li, Yongxin; Li, Zhongrui; Yamanaka, Kazuya; Xu, Ying; Zhang, Weipeng; Vlamakis, Hera; Kolter, Roberto; Moore, Bradley S.; Qian, Pei-Yuan

    2015-03-01

    Bacilli are ubiquitous low G+C environmental Gram-positive bacteria that produce a wide assortment of specialized small molecules. Although their natural product biosynthetic potential is high, robust molecular tools to support the heterologous expression of large biosynthetic gene clusters in Bacillus hosts are rare. Herein we adapt transformation-associated recombination (TAR) in yeast to design a single genomic capture and expression vector for antibiotic production in Bacillus subtilis. After validating this direct cloning ``plug-and-play'' approach with surfactin, we genetically interrogated amicoumacin biosynthetic gene cluster from the marine isolate Bacillus subtilis 1779. Its heterologous expression allowed us to explore an unusual maturation process involving the N-acyl-asparagine pro-drug intermediates preamicoumacins, which are hydrolyzed by the asparagine-specific peptidase into the active component amicoumacin A. This work represents the first direct cloning based heterologous expression of natural products in the model organism B. subtilis and paves the way to the development of future genome mining efforts in this genus.

  6. Investigation of the prognostic value of the apoptotic marker p53 gene and vascular endothelial growth factor in evaluating the clinical course of nasopharyngeal angiofibroma

    Directory of Open Access Journals (Sweden)

    O. B. Abdurakhmanov

    2015-01-01

    Full Text Available Objective. To investigate the prognostic value of the apoptotic markers (p53 and vascular endothelial growth factor (VEGF in evaluating the clinical course of juvenile nasopharyngeal angiofibroma (JNA.Subjects and methods. The investigation enrolled 43 patients with primary JNA (a study group and 20 with its relapses (a control group. The expression of VEGF and mutant p53 (mtp53 gene was immunohistochemically determined using DAKO kits (Denmark. The results of reactions with antibodies to VEGF-A and mtp53 located in the nuclei and membranes were expressed as percentages in terms of stained cell counts per 100 cells examined in different visual fields.Results. An associative analysis showed that both study and control group patients with high mtp53 gene expression in the tumor cells had clinical stages IIIA–B and IV and those in whom the expression of this gene in the tumor cells was weak or absent were found to have clinical stages I and II. The high (3+ and moderate (2+ mtp53 gene expressions suggest that the disease is severe. Consequently, this is of prognostic value and a poor predictor and the absence of mutations or the decreased expression of this gene is associated with a favorable disease outcome.Our investigations indicated that the high expression of the VEGF gene was detected in none of the tumor specimens. In the study group, the tumor cell expression of this gene was found to be moderate (2+ in 18 (41.9 % patients, weak in 6 (13.9 % and absent in 19 (44.2 % of the 43 patients. In the control group, the absence of VEGF gene expression in the tumor specimens was 9 times lower than that in the study group.A comparison with the clinical characteristics of the patients demonstrated that in both the study and control groups, the VEGF expression was observed to be moderate, or weak and absent in those with clinical stages IIIA–B and IV or in those with stage II and I, respectively.Conclusion. The associative analysis showed that both

  7. Investigation of pathogenic genes in peri-implantitis from implant clustering failure patients: a whole-exome sequencing pilot study.

    Directory of Open Access Journals (Sweden)

    Soohyung Lee

    Full Text Available Peri-implantitis is a frequently occurring gum disease linked to multi-factorial traits with various environmental and genetic causalities and no known concrete pathogenesis. The varying severity of peri-implantitis among patients with relatively similar environments suggests a genetic aspect which needs to be investigated to understand and regulate the pathogenesis of the disease. Six unrelated individuals with multiple clusterization implant failure due to severe peri-implantitis were chosen for this study. These six individuals had relatively healthy lifestyles, with minimal environmental causalities affecting peri-implantitis. Research was undertaken to investigate pathogenic genes in peri-implantitis albeit with a small number of subjects and incomplete elimination of environmental causalities. Whole-exome sequencing was performed on collected saliva samples via self DNA collection kit. Common variants with minor allele frequencies (MAF > = 0.05 from all control datasets were eliminated and variants having high and moderate impact and loss of function were used for comparison. Gene set enrichment analysis was performed to reveal functional groups associated with the genetic variants. 2,022 genes were left after filtering against dbSNP, the 1000 Genomes East Asian population, and healthy Korean randomized subsample data (GSK project. 175 (p-value <0.05 out of 927 gene sets were obtained via GSEA (DAVID. The top 10 was chosen (p-value <0.05 from cluster enrichment showing significance of cytoskeleton, cell adhesion, and metal ion binding. Network analysis was applied to find relationships between functional clusters. Among the functional groups, ion metal binding was located in the center of all clusters, indicating dysfunction of regulation in metal ion concentration might affect cell morphology or cell adhesion, resulting in implant failure. This result may demonstrate the feasibility of and provide pilot data for a larger research

  8. CMS-dependent prognostic impact of KRAS and BRAFV600E mutations in primary colorectal cancer.

    Science.gov (United States)

    Smeby, J; Sveen, A; Merok, M A; Danielsen, S A; Eilertsen, I A; Guren, M G; Dienstmann, R; Nesbakken, A; Lothe, R A

    2018-05-01

    The prognostic impact of KRAS and BRAFV600E mutations in primary colorectal cancer (CRC) varies with microsatellite instability (MSI) status. The gene expression-based consensus molecular subtypes (CMSs) of CRC define molecularly and clinically distinct subgroups, and represent a novel stratification framework in biomarker analysis. We investigated the prognostic value of these mutations within the CMS groups. Totally 1197 primary tumors from a Norwegian series of CRC stage I-IV were analyzed for MSI and mutation status in hotspots in KRAS (codons 12, 13 and 61) and BRAF (codon 600). A subset was analyzed for gene expression and confident CMS classification was obtained for 317 samples. This cohort was expanded with clinical and molecular data, including CMS classification, from 514 patients in the publically available dataset GSE39582. Gene expression signatures associated with KRAS and BRAFV600E mutations were used to evaluate differential impact of mutations on gene expression among the CMS groups. BRAFV600E and KRAS mutations were both associated with inferior 5-year overall survival (OS) exclusively in MSS tumors (BRAFV600E mutation versus KRAS/BRAF wild-type: Hazard ratio (HR) 2.85, P CMS1, leading to negative prognostic impact in this subtype (OS: BRAFV600E mutation versus wild-type: HR 7.73, P = 0.001). In contrast, the poor prognosis of KRAS mutations was limited to MSS tumors with CMS2/CMS3 epithelial-like gene expression profiles (OS: KRAS mutation versus wild-type: HR 1.51, P = 0.011). The subtype-specific prognostic associations were substantiated by differential effects of BRAFV600E and KRAS mutations on gene expression signatures according to the MSI status and CMS group. BRAFV600E mutations are enriched and associated with metastatic disease in CMS1 MSS tumors, leading to poor prognosis in this subtype. KRAS mutations are associated with adverse outcome in epithelial (CMS2/CMS3) MSS tumors.

  9. The prognostic impact of mutations in spliceosomal genes for myelodysplastic syndrome patients without ring sideroblasts

    International Nuclear Information System (INIS)

    Kang, Min-Gu; Kim, Hye-Ran; Seo, Bo-Young; Lee, Jun Hyung; Choi, Seok-Yong; Kim, Soo-Hyun; Shin, Jong-Hee; Suh, Soon-Pal; Ahn, Jae-Sook; Shin, Myung-Geun

    2015-01-01

    Mutations in genes that are part of the splicing machinery for myelodysplastic syndromes (MDS), including MDS without ring sideroblasts (RS), have been widely investigated. The effects of these mutations on clinical outcomes have been diverse and contrasting. We examined a cohort of 129 de novo MDS patients, who did not harbor RS, for mutations affecting three spliceosomal genes (SF3B1, U2AF1, and SRSF2). The mutation rates of SF3B1, U2AF1, and SRSF2 were 7.0 %, 7.8 %, and 10.1 %, respectively. Compared with previously reported results, these rates were relatively infrequent. The SRSF2 mutation strongly correlated with old age (P < 0.001), while the mutation status of SF3B1 did not affect overall survival (OS), progression-free survival (PFS), or acute myeloid leukemia (AML) transformation. In contrast, MDS patients with mutations in U2AF1 or SRSF2 exhibited inferior PFS. The U2AF1 mutation was associated with inferior OS in low-risk MDS patients (P = 0.035). The SRSF2 mutation was somewhat associated with AML transformation (P = 0.083). Our findings suggest that the frequencies of the SF3B1, U2AF1, and SRSF2 splicing gene mutations in MDS without RS were relatively low. We also demonstrated that the U2AF1 and SRSF2 mutations were associated with an unfavorable prognostic impact in MDS patients without RS. The online version of this article (doi:10.1186/s12885-015-1493-5) contains supplementary material, which is available to authorized users

  10. Heterogeneous gene expression signatures correspond to distinct lung pathologies and biomarkers of disease severity in idiopathic pulmonary fibrosis.

    Science.gov (United States)

    DePianto, Daryle J; Chandriani, Sanjay; Abbas, Alexander R; Jia, Guiquan; N'Diaye, Elsa N; Caplazi, Patrick; Kauder, Steven E; Biswas, Sabyasachi; Karnik, Satyajit K; Ha, Connie; Modrusan, Zora; Matthay, Michael A; Kukreja, Jasleen; Collard, Harold R; Egen, Jackson G; Wolters, Paul J; Arron, Joseph R

    2015-01-01

    There is microscopic spatial and temporal heterogeneity of pathological changes in idiopathic pulmonary fibrosis (IPF) lung tissue, which may relate to heterogeneity in pathophysiological mediators of disease and clinical progression. We assessed relationships between gene expression patterns, pathological features, and systemic biomarkers to identify biomarkers that reflect the aggregate disease burden in patients with IPF. Gene expression microarrays (N=40 IPF; 8 controls) and immunohistochemical analyses (N=22 IPF; 8 controls) of lung biopsies. Clinical characterisation and blood biomarker levels of MMP3 and CXCL13 in a separate cohort of patients with IPF (N=80). 2940 genes were significantly differentially expressed between IPF and control samples (|fold change| >1.5, p<0.05). Two clusters of co-regulated genes related to bronchiolar epithelium or lymphoid aggregates exhibited substantial heterogeneity within the IPF population. Gene expression in bronchiolar and lymphoid clusters corresponded to the extent of bronchiolisation and lymphoid aggregates determined by immunohistochemistry in adjacent tissue sections. Elevated serum levels of MMP3, encoded in the bronchiolar cluster, and CXCL13, encoded in the lymphoid cluster, corresponded to disease severity and shortened survival time (p<10(-7) for MMP3 and p<10(-5) for CXCL13; Cox proportional hazards model). Microscopic pathological heterogeneity in IPF lung tissue corresponds to specific gene expression patterns related to bronchiolisation and lymphoid aggregates. MMP3 and CXCL13 are systemic biomarkers that reflect the aggregate burden of these pathological features across total lung tissue. These biomarkers may have clinical utility as prognostic and/or surrogate biomarkers of disease activity in interventional studies in IPF. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.

  11. In planta functions of cytochrome P450 monooxygenase genes in the phytocassane biosynthetic gene cluster on rice chromosome 2.

    Science.gov (United States)

    Ye, Zhongfeng; Yamazaki, Kohei; Minoda, Hiromi; Miyamoto, Koji; Miyazaki, Sho; Kawaide, Hiroshi; Yajima, Arata; Nojiri, Hideaki; Yamane, Hisakazu; Okada, Kazunori

    2018-06-01

    In response to environmental stressors such as blast fungal infections, rice produces phytoalexins, an antimicrobial diterpenoid compound. Together with momilactones, phytocassanes are among the major diterpenoid phytoalexins. The biosynthetic genes of diterpenoid phytoalexin are organized on the chromosome in functional gene clusters, comprising diterpene cyclase, dehydrogenase, and cytochrome P450 monooxygenase genes. Their functions have been studied extensively using in vitro enzyme assay systems. Specifically, P450 genes (CYP71Z6, Z7; CYP76M5, M6, M7, M8) on rice chromosome 2 have multifunctional activities associated with ent-copalyl diphosphate-related diterpene hydrocarbons, but the in planta contribution of these genes to diterpenoid phytoalexin production remains unknown. Here, we characterized cyp71z7 T-DNA mutant and CYP76M7/M8 RNAi lines to find that potential phytoalexin intermediates accumulated in these P450-suppressed rice plants. The results suggested that in planta, CYP71Z7 is responsible for C2-hydroxylation of phytocassanes and that CYP76M7/M8 is involved in C11α-hydroxylation of 3-hydroxy-cassadiene. Based on these results, we proposed potential routes of phytocassane biosynthesis in planta.

  12. Prognostics for Microgrid Components

    Science.gov (United States)

    Saxena, Abhinav

    2012-01-01

    Prognostics is the science of predicting future performance and potential failures based on targeted condition monitoring. Moving away from the traditional reliability centric view, prognostics aims at detecting and quantifying the time to impending failures. This advance warning provides the opportunity to take actions that can preserve uptime, reduce cost of damage, or extend the life of the component. The talk will focus on the concepts and basics of prognostics from the viewpoint of condition-based systems health management. Differences with other techniques used in systems health management and philosophies of prognostics used in other domains will be shown. Examples relevant to micro grid systems and subsystems will be used to illustrate various types of prediction scenarios and the resources it take to set up a desired prognostic system. Specifically, the implementation results for power storage and power semiconductor components will demonstrate specific solution approaches of prognostics. The role of constituent elements of prognostics, such as model, prediction algorithms, failure threshold, run-to-failure data, requirements and specifications, and post-prognostic reasoning will be explained. A discussion on performance evaluation and performance metrics will conclude the technical discussion followed by general comments on open research problems and challenges in prognostics.

  13. Prognostic significance of cyclin D1 protein expression and gene amplification in invasive breast carcinoma.

    Directory of Open Access Journals (Sweden)

    Angela B Ortiz

    Full Text Available The oncogenic capacity of cyclin D1 has long been established in breast cancer. CCND1 amplification has been identified in a subset of patients with poor prognosis, but there are conflicting data regarding the predictive value of cyclin D1 protein overexpression. This study was designed to analyze the expression of cyclin D1 and its correlation with CCND1 amplification and their prognostic implications in invasive breast cancer. By using the tissue microarray technique, we performed an immunohistochemical study of ER, PR, HER2, p53, cyclin D1, Ki67 and p16 in 179 invasive breast carcinoma cases. The FISH method was performed to detect HER2/Neu and CCND1 amplification. High cyclin D1 expression was identified in 94/179 (52% of invasive breast cancers. Cyclin D1 overexpression and CCND1 amplification were significantly associated (p = 0.010. Overexpression of cyclin D1 correlated with ER expression, PR expression and Luminal subtypes (p<0.001, with a favorable impact on overall survival in the whole series. However, in the Luminal A group, high expression of cyclin D1 correlated with shorter disease-free survival, suggesting that the prognostic role of cyclin D1 depends on the molecular subtype. CCND1 gene amplification was detected in 17 cases (9% and correlated significantly with high tumor grade (p = 0.038, high Ki-67 protein expression (p = 0.002, and the Luminal B subtype (p = 0.002. Patients with tumors with high amplification of CCND1 had an increased risk of recurrence (HR = 2.5; 95% CI, 1.2-4.9, p = 0.01. These findings suggest that CCND1 amplification could be useful for predicting recurrence in invasive breast cancer.

  14. Association of Interleukin-1 gene clusters polymorphisms with primary open-angle glaucoma: a meta-analysis.

    Science.gov (United States)

    Li, Junhua; Feng, Yifan; Sung, Mi Sun; Lee, Tae Hee; Park, Sang Woo

    2017-11-28

    Previous studies have associated the Interleukin-1 (IL-1) gene clusters polymorphisms with the risk of primary open-angle glaucoma (POAG). However, the results were not consistent. Here, we performed a meta-analysis to evaluate the role of IL-1 gene clusters polymorphisms in POAG susceptibility. PubMed, EMBASE and Cochrane Library (up to July 15, 2017) were searched by two independent investigators. All case-control studies investigating the association between single-nucleotide polymorphisms (SNPs) of IL-1 gene clusters and POAG risk were included. Odds ratios (ORs) with 95% confidence intervals (CIs) were calculated for quantifying the strength of association that has been involved in at least two studies. Five studies on IL-1β rs16944 (c. -511C > T) (1053 cases and 986 controls), 4 studies on IL-1α rs1800587 (c. -889C > T) (822 cases and 714 controls), and 4 studies on IL-1β rs1143634 (c. +3953C > T) (798 cases and 730 controls) were included. The results suggest that all three SNPs were not associated with POAG risk. Stratification analyses indicated that the rs1143634 has a suggestive associated with high tension glaucoma (HTG) under dominant (P = 0.03), heterozygote (P = 0.04) and allelic models (P = 0.02), however, the weak association was nullified after Bonferroni adjustments for multiple tests. Based on current meta-analysis, we indicated that there is lack of association between the three SNPs of IL-1 and POAG. However, this conclusion should be interpreted with caution and further well designed studies with large sample-size are required to validate the conclusion as low statistical powers.

  15. Gene Expression Programs in Response to Hypoxia: Cell Type Specificity and Prognostic Significance in Human Cancers.

    Directory of Open Access Journals (Sweden)

    2006-01-01

    Full Text Available BACKGROUND: Inadequate oxygen (hypoxia triggers a multifaceted cellular response that has important roles in normal physiology and in many human diseases. A transcription factor, hypoxia-inducible factor (HIF, plays a central role in the hypoxia response; its activity is regulated by the oxygen-dependent degradation of the HIF-1alpha protein. Despite the ubiquity and importance of hypoxia responses, little is known about the variation in the global transcriptional response to hypoxia among different cell types or how this variation might relate to tissue- and cell-specific diseases. METHODS AND FINDINGS: We analyzed the temporal changes in global transcript levels in response to hypoxia in primary renal proximal tubule epithelial cells, breast epithelial cells, smooth muscle cells, and endothelial cells with DNA microarrays. The extent of the transcriptional response to hypoxia was greatest in the renal tubule cells. This heightened response was associated with a uniquely high level of HIF-1alpha RNA in renal cells, and it could be diminished by reducing HIF-1alpha expression via RNA interference. A gene-expression signature of the hypoxia response, derived from our studies of cultured mammary and renal tubular epithelial cells, showed coordinated variation in several human cancers, and was a strong predictor of clinical outcomes in breast and ovarian cancers. In an analysis of a large, published gene-expression dataset from breast cancers, we found that the prognostic information in the hypoxia signature was virtually independent of that provided by the previously reported wound signature and more predictive of outcomes than any of the clinical parameters in current use. CONCLUSIONS: The transcriptional response to hypoxia varies among human cells. Some of this variation is traceable to variation in expression of the HIF1A gene. A gene-expression signature of the cellular response to hypoxia is associated with a significantly poorer prognosis

  16. A seven-gene CpG-island methylation panel predicts breast cancer progression

    International Nuclear Information System (INIS)

    Li, Yan; Melnikov, Anatoliy A.; Levenson, Victor; Guerra, Emanuela; Simeone, Pasquale; Alberti, Saverio; Deng, Youping

    2015-01-01

    DNA methylation regulates gene expression, through the inhibition/activation of gene transcription of methylated/unmethylated genes. Hence, DNA methylation profiling can capture pivotal features of gene expression in cancer tissues from patients at the time of diagnosis. In this work, we analyzed a breast cancer case series, to identify DNA methylation determinants of metastatic versus non-metastatic tumors. CpG-island methylation was evaluated on a 56-gene cancer-specific biomarker microarray in metastatic versus non-metastatic breast cancers in a multi-institutional case series of 123 breast cancer patients. Global statistical modeling and unsupervised hierarchical clustering were applied to identify a multi-gene binary classifier with high sensitivity and specificity. Network analysis was utilized to quantify the connectivity of the identified genes. Seven genes (BRCA1, DAPK1, MSH2, CDKN2A, PGR, PRKCDBP, RANKL) were found informative for prognosis of metastatic diffusion and were used to calculate classifier accuracy versus the entire data-set. Individual-gene performances showed sensitivities of 63–79 %, 53–84 % specificities, positive predictive values of 59–83 % and negative predictive values of 63–80 %. When modelled together, these seven genes reached a sensitivity of 93 %, 100 % specificity, a positive predictive value of 100 % and a negative predictive value of 93 %, with high statistical power. Unsupervised hierarchical clustering independently confirmed these findings, in close agreement with the accuracy measurements. Network analyses indicated tight interrelationship between the identified genes, suggesting this to be a functionally-coordinated module, linked to breast cancer progression. Our findings identify CpG-island methylation profiles with deep impact on clinical outcome, paving the way for use as novel prognostic assays in clinical settings. The online version of this article (doi:10.1186/s12885-015-1412-9) contains supplementary

  17. MicroRNA dysregulation as a prognostic biomarker in colorectal cancer

    International Nuclear Information System (INIS)

    Dong, Yujuan; Yu, Jun; Ng, Simon SM

    2014-01-01

    Colorectal cancer (CRC) is one of the most potentially curable cancers, yet it remains the fourth most common overall cause of cancer death worldwide. The identification of robust molecular prognostic biomarkers can refine the conventional tumor–node–metastasis staging system, avoid understaging of tumor, and help pinpoint patients with early-stage CRC who may benefit from aggressive treatments. Recently, epigenetic studies have provided new molecular evidence to better categorize the CRC subtypes and predict clinical outcomes. In this review, we summarize recent findings concerning the prognostic potential of microRNAs (miRNAs) in CRC. We first discuss the prognostic value of three tissue miRNAs (miR-21-5p, miR-29-3p, miR-148-3p) that have been examined in multiple studies. We also summarize the dysregulation of miRNA processing machinery DICER in CRC and its association with risk for mortality. We also reviewe the potential application of miRNA-associated single-nucleotide polymorphisms as prognostic biomarkers for CRC, especially the miRNA-associated polymorphism in the KRAS gene. Last but not least, we discuss the microsatellite instability-related miRNA candidates. Among all these candidates, miR-21-5p is the most promising prognostic marker, yet further prospective validation studies are required before it can go into clinical usage

  18. Genome based analysis of type-I polyketide synthase and nonribosomal peptide synthetase gene clusters in seven strains of five representative Nocardia species.

    Science.gov (United States)

    Komaki, Hisayuki; Ichikawa, Natsuko; Hosoyama, Akira; Takahashi-Nakaguchi, Azusa; Matsuzawa, Tetsuhiro; Suzuki, Ken-ichiro; Fujita, Nobuyuki; Gonoi, Tohru

    2014-04-30

    Actinobacteria of the genus Nocardia usually live in soil or water and play saprophytic roles, but they also opportunistically infect the respiratory system, skin, and other organs of humans and animals. Primarily because of the clinical importance of the strains, some Nocardia genomes have been sequenced, and genome sequences have accumulated. Genome sizes of Nocardia strains are similar to those of Streptomyces strains, the producers of most antibiotics. In the present work, we compared secondary metabolite biosynthesis gene clusters of type-I polyketide synthase (PKS-I) and nonribosomal peptide synthetase (NRPS) among genomes of representative Nocardia species/strains based on domain organization and amino acid sequence homology. Draft genome sequences of Nocardia asteroides NBRC 15531(T), Nocardia otitidiscaviarum IFM 11049, Nocardia brasiliensis NBRC 14402(T), and N. brasiliensis IFM 10847 were read and compared with published complete genome sequences of Nocardia farcinica IFM 10152, Nocardia cyriacigeorgica GUH-2, and N. brasiliensis HUJEG-1. Genome sizes are as follows: N. farcinica, 6.0 Mb; N. cyriacigeorgica, 6.2 Mb; N. asteroides, 7.0 Mb; N. otitidiscaviarum, 7.8 Mb; and N. brasiliensis, 8.9 - 9.4 Mb. Predicted numbers of PKS-I, NRPS, and PKS-I/NRPS hybrid clusters ranged between 4-11, 7-13, and 1-6, respectively, depending on strains, and tended to increase with increasing genome size. Domain and module structures of representative or unique clusters are discussed in the text. We conclude the following: 1) genomes of Nocardia strains carry as many PKS-I and NRPS gene clusters as those of Streptomyces strains, 2) the number of PKS-I and NRPS gene clusters in Nocardia strains varies substantially depending on species, and N. brasiliensis strains carry the largest numbers of clusters among the species studied, 3) the seven Nocardia strains studied in the present work have seven common PKS-I and/or NRPS clusters, some of whose products are yet to be studied

  19. Gene expression of the mismatch repair gene MSH2 in primary colorectal cancer

    DEFF Research Database (Denmark)

    Jensen, Lars Henrik; Kuramochi, Hidekazu; Crüger, Dorthe Gylling

    2011-01-01

    promoter was only detected in 14 samples and only at a low level with no correlation to gene expression. MSH2 gene expression was not a prognostic factor for overall survival in univariate or multivariate analysis. The gene expression of MSH2 is a potential quantitative marker ready for further clinical...

  20. Nottingham Prognostic Index in Triple-Negative Breast Cancer: a reliable prognostic tool?

    International Nuclear Information System (INIS)

    Albergaria, André; Ricardo, Sara; Milanezi, Fernanda; Carneiro, Vítor; Amendoeira, Isabel; Vieira, Daniella; Cameselle-Teijeiro, Jorge; Schmitt, Fernando

    2011-01-01

    A breast cancer prognostic tool should ideally be applicable to all types of invasive breast lesions. A number of studies have shown histopathological grade to be an independent prognostic factor in breast cancer, adding prognostic power to nodal stage and tumour size. The Nottingham Prognostic Index has been shown to accurately predict patient outcome in stratified groups with a follow-up period of 15 years after primary diagnosis of breast cancer. Clinically, breast tumours that lack the expression of Oestrogen Receptor, Progesterone Receptor and Human Epidermal growth factor Receptor 2 (HER2) are identified as presenting a 'triple-negative' phenotype or as triple-negative breast cancers. These poor outcome tumours represent an easily recognisable prognostic group of breast cancer with aggressive behaviour that currently lack the benefit of available systemic therapy. There are conflicting results on the prevalence of lymph node metastasis at the time of diagnosis in triple-negative breast cancer patients but it is currently accepted that triple-negative breast cancer does not metastasize to axillary nodes and bones as frequently as the non-triple-negative carcinomas, favouring instead, a preferentially haematogenous spread. Hypothetically, this particular tumour dissemination pattern would impair the reliability of using Nottingham Prognostic Index as a tool for triple-negative breast cancer prognostication. The present study tested the effectiveness of the Nottingham Prognostic Index in stratifying breast cancer patients of different subtypes with special emphasis in a triple-negative breast cancer patient subset versus non- triple-negative breast cancer. We demonstrated that besides the fact that TNBC disseminate to axillary lymph nodes as frequently as luminal or HER2 tumours, we also showed that TNBC are larger in size compared with other subtypes and almost all grade 3. Additionally, survival curves demonstrated that these prognostic factors are

  1. Two gene clusters co-ordinate for a functional N-acetylglucosamine catabolic pathway in Vibrio cholerae.

    Science.gov (United States)

    Ghosh, Swagata; Rao, K Hanumantha; Sengupta, Manjistha; Bhattacharya, Sujit K; Datta, Asis

    2011-06-01

    Pathogenic microorganisms like Vibrio cholerae are capable of adapting to diverse living conditions, especially when they transit from their environmental reservoirs to human host. V. cholerae attaches to N-acetylglucosamine (GlcNAc) residues in glycoproteins and lipids present in the intestinal epithelium and chitinous surface of zoo-phytoplanktons in the aquatic environment for its survival and colonization. GlcNAc utilization thus appears to be important for the pathogen to reach sufficient titres in the intestine for producing clinical symptoms of cholera. We report here the involvement of a second cluster of genes working in combination with the classical genes of GlcNAc catabolism, suggesting the occurrence of a novel variant of the process of biochemical conversion of GlcNAc to Fructose-6-phosphate as has been described in other organisms. Colonization was severely attenuated in mutants that were incapable of utilizing GlcNAc. It was also shown that N-acetylglucosamine specific repressor (NagC) performs a dual role - while the classical GlcNAc catabolic genes are under its negative control, the genes belonging to the second cluster are positively regulated by it. Further application of tandem affinity purification to NagC revealed its interaction with a novel partner. Our results provide a genetic program that probably enables V. cholerae to successfully utilize amino - sugars and also highlights a new mode of transcriptional regulation, not described in this organism. © 2011 Blackwell Publishing Ltd.

  2. vanI: a novel d-Ala-d-Lac vancomycin resistance gene cluster found in Desulfitobacterium hafniense

    NARCIS (Netherlands)

    Kruse, T.; Levisson, M.; Vos, de W.M.; Smidt, H.

    2014-01-01

    The glycopeptide vancomycin was until recently considered a drug of last resort against Gram-positive bacteria. Increasing numbers of bacteria, however, are found to carry genes that confer resistance to this antibiotic. So far, 10 different vancomycin resistance clusters have been described. A

  3. Regulation of the Apolipoprotein Gene Cluster by a Long Noncoding RNA

    Directory of Open Access Journals (Sweden)

    Paul Halley

    2014-01-01

    Full Text Available Apolipoprotein A1 (APOA1 is the major protein component of high-density lipoprotein (HDL in plasma. We have identified an endogenously expressed long noncoding natural antisense transcript, APOA1-AS, which acts as a negative transcriptional regulator of APOA1 both in vitro and in vivo. Inhibition of APOA1-AS in cultured cells resulted in the increased expression of APOA1 and two neighboring genes in the APO cluster. Chromatin immunoprecipitation (ChIP analyses of a ∼50 kb chromatin region flanking the APOA1 gene demonstrated that APOA1-AS can modulate distinct histone methylation patterns that mark active and/or inactive gene expression through the recruitment of histone-modifying enzymes. Targeting APOA1-AS with short antisense oligonucleotides also enhanced APOA1 expression in both human and monkey liver cells and induced an increase in hepatic RNA and protein expression in African green monkeys. Furthermore, the results presented here highlight the significant local modulatory effects of long noncoding antisense RNAs and demonstrate the therapeutic potential of manipulating the expression of these transcripts both in vitro and in vivo.

  4. Mapping in an apple (Malus x domestica) F1 segregating population based on physical clustering of differentially expressed genes.

    Science.gov (United States)

    Jensen, Philip J; Fazio, Gennaro; Altman, Naomi; Praul, Craig; McNellis, Timothy W

    2014-04-04

    Apple tree breeding is slow and difficult due to long generation times, self-incompatibility, and complex genetics. The identification of molecular markers linked to traits of interest is a way to expedite the breeding process. In the present study, we aimed to identify genes whose steady-state transcript abundance was associated with inheritance of specific traits segregating in an apple (Malus × domestica) rootstock F1 breeding population, including resistance to powdery mildew (Podosphaera leucotricha) disease and woolly apple aphid (Eriosoma lanigerum). Transcription profiling was performed for 48 individual F1 apple trees from a cross of two highly heterozygous parents, using RNA isolated from healthy, actively-growing shoot tips and a custom apple DNA oligonucleotide microarray representing 26,000 unique transcripts. Genome-wide expression profiles were not clear indicators of powdery mildew or woolly apple aphid resistance phenotype. However, standard differential gene expression analysis between phenotypic groups of trees revealed relatively small sets of genes with trait-associated expression levels. For example, thirty genes were identified that were differentially expressed between trees resistant and susceptible to powdery mildew. Interestingly, the genes encoding twenty-four of these transcripts were physically clustered on chromosome 12. Similarly, seven genes were identified that were differentially expressed between trees resistant and susceptible to woolly apple aphid, and the genes encoding five of these transcripts were also clustered, this time on chromosome 17. In each case, the gene clusters were in the vicinity of previously identified major quantitative trait loci for the corresponding trait. Similar results were obtained for a series of molecular traits. Several of the differentially expressed genes were used to develop DNA polymorphism markers linked to powdery mildew disease and woolly apple aphid resistance. Gene expression profiling

  5. The nitrate-reduction gene cluster components exert lineage-dependent contributions to optimization of Sinorhizobium symbiosis with soybeans.

    Science.gov (United States)

    Liu, Li Xue; Li, Qin Qin; Zhang, Yun Zeng; Hu, Yue; Jiao, Jian; Guo, Hui Juan; Zhang, Xing Xing; Zhang, Biliang; Chen, Wen Xin; Tian, Chang Fu

    2017-12-01

    Receiving nodulation and nitrogen fixation genes does not guarantee rhizobia an effective symbiosis with legumes. Here, variations in gene content were determined for three Sinorhizobium species showing contrasting symbiotic efficiency on soybeans. A nitrate-reduction gene cluster absent in S. sojae was found to be essential for symbiotic adaptations of S. fredii and S. sp. III. In S. fredii, the deletion mutation of the nap (nitrate reductase), instead of nir (nitrite reductase) and nor (nitric oxide reductase), led to defects in nitrogen-fixation (Fix - ). By contrast, none of these core nitrate-reduction genes were required for the symbiosis of S. sp. III. However, within the same gene cluster, the deletion of hemN1 (encoding oxygen-independent coproporphyrinogen III oxidase) in both S. fredii and S. sp. III led to the formation of nitrogen-fixing (Fix + ) but ineffective (Eff - ) nodules. These Fix + /Eff - nodules were characterized by significantly lower enzyme activity of glutamine synthetase indicating rhizobial modulation of nitrogen-assimilation by plants. A distant homologue of HemN1 from S. sojae can complement this defect in S. fredii and S. sp. III, but exhibited a more pleotropic role in symbiosis establishment. These findings highlighted the lineage-dependent optimization of symbiotic functions in different rhizobial species associated with the same host. © 2017 Society for Applied Microbiology and John Wiley & Sons Ltd.

  6. Gene expression profiling in woman with women with breast cancer in a Saudi population

    International Nuclear Information System (INIS)

    Amer, Saud M. Bin; Maqbool, Z.; Nirmal, Maimoona S.; Hussain, Syed S.; Jeprel, Hatim A.; Qattan, Amal T.; Tulbah, Asma M.; Malik, Osama A.; Al-Tweigeri, Taher A.

    2008-01-01

    Objective was to generate consensus gene expression profiles of invasive breast tumors from a small cohort of Saudi females and to explore the possibility that they may be broadly conserved between Caucasian and Middle Eastern populations. This study was performed at King Faisal Specialist Hospital and Research Center, Riyadh, Kingdom of Saudi Arabia, from January 2005 to January 2007. Gene expression profiles were generated from 38 invasive breast tumors and 8 tumor adjacent tissues (TATs) using BD Atlas cDNA expression arrays containing 1176 genes. Results were confirmed by reverse transcriptase polymerase chain reaction and analyzed by 2-dimensional unsupervised hierarchical clustering. The analysis identified 48 differentially expressed genes in tumors from which 25 are already reported by various western studies. Forty-three of these genes were also differentially expressed in TATs. The same data set has been able to distinguish between tumors and the TAT's, interestingly by using only 4 of the differentially expressed genes. Moreover, we were able to group the patients according to prognosis to an extent by hierarchical clustering. Our results indicate that expression profiles between Saudi females with breast cancer and the Caucasian population are conserved to some extent, and can be used to classify patients according to prognostic groups. We also suggest 3 differentially expressed genes (IGHG3, CDK3 and RPS9) in tumors may have a novel role in breast cancer. In addition, the role of TATs is much more essential in breast cancer and needs to be explored thoroughly. (author)

  7. De Novo Assembly and Genome Analyses of the Marine-Derived Scopulariopsis brevicaulis Strain LF580 Unravels Life-Style Traits and Anticancerous Scopularide Biosynthetic Gene Cluster.

    Science.gov (United States)

    Kumar, Abhishek; Henrissat, Bernard; Arvas, Mikko; Syed, Muhammad Fahad; Thieme, Nils; Benz, J Philipp; Sørensen, Jens Laurids; Record, Eric; Pöggeler, Stefanie; Kempken, Frank

    2015-01-01

    The marine-derived Scopulariopsis brevicaulis strain LF580 produces scopularides A and B, which have anticancerous properties. We carried out genome sequencing using three next-generation DNA sequencing methods. De novo hybrid assembly yielded 621 scaffolds with a total size of 32.2 Mb and 16298 putative gene models. We identified a large non-ribosomal peptide synthetase gene (nrps1) and supporting pks2 gene in the same biosynthetic gene cluster. This cluster and the genes within the cluster are functionally active as confirmed by RNA-Seq. Characterization of carbohydrate-active enzymes and major facilitator superfamily (MFS)-type transporters lead to postulate S. brevicaulis originated from a soil fungus, which came into contact with the marine sponge Tethya aurantium. This marine sponge seems to provide shelter to this fungus and micro-environment suitable for its survival in the ocean. This study also builds the platform for further investigations of the role of life-style and secondary metabolites from S. brevicaulis.

  8. Structure of the neutral capsular polysaccharide of Acinetobacter baumannii NIPH146 that carries the KL37 capsule gene cluster.

    Science.gov (United States)

    Arbatsky, Nikolay P; Shneider, Mikhail M; Kenyon, Johanna J; Shashkov, Alexander S; Popova, Anastasiya V; Miroshnikov, Konstantin A; Volozhantsev, Nikolay V; Knirel, Yuriy A

    2015-09-02

    Capsular polysaccharide (CPS) was isolated from Acinetobacter baumannii NIPH146, and the following structure of branched pentasaccharide repeating unit was established by sugar analyses along with 1D and 2D NMR spectroscopy: In comparison to most other known capsular polysaccharides of A. baumannii, the CPS studied is neutral and lacks any specific monosaccharide component. The synthesis, assembly and export of this structure could be attributed to genes in a novel capsule biosynthesis gene cluster, designated KL37, which was found in the NIPH146 genome. The CPS of A. baumannii NIPH146 shares the α-d-Galp-(1→6)-β-d-Glcp-(1→3)-d-GalpNAc-(1→ trisaccharide fragment with the CPS units of several A. baumannii strains, including ATCC 17978 and LUH 5537 that carry the KL3 and KL22 gene clusters, respectively. KL37 contains two genes for glycosyltransferases that are related to two glycosyltransferase genes present in both KL3 and KL22, and the encoded proteins could be tentatively assigned to linkages between sugars in the CPS repeat. Copyright © 2015 Elsevier Ltd. All rights reserved.

  9. Simultaneous analysis of the expression of 14 genes with individual prognostic value in myelodysplastic syndrome patients at diagnosis: WT1 detection in peripheral blood adversely affects survival.

    Science.gov (United States)

    Santamaría, Carlos; Ramos, Fernando; Puig, Noemi; Barragán, Eva; de Paz, Raquel; Pedro, Carme; Insunza, Andrés; Tormo, Mar; Del Cañizo, Consuelo; Diez-Campelo, María; Xicoy, Blanca; Salido, Eduardo; Sánchez del Real, Javier; Hernández, Montserrat; Chillón, Carmen; Sanz, Guillermo F; García-Sanz, Ramón; San Miguel, Jesús F; González, Marcos

    2012-12-01

    Several studies have evaluated the prognostic value of the individual expression of certain genes in patients with myelodysplastic syndromes (MDS). However, none of them includes their simultaneous analysis by quantitative polymerase chain reaction (PCR). We evaluated relative expression levels of 14 molecular markers in 193 peripheral blood samples from untreated MDS patients using real-time PCR. Detectable WT1 expression levels, low TET2, and low IER3 gene expression were the only markers showing in univariate analysis a poor prognostic value for all treatment-free (TFS), progression-free (PFS), and overall survival (OS). In multivariate analysis, molecular parameters associated with a shorter TFS were: WT1 detection (p = 0.014), low TET2 (p = 0.002), and low IER3 expression (p = 0.025). WT1 detection (p = 0.006) and low TET2 (p = 0.006) expression were associated with a shorter PFS when multivariate analysis was carried out by including only molecular markers. Molecular values with an independent value in OS were: WT1 detection (p = 0.003), high EVI1 expression (p = 0.001), and undetectatable p15-CDKN2B (p = 0.037). WT1 expressers were associated with adverse clinical-biological features, high IPSS and WPSS scoring, and unfavorable molecular expression profile. In summary, detectable WT1 expression levels, and low TET2 and low IER3 expression in peripheral blood showed a strong association with adverse prognosis in MDS patients at diagnosis. However, WT1 was the only molecular marker displaying an independent prognostic value in both OS and TFS.

  10. Functional Genome Mining for Metabolites Encoded by Large Gene Clusters through Heterologous Expression of a Whole-Genome Bacterial Artificial Chromosome Library in Streptomyces spp.

    Science.gov (United States)

    Xu, Min; Wang, Yemin; Zhao, Zhilong; Gao, Guixi; Huang, Sheng-Xiong; Kang, Qianjin; He, Xinyi; Lin, Shuangjun; Pang, Xiuhua; Deng, Zixin

    2016-01-01

    ABSTRACT Genome sequencing projects in the last decade revealed numerous cryptic biosynthetic pathways for unknown secondary metabolites in microbes, revitalizing drug discovery from microbial metabolites by approaches called genome mining. In this work, we developed a heterologous expression and functional screening approach for genome mining from genomic bacterial artificial chromosome (BAC) libraries in Streptomyces spp. We demonstrate mining from a strain of Streptomyces rochei, which is known to produce streptothricins and borrelidin, by expressing its BAC library in the surrogate host Streptomyces lividans SBT5, and screening for antimicrobial activity. In addition to the successful capture of the streptothricin and borrelidin biosynthetic gene clusters, we discovered two novel linear lipopeptides and their corresponding biosynthetic gene cluster, as well as a novel cryptic gene cluster for an unknown antibiotic from S. rochei. This high-throughput functional genome mining approach can be easily applied to other streptomycetes, and it is very suitable for the large-scale screening of genomic BAC libraries for bioactive natural products and the corresponding biosynthetic pathways. IMPORTANCE Microbial genomes encode numerous cryptic biosynthetic gene clusters for unknown small metabolites with potential biological activities. Several genome mining approaches have been developed to activate and bring these cryptic metabolites to biological tests for future drug discovery. Previous sequence-guided procedures relied on bioinformatic analysis to predict potentially interesting biosynthetic gene clusters. In this study, we describe an efficient approach based on heterologous expression and functional screening of a whole-genome library for the mining of bioactive metabolites from Streptomyces. The usefulness of this function-driven approach was demonstrated by the capture of four large biosynthetic gene clusters for metabolites of various chemical types, including

  11. Prognostic meta-signature of breast cancer developed by two-stage mixture modeling of microarray data

    Directory of Open Access Journals (Sweden)

    Ghosh Debashis

    2004-12-01

    Full Text Available Abstract Background An increasing number of studies have profiled tumor specimens using distinct microarray platforms and analysis techniques. With the accumulating amount of microarray data, one of the most intriguing yet challenging tasks is to develop robust statistical models to integrate the findings. Results By applying a two-stage Bayesian mixture modeling strategy, we were able to assimilate and analyze four independent microarray studies to derive an inter-study validated "meta-signature" associated with breast cancer prognosis. Combining multiple studies (n = 305 samples on a common probability scale, we developed a 90-gene meta-signature, which strongly associated with survival in breast cancer patients. Given the set of independent studies using different microarray platforms which included spotted cDNAs, Affymetrix GeneChip, and inkjet oligonucleotides, the individually identified classifiers yielded gene sets predictive of survival in each study cohort. The study-specific gene signatures, however, had minimal overlap with each other, and performed poorly in pairwise cross-validation. The meta-signature, on the other hand, accommodated such heterogeneity and achieved comparable or better prognostic performance when compared with the individual signatures. Further by comparing to a global standardization method, the mixture model based data transformation demonstrated superior properties for data integration and provided solid basis for building classifiers at the second stage. Functional annotation revealed that genes involved in cell cycle and signal transduction activities were over-represented in the meta-signature. Conclusion The mixture modeling approach unifies disparate gene expression data on a common probability scale allowing for robust, inter-study validated prognostic signatures to be obtained. With the emerging utility of microarrays for cancer prognosis, it will be important to establish paradigms to meta

  12. Ananke: temporal clustering reveals ecological dynamics of microbial communities

    Directory of Open Access Journals (Sweden)

    Michael W. Hall

    2017-09-01

    Full Text Available Taxonomic markers such as the 16S ribosomal RNA gene are widely used in microbial community analysis. A common first step in marker-gene analysis is grouping genes into clusters to reduce data sets to a more manageable size and potentially mitigate the effects of sequencing error. Instead of clustering based on sequence identity, marker-gene data sets collected over time can be clustered based on temporal correlation to reveal ecologically meaningful associations. We present Ananke, a free and open-source algorithm and software package that complements existing sequence-identity-based clustering approaches by clustering marker-gene data based on time-series profiles and provides interactive visualization of clusters, including highlighting of internal OTU inconsistencies. Ananke is able to cluster distinct temporal patterns from simulations of multiple ecological patterns, such as periodic seasonal dynamics and organism appearances/disappearances. We apply our algorithm to two longitudinal marker gene data sets: faecal communities from the human gut of an individual sampled over one year, and communities from a freshwater lake sampled over eleven years. Within the gut, the segregation of the bacterial community around a food-poisoning event was immediately clear. In the freshwater lake, we found that high sequence identity between marker genes does not guarantee similar temporal dynamics, and Ananke time-series clusters revealed patterns obscured by clustering based on sequence identity or taxonomy. Ananke is free and open-source software available at https://github.com/beiko-lab/ananke.

  13. MPIGeneNet: Parallel Calculation of Gene Co-Expression Networks on Multicore Clusters.

    Science.gov (United States)

    Gonzalez-Dominguez, Jorge; Martin, Maria J

    2017-10-10

    In this work we present MPIGeneNet, a parallel tool that applies Pearson's correlation and Random Matrix Theory to construct gene co-expression networks. It is based on the state-of-the-art sequential tool RMTGeneNet, which provides networks with high robustness and sensitivity at the expenses of relatively long runtimes for large scale input datasets. MPIGeneNet returns the same results as RMTGeneNet but improves the memory management, reduces the I/O cost, and accelerates the two most computationally demanding steps of co-expression network construction by exploiting the compute capabilities of common multicore CPU clusters. Our performance evaluation on two different systems using three typical input datasets shows that MPIGeneNet is significantly faster than RMTGeneNet. As an example, our tool is up to 175.41 times faster on a cluster with eight nodes, each one containing two 12-core Intel Haswell processors. Source code of MPIGeneNet, as well as a reference manual, are available at https://sourceforge.net/projects/mpigenenet/.

  14. Conserved syntenic clusters of protein coding genes are missing in birds.

    Science.gov (United States)

    Lovell, Peter V; Wirthlin, Morgan; Wilhelm, Larry; Minx, Patrick; Lazar, Nathan H; Carbone, Lucia; Warren, Wesley C; Mello, Claudio V

    2014-01-01

    Birds are one of the most highly successful and diverse groups of vertebrates, having evolved a number of distinct characteristics, including feathers and wings, a sturdy lightweight skeleton and unique respiratory and urinary/excretion systems. However, the genetic basis of these traits is poorly understood. Using comparative genomics based on extensive searches of 60 avian genomes, we have found that birds lack approximately 274 protein coding genes that are present in the genomes of most vertebrate lineages and are for the most part organized in conserved syntenic clusters in non-avian sauropsids and in humans. These genes are located in regions associated with chromosomal rearrangements, and are largely present in crocodiles, suggesting that their loss occurred subsequent to the split of dinosaurs/birds from crocodilians. Many of these genes are associated with lethality in rodents, human genetic disorders, or biological functions targeting various tissues. Functional enrichment analysis combined with orthogroup analysis and paralog searches revealed enrichments that were shared by non-avian species, present only in birds, or shared between all species. Together these results provide a clearer definition of the genetic background of extant birds, extend the findings of previous studies on missing avian genes, and provide clues about molecular events that shaped avian evolution. They also have implications for fields that largely benefit from avian studies, including development, immune system, oncogenesis, and brain function and cognition. With regards to the missing genes, birds can be considered ‘natural knockouts’ that may become invaluable model organisms for several human diseases.

  15. MicroRNA-424/503 cluster members regulate bovine granulosa cell proliferation and cell cycle progression by targeting SMAD7 gene through activin signalling pathway.

    Science.gov (United States)

    Pande, Hari Om; Tesfaye, Dawit; Hoelker, Michael; Gebremedhn, Samuel; Held, Eva; Neuhoff, Christiane; Tholen, Ernst; Schellander, Karl; Wondim, Dessie Salilew

    2018-05-01

    The granulosa cells are indispensable for follicular development and its function is orchestrated by several genes, which in turn posttranscriptionally regulated by microRNAs (miRNA). In our previous study, the miRRNA-424/503 cluster was found to be highly abundant in bovine granulosa cells (bGCs) of preovulatory dominant follicle compared to subordinate counterpart at day 19 of the bovine estrous cycle. Other study also indicated the involvement of miR-424/503 cluster in tumour cell resistance to apoptosis suggesting this miRNA cluster may involve in cell survival. However, the role of miR-424/503 cluster in granulosa cell function remains elusive Therefore, this study aimed to investigate the role of miRNA-424/503 cluster in bGCs function using microRNA gain- and loss-of-function approaches. The role of miR-424/503 cluster members in granulosa cell function was investigated by overexpressing or inhibiting its activity in vitro cultured granulosa cells using miR-424/503 mimic or inhibitor, respectively. Luciferase reporter assay showed that SMAD7 and ACVR2A are the direct targets of the miRNA-424/503 cluster members. In line with this, overexpression of miRNA-424/503 cluster members using its mimic and inhibition of its activity by its inhibitor reduced and increased, respectively the expression of SMAD7 and ACVR2A. Furthermore, flow cytometric analysis indicated that overexpression of miRNA-424/503 cluster members enhanced bGCs proliferation by promoting G1- to S- phase cell cycle transition. Modulation of miRNA-424/503 cluster members tended to increase phosphorylation of SMAD2/3 in the Activin signalling pathway. Moreover, sequence specific knockdown of SMAD7, the target gene of miRNA-424/503 cluster members, using small interfering RNA also revealed similar phenotypic and molecular alterations observed when miRNA-424/503 cluster members were overexpressed. Similarly, to get more insight about the role of miRNA-424/503 cluster members in activin signalling

  16. Composition and genomic organization of arthropod Hox clusters.

    Science.gov (United States)

    Pace, Ryan M; Grbić, Miodrag; Nagy, Lisa M

    2016-01-01

    The ancestral arthropod is believed to have had a clustered arrangement of ten Hox genes. Within arthropods, Hox gene mutations result in transformation of segment identities. Despite the fact that variation in segment number/character was common in the diversification of arthropods, few examples of Hox gene gains/losses have been correlated with morphological evolution. Furthermore, a full appreciation of the variation in the genomic arrangement of Hox genes in extant arthropods has not been recognized, as genome sequences from each major arthropod clade have not been reported until recently. Initial genomic analysis of the chelicerate Tetranychus urticae suggested that loss of Hox genes and Hox gene clustering might be more common than previously assumed. To further characterize the genomic evolution of arthropod Hox genes, we compared the genomic arrangement and general characteristics of Hox genes from representative taxa from each arthropod subphylum. In agreement with others, we find arthropods generally contain ten Hox genes arranged in a common orientation in the genome, with an increasing number of sampled species missing either Hox3 or abdominal-A orthologs. The genomic clustering of Hox genes in species we surveyed varies significantly, ranging from 0.3 to 13.6 Mb. In all species sampled, arthropod Hox genes are dispersed in the genome relative to the vertebrate Mus musculus. Differences in Hox cluster size arise from variation in the number of intervening genes, intergenic spacing, and the size of introns and UTRs. In the arthropods surveyed, Hox gene duplications are rare and four microRNAs are, in general, conserved in similar genomic positions relative to the Hox genes. The tightly clustered Hox complexes found in the vertebrates are not evident within arthropods, and differential patterns of Hox gene dispersion are found throughout the arthropods. The comparative genomic data continue to support an ancestral arthropod Hox cluster of ten genes with

  17. Isoeugenol monooxygenase and its putative regulatory gene are located in the eugenol metabolic gene cluster in Pseudomonas nitroreducens Jin1.

    Science.gov (United States)

    Ryu, Ji-Young; Seo, Jiyoung; Unno, Tatsuya; Ahn, Joong-Hoon; Yan, Tao; Sadowsky, Michael J; Hur, Hor-Gil

    2010-03-01

    The plant-derived phenylpropanoids eugenol and isoeugenol have been proposed as useful precursors for the production of natural vanillin. Genes involved in the metabolism of eugenol and isoeugenol were clustered in region of about a 30 kb of Pseudomonas nitroreducens Jin1. Two of the 23 ORFs in this region, ORFs 26 (iemR) and 27 (iem), were predicted to be involved in the conversion of isoeugenol to vanillin. The deduced amino acid sequence of isoeugenol monooxygenase (Iem) of strain Jin1 had 81.4% identity to isoeugenol monooxygenase from Pseudomonas putida IE27, which also transforms isoeugenol to vanillin. Iem was expressed in E. coli BL21(DE3) and was found to lead to isoeugenol to vanillin transformation. Deletion and cloning analyses indicated that the gene iemR, located upstream of iem, is required for expression of iem in the presence of isoeugenol, suggesting it to be the iem regulatory gene. Reverse transcription, real-time PCR analyses indicated that the genes involved in the metabolism of eugenol and isoeugenol were differently induced by isoeugenol, eugenol, and vanillin.

  18. Prognostic, predictive and pharmacogenomic assessments of CDX2 refine stratification of colorectal cancer.

    Science.gov (United States)

    Bruun, Jarle; Sveen, Anita; Barros, Rita; Eide, Peter W; Eilertsen, Ina; Kolberg, Matthias; Pellinen, Teijo; David, Leonor; Svindland, Aud; Kallioniemi, Olli; Guren, Marianne G; Nesbakken, Arild; Almeida, Raquel; Lothe, Ragnhild A

    2018-06-14

    We aimed to refine the value of CDX2 as an independent prognostic and predictive biomarker in colorectal cancer (CRC) according to disease stage and chemotherapy sensitivity in preclinical models. CDX2 expression was evaluated in 1045 stage I-IV primary CRCs by gene expression (n=403) or immunohistochemistry (n=642) and in relation to 5-year relapse-free survival (RFS), overall survival (OS), and chemotherapy. Pharmacogenomic associations between CDX2 expression and 69 chemotherapeutics were assessed by drug screening of 35 CRC cell lines. CDX2 expression was lost in 11.6% of cases and showed independent poor prognostic value in multivariable models. For individual stages, CDX2 was prognostic only in stage IV, independent of chemotherapy. Among stage I-III patients not treated in an adjuvant setting, CDX2 loss was associated with a particularly poor survival in the BRAF-mutated subgroup, but prognostic value was independent of microsatellite instability status and the consensus molecular subtypes In stage III, the 5-year RFS rate was higher among patients with loss of CDX2 who received adjuvant chemotherapy than among patients who did not. The CDX2-negative cell lines were significantly more sensitive to chemotherapeutics than CDX2-positive cells, and the multidrug resistance genes MDR1 and CFTR were significantly downregulated both in CDX2-negative cells and patient tumors. Molecular Oncology (2018) © 2018 The Authors. Published by FEBS Press and John Wiley & Sons Ltd.

  19. Open reading frame 176 in the photosynthesis gene cluster of Rhodobacter capsulatus encodes idi, a gene for isopentenyl diphosphate isomerase.

    OpenAIRE

    Hahn, F M; Baker, J A; Poulter, C D

    1996-01-01

    Isopentenyl diphosphate (IPP) isomerase catalyzes an essential activation step in the isoprenoid biosynthetic pathway. A database search based on probes from the highly conserved regions in three eukaryotic IPP isomerases revealed substantial similarity with ORF176 in the photosynthesis gene cluster in Rhodobacter capsulatus. The open reading frame was cloned into an Escherichia coli expression vector. The encoded 20-kDa protein, which was purified in two steps by ion exchange and hydrophobic...

  20. Synteny in toxigenic Fusarium species: the fumonisin gene cluster and the mating type region as examples

    NARCIS (Netherlands)

    Waalwijk, C.; Lee, van der T.A.J.; Vries, de P.M.; Hesselink, T.; Arts, J.; Kema, G.H.J.

    2004-01-01

    A comparative genomic approach was used to study the mating type locus and the gene cluster involved in toxin production ( fumonisin) in Fusarium proliferatum, a pathogen with a wide host range and a complex toxin profile. A BAC library, generated from F. proliferatum isolate ITEM 2287, was used to

  1. Identification of an extensive gene cluster among a family of PPOs in Trifolium pratense L. (red clover using a large insert BAC library

    Directory of Open Access Journals (Sweden)

    Thomas Ann

    2009-07-01

    Full Text Available Abstract Background Polyphenol oxidase (PPO activity in plants is a trait with potential economic, agricultural and environmental impact. In relation to the food industry, PPO-induced browning causes unacceptable discolouration in fruit and vegetables: from an agriculture perspective, PPO can protect plants against pathogens and environmental stress, improve ruminant growth by increasing nitrogen absorption and decreasing nitrogen loss to the environment through the animal's urine. The high PPO legume, red clover, has a significant economic and environmental role in sustaining low-input organic and conventional farms. Molecular markers for a range of important agricultural traits are being developed for red clover and improved knowledge of PPO genes and their structure will facilitate molecular breeding. Results A bacterial artificial chromosome (BAC library comprising 26,016 BAC clones with an average 135 Kb insert size, was constructed from Trifolium pratense L. (red clover, a diploid legume with a haploid genome size of 440–637 Mb. Library coverage of 6–8 genome equivalents ensured good representation of genes: the library was screened for polyphenol oxidase (PPO genes. Two single copy PPO genes, PPO4 and PPO5, were identified to add to a family of three, previously reported, paralogous genes (PPO1–PPO3. Multiple PPO1 copies were identified and characterised revealing a subfamily comprising three variants PPO1/2, PPO1/4 and PPO1/5. Six PPO genes clustered within the genome: four separate BAC clones could be assembled onto a predicted 190–510 Kb single BAC contig. Conclusion A PPO gene family in red clover resides as a cluster of at least 6 genes. Three of these genes have high homology, suggesting a more recent evolutionary event. This PPO cluster covers a longer region of the genome than clusters detected in rice or previously reported in tomato. Full-length coding sequences from PPO4, PPO5, PPO1/5 and PPO1/4 will facilitate

  2. Surface Prognostic Charts

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — Surface Prognostic Charts are historical surface prognostic (forecast) charts created by the United States Weather Bureau. They include fronts, isobars, cloud, and...

  3. Prognostic relevance of molecular subtypes and master regulators in pancreatic ductal adenocarcinoma

    International Nuclear Information System (INIS)

    Janky, Rekin’s; Binda, Maria Mercedes; Allemeersch, Joke; Van den broeck, Anke; Govaere, Olivier; Swinnen, Johannes V.; Roskams, Tania; Aerts, Stein; Topal, Baki

    2016-01-01

    Pancreatic cancer is poorly characterized at genetic and non-genetic levels. The current study evaluates in a large cohort of patients the prognostic relevance of molecular subtypes and key transcription factors in pancreatic ductal adenocarcinoma (PDAC). We performed gene expression analysis of whole-tumor tissue obtained from 118 surgically resected PDAC and 13 histologically normal pancreatic tissue samples. Cox regression models were used to study the effect on survival of molecular subtypes and 16 clinicopathological prognostic factors. In order to better understand the biology of PDAC we used iRegulon to identify transcription factors (TFs) as master regulators of PDAC and its subtypes. We confirmed the PDAssign gene signature as classifier of PDAC in molecular subtypes with prognostic relevance. We found molecular subtypes, but not clinicopathological factors, as independent predictors of survival. Regulatory network analysis predicted that HNF1A/B are among thousand TFs the top enriched master regulators of the genes expressed in the normal pancreatic tissue compared to the PDAC regulatory network. On immunohistochemistry staining of PDAC samples, we observed low expression of HNF1B in well differentiated towards no expression in poorly differentiated PDAC samples. We predicted IRF/STAT, AP-1, and ETS-family members as key transcription factors in gene signatures downstream of mutated KRAS. PDAC can be classified in molecular subtypes that independently predict survival. HNF1A/B seem to be good candidates as master regulators of pancreatic differentiation, which at the protein level loses its expression in malignant ductal cells of the pancreas, suggesting its putative role as tumor suppressor in pancreatic cancer. The study was registered at ClinicalTrials.gov under the number NCT01116791 (May 3, 2010). The online version of this article (doi:10.1186/s12885-016-2540-6) contains supplementary material, which is available to authorized users

  4. Prognostic factors in invasive bladder cancer

    International Nuclear Information System (INIS)

    Maulard-Durdux, C.; Housset, M.

    1998-01-01

    In France, invasive bladder cancer is the more frequent urologic malignancy after prostate carcinoma. Treatment of bladder cancer is radical cystectomy. New therapeutic approaches such as chemo-radiation combination for a conservative procedure, neo-adjuvant or adjuvant chemotherapy are still developing. In this way, a rigorous selection of patients is needed. This selection is based on prognostic criteria that could be divided into four groups: the volume of the tumor including the tumor infiltration depth, the nodal status, the presence or not of hydronephrosis and the residual tumor mass after trans-urethral resection; the histologic aspects of the tumor including histologic grading, the presence or not of an epidermoid metaplasia, of in situ carcinoma or of thrombi; the expression of tumor markers tissue polypeptide antigen, bladder tumor antigen; the biologic aspects of the tumor as ploidy, cytogenetic abnormalities, expression of Ki67, expression of oncogenes or tumor suppressor genes, expression of tumor antigens or growth factor receptors. This paper reviews the prognostic value of the various parameters. (authors)

  5. Gene Cluster Responsible for Secretion of and Immunity to Multiple Bacteriocins, the NKR-5-3 Enterocins

    Science.gov (United States)

    Ishibashi, Naoki; Himeno, Kohei; Masuda, Yoshimitsu; Perez, Rodney Honrada; Iwatani, Shun; Wilaipun, Pongtep; Leelawatcharamas, Vichien; Nakayama, Jiro; Sonomoto, Kenji

    2014-01-01

    Enterococcus faecium NKR-5-3, isolated from Thai fermented fish, is characterized by the unique ability to produce five bacteriocins, namely, enterocins NKR-5-3A, -B, -C, -D, and -Z (Ent53A, Ent53B, Ent53C, Ent53D, and Ent53Z). Genetic analysis with a genome library revealed that the bacteriocin structural genes (enkA [ent53A], enkC [ent53C], enkD [ent53D], and enkZ [ent53Z]) that encode these peptides (except for Ent53B) are located in close proximity to each other. This NKR-5-3ACDZ (Ent53ACDZ) enterocin gene cluster (approximately 13 kb long) includes certain bacteriocin biosynthetic genes such as an ABC transporter gene (enkT), two immunity genes (enkIaz and enkIc), a response regulator (enkR), and a histidine protein kinase (enkK). Heterologous-expression studies of enkT and ΔenkT mutant strains showed that enkT is responsible for the secretion of Ent53A, Ent53C, Ent53D, and Ent53Z, suggesting that EnkT is a wide-range ABC transporter that contributes to the effective production of these bacteriocins. In addition, EnkIaz and EnkIc were found to confer self-immunity to the respective bacteriocins. Furthermore, bacteriocin induction assays performed with the ΔenkRK mutant strain showed that EnkR and EnkK are regulatory proteins responsible for bacteriocin production and that, together with Ent53D, they constitute a three-component regulatory system. Thus, the Ent53ACDZ gene cluster is essential for the biosynthesis and regulation of NKR-5-3 enterocins, and this is, to our knowledge, the first report that demonstrates the secretion of multiple bacteriocins by an ABC transporter. PMID:25149515

  6. West German Study Group Phase III PlanB Trial: First Prospective Outcome Data for the 21-Gene Recurrence Score Assay and Concordance of Prognostic Markers by Central and Local Pathology Assessment.

    Science.gov (United States)

    Gluz, Oleg; Nitz, Ulrike A; Christgen, Matthias; Kates, Ronald E; Shak, Steven; Clemens, Michael; Kraemer, Stefan; Aktas, Bahriye; Kuemmel, Sherko; Reimer, Toralf; Kusche, Manfred; Heyl, Volker; Lorenz-Salehi, Fatemeh; Just, Marianne; Hofmann, Daniel; Degenhardt, Tom; Liedtke, Cornelia; Svedman, Christer; Wuerstlein, Rachel; Kreipe, Hans H; Harbeck, Nadia

    2016-07-10

    The 21-gene Recurrence Score (RS) assay is a validated prognostic/predictive tool in early hormone receptor-positive breast cancer (BC); however, only a few prospective outcome results have been available so far. In the phase III PlanB trial, RS was prospectively used to define a subset of patients who received only endocrine therapy. We present 3-year outcome data and concordance analysis (among biomarkers/RS). Central tumor bank was established prospectively from PlanB (intermediate and high-risk, locally human epidermal growth factor receptor 2-negative BC). After an early amendment, HR-positive, pN0-1 patients with RS ≤ 11 were recommended to omit chemotherapy. From 2009 to 2011, PlanB enrolled 3,198 patients with a median age of 56 years; 41.1% had node-positive and 32.5% grade 3 disease. In 348 patients (15.3%), chemotherapy was omitted based on RS ≤ 11. After 35 months median follow-up, 3-year disease-free survival in patients with RS ≤ 11 and endocrine therapy alone was 98% versus 92% and 98% in RS > 25 and RS 12 to 25 in chemotherapy-treated patients, respectively. Nodal status, central and local grade, the Ki-67 protein encoded by the MKI67 gene, estrogen receptor, progesterone receptor, tumor size, and RS were univariate prognostic factors for disease-free survival; only nodal status, both central and local grade, and RS were independent multivariate factors. Histologic grade was discordant between central and local laboratories in 44%. RS was positively but moderately correlated with the Ki-67 protein encoded by the MKI67 gene and grade and negatively correlated with progesterone receptor and estrogen receptor. In this prospective trial, patients with enhanced clinical risk and omitted chemotherapy on the basis of RS ≤ 11 had excellent 3-year survival. The substantial discordance observed between traditional prognostic markers and RS emphasizes the need for standardized assessment and supports the potential integration of standardized, well

  7. Biosynthesis of actinorhodin and related antibiotics: discovery of alternative routes for quinone formation encoded in the act gene cluster.

    Science.gov (United States)

    Okamoto, Susumu; Taguchi, Takaaki; Ochi, Kozo; Ichinose, Koji

    2009-02-27

    All known benzoisochromanequinone (BIQ) biosynthetic gene clusters carry a set of genes encoding a two-component monooxygenase homologous to the ActVA-ORF5/ActVB system for actinorhodin biosynthesis in Streptomyces coelicolor A3(2). Here, we conducted molecular genetic and biochemical studies of this enzyme system. Inactivation of actVA-ORF5 yielded a shunt product, actinoperylone (ACPL), apparently derived from 6-deoxy-dihydrokalafungin. Similarly, deletion of actVB resulted in accumulation of ACPL, indicating a critical role for the monooxygenase system in C-6 oxygenation, a biosynthetic step common to all BIQ biosyntheses. Furthermore, in vitro, we showed a quinone-forming activity of the ActVA-ORF5/ActVB system in addition to that of a known C-6 monooxygenase, ActVA-ORF6, by using emodinanthrone as a model substrate. Our results demonstrate that the act gene cluster encodes two alternative routes for quinone formation by C-6 oxygenation in BIQ biosynthesis.

  8. Diverse and Abundant Secondary Metabolism Biosynthetic Gene Clusters in the Genomes of Marine Sponge Derived Streptomyces spp. Isolates

    Directory of Open Access Journals (Sweden)

    Stephen A. Jackson

    2018-02-01

    Full Text Available The genus Streptomyces produces secondary metabolic compounds that are rich in biological activity. Many of these compounds are genetically encoded by large secondary metabolism biosynthetic gene clusters (smBGCs such as polyketide synthases (PKS and non-ribosomal peptide synthetases (NRPS which are modular and can be highly repetitive. Due to the repeats, these gene clusters can be difficult to resolve using short read next generation datasets and are often quite poorly predicted using standard approaches. We have sequenced the genomes of 13 Streptomyces spp. strains isolated from shallow water and deep-sea sponges that display antimicrobial activities against a number of clinically relevant bacterial and yeast species. Draft genomes have been assembled and smBGCs have been identified using the antiSMASH (antibiotics and Secondary Metabolite Analysis Shell web platform. We have compared the smBGCs amongst strains in the search for novel sequences conferring the potential to produce novel bioactive secondary metabolites. The strains in this study recruit to four distinct clades within the genus Streptomyces. The marine strains host abundant smBGCs which encode polyketides, NRPS, siderophores, bacteriocins and lantipeptides. The deep-sea strains appear to be enriched with gene clusters encoding NRPS. Marine adaptations are evident in the sponge-derived strains which are enriched for genes involved in the biosynthesis and transport of compatible solutes and for heat-shock proteins. Streptomyces spp. from marine environments are a promising source of novel bioactive secondary metabolites as the abundance and diversity of smBGCs show high degrees of novelty. Sponge derived Streptomyces spp. isolates appear to display genomic adaptations to marine living when compared to terrestrial strains.

  9. Myeloid clusters are associated with a pro-metastatic environment and poor prognosis in smoking-related early stage non-small cell lung cancer.

    Directory of Open Access Journals (Sweden)

    Wang Zhang

    Full Text Available This study aimed to understand the role of myeloid cell clusters in uninvolved regional lymph nodes from early stage non-small cell lung cancer patients.Uninvolved regional lymph node sections from 67 patients with stage I-III resected non-small cell lung cancer were immunostained to detect myeloid clusters, STAT3 activity and occult metastasis. Anthracosis intensity, myeloid cluster infiltration associated with anthracosis and pSTAT3 level were scored and correlated with patient survival. Multivariate Cox regression analysis was performed with prognostic variables. Human macrophages were used for in vitro nicotine treatment.CD68+ myeloid clusters associated with anthracosis and with an immunosuppressive and metastasis-promoting phenotype and elevated overall STAT3 activity were observed in uninvolved lymph nodes. In patients with a smoking history, myeloid cluster score significantly correlated with anthracosis intensity and pSTAT3 level (P<0.01. Nicotine activated STAT3 in macrophages in long-term culture. CD68+ myeloid clusters correlated and colocalized with occult metastasis. Myeloid cluster score was an independent prognostic factor (P = 0.049 and was associated with survival by Kaplan-Maier estimate in patients with a history of smoking (P = 0.055. The combination of myeloid cluster score with either lymph node stage or pSTAT3 level defined two populations with a significant difference in survival (P = 0.024 and P = 0.004, respectively.Myeloid clusters facilitate a pro-metastatic microenvironment in uninvolved regional lymph nodes and associate with occult metastasis in early stage non-small cell lung cancer. Myeloid cluster score is an independent prognostic factor for survival in patients with a history of smoking, and may present a novel method to inform therapy choices in the adjuvant setting. Further validation studies are warranted.

  10. Differential Gene Expression in Primary Breast Tumors Associated with Lymph Node Metastasis

    International Nuclear Information System (INIS)

    Ellsworth, R.E.; Field, L.A.; Kane, J.L.; Love, B.; Hooke, J.A.; Shriver, C.D.

    2011-01-01

    Lymph node status remains one of the most useful prognostic indicators in breast cancer; however, current methods to assess nodal status disrupt the lymphatic system and may lead to secondary complications. Identification of molecular signatures discriminating lymph node-positive from lymph node-negative primary tumors would allow for stratification of patients requiring surgical assesment of lymph nodes. Primary breast tumors from women with negative (n=41) and positive (n=35) lymph node status matched for possible confounding factors were subjected to laser micro dissection and gene expression data generated. Although ANOVA analysis (P 1.5) revealed 13 differentially expressed genes, hierarchical clustering classified 90% of node-negative but only 66% of node-positive tumors correctly. The inability to derive molecular profiles of metastasis in primary tumors may reflect tumor heterogeneity, paucity of cells within the primary tumor with metastatic potential, influence of the microenvironment, or inherited host susceptibility to metastasis

  11. Differential Gene Expression in Primary Breast Tumors Associated with Lymph Node Metastasis

    Science.gov (United States)

    Ellsworth, Rachel E.; Field, Lori A.; Love, Brad; Kane, Jennifer L.; Hooke, Jeffrey A.; Shriver, Craig D.

    2011-01-01

    Lymph node status remains one of the most useful prognostic indicators in breast cancer; however, current methods to assess nodal status disrupt the lymphatic system and may lead to secondary complications. Identification of molecular signatures discriminating lymph node-positive from lymph node-negative primary tumors would allow for stratification of patients requiring surgical assesment of lymph nodes. Primary breast tumors from women with negative (n = 41) and positive (n = 35) lymph node status matched for possible confounding factors were subjected to laser microdissection and gene expression data generated. Although ANOVA analysis (P 1.5) revealed 13 differentially expressed genes, hierarchical clustering classified 90% of node-negative but only 66% of node-positive tumors correctly. The inability to derive molecular profiles of metastasis in primary tumors may reflect tumor heterogeneity, paucity of cells within the primary tumor with metastatic potential, influence of the microenvironment, or inherited host susceptibility to metastasis. PMID:22295210

  12. Differential Gene Expression in Primary Breast Tumors Associated with Lymph Node Metastasis

    Directory of Open Access Journals (Sweden)

    Rachel E. Ellsworth

    2011-01-01

    Full Text Available Lymph node status remains one of the most useful prognostic indicators in breast cancer; however, current methods to assess nodal status disrupt the lymphatic system and may lead to secondary complications. Identification of molecular signatures discriminating lymph node-positive from lymph node-negative primary tumors would allow for stratification of patients requiring surgical assesment of lymph nodes. Primary breast tumors from women with negative (=41 and positive (=35 lymph node status matched for possible confounding factors were subjected to laser microdissection and gene expression data generated. Although ANOVA analysis (1.5 revealed 13 differentially expressed genes, hierarchical clustering classified 90% of node-negative but only 66% of node-positive tumors correctly. The inability to derive molecular profiles of metastasis in primary tumors may reflect tumor heterogeneity, paucity of cells within the primary tumor with metastatic potential, influence of the microenvironment, or inherited host susceptibility to metastasis.

  13. Prognostics 101: A tutorial for particle filter-based prognostics algorithm using Matlab

    International Nuclear Information System (INIS)

    An, Dawn; Choi, Joo-Ho; Kim, Nam Ho

    2013-01-01

    This paper presents a Matlab-based tutorial for model-based prognostics, which combines a physical model with observed data to identify model parameters, from which the remaining useful life (RUL) can be predicted. Among many model-based prognostics algorithms, the particle filter is used in this tutorial for parameter estimation of damage or a degradation model. The tutorial is presented using a Matlab script with 62 lines, including detailed explanations. As examples, a battery degradation model and a crack growth model are used to explain the updating process of model parameters, damage progression, and RUL prediction. In order to illustrate the results, the RUL at an arbitrary cycle are predicted in the form of distribution along with the median and 90% prediction interval. This tutorial will be helpful for the beginners in prognostics to understand and use the prognostics method, and we hope it provides a standard of particle filter based prognostics. -- Highlights: ► Matlab-based tutorial for model-based prognostics is presented. ► A battery degradation model and a crack growth model are used as examples. ► The RUL at an arbitrary cycle are predicted using the particle filter

  14. Polymorphisms within the APOBR gene are highly associated with milk levels of prognostic ketosis biomarkers in dairy cows.

    Science.gov (United States)

    Tetens, Jens; Heuer, Claas; Heyer, Iris; Klein, Matthias S; Gronwald, Wolfram; Junge, Wolfgang; Oefner, Peter J; Thaller, Georg; Krattenmacher, Nina

    2015-04-01

    Essentially all high-yielding dairy cows experience a negative energy balance during early lactation leading to increased lipomobilization, which is a normal physiological response. However, a severe energy deficit may lead to high levels of ketone bodies and, subsequently, to subclinical or clinical ketosis. It has previously been reported that the ratio of glycerophosphocholine to phosphocholine in milk is a prognostic biomarker for the risk of ketosis in dairy cattle. It was hypothesized that this ratio reflects the ability to break down blood phosphatidylcholine as a fatty acid resource. In the current study, 248 animals from a previous study were genotyped with Illumina BovineSNP50 BeadChip, and genome-wide association studies were carried out for the milk levels of phosphocholine, glycerophosphocholine, and the ratio of both metabolites. It was demonstrated that the latter two traits are heritable with h2 = 0.43 and h2 = 0.34, respectively. A major quantitative trait locus was identified on cattle chromosome 25. The APOBR gene, coding for the apolipoprotein B receptor, is located within this region and was analyzed as a candidate gene. The analysis revealed highly significant associations of polymorphisms within the gene with glycerophosphocholine as well as the metabolite ratio. These findings support the hypothesis that differences in the ability to take up blood phosphatidylcholine from low-density lipoproteins play an important role in early lactation metabolic stability of dairy cows and indicate APOBR to contain a causative variant. Copyright © 2015 the American Physiological Society.

  15. Clusters of ancestrally related genes that show paralogy in whole or in part are a major feature of the genomes of humans and other species.

    Directory of Open Access Journals (Sweden)

    Michael B Walker

    Full Text Available Arrangements of genes along chromosomes are a product of evolutionary processes, and we can expect that preferable arrangements will prevail over the span of evolutionary time, often being reflected in the non-random clustering of structurally and/or functionally related genes. Such non-random arrangements can arise by two distinct evolutionary processes: duplications of DNA sequences that give rise to clusters of genes sharing both sequence similarity and common sequence features and the migration together of genes related by function, but not by common descent. To provide a background for distinguishing between the two, which is important for future efforts to unravel the evolutionary processes involved, we here provide a description of the extent to which ancestrally related genes are found in proximity.Towards this purpose, we combined information from five genomic datasets, InterPro, SCOP, PANTHER, Ensembl protein families, and Ensembl gene paralogs. The results are provided in publicly available datasets (http://cgd.jax.org/datasets/clustering/paraclustering.shtml describing the extent to which ancestrally related genes are in proximity beyond what is expected by chance (i.e. form paraclusters in the human and nine other vertebrate genomes, as well as the D. melanogaster, C. elegans, A. thaliana, and S. cerevisiae genomes. With the exception of Saccharomyces, paraclusters are a common feature of the genomes we examined. In the human genome they are estimated to include at least 22% of all protein coding genes. Paraclusters are far more prevalent among some gene families than others, are highly species or clade specific and can evolve rapidly, sometimes in response to environmental cues. Altogether, they account for a large portion of the functional clustering previously reported in several genomes.

  16. Evolution of homeobox genes.

    Science.gov (United States)

    Holland, Peter W H

    2013-01-01

    Many homeobox genes encode transcription factors with regulatory roles in animal and plant development. Homeobox genes are found in almost all eukaryotes, and have diversified into 11 gene classes and over 100 gene families in animal evolution, and 10 to 14 gene classes in plants. The largest group in animals is the ANTP class which includes the well-known Hox genes, plus other genes implicated in development including ParaHox (Cdx, Xlox, Gsx), Evx, Dlx, En, NK4, NK3, Msx, and Nanog. Genomic data suggest that the ANTP class diversified by extensive tandem duplication to generate a large array of genes, including an NK gene cluster and a hypothetical ProtoHox gene cluster that duplicated to generate Hox and ParaHox genes. Expression and functional data suggest that NK, Hox, and ParaHox gene clusters acquired distinct roles in patterning the mesoderm, nervous system, and gut. The PRD class is also diverse and includes Pax2/5/8, Pax3/7, Pax4/6, Gsc, Hesx, Otx, Otp, and Pitx genes. PRD genes are not generally arranged in ancient genomic clusters, although the Dux, Obox, and Rhox gene clusters arose in mammalian evolution as did several non-clustered PRD genes. Tandem duplication and genome duplication expanded the number of homeobox genes, possibly contributing to the evolution of developmental complexity, but homeobox gene loss must not be ignored. Evolutionary changes to homeobox gene expression have also been documented, including Hox gene expression patterns shifting in concert with segmental diversification in vertebrates and crustaceans, and deletion of a Pitx1 gene enhancer in pelvic-reduced sticklebacks. WIREs Dev Biol 2013, 2:31-45. doi: 10.1002/wdev.78 For further resources related to this article, please visit the WIREs website. The author declares that he has no conflicts of interest. Copyright © 2012 Wiley Periodicals, Inc.

  17. The ArcD1 and ArcD2 arginine/ornithine exchangers encoded in the arginine deiminase (ADI) pathway gene cluster of Lactococcus lactis

    NARCIS (Netherlands)

    Noens, Elke E E; Kaczmarek, Michał B; Żygo, Monika; Lolkema, Juke S

    2015-01-01

    The arginine deiminase pathway (ADI) gene cluster in Lactococcus lactis contains two copies of a gene encoding an L-arginine/L-ornithine exchanger, the arcD1 and arcD2 genes. The physiological function of ArcD1 and ArcD2 was studied by deleting the two genes. Deletion of arcD1 resulted in loss of

  18. Clusters of orthologous genes for 41 archaeal genomes and implications for evolutionary genomics of archaea

    OpenAIRE

    Wolf Yuri I; Novichkov Pavel S; Sorokin Alexander V; Makarova Kira S; Koonin Eugene V

    2007-01-01

    Abstract Background An evolutionary classification of genes from sequenced genomes that distinguishes between orthologs and paralogs is indispensable for genome annotation and evolutionary reconstruction. Shortly after multiple genome sequences of bacteria, archaea, and unicellular eukaryotes became available, an attempt on such a classification was implemented in Clusters of Orthologous Groups of proteins (COGs). Rapid accumulation of genome sequences creates opportunities for refining COGs ...

  19. Prognostic and predictive factors in colorectal cancer.

    Science.gov (United States)

    Bolocan, A; Ion, D; Ciocan, D N; Paduraru, D N

    2012-01-01

    Colorectal cancer (CRC) is an important public health problem; it is a leading cause of cancer mortality in the industrialized world, second to lung cancer: each year there are nearly one million new cases of CRC diagnosed worldwide and half a million deaths (1). This review aims to summarise the most important currently available markers for CRC that provide prognostic or predictive information. Amongst others, it covers serum markers such as CEA and CA19-9, markers expressed by tumour tissues, such as thymidylate synthase, and also the expression/loss of expression of certain oncogenes and tumour suppressor genes such as K-ras and p53. The prognostic value of genomic instability, angiogenesis and proliferative indices, such as the apoptotic index, are discussed. The advent of new therapies created the pathway for a personalized approach of the patient. This will take into consideration the complex genetic mechanisms involved in tumorigenesis, besides the classical clinical and pathological stagings. The growing number of therapeutic agents and known molecular targets in oncology lead to a compulsory study of the clinical use of biomarkers with role in improving response and survival, as well as in reducing toxicity and establishing economic stability. The potential predictive and prognostic biomarkers which have arisen from the study of the genetic basis of colorectal cancer and their therapeutical significance are discussed. RevistaChirurgia.

  20. The PROgnostic Value of unrequested Information in Diagnostic Imaging (PROVIDI) Study: rationale and design

    International Nuclear Information System (INIS)

    Gondrie, M. J. A.; Mali, W. P. Th. M.; Buckens, C. F. M.; Jacobs, P. C. A.; Grobbee, D. E.; Graaf, Y van der

    2010-01-01

    We describe the rationale for a new study examining the prognostic value of unrequested findings in diagnostic imaging. The deployment of more advanced imaging modalities in routine care means that such findings are being detected with increasing frequency. However, as the prognostic significance of many types of unrequested findings is unknown, the optimal response to such findings remains uncertain and in many cases an overly defensive approach is adopted, to the detriment of patient-care. Additionally, novel and promising image findings that are newly available on many routine scans cannot be used to improve patient care until their prognostic value is properly determined. The PROVIDI study seeks to address these issues using an innovative multi-center case-cohort study design. PROVIDI is to consist of a series of studies investigating specific, selected disease entities and clusters. Computed Tomography images from the participating hospitals are reviewed for unrequested findings. Subsequently, this data is pooled with outcome data from a central population registry. Study populations consist of patients with endpoints relevant to the (group of) disease(s) under study along with a random control sample from the cohort. This innovative design allows PROVIDI to evaluate selected unrequested image findings for their true prognostic value in a series of manageable studies. By incorporating unrequested image findings and outcomes data relevant to patients, truly meaningful conclusions about the prognostic value of unrequested and emerging image findings can be reached and used to improve patient-care.

  1. Opposite prognostic roles of HIF1β and HIF2β expressions in bone metastatic clear cell renal cell cancer

    DEFF Research Database (Denmark)

    Szendroi, Attila; Szász, A. Marcell; Kardos, Magdolna

    2016-01-01

    BACKGROUND: Prognostic markers of bone metastatic clear cell renal cell cancer (ccRCC) are poorly established. We tested prognostic value of HIF1β/HIF2β and their selected target genes in primary tumors and corresponding bone metastases. RESULTS: Expression of HIF2β was lower in mRCC both at m...

  2. Multi-walled carbon nanotube-induced gene expression in the mouse lung: Association with lung pathology

    International Nuclear Information System (INIS)

    Pacurari, M.; Qian, Y.; Porter, D.W.; Wolfarth, M.; Wan, Y.; Luo, D.; Ding, M.; Castranova, V.; Guo, N.L.

    2011-01-01

    Due to the fibrous shape and durability of multi-walled carbon nanotubes (MWCNT), concerns regarding their potential for producing environmental and human health risks, including carcinogenesis, have been raised. This study sought to investigate how previously identified lung cancer prognostic biomarkers and the related cancer signaling pathways are affected in the mouse lung following pharyngeal aspiration of well-dispersed MWCNT. A total of 63 identified lung cancer prognostic biomarker genes and major signaling biomarker genes were analyzed in mouse lungs (n = 80) exposed to 0, 10, 20, 40, or 80 μg of MWCNT by pharyngeal aspiration at 7 and 56 days post-exposure using quantitative PCR assays. At 7 and 56 days post-exposure, a set of 7 genes and a set of 11 genes, respectively, showed differential expression in the lungs of mice exposed to MWCNT vs. the control group. Additionally, these significant genes could separate the control group from the treated group over the time series in a hierarchical gene clustering analysis. Furthermore, 4 genes from these two sets of significant genes, coiled-coil domain containing-99 (Ccdc99), muscle segment homeobox gene-2 (Msx2), nitric oxide synthase-2 (Nos2), and wingless-type inhibitory factor-1 (Wif1), showed significant mRNA expression perturbations at both time points. It was also found that the expression changes of these 4 overlapping genes at 7 days post-exposure were attenuated at 56 days post-exposure. Ingenuity Pathway Analysis (IPA) found that several carcinogenic-related signaling pathways and carcinogenesis itself were associated with both the 7 and 11 gene signatures. Taken together, this study identifies that MWCNT exposure affects a subset of lung cancer biomarkers in mouse lungs. - Research highlights: → Multi-Walled Carbon Nanotubes affect lung cancer biomarkers in mouse lungs. → The results suggest potentially harmful effects of MWCNT exposure on human lungs. → The results could potentially be used

  3. Prevalence of the lmo0036-0043 gene cluster encoding arginine deiminase and agmatine deiminase systems in Listeria monocytogenes.

    Science.gov (United States)

    Chen, Jianshun; Chen, Fan; Cheng, Changyong; Fang, Weihuan

    2013-04-01

    Arginine deiminase and agmatine deiminase systems are involved in acid tolerance, and their encoding genes form the cluster lmo0036-0043 in Listeria monocytogenes. While lmo0042 and lmo0043 were conserved in all L. monocytogenes strains, the lmo0036-0041 region of this cluster was identified in all lineages I and II, and the majority of lineage IV (83.3%) strains, but absent in all lineage III and a small fraction of lineage IV (16.7%) strains, suggesting that the presence of the complete lmo0036-0043 cluster is dependent on lineages. lmo0036-0043-complete and -deficient lineage IV strains exhibit specific ascB-dapE profiles, which might represent two subpopulations with distinct genetic characteristics.

  4. Full structure and insight into the gene cluster of the O-specific polysaccharide of Yersinia intermedia H9-36/83 (O:17).

    Science.gov (United States)

    Sizova, Olga V; Shashkov, Alexander S; Kondakova, Anna N; Knirel, Yuriy A; Shaikhutdinova, Rima Z; Ivanov, Sergei A; Kislichkina, Angelina A; Kadnikova, Lidia A; Bogun, Aleksandr G; Dentovskaya, Svetlana V

    2018-05-02

    Lipopolysaccharide was isolated from bacteria Yersinia intermedia H9-36/83 (O:17) and degraded with mild acid to give an O-specific polysaccharide, which was isolated by GPC on Sephadex G-50 and studied by sugar analysis and 1D and 2D NMR spectroscopy. The polysaccharide was found to contain 3-deoxy-3-[(R)-3-hydroxybutanoylamino]-d-fucose (d-Fuc3NR3Hb) and the following structure of the heptasaccharide repeating unit was established: The structure established is consistent with the gene content of the O-antigen gene cluster. The O-polysaccharide structure and gene cluster of Y. intermedia are related to those of Hafnia alvei 1211 and Escherichia coli O:103. Copyright © 2018 Elsevier Ltd. All rights reserved.

  5. The Widespread Multidrug-Resistant Serotype O12 Pseudomonas aeruginosa Clone Emerged through Concomitant Horizontal Transfer of Serotype Antigen and Antibiotic Resistance Gene Clusters

    DEFF Research Database (Denmark)

    Thrane, Sandra Wingaard; Taylor, Véronique L.; Freschi, Luca

    2015-01-01

    . aeruginosa O12 OSA gene cluster, an antibiotic resistance determinant (gyrAC248T), and other genes that have been transferred between P. aeruginosa strains with distinct core genome architectures. We showed that these genes were likely acquired from an O12 serotype strain that is closely related to P...... in clinical settings and outbreaks. These serotype O12 isolates exhibit high levels of resistance to various classes of antibiotics. Here, we explore how the P. aeruginosa OSA biosynthesis gene clusters evolve in the population by investigating the association between the phylogenetic relationships among 83 P....... aeruginosa strains and their serotypes. While most serotypes were closely linked to the core genome phylogeny, we observed horizontal exchange of OSA biosynthesis genes among phylogenetically distinct P. aeruginosa strains. Specifically, we identified a "serotype island" ranging from 62 kb to 185 kb containing the P...

  6. Polymorphisms in Fatty Acid Desaturase (FADS) Gene Cluster: Effects on Glycemic Controls Following an Omega-3 Polyunsaturated Fatty Acids (PUFA) Supplementation

    Science.gov (United States)

    Cormier, Hubert; Rudkowska, Iwona; Thifault, Elisabeth; Lemieux, Simone; Couture, Patrick; Vohl, Marie-Claude

    2013-01-01

    Changes in desaturase activity are associated with insulin sensitivity and may be associated with type 2 diabetes mellitus (T2DM). Polymorphisms (SNPs) in the fatty acid desaturase (FADS) gene cluster have been associated with the homeostasis model assessment of insulin sensitivity (HOMA-IS) and serum fatty acid composition. Objective: To investigate whether common genetic variations in the FADS gene cluster influence fasting glucose (FG) and fasting insulin (FI) responses following a 6-week n-3 polyunsaturated fatty acids (PUFA) supplementation. Methods: 210 subjects completed a 2-week run-in period followed by a 6-week supplementation with 5 g/d of fish oil (providing 1.9 g–2.2 g of EPA + 1.1 g of DHA). Genotyping of 18 SNPs of the FADS gene cluster covering 90% of all common genetic variations (minor allele frequency ≥ 0.03) was performed. Results: Carriers of the minor allele for rs482548 (FADS2) had increased plasma FG levels after the n-3 PUFA supplementation in a model adjusted for FG levels at baseline, age, sex, and BMI. A significant genotype*supplementation interaction effect on FG levels was observed for rs482548 (p = 0.008). For FI levels, a genotype effect was observed with one SNP (rs174456). For HOMA-IS, several genotype*supplementation interaction effects were observed for rs7394871, rs174602, rs174570, rs7482316 and rs482548 (p = 0.03, p = 0.01, p = 0.03, p = 0.05 and p = 0.07; respectively). Conclusion: Results suggest that SNPs in the FADS gene cluster may modulate plasma FG, FI and HOMA-IS levels in response to n-3 PUFA supplementation. PMID:24705214

  7. Polymorphisms in Fatty Acid Desaturase (FADS Gene Cluster: Effects on Glycemic Controls Following an Omega-3 Polyunsaturated Fatty Acids (PUFA Supplementation

    Directory of Open Access Journals (Sweden)

    Patrick Couture

    2013-09-01

    Full Text Available Changes in desaturase activity are associated with insulin sensitivity and may be associated with type 2 diabetes mellitus (T2DM. Polymorphisms (SNPs in the fatty acid desaturase (FADS gene cluster have been associated with the homeostasis model assessment of insulin sensitivity (HOMA-IS and serum fatty acid composition. Objective: To investigate whether common genetic variations in the FADS gene cluster influence fasting glucose (FG and fasting insulin (FI responses following a 6-week n-3 polyunsaturated fatty acids (PUFA supplementation. Methods: 210 subjects completed a 2-week run-in period followed by a 6-week supplementation with 5 g/d of fish oil (providing 1.9 g–2.2 g of EPA + 1.1 g of DHA. Genotyping of 18 SNPs of the FADS gene cluster covering 90% of all common genetic variations (minor allele frequency ≥ 0.03 was performed. Results: Carriers of the minor allele for rs482548 (FADS2 had increased plasma FG levels after the n-3 PUFA supplementation in a model adjusted for FG levels at baseline, age, sex, and BMI. A significant genotype*supplementation interaction effect on FG levels was observed for rs482548 (p = 0.008. For FI levels, a genotype effect was observed with one SNP (rs174456. For HOMA-IS, several genotype*supplementation interaction effects were observed for rs7394871, rs174602, rs174570, rs7482316 and rs482548 (p = 0.03, p = 0.01, p = 0.03, p = 0.05 and p = 0.07; respectively. Conclusion: Results suggest that SNPs in the FADS gene cluster may modulate plasma FG, FI and HOMA-IS levels in response to n-3 PUFA supplementation.

  8. [Time for cluster C personality disorders: state of the art].

    Science.gov (United States)

    Hutsebaut, J; Willemsen, E M C; Van, H L

    Compared to cluster B personality disorders, the assessment and treatment of people with obsessive-compulsive, dependent, and avoidant personality disorders (cluster C) is given little attention in the field of research and clinical practice. Presenting the current state of affairs in regard to cluster C personality disorders. A systematic literature search was conducted using the main data bases. Cluster C personality disorders are present in approximately 3-9% of the general population. In about half of the cases of mood, anxiety, and eating disorders, there is co-morbid cluster C pathology. This has a major influence on the progression of symptoms, treatment effectiveness and potential relapse. There are barely any well conducted randomized studies on the treatment of cluster-C in existence. Open cohort studies, however, show strong, lasting treatment effects. Given the frequent occurrence of cluster C personality disorders, the burden of disease, associated societal costs and the prognostic implications in case of a co-morbid cluster C personality disorder, early detection and treatment of these disorders is warranted.

  9. Cluster editing

    DEFF Research Database (Denmark)

    Böcker, S.; Baumbach, Jan

    2013-01-01

    . The problem has been the inspiration for numerous algorithms in bioinformatics, aiming at clustering entities such as genes, proteins, phenotypes, or patients. In this paper, we review exact and heuristic methods that have been proposed for the Cluster Editing problem, and also applications......The Cluster Editing problem asks to transform a graph into a disjoint union of cliques using a minimum number of edge modifications. Although the problem has been proven NP-complete several times, it has nevertheless attracted much research both from the theoretical and the applied side...

  10. SPINE: SParse eIgengene NEtwork linking gene expression clusters in Dehalococcoides mccartyi to perturbations in experimental conditions.

    Directory of Open Access Journals (Sweden)

    Cresten B Mansfeldt

    Full Text Available We present a statistical model designed to identify the effect of experimental perturbations on the aggregate behavior of the transcriptome expressed by the bacterium Dehalococcoides mccartyi strain 195. Strains of Dehalococcoides are used in sub-surface bioremediation applications because they organohalorespire tetrachloroethene and trichloroethene (common chlorinated solvents that contaminate the environment to non-toxic ethene. However, the biochemical mechanism of this process remains incompletely described. Additionally, the response of Dehalococcoides to stress-inducing conditions that may be encountered at field-sites is not well understood. The constructed statistical model captured the aggregate behavior of gene expression phenotypes by modeling the distinct eigengenes of 100 transcript clusters, determining stable relationships among these clusters of gene transcripts with a sparse network-inference algorithm, and directly modeling the effect of changes in experimental conditions by constructing networks conditioned on the experimental state. Based on the model predictions, we discovered new response mechanisms for DMC, notably when the bacterium is exposed to solvent toxicity. The network identified a cluster containing thirteen gene transcripts directly connected to the solvent toxicity condition. Transcripts in this cluster include an iron-dependent regulator (DET0096-97 and a methylglyoxal synthase (DET0137. To validate these predictions, additional experiments were performed. Continuously fed cultures were exposed to saturating levels of tetrachloethene, thereby causing solvent toxicity, and transcripts that were predicted to be linked to solvent toxicity were monitored by quantitative reverse-transcription polymerase chain reaction. Twelve hours after being shocked with saturating levels of tetrachloroethene, the control transcripts (encoding for a key hydrogenase and the 16S rRNA did not significantly change. By contrast

  11. antiSMASH 4.0-improvements in chemistry prediction and gene cluster boundary identification

    DEFF Research Database (Denmark)

    Blin, Kai; Wolf, Thomas; Chevrette, Marc G.

    2017-01-01

    Many antibiotics, chemotherapeutics, crop protection agents and food preservatives originate from molecules produced by bacteria, fungi or plants. In recent years, genome mining methodologies have been widely adopted to identify and characterize the biosynthetic gene clusters encoding...... the production of such compounds. Since 2011, the 'antibiotics and secondary metabolite analysis shell-antiSMASH' has assisted researchers in efficiently performing this, both as a web server and a standalone tool. Here, we present the thoroughly updated antiSMASH version 4, which adds several novel features...

  12. A cluster merging method for time series microarray with production values.

    Science.gov (United States)

    Chira, Camelia; Sedano, Javier; Camara, Monica; Prieto, Carlos; Villar, Jose R; Corchado, Emilio

    2014-09-01

    A challenging task in time-course microarray data analysis is to cluster genes meaningfully combining the information provided by multiple replicates covering the same key time points. This paper proposes a novel cluster merging method to accomplish this goal obtaining groups with highly correlated genes. The main idea behind the proposed method is to generate a clustering starting from groups created based on individual temporal series (representing different biological replicates measured in the same time points) and merging them by taking into account the frequency by which two genes are assembled together in each clustering. The gene groups at the level of individual time series are generated using several shape-based clustering methods. This study is focused on a real-world time series microarray task with the aim to find co-expressed genes related to the production and growth of a certain bacteria. The shape-based clustering methods used at the level of individual time series rely on identifying similar gene expression patterns over time which, in some models, are further matched to the pattern of production/growth. The proposed cluster merging method is able to produce meaningful gene groups which can be naturally ranked by the level of agreement on the clustering among individual time series. The list of clusters and genes is further sorted based on the information correlation coefficient and new problem-specific relevant measures. Computational experiments and results of the cluster merging method are analyzed from a biological perspective and further compared with the clustering generated based on the mean value of time series and the same shape-based algorithm.

  13. Phylogeography of var gene repertoires reveals fine-scale geospatial clustering of Plasmodium falciparum populations in a highly endemic area.

    Science.gov (United States)

    Tessema, Sofonias K; Monk, Stephanie L; Schultz, Mark B; Tavul, Livingstone; Reeder, John C; Siba, Peter M; Mueller, Ivo; Barry, Alyssa E

    2015-01-01

    Plasmodium falciparum malaria is a major global health problem that is being targeted for progressive elimination. Knowledge of local disease transmission patterns in endemic countries is critical to these elimination efforts. To investigate fine-scale patterns of malaria transmission, we have compared repertoires of rapidly evolving var genes in a highly endemic area. A total of 3680 high-quality DBLα-sequences were obtained from 68 P. falciparum isolates from ten villages spread over two distinct catchment areas on the north coast of Papua New Guinea (PNG). Modelling of the extent of var gene diversity in the two parasite populations predicts more than twice as many var gene alleles circulating within each catchment (Mugil = 906; Wosera = 1094) than previously recognized in PNG (Amele = 369). In addition, there were limited levels of var gene sharing between populations, consistent with local parasite population structure. Phylogeographic analyses demonstrate that while neutrally evolving microsatellite markers identified population structure only at the catchment level, var gene repertoires reveal further fine-scale geospatial clustering of parasite isolates. The clustering of parasite isolates by village in Mugil, but not in Wosera was consistent with the physical and cultural isolation of the human populations in the two catchments. The study highlights the microheterogeneity of P. falciparum transmission in highly endemic areas and demonstrates the potential of var genes as markers of local patterns of parasite population structure. © 2014 John Wiley & Sons Ltd.

  14. Hessian regularization based symmetric nonnegative matrix factorization for clustering gene expression and microbiome data.

    Science.gov (United States)

    Ma, Yuanyuan; Hu, Xiaohua; He, Tingting; Jiang, Xingpeng

    2016-12-01

    Nonnegative matrix factorization (NMF) has received considerable attention due to its interpretation of observed samples as combinations of different components, and has been successfully used as a clustering method. As an extension of NMF, Symmetric NMF (SNMF) inherits the advantages of NMF. Unlike NMF, however, SNMF takes a nonnegative similarity matrix as an input, and two lower rank nonnegative matrices (H, H T ) are computed as an output to approximate the original similarity matrix. Laplacian regularization has improved the clustering performance of NMF and SNMF. However, Laplacian regularization (LR), as a classic manifold regularization method, suffers some problems because of its weak extrapolating ability. In this paper, we propose a novel variant of SNMF, called Hessian regularization based symmetric nonnegative matrix factorization (HSNMF), for this purpose. In contrast to Laplacian regularization, Hessian regularization fits the data perfectly and extrapolates nicely to unseen data. We conduct extensive experiments on several datasets including text data, gene expression data and HMP (Human Microbiome Project) data. The results show that the proposed method outperforms other methods, which suggests the potential application of HSNMF in biological data clustering. Copyright © 2016. Published by Elsevier Inc.

  15. A brain-specific gene cluster isolated from the region of the mouse obesity locus is expressed in the adult hypothalamus and during mouse development

    Energy Technology Data Exchange (ETDEWEB)

    Laig-Webster, M.; Lim, M.E.; Chehab, F.F. [Univ. of California, San Francisco, CA (United States)

    1994-09-01

    The molecular defect underlying an autosomal recessive form of genetic obesity in a classical mouse model C57 BL/6J-ob/ob has not yet been elucidated. Whereas metabolic and physiological disturbances such as diabetes and hypertension are associated with obesity, the site of expression and the nature of the primary lesion responsible for this cascade of events remains elusive. Our efforts aimed at the positional cloning of the ob gene by YAC contig mapping and gene identification have resulted in the cloning of a brain-specific gene cluster from the ob critical region. The expression of this gene cluster is remarkably complex owing to the multitude of brain-specific mRNA transcripts detected on Northern blots. cDNA cloning of these transcripts suggests that they are expressed from different genes as well as by alternate splicing mechanisms. Furthermore, the genomic organization of the cluster appears to consist of at least two identical promoters displaying CpG islands characteristic of housekeeping genes, yet clearly involving tissue-specific expression. Sense and anti-sense synthetic RNA probes were derived from a common DNA sequence on 3 cDNA clones and hybridized to 8-16 days mouse embryonic stages and mouse adult brain sections. Expression in development was noticeable as of the 11th day of gestation and confined to the central nervous system mainly in the telencephalon and spinal cord. Coronal and sagittal sections of the adult mouse brain showed expression only in 3 different regions of the brain stem. In situ hybridization to mouse hypothalamus sections revealed the presence of a localized and specialized group of cells expressing high levels of mRNA, suggesting that this gene cluster may also be involved in the regulation of hypothalamic activities. The hypothalamus has long been hypothesized as a primary candidate tissue for the expression of the obesity gene mainly because of its well-established role in the regulation of energy metabolism and food intake.

  16. Identification of a cluster IV pleiotropic drug resistance transporter gene expressed in the style of Nicotiana plumbaginifolia.

    Science.gov (United States)

    Trombik, Tomasz; Jasinski, Michal; Crouzet, Jérome; Boutry, Marc

    2008-01-01

    ATP-binding cassette transporters of the pleiotropic drug resistance (PDR) subfamily are composed of five clusters. We have cloned a gene, NpPDR2, belonging to the still uncharacterized cluster IV from Nicotiana plumbaginifolia. NpPDR2 transcripts were found in the roots and mature flowers. In the latter, NpPDR2 expression was restricted to the style and only after pollination. A 1.5-kb genomic sequence containing the putative NpPDR2 transcription promoter was fused to the beta-glucuronidase reporter gene. The GUS expression pattern confirmed the RT-PCR results that NpPDR2 was expressed in roots and the flower style and showed that it was localized around the conductive tissues. Unlike other PDR genes, NpPDR2 expression was not induced in leaf tissues by none of the hormones typically involved in biotic and abiotic stress response. Moreover, unlike NpPDR1 known to be involved in biotic stress response, NpPDR2 expression was not induced in the style upon Botrytis cinerea infection. In N. plumbaginifolia plants in which NpPDR2 expression was prevented by RNA interference, no unusual phenotype was observed, including at the flowering stage, which suggests that NpPDR2 is not essential in the reproductive process under the tested conditions.

  17. Characterization of a multicopper oxidase gene cluster in Phanerochaete chrysosporium and evidence of altered splicing of the mco transcripts

    Science.gov (United States)

    Luis F. Larrondo; Bernardo Gonzalez; Dan Cullen; Rafael Vicuna

    2004-01-01

    A cluster of multicopper oxidase genes (mco1, mco2, mco3, mco4) from the lignin-degrading basidiomycete Phanerochaete chrysosporium is described. The four genes share the same transcriptional orientation within a 25 kb region. mco1, mco2 and mco3 are tightly grouped, with intergenic regions of 2.3 and 0.8 kb, respectively, whereas mco4 is located 11 kb upstream of mco1...

  18. Prognostic factors of breast cancer

    International Nuclear Information System (INIS)

    Gonzalez Ortega, Jose Maria; Morales Wong, Mario Miguel; Lopez Cuevas, Zoraida; Diaz Valdez, Marilin

    2011-01-01

    The prognostic factors must to be differentiated of the predictive ones. A prognostic factor is any measurement used at moment of the surgery correlated with the free interval of disease or global survival in the absence of the systemic adjuvant treatment and as result is able to correlate with the natural history of the disease. In contrast, a predictive factor is any measurement associated with the response to a given treatment. Among the prognostic factors of the breast cancer are included the clinical, histological, biological, genetic and psychosocial factors. In present review of psychosocial prognostic factors has been demonstrated that the stress and the depression are negative prognostic factors in patients presenting with breast cancer. It is essential to remember that the assessment of just one prognostic parameter is a help but it is not useful to clinical and therapeutic management of the patient.(author)

  19. Outcome-Driven Cluster Analysis with Application to Microarray Data.

    Directory of Open Access Journals (Sweden)

    Jessie J Hsu

    Full Text Available One goal of cluster analysis is to sort characteristics into groups (clusters so that those in the same group are more highly correlated to each other than they are to those in other groups. An example is the search for groups of genes whose expression of RNA is correlated in a population of patients. These genes would be of greater interest if their common level of RNA expression were additionally predictive of the clinical outcome. This issue arose in the context of a study of trauma patients on whom RNA samples were available. The question of interest was whether there were groups of genes that were behaving similarly, and whether each gene in the cluster would have a similar effect on who would recover. For this, we develop an algorithm to simultaneously assign characteristics (genes into groups of highly correlated genes that have the same effect on the outcome (recovery. We propose a random effects model where the genes within each group (cluster equal the sum of a random effect, specific to the observation and cluster, and an independent error term. The outcome variable is a linear combination of the random effects of each cluster. To fit the model, we implement a Markov chain Monte Carlo algorithm based on the likelihood of the observed data. We evaluate the effect of including outcome in the model through simulation studies and describe a strategy for prediction. These methods are applied to trauma data from the Inflammation and Host Response to Injury research program, revealing a clustering of the genes that are informed by the recovery outcome.

  20. CDO1 promoter methylation is associated with gene silencing and is a prognostic biomarker for biochemical recurrence-free survival in prostate cancer patients.

    Science.gov (United States)

    Meller, Sebastian; Zipfel, Lisa; Gevensleben, Heidrun; Dietrich, Jörn; Ellinger, Jörg; Majores, Michael; Stein, Johannes; Sailer, Verena; Jung, Maria; Kristiansen, Glen; Dietrich, Dimo

    2016-12-01

    Molecular biomarkers may facilitate the distinction between aggressive and clinically insignificant prostate cancer (PCa), thereby potentially aiding individualized treatment. We analyzed cysteine dioxygenase 1 (CDO1) promoter methylation and mRNA expression in order to evaluate its potential as prognostic biomarker. CDO1 methylation and mRNA expression were determined in cell lines and formalin-fixed paraffin-embedded prostatectomy specimens from a first cohort of 300 PCa patients using methylation-specific qPCR and qRT-PCR. Univariate and multivariate Cox proportional hazards and Kaplan-Meier analyses were performed to evaluate biochemical recurrence (BCR)-free survival. Results were confirmed in an independent second cohort comprising 498 PCa cases. Methylation and mRNA expression data from the second cohort were generated by The Cancer Genome Atlas (TCGA) Research Network by means of Infinium HumanMethylation450 BeadChip and RNASeq. CDO1 was hypermethylated in PCa compared to normal adjacent tissues and benign prostatic hyperplasia (P < 0.001) and was associated with reduced gene expression (ρ = -0.91, P = 0.005). Using two different methodologies for methylation quantification, high CDO1 methylation as continuous variable was associated with BCR in univariate analysis (first cohort: HR = 1.02, P = 0.002, 95% CI [1.01-1.03]; second cohort: HR = 1.02, P = 0.032, 95% CI [1.00-1.03]) but failed to reach statistical significance in multivariate analysis. CDO1 promoter methylation is involved in gene regulation and is a potential prognostic biomarker for BCR-free survival in PCa patients following radical prostatectomy. Further studies are needed to validate CDO1 methylation assays and to evaluate the clinical utility of CDO1 methylation for the management of PCa.

  1. Heterologous expression of oxytetracycline biosynthetic gene cluster in Streptomyces venezuelae WVR2006 to improve production level and to alter fermentation process.

    Science.gov (United States)

    Yin, Shouliang; Li, Zilong; Wang, Xuefeng; Wang, Huizhuan; Jia, Xiaole; Ai, Guomin; Bai, Zishang; Shi, Mingxin; Yuan, Fang; Liu, Tiejun; Wang, Weishan; Yang, Keqian

    2016-12-01

    Heterologous expression is an important strategy to activate biosynthetic gene clusters of secondary metabolites. Here, it is employed to activate and manipulate the oxytetracycline (OTC) gene cluster and to alter OTC fermentation process. To achieve these goals, a fast-growing heterologous host Streptomyces venezuelae WVR2006 was rationally selected among several potential hosts. It shows rapid and dispersed growth and intrinsic high resistance to OTC. By manipulating the expression of two cluster-situated regulators (CSR) OtcR and OtrR and precursor supply, the OTC production level was significantly increased in this heterologous host from 75 to 431 mg/l only in 48 h, a level comparable to the native producer Streptomyces rimosus M4018 in 8 days. This work shows that S. venezuelae WVR2006 is a promising chassis for the production of secondary metabolites, and the engineered heterologous OTC producer has the potential to completely alter the fermentation process of OTC production.

  2. IMG-ABC: An Atlas of Biosynthetic Gene Clusters to Fuel the Discovery of Novel Secondary Metabolites

    Energy Technology Data Exchange (ETDEWEB)

    Chen, I-Min; Chu, Ken; Ratner, Anna; Palaniappan, Krishna; Huang, Jinghua; Reddy, T. B.K.; Cimermancic, Peter; Fischbach, Michael; Ivanova, Natalia; Markowitz, Victor; Kyrpides, Nikos; Pati, Amrita

    2014-10-28

    In the discovery of secondary metabolites (SMs), large-scale analysis of sequence data is a promising exploration path that remains largely underutilized due to the lack of relevant computational resources. We present IMG-ABC (https://img.jgi.doe.gov/abc/) -- An Atlas of Biosynthetic gene Clusters within the Integrated Microbial Genomes (IMG) system1. IMG-ABC is a rich repository of both validated and predicted biosynthetic clusters (BCs) in cultured isolates, single-cells and metagenomes linked with the SM chemicals they produce and enhanced with focused analysis tools within IMG. The underlying scalable framework enables traversal of phylogenetic dark matter and chemical structure space -- serving as a doorway to a new era in the discovery of novel molecules.

  3. Incorporating genomic, transcriptomic and clinical data: a prognostic and stem cell-like MYC and PRC imbalance in high-risk neuroblastoma.

    Science.gov (United States)

    Yang, Xinan Holly; Tang, Fangming; Shin, Jisu; Cunningham, John M

    2017-10-03

    Previous studies suggested that cancer cells possess traits reminiscent of the biological mechanisms ascribed to normal embryonic stem cells (ESCs) regulated by MYC and Polycomb repressive complex 2 (PRC2). Several poorly differentiated adult tumors showed preferentially high expression levels in targets of MYC, coincident with low expression levels in targets of PRC2. This paper will reveal this ESC-like cancer signature in high-risk neuroblastoma (HR-NB), the most common extracranial solid tumor in children. We systematically assembled genomic variants, gene expression changes, priori knowledge of gene functions, and clinical outcomes to identify prognostic multigene signatures. First, we assigned a new, individualized prognostic index using the relative expressions between the poor- and good-outcome signature genes. We then characterized HR-NB aggressiveness beyond these prognostic multigene signatures through the imbalanced effects of MYC and PRC2 signaling. We further analyzed Retinoic acid (RA)-induced HR-NB cells to model tumor cell differentiation. Finally, we performed in vitro validation on ZFHX3, a cell differentiation marker silenced by PRC2, and compared cell morphology changes before and after blocking PRC2 in HR-NB cells. A significant concurrence existed between exons with verified variants and genes showing MYCN-dependent expression in HR-NB. From these biomarker candidates, we identified two novel prognostic gene-set pairs with multi-scale oncogenic defects. Intriguingly, MYC targets over-represented an unfavorable component of the identified prognostic signatures while PRC2 targets over-represented a favorable component. The cell cycle arrest and neuronal differentiation marker ZFHX3 was identified as one of PRC2-silenced tumor suppressor candidates. Blocking PRC2 reduced tumor cell growth and increased the mRNA expression levels of ZFHX3 in an early treatment stage. This hypothesis-driven systems bioinformatics work offered novel insights into

  4. BRCA1 gene expression in relation to prognostic parameters of breast cancer

    Directory of Open Access Journals (Sweden)

    Manal Kamal

    2011-09-01

    Full Text Available The tumor suppressor gene, BRCA1 has been conferred to increase the susceptibility to breast cancer in younger women. This work studied the expression of BRCA1 (mRNA in women with breast cancer in relation to other prognostic parameters such as histological type and grade of cancer, hormone receptor status, human epidermal growth factor receptor 2 (HER2/neu and CA15-3. Thirty patients with positive family history of breast cancer and a control group of 20 healthy subjects were also included for the study. Ribonucleic acid (RNA extraction from breast cancer tissues was done (considered suitable for RNA extraction if 70% or more of the tissue section contained tumor and was followed by real-time reverse transcription polymerase chain reaction. BRCA1 expression was assessed and correlated with age, histological type and grade of breast cancer, estrogen and progesterone receptor (ER, PR status, HER2/neu expression and CA15-3 levels. The mean age of patients was 54.8 ± 10.49 years. Of the 30 breast cancer cases studied, the majority (77% was of high histological grade and the most common histological type was infiltrating ductal carcinoma (20 cases. ER expression was positive in 53.3% of breast cancers, while PR expression was positive in 50% of cancers. BRCA1 mRNA was found in 6 patient samples (20% of the breast cancer patients while the remaining 24 patients (80% showed negative BRCA1 mRNA expression as well as the control group. A positive significant relationship was demonstrated between BRCA1 (mRNA expression and high histological grade, negative estrogen and progesterone receptor status, and high levels of serum CA15-3. A significant negative correlation was found between BRCA1 mRNA expression and age (r = −0.683; p < 0.01. The study demonstrated lack of BRCA1 gene expression (mRNA in the majority of breast cancer cases and confirmed the relationship between BRCA1 expression and parameters that determine poor prognosis in breast cancer. The

  5. Identification of a novel prophage-like gene cluster actively expressed in both virulent and avirulent strains of Leptospira interrogans serovar Lai.

    Science.gov (United States)

    Qin, Jin-Hong; Zhang, Qing; Zhang, Zhi-Ming; Zhong, Yi; Yang, Yang; Hu, Bao-Yu; Zhao, Guo-Ping; Guo, Xiao-Kui

    2008-06-01

    DNA microarray analysis was used to compare the differential gene expression profiles between Leptospira interrogans serovar Lai type strain 56601 and its corresponding attenuated strain IPAV. A 22-kb genomic island covering a cluster of 34 genes (i.e., genes LA0186 to LA0219) was actively expressed in both strains but concomitantly upregulated in strain 56601 in contrast to that of IPAV. Reverse transcription-PCR assays proved that the gene cluster comprised five transcripts. Gene annotation of this cluster revealed characteristics of a putative prophage-like remnant with at least 8 of 34 sequences encoding prophage-like proteins, of which the LA0195 protein is probably a putative prophage CI-like regulator. The transcription initiation activities of putative promoter-regulatory sequences of transcripts I, II, and III, all proximal to the LA0195 gene, were further analyzed in the Escherichia coli promoter probe vector pKK232-8 by assaying the reporter chloramphenicol acetyltransferase (CAT) activities. The strong promoter activities of both transcripts I and II indicated by the E. coli CAT assay were well correlated with the in vitro sequence-specific binding of the recombinant LA0195 protein to the corresponding promoter probes detected by the electrophoresis mobility shift assay. On the other hand, the promoter activity of transcript III was very low in E. coli and failed to show active binding to the LA0195 protein in vitro. These results suggested that the LA0195 protein is likely involved in the transcription of transcripts I and II. However, the identical complete DNA sequences of this prophage remnant from these two strains strongly suggests that possible regulatory factors or signal transduction systems residing outside of this region within the genome may be responsible for the differential expression profiling in these two strains.

  6. Genome mining of the sordarin biosynthetic gene cluster from Sordaria araneosa Cain ATCC 36386: characterization of cycloaraneosene synthase and GDP-6-deoxyaltrose transferase.

    Science.gov (United States)

    Kudo, Fumitaka; Matsuura, Yasunori; Hayashi, Takaaki; Fukushima, Masayuki; Eguchi, Tadashi

    2016-07-01

    Sordarin is a glycoside antibiotic with a unique tetracyclic diterpene aglycone structure called sordaricin. To understand its intriguing biosynthetic pathway that may include a Diels-Alder-type [4+2]cycloaddition, genome mining of the gene cluster from the draft genome sequence of the producer strain, Sordaria araneosa Cain ATCC 36386, was carried out. A contiguous 67 kb gene cluster consisting of 20 open reading frames encoding a putative diterpene cyclase, a glycosyltransferase, a type I polyketide synthase, and six cytochrome P450 monooxygenases were identified. In vitro enzymatic analysis of the putative diterpene cyclase SdnA showed that it catalyzes the transformation of geranylgeranyl diphosphate to cycloaraneosene, a known biosynthetic intermediate of sordarin. Furthermore, a putative glycosyltransferase SdnJ was found to catalyze the glycosylation of sordaricin in the presence of GDP-6-deoxy-d-altrose to give 4'-O-demethylsordarin. These results suggest that the identified sdn gene cluster is responsible for the biosynthesis of sordarin. Based on the isolated potential biosynthetic intermediates and bioinformatics analysis, a plausible biosynthetic pathway for sordarin is proposed.

  7. Characterization and transcriptional analysis of two gene clusters for type IV secretion machinery in Wolbachia of Armadillidium vulgare

    DEFF Research Database (Denmark)

    Félix, Christine; Pichon, Samuel; Braquart-Varnier, Christine

    2008-01-01

    Wolbachia are maternally inherited alpha-proteobacteria that induce feminization of genetic males in most terrestrial crustacean isopods. Two clusters of vir genes for a type IV secretion machinery have been identified at two separate loci and characterized for the first time in a feminizing Wolb...

  8. Prognostic breast cancer signature identified from 3D culture model accurately predicts clinical outcome across independent datasets

    Energy Technology Data Exchange (ETDEWEB)

    Martin, Katherine J.; Patrick, Denis R.; Bissell, Mina J.; Fournier, Marcia V.

    2008-10-20

    One of the major tenets in breast cancer research is that early detection is vital for patient survival by increasing treatment options. To that end, we have previously used a novel unsupervised approach to identify a set of genes whose expression predicts prognosis of breast cancer patients. The predictive genes were selected in a well-defined three dimensional (3D) cell culture model of non-malignant human mammary epithelial cell morphogenesis as down-regulated during breast epithelial cell acinar formation and cell cycle arrest. Here we examine the ability of this gene signature (3D-signature) to predict prognosis in three independent breast cancer microarray datasets having 295, 286, and 118 samples, respectively. Our results show that the 3D-signature accurately predicts prognosis in three unrelated patient datasets. At 10 years, the probability of positive outcome was 52, 51, and 47 percent in the group with a poor-prognosis signature and 91, 75, and 71 percent in the group with a good-prognosis signature for the three datasets, respectively (Kaplan-Meier survival analysis, p<0.05). Hazard ratios for poor outcome were 5.5 (95% CI 3.0 to 12.2, p<0.0001), 2.4 (95% CI 1.6 to 3.6, p<0.0001) and 1.9 (95% CI 1.1 to 3.2, p = 0.016) and remained significant for the two larger datasets when corrected for estrogen receptor (ER) status. Hence the 3D-signature accurately predicts breast cancer outcome in both ER-positive and ER-negative tumors, though individual genes differed in their prognostic ability in the two subtypes. Genes that were prognostic in ER+ patients are AURKA, CEP55, RRM2, EPHA2, FGFBP1, and VRK1, while genes prognostic in ER patients include ACTB, FOXM1 and SERPINE2 (Kaplan-Meier p<0.05). Multivariable Cox regression analysis in the largest dataset showed that the 3D-signature was a strong independent factor in predicting breast cancer outcome. The 3D-signature accurately predicts breast cancer outcome across multiple datasets and holds prognostic

  9. [Prognostic and predictive molecular markers for urologic cancers].

    Science.gov (United States)

    Hartmann, A; Schlomm, T; Bertz, S; Heinzelmann, J; Hölters, S; Simon, R; Stoehr, R; Junker, K

    2014-04-01

    Molecular prognostic factors and genetic alterations as predictive markers for cancer-specific targeted therapies are used today in the clinic for many malignancies. In recent years, many molecular markers for urogenital cancers have also been identified. However, these markers are not clinically used yet. In prostate cancer, novel next-generation sequencing methods revealed a detailed picture of the molecular changes. There is growing evidence that a combination of classical histopathological and validated molecular markers could lead to a more precise estimation of prognosis, thus, resulting in an increasing number of patients with active surveillance as a possible treatment option. In patients with urothelial carcinoma, histopathological factors but also the proliferation of the tumor, mutations in oncogenes leading to an increasing proliferation rate and changes in genes responsible for invasion and metastasis are important. In addition, gene expression profiles which could distinguish aggressive tumors with high risk of metastasis from nonmetastasizing tumors have been recently identified. In the future, this could potentially allow better selection of patients needing systemic perioperative treatment. In renal cell carcinoma, many molecular markers that are associated with metastasis and survival have been identified. Some of these markers were also validated as independent prognostic markers. Selection of patients with primarily organ-confined tumors and increased risk of metastasis for adjuvant systemic therapy could be clinically relevant in the future.

  10. Id-1 and Id-2 genes and products as markers of epithelial cancer

    Science.gov (United States)

    Desprez, Pierre-Yves [El Cerrito, CA; Campisi, Judith [Berkeley, CA

    2008-09-30

    A method for detection and prognosis of breast cancer and other types of cancer. The method comprises detecting expression, if any, for both an Id-1 and an Id-2 genes, or the ratio thereof, of gene products in samples of breast tissue obtained from a patient. When expressed, Id-1 gene is a prognostic indicator that breast cancer cells are invasive and metastatic, whereas Id-2 gene is a prognostic indicator that breast cancer cells are localized and noninvasive in the breast tissue.

  11. Promoter methylation of APC and RAR-β genes as prognostic markers in non-small cell lung cancer (NSCLC).

    Science.gov (United States)

    Feng, Hongxiang; Zhang, Zhenrong; Qing, Xin; Wang, Xiaowei; Liang, Chaoyang; Liu, Deruo

    2016-02-01

    Aberrant promoter hypermethylations of tumor suppressor genes are promising markers for lung cancer diagnosis and prognosis. The purpose of this study was to determine methylation status at APC and RAR-β promoters in primary NSCLC, and whether they have any relationship with survival. APC and RAR-β promoter methylation status were determined in 41 NSCLC patients using methylation specific PCR. APC promoter methylation was detectable in 9 (22.0%) tumor samples and 6 (14.6%) corresponding non-tumor samples (P=0.391). RAR-β promoter methylation was detectable in 13 (31.7%) tumor samples and 4 (9.8%) corresponding non-tumor samples (P=0.049) in the NSCLC patients. APC promoter methylation was found to be associated with T stage (P=0.046) and nodal status (P=0.019) in non-tumor samples, and with smoking (P=0.004) in tumor samples. RAR-β promoter methylation was found associated with age (P=0.031) in non-tumor samples and with primary tumor site in tumor samples. Patients with APC promoter methylation in tumor samples showed significantly longer survival than patients without it (Log-rank P=0.014). In a multivariate analysis of prognostic factors, APC methylation in tumor samples was an independent prognostic factor (P=0.012), as were N1 positive lymph node number (P=0.025) and N2 positive lymph node number (P=0.06). Our study shows that RAR-β methylation detected in lung tissue may be used as a predictive marker for NSCLC diagnosis and that APC methylation in tumor sample may be a useful marker for superior survival in NSCLC patients. Copyright © 2015. Published by Elsevier Inc.

  12. Identification and Analysis of a Novel Gene Cluster Involves in Fe2+ Oxidation in Acidithiobacillus ferrooxidans ATCC 23270, a Typical Biomining Acidophile.

    Science.gov (United States)

    Ai, Chenbing; Liang, Yuting; Miao, Bo; Chen, Miao; Zeng, Weimin; Qiu, Guanzhou

    2018-07-01

    Iron-oxidizing Acidithiobacillus spp. are applied worldwide in biomining industry to extract metals from sulfide minerals. They derive energy for survival through Fe 2+ oxidation and generate Fe 3+ for the dissolution of sulfide minerals. However, molecular mechanisms of their iron oxidation still remain elusive. A novel two-cytochrome-encoding gene cluster (named tce gene cluster) encoding a high-molecular-weight cytochrome c (AFE_1428) and a c 4 -type cytochrome c 552 (AFE_1429) in A. ferrooxidans ATCC 23270 was first identified in this study. Bioinformatic analysis together with transcriptional study showed that AFE_1428 and AFE_1429 were the corresponding paralog of Cyc2 (AFE_3153) and Cyc1 (AFE_3152) which were encoded by the extensively studied rus operon and had been proven involving in ferrous iron oxidation. Both AFE_1428 and AFE_1429 contained signal peptide and the classic heme-binding motif(s) as their corresponding paralog. The modeled structure of AFE_1429 showed high resemblance to Cyc1. AFE_1428 and AFE_1429 were preferentially transcribed as their corresponding paralogs in the presence of ferrous iron as sole energy source as compared with sulfur. The tce gene cluster is highly conserved in the genomes of four phylogenetic-related A. ferrooxidans strains that were originally isolated from different sites separated with huge geographical distance, which further implies the importance of this gene cluster. Collectively, AFE_1428 and AFE_1429 involve in Fe 2+ oxidation like their corresponding paralog by integrating with the metalloproteins encoded by rus operon. This study provides novel insights into the Fe 2+ oxidation mechanism in Fe 2+ -oxidizing A. ferrooxidans ssp.

  13. Gene mutations in hepatocellular adenomas

    DEFF Research Database (Denmark)

    Raft, Marie B; Jørgensen, Ernö N; Vainer, Ben

    2015-01-01

    is associated with bi-allelic mutations in the TCF1 gene and morphologically has marked steatosis. β-catenin activating HCA has increased activity of the Wnt/β-catenin pathway and is associated with possible malignant transformation. Inflammatory HCA is characterized by an oncogene-induced inflammation due...... to alterations in the Janus kinase/signal transducer and activator of transcription (JAK/STAT) pathway. In the diagnostic setting, sub classification of HCA is based primarily on immunohistochemical analyzes, and has had an increasing impact on choice of treatment and individual prognostic assessment....... This review offers an overview of the reported gene mutations associated with hepatocellular adenomas together with a discussion of the diagnostic and prognostic value....

  14. Prognostic and Predictive Value of the 21-Gene Recurrence Score Assay in a Randomized Trial of Chemotherapy for Postmenopausal, Node-Positive, Estrogen Receptor-Positive Breast Cancer

    Science.gov (United States)

    Albain, Kathy S.; Barlow, William E.; Shak, Steven; Hortobagyi, Gabriel N.; Livingston, Robert B.; Yeh, I-Tien; Ravdin, Peter; Bugarini, Roberto; Baehner, Frederick L.; Davidson, Nancy E.; Sledge, George W.; Winer, Eric P.; Hudis, Clifford; Ingle, James N.; Perez, Edith A.; Pritchard, Kathleen I.; Shepherd, Lois; Gralow, Julie R.; Yoshizawa, Carl; Allred, D. Craig; Osborne, C. Kent; Hayes, Daniel F.

    2010-01-01

    SUMMARY Background The 21-gene Recurrence Score assay (RS) is prognostic for women with node-negative, estrogen receptor (ER)-positive breast cancer (BC) treated with tamoxifen. A low RS predicts little benefit of chemotherapy. For node-positive BC, we investigated whether RS was prognostic in women treated with tamoxifen alone and whether it identified those who might not benefit from anthracycline-based chemotherapy, despite higher recurrence risks. Methods The phase III trial S8814 for postmenopausal women with node-positive, ER-positive BC showed that CAF chemotherapy prior to tamoxifen (CAF-T) added survival benefit to tamoxifen alone. Optional tumor banking yielded specimens for RS determination by RT-PCR. We evaluated the effect of RS on disease-free survival (DFS) by treatment group (tamoxifen versus CAF-T) using Cox regression adjusting for number of positive nodes. Findings There were 367 specimens (40% of parent trial) with sufficient RNA (tamoxifen, 148; CAF-T, 219). The RS was prognostic in the tamoxifen arm (p=0.006). There was no CAF benefit in the low RS group (logrank p=0.97; HR=1.02, 95% CI (0.54,1.93)), but major DFS improvement for the high RS subset (logrank p=.03; HR=0.59, 95% CI (0.35, 1.01)), adjusting for number of positive nodes. The RS-by-treatment interaction was significant in the first 5 years (p=0.029), with no additional prediction beyond 5 years (p=0.58), though the cumulative benefit remained at 10 years. Results were similar for overall survival and BC-specific survival. Interpretation In this retrospective analysis, the RS is prognostic for tamoxifen-treated patients with positive nodes and predicts significant CAF benefit in tumors with a high RS. A low RS identifies women who may not benefit from anthracycline-based chemotherapy despite positive nodes. PMID:20005174

  15. A molecular prognostic model predicts esophageal squamous cell carcinoma prognosis.

    Directory of Open Access Journals (Sweden)

    Hui-Hui Cao

    Full Text Available Esophageal squamous cell carcinoma (ESCC has the highest mortality rates in China. The 5-year survival rate of ESCC remains dismal despite improvements in treatments such as surgical resection and adjuvant chemoradiation, and current clinical staging approaches are limited in their ability to effectively stratify patients for treatment options. The aim of the present study, therefore, was to develop an immunohistochemistry-based prognostic model to improve clinical risk assessment for patients with ESCC.We developed a molecular prognostic model based on the combined expression of axis of epidermal growth factor receptor (EGFR, phosphorylated Specificity protein 1 (p-Sp1, and Fascin proteins. The presence of this prognostic model and associated clinical outcomes were analyzed for 130 formalin-fixed, paraffin-embedded esophageal curative resection specimens (generation dataset and validated using an independent cohort of 185 specimens (validation dataset.The expression of these three genes at the protein level was used to build a molecular prognostic model that was highly predictive of ESCC survival in both generation and validation datasets (P = 0.001. Regression analysis showed that this molecular prognostic model was strongly and independently predictive of overall survival (hazard ratio = 2.358 [95% CI, 1.391-3.996], P = 0.001 in generation dataset; hazard ratio = 1.990 [95% CI, 1.256-3.154], P = 0.003 in validation dataset. Furthermore, the predictive ability of these 3 biomarkers in combination was more robust than that of each individual biomarker.This technically simple immunohistochemistry-based molecular model accurately predicts ESCC patient survival and thus could serve as a complement to current clinical risk stratification approaches.

  16. Cloning, reassembling and integration of the entire nikkomycin biosynthetic gene cluster into Streptomyces ansochromogenes lead to an improved nikkomycin production

    Directory of Open Access Journals (Sweden)

    Yang Haihua

    2010-01-01

    Full Text Available Abstract Background Nikkomycins are a group of peptidyl nucleoside antibiotics produced by Streptomyces ansochromogenes. They are competitive inhibitors of chitin synthase and show potent fungicidal, insecticidal, and acaricidal activities. Nikkomycin X and Z are the main components produced by S. ansochromogenes. Generation of a high-producing strain is crucial to scale up nikkomycins production for further clinical trials. Results To increase the yields of nikkomycins, an additional copy of nikkomycin biosynthetic gene cluster (35 kb was introduced into nikkomycin producing strain, S. ansochromogenes 7100. The gene cluster was first reassembled into an integrative plasmid by Red/ET technology combining with classic cloning methods and then the resulting plasmid(pNIKwas introduced into S. ansochromogenes by conjugal transfer. Introduction of pNIK led to enhanced production of nikkomycins (880 mg L-1, 4 -fold nikkomycin X and 210 mg L-1, 1.8-fold nikkomycin Z in the resulting exconjugants comparing with the parent strain (220 mg L-1 nikkomycin X and 120 mg L-1 nikkomycin Z. The exconjugants are genetically stable in the absence of antibiotic resistance selection pressure. Conclusion A high nikkomycins producing strain (1100 mg L-1 nikkomycins was obtained by introduction of an extra nikkomycin biosynthetic gene cluster into the genome of S. ansochromogenes. The strategies presented here could be applicable to other bacteria to improve the yields of secondary metabolites.

  17. Promoter DNA methylation pattern identifies prognostic subgroups in childhood T-cell acute lymphoblastic leukemia.

    Directory of Open Access Journals (Sweden)

    Magnus Borssén

    Full Text Available BACKGROUND: Treatment of pediatric T-cell acute lymphoblastic leukemia (T-ALL has improved, but there is a considerable fraction of patients experiencing a poor outcome. There is a need for better prognostic markers and aberrant DNA methylation is a candidate in other malignancies, but its potential prognostic significance in T-ALL is hitherto undecided. DESIGN AND METHODS: Genome wide promoter DNA methylation analysis was performed in pediatric T-ALL samples (n = 43 using arrays covering >27000 CpG sites. Clinical outcome was evaluated in relation to methylation status and compared with a contemporary T-ALL group not tested for methylation (n = 32. RESULTS: Based on CpG island methylator phenotype (CIMP, T-ALL samples were subgrouped as CIMP+ (high methylation and CIMP- (low methylation. CIMP- T-ALL patients had significantly worse overall and event free survival (p = 0.02 and p = 0.001, respectively compared to CIMP+ cases. CIMP status was an independent factor for survival in multivariate analysis including age, gender and white blood cell count. Analysis of differently methylated genes in the CIMP subgroups showed an overrepresentation of transcription factors, ligands and polycomb target genes. CONCLUSIONS: We identified global promoter methylation profiling as being of relevance for subgrouping and prognostication of pediatric T-ALL.

  18. Evaluation of potential prognostic value of Bmi-1 gene product and selected markers of proliferation (Ki-67 and apoptosis (p53 in the neuroblastoma group of tumors

    Directory of Open Access Journals (Sweden)

    Katarzyna Taran

    2016-02-01

    Full Text Available Introduction: Cancer in children is a very important issue in pediatrics. The least satisfactory treatment outcome occurs among patients with clinically advanced neuroblastomas. Despite much research, the biology of this tumor still remains unclear, and new prognostic factors are sought. The Bmi-1 gene product is a currently highly investigated protein which belongs to the Polycomb group (PcG and has been identified as a regulator of primary neural crest cells. It is believed that Bmi‑1 and N-myc act together and are both involved in the pathogenesis of neuroblastoma. The aim of the study was to assess the potential prognostic value of Bmi-1 protein and its relations with mechanisms of proliferation and apoptosis in the neuroblastoma group of tumors.Material/Methods: 29 formalin-fixed and paraffin-embedded neuroblastoma tissue sections were examined using mouse monoclonal antibodies anti-Bmi-1, anti-p53 and anti-Ki-67 according to the manufacturer’s instructions.Results: There were found statistically significant correlations between Bmi-1 expression and tumor histology and age of patients.Conclusions: Bmi-1 seems to be a promising marker in the neuroblastoma group of tumors whose expression correlates with widely accepted prognostic parameters. The pattern of BMI-1 expression may indicate that the examined protein is also involved in maturation processes in tumor tissue.

  19. Google goes cancer: improving outcome prediction for cancer patients by network-based ranking of marker genes.

    Directory of Open Access Journals (Sweden)

    Christof Winter

    Full Text Available Predicting the clinical outcome of cancer patients based on the expression of marker genes in their tumors has received increasing interest in the past decade. Accurate predictors of outcome and response to therapy could be used to personalize and thereby improve therapy. However, state of the art methods used so far often found marker genes with limited prediction accuracy, limited reproducibility, and unclear biological relevance. To address this problem, we developed a novel computational approach to identify genes prognostic for outcome that couples gene expression measurements from primary tumor samples with a network of known relationships between the genes. Our approach ranks genes according to their prognostic relevance using both expression and network information in a manner similar to Google's PageRank. We applied this method to gene expression profiles which we obtained from 30 patients with pancreatic cancer, and identified seven candidate marker genes prognostic for outcome. Compared to genes found with state of the art methods, such as Pearson correlation of gene expression with survival time, we improve the prediction accuracy by up to 7%. Accuracies were assessed using support vector machine classifiers and Monte Carlo cross-validation. We then validated the prognostic value of our seven candidate markers using immunohistochemistry on an independent set of 412 pancreatic cancer samples. Notably, signatures derived from our candidate markers were independently predictive of outcome and superior to established clinical prognostic factors such as grade, tumor size, and nodal status. As the amount of genomic data of individual tumors grows rapidly, our algorithm meets the need for powerful computational approaches that are key to exploit these data for personalized cancer therapies in clinical practice.

  20. Polymorphisms of ST2-IL18R1-IL18RAP gene cluster: a new risk for autoimmune thyroid diseases.

    Science.gov (United States)

    Wang, X; Zhu, Y F; Li, D M; Qin, Q; Wang, Q; Muhali, F S; Jiang, W J; Zhang, J A

    2016-02-01

    Interleukin 33 (IL33) / ST2 pathway and ST2-interlukin18 receptor1-interlukin18 receptor accessory protein (ST2-IL18R1-IL18RAP) gene cluster have been involved in many autoimmune diseases but few report in autoimmune thyroid diseases (AITD). In this study, we investigated whether polymorphisms of IL33, ST2, IL18R1, and IL18RAP are associated with Graves' disease (GD) and Hashimoto's thyroiditis (HT), two major forms of AITD, among a Chinese population. A total of 11 SNPs were explored in a case-control study including 417 patients with GD, 250 HT patients and 301 controls, including rs1929992, rs10975519, rs10208293, rs6543116, rs1041973, rs3732127, rs11465597, rs1035130, rs2293225, rs1035127, rs917997 of IL 33, ST2-IL18R1-IL18RAP gene cluster. Genotyping of these SNPs was performed using matrix-assisted laser desorption / ionization-time-of-flight mass spectrometer (MALDI-TOF-MS) platform from Sequenom. The frequencies of allele A and AA+AG genotype of rs6543116 (ST2) in HT patients were significantly increased compared with those of the controls (P = 0.029/0.021, OR = 1.31/1.62). And in another SNP rs917997, AA+AG genotype presented an increased frequency in HT subjects compared with controls (P = 0.046, OR = 1.53). Furthermore, the haplotype GAGCCCG from ST2-IL18R1-IL18RAP gene cluster (rs6543116, rs1041973, rs1035130, rs3732127, rs1035127, rs2293225, rs917997) was associated with increased susceptibility to GD with an OR of 2.03 (P = 0.022, 95% CI = 1.07-3.86). Some SNPs of ST2-IL18R1-IL18RAP gene cluster might increase the risk of susceptibility of HT and GD in Chinese Han population. © 2015 John Wiley & Sons Ltd.

  1. Cloning and Characterization of the Polyether Salinomycin Biosynthesis Gene Cluster of Streptomyces albus XM211

    Science.gov (United States)

    Jiang, Chunyan; Wang, Hougen; Kang, Qianjin; Liu, Jing

    2012-01-01

    Salinomycin is widely used in animal husbandry as a food additive due to its antibacterial and anticoccidial activities. However, its biosynthesis had only been studied by feeding experiments with isotope-labeled precursors. A strategy with degenerate primers based on the polyether-specific epoxidase sequences was successfully developed to clone the salinomycin gene cluster. Using this strategy, a putative epoxidase gene, slnC, was cloned from the salinomycin producer Streptomyces albus XM211. The targeted replacement of slnC and subsequent trans-complementation proved its involvement in salinomycin biosynthesis. A 127-kb DNA region containing slnC was sequenced, including genes for polyketide assembly and release, oxidative cyclization, modification, export, and regulation. In order to gain insight into the salinomycin biosynthesis mechanism, 13 gene replacements and deletions were conducted. Including slnC, 7 genes were identified as essential for salinomycin biosynthesis and putatively responsible for polyketide chain release, oxidative cyclization, modification, and regulation. Moreover, 6 genes were found to be relevant to salinomycin biosynthesis and possibly involved in precursor supply, removal of aberrant extender units, and regulation. Sequence analysis and a series of gene replacements suggest a proposed pathway for the biosynthesis of salinomycin. The information presented here expands the understanding of polyether biosynthesis mechanisms and paves the way for targeted engineering of salinomycin activity and productivity. PMID:22156425

  2. Aberrant DNA methylation of ESR1 and p14ARF genes could be useful as prognostic indicators in osteosarcoma

    Directory of Open Access Journals (Sweden)

    Sonaglio V

    2013-06-01

    Full Text Available Viviane Sonaglio,1 Ana C de Carvalho,2 Silvia R C Toledo,3,4 Carolina Salinas-Souza,3,4 André L Carvalho,5 Antonio S Petrilli,3 Beatriz de Camargo,6 André L Vettore21Pediatrics Department, A C Camargo Hospital, São Paulo, Brazil; 2Biological Science Department, Federal University of São Paulo, Diadema, Brazil; 3Department of Pediatrics, Pediatric Oncology Institute, GRAACC/Federal University of São Paulo, São Paulo, Brazil; 4Department of Morphology and Genetics, Federal University of São Paulo, São Paulo, Brazil; 5Department of Head and Neck Surgery, PIO XII Foundation, Barretos Cancer Hospital, Barretos, São Paulo, Brazil; 6Research Program Pediatric Oncology Program, CPNq, Instituto Nacional do Cancer, Rio de Janeiro, BrazilAbstract: Osteosarcoma (OS is the eighth most common form of childhood and adolescence cancer. Approximately 10%–20% of patients present metastatic disease at diagnosis and the 5-year overall survival remains around 70% for nonmetastatic patients and around 30% for metastatic patients. Metastatic disease at diagnosis and the necrosis grade induced by preoperative treatment are the only well-established prognostic factors for osteosarcoma. The DNA aberrant methylation is a frequent epigenetic alteration in humans and has been described as a molecular marker in different tumor types. This study evaluated the DNA aberrant methylation status of 18 genes in 34 OS samples without previous chemotherapy treatment and in four normal bone specimens and compared the methylation profile with clinicopathological characteristics of the patients. We were able to define a three-gene panel (AIM1, p14ARF, and ESR1 in which methylation was correlated with OS cases. The hypermethylation of p14ARF showed a significant association with the absence of metastases at diagnoses, while ESR1 hypermethylation was marginally associated with worse overall survival. This study demonstrated that aberrant promoter methylation is a common event

  3. Prognostic and Clinical Significance of miRNA-205 in Endometrioid Endometrial Cancer.

    Directory of Open Access Journals (Sweden)

    Milosz Wilczynski

    Full Text Available Endometrial cancer is one of the most common malignancies of the reproductive female tract, with endometrioid endometrial cancer being the most frequent type. Despite the relatively favourable prognosis in cases of endometrial cancer, there is a necessity to evaluate clinical and prognostic utility of new molecular markers. MiRNAs are small, non-coding RNA molecules that take part in RNA silencing and post-transcriptional regulation of gene expression. Altered expression of miRNAs may be associated with cancer initiation, progression and metastatic capabilities. MiRNA-205 seems to be one of the key regulators of gene expression in endometrial cancer. In this study, we investigated clinical and prognostic role of miRNA-205 in endometrioid endometrial cancer. After total RNA extraction from 100 archival formalin-fixed paraffin-embedded tissues, real-time quantitative RT-PCR was used to define miRNA-205 expression levels. The aim of the study was to evaluate miRNA-205 expression levels in regard to patients' clinical and histopathological features, such as: survival rate, recurrence rate, staging, myometrial invasion, grading and lymph nodes involvement. Higher levels of miRNA-205 expression were observed in tumours with less than half of myometrial invasion and non-advanced cancers. Kaplan-Maier analysis revealed that higher levels of miRNA-205 were associated with better overall survival (p = 0,034. These results indicate potential clinical utility of miRNA-205 as a prognostic marker.

  4. Targeted capture and heterologous expression of the Pseudoalteromonas alterochromide gene cluster in Escherichia coli represents a promising natural product exploratory platform.

    Science.gov (United States)

    Ross, Avena C; Gulland, Lauren E S; Dorrestein, Pieter C; Moore, Bradley S

    2015-04-17

    Marine pseudoalteromonads represent a very promising source of biologically important natural product molecules. To access and exploit the full chemical capacity of these cosmopolitan Gram-(-) bacteria, we sought to apply universal synthetic biology tools to capture, refactor, and express biosynthetic gene clusters for the production of complex organic compounds in reliable host organisms. Here, we report a platform for the capture of proteobacterial gene clusters using a transformation-associated recombination (TAR) strategy coupled with direct pathway manipulation and expression in Escherichia coli. The ~34 kb pathway for production of alterochromide lipopeptides by Pseudoalteromonas piscicida JCM 20779 was captured and heterologously expressed in E. coli utilizing native and E. coli-based T7 promoter sequences. Our approach enabled both facile production of the alterochromides and in vivo interrogation of gene function associated with alterochromide's unusual brominated lipid side chain. This platform represents a simple but effective strategy for the discovery and biosynthetic characterization of natural products from marine proteobacteria.

  5. Id-1 gene and gene products as therapeutic targets for treatment of breast cancer and other types of carcinoma

    Science.gov (United States)

    Desprez, Pierre-Yves; Campisi, Judith

    2014-08-19

    A method for treatment of breast cancer and other types of cancer. The method comprises targeting and modulating Id-1 gene expression, if any, for the Id-1 gene, or gene products in breast or other epithelial cancers in a patient by delivering products that modulate Id-1 gene expression. When expressed, Id-1 gene is a prognostic indicator that cancer cells are invasive and metastatic.

  6. Identification and comparative analysis of the protocadherin cluster in a reptile, the green anole lizard.

    Directory of Open Access Journals (Sweden)

    Xiao-Juan Jiang

    Full Text Available BACKGROUND: The vertebrate protocadherins are a subfamily of cell adhesion molecules that are predominantly expressed in the nervous system and are believed to play an important role in establishing the complex neural network during animal development. Genes encoding these molecules are organized into a cluster in the genome. Comparative analysis of the protocadherin subcluster organization and gene arrangements in different vertebrates has provided interesting insights into the history of vertebrate genome evolution. Among tetrapods, protocadherin clusters have been fully characterized only in mammals. In this study, we report the identification and comparative analysis of the protocadherin cluster in a reptile, the green anole lizard (Anolis carolinensis. METHODOLOGY/PRINCIPAL FINDINGS: We show that the anole protocadherin cluster spans over a megabase and encodes a total of 71 genes. The number of genes in the anole protocadherin cluster is significantly higher than that in the coelacanth (49 genes and mammalian (54-59 genes clusters. The anole protocadherin genes are organized into four subclusters: the delta, alpha, beta and gamma. This subcluster organization is identical to that of the coelacanth protocadherin cluster, but differs from the mammalian clusters which lack the delta subcluster. The gene number expansion in the anole protocadherin cluster is largely due to the extensive gene duplication in the gammab subgroup. Similar to coelacanth and elephant shark protocadherin genes, the anole protocadherin genes have experienced a low frequency of gene conversion. CONCLUSIONS/SIGNIFICANCE: Our results suggest that similar to the protocadherin clusters in other vertebrates, the evolution of anole protocadherin cluster is driven mainly by lineage-specific gene duplications and degeneration. Our analysis also shows that loss of the protocadherin delta subcluster in the mammalian lineage occurred after the divergence of mammals and reptiles

  7. Convex Clustering: An Attractive Alternative to Hierarchical Clustering

    Science.gov (United States)

    Chen, Gary K.; Chi, Eric C.; Ranola, John Michael O.; Lange, Kenneth

    2015-01-01

    The primary goal in cluster analysis is to discover natural groupings of objects. The field of cluster analysis is crowded with diverse methods that make special assumptions about data and address different scientific aims. Despite its shortcomings in accuracy, hierarchical clustering is the dominant clustering method in bioinformatics. Biologists find the trees constructed by hierarchical clustering visually appealing and in tune with their evolutionary perspective. Hierarchical clustering operates on multiple scales simultaneously. This is essential, for instance, in transcriptome data, where one may be interested in making qualitative inferences about how lower-order relationships like gene modules lead to higher-order relationships like pathways or biological processes. The recently developed method of convex clustering preserves the visual appeal of hierarchical clustering while ameliorating its propensity to make false inferences in the presence of outliers and noise. The solution paths generated by convex clustering reveal relationships between clusters that are hidden by static methods such as k-means clustering. The current paper derives and tests a novel proximal distance algorithm for minimizing the objective function of convex clustering. The algorithm separates parameters, accommodates missing data, and supports prior information on relationships. Our program CONVEXCLUSTER incorporating the algorithm is implemented on ATI and nVidia graphics processing units (GPUs) for maximal speed. Several biological examples illustrate the strengths of convex clustering and the ability of the proximal distance algorithm to handle high-dimensional problems. CONVEXCLUSTER can be freely downloaded from the UCLA Human Genetics web site at http://www.genetics.ucla.edu/software/ PMID:25965340

  8. Histone and Ribosomal RNA Repetitive Gene Clusters of the Boll Weevil are Linked in a Tandem Array

    Science.gov (United States)

    Histones are the major protein component of chromatin structure. The histone family is made up of a quintet of proteins, four core histones (H2A, H2B, H3 & H4) and the linker histones (H1). Spacers are found between the coding regions. Among insects this quintet of genes is usually clustered and ...

  9. Prognostic influence of pre-operative C-reactive protein in node-negative breast cancer patients.

    Directory of Open Access Journals (Sweden)

    Isabel Sicking

    Full Text Available The importance of inflammation is increasingly noticed in cancer. The aim of this study was to analyze the prognostic influence of pre-operative serum C-reactive protein (CRP in a cohort of 148 lymph node-negative breast cancer patients. The prognostic significance of CRP level for disease-free survival (DFS, metastasis-free survival (MFS and overall survival (OS was evaluated using univariate and multivariate Cox regression, also including information on age at diagnosis, tumor size, tumor grade, estrogen receptor (ER, progesterone receptor (PR and human epidermal growth factor receptor 2 (HER2 status, proliferation index (Ki67 and molecular subtype, as well as an assessment of the presence of necrosis and inflammation in the tumor tissue. Univariate analysis showed that CRP, as a continuous variable, was significantly associated with DFS (P = 0.002, hazard ratio [HR] = 1.04, 95% confidence interval [CI] = 1.02-1.07 and OS (P = 0.036, HR= 1.03, 95% CI = 1.00-1.06, whereas a trend was observed for MFS (P = 0.111. In the multivariate analysis, CRP retained its significance for DFS (P = 0.033, HR= 1.01, 95% CI = 1.00-1.07 as well as OS (P = 0.023, HR= 1.03, 95% CI = 1.00-1.06, independent of established prognostic factors. Furthermore, large-scale gene expression analysis by Affymetrix HG-U133A arrays was performed for 72 (48.6% patients. The correlations between serum CRP and gene expression levels in the corresponding carcinoma of the breast were assessed using Spearman's rank correlation, controlled for false-discovery rate. No significant correlation was observed between CRP level and gene expression indicative of an ongoing local inflammatory process. In summary, pre-operatively elevated CRP levels at the time of diagnosis were associated with shorter DFS and OS independent of established prognostic factors in node-negative breast cancer, supporting a possible link between inflammation and

  10. Related structures of neutral capsular polysaccharides of Acinetobacter baumannii isolates that carry related capsule gene clusters KL43, KL47, and KL88.

    Science.gov (United States)

    Shashkov, Alexander S; Kenyon, Johanna J; Arbatsky, Nikolay P; Shneider, Mikhail M; Popova, Anastasiya V; Miroshnikov, Konstantin A; Hall, Ruth M; Knirel, Yuriy A

    2016-11-29

    Capsular polysaccharides were recovered from four Acinetobacter baumannii isolates, and the following related structures of oligosaccharide repeating units were established by sugar analyses along with 1D and 2D 1 H and 13 C NMR spectroscopy: NIPH 60 and LUH5544 (K43) NIPH 601 (K47) The K locus for capsule biosynthesis in the genome sequences available for NIPH 60 and LUH5544, designated KL43, was found to be related to gene clusters KL47 in NIPH 601 and KL88 in LUH5548. The three clusters share most gene content differing in only a small portion that includes an additional glycosyltransferase genes in KL47 and KL88, as well as genes encoding distinct Wzy polymerases that were found to form the same α-d-GlcpNAc-(1 → 6)-α-d-GlcpNAc linkage in K43 and K47. Copyright © 2016 Elsevier Ltd. All rights reserved.

  11. METHOD OF CONSTRUCTION OF GENETIC DATA CLUSTERS

    Directory of Open Access Journals (Sweden)

    N. A. Novoselova

    2016-01-01

    Full Text Available The paper presents a method of construction of genetic data clusters (functional modules using the randomized matrices. To build the functional modules the selection and analysis of the eigenvalues of the gene profiles correlation matrix is performed. The principal components, corresponding to the eigenvalues, which are significantly different from those obtained for the randomly generated correlation matrix, are used for the analysis. Each selected principal component forms gene cluster. In a comparative experiment with the analogs the proposed method shows the advantage in allocating statistically significant different-sized clusters, the ability to filter non- informative genes and to extract the biologically interpretable functional modules matching the real data structure.

  12. Single Nucleotide Polymorphisms in the FADS Gene Cluster but not the ELOVL2 Gene are Associated with Serum Polyunsaturated Fatty Acid Composition and Development of Allergy (in a Swedish Birth Cohort

    Directory of Open Access Journals (Sweden)

    Malin Barman

    2015-12-01

    Full Text Available Exposure to polyunsaturated fatty acids (PUFA influences immune function and may affect the risk of allergy development. Long chain PUFAs are produced from dietary precursors catalyzed by desaturases and elongases encoded by FADS and ELOVL genes. In 211 subjects, we investigated whether polymorphisms in the FADS gene cluster and the ELOVL2 gene were associated with allergy or PUFA composition in serum phospholipids in a Swedish birth-cohort sampled at birth and at 13 years of age; allergy was diagnosed at 13 years of age. Minor allele carriers of rs102275 and rs174448 (FADS gene cluster had decreased proportions of 20:4 n-6 in cord and adolescent serum and increased proportions of 20:3 n-6 in cord serum as well as a nominally reduced risk of developing atopic eczema, but not respiratory allergy, at 13 years of age. Minor allele carriers of rs17606561 in the ELOVL2 gene had nominally decreased proportions of 20:4 n-6 in cord serum but ELOVL polymorphisms (rs2236212 and rs17606561 were not associated with allergy development. Thus, reduced capacity to desaturase n-6 PUFAs due to FADS polymorphisms was nominally associated with reduced risk for eczema development, which could indicate a pathogenic role for long-chain PUFAs in allergy development.

  13. Functional Analysis of the Chaperone-Usher Fimbrial Gene Clusters of Salmonella enterica serovar Typhi.

    Science.gov (United States)

    Dufresne, Karine; Saulnier-Bellemare, Julie; Daigle, France

    2018-01-01

    The human-specific pathogen Salmonella enterica serovar Typhi causes typhoid, a major public health issue in developing countries. Several aspects of its pathogenesis are still poorly understood. S . Typhi possesses 14 fimbrial gene clusters including 12 chaperone-usher fimbriae ( stg, sth, bcf , fim, saf , sef , sta, stb, stc, std, ste , and tcf ). These fimbriae are weakly expressed in laboratory conditions and only a few are actually characterized. In this study, expression of all S . Typhi chaperone-usher fimbriae and their potential roles in pathogenesis such as interaction with host cells, motility, or biofilm formation were assessed. All S . Typhi fimbriae were better expressed in minimal broth. Each system was overexpressed and only the fimbrial gene clusters without pseudogenes demonstrated a putative major subunits of about 17 kDa on SDS-PAGE. Six of these (Fim, Saf, Sta, Stb, Std, and Tcf) also show extracellular structure by electron microscopy. The impact of fimbrial deletion in a wild-type strain or addition of each individual fimbrial system to an S . Typhi afimbrial strain were tested for interactions with host cells, biofilm formation and motility. Several fimbriae modified bacterial interactions with human cells (THP-1 and INT-407) and biofilm formation. However, only Fim fimbriae had a deleterious effect on motility when overexpressed. Overall, chaperone-usher fimbriae seem to be an important part of the balance between the different steps (motility, adhesion, host invasion and persistence) of S . Typhi pathogenesis.

  14. Haplotypes in the APOA1-C3-A4-A5 gene cluster affect plasma lipids in both humans and baboons

    Energy Technology Data Exchange (ETDEWEB)

    Wang, Qian-fei; Liu, Xin; O' Connell, Jeff; Peng, Ze; Krauss, Ronald M.; Rainwater, David L.; VandeBerg, John L.; Rubin, Edward M.; Cheng, Jan-Fang; Pennacchio, Len A.

    2003-09-15

    Genetic studies in non-human primates serve as a potential strategy for identifying genomic intervals where polymorphisms impact upon human disease-related phenotypes. It remains unclear, however, whether independently arising polymorphisms in orthologous regions of non-human primates leads to similar variation in a quantitative trait found in both species. To explore this paradigm, we studied a baboon apolipoprotein gene cluster (APOA1/C3/A4/A5) for which the human gene orthologs have well established roles in influencing plasma HDL-cholesterol and triglyceride concentrations. Our extensive polymorphism analysis of this 68 kb gene cluster in 96 pedigreed baboons identified several haplotype blocks each with limited diversity, consistent with haplotype findings in humans. To determine whether baboons, like humans, also have particular haplotypes associated with lipid phenotypes, we genotyped 634 well characterized baboons using 16 haplotype tagging SNPs. Genetic analysis of single SNPs, as well as haplotypes, revealed an association of APOA5 and APOC3 variants with HDL cholesterol and triglyceride concentrations, respectively. Thus, independent variation in orthologous genomic intervals does associate with similar quantitative lipid traits in both species, supporting the possibility of uncovering human QTL genes in a highly controlled non-human primate model.

  15. Context-dependent interpretation of the prognostic value of BRAF and KRAS mutations in colorectal cancer

    International Nuclear Information System (INIS)

    Popovici, Vlad; Budinska, Eva; Bosman, Fred T; Tejpar, Sabine; Roth, Arnaud D; Delorenzi, Mauro

    2013-01-01

    The mutation status of the BRAF and KRAS genes has been proposed as prognostic biomarker in colorectal cancer. Of them, only the BRAF V600E mutation has been validated independently as prognostic for overall survival and survival after relapse, while the prognostic value of KRAS mutation is still unclear. We investigated the prognostic value of BRAF and KRAS mutations in various contexts defined by stratifications of the patient population. We retrospectively analyzed a cohort of patients with stage II and III colorectal cancer from the PETACC-3 clinical trial (N = 1,423), by assessing the prognostic value of the BRAF and KRAS mutations in subpopulations defined by all possible combinations of the following clinico-pathological variables: T stage, N stage, tumor site, tumor grade and microsatellite instability status. In each such subpopulation, the prognostic value was assessed by log rank test for three endpoints: overall survival, relapse-free survival, and survival after relapse. The significance level was set to 0.01 for Bonferroni-adjusted p-values, and a second threshold for a trend towards statistical significance was set at 0.05 for unadjusted p-values. The significance of the interactions was tested by Wald test, with significance level of 0.05. In stage II-III colorectal cancer, BRAF mutation was confirmed a marker of poor survival only in subpopulations involving microsatellite stable and left-sided tumors, with higher effects than in the whole population. There was no evidence for prognostic value in microsatellite instable or right-sided tumor groups. We found that BRAF was also prognostic for relapse-free survival in some subpopulations. We found no evidence that KRAS mutations had prognostic value, although a trend was observed in some stratifications. We also show evidence of heterogeneity in survival of patients with BRAF V600E mutation. The BRAF mutation represents an additional risk factor only in some subpopulations of colorectal cancers, in

  16. Genetic studies on the APOA1-C3-A5 gene cluster in Asian Indians with premature coronary artery disease

    Directory of Open Access Journals (Sweden)

    Hebbagodi Sridhara

    2008-09-01

    Full Text Available Abstract Background The APOA1-C3-A5 gene cluster plays an important role in the regulation of lipids. Asian Indians have an increased tendency for abnormal lipid levels and high risk of Coronary Artery Disease (CAD. Therefore, the present study aimed to elucidate the relationship of four single nucleotide polymorphisms (SNPs in the Apo11q cluster, namely the -75G>A, +83C>T SNPs in the APOA1 gene, the Sac1 SNP in the APOC3 gene and the S19W variant in the APOA5 gene to plasma lipids and CAD in 190 affected sibling pairs (ASPs belonging to Asian Indian families with a strong CAD history. Methods & results Genotyping and lipid assays were carried out using standard protocols. Plasma lipids showed a strong heritability (h2 48% – 70%; P P A (LOD score 2.77 SNPs by single-point analysis (P A (pi 0.56 and +83C>T (pi 0.52 (P P A SNPs along with hypertension showed maximized correlations with TC, TG and Apo B by association analysis. Conclusion The APOC3-Sac1 SNP is an important genetic variant that is associated with CAD through its interaction with plasma lipids and other standard risk factors among Asian Indians.

  17. Prognostics

    Data.gov (United States)

    National Aeronautics and Space Administration — Prognostics has received considerable attention recently as an emerging sub-discipline within SHM. Prognosis is here strictly defined as “predicting the time at...

  18. Clinical Implications of Cluster Analysis-Based Classification of Acute Decompensated Heart Failure and Correlation with Bedside Hemodynamic Profiles.

    Directory of Open Access Journals (Sweden)

    Tariq Ahmad

    Full Text Available Classification of acute decompensated heart failure (ADHF is based on subjective criteria that crudely capture disease heterogeneity. Improved phenotyping of the syndrome may help improve therapeutic strategies.To derive cluster analysis-based groupings for patients hospitalized with ADHF, and compare their prognostic performance to hemodynamic classifications derived at the bedside.We performed a cluster analysis on baseline clinical variables and PAC measurements of 172 ADHF patients from the ESCAPE trial. Employing regression techniques, we examined associations between clusters and clinically determined hemodynamic profiles (warm/cold/wet/dry. We assessed association with clinical outcomes using Cox proportional hazards models. Likelihood ratio tests were used to compare the prognostic value of cluster data to that of hemodynamic data.We identified four advanced HF clusters: 1 male Caucasians with ischemic cardiomyopathy, multiple comorbidities, lowest B-type natriuretic peptide (BNP levels; 2 females with non-ischemic cardiomyopathy, few comorbidities, most favorable hemodynamics; 3 young African American males with non-ischemic cardiomyopathy, most adverse hemodynamics, advanced disease; and 4 older Caucasians with ischemic cardiomyopathy, concomitant renal insufficiency, highest BNP levels. There was no association between clusters and bedside-derived hemodynamic profiles (p = 0.70. For all adverse clinical outcomes, Cluster 4 had the highest risk, and Cluster 2, the lowest. Compared to Cluster 4, Clusters 1-3 had 45-70% lower risk of all-cause mortality. Clusters were significantly associated with clinical outcomes, whereas hemodynamic profiles were not.By clustering patients with similar objective variables, we identified four clinically relevant phenotypes of ADHF patients, with no discernable relationship to hemodynamic profiles, but distinct associations with adverse outcomes. Our analysis suggests that ADHF classification using

  19. Integrated pathway clusters with coherent biological themes for target prioritisation.

    Directory of Open Access Journals (Sweden)

    Yi-An Chen

    Full Text Available Prioritising candidate genes for further experimental characterisation is an essential, yet challenging task in biomedical research. One way of achieving this goal is to identify specific biological themes that are enriched within the gene set of interest to obtain insights into the biological phenomena under study. Biological pathway data have been particularly useful in identifying functional associations of genes and/or gene sets. However, biological pathway information as compiled in varied repositories often differs in scope and content, preventing a more effective and comprehensive characterisation of gene sets. Here we describe a new approach to constructing biologically coherent gene sets from pathway data in major public repositories and employing them for functional analysis of large gene sets. We first revealed significant overlaps in gene content between different pathways and then defined a clustering method based on the shared gene content and the similarity of gene overlap patterns. We established the biological relevance of the constructed pathway clusters using independent quantitative measures and we finally demonstrated the effectiveness of the constructed pathway clusters in comparative functional enrichment analysis of gene sets associated with diverse human diseases gathered from the literature. The pathway clusters and gene mappings have been integrated into the TargetMine data warehouse and are likely to provide a concise, manageable and biologically relevant means of functional analysis of gene sets and to facilitate candidate gene prioritisation.

  20. Human microRNA target analysis and gene ontology clustering by GOmir, a novel stand-alone application.

    Science.gov (United States)

    Roubelakis, Maria G; Zotos, Pantelis; Papachristoudis, Georgios; Michalopoulos, Ioannis; Pappa, Kalliopi I; Anagnou, Nicholas P; Kossida, Sophia

    2009-06-16

    microRNAs (miRNAs) are single-stranded RNA molecules of about 20-23 nucleotides length found in a wide variety of organisms. miRNAs regulate gene expression, by interacting with target mRNAs at specific sites in order to induce cleavage of the message or inhibit translation. Predicting or verifying mRNA targets of specific miRNAs is a difficult process of great importance. GOmir is a novel stand-alone application consisting of two separate tools: JTarget and TAGGO. JTarget integrates miRNA target prediction and functional analysis by combining the predicted target genes from TargetScan, miRanda, RNAhybrid and PicTar computational tools as well as the experimentally supported targets from TarBase and also providing a full gene description and functional analysis for each target gene. On the other hand, TAGGO application is designed to automatically group gene ontology annotations, taking advantage of the Gene Ontology (GO), in order to extract the main attributes of sets of proteins. GOmir represents a new tool incorporating two separate Java applications integrated into one stand-alone Java application. GOmir (by using up to five different databases) introduces miRNA predicted targets accompanied by (a) full gene description, (b) functional analysis and (c) detailed gene ontology clustering. Additionally, a reverse search initiated by a potential target can also be conducted. GOmir can freely be downloaded BRFAA.

  1. ATNT: an enhanced system for expression of polycistronic secondary metabolite gene clusters in Aspergillus niger.

    Science.gov (United States)

    Geib, Elena; Brock, Matthias

    2017-01-01

    Fungi are treasure chests for yet unexplored natural products. However, exploitation of their real potential remains difficult as a significant proportion of biosynthetic gene clusters appears silent under standard laboratory conditions. Therefore, elucidation of novel products requires gene activation or heterologous expression. For heterologous gene expression, we previously developed an expression platform in Aspergillus niger that is based on the transcriptional regulator TerR and its target promoter P terA . In this study, we extended this system by regulating expression of terR  by the doxycycline inducible Tet-on system. Reporter genes cloned under the control of the target promoter P terA remained silent in the absence of doxycycline, but were strongly expressed when doxycycline was added. Reporter quantification revealed that the coupled system results in about five times higher expression rates compared to gene expression under direct control of the Tet-on system. As production of secondary metabolites generally requires the expression of several biosynthetic genes, the suitability of the self-cleaving viral peptide sequence P2A was tested in this optimised expression system. P2A allowed polycistronic expression of genes required for Asp-melanin formation in combination with the gene coding for the red fluorescent protein tdTomato. Gene expression and Asp-melanin formation was prevented in the absence of doxycycline and strongly induced by addition of doxycycline. Fluorescence studies confirmed the correct subcellular localisation of the respective enzymes. This tightly regulated but strongly inducible expression system enables high level production of secondary metabolites most likely even those with toxic potential. Furthermore, this system is compatible with polycistronic gene expression and, thus, suitable for the discovery of novel natural products.

  2. A Distributed Approach to System-Level Prognostics

    Science.gov (United States)

    Daigle, Matthew J.; Bregon, Anibal; Roychoudhury, Indranil

    2012-01-01

    Prognostics, which deals with predicting remaining useful life of components, subsystems, and systems, is a key technology for systems health management that leads to improved safety and reliability with reduced costs. The prognostics problem is often approached from a component-centric view. However, in most cases, it is not specifically component lifetimes that are important, but, rather, the lifetimes of the systems in which these components reside. The system-level prognostics problem can be quite difficult due to the increased scale and scope of the prognostics problem and the relative Jack of scalability and efficiency of typical prognostics approaches. In order to address these is ues, we develop a distributed solution to the system-level prognostics problem, based on the concept of structural model decomposition. The system model is decomposed into independent submodels. Independent local prognostics subproblems are then formed based on these local submodels, resul ting in a scalable, efficient, and flexible distributed approach to the system-level prognostics problem. We provide a formulation of the system-level prognostics problem and demonstrate the approach on a four-wheeled rover simulation testbed. The results show that the system-level prognostics problem can be accurately and efficiently solved in a distributed fashion.

  3. Interleukin‑1 gene cluster variants in hemodialysis patients with end stage renal disease: An association and meta‑analysis

    Directory of Open Access Journals (Sweden)

    G Tripathi

    2015-01-01

    Full Text Available We evaluated whether polymorphisms in interleukin (IL-1 gene cluster (IL-1 alpha [IL-1A], IL-1 beta [IL-1B], and IL-1 receptor antagonist [IL-1RN] are associated with end stage renal disease (ESRD. A total of 258 ESRD patients and 569 ethnicity matched controls were examined for IL-1 gene cluster. These were genotyped for five single-nucleotide gene polymorphisms in the IL-1A, IL-1B and IL-1RN genes and a variable number of tandem repeats (VNTR in the IL-1RN. The IL-1B − 3953 and IL-1RN + 8006 polymorphism frequencies were significantly different between the two groups. At IL-1B, the T allele of − 3953C/T was increased among ESRD (P = 0.0001. A logistic regression model demonstrated that two repeat (240 base pair [bp] of the IL-1Ra VNTR polymorphism was associated with ESRD (P = 0.0001. The C/C/C/C/C/1 haplotype was more prevalent in ESRD = 0.007. No linkage disequilibrium (LD was observed between six loci of IL-1 gene. We further conducted a meta-analysis of existing studies and found that there is a strong association of IL-1 RN VNTR 86 bp repeat polymorphism with susceptibility to ESRD (odds ratio = 2.04, 95% confidence interval = 1.48-2.82; P = 0.000. IL-1B − 5887, +8006 and the IL-1RN VNTR polymorphisms have been implicated as potential risk factors for ESRD. The meta-analysis showed a strong association of IL-1RN 86 bp VNTR polymorphism with susceptibility to ESRD.

  4. Unsupervised deep learning reveals prognostically relevant subtypes of glioblastoma.

    Science.gov (United States)

    Young, Jonathan D; Cai, Chunhui; Lu, Xinghua

    2017-10-03

    One approach to improving the personalized treatment of cancer is to understand the cellular signaling transduction pathways that cause cancer at the level of the individual patient. In this study, we used unsupervised deep learning to learn the hierarchical structure within cancer gene expression data. Deep learning is a group of machine learning algorithms that use multiple layers of hidden units to capture hierarchically related, alternative representations of the input data. We hypothesize that this hierarchical structure learned by deep learning will be related to the cellular signaling system. Robust deep learning model selection identified a network architecture that is biologically plausible. Our model selection results indicated that the 1st hidden layer of our deep learning model should contain about 1300 hidden units to most effectively capture the covariance structure of the input data. This agrees with the estimated number of human transcription factors, which is approximately 1400. This result lends support to our hypothesis that the 1st hidden layer of a deep learning model trained on gene expression data may represent signals related to transcription factor activation. Using the 3rd hidden layer representation of each tumor as learned by our unsupervised deep learning model, we performed consensus clustering on all tumor samples-leading to the discovery of clusters of glioblastoma multiforme with differential survival. One of these clusters contained all of the glioblastoma samples with G-CIMP, a known methylation phenotype driven by the IDH1 mutation and associated with favorable prognosis, suggesting that the hidden units in the 3rd hidden layer representations captured a methylation signal without explicitly using methylation data as input. We also found differentially expressed genes and well-known mutations (NF1, IDH1, EGFR) that were uniquely correlated with each of these clusters. Exploring these unique genes and mutations will allow us to

  5. Extensive polycistronism and antisense transcription in the mammalian Hox clusters.

    Directory of Open Access Journals (Sweden)

    Gaëll Mainguy

    Full Text Available The Hox clusters play a crucial role in body patterning during animal development. They encode both Hox transcription factor and micro-RNA genes that are activated in a precise temporal and spatial sequence that follows their chromosomal order. These remarkable collinear properties confer functional unit status for Hox clusters. We developed the TranscriptView platform to establish high resolution transcriptional profiling and report here that transcription in the Hox clusters is far more complex than previously described in both human and mouse. Unannotated transcripts can represent up to 60% of the total transcriptional output of a cluster. In particular, we identified 14 non-coding Transcriptional Units antisense to Hox genes, 10 of which (70% have a detectable mouse homolog. Most of these Transcriptional Units in both human and mouse present conserved sizeable sequences (>40 bp overlapping Hox transcripts, suggesting that these Hox antisense transcripts are functional. Hox clusters also display at least seven polycistronic clusters, i.e., different genes being co-transcribed on long isoforms (up to 30 kb. This work provides a reevaluated framework for understanding Hox gene function and dys-function. Such extensive transcriptions may provide a structural explanation for Hox clustering.

  6. Generic Software Architecture for Prognostics (GSAP) User Guide

    Science.gov (United States)

    Teubert, Christopher Allen; Daigle, Matthew John; Watkins, Jason; Sankararaman, Shankar; Goebel, Kai

    2016-01-01

    The Generic Software Architecture for Prognostics (GSAP) is a framework for applying prognostics. It makes applying prognostics easier by implementing many of the common elements across prognostic applications. The standard interface enables reuse of prognostic algorithms and models across systems using the GSAP framework.

  7. Prognostic methods in medicine

    NARCIS (Netherlands)

    Lucas, P. J.; Abu-Hanna, A.

    1999-01-01

    Prognosis--the prediction of the course and outcome of disease processes--plays an important role in patient management tasks like diagnosis and treatment planning. As a result, prognostic models form an integral part of a number of systems supporting these tasks. Furthermore, prognostic models

  8. Aircraft Anomaly Prognostics, Phase I

    Data.gov (United States)

    National Aeronautics and Space Administration — Ridgetop Group will leverage its proven Electromechanical Actuator (EMA) prognostics methodology to develop an advanced model-based actuator prognostic reasoner...

  9. An indigoidine biosynthetic gene cluster from Streptomyces chromofuscus ATCC 49982 contains an unusual IndB homologue.

    Science.gov (United States)

    Yu, Dayu; Xu, Fuchao; Valiente, Jonathan; Wang, Siyuan; Zhan, Jixun

    2013-01-01

    A putative indigoidine biosynthetic gene cluster was located in the genome of Streptomyces chromofuscus ATCC 49982. The silent 9.4-kb gene cluster consists of five open reading frames, named orf1, Sc-indC, Sc-indA, Sc-indB, and orf2, respectively. Sc-IndC was functionally characterized as an indigoidine synthase through heterologous expression of the enzyme in both Streptomyces coelicolor CH999 and Escherichia coli BAP1. The yield of indigoidine in E. coli BAP1 reached 2.78 g/l under the optimized conditions. The predicted protein product of Sc-indB is unusual and much larger than any other reported IndB-like protein. The N-terminal portion of this enzyme resembles IdgB and the C-terminal portion is a hypothetical protein. Sc-IndA and/or Sc-IndB were co-expressed with Sc-IndC in E. coli BAP1, which demonstrated the involvement of Sc-IndB, but not Sc-IndA, in the biosynthetic pathway of indigoidine. The yield of indigoidine was dramatically increased by 41.4 % (3.93 g/l) when Sc-IndB was co-expressed with Sc-IndC in E. coli BAP1. Indigoidine is more stable at low temperatures.

  10. Meta-analysis of Cancer Gene Profiling Data.

    Science.gov (United States)

    Roy, Janine; Winter, Christof; Schroeder, Michael

    2016-01-01

    The simultaneous measurement of thousands of genes gives the opportunity to personalize and improve cancer therapy. In addition, the integration of meta-data such as protein-protein interaction (PPI) information into the analyses helps in the identification and prioritization of genes from these screens. Here, we describe a computational approach that identifies genes prognostic for outcome by combining gene profiling data from any source with a network of known relationships between genes.

  11. Prognostic value of X-chromosome inactivation in symptomatic female carriers of dystrophinopathy

    Directory of Open Access Journals (Sweden)

    Juan-Mateu Jonàs

    2012-10-01

    Full Text Available Abstract Background Between 8% and 22% of female carriers of DMD mutations exhibit clinical symptoms of variable severity. Development of symptoms in DMD mutation carriers without chromosomal rearrangements has been attributed to skewed X-chromosome inactivation (XCI favouring predominant expression of the DMD mutant allele. However the prognostic use of XCI analysis is controversial. We aimed to evaluate the correlation between X-chromosome inactivation and development of clinical symptoms in a series of symptomatic female carriers of dystrophinopathy. Methods We reviewed the clinical, pathological and genetic features of twenty-four symptomatic carriers covering a wide spectrum of clinical phenotypes. DMD gene analysis was performed using MLPA and whole gene sequencing in blood DNA and muscle cDNA. Blood and muscle DNA was used for X-chromosome inactivation (XCI analysis thought the AR methylation assay in symptomatic carriers and their female relatives, asymptomatic carriers as well as non-carrier females. Results Symptomatic carriers exhibited 49.2% more skewed XCI profiles than asymptomatic carriers. The extent of XCI skewing in blood tended to increase in line with the severity of muscle symptoms. Skewed XCI patterns were found in at least one first-degree female relative in 78.6% of symptomatic carrier families. No mutations altering XCI in the XIST gene promoter were found. Conclusions Skewed XCI is in many cases familial inherited. The extent of XCI skewing is related to phenotype severity. However, the assessment of XCI by means of the AR methylation assay has a poor prognostic value, probably because the methylation status of the AR gene in muscle may not reflect in all cases the methylation status of the DMD gene.

  12. Prognostic indicators for failed nonsurgical reduction of intussusception

    Directory of Open Access Journals (Sweden)

    Khorana J

    2016-08-01

    Full Text Available Jiraporn Khorana,1 Jesda Singhavejsakul,1 Nuthapong Ukarapol,2 Mongkol Laohapensang,3 Jakraphan Siriwongmongkol,1 Jayanton Patumanond4 1Division of Pediatric Surgery, Department of Surgery, 2Division of Gastroenterology, Department of Pediatrics, Chiang Mai University Hospital, Chiang Mai, 3Division of Pediatric Surgery, Department of Surgery, Siriraj Hospital, Mahidol University, Bangkok, 4Center of Excellence in Applied Epidemiology, Thammasat University Hospital, Pathumthani, Thailand Purpose: To identify the risk factors for failure of nonsurgical reduction of intussusception. Methods: Data from intussusception patients who were treated with nonsurgical reduction in Chiang Mai University Hospital and Siriraj Hospital between January 2006 and December 2012 were collected. Patients aged 0–15 years and without contraindications (peritonitis, abdominal X-ray signs of perforation, and/or hemodynamic instability were included for nonsurgical reduction. The success and failure groups were divided according to the results of the reduction. Prognostic indicators for failed reduction were identified by using generalized linear model for exponential risk regression. The risk ratio (RR was used to report each factor. Results: One hundred and ninety cases of intussusception were enrolled. Twenty cases were excluded due to contraindications. A total of 170 cases of intussusception were included for the final analysis. The significant risk factors for reduction failure clustered by an age of 3 years were weight <12 kg (RR =1.48, P=0.004, symptom duration >3 days (RR =1.26, P<0.001, vomiting (RR =1.63, P<0.001, rectal bleeding (RR =1.50, P<0.001, abdominal distension (RR =1.60, P=0.003, temperature >37.8°C (RR =1.51, P<0.001, palpable abdominal mass (RR =1.26, P<0.001, location of mass (left over right side (RR =1.48, P<0.001, poor prognostic signs on ultrasound scans (RR =1.35, P<0.001, and method of reduction (hydrostatic over pneumatic (RR =1

  13. Evaluating biomarkers for prognostic enrichment of clinical trials.

    Science.gov (United States)

    Kerr, Kathleen F; Roth, Jeremy; Zhu, Kehao; Thiessen-Philbrook, Heather; Meisner, Allison; Wilson, Francis Perry; Coca, Steven; Parikh, Chirag R

    2017-12-01

    A potential use of biomarkers is to assist in prognostic enrichment of clinical trials, where only patients at relatively higher risk for an outcome of interest are eligible for the trial. We investigated methods for evaluating biomarkers for prognostic enrichment. We identified five key considerations when considering a biomarker and a screening threshold for prognostic enrichment: (1) clinical trial sample size, (2) calendar time to enroll the trial, (3) total patient screening costs and the total per-patient trial costs, (4) generalizability of trial results, and (5) ethical evaluation of trial eligibility criteria. Items (1)-(3) are amenable to quantitative analysis. We developed the Biomarker Prognostic Enrichment Tool for evaluating biomarkers for prognostic enrichment at varying levels of screening stringency. We demonstrate that both modestly prognostic and strongly prognostic biomarkers can improve trial metrics using Biomarker Prognostic Enrichment Tool. Biomarker Prognostic Enrichment Tool is available as a webtool at http://prognosticenrichment.com and as a package for the R statistical computing platform. In some clinical settings, even biomarkers with modest prognostic performance can be useful for prognostic enrichment. In addition to the quantitative analysis provided by Biomarker Prognostic Enrichment Tool, investigators must consider the generalizability of trial results and evaluate the ethics of trial eligibility criteria.

  14. A Cluster of Five Genes Essential for the Utilization of Dihydroxamate Xenosiderophores in Synechocystis sp. PCC 6803.

    Science.gov (United States)

    Obando S, Tobias A; Babykin, Michael M; Zinchenko, Vladislav V

    2018-05-21

    The unicellular freshwater cyanobacterium Synechocystis sp. PCC 6803 is capable of using dihydroxamate xenosiderophores, either ferric schizokinen (FeSK) or a siderophore of the filamentous cyanobacterium Anabaena variabilis ATCC 29413 (SAV), as the sole source of iron in the TonB-dependent manner. The fecCDEB1-schT gene cluster encoding a siderophore transport system that is involved in the utilization of FeSK and SAV in Synechocystis sp. PCC 6803 was identified. The gene schT encodes TonB-dependent outer membrane transporter, whereas the remaining four genes encode the ABC-type transporter FecB1CDE formed by the periplasmic binding protein FecB1, the transmembrane permease proteins FecC and FecD, and the ATPase FecE. Inactivation of any of these genes resulted in the inability of cells to utilize FeSK and SAV. Our data strongly suggest that Synechocystis sp. PCC 6803 can readily internalize Fe-siderophores via the classic TonB-dependent transport system.

  15. GOBO: gene expression-based outcome for breast cancer online.

    Directory of Open Access Journals (Sweden)

    Markus Ringnér

    Full Text Available Microarray-based gene expression analysis holds promise of improving prognostication and treatment decisions for breast cancer patients. However, the heterogeneity of breast cancer emphasizes the need for validation of prognostic gene signatures in larger sample sets stratified into relevant subgroups. Here, we describe a multifunctional user-friendly online tool, GOBO (http://co.bmc.lu.se/gobo, allowing a range of different analyses to be performed in an 1881-sample breast tumor data set, and a 51-sample breast cancer cell line set, both generated on Affymetrix U133A microarrays. GOBO supports a wide range of applications including: 1 rapid assessment of gene expression levels in subgroups of breast tumors and cell lines, 2 identification of co-expressed genes for creation of potential metagenes, 3 association with outcome for gene expression levels of single genes, sets of genes, or gene signatures in multiple subgroups of the 1881-sample breast cancer data set. The design and implementation of GOBO facilitate easy incorporation of additional query functions and applications, as well as additional data sets irrespective of tumor type and array platform.

  16. Prognostic value of contrast-enhanced MR mammography in patients with breast cancer.

    Science.gov (United States)

    Fischer, U; Kopka, L; Brinck, U; Korabiowska, M; Schauer, A; Grabbe, E

    1997-01-01

    The objective of this study was to evaluate the prognostic value of contrast-enhanced MR mammography in patients with breast cancer. A total of 190 patients with breast cancer (37 noninvasive carcinomas, 153 invasive carcinomas) underwent dynamic contrast-enhanced MR mammography preoperatively. Using 1.5-T unit, T1-weighted sequences (2D FLASH) were obtained repeatedly one time before and five times after IV administration of 0.1 mmol gadopentetate-dimeglumine per kilogram body weight. The findings on MR imaging were correlated with histopathologically defined prognostic factors (histological type, tumor size, tumor grading, metastasis in lymph nodes). In addition, immunohistochemically defined prognostic factors (c-erbB-1, c-erbB-2, p53, Ki-67) were correlated with the signal increase on MR mammogram in 40 patients. There was no significant correlation between the findings on MR mammography and the histopathological type of carcinoma, the grading, and the lymphonodular status. Noninvasive carcinomas showed a higher rate of moderate (38 %) or low (27 %) enhancement on MR imaging than invasive carcinomas (6 and 3 %). The results on MR mammography and the results of immunohistochemical stainings did not correlate significantly. Noninvasive carcinomas showed significantly lower enhancement than invasive carcinomas. However, the signal behavior of contrast-enhanced MR mammography is not related to established histopathological prognostic parameters as subtyping, grading, nodal status, and the expression of certain oncogenes/tumor suppressor genes.

  17. CRC-113 gene expression signature for predicting prognosis in patients with colorectal cancer.

    Science.gov (United States)

    Nguyen, Minh Nam; Choi, Tae Gyu; Nguyen, Dinh Truong; Kim, Jin-Hwan; Jo, Yong Hwa; Shahid, Muhammad; Akter, Salima; Aryal, Saurav Nath; Yoo, Ji Youn; Ahn, Yong-Joo; Cho, Kyoung Min; Lee, Ju-Seog; Choe, Wonchae; Kang, Insug; Ha, Joohun; Kim, Sung Soo

    2015-10-13

    Colorectal cancer (CRC) is the third leading cause of global cancer mortality. Recent studies have proposed several gene signatures to predict CRC prognosis, but none of those have proven reliable for predicting prognosis in clinical practice yet due to poor reproducibility and molecular heterogeneity. Here, we have established a prognostic signature of 113 probe sets (CRC-113) that include potential biomarkers and reflect the biological and clinical characteristics. Robustness and accuracy were significantly validated in external data sets from 19 centers in five countries. In multivariate analysis, CRC-113 gene signature showed a stronger prognostic value for survival and disease recurrence in CRC patients than current clinicopathological risk factors and molecular alterations. We also demonstrated that the CRC-113 gene signature reflected both genetic and epigenetic molecular heterogeneity in CRC patients. Furthermore, incorporation of the CRC-113 gene signature into a clinical context and molecular markers further refined the selection of the CRC patients who might benefit from postoperative chemotherapy. Conclusively, CRC-113 gene signature provides new possibilities for improving prognostic models and personalized therapeutic strategies.

  18. Pathway analysis of gene signatures predicting metastasis of node-negative primary breast cancer

    International Nuclear Information System (INIS)

    Yu, Jack X; Sieuwerts, Anieta M; Zhang, Yi; Martens, John WM; Smid, Marcel; Klijn, Jan GM; Wang, Yixin; Foekens, John A

    2007-01-01

    Published prognostic gene signatures in breast cancer have few genes in common. Here we provide a rationale for this observation by studying the prognostic power and the underlying biological pathways of different gene signatures. Gene signatures to predict the development of metastases in estrogen receptor-positive and estrogen receptor-negative tumors were identified using 500 re-sampled training sets and mapping to Gene Ontology Biological Process to identify over-represented pathways. The Global Test program confirmed that gene expression profilings in the common pathways were associated with the metastasis of the patients. The apoptotic pathway and cell division, or cell growth regulation and G-protein coupled receptor signal transduction, were most significantly associated with the metastatic capability of estrogen receptor-positive or estrogen-negative tumors, respectively. A gene signature derived of the common pathways predicted metastasis in an independent cohort. Mapping of the pathways represented by different published prognostic signatures showed that they share 53% of the identified pathways. We show that divergent gene sets classifying patients for the same clinical endpoint represent similar biological processes and that pathway-derived signatures can be used to predict prognosis. Furthermore, our study reveals that the underlying biology related to aggressiveness of estrogen receptor subgroups of breast cancer is quite different

  19. Genome-wide identification of key modulators of gene-gene interaction networks in breast cancer.

    Science.gov (United States)

    Chiu, Yu-Chiao; Wang, Li-Ju; Hsiao, Tzu-Hung; Chuang, Eric Y; Chen, Yidong

    2017-10-03

    With the advances in high-throughput gene profiling technologies, a large volume of gene interaction maps has been constructed. A higher-level layer of gene-gene interaction, namely modulate gene interaction, is composed of gene pairs of which interaction strengths are modulated by (i.e., dependent on) the expression level of a key modulator gene. Systematic investigations into the modulation by estrogen receptor (ER), the best-known modulator gene, have revealed the functional and prognostic significance in breast cancer. However, a genome-wide identification of key modulator genes that may further unveil the landscape of modulated gene interaction is still lacking. We proposed a systematic workflow to screen for key modulators based on genome-wide gene expression profiles. We designed four modularity parameters to measure the ability of a putative modulator to perturb gene interaction networks. Applying the method to a dataset of 286 breast tumors, we comprehensively characterized the modularity parameters and identified a total of 973 key modulator genes. The modularity of these modulators was verified in three independent breast cancer datasets. ESR1, the encoding gene of ER, appeared in the list, and abundant novel modulators were illuminated. For instance, a prognostic predictor of breast cancer, SFRP1, was found the second modulator. Functional annotation analysis of the 973 modulators revealed involvements in ER-related cellular processes as well as immune- and tumor-associated functions. Here we present, as far as we know, the first comprehensive analysis of key modulator genes on a genome-wide scale. The validity of filtering parameters as well as the conservativity of modulators among cohorts were corroborated. Our data bring new insights into the modulated layer of gene-gene interaction and provide candidates for further biological investigations.

  20. Integrative cluster analysis in bioinformatics

    CERN Document Server

    Abu-Jamous, Basel; Nandi, Asoke K

    2015-01-01

    Clustering techniques are increasingly being put to use in the analysis of high-throughput biological datasets. Novel computational techniques to analyse high throughput data in the form of sequences, gene and protein expressions, pathways, and images are becoming vital for understanding diseases and future drug discovery. This book details the complete pathway of cluster analysis, from the basics of molecular biology to the generation of biological knowledge. The book also presents the latest clustering methods and clustering validation, thereby offering the reader a comprehensive review o

  1. Nearly complete mitogenome of hairy sawfly, Corynis lateralis (Brullé, 1832) (Hymenoptera: Cimbicidae): rearrangements in the IQM and ARNS1EF gene clusters.

    Science.gov (United States)

    Doğan, Özgül; Korkmaz, E Mahir

    2017-10-01

    The Cimbicidae is a small family of the primitive and relatively less diverse suborder Symphyta (Hymenoptera). Here, nearly complete mitochondrial genome (mitogenome) of hairy sawfly, Corynis lateralis (Hymenoptera: Cimbicidae) was sequenced using next generation sequencing and comparatively analysed with the mitogenome of Trichiosoma anthracinum. The sequenced length of C. lateralis mitogenome was 14,899 bp with an A+T content of 80.60%. All protein coding genes (PCGs) are initiated by ATN codons and all are terminated with TAR or T- stop codon. All tRNA genes preferred usual anticodons. Compared with the inferred insect ancestral mitogenome, two tRNA rearrangements were observed in the IQM and ARNS1EF gene clusters, representing a new event not previously reported in Symphyta. An illicit priming of replication and/or intra/inter-mitochondrial recombination and TDRL seem to be responsible mechanisms for the rearrangement events in these gene clusters. Phylogenetic analyses confirmed the position of Corynis within Cimbicidae and recovered a relationship of Tenthredinoidea + (Cephoidea + Orussoidea) in Symphyta.

  2. Presence of the vancomycin resistance gene cluster vanC1, vanXYc, and vanT in Enterococcus casseliflavus.

    Science.gov (United States)

    Hölzel, Christina; Bauer, Johann; Stegherr, Eva-Maria; Schwaiger, Karin

    2014-04-01

    The three chromosomally located clustered genes vanC1, vanXYc, and vanT confer intrinsic resistance to vancomycin and are used for species identification of Enterococcus gallinarum. In this study, 28 strains belonging to the E. gallinarum/casseliflavus group isolated from cloacal swabs from laying hens were screened for the presence of vanC1. As confirmed by species-specific multiplex PCR, 11 vanC1-positive strains were identified as E. gallinarum. Surprisingly, one yellow pigmented strain, verified as E. casseliflavus by species-specific multiplex PCR, was also vanC1 positive; vanXYc and vanT were additionally detectable in this strain. To our knowledge, this is the first report of vanC1, vanXYc, and vanT in E. casseliflavus. The minimum inhibitory concentration of vancomycin was 4 mg/L. Real-time reverse transcription-PCR revealed that none of the clustered genes was expressed in this strain. Even if the genes seem not to be active, there is a certain risk that they will be transferred to other bacteria where they might be functionally expressed. Therefore, it may be advisable to expand the search for vanC1, vanXYc, and vanT from E. gallinarum to other (enterococcal) species. This study confirms that enterococci live up to their name as being reservoir bacteria and should therefore always be closely monitored.

  3. Co-expression modules construction by WGCNA and identify potential prognostic markers of uveal melanoma.

    Science.gov (United States)

    Wan, Qi; Tang, Jing; Han, Yu; Wang, Dan

    2018-01-01

    Uveal melanoma is an aggressive cancer which has a high percentage recurrence and with a worse prognosis. Identify the potential prognostic markers of uveal melanoma may provide information for early detection of recurrence and treatment. RNA sequence data of uveal melanoma and patient clinic traits were obtained from The Cancer Genome Atlas (TCGA) database. Co-expression modules were built by weighted gene co -expression network analysis (WGCNA) and applied to investigate the relationship underlying modules and clinic traits. Besides, functional enrichment analysis was performed on these co-expression genes from interested modules. First, using WGCNA, identified 21 co-expression modules were constructed by the 10975 genes from the 80 human uveal melanoma samples. The number of genes in these modules ranged from 42 to 5091. Found four co -expression modules significantly correlated with three clinic traits (status, recurrence and recurrence Time). Module red, and purple positively correlated with patient's life status and recurrence Time. Module green positively correlates with recurrence. The result of functional enrichment analysis showed that the module magenta was mainly enriched genetic material assemble processes, the purple module was mainly enriched in tissue homeostasis and melanosome membrane and the module red was mainly enriched metastasis of cell, suggesting its critical role in the recurrence and development of the disease. Additionally, identified the hug gene (top connectivity with other genes) in each module. The hub gene SLC17A7, NTRK2, ABTB1 and ADPRHL1 might play a vital role in recurrence of uveal melanoma. Our findings provided the framework of co-expression gene modules of uveal melanoma and identified some prognostic markers might be detection of recurrence and treatment for uveal melanoma. Copyright © 2017 Elsevier Ltd. All rights reserved.

  4. Four-miRNA signature as a prognostic tool for lung adenocarcinoma.

    Science.gov (United States)

    Lin, Yan; Lv, Yufeng; Liang, Rong; Yuan, Chunling; Zhang, Jinyan; He, Dan; Zheng, Xiaowen; Zhang, Jianfeng

    2018-01-01

    The aim of this study was to generate a novel miRNA expression signature to accurately predict prognosis for patients with lung adenocarcinoma (LUAD). Using expression profiles downloaded from The Cancer Genome Atlas database, we identified multiple miRNAs with differential expression between LUAD and paired healthy tissues. We then evaluated the prognostic values of the differentially expressed miRNAs using univariate/multivariate Cox regression analysis. This analysis was ultimately used to construct a four-miRNA signature that effectively predicted patient survival. Finally, we analyzed potential functional roles of the target genes for these four miRNAs using Gene Ontology and Kyoto Encyclopedia of Genes and Genomes pathway enrichment analyses. Based on our cutoff criteria ( P 1.0), we identified a total of 187 differentially expressed miRNAs, including 148 that were upregulated in LUAD tissues and 39 that were downregulated. Four miRNAs (miR-148a-5p, miR-31-5p, miR-548v, and miR-550a-5p) were independently associated with survival based on Kaplan-Meier analysis. We generated a signature index based on the expression of these four miRNAs and stratified patients into low- and high-risk groups. Patients in the high-risk group had significantly shorter survival times than those in the low-risk group ( P =0.002). A functional enrichment analysis suggested that the target genes of these four miRNAs were involved in protein phosphorylation and the Hippo and sphingolipid signaling pathways. Taken together, our results suggest that our four-miRNA signature can be used as a prognostic tool for patients with LUAD.

  5. LAMININS IN COLORECTAL CANCER: EXPRESSION, FUNCTION, PROGNOSTIC POWER AND MOLECULAR MECHANISMS

    Directory of Open Access Journals (Sweden)

    S. A. Rodin

    2017-01-01

    Full Text Available Extracellular matrix (ECM proteins are a major component of the tumor stroma. Laminins emerge as one of the main families of ECM proteins with signaling properties. Apart from the structural function, laminins and products of their degradation affect survival and differentiation of cancer cells, motility of cancer and stromal cells, angiogenesis, invasion into distant organs, and other aspects of cancer development. Here, we discus expression of laminins in colorectal cancer (CRC, studying of laminin functions in in vitro and in vivo models of CRC, and using laminins as prognostic markers of CRC. Recently, we have reported a new approach to assessing prognostic power using classifiers constructed from sets of laminin genes. The method allows for accurate prognosis of CRC and provides additional information that may suggest possible molecular mechanisms of laminin function in CRC progression.

  6. Clustering-based approaches to SAGE data mining

    Directory of Open Access Journals (Sweden)

    Wang Haiying

    2008-07-01

    Full Text Available Abstract Serial analysis of gene expression (SAGE is one of the most powerful tools for global gene expression profiling. It has led to several biological discoveries and biomedical applications, such as the prediction of new gene functions and the identification of biomarkers in human cancer research. Clustering techniques have become fundamental approaches in these applications. This paper reviews relevant clustering techniques specifically designed for this type of data. It places an emphasis on current limitations and opportunities in this area for supporting biologically-meaningful data mining and visualisation.

  7. Assessment of Quantitative and Allelic MGMT Methylation Patterns as a Prognostic Marker in Glioblastoma

    DEFF Research Database (Denmark)

    Kristensen, Lasse S; Michaelsen, Signe R; Dyrbye, Henrik

    2016-01-01

    Methylation of the O(6)-methylguanine-DNA methyltransferase (MGMT) gene is a predictive and prognostic marker in newly diagnosed glioblastoma patients treated with temozolomide but how MGMT methylation should be assessed to ensure optimal detection accuracy is debated. We developed a novel quanti...

  8. Prediction of the prognosis of breast cancer in routine histologic specimens using a simplified, low-cost gene expression signature

    DEFF Research Database (Denmark)

    Marcell, S.A.; Balazs, A.; Emese, A.

    2013-01-01

    Prediction of the prognosis of breast cancer in routine histologic specimens using a simplified, low-cost gene expression signature Background: Grade 2 breast carcinomas do not form a uniform prognostic group. Aim: To extend the number of patients and the investigated genes of a previously...... grade 2 breast carcinomas into prognostic groups. Gene expression was investigated by polymerase chain reaction in 249 formalin-fixed, paraffin-embedded breast tumors. The results were correlated with relapse-free survival. Results: Histologically grade 2 carcinomas were split into good and a poor...... identified prognostic signature described by the authors that reflect chromosomal instability in order to refine characterization of grade 2 breast cancers and identify driver genes. Methods: Using publicly available databases, the authors selected 9 target and 3 housekeeping genes that are capable to divide...

  9. Prognostic significance of 1p36 locus deletion in adenoid cystic carcinoma of the salivary glands

    DEFF Research Database (Denmark)

    Šteiner, Petr; Andreasen, Simon; Grossmann, Petr

    2018-01-01

    Adenoid cystic carcinoma (AdCC) of the salivary glands is characterized by MYB-NFIB or MYBL1-NFIB fusion, prolonged but relentlessly progressive clinical course with frequent recurrences, and development of distant metastasis resulting in high long-term mortality. Currently, no effective therapy...... is available for patients with advanced non-resectable and/or metastatic disease. Complicating the clinical management of this patient group is the lack of prognostic markers. The purpose of this study is to investigate the prognostic value of 1p36 loss in patients with AdCC. The presence of 1p36 deletion...... and gene fusions involving the MYB, NFIB, and MYBL1 genes in a cohort of 93 salivary gland AdCCs was studied using fluorescence in situ hybridization. These results were statistically correlated with clinical data and outcome. Deletion of 1p36 in AdCC was identified in 13 of 85 analyzable cases (15...

  10. Updated clusters of orthologous genes for Archaea: a complex ancestor of the Archaea and the byways of horizontal gene transfer

    Directory of Open Access Journals (Sweden)

    Wolf Yuri I

    2012-12-01

    Full Text Available Abstract Background Collections of Clusters of Orthologous Genes (COGs provide indispensable tools for comparative genomic analysis, evolutionary reconstruction and functional annotation of new genomes. Initially, COGs were made for all complete genomes of cellular life forms that were available at the time. However, with the accumulation of thousands of complete genomes, construction of a comprehensive COG set has become extremely computationally demanding and prone to error propagation, necessitating the switch to taxon-specific COG collections. Previously, we reported the collection of COGs for 41 genomes of Archaea (arCOGs. Here we present a major update of the arCOGs and describe evolutionary reconstructions to reveal general trends in the evolution of Archaea. Results The updated version of the arCOG database incorporates 91% of the pangenome of 120 archaea (251,032 protein-coding genes altogether into 10,335 arCOGs. Using this new set of arCOGs, we performed maximum likelihood reconstruction of the genome content of archaeal ancestral forms and gene gain and loss events in archaeal evolution. This reconstruction shows that the last Common Ancestor of the extant Archaea was an organism of greater complexity than most of the extant archaea, probably with over 2,500 protein-coding genes. The subsequent evolution of almost all archaeal lineages was apparently dominated by gene loss resulting in genome streamlining. Overall, in the evolution of Archaea as well as a representative set of bacteria that was similarly analyzed for comparison, gene losses are estimated to outnumber gene gains at least 4 to 1. Analysis of specific patterns of gene gain in Archaea shows that, although some groups, in particular Halobacteria, acquire substantially more genes than others, on the whole, gene exchange between major groups of Archaea appears to be largely random, with no major ‘highways’ of horizontal gene transfer. Conclusions The updated collection

  11. Profiling of secondary metabolite gene clusters regulated by LaeA in Aspergillus niger FGSC A1279 based on genome sequencing and transcriptome analysis.

    Science.gov (United States)

    Wang, Bin; Lv, Yangyong; Li, Xuejie; Lin, Yiying; Deng, Hai; Pan, Li

    The global regulator LaeA controls the production of many fungal secondary metabolites, possibly via chromatin remodeling. Here we aimed to survey the secondary metabolite profile regulated by LaeA in Aspergillus niger FGSC A1279 by genome sequencing and comparative transcriptomics between the laeA deletion (ΔlaeA) and overexpressing (OE-laeA) mutants. Genome sequencing revealed four putative polyketide synthase genes specific to FGSC A1279, suggesting that the corresponding polyketide compounds might be unique to FGSC A1279. RNA-seq data revealed 281 putative secondary metabolite genes upregulated in the OE-laeA mutants, including 22 secondary metabolite backbone genes. LC-MS chemical profiling illustrated that many secondary metabolites were produced in OE-laeA mutants compared to wild type and ΔlaeA mutants, providing potential resources for drug discovery. KEGG analysis annotated 16 secondary metabolite clusters putatively linked to metabolic pathways. Furthermore, 34 of 61 Zn 2 Cys 6 transcription factors located in secondary metabolite clusters were differentially expressed between ΔlaeA and OE-laeA mutants. Three secondary metabolite clusters (cluster 18, 30 and 33) containing Zn 2 Cys 6 transcription factors that were upregulated in OE-laeA mutants were putatively linked to KEGG pathways, suggesting that Zn 2 Cys 6 transcription factors might play an important role in synthesizing secondary metabolites regulated by LaeA. Taken together, LaeA dramatically influences the secondary metabolite profile in FGSC A1279. Copyright © 2017 Institut Pasteur. Published by Elsevier Masson SAS. All rights reserved.

  12. Duplicated Gephyrin Genes Showing Distinct Tissue Distribution and Alternative Splicing Patterns Mediate Molybdenum Cofactor Biosynthesis, Glycine Receptor Clustering, and Escape Behavior in Zebrafish*

    Science.gov (United States)

    Ogino, Kazutoyo; Ramsden, Sarah L.; Keib, Natalie; Schwarz, Günter; Harvey, Robert J.; Hirata, Hiromi

    2011-01-01

    Gephyrin mediates the postsynaptic clustering of glycine receptors (GlyRs) and GABAA receptors at inhibitory synapses and molybdenum-dependent enzyme (molybdoenzyme) activity in non-neuronal tissues. Gephyrin knock-out mice show a phenotype resembling both defective glycinergic transmission and molybdenum cofactor (Moco) deficiency and die within 1 day of birth due to starvation and dyspnea resulting from deficits in motor and respiratory networks, respectively. To address whether gephyrin function is conserved among vertebrates and whether gephyrin deficiency affects molybdoenzyme activity and motor development, we cloned and characterized zebrafish gephyrin genes. We report here that zebrafish have two gephyrin genes, gphna and gphnb. The former is expressed in all tissues and has both C3 and C4 cassette exons, and the latter is expressed predominantly in the brain and spinal cord and harbors only C4 cassette exons. We confirmed that all of the gphna and gphnb splicing isoforms have Moco synthetic activity. Antisense morpholino knockdown of either gphna or gphnb alone did not disturb synaptic clusters of GlyRs in the spinal cord and did not affect touch-evoked escape behaviors. However, on knockdown of both gphna and gphnb, embryos showed impairments in GlyR clustering in the spinal cord and, as a consequence, demonstrated touch-evoked startle response behavior by contracting antagonistic muscles simultaneously, instead of displaying early coiling and late swimming behaviors, which are executed by side-to-side muscle contractions. These data indicate that duplicated gephyrin genes mediate Moco biosynthesis and control postsynaptic clustering of GlyRs, thereby mediating key escape behaviors in zebrafish. PMID:20843816

  13. Analysis of multiplex gene expression maps obtained by voxelation

    Directory of Open Access Journals (Sweden)

    Smith Desmond J

    2009-04-01

    Full Text Available Abstract Background Gene expression signatures in the mammalian brain hold the key to understanding neural development and neurological disease. Researchers have previously used voxelation in combination with microarrays for acquisition of genome-wide atlases of expression patterns in the mouse brain. On the other hand, some work has been performed on studying gene functions, without taking into account the location information of a gene's expression in a mouse brain. In this paper, we present an approach for identifying the relation between gene expression maps obtained by voxelation and gene functions. Results To analyze the dataset, we chose typical genes as queries and aimed at discovering similar gene groups. Gene similarity was determined by using the wavelet features extracted from the left and right hemispheres averaged gene expression maps, and by the Euclidean distance between each pair of feature vectors. We also performed a multiple clustering approach on the gene expression maps, combined with hierarchical clustering. Among each group of similar genes and clusters, the gene function similarity was measured by calculating the average gene function distances in the gene ontology structure. By applying our methodology to find similar genes to certain target genes we were able to improve our understanding of gene expression patterns and gene functions. By applying the clustering analysis method, we obtained significant clusters, which have both very similar gene expression maps and very similar gene functions respectively to their corresponding gene ontologies. The cellular component ontology resulted in prominent clusters expressed in cortex and corpus callosum. The molecular function ontology gave prominent clusters in cortex, corpus callosum and hypothalamus. The biological process ontology resulted in clusters in cortex, hypothalamus and choroid plexus. Clusters from all three ontologies combined were most prominently expressed in

  14. Analysis of multiplex gene expression maps obtained by voxelation.

    Science.gov (United States)

    An, Li; Xie, Hongbo; Chin, Mark H; Obradovic, Zoran; Smith, Desmond J; Megalooikonomou, Vasileios

    2009-04-29

    Gene expression signatures in the mammalian brain hold the key to understanding neural development and neurological disease. Researchers have previously used voxelation in combination with microarrays for acquisition of genome-wide atlases of expression patterns in the mouse brain. On the other hand, some work has been performed on studying gene functions, without taking into account the location information of a gene's expression in a mouse brain. In this paper, we present an approach for identifying the relation between gene expression maps obtained by voxelation and gene functions. To analyze the dataset, we chose typical genes as queries and aimed at discovering similar gene groups. Gene similarity was determined by using the wavelet features extracted from the left and right hemispheres averaged gene expression maps, and by the Euclidean distance between each pair of feature vectors. We also performed a multiple clustering approach on the gene expression maps, combined with hierarchical clustering. Among each group of similar genes and clusters, the gene function similarity was measured by calculating the average gene function distances in the gene ontology structure. By applying our methodology to find similar genes to certain target genes we were able to improve our understanding of gene expression patterns and gene functions. By applying the clustering analysis method, we obtained significant clusters, which have both very similar gene expression maps and very similar gene functions respectively to their corresponding gene ontologies. The cellular component ontology resulted in prominent clusters expressed in cortex and corpus callosum. The molecular function ontology gave prominent clusters in cortex, corpus callosum and hypothalamus. The biological process ontology resulted in clusters in cortex, hypothalamus and choroid plexus. Clusters from all three ontologies combined were most prominently expressed in cortex and corpus callosum. The experimental

  15. Prognostic and predictive potential molecular biomarkers in colon cancer.

    Science.gov (United States)

    Nastase, A; Pâslaru, L; Niculescu, A M; Ionescu, M; Dumitraşcu, T; Herlea, V; Dima, S; Gheorghe, C; Lazar, V; Popescu, I

    2011-01-01

    An important objective in nowadays research is the discovery of new biomarkers that can detect colon tumours in early stages and indicate with accuracy the status of the disease. The aim of our study was to identify potential biomarkers for colon cancer onset and progression. We assessed gene expression profiles of a list of 10 candidate genes (MMP-1, MMP-3, MMP-7, DEFA 1, DEFA-5, DEFA-6, IL-8, CXCL-1, SPP-1, CTHRC-1) by quantitative real time PCR in triplets of colonic mucosa (normal, adenoma, tumoral tissue) collected from the same patient during surgery for a group of 20 patients. Additionally we performed immunohistochemistry for DEFA1-3 and SPP1. We remarked that DEFA5 and DEFA6 are key factors in adenoma formation (p<0.05). MMP7 is important in the transition from a benign to a malignant status (p <0.01) and further in metastasis being a prognostic indicator for tumor transformation and for the metastatic potential of cancer cells. IL8, irrespective of tumor stage, has a high mRNA level in adenocarcinoma (p< 0.05). The level of expression for SPP1 is correlated with tumor level. We suggest that high levels of DEFAS, DEFA6 (key elements in adenoma formation), MMP7 (marker of colon cancer onset and progression to metastasis), SPP1 (marker of progression) and IL8 could be used to diagnose an early stage colon cancer and to evaluate the prognostic of progression for colon tumors. Further, if DEFA5 and DEFA6 level of expression are low but MMP7, SPP1 and IL8 level are high we could point out that the transition from adenoma to adenocarcinoma had already occurred. Thus, DEFA5, DEFA6, MMP7, IL8 and SPP1 consist in a valuable panel of biomarkers, whose detection can be used in early detection and progressive disease and also in prognostic of colon cancer.

  16. Distributed Prognostic Health Management with Gaussian Process Regression

    Science.gov (United States)

    Saha, Sankalita; Saha, Bhaskar; Saxena, Abhinav; Goebel, Kai Frank

    2010-01-01

    Distributed prognostics architecture design is an enabling step for efficient implementation of health management systems. A major challenge encountered in such design is formulation of optimal distributed prognostics algorithms. In this paper. we present a distributed GPR based prognostics algorithm whose target platform is a wireless sensor network. In addition to challenges encountered in a distributed implementation, a wireless network poses constraints on communication patterns, thereby making the problem more challenging. The prognostics application that was used to demonstrate our new algorithms is battery prognostics. In order to present trade-offs within different prognostic approaches, we present comparison with the distributed implementation of a particle filter based prognostics for the same battery data.

  17. Unsupervised versus Supervised Identification of Prognostic Factors in Patients with Localized Retroperitoneal Sarcoma: A Data Clustering and Mahalanobis Distance Approach

    Directory of Open Access Journals (Sweden)

    Rita De Sanctis

    2018-01-01

    Full Text Available The aim of this report is to unveil specific prognostic factors for retroperitoneal sarcoma (RPS patients by univariate and multivariate statistical techniques. A phase I-II study on localized RPS treated with high-dose ifosfamide and radiotherapy followed by surgery (ISG-STS 0303 protocol demonstrated that chemo/radiotherapy was safe and increased the 3-year relapse-free survival (RFS with respect to historical controls. Of 70 patients, twenty-six developed local, 10 distant, and 5 combined relapse. Median disease-free interval (DFI was 29.47 months. According to a discriminant function analysis, DFI, histology, relapse pattern, and the first treatment approach at relapse had a statistically significant prognostic impact. Based on scientific literature and clinical expertise, clinicopathological data were analyzed using both a supervised and an unsupervised classification method to predict the prognosis, with similar sample sizes (66 and 65, resp., in casewise approach and 70 in mean-substitution one. This is the first attempt to predict patients’ prognosis by means of multivariate statistics, and in this light, it looks noticable that (i some clinical data have a well-defined prognostic value, (ii the unsupervised model produced comparable results with respect to the supervised one, and (iii the appropriate combination of both models appears fruitful and easily extensible to different clinical contexts.

  18. cluML: A markup language for clustering and cluster validity assessment of microarray data.

    Science.gov (United States)

    Bolshakova, Nadia; Cunningham, Pádraig

    2005-01-01

    cluML is a new markup language for microarray data clustering and cluster validity assessment. The XML-based format has been designed to address some of the limitations observed in traditional formats, such as inability to store multiple clustering (including biclustering) and validation results within a dataset. cluML is an effective tool to support biomedical knowledge representation in gene expression data analysis. Although cluML was developed for DNA microarray analysis applications, it can be effectively used for the representation of clustering and for the validation of other biomedical and physical data that has no limitations.

  19. Assessment of clusters of transcription factor binding sites in relationship to human promoter, CpG islands and gene expression

    Directory of Open Access Journals (Sweden)

    Sakaki Yoshiyuki

    2004-02-01

    Full Text Available Abstract Background Gene expression is regulated mainly by transcription factors (TFs that interact with regulatory cis-elements on DNA sequences. To identify functional regulatory elements, computer searching can predict TF binding sites (TFBS using position weight matrices (PWMs that represent positional base frequencies of collected experimentally determined TFBS. A disadvantage of this approach is the large output of results for genomic DNA. One strategy to identify genuine TFBS is to utilize local concentrations of predicted TFBS. It is unclear whether there is a general tendency for TFBS to cluster at promoter regions, although this is the case for certain TFBS. Also unclear is the identification of TFs that have TFBS concentrated in promoters and to what level this occurs. This study hopes to answer some of these questions. Results We developed the cluster score measure to evaluate the correlation between predicted TFBS clusters and promoter sequences for each PWM. Non-promoter sequences were used as a control. Using the cluster score, we identified a PWM group called PWM-PCP, in which TFBS clusters positively correlate with promoters, and another PWM group called PWM-NCP, in which TFBS clusters negatively correlate with promoters. The PWM-PCP group comprises 47% of the 199 vertebrate PWMs, while the PWM-NCP group occupied 11 percent. After reducing the effect of CpG islands (CGI against the clusters using partial correlation coefficients among three properties (promoter, CGI and predicted TFBS cluster, we identified two PWM groups including those strongly correlated with CGI and those not correlated with CGI. Conclusion Not all PWMs predict TFBS correlated with human promoter sequences. Two main PWM groups were identified: (1 those that show TFBS clustered in promoters associated with CGI, and (2 those that show TFBS clustered in promoters independent of CGI. Assessment of PWM matches will allow more positive interpretation of TFBS in

  20. VDR mRNA overexpression is associated with worse prognostic factors in papillary thyroid carcinoma

    Directory of Open Access Journals (Sweden)

    June Young Choi

    2017-03-01

    Full Text Available The purpose of this study was to assess the relationship between vitamin D receptor gene (VDR expression and prognostic factors in papillary thyroid cancer (PTC. mRNA sequencing and somatic mutation data from The Cancer Genome Atlas (TCGA were analyzed. VDR mRNA expression was compared to clinicopathologic variables by linear regression. Tree-based classification was applied to find cutoff and patients were split into low and high VDR group. Logistic regression, Kaplan–Meier analysis, differentially expressed gene (DEG test and pathway analysis were performed to assess the differences between two VDR groups. VDR mRNA expression was elevated in PTC than that in normal thyroid tissue. VDR expressions were high in classic and tall-cell variant PTC and lateral neck node metastasis was present. High VDR group was also associated with classic and tall cell subtype, AJCC stage IV and lower recurrence-free survival. DEG test reveals that 545 genes were upregulated in high VDR group. Thyroid cancer-related pathways were enriched in high VDR group in pathway analyses. VDR mRNA overexpression was correlated with worse prognostic factors such as subtypes of papillary thyroid carcinoma that are known to be worse prognosis, lateral neck node metastasis, advanced stage and recurrence-free survival.