WorldWideScience

Sample records for prognostic gene clusters

  1. Bioinformatics analysis to screen the key prognostic genes in ovarian cancer.

    Science.gov (United States)

    Li, Li; Cai, Shengyun; Liu, Shengnan; Feng, Hao; Zhang, Junjie

    2017-04-13

    Ovarian cancer (OC) is a gynecological oncology that has a poor prognosis and high mortality. This study is conducted to identify the key genes implicated in the prognosis of OC by bioinformatic analysis. Gene expression data (including 568 primary OC tissues, 17 recurrent OC tissues, and 8 adjacent normal tissues) and the relevant clinical information of OC patients were downloaded from The Cancer Genome Atlas database. After data preprocessing, cluster analysis was conducted using the ConsensusClusterPlus package in R. Using the limma package in R, differential analysis was performed to identify feature genes. Based on Kaplan-Meier (KM) survival analysis, prognostic seed genes were selected from the feature genes. After key prognostic genes were further screened by cluster analysis and KM survival analysis, they were performed functional enrichment analysis and multivariate survival analysis. Using the survival package in R, cox regression analysis was conducted for the microarray data of GSE17260 to validate the key prognostic genes. A total of 3668 feature genes were obtained, among which 75 genes were identified as prognostic seed genes. Then, 25 key prognostic genes were screened, including AXL, FOS, KLF6, WDR77, DUSP1, GADD45B, and SLIT3. Especially, AXL and SLIT3 were enriched in ovulation cycle. Multivariate survival analysis showed that the key prognostic genes could effectively differentiate the samples and were significantly associated with prognosis. Additionally, GSE17260 confirmed that the key prognostic genes were associated with the prognosis of OC. AXL, FOS, KLF6, WDR77, DUSP1, GADD45B, and SLIT3 might affect the prognosis of OC.

  2. FunGeneClusterS

    DEFF Research Database (Denmark)

    Vesth, Tammi Camilla; Brandl, Julian; Andersen, Mikael Rørdam

    2016-01-01

    and industrial biotechnology applications. We have previously published a method for accurate prediction of clusters from genome and transcriptome data, which could also suggest cross-chemistry, however, this method was limited both in the number of parameters which could be adjusted as well as in user......Secondary metabolites of fungi are receiving an increasing amount of interest due to their prolific bioactivities and the fact that fungal biosynthesis of secondary metabolites often occurs from co-regulated and co-located gene clusters. This makes the gene clusters attractive for synthetic biology...

  3. Combination of meta-analysis and graph clustering to identify prognostic markers of ESCC

    Directory of Open Access Journals (Sweden)

    Hongyun Gao

    2012-01-01

    Full Text Available Esophageal squamous cell carcinoma (ESCC is one of the most malignant gastrointestinal cancers and occurs at a high frequency rate in China and other Asian countries. Recently, several molecular markers were identified for predicting ESCC. Notwithstanding, additional prognostic markers, with a clear understanding of their underlying roles, are still required. Through bioinformatics, a graph-clustering method by DPClus was used to detect co-expressed modules. The aim was to identify a set of discriminating genes that could be used for predicting ESCC through graph-clustering and GO-term analysis. The results showed that CXCL12, CYP2C9, TGM3, MAL, S100A9, EMP-1 and SPRR3 were highly associated with ESCC development. In our study, all their predicted roles were in line with previous reports, whereby the assumption that a combination of meta-analysis, graph-clustering and GO-term analysis is effective for both identifying differentially expressed genes, and reflecting on their functions in ESCC.

  4. The prognostic value of autotaxin activity and gene expression ...

    African Journals Online (AJOL)

    The prognostic value of autotaxin activity and gene expression, matrix metalloproteinase-9 and p53 antibodies in breast cancer patients. ... included in this study and subjected to determination of ATX (both activity by colorimetric method and gene expression by RT-PCR) and both p53 Abs and MMP-9 by ELISA technique.

  5. Persistence drives gene clustering in bacterial genomes

    Directory of Open Access Journals (Sweden)

    Rocha Eduardo PC

    2008-01-01

    Full Text Available Abstract Background Gene clustering plays an important role in the organization of the bacterial chromosome and several mechanisms have been proposed to explain its extent. However, the controversies raised about the validity of each of these mechanisms remind us that the cause of this gene organization remains an open question. Models proposed to explain clustering did not take into account the function of the gene products nor the likely presence or absence of a given gene in a genome. However, genomes harbor two very different categories of genes: those genes present in a majority of organisms – persistent genes – and those present in very few organisms – rare genes. Results We show that two classes of genes are significantly clustered in bacterial genomes: the highly persistent and the rare genes. The clustering of rare genes is readily explained by the selfish operon theory. Yet, genes persistently present in bacterial genomes are also clustered and we try to understand why. We propose a model accounting specifically for such clustering, and show that indispensability in a genome with frequent gene deletion and insertion leads to the transient clustering of these genes. The model describes how clusters are created via the gene flux that continuously introduces new genes while deleting others. We then test if known selective processes, such as co-transcription, physical interaction or functional neighborhood, account for the stabilization of these clusters. Conclusion We show that the strong selective pressure acting on the function of persistent genes, in a permanent state of flux of genes in bacterial genomes, maintaining their size fairly constant, that drives persistent genes clustering. A further selective stabilization process might contribute to maintaining the clustering.

  6. Biological cluster evaluation for gene function prediction.

    Science.gov (United States)

    Klie, Sebastian; Nikoloski, Zoran; Selbig, Joachim

    2014-06-01

    Recent advances in high-throughput omics techniques render it possible to decode the function of genes by using the "guilt-by-association" principle on biologically meaningful clusters of gene expression data. However, the existing frameworks for biological evaluation of gene clusters are hindered by two bottleneck issues: (1) the choice for the number of clusters, and (2) the external measures which do not take in consideration the structure of the analyzed data and the ontology of the existing biological knowledge. Here, we address the identified bottlenecks by developing a novel framework that allows not only for biological evaluation of gene expression clusters based on existing structured knowledge, but also for prediction of putative gene functions. The proposed framework facilitates propagation of statistical significance at each of the following steps: (1) estimating the number of clusters, (2) evaluating the clusters in terms of novel external structural measures, (3) selecting an optimal clustering algorithm, and (4) predicting gene functions. The framework also includes a method for evaluation of gene clusters based on the structure of the employed ontology. Moreover, our method for obtaining a probabilistic range for the number of clusters is demonstrated valid on synthetic data and available gene expression profiles from Saccharomyces cerevisiae. Finally, we propose a network-based approach for gene function prediction which relies on the clustering of optimal score and the employed ontology. Our approach effectively predicts gene function on the Saccharomyces cerevisiae data set and is also employed to obtain putative gene functions for an Arabidopsis thaliana data set.

  7. Multiconstrained gene clustering based on generalized projections

    Directory of Open Access Journals (Sweden)

    Zhu Shanfeng

    2010-03-01

    Full Text Available Abstract Background Gene clustering for annotating gene functions is one of the fundamental issues in bioinformatics. The best clustering solution is often regularized by multiple constraints such as gene expressions, Gene Ontology (GO annotations and gene network structures. How to integrate multiple pieces of constraints for an optimal clustering solution still remains an unsolved problem. Results We propose a novel multiconstrained gene clustering (MGC method within the generalized projection onto convex sets (POCS framework used widely in image reconstruction. Each constraint is formulated as a corresponding set. The generalized projector iteratively projects the clustering solution onto these sets in order to find a consistent solution included in the intersection set that satisfies all constraints. Compared with previous MGC methods, POCS can integrate multiple constraints from different nature without distorting the original constraints. To evaluate the clustering solution, we also propose a new performance measure referred to as Gene Log Likelihood (GLL that considers genes having more than one function and hence in more than one cluster. Comparative experimental results show that our POCS-based gene clustering method outperforms current state-of-the-art MGC methods. Conclusions The POCS-based MGC method can successfully combine multiple constraints from different nature for gene clustering. Also, the proposed GLL is an effective performance measure for the soft clustering solutions.

  8. Prognostic Gene Expression Profiles in Breast Cancer

    DEFF Research Database (Denmark)

    Sørensen, Kristina Pilekær

    Each year approximately 4,800 Danish women are diagnosed with breast cancer. Several clinical and pathological factors are used as prognostic and predictive markers to categorize the patients into groups of high or low risk. Around 90% of all patients are allocated to the high risk group and offe......Each year approximately 4,800 Danish women are diagnosed with breast cancer. Several clinical and pathological factors are used as prognostic and predictive markers to categorize the patients into groups of high or low risk. Around 90% of all patients are allocated to the high risk group...... and offered systemic adjuvant therapy, and 50% of these patients receive chemotherapy. However, approximately 25-30% of the lymph node negative and 50% of the lymph node positive high risk patients would experience recurrence if left untreated with systemic adjuvant therapy. Consequently, considerable......, hormone receptor status, histological grade, age of patient at diagnosis, and year of surgery. All patients included in the study had not received any kind of systemic adjuvant therapy; hence, the study results were not influenced by treatment response. We compared lncRNA expression in metastatic and non...

  9. Finding approximate gene clusters with Gecko 3

    Science.gov (United States)

    Winter, Sascha; Jahn, Katharina; Wehner, Stefanie; Kuchenbecker, Leon; Marz, Manja; Stoye, Jens; Böcker, Sebastian

    2016-01-01

    Gene-order-based comparison of multiple genomes provides signals for functional analysis of genes and the evolutionary process of genome organization. Gene clusters are regions of co-localized genes on genomes of different species. The rapid increase in sequenced genomes necessitates bioinformatics tools for finding gene clusters in hundreds of genomes. Existing tools are often restricted to few (in many cases, only two) genomes, and often make restrictive assumptions such as short perfect conservation, conserved gene order or monophyletic gene clusters. We present Gecko 3, an open-source software for finding gene clusters in hundreds of bacterial genomes, that comes with an easy-to-use graphical user interface. The underlying gene cluster model is intuitive, can cope with low degrees of conservation as well as misannotations and is complemented by a sound statistical evaluation. To evaluate the biological benefit of Gecko 3 and to exemplify our method, we search for gene clusters in a dataset of 678 bacterial genomes using Synechocystis sp. PCC 6803 as a reference. We confirm detected gene clusters reviewing the literature and comparing them to a database of operons; we detect two novel clusters, which were confirmed by publicly available experimental RNA-Seq data. The computational analysis is carried out on a laptop computer in <40 min. PMID:27679480

  10. Prognostic value of cluster analysis of severe asthma phenotypes.

    Science.gov (United States)

    Bourdin, Arnaud; Molinari, Nicolas; Vachier, Isabelle; Varrin, Muriel; Marin, Grégory; Gamez, Anne-Sophie; Paganin, Fabrice; Chanez, Pascal

    2014-11-01

    Cross-sectional severe asthma cluster analysis identified different phenotypes. We tested the hypothesis that these clusters will follow different courses. We aimed to identify which asthma outcomes are specific and coherently associated with these different phenotypes in a prospective longitudinal cohort. In a longitudinal cohort of 112 patients with severe asthma, the 5 Severe Asthma Research Program (SARP) clusters were identified by means of algorithm application. Because patients of the present cohort all had severe asthma compared with the SARP cohort, homemade clusters were identified and also tested. At the subsequent visit, we investigated several outcomes related to asthma control at 1 year (6-item Asthma Control Questionnaire [ACQ-6], lung function, and medication requirement) and then recorded the 3-year exacerbations rate and time to first exacerbation. The SARP algorithm discriminated the 5 clusters at entry for age, asthma duration, lung function, blood eosinophil measurement, ACQ-6 scores, and diabetes comorbidity. Four homemade clusters were mostly segregated by best ever achieved FEV1 values and discriminated the groups by a few clinical characteristics. Nonetheless, all these clusters shared similar asthma outcomes related to asthma control as follows. The ACQ-6 score did not change in any cluster. Exacerbation rate and time to first exacerbation were similar, as were treatment requirements. Severe asthma phenotypes identified by using a previously reported cluster analysis or newly homemade clusters do not behave differently concerning asthma control-related outcomes, which are used to assess the response to innovative therapies. This study demonstrates a potential limitation of the cluster analysis approach in the field of severe asthma. Copyright © 2014. Published by Elsevier Inc.

  11. Pichia stipitis genomics, transcriptomics, and gene clusters.

    Science.gov (United States)

    Jeffries, Thomas W; Van Vleet, Jennifer R Headman

    2009-09-01

    Genome sequencing and subsequent global gene expression studies have advanced our understanding of the lignocellulose-fermenting yeast Pichia stipitis. These studies have provided an insight into its central carbon metabolism, and analysis of its genome has revealed numerous functional gene clusters and tandem repeats. Specialized physiological traits are often the result of several gene products acting together. When coinheritance is necessary for the overall physiological function, recombination and selection favor colocation of these genes in a cluster. These are particularly evident in strongly conserved and idiomatic traits. In some cases, the functional clusters consist of multiple gene families. Phylogenetic analyses of the members in each family show that once formed, functional clusters undergo duplication and differentiation. Genome-wide expression analysis reveals that regulatory patterns of clusters are similar after they have duplicated and that the expression profiles evolve along with functional differentiation of the clusters. Orthologous gene families appear to arise through tandem gene duplication, followed by differentiation in the regulatory and coding regions of the gene. Genome-wide expression analysis combined with cross-species comparisons of functional gene clusters should reveal many more aspects of eukaryotic physiology.

  12. A New Multivariate Approach for Prognostics Based on Extreme Learning Machine and Fuzzy Clustering.

    Science.gov (United States)

    Javed, Kamran; Gouriveau, Rafael; Zerhouni, Noureddine

    2015-12-01

    Prognostics is a core process of prognostics and health management (PHM) discipline, that estimates the remaining useful life (RUL) of a degrading machinery to optimize its service delivery potential. However, machinery operates in a dynamic environment and the acquired condition monitoring data are usually noisy and subject to a high level of uncertainty/unpredictability, which complicates prognostics. The complexity further increases, when there is absence of prior knowledge about ground truth (or failure definition). For such issues, data-driven prognostics can be a valuable solution without deep understanding of system physics. This paper contributes a new data-driven prognostics approach namely, an "enhanced multivariate degradation modeling," which enables modeling degrading states of machinery without assuming a homogeneous pattern. In brief, a predictability scheme is introduced to reduce the dimensionality of the data. Following that, the proposed prognostics model is achieved by integrating two new algorithms namely, the summation wavelet-extreme learning machine and subtractive-maximum entropy fuzzy clustering to show evolution of machine degradation by simultaneous predictions and discrete state estimation. The prognostics model is equipped with a dynamic failure threshold assignment procedure to estimate RUL in a realistic manner. To validate the proposition, a case study is performed on turbofan engines data from PHM challenge 2008 (NASA), and results are compared with recent publications.

  13. AKT pathway genes define 5 prognostic subgroups in glioblastoma

    National Research Council Canada - National Science Library

    Joy, Anna; Ramesh, Archana; Smirnov, Ivan; Reiser, Mark; Misra, Anjan; Shapiro, William R; Mills, Gordon B; Kim, Seungchan; Feuerstein, Burt G

    2014-01-01

    ... robust. We hypothesized variations in the pathway between tumors contribute to poor response. We clustered GBM based on AKT pathway genes and discovered new subtypes then characterized their clinical and molecular features...

  14. Comparison of scores for bimodality of gene expression distributions and genome-wide evaluation of the prognostic relevance of high-scoring genes

    Science.gov (United States)

    2010-01-01

    Background A major goal of the analysis of high-dimensional RNA expression data from tumor tissue is to identify prognostic signatures for discriminating patient subgroups. For this purpose genome-wide identification of bimodally expressed genes from gene array data is relevant because distinguishability of high and low expression groups is easier compared to genes with unimodal expression distributions. Recently, several methods for the identification of genes with bimodal distributions have been introduced. A straightforward approach is to cluster the expression values and score the distance between the two distributions. Other scores directly measure properties of the distribution. The kurtosis, e.g., measures divergence from a normal distribution. An alternative is the outlier-sum statistic that identifies genes with extremely high or low expression values in a subset of the samples. Results We compare and discuss scores for bimodality for expression data. For the genome-wide identification of bimodal genes we apply all scores to expression data from 194 patients with node-negative breast cancer. Further, we present the first comprehensive genome-wide evaluation of the prognostic relevance of bimodal genes. We first rank genes according to bimodality scores and define two patient subgroups based on expression values. Then we assess the prognostic significance of the top ranking bimodal genes by comparing the survival functions of the two patient subgroups. We also evaluate the global association between the bimodal shape of expression distributions and survival times with an enrichment type analysis. Various cluster-based methods lead to a significant overrepresentation of prognostic genes. A striking result is obtained with the outlier-sum statistic (p genes with heavy tails generate subgroups of patients with different prognosis. Conclusions Genes with high bimodality scores are promising candidates for defining prognostic patient subgroups from expression

  15. Robust method for identification of prognostic gene signatures from gene expression profiles.

    Science.gov (United States)

    Sim, Woogwang; Lee, Jungsul; Choi, Chulhee

    2017-12-05

    In the last decade, many attempts have been made to use gene expression profiles to identify prognostic genes for various types of cancer. Previous studies evaluating the prognostic value of genes suffered by failing to solve the critical problem of classifying patients into different risk groups based on specific gene expression threshold levels. Here, we present a novel method, called iterative patient partitioning (IPP), which was inspired by the receiver operating characteristic (ROC) curve, is based on the log-rank test and overcomes the threshold decision problem. We applied IPP to analyze datasets pertaining to various subtypes of breast cancer. Using IPP, we discovered both novel and well-studied prognostic genes related to cell cycle/proliferation or the immune response. The novel genes were further analyzed using copy-number alteration and mutation data, and these results supported their relationship with prognosis.

  16. Significance analysis of prognostic signatures.

    Directory of Open Access Journals (Sweden)

    Andrew H Beck

    Full Text Available A major goal in translational cancer research is to identify biological signatures driving cancer progression and metastasis. A common technique applied in genomics research is to cluster patients using gene expression data from a candidate prognostic gene set, and if the resulting clusters show statistically significant outcome stratification, to associate the gene set with prognosis, suggesting its biological and clinical importance. Recent work has questioned the validity of this approach by showing in several breast cancer data sets that "random" gene sets tend to cluster patients into prognostically variable subgroups. This work suggests that new rigorous statistical methods are needed to identify biologically informative prognostic gene sets. To address this problem, we developed Significance Analysis of Prognostic Signatures (SAPS which integrates standard prognostic tests with a new prognostic significance test based on stratifying patients into prognostic subtypes with random gene sets. SAPS ensures that a significant gene set is not only able to stratify patients into prognostically variable groups, but is also enriched for genes showing strong univariate associations with patient prognosis, and performs significantly better than random gene sets. We use SAPS to perform a large meta-analysis (the largest completed to date of prognostic pathways in breast and ovarian cancer and their molecular subtypes. Our analyses show that only a small subset of the gene sets found statistically significant using standard measures achieve significance by SAPS. We identify new prognostic signatures in breast and ovarian cancer and their corresponding molecular subtypes, and we show that prognostic signatures in ER negative breast cancer are more similar to prognostic signatures in ovarian cancer than to prognostic signatures in ER positive breast cancer. SAPS is a powerful new method for deriving robust prognostic biological signatures from clinically

  17. Latent cluster analysis of ALS phenotypes identifies prognostically differing groups.

    Directory of Open Access Journals (Sweden)

    Jeban Ganesalingam

    2009-09-01

    Full Text Available Amyotrophic lateral sclerosis (ALS is a degenerative disease predominantly affecting motor neurons and manifesting as several different phenotypes. Whether these phenotypes correspond to different underlying disease processes is unknown. We used latent cluster analysis to identify groupings of clinical variables in an objective and unbiased way to improve phenotyping for clinical and research purposes.Latent class cluster analysis was applied to a large database consisting of 1467 records of people with ALS, using discrete variables which can be readily determined at the first clinic appointment. The model was tested for clinical relevance by survival analysis of the phenotypic groupings using the Kaplan-Meier method.The best model generated five distinct phenotypic classes that strongly predicted survival (p<0.0001. Eight variables were used for the latent class analysis, but a good estimate of the classification could be obtained using just two variables: site of first symptoms (bulbar or limb and time from symptom onset to diagnosis (p<0.00001.The five phenotypic classes identified using latent cluster analysis can predict prognosis. They could be used to stratify patients recruited into clinical trials and generating more homogeneous disease groups for genetic, proteomic and risk factor research.

  18. Minimum Information about a Biosynthetic Gene cluster

    NARCIS (Netherlands)

    Medema, M.H.; Kottmann, Renzo; Yilmaz, Pelin; Cummings, Matthew; Biggins, J.B.; Blin, Kai; Bruijn, De Irene; Chooi, Yit Heng; Claesen, Jan; Coates, R.C.; Cruz-Morales, Pablo; Duddela, Srikanth; Düsterhus, Stephanie; Edwards, Daniel J.; Fewer, David P.; Garg, Neha; Geiger, Christoph; Gomez-Escribano, Juan Pablo; Greule, Anja; Hadjithomas, Michalis; Haines, Anthony S.; Helfrich, Eric J.N.; Hillwig, Matthew L.; Ishida, Keishi; Jones, Adam C.; Jones, Carla S.; Jungmann, Katrin; Kegler, Carsten; Kim, Hyun Uk; Kötter, Peter; Krug, Daniel; Masschelein, Joleen; Melnik, Alexey V.; Mantovani, Simone M.; Monroe, Emily A.; Moore, Marcus; Moss, Nathan; Nützmann, Hans Wilhelm; Pan, Guohui; Pati, Amrita; Petras, Daniel; Reen, F.J.; Rosconi, Federico; Rui, Zhe; Tian, Zhenhua; Tobias, Nicholas J.; Tsunematsu, Yuta; Wiemann, Philipp; Wyckoff, Elizabeth; Yan, Xiaohui; Yim, Grace; Yu, Fengan; Xie, Yunchang; Aigle, Bertrand; Apel, Alexander K.; Balibar, Carl J.; Balskus, Emily P.; Barona-Gómez, Francisco; Bechthold, Andreas; Bode, Helge B.; Borriss, Rainer; Brady, Sean F.; Brakhage, Axel A.; Caffrey, Patrick; Cheng, Yi Qiang; Clardy, Jon; Cox, Russell J.; Mot, De René; Donadio, Stefano; Donia, Mohamed S.; Donk, Van Der Wilfred A.; Dorrestein, Pieter C.; Doyle, Sean; Driessen, Arnold J.M.; Ehling-Schulz, Monika; Entian, Karl Dieter; Fischbach, Michael A.; Gerwick, Lena; Gerwick, William H.; Gross, Harald; Gust, Bertolt; Hertweck, Christian; Höfte, Monica; Jensen, Susan E.; Ju, Jianhua; Katz, Leonard; Kaysser, Leonard; Klassen, Jonathan L.; Keller, Nancy P.; Kormanec, Jan; Kuipers, Oscar P.; Kuzuyama, Tomohisa; Kyrpides, Nikos C.; Kwon, Hyung Jin; Lautru, Sylvie; Lavigne, Rob; Lee, Chia Y.; Linquan, Bai; Liu, Xinyu; Liu, Wen; Luzhetskyy, Andriy; Mahmud, Taifo; Mast, Yvonne; Méndez, Carmen; Metsä-Ketelä, Mikko; Micklefield, Jason; Mitchell, Douglas A.; Moore, Bradley S.; Moreira, Leonilde M.; Müller, Rolf; Neilan, Brett A.; Nett, Markus; Nielsen, Jens; O'Gara, Fergal; Oikawa, Hideaki; Osbourn, Anne; Osburne, Marcia S.; Ostash, Bohdan; Payne, Shelley M.; Pernodet, Jean Luc; Petricek, Miroslav; Piel, Jörn; Ploux, Olivier; Raaijmakers, Jos M.; Salas, José A.; Schmitt, Esther K.; Scott, Barry; Seipke, Ryan F.; Shen, Ben; Sherman, David H.; Sivonen, Kaarina; Smanski, Michael J.; Sosio, Margherita; Stegmann, Evi; Süssmuth, Roderich D.; Tahlan, Kapil; Thomas, Christopher M.; Tang, Yi; Truman, Andrew W.; Viaud, Muriel; Walton, Jonathan D.; Walsh, Christopher T.; Weber, Tilmann; Wezel, Van Gilles P.; Wilkinson, Barrie; Willey, Joanne M.; Wohlleben, Wolfgang; Wright, Gerard D.; Ziemert, Nadine; Zhang, Changsheng; Zotchev, Sergey B.; Breitling, Rainer; Takano, Eriko; Glöckner, Frank Oliver

    2015-01-01

    A wide variety of enzymatic pathways that produce specialized metabolites in bacteria, fungi and plants are known to be encoded in biosynthetic gene clusters. Information about these clusters, pathways and metabolites is currently dispersed throughout the literature, making it difficult to

  19. Prognostics

    Data.gov (United States)

    National Aeronautics and Space Administration — Prognostics has received considerable attention recently as an emerging sub-discipline within SHM. Prognosis is here strictly defined as “predicting the time at...

  20. Pichia stipitis genomics, transcriptomics, and gene clusters

    Science.gov (United States)

    Thomas W. Jeffries; Jennifer R. Headman Van Vleet

    2009-01-01

    Genome sequencing and subsequent global gene expression studies have advanced our understanding of the lignocellulose-fermenting yeast Pichia stipitis. These studies have provided an insight into its central carbon metabolism, and analysis of its genome has revealed numerous functional gene clusters and tandem repeats. Specialized physiological traits are often the...

  1. Prognostic Power of a Tumor Differentiation Gene Signature for Bladder Urothelial Carcinomas.

    Science.gov (United States)

    Mo, Qianxing; Nikolos, Fotis; Chen, Fengju; Tramel, Zoe; Lee, Yu-Cheng; Hayashi, Kazukuni; Xiao, Jing; Shen, Jianjun; Chan, Keith Syson

    2018-01-12

    Muscle-invasive bladder cancers (MIBCs) cause approximately 150 000 deaths per year worldwide. Survival for MIBC patients is heterogeneous, with no clinically validated molecular markers that predict clinical outcome. Non-MIBCs (NMIBCs) generally have favorable outcome; however, a portion progress to MIBC. Hence, development of a prognostic tool that can guide decision-making is crucial for improving clinical management of bladder urothelial carcinomas. Tumor grade is defined by pathologic evaluation of tumor cell differentiation, and it often associates with clinical outcome. The current study extrapolates this conventional wisdom and combines it with molecular profiling. We developed an 18-gene signature that molecularly defines urothelial cellular differentiation, thus classifying MIBCs and NMIBCs into two subgroups: basal and differentiated. We evaluated the prognostic capability of this "tumor differentiation signature" and three other existing gene signatures including the The Cancer Genome Atlas (TCGA; 2707 genes), MD Anderson Cancer Center (MDA; 2252 genes/2697 probes), and University of North Carolina at Chapel Hill (UNC; 47 genes) using five gene expression data sets derived from MIBC and NMIBC patients. All statistical tests were two-sided. The tumor differentiation signature demonstrated consistency and statistical robustness toward stratifying MIBC patients into different overall survival outcomes (TCGA cohort 1, P = .03; MDA discovery, P = .009; MDA validation, P = .01), while the other signatures were not as consistent. In addition, we analyzed the progression (Ta/T1 progressing to ≥T2) probability of NMIBCs. NMIBC patients with a basal tumor differentiation signature associated with worse progression outcome (P = .008). Gene functional term enrichment and gene set enrichment analyses revealed that genes involved in the biologic process of immune response and inflammatory response are among the most elevated within basal bladder cancers

  2. Prognostically distinct clinical patterns of systemic lupus erythematosus identified by cluster analysis.

    Science.gov (United States)

    To, C H; Mok, C C; Tang, S S K; Ying, S K Y; Wong, R W S; Lau, C S

    2009-12-01

    The objective of this study was to evaluate the patterns of clinical manifestations and their mortality in a large cohort of Chinese patients with systemic lupus erythematosus. The cumulative clinical manifestations of a large group of Chinese systemic lupus erythematosus patients who fulfilled at least four American College of Rheumatology criteria for systemic lupus erythematosus were studied. Patients were divided into distinct groups by using the K-mean cluster analysis. Clinical features, prevalence of proliferative lupus nephritis (World Health Organization class III, IV), autoantibody profile, and treatment data were compared and the standardized mortality ratios were calculated for each cluster of patients. There were 1082 patients included in the study (mean age at systemic lupus erythematosus diagnosis 30.5 years; mean systemic lupus erythematosus duration 10.3 years). Three distinct groups of patients were identified. Cluster 1 (n = 347) was characterized predominantly by mucocutaneous manifestations (malar rash, discoid rash, photosensitivity, oral ulcer) and arthritis but having the lowest prevalence of serositis, hematologic manifestations (hemolytic anemia, leukopenia, and thrombocytopenia), and proliferative lupus nephritis. Patients in cluster 2 (n = 409) had mainly renal and hematological manifestations but having the lowest prevalence of mucocutaneous manifestations. Pulmonary and gastrointestinal manifestations were significantly more frequent in cluster 2 than the other clusters. Cluster 3 patients (n = 326) had the most heterogeneous features. Besides having a high prevalence of mucocutaneous manifestations, serositis and hematologic manifestations, renal involvement, and proliferative lupus nephritis was also most prevalent among the three clusters. Patients in cluster 2 had a much higher standardized mortality ratio [standardized mortality ratio 7.23 (6.7-7.7), p lupus erythematosus could be clustered into prognostically distinct patterns of

  3. Clustering context-specific gene regulatory networks.

    Science.gov (United States)

    Ramesh, Archana; Trevino, Robert; VON Hoff, Daniel D; Kim, Seungchan

    2010-01-01

    Gene regulatory networks (GRNs) learned from high throughput genomic data are often hard to visualize due to the large number of nodes and edges involved, rendering them difficult to appreciate. This becomes an important issue when modular structures are inherent in the inferred networks, such as in the recently proposed context-specific GRNs.(12) In this study, we investigate the application of graph clustering techniques to discern modularity in such highly complex graphs, focusing on context-specific GRNs. Identified modules are then associated with a subset of samples and the key pathways enriched in the module. Specifically, we study the use of Markov clustering and spectral clustering on cancer datasets to yield evidence on the possible association amongst different tumor types. Two sets of gene expression profiling data were analyzed to reveal context-specificity as well as modularity in genomic regulations.

  4. Classification of microvascular patterns via cluster analysis reveals their prognostic significance in glioblastoma.

    Science.gov (United States)

    Chen, Long; Lin, Zhi-Xiong; Lin, Guo-Shi; Zhou, Chang-Fu; Chen, Yu-Peng; Wang, Xing-Fu; Zheng, Zong-Qing

    2015-01-01

    There are limited researches focusing on microvascular patterns (MVPs) in human glioblastoma and their prognostic impact. We evaluated MVPs of 78 glioblastomas by CD34/periodic acid-Schiff dual staining and by cluster analysis of the percentage of microvascular area for distinct microvascular formations. The distribution of 5 types of basic microvascular formations, that is, microvascular sprouting (MS), vascular cluster (VC), vascular garland (VG), glomeruloid vascular proliferation (GVP), and vasculogenic mimicry (VM), was variable. Accordingly, cluster analysis classified MVPs into 2 types: type I MVP displayed prominent MSs and VCs, whereas type II MVP had numerous VGs, GVPs, and VMs. By analyzing the proportion of microvascular area for each type of formation, we determined that glioblastomas with few MSs and VCs had many GVPs and VMs, and vice versa. VG seemed to be a transitional type of formation. In case of type I MVP, expression of Ki-67 and p53 but not MGMT was significantly higher as compared with those of type II MVP (P < .05). Survival analysis showed that the type of MVPs presented as an independent prognostic factor of progression-free survival (PFS) and overall survival (OS) (both P < .001). Type II MVP had a more negative influence on PFS and OS than did type I MVP. We conclude that the heterogeneous MVPs in glioblastoma can be categorized properly by certain histopathologic and statistical analyses and may influence clinical outcome. Copyright © 2014 The Authors. Published by Elsevier Inc. All rights reserved.

  5. Semi-supervised consensus clustering for gene expression data analysis

    OpenAIRE

    Wang, Yunli; Pan, Youlian

    2014-01-01

    Background Simple clustering methods such as hierarchical clustering and k-means are widely used for gene expression data analysis; but they are unable to deal with noise and high dimensionality associated with the microarray gene expression data. Consensus clustering appears to improve the robustness and quality of clustering results. Incorporating prior knowledge in clustering process (semi-supervised clustering) has been shown to improve the consistency between the data partitioning and do...

  6. An accurate prostate cancer prognosticator using a seven-gene signature plus Gleason score and taking cell type heterogeneity into account.

    Directory of Open Access Journals (Sweden)

    Xin Chen

    Full Text Available One of the major challenges in the development of prostate cancer prognostic biomarkers is the cellular heterogeneity in tissue samples. We developed an objective Cluster-Correlation (CC analysis to identify gene expression changes in various cell types that are associated with progression. In the Cluster step, samples were clustered (unsupervised based on the expression values of each gene through a mixture model combined with a multiple linear regression model in which cell-type percent data were used for decomposition. In the Correlation step, a Chi-square test was used to select potential prognostic genes. With CC analysis, we identified 324 significantly expressed genes (68 tumor and 256 stroma cell expressed genes which were strongly associated with the observed biochemical relapse status. Significance Analysis of Microarray (SAM was then utilized to develop a seven-gene classifier. The Classifier has been validated using two independent Data Sets. The overall prediction accuracy and sensitivity is 71% and 76%, respectively. The inclusion of the Gleason sum to the seven-gene classifier raised the prediction accuracy and sensitivity to 83% and 76% respectively based on independent testing. These results indicated that our prognostic model that includes cell type adjustments and using Gleason score and the seven-gene signature has some utility for predicting outcomes for prostate cancer for individual patients at the time of prognosis. The strategy could have applications for improving marker performance in other cancers and other diseases.

  7. Prognostic Discrimination Using a 70-Gene Signature among Patients with Estrogen Receptor-Positive Breast Cancer and an Intermediate 21-Gene Recurrence Score

    Directory of Open Access Journals (Sweden)

    Sung Gwe Ahn

    2013-12-01

    Full Text Available The Oncotype DX® recurrence score (RS predictor has been clinically utilized to appropriately select adjuvant chemotherapy for patients with estrogen receptor (ER-positive early breast cancer. However, the selection of chemotherapy for patients with intermediate RSs remains controversial. We assessed the prognostic value of a 70-gene signature (70GS among patients with ER-positive breast cancer and intermediate RSs. In addition, we sought to identify genes associated with poor 70GS scores based on gene expression profiling (GEP. GEP was performed using gene expression data from 186 patients with ER-positive breast cancer. The RS and 70GS score were calculated on the basis of GEP. Among 186 patients, 82 ER-positive patients with intermediate RSs were identified. These patients were stratified by 70GS, overall survival (OS significantly differed according to 70GS (p = 0.013. In a supervised hierarchical analysis according to 70GS, the expression of several representative genes for cell proliferation was significantly higher in the poor 70GS cluster than in the good 70GS cluster. Furthermore, among these patients, FOXM1, AURKA, AURKB, and BIRC5 displayed prognostic significance for OS. In conclusion, 70GS can help to discriminate survival differences among ER-positive patients with intermediate RSs. FOXM1, AURKA, AURKB, and BIRC5, are associated with poor 70GS scores.

  8. AKT pathway genes define 5 prognostic subgroups in glioblastoma.

    Directory of Open Access Journals (Sweden)

    Anna Joy

    Full Text Available Activity of GFR/PI3K/AKT pathway inhibitors in glioblastoma clinical trials has not been robust. We hypothesized variations in the pathway between tumors contribute to poor response. We clustered GBM based on AKT pathway genes and discovered new subtypes then characterized their clinical and molecular features. There are at least 5 GBM AKT subtypes having distinct DNA copy number alterations, enrichment in oncogenes and tumor suppressor genes and patterns of expression for PI3K/AKT/mTOR signaling components. Gene Ontology terms indicate a different cell of origin or dominant phenotype for each subgroup. Evidence suggests one subtype is very sensitive to BCNU or CCNU (median survival 5.8 vs. 1.5 years; BCNU/CCNU vs other treatments; respectively. AKT subtyping advances previous approaches by revealing additional subgroups with unique clinical and molecular features. Evidence indicates it is a predictive marker for response to BCNU or CCNU and PI3K/AKT/mTOR pathway inhibitors. We anticipate Akt subtyping may help stratify patients for clinical trials and augment discovery of class-specific therapeutic targets.

  9. AKT Pathway Genes Define 5 Prognostic Subgroups in Glioblastoma

    Science.gov (United States)

    Smirnov, Ivan; Reiser, Mark; Misra, Anjan; Shapiro, William R.; Mills, Gordon B.; Kim, Seungchan; Feuerstein, Burt G.

    2014-01-01

    Activity of GFR/PI3K/AKT pathway inhibitors in glioblastoma clinical trials has not been robust. We hypothesized variations in the pathway between tumors contribute to poor response. We clustered GBM based on AKT pathway genes and discovered new subtypes then characterized their clinical and molecular features. There are at least 5 GBM AKT subtypes having distinct DNA copy number alterations, enrichment in oncogenes and tumor suppressor genes and patterns of expression for PI3K/AKT/mTOR signaling components. Gene Ontology terms indicate a different cell of origin or dominant phenotype for each subgroup. Evidence suggests one subtype is very sensitive to BCNU or CCNU (median survival 5.8 vs. 1.5 years; BCNU/CCNU vs other treatments; respectively). AKT subtyping advances previous approaches by revealing additional subgroups with unique clinical and molecular features. Evidence indicates it is a predictive marker for response to BCNU or CCNU and PI3K/AKT/mTOR pathway inhibitors. We anticipate Akt subtyping may help stratify patients for clinical trials and augment discovery of class-specific therapeutic targets. PMID:24984002

  10. AKT pathway genes define 5 prognostic subgroups in glioblastoma.

    Science.gov (United States)

    Joy, Anna; Ramesh, Archana; Smirnov, Ivan; Reiser, Mark; Misra, Anjan; Shapiro, William R; Mills, Gordon B; Kim, Seungchan; Feuerstein, Burt G

    2014-01-01

    Activity of GFR/PI3K/AKT pathway inhibitors in glioblastoma clinical trials has not been robust. We hypothesized variations in the pathway between tumors contribute to poor response. We clustered GBM based on AKT pathway genes and discovered new subtypes then characterized their clinical and molecular features. There are at least 5 GBM AKT subtypes having distinct DNA copy number alterations, enrichment in oncogenes and tumor suppressor genes and patterns of expression for PI3K/AKT/mTOR signaling components. Gene Ontology terms indicate a different cell of origin or dominant phenotype for each subgroup. Evidence suggests one subtype is very sensitive to BCNU or CCNU (median survival 5.8 vs. 1.5 years; BCNU/CCNU vs other treatments; respectively). AKT subtyping advances previous approaches by revealing additional subgroups with unique clinical and molecular features. Evidence indicates it is a predictive marker for response to BCNU or CCNU and PI3K/AKT/mTOR pathway inhibitors. We anticipate Akt subtyping may help stratify patients for clinical trials and augment discovery of class-specific therapeutic targets.

  11. Prognostic Biomarker Identification Through Integrating the Gene Signatures of Hepatocellular Carcinoma Properties

    Directory of Open Access Journals (Sweden)

    Jialin Cai

    2017-05-01

    Full Text Available Many molecular classification and prognostic gene signatures for hepatocellular carcinoma (HCC patients have been established based on genome-wide gene expression profiling; however, their generalizability is unclear. Herein, we systematically assessed the prognostic effects of these gene signatures and identified valuable prognostic biomarkers by integrating these gene signatures. With two independent HCC datasets (GSE14520, N = 242 and GSE54236, N = 78, 30 published gene signatures were evaluated, and 11 were significantly associated with the overall survival (OS of postoperative HCC patients in both datasets. The random survival forest models suggested that the gene signatures were superior to clinical characteristics for predicting the prognosis of the patients. Based on the 11 gene signatures, a functional protein-protein interaction (PPI network with 1406 nodes and 10,135 edges was established. With tissue microarrays of HCC patients (N = 60, we determined the prognostic values of the core genes in the network and found that RAD21, CDK1, and HDAC2 expression levels were negatively associated with OS for HCC patients. The multivariate Cox regression analyses suggested that CDK1 was an independent prognostic factor, which was validated in an independent case cohort (N = 78. In cellular models, inhibition of CDK1 by siRNA or a specific inhibitor, RO-3306, reduced cellular proliferation and viability for HCC cells. These results suggest that the prognostic predictive capacities of these gene signatures are reproducible and that CDK1 is a potential prognostic biomarker or therapeutic target for HCC patients.

  12. Gene ordering in partitive clustering using microarray expressions

    Indian Academy of Sciences (India)

    2007-06-28

    Jun 28, 2007 ... We validated our hybrid approach using yeast and fibroblast data and showed that our approach improves the result quality of partitive clustering solution, by identifying subclusters within big clusters, grouping functionally correlated genes within clusters, minimization of summation of gene expression ...

  13. Prognostic immune-related gene models for breast cancer: a pooled analysis.

    Science.gov (United States)

    Zhao, Jianli; Wang, Ying; Lao, Zengding; Liang, Siting; Hou, Jingyi; Yu, Yunfang; Yao, Herui; You, Na; Chen, Kai

    2017-01-01

    Breast cancer, the most common cancer among women, is a clinically and biologically heterogeneous disease. Numerous prognostic tools have been proposed, including gene signatures. Unlike proliferation-related prognostic gene signatures, many immune-related gene signatures have emerged as principal biology-driven predictors of breast cancer. Diverse statistical methods and data sets were used for building these immune-related prognostic models, making it difficult to compare or use them in clinically meaningful ways. This study evaluated successfully published immune-related prognostic gene signatures through systematic validations of publicly available data sets. Eight prognostic models that were built upon immune-related gene signatures were evaluated. The performances of these models were compared and ranked in ten publicly available data sets, comprising a total of 2,449 breast cancer cases. Predictive accuracies were measured as concordance indices (C-indices). All tests of statistical significance were two-sided. Immune-related gene models performed better in estrogen receptor-negative (ER-) and lymph node-positive (LN+) breast cancer subtypes. The three top-ranked ER- breast cancer models achieved overall C-indices of 0.62-0.63. Two models predicted better than chance for ER+ breast cancer, with C-indices of 0.53 and 0.59, respectively. For LN+ breast cancer, four models showed predictive advantage, with C-indices between 0.56 and 0.61. Predicted prognostic values were positively correlated with ER status when evaluated using univariate analyses in most of the models under investigation. Multivariate analyses indicated that prognostic values of the three models were independent of known clinical prognostic factors. Collectively, these analyses provided a comprehensive evaluation of immune-related prognostic gene signatures. By synthesizing C-indices in multiple independent data sets, immune-related gene signatures were ranked for ER+, ER-, LN+, and LN- breast

  14. A Combinatory Approach for Selecting Prognostic Genes in Microarray Studies of Tumour Survivals

    Directory of Open Access Journals (Sweden)

    Qihua Tan

    2009-01-01

    Full Text Available Different from significant gene expression analysis which looks for genes that are differentially regulated, feature selection in the microarray-based prognostic gene expression analysis aims at finding a subset of marker genes that are not only differentially expressed but also informative for prediction. Unfortunately feature selection in literature of microarray study is predominated by the simple heuristic univariate gene filter paradigm that selects differentially expressed genes according to their statistical significances. We introduce a combinatory feature selection strategy that integrates differential gene expression analysis with the Gram-Schmidt process to identify prognostic genes that are both statistically significant and highly informative for predicting tumour survival outcomes. Empirical application to leukemia and ovarian cancer survival data through-within- and cross-study validations shows that the feature space can be largely reduced while achieving improved testing performances.

  15. Gene expression classification of colon cancer into molecular subtypes: characterization, validation, and prognostic value.

    Directory of Open Access Journals (Sweden)

    Laetitia Marisa

    Full Text Available Colon cancer (CC pathological staging fails to accurately predict recurrence, and to date, no gene expression signature has proven reliable for prognosis stratification in clinical practice, perhaps because CC is a heterogeneous disease. The aim of this study was to establish a comprehensive molecular classification of CC based on mRNA expression profile analyses.Fresh-frozen primary tumor samples from a large multicenter cohort of 750 patients with stage I to IV CC who underwent surgery between 1987 and 2007 in seven centers were characterized for common DNA alterations, including BRAF, KRAS, and TP53 mutations, CpG island methylator phenotype, mismatch repair status, and chromosomal instability status, and were screened with whole genome and transcriptome arrays. 566 samples fulfilled RNA quality requirements. Unsupervised consensus hierarchical clustering applied to gene expression data from a discovery subset of 443 CC samples identified six molecular subtypes. These subtypes were associated with distinct clinicopathological characteristics, molecular alterations, specific enrichments of supervised gene expression signatures (stem cell phenotype-like, normal-like, serrated CC phenotype-like, and deregulated signaling pathways. Based on their main biological characteristics, we distinguished a deficient mismatch repair subtype, a KRAS mutant subtype, a cancer stem cell subtype, and three chromosomal instability subtypes, including one associated with down-regulated immune pathways, one with up-regulation of the Wnt pathway, and one displaying a normal-like gene expression profile. The classification was validated in the remaining 123 samples plus an independent set of 1,058 CC samples, including eight public datasets. Furthermore, prognosis was analyzed in the subset of stage II-III CC samples. The subtypes C4 and C6, but not the subtypes C1, C2, C3, and C5, were independently associated with shorter relapse-free survival, even after

  16. The Prognostic Value of Haplotypes in the Vascular Endothelial Growth Factor A Gene in Colorectal Cancer

    Energy Technology Data Exchange (ETDEWEB)

    Hansen, Torben F., E-mail: torben.hansen@slb.regionsyddanmark.dk; Spindler, Karen-Lise G. [Department of Oncology, Vejle Hospital, Vejle (Denmark); Andersen, Rikke F. [Department of Biochemistry, Vejle Hospital, Vejle (Denmark); Lindebjerg, Jan [Department of Clinical Pathology, Vejle Hospital, Vejle (Denmark); Kølvraa, Steen [Department of Clinical Genetics, Vejle Hospital, Vejle (Denmark); Brandslund, Ivan [Department of Biochemistry, Vejle Hospital, Vejle (Denmark); Jakobsen, Anders [Department of Oncology, Vejle Hospital, Vejle (Denmark)

    2010-06-28

    New prognostic markers in patients with colorectal cancer (CRC) are a prerequisite for individualized treatment. Prognostic importance of single nucleotide polymorphisms (SNPs) in the vascular endothelial growth factor A (VEGF-A) gene has been proposed. The objective of the present study was to investigate the prognostic importance of haplotypes in the VEGF-A gene in patients with CRC. The study included 486 patients surgically resected for stage II and III CRC, divided into two independent cohorts. Three SNPs in the VEGF-A gene were analyzed by polymerase chain reaction. Haplotypes were estimated using the PHASE program. The prognostic influence was evaluated using Kaplan-Meir plots and log rank tests. Cox regression method was used to analyze the independent prognostic importance of different markers. All three SNPs were significantly related to survival. A haplotype combination, responsible for this effect, was present in approximately 30% of the patients and demonstrated a significant relationship with poor survival, and it remained an independent prognostic marker after multivariate analysis, hazard ratio 2.46 (95% confidence interval 1.49–4.06), p < 0.001. Validation was provided by consistent findings in a second and independent cohort. Haplotype combinations call for further investigation.

  17. The Prognostic Value of Haplotypes in the Vascular Endothelial Growth Factor A Gene in Colorectal Cancer

    Directory of Open Access Journals (Sweden)

    Torben F. Hansen

    2010-06-01

    Full Text Available New prognostic markers in patients with colorectal cancer (CRC are a prerequisite for individualized treatment. Prognostic importance of single nucleotide polymorphisms (SNPs in the vascular endothelial growth factor A (VEGF-A gene has been proposed. The objective of the present study was to investigate the prognostic importance of haplotypes in the VEGF-A gene in patients with CRC. The study included 486 patients surgically resected for stage II and III CRC, divided into two independent cohorts. Three SNPs in the VEGF-A gene were analyzed by polymerase chain reaction. Haplotypes were estimated using the PHASE program. The prognostic influence was evaluated using Kaplan-Meir plots and log rank tests. Cox regression method was used to analyze the independent prognostic importance of different markers. All three SNPs were significantly related to survival. A haplotype combination, responsible for this effect, was present in approximately 30% of the patients and demonstrated a significant relationship with poor survival, and it remained an independent prognostic marker after multivariate analysis, hazard ratio 2.46 (95% confidence interval 1.49–4.06, p < 0.001. Validation was provided by consistent findings in a second and independent cohort. Haplotype combinations call for further investigation.

  18. A combinatory approach for selecting prognostic genes in microarray studies of tumour survivals

    DEFF Research Database (Denmark)

    Tan, Qihua; Thomassen, Mads; Jochumsen, Kirsten M

    2009-01-01

    differential gene expression analysis with the Gram-Schmidt process to identify prognostic genes that are both statistically significant and highly informative for predicting tumour survival outcomes. Empirical application to leukemia and ovarian cancer survival data through-within- and cross-study validations...... for prediction. Unfortunately feature selection in literature of microarray study is predominated by the simple heuristic univariate gene filter paradigm that selects differentially expressed genes according to their statistical significances. We introduce a combinatory feature selection strategy that integrates...

  19. Phylogenetic detection of conserved gene clusters in microbial genomes

    Directory of Open Access Journals (Sweden)

    Anton Brian P

    2005-10-01

    Full Text Available Abstract Background Microbial genomes contain an abundance of genes with conserved proximity forming clusters on the chromosome. However, the conservation can be a result of many factors such as vertical inheritance, or functional selection. Thus, identification of conserved gene clusters that are under functional selection provides an effective channel for gene annotation, microarray screening, and pathway reconstruction. The problem of devising a robust method to identify these conserved gene clusters and to evaluate the significance of the conservation in multiple genomes has a number of implications for comparative, evolutionary and functional genomics as well as synthetic biology. Results In this paper we describe a new method for detecting conserved gene clusters that incorporates the information captured by a genome phylogenetic tree. We show that our method can overcome the common problem of overestimation of significance due to the bias in the genome database and thereby achieve better accuracy when detecting functionally connected gene clusters. Our results can be accessed at database GeneChords http://genomics10.bu.edu/GeneChords. Conclusion The methodology described in this paper gives a scalable framework for discovering conserved gene clusters in microbial genomes. It serves as a platform for many other functional genomic analyses in microorganisms, such as operon prediction, regulatory site prediction, functional annotation of genes, evolutionary origin and development of gene clusters.

  20. Phylogenetic detection of conserved gene clusters in microbial genomes.

    Science.gov (United States)

    Zheng, Yu; Anton, Brian P; Roberts, Richard J; Kasif, Simon

    2005-10-03

    Microbial genomes contain an abundance of genes with conserved proximity forming clusters on the chromosome. However, the conservation can be a result of many factors such as vertical inheritance, or functional selection. Thus, identification of conserved gene clusters that are under functional selection provides an effective channel for gene annotation, microarray screening, and pathway reconstruction. The problem of devising a robust method to identify these conserved gene clusters and to evaluate the significance of the conservation in multiple genomes has a number of implications for comparative, evolutionary and functional genomics as well as synthetic biology. In this paper we describe a new method for detecting conserved gene clusters that incorporates the information captured by a genome phylogenetic tree. We show that our method can overcome the common problem of overestimation of significance due to the bias in the genome database and thereby achieve better accuracy when detecting functionally connected gene clusters. Our results can be accessed at database GeneChords http://genomics10.bu.edu/GeneChords. The methodology described in this paper gives a scalable framework for discovering conserved gene clusters in microbial genomes. It serves as a platform for many other functional genomic analyses in microorganisms, such as operon prediction, regulatory site prediction, functional annotation of genes, evolutionary origin and development of gene clusters.

  1. Detecting Sequence Homology at the Gene Cluster Level with MultiGeneBlast

    NARCIS (Netherlands)

    Medema, Marnix H.; Takano, Eriko; Breitling, Rainer; Nowick, Katja

    The genes encoding many biomolecular systems and pathways are genomically organized in operons or gene clusters. With MultiGeneBlast, we provide a user-friendly and effective tool to perform homology searches with operons or gene clusters as basic units, instead of single genes. The

  2. Unique nucleotide polymorphism of ankyrin gene cluster in ...

    Indian Academy of Sciences (India)

    The ankyrin (ANK) gene cluster is a part of a multigene family encoding ANK transmembrane proteins in Arabidopsis thaliana, and plays an important role in protein–protein interactions and in signal pathways. In contrast to other regions of a genome, the ANK gene cluster exhibits an extremely high level of DNA ...

  3. Bioinformatics Prediction of Polyketide Synthase Gene Clusters from Mycosphaerella fijiensis.

    Directory of Open Access Journals (Sweden)

    Roslyn D Noar

    Full Text Available Mycosphaerella fijiensis, causal agent of black Sigatoka disease of banana, is a Dothideomycete fungus closely related to fungi that produce polyketides important for plant pathogenicity. We utilized the M. fijiensis genome sequence to predict PKS genes and their gene clusters and make bioinformatics predictions about the types of compounds produced by these clusters. Eight PKS gene clusters were identified in the M. fijiensis genome, placing M. fijiensis into the 23rd percentile for the number of PKS genes compared to other Dothideomycetes. Analysis of the PKS domains identified three of the PKS enzymes as non-reducing and two as highly reducing. Gene clusters contained types of genes frequently found in PKS clusters including genes encoding transporters, oxidoreductases, methyltransferases, and non-ribosomal peptide synthases. Phylogenetic analysis identified a putative PKS cluster encoding melanin biosynthesis. None of the other clusters were closely aligned with genes encoding known polyketides, however three of the PKS genes fell into clades with clusters encoding alternapyrone, fumonisin, and solanapyrone produced by Alternaria and Fusarium species. A search for homologs among available genomic sequences from 103 Dothideomycetes identified close homologs (>80% similarity for six of the PKS sequences. One of the PKS sequences was not similar (< 60% similarity to sequences in any of the 103 genomes, suggesting that it encodes a unique compound. Comparison of the M. fijiensis PKS sequences with those of two other banana pathogens, M. musicola and M. eumusae, showed that these two species have close homologs to five of the M. fijiensis PKS sequences, but three others were not found in either species. RT-PCR and RNA-Seq analysis showed that the melanin PKS cluster was down-regulated in infected banana as compared to growth in culture. Three other clusters, however were strongly upregulated during disease development in banana, suggesting that

  4. Some statistical properties of gene expression clustering for array data

    DEFF Research Database (Denmark)

    Abreu, G C G; Pinheiro, A; Drummond, R D

    2010-01-01

    DNA array data without a corresponding statistical error measure. We propose an easy-to-implement and simple-to-use technique that uses bootstrap re-sampling to evaluate the statistical error of the nodes provided by SOM-based clustering. Comparisons between SOM and parametric clustering are presented...... for simulated as well as for two real data sets. We also implement a bootstrap-based pre-processing procedure for SOM, that improves the false discovery ratio of differentially expressed genes. Code in Matlab is freely available, as well as some supplementary material, at the following address: https......DNA arrays have been a rich source of data for the study of genomic expression of a wide variety of biological systems. Gene clustering is one of the paradigms quite used to assess the significance of a gene (or group of genes). However, most of the gene clustering techniques are applied to c...

  5. Prognostic Fifteen-Gene Signature for Early Stage Pancreatic Ductal Adenocarcinoma.

    Directory of Open Access Journals (Sweden)

    Dung-Tsa Chen

    Full Text Available The outcomes of patients treated with surgery for early stage pancreatic ductal adenocarcinoma (PDAC are variable with median survival ranging from 6 months to more than 5 years. This challenge underscores an unmet need for developing personalized medicine strategies to refine the current treatment decision-making process. To derive a prognostic gene signature for patients with early stage PDAC, a PDAC cohort from Moffitt Cancer Center (n = 63 was used with overall survival (OS as the primary endpoint. This was further evaluated using an independent microarray cohort dataset (Stratford et al: n = 102. Technical validation was performed by NanoString platform. A prognostic 15-gene signature was developed and showed a statistically significant association with OS in the Moffitt cohort (hazard ratio [HR] = 3.26; p<0.001 and Stratford et al cohort (HR = 2.07; p = 0.02, and was independent of other prognostic variables. Moreover, integration of the signature with the TNM staging system improved risk prediction (p<0.01 in both cohorts. In addition, NanoString validation showed that the signature was robust with a high degree of reproducibility and the association with OS remained significant in the two cohorts. The gene signature could be a potential prognostic tool to allow risk-adapted stratification of PDAC patients into personalized treatment protocols; possibly improving the currently poor clinical outcomes of these patients.

  6. Clustering Algorithms: Their Application to Gene Expression Data

    Science.gov (United States)

    Oyelade, Jelili; Isewon, Itunuoluwa; Oladipupo, Funke; Aromolaran, Olufemi; Uwoghiren, Efosa; Ameh, Faridah; Achas, Moses; Adebiyi, Ezekiel

    2016-01-01

    Gene expression data hide vital information required to understand the biological process that takes place in a particular organism in relation to its environment. Deciphering the hidden patterns in gene expression data proffers a prodigious preference to strengthen the understanding of functional genomics. The complexity of biological networks and the volume of genes present increase the challenges of comprehending and interpretation of the resulting mass of data, which consists of millions of measurements; these data also inhibit vagueness, imprecision, and noise. Therefore, the use of clustering techniques is a first step toward addressing these challenges, which is essential in the data mining process to reveal natural structures and identify interesting patterns in the underlying data. The clustering of gene expression data has been proven to be useful in making known the natural structure inherent in gene expression data, understanding gene functions, cellular processes, and subtypes of cells, mining useful information from noisy data, and understanding gene regulation. The other benefit of clustering gene expression data is the identification of homology, which is very important in vaccine design. This review examines the various clustering algorithms applicable to the gene expression data in order to discover and provide useful knowledge of the appropriate clustering technique that will guarantee stability and high degree of accuracy in its analysis procedure. PMID:27932867

  7. Minimum Information about a Biosynthetic Gene cluster : commentary

    NARCIS (Netherlands)

    Medema, Marnix H; Kottmann, Renzo; Yilmaz, Pelin; Cummings, Matthew; Biggins, John B; Blin, Kai; de Bruijn, Irene; Chooi, Yit Heng; Claesen, Jan; Coates, R Cameron; Cruz-Morales, Pablo; Duddela, Srikanth; Dusterhus, Stephanie; Edwards, Daniel J; Fewer, David P; Garg, Neha; Geiger, Christoph; Gomez-Escribano, Juan Pablo; Greule, Anja; Hadjithomas, Michalis; Haines, Anthony S; Helfrich, Eric J N; Hillwig, Matthew L; Ishida, Keishi; Jones, Adam C; Jones, Carla S; Jungmann, Katrin; Kegler, Carsten; Kim, Hyun Uk; Kotter, Peter; Krug, Daniel; Masschelein, Joleen; Melnik, Alexey V; Mantovani, Simone M; Monroe, Emily A; Moore, Marcus; Moss, Nathan; Nutzmann, Hans-Wilhelm; Pan, Guohui; Pati, Amrita; Petras, Daniel; Reen, F Jerry; Rosconi, Federico; Rui, Zhe; Tian, Zhenhua; Tobias, Nicholas J; Tsunematsu, Yuta; Wiemann, Philipp; Wyckoff, Elizabeth; Yan, Xiaohui; Yim, Grace; Yu, Fengan; Xie, Yunchang; Aigle, Bertrand; Apel, Alexander K; Balibar, Carl J; Balskus, Emily P; Barona-Gomez, Francisco; Bechthold, Andreas; Bode, Helge B; Borriss, Rainer; Brady, Sean F; Brakhage, Axel A; Caffrey, Patrick; Cheng, Yi-Qiang; Clardy, Jon; Cox, Russell J; De Mot, Rene; Donadio, Stefano; Donia, Mohamed S; van der Donk, Wilfred A; Dorrestein, Pieter C; Doyle, Sean; Driessen, Arnold J M; Ehling-Schulz, Monika; Entian, Karl-Dieter; Fischbach, Michael A; Gerwick, Lena; Gerwick, William H; Gross, Harald; Gust, Bertolt; Hertweck, Christian; Hofte, Monica; Jensen, Susan E; Ju, Jianhua; Katz, Leonard; Kaysser, Leonard; Klassen, Jonathan L; Keller, Nancy P; Kormanec, Jan; Kuipers, Oscar P; Kuzuyama, Tomohisa; Kyrpides, Nikos C; Kwon, Hyung-Jin; Lautru, Sylvie; Lavigne, Rob; Lee, Chia Y; Linquan, Bai; Liu, Xinyu; Liu, Wen; Luzhetskyy, Andriy; Mahmud, Taifo; Mast, Yvonne; Mendez, Carmen; Metsa-Ketela, Mikko; Micklefield, Jason; Mitchell, Douglas A; Moore, Bradley S; Moreira, Leonilde M; Muller, Rolf; Neilan, Brett A; Nett, Markus; Nielsen, Jens; O'Gara, Fergal; Oikawa, Hideaki; Osbourn, Anne; Osburne, Marcia S; Ostash, Bohdan; Payne, Shelley M; Pernodet, Jean-Luc; Petricek, Miroslav; Piel, Jorn; Ploux, Olivier; Raaijmakers, Jos M; Salas, Jose A; Schmitt, Esther K; Scott, Barry; Seipke, Ryan F; Shen, Ben; Sherman, David H; Sivonen, Kaarina; Smanski, Michael J; Sosio, Margherita; Stegmann, Evi; Sussmuth, Roderich D; Tahlan, Kapil; Thomas, Christopher M; Tang, Yi; Truman, Andrew W; Viaud, Muriel; Walton, Jonathan D; Walsh, Christopher T; Weber, Tilmann; van Wezel, Gilles P; Wilkinson, Barrie; Willey, Joanne M; Wohlleben, Wolfgang; Wright, Gerard D; Ziemert, Nadine; Zhang, Changsheng; Zotchev, Sergey B; Breitling, Rainer; Takano, Eriko; Glockner, Frank Oliver

    A wide variety of enzymatic pathways that produce specialized metabolites in bacteria, fungi and plants are known to be encoded in biosynthetic gene clusters. Information about these clusters, pathways and metabolites is currently dispersed throughout the literature, making it difficult to exploit.

  8. ColoGuidePro: a prognostic 7-gene expression signature for stage III colorectal cancer patients.

    Science.gov (United States)

    Sveen, Anita; Ågesen, Trude H; Nesbakken, Arild; Meling, Gunn Iren; Rognum, Torleiv O; Liestøl, Knut; Skotheim, Rolf I; Lothe, Ragnhild A

    2012-11-01

    Improved prognostic stratification of patients with stage II and III colorectal cancer is warranted for postoperative clinical decision making. This study was conducted to develop a clinically feasible and robust prognostic classifier for these patients independent of adjuvant treatment. Global gene expression profiles from altogether 387 stage II and III colorectal cancer tissue samples from three independent patient series were included in the study. ColoGuidePro, a seven-gene prognostic classifier, was developed from a selected Norwegian learning series (n = 95; no adjuvant treatment) using lasso-penalized multivariate survival modeling with cross-validation. The expression signature significantly stratified patients in a consecutive Norwegian test series, in which patients were treated according to current standards [HR, 2.9 (1.1-7.5); P = 0.03; n = 77] and an external validation series [HR, 3.7 (2.0-6.8); P < 0.001; n = 215] according to survival. ColoGuidePro was also an independent predictor of prognosis in multivariate models including tumor stage in both series (HR, ≥ 3.1; P ≤ 0.03). In the validation series, which consisted of patients from other populations (United States and Australia), 5-year relapse-free survival was significantly predicted for stage III patients only (P < 0.001; n = 107). Here, prognostic stratification was independent of adjuvant treatment (P = 0.001). We present ColoGuidePro, a prognostic classifier developed for patients with stage II and III colorectal cancer. The test is suitable for transfer to clinical use and has best prognostic prediction potential for stage III patients. ©2012 AACR.

  9. Gene expression profiles as prognostic markers in women with ovarian cancer

    DEFF Research Database (Denmark)

    Jochumsen, Kirsten M; Tan, Qihua; Høgdall, Estrid V

    2009-01-01

    disease. Furthermore, its ability to classify in an external validation set was demonstrated. The identified 14-gene prognostic profile was able to predict survival (short- vs long-term survival) with a strength that is better than any other prognostic factor in epithelial ovarian cancer including FIGO......The purpose was to find a gene expression profile that could distinguish short-term from long-term survivors in our collection of serous epithelial ovarian carcinomas. Furthermore, it should be able to stratify in an external validation set. Such a classifier profile will take us a step forward...... toward investigations for more individualized therapies and the use of gene expression profiles in the clinical practice. RNA from tumor tissue from 43 Danish patients with serous epithelial ovarian carcinoma (11 International Federation of Gynecology and Obstetrics [FIGO] stage I/II, 32 FIGO stage III...

  10. Gene Expression Profiles as Prognostic Marker in Women with Ovarian Cancer

    DEFF Research Database (Denmark)

    Jochumsen, Kirsten Marie; Tan, Qihua; Høgdall, EV

    2009-01-01

    disease. Furthermore, its ability to classify in an external validation set was demonstrated. The identified 14-gene prognostic profile was able to predict survival (short- vs long-term survival) with a strength that is better than any other prognostic factor in epithelial ovarian cancer including FIGO......The purpose was to find a gene expression profile that could distinguish short-term from long-term survivors in our collection of serous epithelial ovarian carcinomas. Furthermore, it should be able to stratify in an external validation set. Such a classifier profile will take us a step forward...... toward investigations for more individualized therapies and the use of gene expression profiles in the clinical practice. RNA from tumor tissue from 43 Danish patients with serous epithelial ovarian carcinoma (11 International Federation of Gynecology and Obstetrics [FIGO] stage I/II, 32 FIGO stage III...

  11. Microarray data mining using landmark gene-guided clustering

    Directory of Open Access Journals (Sweden)

    Cho HyungJun

    2008-02-01

    Full Text Available Abstract Background Clustering is a popular data exploration technique widely used in microarray data analysis. Most conventional clustering algorithms, however, generate only one set of clusters independent of the biological context of the analysis. This is often inadequate to explore data from different biological perspectives and gain new insights. We propose a new clustering model that can generate multiple versions of different clusters from a single dataset, each of which highlights a different aspect of the given dataset. Results By applying our SigCalc algorithm to three yeast Saccharomyces cerevisiae datasets we show two results. First, we show that different sets of clusters can be generated from the same dataset using different sets of landmark genes. Each set of clusters groups genes differently and reveals new biological associations between genes that were not apparent from clustering the original microarray expression data. Second, we show that many of these new found biological associations are common across datasets. These results also provide strong evidence of a link between the choice of landmark genes and the new biological associations found in gene clusters. Conclusion We have used the SigCalc algorithm to project the microarray data onto a completely new subspace whose co-ordinates are genes (called landmark genes, known to belong to a Biological Process. The projected space is not a true vector space in mathematical terms. However, we use the term subspace to refer to one of virtually infinite numbers of projected spaces that our proposed method can produce. By changing the biological process and thus the landmark genes, we can change this subspace. We have shown how clustering on this subspace reveals new, biologically meaningful clusters which were not evident in the clusters generated by conventional methods. The R scripts (source code are freely available under the GPL license. The source code is available [see

  12. [Transcriptional regulation of aco gene cluster in Bacillus thuringiensis].

    Science.gov (United States)

    Huang, Minzhong; Zhang, Jie; Gao, Jiguo; Song, Fuping

    2015-09-04

    We analyzed the transcriptional regulation of aco gene cluster and the phenotype of acoR mutant, to determine the effect of acoR deletion on sporulation efficiency and Cry protein production. Sequence of aco gene cluster in Bacillus thuringiensis was analyzed by sequence alignment. RT-PCR was carried out to reveal the transcriptional units of the aco gene cluster. acoR insertion mutant was constructed by homologous recombination. Transcriptional activity was analyzed by promoter fusions with lacZ gene. Comparison of the Cry1Ac protein production was determined by protein quantitation. The aco gene cluster was composed of four genes. The acoABCL formed one transcriptional unit. The transcriptional activity of acoA promoter sharply decreased in sigL and acoR mutants, respectively. Deletion of acoR had no effect on growth and Cry protein production, but decreased the motility of cells and sporulation efficiency. The aco gene cluster is controlled by Sigma 54 and activated by AcoR. Deletion of acoR has no effect on Cry protein production, but decreased the motility of the cells.

  13. Application of Gene Shaving and Mixture Models to Cluster Microarray Gene Expression Data

    Directory of Open Access Journals (Sweden)

    S. Wen

    2007-01-01

    Full Text Available Researchers are frequently faced with the analysis of microarray data of a relatively large number of genes using a small number of tissue samples. We examine the application of two statistical methods for clustering such microarray expression data: EMMIX-GENE and GeneClust. EMMIX-GENE is a mixture-model based clustering approach, designed primarily to cluster tissue samples on the basis of the genes. GeneClust is an implementation of the gene shaving methodology, motivated by research to identify distinct sets of genes for which variation in expression could be related to a biological property of the tissue samples. We illustrate the use of these two methods in the analysis of Affymetrix oligonucleotide arrays of well-known data sets from colon tissue samples with and without tumors, and of tumor tissue samples from patients with leukemia. Although the two approaches have been developed from different perspectives, the results demonstrate a clear correspondence between gene clusters produced by GeneClust and EMMIX-GENE for the colon tissue data. It is demonstrated, for the case of ribosomal proteins and smooth muscle genes in the colon data set, that both methods can classify genes into co-regulated families. It is further demonstrated that tissue types (tumor and normal can be separated on the basis of subtle distributed patterns of genes. Application to the leukemia tissue data produces a division of tissues corresponding closely to the external classification, acute myeloid leukemia (AML and acute lymphoblastic leukaemia (ALL, for both methods. In addition, we also identify genes specifi c for the subgroup of ALL-T cell samples. Overall, we find that the gene shaving method produces gene clusters at great speed; allows variable cluster sizes and can incorporate partial or full supervision; and finds clusters of genes in which the gene expression varies greatly over the tissue samples while maintaining a high level of coherence between the

  14. AKT Pathway Genes Define 5 Prognostic Subgroups in Glioblastoma: e100827

    National Research Council Canada - National Science Library

    Anna Joy; Archana Ramesh; Ivan Smirnov; Mark Reiser; Anjan Misra; William R Shapiro; Gordon B Mills; Seungchan Kim; Burt G Feuerstein

    2014-01-01

    ... robust. We hypothesized variations in the pathway between tumors contribute to poor response. We clustered GBM based on AKT pathway genes and discovered new subtypes then characterized their clinical and molecular features...

  15. Gene ordering in partitive clustering using microarray expressions

    Indian Academy of Sciences (India)

    PRAKASH KUMAR

    Although there is a rich literature on gene ordering in hierarchical clustering framework for gene expression analysis, there is no .... Step 1: Create the string representation (chromosome of. GA) for a .... The expression profiles are represented as lines of coloured boxes using Expander (Sharan et al 2003), each of which.

  16. Clustering gene expression data using a diffraction‐inspired framework

    Directory of Open Access Journals (Sweden)

    Dinger Steven C

    2012-11-01

    Full Text Available Abstract Background The recent developments in microarray technology has allowed for the simultaneous measurement of gene expression levels. The large amount of captured data challenges conventional statistical tools for analysing and finding inherent correlations between genes and samples. The unsupervised clustering approach is often used, resulting in the development of a wide variety of algorithms. Typical clustering algorithms require selecting certain parameters to operate, for instance the number of expected clusters, as well as defining a similarity measure to quantify the distance between data points. The diffraction‐based clustering algorithm however is designed to overcome this necessity for user‐defined parameters, as it is able to automatically search the data for any underlying structure. Methods The diffraction‐based clustering algorithm presented in this paper is tested using five well‐known expression datasets pertaining to cancerous tissue samples. The clustering results are then compared to those results obtained from conventional algorithms such as the k‐means, fuzzy c‐means, self‐organising map, hierarchical clustering algorithm, Gaussian mixture model and density‐based spatial clustering of applications with noise (DBSCAN. The performance of each algorithm is measured using an average external criterion and an average validity index. Results The diffraction‐based clustering algorithm is shown to be independent of the number of clusters as the algorithm searches the feature space and requires no form of parameter selection. The results show that the diffraction‐based clustering algorithm performs significantly better on the real biological datasets compared to the other existing algorithms. Conclusion The results of the diffraction‐based clustering algorithm presented in this paper suggest that the method can provide researchers with a new tool for successfully analysing microarray data.

  17. Prognostic gene signature profiles of hepatitis C-related early-stage liver cirrhosis

    Directory of Open Access Journals (Sweden)

    Anu Venkatesh

    2014-12-01

    Full Text Available The rate of hepatitis C virus (HCV related liver cirrhosis and subsequent cancer development is increasing and raising the risk of related mortality and morbidity. To address this issue, we aimed to develop a prognostic index that can be used to stratify patients for risk of disease progression. This index was developed in part by using a gene signature test implemented in a clinically applicable digital transcript counting platform (NanoString nCounter system. A cohort of 145 U.S. patients with HCV-related early-stage cirrhosis was analyzed by using the assay. This dataset (GEO accession number GPL17230 provides information of expression levels of the prognostic genes in the cohort.

  18. Characterization of the largest effector gene cluster of Ustilago maydis.

    Directory of Open Access Journals (Sweden)

    Thomas Brefort

    2014-07-01

    Full Text Available In the genome of the biotrophic plant pathogen Ustilago maydis, many of the genes coding for secreted protein effectors modulating virulence are arranged in gene clusters. The vast majority of these genes encode novel proteins whose expression is coupled to plant colonization. The largest of these gene clusters, cluster 19A, encodes 24 secreted effectors. Deletion of the entire cluster results in severe attenuation of virulence. Here we present the functional analysis of this genomic region. We show that a 19A deletion mutant behaves like an endophyte, i.e. is still able to colonize plants and complete the infection cycle. However, tumors, the most conspicuous symptoms of maize smut disease, are only rarely formed and fungal biomass in infected tissue is significantly reduced. The generation and analysis of strains carrying sub-deletions identified several genes significantly contributing to tumor formation after seedling infection. Another of the effectors could be linked specifically to anthocyanin induction in the infected tissue. As the individual contributions of these genes to tumor formation were small, we studied the response of maize plants to the whole cluster mutant as well as to several individual mutants by array analysis. This revealed distinct plant responses, demonstrating that the respective effectors have discrete plant targets. We propose that the analysis of plant responses to effector mutant strains that lack a strong virulence phenotype may be a general way to visualize differences in effector function.

  19. Tumor Microenvironment Gene Signature as a Prognostic Classifier and Therapeutic Target

    Science.gov (United States)

    2015-06-01

    Metagenomic Count Data. Bioinformatics. 2014. 14. Verhaak RG, Tamayo P, Yang JY, Hubbard D, Zhang H, Creighton CJ, et al. Prognostically relevant gene...invasion by upregulating CAF-derived versican in the tumor microenvironment. Cancer Res. 2013;73:5016-28. 24. Iwano M, Plieth D, Danoff TM, Xue C...signatures, analyzed and interpreted the data, wrote and published one manuscript, and prepared another one for publication. Name: Dong Joo Cheon

  20. [Transcriptional regulation of bkd gene cluster in Bacillus thuringiensis].

    Science.gov (United States)

    Wang, Guannan; Peng, Qi; Zheng, Qingyun; Li, Jie; Zhang, Jie

    2014-10-04

    In order to determine the effect of bkdR deletion on Cry protein production. We analyzed the transcriptional regulation of bkd gene cluster and the phenotype of bkdR mutant. Sequence of bkd gene cluster in Bacillus thuringiensis was analyzed by sequence alignment. RT-PCRwas used to reveal the transcriptional units of the bkd gene cluster. bkdR insertion mutant was constructed by homologous recombination. Transcriptional activity was analyzed by promoter fusions with lacZ gene. Comparison of the CrylAc protein production was determined by protein quantitation. The bkd gene cluster was composed of eight genes. The ptb-bkdB formed one transcriptional unit. The transcriptional activity of ptb sharply decreased in sigL and bkdR mutants. Deletion of bkdR decreased the motility of cells, but no effect on growth, sporulation efficiency and Cry protein production. The bkd gene cluster is controlled by Sigma 54 and activated by BkdR. Deletion of bkdR has no effect on Cry protein production, but decreased the motility of the cells. The bkd gene cluster is controlled by Sigma 54 and activated by BkdR. Deletion of bkdR has nb effect on Cry protein production, but decreased the motility of the cells. It suggested that deletion of bkdR do not affect the Cry protein production the same as sigL mutant. It means decreasing of Cry protein productioninsigL mutant was not caused by only one EBP mutation, but might be multiple roles.

  1. Methylome sequencing in triple-negative breast cancer reveals distinct methylation clusters with prognostic value.

    Science.gov (United States)

    Stirzaker, Clare; Zotenko, Elena; Song, Jenny Z; Qu, Wenjia; Nair, Shalima S; Locke, Warwick J; Stone, Andrew; Armstong, Nicola J; Robinson, Mark D; Dobrovic, Alexander; Avery-Kiejda, Kelly A; Peters, Kate M; French, Juliet D; Stein, Sandra; Korbie, Darren J; Trau, Matt; Forbes, John F; Scott, Rodney J; Brown, Melissa A; Francis, Glenn D; Clark, Susan J

    2015-02-02

    Epigenetic alterations in the cancer methylome are common in breast cancer and provide novel options for tumour stratification. Here, we perform whole-genome methylation capture sequencing on small amounts of DNA isolated from formalin-fixed, paraffin-embedded tissue from triple-negative breast cancer (TNBC) and matched normal samples. We identify differentially methylated regions (DMRs) enriched with promoters associated with transcription factor binding sites and DNA hypersensitive sites. Importantly, we stratify TNBCs into three distinct methylation clusters associated with better or worse prognosis and identify 17 DMRs that show a strong association with overall survival, including DMRs located in the Wilms tumour 1 (WT1) gene, bi-directional-promoter and antisense WT1-AS. Our data reveal that coordinated hypermethylation can occur in oestrogen receptor-negative disease, and that characterizing the epigenetic framework provides a potential signature to stratify TNBCs. Together, our findings demonstrate the feasibility of profiling the cancer methylome with limited archival tissue to identify regulatory regions associated with cancer.

  2. Identification of nitrogen-fixing genes and gene clusters from metagenomic library of acid mine drainage.

    Directory of Open Access Journals (Sweden)

    Zhimin Dai

    Full Text Available Biological nitrogen fixation is an essential function of acid mine drainage (AMD microbial communities. However, most acidophiles in AMD environments are uncultured microorganisms and little is known about the diversity of nitrogen-fixing genes and structure of nif gene cluster in AMD microbial communities. In this study, we used metagenomic sequencing to isolate nif genes in the AMD microbial community from Dexing Copper Mine, China. Meanwhile, a metagenome microarray containing 7,776 large-insertion fosmids was constructed to screen novel nif gene clusters. Metagenomic analyses revealed that 742 sequences were identified as nif genes including structural subunit genes nifH, nifD, nifK and various additional genes. The AMD community is massively dominated by the genus Acidithiobacillus. However, the phylogenetic diversity of nitrogen-fixing microorganisms is much higher than previously thought in the AMD community. Furthermore, a 32.5-kb genomic sequence harboring nif, fix and associated genes was screened by metagenome microarray. Comparative genome analysis indicated that most nif genes in this cluster are most similar to those of Herbaspirillum seropedicae, but the organization of the nif gene cluster had significant differences from H. seropedicae. Sequence analysis and reverse transcription PCR also suggested that distinct transcription units of nif genes exist in this gene cluster. nifQ gene falls into the same transcription unit with fixABCX genes, which have not been reported in other diazotrophs before. All of these results indicated that more novel diazotrophs survive in the AMD community.

  3. Genome classification by gene distribution: An overlapping subspace clustering approach

    Directory of Open Access Journals (Sweden)

    Halgamuge Saman K

    2008-04-01

    Full Text Available Abstract Background Genomes of lower organisms have been observed with a large amount of horizontal gene transfers, which cause difficulties in their evolutionary study. Bacteriophage genomes are a typical example. One recent approach that addresses this problem is the unsupervised clustering of genomes based on gene order and genome position, which helps to reveal species relationships that may not be apparent from traditional phylogenetic methods. Results We propose the use of an overlapping subspace clustering algorithm for such genome classification problems. The advantage of subspace clustering over traditional clustering is that it can associate clusters with gene arrangement patterns, preserving genomic information in the clusters produced. Additionally, overlapping capability is desirable for the discovery of multiple conserved patterns within a single genome, such as those acquired from different species via horizontal gene transfers. The proposed method involves a novel strategy to vectorize genomes based on their gene distribution. A number of existing subspace clustering and biclustering algorithms were evaluated to identify the best framework upon which to develop our algorithm; we extended a generic subspace clustering algorithm called HARP to incorporate overlapping capability. The proposed algorithm was assessed and applied on bacteriophage genomes. The phage grouping results are consistent overall with the Phage Proteomic Tree and showed common genomic characteristics among the TP901-like, Sfi21-like and sk1-like phage groups. Among 441 phage genomes, we identified four significantly conserved distribution patterns structured by the terminase, portal, integrase, holin and lysin genes. We also observed a subgroup of Sfi21-like phages comprising a distinctive divergent genome organization and identified nine new phage members to the Sfi21-like genus: Staphylococcus 71, phiPVL108, Listeria A118, 2389, Lactobacillus phi AT3, A2

  4. High diversity of polyketide synthase genes and the melanin biosynthesis gene cluster in Penicillium marneffei.

    Science.gov (United States)

    Woo, Patrick C Y; Tam, Emily W T; Chong, Ken T K; Cai, James J; Tung, Edward T K; Ngan, Antonio H Y; Lau, Susanna K P; Yuen, Kwok-Yung

    2010-09-01

    Despite the unique phenotypic properties and clinical importance of Penicillium marneffei, the polyketide synthase genes in its genome have never been characterized. Twenty-three putative polyketide synthase genes and two putative polyketide synthase nonribosomal peptide-synthase hybrid genes were identified in the P. marneffei genome, a diversity much higher than found in other pathogenic thermal dimorphic fungi, such as Histoplasma capsulatum (one polyketide synthase gene) and Coccidioides immitis (10 polyketide synthase genes). These genes were evenly distributed on the phylogenetic tree with polyketide synthase genes of Aspergillus and other fungi, indicating that the high diversity was not a result of lineage-specific gene expansion through recent gene duplication. The melanin-biosynthesis gene cluster had gene order and orientations identical to those in the Talaromyces stipitatus (a teleomorph of Penicillium emmonsii) genome. Phylogenetically, all six genes of the melanin-biosynthesis gene cluster in P. marneffei were also most closely related to those in T. stipitatus, with high bootstrap supports. The polyketide synthase gene of the melanin-biosynthesis gene cluster (alb1) in P. marneffei was knocked down, which was accompanied by loss of melanin pigment production and reduced ornamentation in conidia. The survival of mice challenged with the alb1 knockdown mutant was significantly better than those challenged with wild-type P. marneffei (P melanin-biosynthesis gene cluster contributed to virulence through decreased susceptibility to killing by hydrogen peroxide. © 2010 The Authors Journal compilation © 2010 FEBS.

  5. [GST genes expression as prognostic factor in papillary thyroid cancer].

    Science.gov (United States)

    Gonçalves, Antonio Jose; Monte, Osmar; Morari, Eliane Cristina; Ward, Laura Sterian; Nakasako, Diana Shimoda; Nieto, Juliana; Nakai, Marianne Yumi

    2009-01-01

    Analyze the relationship between the AMES classification and molecular factors from Glutation-S-Transferase System, specifically the GSTT1 and GSTM1 in patients with well differentiated thyroid cancer. Samples of thyroid tissue of 66 patients with papillary thyroid carcinoma were obtained (53 women and 13 men). Patients were divided in two groups (high and low risk) according to the AMES classification. In each group, presence of the null genotype of both GST enzymes system was studied. These results were compared with the AMES classification. Samples were obtained in the operating room immediately after thyroidectomy, placed in cryotubes, immersed in liquid nitrogen and stored in a freezer at -80 masculineC. DNA of this enzymes was extracted by the fenol-cloroformium method. There were 17 high risk patients and 49 low risk patients. The null genotype of the high risk group was 5.8% and in the other group was 6.1%. There was no relationship between absence of genes GSTT1 and GSTM1 and prognosis of the papillary thyroid carcinoma when compared to the AMES classifications.

  6. Calcitonin gene-related peptide antagonism and cluster headache

    DEFF Research Database (Denmark)

    Ashina, Håkan; Newman, Lawrence; Ashina, Sait

    2017-01-01

    Calcitonin gene-related peptide (CGRP) is a key signaling molecule involved in migraine pathophysiology. Efficacy of CGRP monoclonal antibodies and antagonists in migraine treatment has fueled an increasing interest in the prospect of treating cluster headache (CH) with CGRP antagonism. The exact...... role of CGRP and its mechanism of action in CH have not been fully clarified. A search for original studies and randomized controlled trials (RCTs) published in English was performed in PubMed and in ClinicalTrials.gov . The search term used was "cluster headache and calcitonin gene related peptide......" and "primary headaches and calcitonin gene related peptide." Reference lists of identified articles were also searched for additional relevant papers. Human experimental studies have reported elevated plasma CGRP levels during both spontaneous and glyceryl trinitrate-induced cluster attacks. CGRP may play...

  7. Co-evolution of secondary metabolite gene clusters and their host

    DEFF Research Database (Denmark)

    Kjærbølling, Inge; Vesth, Tammi Camilla; Frisvad, Jens Christian

    Secondary metabolite gene cluster evolution is mainly driven by two events: gene duplication and annexation and horizontal gene transfer. Here we use comparative genomics of Aspergillus species to investigate the evolution of secondary metabolite (SM) gene clusters across a wide spectrum of species....... We investigate the dynamic evolutionary relationship between the cluster and the host by examining the genes within the cluster and the number of homologous genes found within the host and in closely related species....

  8. A simple but highly effective approach to evaluate the prognostic performance of gene expression signatures.

    Directory of Open Access Journals (Sweden)

    Maud H W Starmans

    Full Text Available BACKGROUND: Highly parallel analysis of gene expression has recently been used to identify gene sets or 'signatures' to improve patient diagnosis and risk stratification. Once a signature is generated, traditional statistical testing is used to evaluate its prognostic performance. However, due to the dimensionality of microarrays, this can lead to false interpretation of these signatures. PRINCIPAL FINDINGS: A method was developed to test batches of a user-specified number of randomly chosen signatures in patient microarray datasets. The percentage of random generated signatures yielding prognostic value was assessed using ROC analysis by calculating the area under the curve (AUC in six public available cancer patient microarray datasets. We found that a signature consisting of randomly selected genes has an average 10% chance of reaching significance when assessed in a single dataset, but can range from 1% to ∼40% depending on the dataset in question. Increasing the number of validation datasets markedly reduces this number. CONCLUSIONS: We have shown that the use of an arbitrary cut-off value for evaluation of signature significance is not suitable for this type of research, but should be defined for each dataset separately. Our method can be used to establish and evaluate signature performance of any derived gene signature in a dataset by comparing its performance to thousands of randomly generated signatures. It will be of most interest for cases where few data are available and testing in multiple datasets is limited.

  9. A novel hierarchical clustering algorithm for gene sequences

    Directory of Open Access Journals (Sweden)

    Wei Dan

    2012-07-01

    Full Text Available Abstract Background Clustering DNA sequences into functional groups is an important problem in bioinformatics. We propose a new alignment-free algorithm, mBKM, based on a new distance measure, DMk, for clustering gene sequences. This method transforms DNA sequences into the feature vectors which contain the occurrence, location and order relation of k-tuples in DNA sequence. Afterwards, a hierarchical procedure is applied to clustering DNA sequences based on the feature vectors. Results The proposed distance measure and clustering method are evaluated by clustering functionally related genes and by phylogenetic analysis. This method is also compared with BlastClust, CD-HIT-EST and some others. The experimental results show our method is effective in classifying DNA sequences with similar biological characteristics and in discovering the underlying relationship among the sequences. Conclusions We introduced a novel clustering algorithm which is based on a new sequence similarity measure. It is effective in classifying DNA sequences with similar biological characteristics and in discovering the relationship among the sequences.

  10. Mining Association Rules among Gene Functions in Clusters of Similar Gene Expression Maps.

    Science.gov (United States)

    An, Li; Obradovic, Zoran; Smith, Desmond; Bodenreider, Olivier; Megalooikonomou, Vasileios

    2009-11-01

    Association rules mining methods have been recently applied to gene expression data analysis to reveal relationships between genes and different conditions and features. However, not much effort has focused on detecting the relation between gene expression maps and related gene functions. Here we describe such an approach to mine association rules among gene functions in clusters of similar gene expression maps on mouse brain. The experimental results show that the detected association rules make sense biologically. By inspecting the obtained clusters and the genes having the gene functions of frequent itemsets, interesting clues were discovered that provide valuable insight to biological scientists. Moreover, discovered association rules can be potentially used to predict gene functions based on similarity of gene expression maps.

  11. Cloning and Heterologous Expression of the Grecocycline Biosynthetic Gene Cluster.

    Directory of Open Access Journals (Sweden)

    Oksana Bilyk

    Full Text Available Transformation-associated recombination (TAR in yeast is a rapid and inexpensive method for cloning and assembly of large DNA fragments, which relies on natural homologous recombination. Two vectors, based on p15a and F-factor replicons that can be maintained in yeast, E. coli and streptomycetes have been constructed. These vectors have been successfully employed for assembly of the grecocycline biosynthetic gene cluster from Streptomyces sp. Acta 1362. Fragments of the cluster were obtained by PCR and transformed together with the "capture" vector into the yeast cells, yielding a construct carrying the entire gene cluster. The obtained construct was heterologously expressed in S. albus J1074, yielding several grecocycline congeners. Grecocyclines have unique structural moieties such as a dissacharide side chain, an additional amino sugar at the C-5 position and a thiol group. Enzymes from this pathway may be used for the derivatization of known active angucyclines in order to improve their desired biological properties.

  12. The ergot alkaloid gene cluster: Functional analyses and evolutionary aspects

    Czech Academy of Sciences Publication Activity Database

    Lorenz, N.; Haarmann, T.; Pažoutová, Sylvie; Jung, M.; Tudzynski, P.

    2009-01-01

    Roč. 70, 15-16 (2009), s. 1822-1832 ISSN 0031-9422 Institutional research plan: CEZ:AV0Z50200510 Keywords : Claviceps purpurea * Ergot fungus * Ergot alkaloid gene cluster Subject RIV: EE - Microbiology, Virology Impact factor: 3.104, year: 2009

  13. PEACE: Parallel Environment for Assembly and Clustering of Gene Expression.

    Science.gov (United States)

    Rao, D M; Moler, J C; Ozden, M; Zhang, Y; Liang, C; Karro, J E

    2010-07-01

    We present PEACE, a stand-alone tool for high-throughput ab initio clustering of transcript fragment sequences produced by Next Generation or Sanger Sequencing technologies. It is freely available from www.peace-tools.org. Installed and managed through a downloadable user-friendly graphical user interface (GUI), PEACE can process large data sets of transcript fragments of length 50 bases or greater, grouping the fragments by gene associations with a sensitivity comparable to leading clustering tools. Once clustered, the user can employ the GUI's analysis functions, facilitating the easy collection of statistics and allowing them to single out specific clusters for more comprehensive study or assembly. Using a novel minimum spanning tree-based clustering method, PEACE is the equal of leading tools in the literature, with an interface making it accessible to any user. It produces results of quality virtually identical to those of the WCD tool when applied to Sanger sequences, significantly improved results over WCD and TGICL when applied to the products of Next Generation Sequencing Technology and significantly improved results over Cap3 in both cases. In short, PEACE provides an intuitive GUI and a feature-rich, parallel clustering engine that proves to be a valuable addition to the leading cDNA clustering tools.

  14. Should treatment of (sub)acute low back pain be aimed at psychosocial prognostic factors? Cluster randomised clinical trial in general practice

    NARCIS (Netherlands)

    Jellema, Petra; van der Windt, Daniëlle A. W. M.; van der Horst, Henriëtte E.; Twisk, Jos W. R.; Stalman, Wim A. B.; Bouter, Lex M.

    2005-01-01

    To compare the effects of a minimal intervention strategy aimed at assessment and modification of psychosocial prognostic factors and usual care for treatment of (sub)acute low back pain in general practice. Cluster randomised clinical trial. 60 general practitioners in 41 general practices. 314

  15. The Fusarium graminearum Genome Reveals More Secondary Metabolite Gene Clusters and Hints of Horizontal Gene Transfer

    Science.gov (United States)

    Wong, Philip; Münsterkötter, Martin; Mewes, Hans-Werner; Schmeitzl, Clemens; Varga, Elisabeth; Berthiller, Franz; Adam, Gerhard; Güldener, Ulrich

    2014-01-01

    Fungal secondary metabolite biosynthesis genes are of major interest due to the pharmacological properties of their products (like mycotoxins and antibiotics). The genome of the plant pathogenic fungus Fusarium graminearum codes for a large number of candidate enzymes involved in secondary metabolite biosynthesis. However, the chemical nature of most enzymatic products of proteins encoded by putative secondary metabolism biosynthetic genes is largely unknown. Based on our analysis we present 67 gene clusters with significant enrichment of predicted secondary metabolism related enzymatic functions. 20 gene clusters with unknown metabolites exhibit strong gene expression correlation in planta and presumably play a role in virulence. Furthermore, the identification of conserved and over-represented putative transcription factor binding sites serves as additional evidence for cluster co-regulation. Orthologous cluster search provided insight into the evolution of secondary metabolism clusters. Some clusters are characteristic for the Fusarium phylum while others show evidence of horizontal gene transfer as orthologs can be found in representatives of the Botrytis or Cochliobolus lineage. The presented candidate clusters provide valuable targets for experimental examination. PMID:25333987

  16. The Fusarium graminearum genome reveals more secondary metabolite gene clusters and hints of horizontal gene transfer.

    Directory of Open Access Journals (Sweden)

    Christian M K Sieber

    Full Text Available Fungal secondary metabolite biosynthesis genes are of major interest due to the pharmacological properties of their products (like mycotoxins and antibiotics. The genome of the plant pathogenic fungus Fusarium graminearum codes for a large number of candidate enzymes involved in secondary metabolite biosynthesis. However, the chemical nature of most enzymatic products of proteins encoded by putative secondary metabolism biosynthetic genes is largely unknown. Based on our analysis we present 67 gene clusters with significant enrichment of predicted secondary metabolism related enzymatic functions. 20 gene clusters with unknown metabolites exhibit strong gene expression correlation in planta and presumably play a role in virulence. Furthermore, the identification of conserved and over-represented putative transcription factor binding sites serves as additional evidence for cluster co-regulation. Orthologous cluster search provided insight into the evolution of secondary metabolism clusters. Some clusters are characteristic for the Fusarium phylum while others show evidence of horizontal gene transfer as orthologs can be found in representatives of the Botrytis or Cochliobolus lineage. The presented candidate clusters provide valuable targets for experimental examination.

  17. Origin and distribution of epipolythiodioxopiperazine (ETP) gene clusters in filamentous ascomycetes.

    Science.gov (United States)

    Patron, Nicola J; Waller, Ross F; Cozijnsen, Anton J; Straney, David C; Gardiner, Donald M; Nierman, William C; Howlett, Barbara J

    2007-09-26

    Genes responsible for biosynthesis of fungal secondary metabolites are usually tightly clustered in the genome and co-regulated with metabolite production. Epipolythiodioxopiperazines (ETPs) are a class of secondary metabolite toxins produced by disparate ascomycete fungi and implicated in several animal and plant diseases. Gene clusters responsible for their production have previously been defined in only two fungi. Fungal genome sequence data have been surveyed for the presence of putative ETP clusters and cluster data have been generated from several fungal taxa where genome sequences are not available. Phylogenetic analysis of cluster genes has been used to investigate the assembly and heredity of these gene clusters. Putative ETP gene clusters are present in 14 ascomycete taxa, but absent in numerous other ascomycetes examined. These clusters are discontinuously distributed in ascomycete lineages. Gene content is not absolutely fixed, however, common genes are identified and phylogenies of six of these are separately inferred. In each phylogeny almost all cluster genes form monophyletic clades with non-cluster fungal paralogues being the nearest outgroups. This relatedness of cluster genes suggests that a progenitor ETP gene cluster assembled within an ancestral taxon. Within each of the cluster clades, the cluster genes group together in consistent subclades, however, these relationships do not always reflect the phylogeny of ascomycetes. Micro-synteny of several of the genes within the clusters provides further support for these subclades. ETP gene clusters appear to have a single origin and have been inherited relatively intact rather than assembling independently in the different ascomycete lineages. This progenitor cluster has given rise to a small number of distinct phylogenetic classes of clusters that are represented in a discontinuous pattern throughout ascomycetes. The disjunct heredity of these clusters is discussed with consideration to multiple

  18. Origin and distribution of epipolythiodioxopiperazine (ETP gene clusters in filamentous ascomycetes

    Directory of Open Access Journals (Sweden)

    Gardiner Donald M

    2007-09-01

    Full Text Available Abstract Background Genes responsible for biosynthesis of fungal secondary metabolites are usually tightly clustered in the genome and co-regulated with metabolite production. Epipolythiodioxopiperazines (ETPs are a class of secondary metabolite toxins produced by disparate ascomycete fungi and implicated in several animal and plant diseases. Gene clusters responsible for their production have previously been defined in only two fungi. Fungal genome sequence data have been surveyed for the presence of putative ETP clusters and cluster data have been generated from several fungal taxa where genome sequences are not available. Phylogenetic analysis of cluster genes has been used to investigate the assembly and heredity of these gene clusters. Results Putative ETP gene clusters are present in 14 ascomycete taxa, but absent in numerous other ascomycetes examined. These clusters are discontinuously distributed in ascomycete lineages. Gene content is not absolutely fixed, however, common genes are identified and phylogenies of six of these are separately inferred. In each phylogeny almost all cluster genes form monophyletic clades with non-cluster fungal paralogues being the nearest outgroups. This relatedness of cluster genes suggests that a progenitor ETP gene cluster assembled within an ancestral taxon. Within each of the cluster clades, the cluster genes group together in consistent subclades, however, these relationships do not always reflect the phylogeny of ascomycetes. Micro-synteny of several of the genes within the clusters provides further support for these subclades. Conclusion ETP gene clusters appear to have a single origin and have been inherited relatively intact rather than assembling independently in the different ascomycete lineages. This progenitor cluster has given rise to a small number of distinct phylogenetic classes of clusters that are represented in a discontinuous pattern throughout ascomycetes. The disjunct heredity of

  19. The Diagnostic and Prognostic Role of Interleukin 12B and Interleukin 6R Gene Polymorphism in Patients With Ankylosing Spondylitis.

    Science.gov (United States)

    Ruan, Wen-Feng; Xie, Jiang-Tao; Jin, Qi; Wang, Wen-Da; Ping, An-Song

    2018-01-01

    Interleukin 23 (IL-23) pathway and IL-1 cluster genes play prominent role in the etiopathology of ankylosing spondylitis (AS). The aim of this study was to investigate the diagnostic and prognostic role of 5 single-nucleotide polymorphisms related to IL-23 pathway and IL-1 cluster genes in AS patients. Four hundred thirty-one patients with AS and 206 age- and sex-matched healthy controls were recruited in this prospective cohort study. Five potential single-nucleotide polymorphisms (IL-23R [rs11209026], IL-12B [rs6871626], TYK2 [rs6511701], IL-6R [rs4129267], and IL-1R2 [rs2192752]) related to IL-23 pathway and IL-1 cluster genes by analyzing previous studies were genotyped. Among 431 total AS patients, 198 active cases were treated and followed up for 24 weeks. Frequencies of IL-12B AA (rs6871626) and IL-6R TT (rs4129267) genotypes were increased in AS patients compared with healthy controls (both P < 0.001), and IL-12B A (rs6871626) as well as IL-6R T (rs4129267) allele increased the risk of AS independently (both P < 0.001). The Bath Ankylosing Spondylitis Disease Activity Index score was found to be elevated in AS patients with IL-12B AA (rs6871626) compared with patients with the CA and CC genotypes (P = 0.002 and P < 0.001, respectively), and the Bath Ankylosing Spondylitis Functional Index score was also increased in AS patents with IL-12B AA (rs6871626) than in those with the CA and CC genotypes (P = 0.001 and P < 0.001). In addition, IL-6R T (rs4129267) allele could predict a worse ASAS-20 (Assessment of SpondyloArthritis international Society) response at week 24 as an independent factor by multivariate logistic regression analysis with additive model (P = 0.011). Interleukin 12B (rs6871626) and IL-6R (rs4129267) gene polymorphisms could serve as promising biomarkers for diagnosis and prognosis in AS patients.

  20. K-ras gene mutation as an early prognostic marker of colon cancer.

    Science.gov (United States)

    Szpon, Łukasz; Stal, Aleksander; Zawadzki, Marcin; Lis-Nawara, Anna; Kielan, Wojciech; Grzebieniak, Zygmunt

    2016-01-01

    Due to increased colorectal cancer incidence there is a necessity of seeking new both prognostic and prediction factors that will allow to evolve new diagnostic tests. K-ras gene seems to be such a factor and its mutations are considered to be an early marker of progression of colorectal cancer. The aim of the study was to find a correlation between K-ras gene mutation in patients with diagnosed colorectal cancer and selected clinical parameters. A total of 104 patients (41 women and 63 men) with diagnosed colorectal cancer were included in this study. The average age of male group was 68.3 and in female group - 65.9. Samples were taken from paraffine blocks with tissue from diagnosed patients and K-ras gene mutation were identified. Afterwards the statistical analysis was made seeking the correlation between K-ras gene mutation incidence and clinical TNM staging system, tumour localisation, histological type, sex, age. K-ras gene mutations were detected in 20.1% of all colorectal cancers. Significantly higher rate of K-ras gene mutations were diagnosed among patients classified at stage I (40%), stage IIC (50%) and stage IV (50%) according to the TNM classification. The results of our study are compatible with other studies and indicate the correlation between K-ras gene mutation and colorectal cancer incidence. Identification of K-ras gene mutation may complement other diagnostic methods at early stage of colorectal cancer.

  1. Evolution and differential expression of a vertebrate vitellogenin gene cluster

    Directory of Open Access Journals (Sweden)

    Kongshaug Heidi

    2009-01-01

    Full Text Available Abstract Background The multiplicity or loss of the vitellogenin (vtg gene family in vertebrates has been argued to have broad implications for the mode of reproduction (placental or non-placental, cleavage pattern (meroblastic or holoblastic and character of the egg (pelagic or benthic. Earlier proposals for the existence of three forms of vertebrate vtgs present conflicting models for their origin and subsequent duplication. Results By integrating phylogenetics of novel vtg transcripts from old and modern teleosts with syntenic analyses of all available genomic variants of non-metatherian vertebrates we identify the gene orthologies between the Sarcopterygii (tetrapod branch and Actinopterygii (fish branch. We argue that the vertebrate vtg gene cluster originated in proto-chromosome m, but that vtg genes have subsequently duplicated and rearranged following whole genome duplications. Sequencing of a novel fourth vtg transcript in labrid species, and the presence of duplicated paralogs in certain model organisms supports the notion that lineage-specific gene duplications frequently occur in teleosts. The data show that the vtg gene cluster is more conserved between acanthomorph teleosts and tetrapods, than in ostariophysan teleosts such as the zebrafish. The differential expression of the labrid vtg genes are further consistent with the notion that neofunctionalized Aa-type vtgs are important determinants of the pelagic or benthic character of the eggs in acanthomorph teleosts. Conclusion The vertebrate vtg gene cluster existed prior to the separation of Sarcopterygii from Actinopterygii >450 million years ago, a period associated with the second round of whole genome duplication. The presence of higher copy numbers in a more highly expressed subcluster is particularly prevalent in teleosts. The differential expression and latent neofunctionalization of vtg genes in acanthomorph teleosts is an adaptive feature associated with oocyte hydration

  2. Gene Expression Profiling for In Silico Microdissection of Hodgkin's Lymphoma Microenvironment and Identification of Prognostic Features

    Directory of Open Access Journals (Sweden)

    François Bertucci

    2011-01-01

    Full Text Available Gene expression profiling studies based on DNA microarrays have demonstrated their ability to define the interaction pathways between neoplastic and nonmalignant stromal cells in cancer tissues. During the past ten years, a number of approaches including microdissection have tried to resolve the variability in DNA microarray measurements stemming from cancer tissue sample heterogeneity. Another approach, designated as virtual or in silico microdissection, avoids the laborious and time-consuming step of anatomic microdissection. It consists of confronting the gene expression profiles of complex tissue samples to those of cell lines representative of different cell lineages, different differentiation stages, or different signaling pathways. This strategy has been used in recent studies aiming to analyze microenvironment alterations using gene expression profiling of nonmicrodissected classical Hodgkin lymphoma tissues in order to generate new prognostic factors. These recent contributions are detailed and discussed in the present paper.

  3. From green to red: horizontal gene transfer of the phycoerythrin gene cluster between Planktothrix strains.

    Science.gov (United States)

    Tooming-Klunderud, Ave; Sogge, Hanne; Rounge, Trine Ballestad; Nederbragt, Alexander J; Lagesen, Karin; Glöckner, Gernot; Hayes, Paul K; Rohrlack, Thomas; Jakobsen, Kjetill S

    2013-11-01

    Horizontal gene transfer is common in cyanobacteria, and transfer of large gene clusters may lead to acquisition of new functions and conceivably niche adaption. In the present study, we demonstrate that horizontal gene transfer between closely related Planktothrix strains can explain the production of the same oligopeptide isoforms by strains of different colors. Comparison of the genomes of eight Planktothrix strains revealed that strains producing the same oligopeptide isoforms are closely related, regardless of color. We have investigated genes involved in the synthesis of the photosynthetic pigments phycocyanin and phycoerythrin, which are responsible for green and red appearance, respectively. Sequence comparisons suggest the transfer of a functional phycoerythrin gene cluster generating a red phenotype in a strain that is otherwise more closely related to green strains. Our data show that the insertion of a DNA fragment containing the 19.7-kb phycoerythrin gene cluster has been facilitated by homologous recombination, also replacing a region of the phycocyanin operon. These findings demonstrate that large DNA fragments spanning entire functional gene clusters can be effectively transferred between closely related cyanobacterial strains and result in a changed phenotype. Further, the results shed new light on the discussion of the role of horizontal gene transfer in the sporadic distribution of large gene clusters in cyanobacteria, as well as the appearance of red and green strains.

  4. An alanine tRNA gene cluster from Nephila clavipes.

    Science.gov (United States)

    Luciano, E; Candelas, G C

    1996-06-01

    We report the sequence of a 2.3-kb genomic DNA fragment from the orb-web spider, Nephila clavipes (Nc). The fragment contains four regions of high homology to tRNA(Ala). The members of this irregularly spaced cluster of genes are oriented in the same direction and have the same anticodon (GCA), but their sequence differs at several positions. Initiation and termination signals, as well as consensus intragenic promoter sequences characteristic of tRNA genes, have been identified in all genes. tRNA(Ala) are involved in the regulation of the fibroin synthesis in the large ampullate Nc glands.

  5. A prognostic gene signature for metastasis-free survival of triple negative breast cancer patients.

    Directory of Open Access Journals (Sweden)

    Unjin Lee

    Full Text Available Although triple negative breast cancers (TNBC are the most aggressive subtype of breast cancer, they currently lack targeted therapies. Because this classification still includes a heterogeneous collection of tumors, new tools to classify TNBCs are urgently required in order to improve our prognostic capability for high risk patients and predict response to therapy. We previously defined a gene expression signature, RKIP Pathway Metastasis Signature (RPMS, based upon a metastasis-suppressive signaling pathway initiated by Raf Kinase Inhibitory Protein (RKIP. We have now generated a new BACH1 Pathway Metastasis gene signature (BPMS that utilizes targets of the metastasis regulator BACH1. Specifically, we substituted experimentally validated target genes to generate a new BACH1 metagene, developed an approach to optimize patient tumor stratification, and reduced the number of signature genes to 30. The BPMS significantly and selectively stratified metastasis-free survival in basal-like and, in particular, TNBC patients. In addition, the BPMS further stratified patients identified as having a good or poor prognosis by other signatures including the Mammaprint® and Oncotype® clinical tests. The BPMS is thus complementary to existing signatures and is a prognostic tool for high risk ER-HER2- patients. We also demonstrate the potential clinical applicability of the BPMS as a single sample predictor. Together, these results reveal the potential of this pathway-based BPMS gene signature to identify high risk TNBC patients that can respond effectively to targeted therapy, and highlight BPMS genes as novel drug targets for therapeutic development.

  6. [The prognostic value of kcnq1 gene mutations in patients after myocardial infarction].

    Science.gov (United States)

    Olszak-Waśkiewicz, Marlena; Kramarz, Elżbieta

    Ion channel gene mutations are risk factors for SCD. To assess the prognostic value of A2753831C, C2505734T, C2505846A, G2753881A, T2755854C and T2755875G mutations in the KCNQ1 gene in patients after MI. The study group of 100 patients after MI was divided into two groups: patients with mutations (n=23) and patients without mutations (n=77). The subjects underwent physical examinations, laboratory tests, ECG, Holter ECG and echocardiography. The examinations were repeated every 12 months. Cardiac events including deaths occurred during the observation period. The mean observation time was 8 ± 4,3 years. KCNQ1 gene mutations were found in 23 subjects and were four times more frequent in men than in women. Parameters such as QRS ≥ 110 ms, QTc ≥ 440 ms, VEBs ≥ 100 per 24 hours, nsVT and LVEF ≤40% showed statistically significant differences between the group of patients who died and the group of patients who survived. LVEF ≤ 40% and VEBs ≥ 100/24 h were the factors that correlated the most with deaths. KCNQ1 gene mutations, PQ interval ≥ 200ms and QTd ≥ 60ms had no impact on death. The occurrence of KCNQ1 gene mutations in patients after MI is higher in men than in women. The presence of KCNQ1 gene mutations is not an additional risk factor for increased mortality in patients after MI. LVEF ≤40% and VES ≥100/24 h have a significant prognostic value in predicting deaths in patients after MI.

  7. Evolutionary conservation of regulatory elements in vertebrate HOX gene clusters

    Energy Technology Data Exchange (ETDEWEB)

    Santini, Simona; Boore, Jeffrey L.; Meyer, Axel

    2003-12-31

    Due to their high degree of conservation, comparisons of DNA sequences among evolutionarily distantly-related genomes permit to identify functional regions in noncoding DNA. Hox genes are optimal candidate sequences for comparative genome analyses, because they are extremely conserved in vertebrates and occur in clusters. We aligned (Pipmaker) the nucleotide sequences of HoxA clusters of tilapia, pufferfish, striped bass, zebrafish, horn shark, human and mouse (over 500 million years of evolutionary distance). We identified several highly conserved intergenic sequences, likely to be important in gene regulation. Only a few of these putative regulatory elements have been previously described as being involved in the regulation of Hox genes, while several others are new elements that might have regulatory functions. The majority of these newly identified putative regulatory elements contain short fragments that are almost completely conserved and are identical to known binding sites for regulatory proteins (Transfac). The conserved intergenic regions located between the most rostrally expressed genes in the developing embryo are longer and better retained through evolution. We document that presumed regulatory sequences are retained differentially in either A or A clusters resulting from a genome duplication in the fish lineage. This observation supports both the hypothesis that the conserved elements are involved in gene regulation and the Duplication-Deletion-Complementation model.

  8. Multiscale mutation clustering algorithm identifies pan-cancer mutational clusters associated with pathway-level changes in gene expression.

    Science.gov (United States)

    Poole, William; Leinonen, Kalle; Shmulevich, Ilya; Knijnenburg, Theo A; Bernard, Brady

    2017-02-01

    Cancer researchers have long recognized that somatic mutations are not uniformly distributed within genes. However, most approaches for identifying cancer mutations focus on either the entire-gene or single amino-acid level. We have bridged these two methodologies with a multiscale mutation clustering algorithm that identifies variable length mutation clusters in cancer genes. We ran our algorithm on 539 genes using the combined mutation data in 23 cancer types from The Cancer Genome Atlas (TCGA) and identified 1295 mutation clusters. The resulting mutation clusters cover a wide range of scales and often overlap with many kinds of protein features including structured domains, phosphorylation sites, and known single nucleotide variants. We statistically associated these multiscale clusters with gene expression and drug response data to illuminate the functional and clinical consequences of mutations in our clusters. Interestingly, we find multiple clusters within individual genes that have differential functional associations: these include PTEN, FUBP1, and CDH1. This methodology has potential implications in identifying protein regions for drug targets, understanding the biological underpinnings of cancer, and personalizing cancer treatments. Toward this end, we have made the mutation clusters and the clustering algorithm available to the public. Clusters and pathway associations can be interactively browsed at m2c.systemsbiology.net. The multiscale mutation clustering algorithm is available at https://github.com/IlyaLab/M2C.

  9. Identification of lung adenocarcinoma specific dysregulated genes with diagnostic and prognostic value across 27 TCGA cancer types.

    Science.gov (United States)

    Shang, Jun; Song, Qian; Yang, Zuyi; Li, Dongyao; Chen, Wenjie; Luo, Lei; Wang, Yongkun; Yang, Jingcheng; Li, Shikang

    2017-10-20

    As the most common histologic subtype of lung cancer, lung adenocarcinoma (LUAD) contributes to a majority of cancer-related deaths worldwide annually. In order to find specific biomarkers of LUAD that are able to distinguish LUAD from other types of cancer so as to improve the early diagnostic and prognostic power in LUAD, we analyzed 10098 tumor tissue samples across 27 TCGA cancer types and identified 112 specific expressed genes in LUAD. Meantime, 8240 LUAD dysregulated genes in tumor and normal samples were identified. Combining with the results of specific expressed genes and dysregulated genes in LUAD, we found there were 70 specific dysregulated genes in LUAD (LUAD-SDGs). Then ROC curve revealed six LUAD-SDGs that may be of strong diagnostic value to predict the existence of cancer (area under curve[AUC] > 95%). Kaplan-Meier survival analysis was performed to identify 6 LUAD-SDGs associated with patients' prognosis (P-values SDGs were independent prognostic factors. Then, we used the six overall survival (OS)-related LUAD-SDGs constructing a six-gene signature. Multivariate Cox regression analysis suggested that the six-gene signature was an independent prognostic factor of other clinical variables (hazard ratio [HR] = 1.5098, 95%CI = 1.2996-1.7538, P SDGs for LUAD diagnosis and prognosis. Our results may provide efficient biomarkers to clinical diagnostic and prognostic evaluation in LUAD.

  10. Prognostic modeling of oral cancer by gene profiles and clinicopathological co-variables.

    Science.gov (United States)

    Mes, Steven W; Te Beest, Dennis; Poli, Tito; Rossi, Silvia; Scheckenbach, Kathrin; van Wieringen, Wessel N; Brink, Arjen; Bertani, Nicoletta; Lanfranco, Davide; Silini, Enrico M; van Diest, Paul J; Bloemena, Elisabeth; Leemans, C René; van de Wiel, Mark A; Brakenhoff, Ruud H

    2017-08-29

    Accurate staging and outcome prediction is a major problem in clinical management of oral cancer patients, hampering high precision treatment and adjuvant therapy planning. Here, we have built and validated multivariable models that integrate gene signatures with clinical and pathological variables to improve staging and survival prediction of patients with oral squamous cell carcinoma (OSCC). Gene expression profiles from 249 human papillomavirus (HPV)-negative OSCCs were explored to identify a 22-gene lymph node metastasis signature (LNMsig) and a 40-gene overall survival signature (OSsig). To facilitate future clinical implementation and increase performance, these signatures were transferred to quantitative polymerase chain reaction (qPCR) assays and validated in an independent cohort of 125 HPV-negative tumors. When applied in the clinically relevant subgroup of early-stage (cT1-2N0) OSCC, the LNMsig could prevent overtreatment in two-third of the patients. Additionally, the integration of RT-qPCR gene signatures with clinical and pathological variables provided accurate prognostic models for oral cancer, strongly outperforming TNM. Finally, the OSsig gene signature identified a subpopulation of patients, currently considered at low-risk for disease-related survival, who showed an unexpected poor prognosis. These well-validated models will assist in personalizing primary treatment with respect to neck dissection and adjuvant therapies.

  11. EGRI and FOSB gene expressions in cancer stroma are independent prognostic indicators for epithelial ovarian cancer receiving standard therapy.

    Science.gov (United States)

    Kataoka, Fumio; Tsuda, Hiroshi; Arao, Tokuzo; Nishimura, Sadako; Tanaka, Hideo; Nomura, Hiroyuki; Chiyoda, Tatsuyuki; Hirasawa, Akira; Akahane, Tomoko; Nishio, Hiroshi; Nishio, Kazuto; Aoki, Daisuke

    2012-03-01

    Stromal components interact with cancer cells to promote growth and metastasis. The purpose of this study was to identify genes expressed in stroma, which could provide prognostic information in epithelial ovarian cancer (EOC). Seventy-four patients were included. We performed gene expression profiling and confirmed array data using RT-PCR and immunohistochemistry. By microarray analysis, 52 candidate genes associated with progression free survival (PFS) were identified (P stroma, and EGR1 expression in cancer are independent prognostic factors in EOC. Immunohistochemically, EGR1 protein is localized in cancer cells and α-smooth muscle actin positive stromal fibroblasts. The EGR1 and FOSB expression in stromal cells and EGR1 expression in cancer cells are prognostic indicators in EOC. Copyright © 2011 Wiley Periodicals, Inc.

  12. Gene expression risk signatures maintain prognostic power in multiple myeloma despite microarray probe set translation

    DEFF Research Database (Denmark)

    Hermansen, N E U; Borup, R; Andersen, M K

    2016-01-01

    INTRODUCTION: Gene expression profiling (GEP) risk models in multiple myeloma are based on 3'-end microarrays. We hypothesized that GEP risk signatures could retain prognostic power despite being translated and applied to whole-transcript microarray data. METHODS: We studied CD138-positive bone...... marrow plasma cells in a prospective cohort of 59 samples from newly diagnosed patients eligible for high-dose therapy (HDT) and 67 samples from previous HDT patients with progressive disease. We used Affymetrix Human Gene 1.1 ST microarrays for GEP. Nine GEP risk signatures were translated by probe set...... match and applied to our data in multivariate Cox regression analysis for progression-free survival and overall survival in combination with clinical, cytogenetic and biochemical risk markers, including the International Staging System (ISS). RESULTS: Median follow-up was 66 months (range 42...

  13. Identification of Nocobactin NA Biosynthetic Gene Clusters in Nocardia farcinica▿ §

    OpenAIRE

    Hoshino, Yasutaka; Chiba, Kazuhiro; Ishino, Keiko; Fukai, Toshio; Igarashi, Yasuhiro; Yazawa, Katsukiyo; Mikami, Yuzuru; Ishikawa, Jun

    2010-01-01

    We identified the biosynthetic gene clusters of the siderophore nocobactin NA. The nbt clusters, which were discovered as genes highly homologous to the mycobactin biosynthesis genes by the genomic sequencing of Nocardia farcinica IFM 10152, consist of 10 genes separately located at two genomic regions. The gene organization of the nbt clusters and the predicted functions of the nbt genes, particularly the cyclization and epimerization domains, were in good agreement with the chemical structu...

  14. Transcriptional analysis of exopolysaccharides biosynthesis gene clusters in Lactobacillus plantarum.

    Science.gov (United States)

    Vastano, Valeria; Perrone, Filomena; Marasco, Rosangela; Sacco, Margherita; Muscariello, Lidia

    2016-04-01

    Exopolysaccharides (EPS) from lactic acid bacteria contribute to specific rheology and texture of fermented milk products and find applications also in non-dairy foods and in therapeutics. Recently, four clusters of genes (cps) associated with surface polysaccharide production have been identified in Lactobacillus plantarum WCFS1, a probiotic and food-associated lactobacillus. These clusters are involved in cell surface architecture and probably in release and/or exposure of immunomodulating bacterial molecules. Here we show a transcriptional analysis of these clusters. Indeed, RT-PCR experiments revealed that the cps loci are organized in five operons. Moreover, by reverse transcription-qPCR analysis performed on L. plantarum WCFS1 (wild type) and WCFS1-2 (ΔccpA), we demonstrated that expression of three cps clusters is under the control of the global regulator CcpA. These results, together with the identification of putative CcpA target sequences (catabolite responsive element CRE) in the regulatory region of four out of five transcriptional units, strongly suggest for the first time a role of the master regulator CcpA in EPS gene transcription among lactobacilli.

  15. Gene duplication, modularity and adaptation in the evolution of the aflatoxin gene cluster

    Directory of Open Access Journals (Sweden)

    Jakobek Judy L

    2007-07-01

    Full Text Available Abstract Background The biosynthesis of aflatoxin (AF involves over 20 enzymatic reactions in a complex polyketide pathway that converts acetate and malonate to the intermediates sterigmatocystin (ST and O-methylsterigmatocystin (OMST, the respective penultimate and ultimate precursors of AF. Although these precursors are chemically and structurally very similar, their accumulation differs at the species level for Aspergilli. Notable examples are A. nidulans that synthesizes only ST, A. flavus that makes predominantly AF, and A. parasiticus that generally produces either AF or OMST. Whether these differences are important in the evolutionary/ecological processes of species adaptation and diversification is unknown. Equally unknown are the specific genomic mechanisms responsible for ordering and clustering of genes in the AF pathway of Aspergillus. Results To elucidate the mechanisms that have driven formation of these clusters, we performed systematic searches of aflatoxin cluster homologs across five Aspergillus genomes. We found a high level of gene duplication and identified seven modules consisting of highly correlated gene pairs (aflA/aflB, aflR/aflS, aflX/aflY, aflF/aflE, aflT/aflQ, aflC/aflW, and aflG/aflL. With the exception of A. nomius, contrasts of mean Ka/Ks values across all cluster genes showed significant differences in selective pressure between section Flavi and non-section Flavi species. A. nomius mean Ka/Ks values were more similar to partial clusters in A. fumigatus and A. terreus. Overall, mean Ka/Ks values were significantly higher for section Flavi than for non-section Flavi species. Conclusion Our results implicate several genomic mechanisms in the evolution of ST, OMST and AF cluster genes. Gene modules may arise from duplications of a single gene, whereby the function of the pre-duplication gene is retained in the copy (aflF/aflE or the copies may partition the ancestral function (aflA/aflB. In some gene modules, the

  16. Loss of Bloom syndrome protein destabilizes human gene cluster architecture.

    Science.gov (United States)

    Killen, Michael W; Stults, Dawn M; Adachi, Noritaka; Hanakahi, Les; Pierce, Andrew J

    2009-09-15

    Bloom syndrome confers strong predisposition to malignancy in multiple tissue types. The Bloom syndrome patient (BLM) protein defective in the disease biochemically functions as a Holliday junction dissolvase and human cells lacking functional BLM show 10-fold elevated rates of sister chromatid exchange. Collectively, these phenomena suggest that dysregulated mitotic recombination drives the genomic instability underpinning the development of cancer in these individuals. Here we use physical analysis of the highly repeated, highly self-similar human ribosomal RNA gene clusters as sentinel biomarkers for dysregulated homologous recombination to demonstrate that loss of BLM protein function causes a striking increase in spontaneous molecular level genomic restructuring. Analysis of single-cell derived sub-clonal populations from wild-type human cell lines shows that gene cluster architecture is ordinarily very faithfully preserved under mitosis, but is so unstable in cell lines derived from BLMs as to make gene cluster architecture in different sub-clonal populations essentially unrecognizable one from another. Human cells defective in a different RecQ helicase, the WRN protein involved in the premature aging Werner syndrome, do not exhibit the gene cluster instability (GCI) phenotype, indicating that the BLM protein specifically, rather than RecQ helicases generally, holds back this recombination-mediated genomic instability. An ataxia-telangiectasia defective cell line also shows elevated rDNA GCI, although not to the extent of BLM defective cells. Genomic restructuring mediated by dysregulated recombination between the abundant low-copy repeats in the human genome may prove to be an important additional mechanism of genomic instability driving the initiation and progression of human cancer.

  17. Prognostic importance of expression of the Wilms' tumor 1 gene in newly diagnosed acute promyelocytic leukemia.

    Science.gov (United States)

    Hecht, Anna; Nolte, Florian; Nowak, Daniel; Nowak, Verena; Reinwald, Mark; Hanfstein, Benjamin; Faldum, Andreas; Büchner, Thomas; Spiekermann, Karsten; Sauerland, Cristina; Weiss, Christel; Hofmann, Wolf-Karsten; Lengfelder, Eva

    2015-01-01

    Wilms' tumor 1 gene (WT1) is known to be highly expressed in acute promyelocytic leukemia (APL) but information on its impact on prognosis is lacking. WT1 expression was analyzed in bone marrow samples of 79 patients with APL at initial diagnosis. Patients had a differing outcome according to their level of WT1 expression. In patients who achieved a complete remission (CR), low or high WT1 expression was significantly associated with inferior overall survival (OS) compared to intermediate WT1 expression (49% for WT1high vs. 63% for WT1low vs. 93% for WT1int; p=0.008). Moreover, there were significant differences in relapse-free survival (RFS) between the three expression groups (42% for WT1high vs. 63% for WT1low vs. 83% for WT1int; p=0.047). In multivariable analysis WT1 expression showed an independent prognostic impact on OS of responders to induction therapy. In conclusion, the level of WT1 expression can add prognostic information in APL risk stratification.

  18. NLRC and NLRX gene family mRNA expression and prognostic value in hepatocellular carcinoma.

    Science.gov (United States)

    Wang, Xiangkun; Yang, Chengkun; Liao, Xiwen; Han, Chuangye; Yu, Tingdong; Huang, Ketuan; Yu, Long; Qin, Wei; Zhu, Guangzhi; Su, Hao; Liu, Xiaoguang; Ye, Xinping; Chen, Bin; Peng, Minhao; Peng, Tao

    2017-11-01

    Nucleotide-binding oligomerization domain (NOD)-like receptor (NLR)C and NLRX family proteins play a key role in the innate immune response. The relationship between these proteins and hepatocellular carcinoma (HCC) remains unclear. This study investigated the prognostic significance of NLRC and NLRX family protein levels in HCC patients. Data from 360 HCC patients in The Cancer Genome Atlas database and 231 patients in the Gene Expression Omnibus database were analyzed. Kaplan-Meier analysis and a Cox regression model were used to determine median survival time (MST) and overall and recurrence-free survival by calculating the hazard ratio (HR) and 95% confidence interval (CI). High NOD2 and low NLRX1 expression in tumor tissue was associated with short MST (P = 0.012 and 0.014, respectively). A joint-effects analysis of NOD2 and NLRX1 combined revealed that groups III and IV had reduced risk of death from HCC as compared to group I (adjusted P = 0.001, adjusted HR = 0.31, 95% CI = 0.16-0.61 and adjusted P = 0.043, adjusted HR = 0.63, 95%CI = 0.41-0.99, respectively). NOD2 and NLRX1 expression levels are potential prognostic markers in HCC following hepatectomy. © 2017 The Authors. Cancer Medicine published by John Wiley & Sons Ltd.

  19. Prognostically relevant gene signatures of high-grade serous ovarian carcinoma

    Science.gov (United States)

    Verhaak, Roel G.W.; Tamayo, Pablo; Yang, Ji-Yeon; Hubbard, Diana; Zhang, Hailei; Creighton, Chad J.; Fereday, Sian; Lawrence, Michael; Carter, Scott L.; Mermel, Craig H.; Kostic, Aleksandar D.; Etemadmoghadam, Dariush; Saksena, Gordon; Cibulskis, Kristian; Duraisamy, Sekhar; Levanon, Keren; Sougnez, Carrie; Tsherniak, Aviad; Gomez, Sebastian; Onofrio, Robert; Gabriel, Stacey; Chin, Lynda; Zhang, Nianxiang; Spellman, Paul T.; Zhang, Yiqun; Akbani, Rehan; Hoadley, Katherine A.; Kahn, Ari; Köbel, Martin; Huntsman, David; Soslow, Robert A.; Defazio, Anna; Birrer, Michael J.; Gray, Joe W.; Weinstein, John N.; Bowtell, David D.; Drapkin, Ronny; Mesirov, Jill P.; Getz, Gad; Levine, Douglas A.; Meyerson, Matthew

    2012-01-01

    Because of the high risk of recurrence in high-grade serous ovarian carcinoma (HGS-OvCa), the development of outcome predictors could be valuable for patient stratification. Using the catalog of The Cancer Genome Atlas (TCGA), we developed subtype and survival gene expression signatures, which, when combined, provide a prognostic model of HGS-OvCa classification, named “Classification of Ovarian Cancer” (CLOVAR). We validated CLOVAR on an independent dataset consisting of 879 HGS-OvCa expression profiles. The worst outcome group, accounting for 23% of all cases, was associated with a median survival of 23 months and a platinum resistance rate of 63%, versus a median survival of 46 months and platinum resistance rate of 23% in other cases. Associating the outcome prediction model with BRCA1/BRCA2 mutation status, residual disease after surgery, and disease stage further optimized outcome classification. Ovarian cancer is a disease in urgent need of more effective therapies. The spectrum of outcomes observed here and their association with CLOVAR signatures suggests variations in underlying tumor biology. Prospective validation of the CLOVAR model in the context of additional prognostic variables may provide a rationale for optimal combination of patient and treatment regimens. PMID:23257362

  20. Genome-scale analysis of positional clustering of mouse testis-specific genes

    Directory of Open Access Journals (Sweden)

    Lee Bernett TK

    2005-01-01

    Full Text Available Abstract Background Genes are not randomly distributed on a chromosome as they were thought even after removal of tandem repeats. The positional clustering of co-expressed genes is known in prokaryotes and recently reported in several eukaryotic organisms such as Caenorhabditis elegans, Drosophila melanogaster, and Homo sapiens. In order to further investigate the mode of tissue-specific gene clustering in higher eukaryotes, we have performed a genome-scale analysis of positional clustering of the mouse testis-specific genes. Results Our computational analysis shows that a large proportion of testis-specific genes are clustered in groups of 2 to 5 genes in the mouse genome. The number of clusters is much higher than expected by chance even after removal of tandem repeats. Conclusion Our result suggests that testis-specific genes tend to cluster on the mouse chromosomes. This provides another piece of evidence for the hypothesis that clusters of tissue-specific genes do exist.

  1. Emerging gene-based prognostic tools in early breast cancer: First steps to personalised medicine.

    Science.gov (United States)

    Wazir, Umar; Mokbel, Kefah

    2014-12-10

    Breast cancer remains a major cause of neoplastic disease in much of the developed world. The majority of cases are diagnosed with oestrogen receptor (ER)-positive and human epidermal growth factor receptor-2 negative invasive ductal carcinoma and are treated predominantly by surgery which includes sentinel node biopsy and adjuvant endocrine therapy ± adjuvant radiotherapy. It is believed that an indeterminate subset of the patient population is needlessly incurring chemotherapy related morbidity without attaining any increase in survival due to therapy. Furthermore in the era of extended adjuvant endocrine therapy it is important to identify those patients who can be safely treated with 5 years rather than 10 years of endocrine therapy thus optimising the benefit-risk balance. This perception has propelled the development of more personalised prognostic tools for newly diagnosed cases of ER-positive breast cancer. In this article, we shall review the evidence regarding the currently available gene assays for human breast cancer.

  2. Validation of the 18-gene classifier as a prognostic biomarker of distant metastasis in breast cancer.

    Directory of Open Access Journals (Sweden)

    Skye Hung-Chun Cheng

    Full Text Available We validated an 18-gene classifier (GC initially developed to predict local/regional recurrence after mastectomy in estimating distant metastasis risk. The 18-gene scoring algorithm defines scores as: <21, low risk; ≥21, high risk. Six hundred eighty-three patients with primary operable breast cancer and fresh frozen tumor tissues available were included. The primary outcome was the 5-year probability of freedom from distant metastasis (DMFP. Two external datasets were used to test the predictive accuracy of 18-GC. The 5-year rates of DMFP for patients classified as low-risk (n = 146, 21.7% and high-risk (n = 537, 78.6% were 96.2% (95% CI, 91.1%-98.8% and 80.9% (74.6%-81.9%, respectively (median follow-up interval, 71.8 months. The 5-year rates of DMFP of the low-risk group in stage I (n = 62, 35.6%, stage II (n = 66, 20.1%, and stage III (n = 18, 10.3% were 100%, 94.2% (78.5%-98.5%, and 90.9% (50.8%-98.7%, respectively. Multivariate analysis revealed that 18-GC is an independent prognostic factor of distant metastasis (adjusted hazard ratio, 5.1; 95% CI, 1.8-14.1; p = 0.0017 for scores of ≥21. External validation showed that the 5-year rate of DMFP in the low- and high-risk patients was 94.1% (82.9%-100% and 80.3% (70.7%-89.9%, p = 0.06 in a Singapore dataset, and 89.5% (81.9%-94.1% and 73.6% (67.2%-79.0%, p = 0.0039 in the GEO-GSE20685 dataset, respectively. In conclusion, 18-GC is a viable prognostic biomarker for breast cancer to estimate distant metastasis risk.

  3. Validation of the 18-gene classifier as a prognostic biomarker of distant metastasis in breast cancer.

    Science.gov (United States)

    Cheng, Skye Hung-Chun; Huang, Tzu-Ting; Cheng, Yu-Hao; Tan, Tee Benita Kiat; Horng, Chen-Fang; Wang, Yong Alison; Brian, Nicholas Shannon; Shih, Li-Sun; Yu, Ben-Long

    2017-01-01

    We validated an 18-gene classifier (GC) initially developed to predict local/regional recurrence after mastectomy in estimating distant metastasis risk. The 18-gene scoring algorithm defines scores as: breast cancer and fresh frozen tumor tissues available were included. The primary outcome was the 5-year probability of freedom from distant metastasis (DMFP). Two external datasets were used to test the predictive accuracy of 18-GC. The 5-year rates of DMFP for patients classified as low-risk (n = 146, 21.7%) and high-risk (n = 537, 78.6%) were 96.2% (95% CI, 91.1%-98.8%) and 80.9% (74.6%-81.9%), respectively (median follow-up interval, 71.8 months). The 5-year rates of DMFP of the low-risk group in stage I (n = 62, 35.6%), stage II (n = 66, 20.1%), and stage III (n = 18, 10.3%) were 100%, 94.2% (78.5%-98.5%), and 90.9% (50.8%-98.7%), respectively. Multivariate analysis revealed that 18-GC is an independent prognostic factor of distant metastasis (adjusted hazard ratio, 5.1; 95% CI, 1.8-14.1; p = 0.0017) for scores of ≥21. External validation showed that the 5-year rate of DMFP in the low- and high-risk patients was 94.1% (82.9%-100%) and 80.3% (70.7%-89.9%, p = 0.06) in a Singapore dataset, and 89.5% (81.9%-94.1%) and 73.6% (67.2%-79.0%, p = 0.0039) in the GEO-GSE20685 dataset, respectively. In conclusion, 18-GC is a viable prognostic biomarker for breast cancer to estimate distant metastasis risk.

  4. A Dexamethasone-regulated Gene Signature Is Prognostic for Poor Survival in Glioblastoma Patients.

    Science.gov (United States)

    Luedi, Markus M; Singh, Sanjay K; Mosley, Jennifer C; Hatami, Masumeh; Gumin, Joy; Sulman, Erik P; Lang, Frederick F; Stueber, Frank; Zinn, Pascal O; Colen, Rivka R

    2017-01-01

    Dexamethasone is reported to induce both tumor-suppressive and tumor-promoting effects. The purpose of this study was to identify the genomic impact of dexamethasone in glioblastoma stem cell (GSC) lines and its prognostic value; furthermore, to identify drugs that can counter these side effects of dexamethasone exposure. We utilized 3 independent GSC lines with tumorigenic potential for this study. Whole-genome expression profiling and pathway analyses were done with dexamethasone-exposed and control cells. GSCs were also co-exposed to dexamethasone and temozolomide. Risk scores were calculated for most affected genes, and their associations with survival in The Cancer Genome Atlas and Repository of Molecular Brain Neoplasia Data databases. In silico Connectivity Map analysis identified camptothecin as antagonist to dexamethasone-induced negative effects. Pathway analyses predicted an activation of dexamethasone network (z-score: 2.908). Top activated canonical pathways included "role of breast cancer 1 in DNA damage response" (P=1.07E-04). GSCs were protected against temozolomide-induced apoptosis when coincubated with dexamethasone. Altered cellular functions included cell movement, cell survival, and apoptosis with z-scores of 2.815, 5.137, and -3.122, respectively. CCAAT/enhancer binding protein beta (CEBPB) was activated in a dose dependent manner specifically in slow-dividing "stem-like" cells. CEBPB was activated in dexamethasone-treated orthotopic tumors. Patients with high risk scores had significantly shorter survival. Camptothecin was validated as potential partial neutralizer of dexamethasone-induced oncogenic effects. Dexamethasone exposure induces a genetic program and CEBPB expression in GSCs that adversely affects key cellular functions and response to therapeutics. High risk scores associated with these genes have negative prognostic value in patients. Our findings further suggest camptothecin as a potential neutralizer of adverse dexamethasone

  5. Global Analysis of miRNA Gene Clusters and Gene Families Reveals Dynamic and Coordinated Expression

    Directory of Open Access Journals (Sweden)

    Li Guo

    2014-01-01

    Full Text Available To further understand the potential expression relationships of miRNAs in miRNA gene clusters and gene families, a global analysis was performed in 4 paired tumor (breast cancer and adjacent normal tissue samples using deep sequencing datasets. The compositions of miRNA gene clusters and families are not random, and clustered and homologous miRNAs may have close relationships with overlapped miRNA species. Members in the miRNA group always had various expression levels, and even some showed larger expression divergence. Despite the dynamic expression as well as individual difference, these miRNAs always indicated consistent or similar deregulation patterns. The consistent deregulation expression may contribute to dynamic and coordinated interaction between different miRNAs in regulatory network. Further, we found that those clustered or homologous miRNAs that were also identified as sense and antisense miRNAs showed larger expression divergence. miRNA gene clusters and families indicated important biological roles, and the specific distribution and expression further enrich and ensure the flexible and robust regulatory network.

  6. Multi-stage filtering for improving confidence level and determining dominant clusters in clustering algorithms of gene expression data.

    Science.gov (United States)

    Kasim, Shahreen; Deris, Safaai; Othman, Razib M

    2013-09-01

    A drastic improvement in the analysis of gene expression has lead to new discoveries in bioinformatics research. In order to analyse the gene expression data, fuzzy clustering algorithms are widely used. However, the resulting analyses from these specific types of algorithms may lead to confusion in hypotheses with regard to the suggestion of dominant function for genes of interest. Besides that, the current fuzzy clustering algorithms do not conduct a thorough analysis of genes with low membership values. Therefore, we present a novel computational framework called the "multi-stage filtering-Clustering Functional Annotation" (msf-CluFA) for clustering gene expression data. The framework consists of four components: fuzzy c-means clustering (msf-CluFA-0), achieving dominant cluster (msf-CluFA-1), improving confidence level (msf-CluFA-2) and combination of msf-CluFA-0, msf-CluFA-1 and msf-CluFA-2 (msf-CluFA-3). By employing double filtering in msf-CluFA-1 and apriori algorithms in msf-CluFA-2, our new framework is capable of determining the dominant clusters and improving the confidence level of genes with lower membership values by means of which the unknown genes can be predicted. Copyright © 2013 Elsevier Ltd. All rights reserved.

  7. Modeling the Drosophila gene cluster regulation network for muscle development.

    Science.gov (United States)

    Haye, Alexandre; Albert, Jaroslav; Rooman, Marianne

    2014-01-01

    The development of accurate and reliable dynamical modeling procedures that describe the time evolution of gene expression levels is a prerequisite to understanding and controlling the transcription process. We focused on data from DNA microarray time series for 20 Drosophila genes involved in muscle development during the embryonic stage. Genes with similar expression profiles were clustered on the basis of a translation-invariant and scale-invariant distance measure. The time evolution of these clusters was modeled using coupled differential equations. Three model structures involving a transcription term and a degradation term were tested. The parameters were identified in successive steps: network construction, parameter optimization, and parameter reduction. The solutions were evaluated on the basis of the data reproduction and the number of parameters, as well as on two biology-based requirements: the robustness with respect to parameter variations and the values of the expression levels not being unrealistically large upon extrapolation in time. Various solutions were obtained that satisfied all our evaluation criteria. The regulatory networks inferred from these solutions were compared with experimental data. The best solution has half of the experimental connections, which compares favorably with previous approaches. Biasing the network toward the experimental connections led to the identification of a model that is only slightly less good on the basis of the evaluation criteria. The non-uniqueness of the solutions and the variable agreement with experimental connections were discussed in the context of the different hypotheses underlying this type of approach.

  8. Nitrobacter winogradskyi cytochrome c oxidase genes are organized in a repeated gene cluster.

    Science.gov (United States)

    Berben, G

    1996-05-01

    Cytochrome c oxidase (EC 1.9.3.1) is one of the components of the electron transport chain by which Nitrobacter, a facultative lithoautotrophic bacterium, recovers energy from nitrite oxidation. The genes encoding the two catalytic core subunits of the enzyme were isolated from a Nitrobacter winogradskyi gene library. Sequencing of one of the 14 cloned DNA segments revealed that the subunit genes are side by side in an operon-like cluster. Remarkably the cluster appears to be present in at least two copies per genome. It extends over a 5-6 kb length including, besides the catalytic core subunit genes, other cytochrome oxidase related genes, especially a heme O synthase gene. Noteworthy is the new kind of gene order identified within the cluster. Deduced sequences for the cytochrome oxidase subunits and for the heme O synthase look closest to their counterparts in other alpha-subdivision Proteobacteria, particularly the Rhizobiaceae. This confirms the phylogenetic relationships established only upon 16S rRNA data. Furthermore, interesting similarities exist between N. winogradskyi and mitochondrial cytochrome oxidase subunits while the heme O synthase sequence gives some new insights about the other similar published alpha-subdivision proteobacterial sequences.

  9. Clinical and Prognostic Profiles of Cardiomyopathies Caused by Mutations in the Troponin T Gene.

    Science.gov (United States)

    Ripoll-Vera, Tomás; Gámez, José María; Govea, Nancy; Gómez, Yolanda; Núñez, Juana; Socías, Lorenzo; Escandell, Ángela; Rosell, Jorge

    2016-02-01

    Mutations in the troponin T gene (TTNT2) have been associated in small studies with the development of hypertrophic cardiomyopathy characterized by a high risk of sudden death and mild hypertrophy. We describe the clinical course of patients carrying mutations in this gene. We analyzed the clinical characteristics and prognosis of patients with mutations in the TNNT2 gene who were seen in an inherited cardiac disease unit. Of 180 families with genetically studied cardiomyopathies, 21 families (11.7%) were identified as having mutations in TNNT2: 10 families had Arg92Gln, 5 had Arg286His, 3 had Arg278Cys, 1 had Arg92Trp, 1 had Arg94His, and 1 had Ile221Thr. Thirty-three additional genetic carriers were identified through family assessment. The study included 54 genetic carriers: 56% were male, and the mean average age was 41 ± 17 years. There were 33 cases of hypertrophic cardiomyopathy, 9 of dilated cardiomyopathy, and 1 of noncompaction cardiomyopathy, and maximal myocardial thickness was 18.5 ± 6mm. Ventricular dysfunction was present in 30% of individuals and a history of sudden death in 62%. During follow-up, 4 patients died and 14 (33%) received a defibrillator (8 probands, 6 relatives). Mean survival was 54 years. Carriers of Arg92Gln had early disease development, high penetrance, a high risk of sudden death, a high rate of defibrillator implantation, and a high frequency of mixed phenotype. Mutations in the TNNT2 gene were more common in this series than in previous studies. The clinical and prognostic profiles depended on the mutation present. Carriers of the Arg92Gln mutation developed hypertrophic or dilated cardiomyopathy and had a significantly worse prognosis than those with other mutations in TNNT2 or other sarcomeric genes. Copyright © 2015 Sociedad Española de Cardiología. Published by Elsevier España, S.L.U. All rights reserved.

  10. Molecular characterization of neurally expressing genes in the para sodium channel gene cluster of Drosophila

    Energy Technology Data Exchange (ETDEWEB)

    Hong, Chang-Sook; Ganetzky, B. [Univ. of Wisconsin, Madison, WI (United States)

    1996-03-01

    To elucidate the mechanisms regulating expression of para, which encodes the major class of sodium channels in the Drosophila nervous system, we have tried to locate upstream cis-acting regulatory elements by mapping the transcriptional start site and analyzing the region immediately upstream of para in region 14D of the polytene chromosomes. From these studies, we have discovered that the region contains a cluster of neurally expressing genes. Here we report the molecular characterization of the genomic organization of the 14D region and the genes within this region, which are: calnexin (Cnx), actin related protein 14D (Arp14D), calcineurin A 14D (CnnA14D), and chromosome associated protein (Cap). The tight clustering of these genes, their neuronal expression patterns, and their potential functions related to expression, modulation, or regulation of sodium channels raise the possibility that these genes represent a functionally related group sharing some coordinate regulatory mechanism. 76 refs., 11 figs.

  11. Accurate prediction of secondary metabolite gene clusters in filamentous fungi

    DEFF Research Database (Denmark)

    Andersen, Mikael Rørdam; Nielsen, Jakob Blæsbjerg; Klitgaard, Andreas

    2013-01-01

    -chemistry between physically separate gene clusters (superclusters), and validate this both with legacy data and experimentally by prediction and verification of a supercluster consisting of the synthase AN1242 and the prenyltransferase AN11080, as well as identification of the product compound nidulanin A. We have...... used A. nidulans for our method development and validation due to the wealth of available biochemical data, but the method can be applied to any fungus with a sequenced and assembled genome, thus supporting further secondary metabolite pathway elucidation in the fungal kingdom.......Biosynthetic pathways of secondary metabolites from fungi are currently subject to an intense effort to elucidate the genetic basis for these compounds due to their large potential within pharmaceutics and synthetic biochemistry. The preferred method is methodical gene deletions to identify...

  12. Gene prioritization and clustering by multi-view text mining

    Directory of Open Access Journals (Sweden)

    De Moor Bart

    2010-01-01

    Full Text Available Abstract Background Text mining has become a useful tool for biologists trying to understand the genetics of diseases. In particular, it can help identify the most interesting candidate genes for a disease for further experimental analysis. Many text mining approaches have been introduced, but the effect of disease-gene identification varies in different text mining models. Thus, the idea of incorporating more text mining models may be beneficial to obtain more refined and accurate knowledge. However, how to effectively combine these models still remains a challenging question in machine learning. In particular, it is a non-trivial issue to guarantee that the integrated model performs better than the best individual model. Results We present a multi-view approach to retrieve biomedical knowledge using different controlled vocabularies. These controlled vocabularies are selected on the basis of nine well-known bio-ontologies and are applied to index the vast amounts of gene-based free-text information available in the MEDLINE repository. The text mining result specified by a vocabulary is considered as a view and the obtained multiple views are integrated by multi-source learning algorithms. We investigate the effect of integration in two fundamental computational disease gene identification tasks: gene prioritization and gene clustering. The performance of the proposed approach is systematically evaluated and compared on real benchmark data sets. In both tasks, the multi-view approach demonstrates significantly better performance than other comparing methods. Conclusions In practical research, the relevance of specific vocabulary pertaining to the task is usually unknown. In such case, multi-view text mining is a superior and promising strategy for text-based disease gene identification.

  13. Gene prioritization and clustering by multi-view text mining.

    Science.gov (United States)

    Yu, Shi; Tranchevent, Leon-Charles; De Moor, Bart; Moreau, Yves

    2010-01-14

    Text mining has become a useful tool for biologists trying to understand the genetics of diseases. In particular, it can help identify the most interesting candidate genes for a disease for further experimental analysis. Many text mining approaches have been introduced, but the effect of disease-gene identification varies in different text mining models. Thus, the idea of incorporating more text mining models may be beneficial to obtain more refined and accurate knowledge. However, how to effectively combine these models still remains a challenging question in machine learning. In particular, it is a non-trivial issue to guarantee that the integrated model performs better than the best individual model. We present a multi-view approach to retrieve biomedical knowledge using different controlled vocabularies. These controlled vocabularies are selected on the basis of nine well-known bio-ontologies and are applied to index the vast amounts of gene-based free-text information available in the MEDLINE repository. The text mining result specified by a vocabulary is considered as a view and the obtained multiple views are integrated by multi-source learning algorithms. We investigate the effect of integration in two fundamental computational disease gene identification tasks: gene prioritization and gene clustering. The performance of the proposed approach is systematically evaluated and compared on real benchmark data sets. In both tasks, the multi-view approach demonstrates significantly better performance than other comparing methods. In practical research, the relevance of specific vocabulary pertaining to the task is usually unknown. In such case, multi-view text mining is a superior and promising strategy for text-based disease gene identification.

  14. Prognostic Significance of Promoter DNA Hypermethylation of cysteine dioxygenase 1 (CDO1 Gene in Primary Breast Cancer.

    Directory of Open Access Journals (Sweden)

    Naoko Minatani

    Full Text Available Using pharmacological unmasking microarray, we identified promoter DNA methylation of cysteine dioxygenase 1 (CDO1 gene in human cancer. In this study, we assessed the clinicopathological significance of CDO1 methylation in primary breast cancer (BC with no prior chemotherapy. The CDO1 DNA methylation was quantified by TaqMan methylation specific PCR (Q-MSP in 7 BC cell lines and 172 primary BC patients with no prior chemotherapy. Promoter DNA of the CDO1 gene was hypermethylated in 6 BC cell lines except SK-BR3, and CDO1 gene expression was all silenced at mRNA level in the 7 BC cell lines. Quantification of CDO1 methylation was developed using Q-MSP, and assessed in primary BC. Among the clinicopathologic factors, CDO1 methylation level was not statistically significantly associated with any prognostic factors. The log-rank plot analysis elucidated that the higher methylation the tumors harbored, the poorer prognosis the patients exhibited. Using the median value of 58.0 as a cut-off one, disease specific survival in BC patients with CDO1 hypermethylation showed significantly poorer prognosis than those with hypomethylation (p = 0.004. Multivariate Cox proportional hazards model identified that CDO1 hypermethylation was prognostic factor as well as Ki-67 and hormone receptor status. The most intriguingly, CDO1 hypermethylation was of robust prognostic relevance in triple negative BC (p = 0.007. Promoter DNA methylation of CDO1 gene was robust prognostic indicator in primary BC patients with no prior chemotherapy. Prognostic relevance of the CDO1 promoter DNA methylation is worthy of being paid attention in triple negative BC cancer.

  15. Adaptive evolution of the FADS gene cluster within Africa.

    Directory of Open Access Journals (Sweden)

    Rasika A Mathias

    Full Text Available Long chain polyunsaturated fatty acids (LC-PUFAs are essential for brain structure, development, and function, and adequate dietary quantities of LC-PUFAs are thought to have been necessary for both brain expansion and the increase in brain complexity observed during modern human evolution. Previous studies conducted in largely European populations suggest that humans have limited capacity to synthesize brain LC-PUFAs such as docosahexaenoic acid (DHA from plant-based medium chain (MC PUFAs due to limited desaturase activity. Population-based differences in LC-PUFA levels and their product-to-substrate ratios can, in part, be explained by polymorphisms in the fatty acid desaturase (FADS gene cluster, which have been associated with increased conversion of MC-PUFAs to LC-PUFAs. Here, we show evidence that these high efficiency converter alleles in the FADS gene cluster were likely driven to near fixation in African populations by positive selection ∼85 kya. We hypothesize that selection at FADS variants, which increase LC-PUFA synthesis from plant-based MC-PUFAs, played an important role in allowing African populations obligatorily tethered to marine sources for LC-PUFAs in isolated geographic regions, to rapidly expand throughout the African continent 60-80 kya.

  16. Adaptive evolution of the FADS gene cluster within Africa.

    Science.gov (United States)

    Mathias, Rasika A; Fu, Wenqing; Akey, Joshua M; Ainsworth, Hannah C; Torgerson, Dara G; Ruczinski, Ingo; Sergeant, Susan; Barnes, Kathleen C; Chilton, Floyd H

    2012-01-01

    Long chain polyunsaturated fatty acids (LC-PUFAs) are essential for brain structure, development, and function, and adequate dietary quantities of LC-PUFAs are thought to have been necessary for both brain expansion and the increase in brain complexity observed during modern human evolution. Previous studies conducted in largely European populations suggest that humans have limited capacity to synthesize brain LC-PUFAs such as docosahexaenoic acid (DHA) from plant-based medium chain (MC) PUFAs due to limited desaturase activity. Population-based differences in LC-PUFA levels and their product-to-substrate ratios can, in part, be explained by polymorphisms in the fatty acid desaturase (FADS) gene cluster, which have been associated with increased conversion of MC-PUFAs to LC-PUFAs. Here, we show evidence that these high efficiency converter alleles in the FADS gene cluster were likely driven to near fixation in African populations by positive selection ∼85 kya. We hypothesize that selection at FADS variants, which increase LC-PUFA synthesis from plant-based MC-PUFAs, played an important role in allowing African populations obligatorily tethered to marine sources for LC-PUFAs in isolated geographic regions, to rapidly expand throughout the African continent 60-80 kya.

  17. Gravitation field algorithm and its application in gene cluster.

    Science.gov (United States)

    Zheng, Ming; Liu, Gui-Xia; Zhou, Chun-Guang; Liang, Yan-Chun; Wang, Yan

    2010-09-20

    Searching optima is one of the most challenging tasks in clustering genes from available experimental data or given functions. SA, GA, PSO and other similar efficient global optimization methods are used by biotechnologists. All these algorithms are based on the imitation of natural phenomena. This paper proposes a novel searching optimization algorithm called Gravitation Field Algorithm (GFA) which is derived from the famous astronomy theory Solar Nebular Disk Model (SNDM) of planetary formation. GFA simulates the Gravitation field and outperforms GA and SA in some multimodal functions optimization problem. And GFA also can be used in the forms of unimodal functions. GFA clusters the dataset well from the Gene Expression Omnibus. The mathematical proof demonstrates that GFA could be convergent in the global optimum by probability 1 in three conditions for one independent variable mass functions. In addition to these results, the fundamental optimization concept in this paper is used to analyze how SA and GA affect the global search and the inherent defects in SA and GA. Some results and source code (in Matlab) are publicly available at http://ccst.jlu.edu.cn/CSBG/GFA.

  18. Gravitation field algorithm and its application in gene cluster

    Directory of Open Access Journals (Sweden)

    Zheng Ming

    2010-09-01

    Full Text Available Abstract Background Searching optima is one of the most challenging tasks in clustering genes from available experimental data or given functions. SA, GA, PSO and other similar efficient global optimization methods are used by biotechnologists. All these algorithms are based on the imitation of natural phenomena. Results This paper proposes a novel searching optimization algorithm called Gravitation Field Algorithm (GFA which is derived from the famous astronomy theory Solar Nebular Disk Model (SNDM of planetary formation. GFA simulates the Gravitation field and outperforms GA and SA in some multimodal functions optimization problem. And GFA also can be used in the forms of unimodal functions. GFA clusters the dataset well from the Gene Expression Omnibus. Conclusions The mathematical proof demonstrates that GFA could be convergent in the global optimum by probability 1 in three conditions for one independent variable mass functions. In addition to these results, the fundamental optimization concept in this paper is used to analyze how SA and GA affect the global search and the inherent defects in SA and GA. Some results and source code (in Matlab are publicly available at http://ccst.jlu.edu.cn/CSBG/GFA.

  19. Gene Expression Programs in Response to Hypoxia: Cell Type Specificity and Prognostic Significance in Human Cancers.

    Directory of Open Access Journals (Sweden)

    2006-01-01

    Full Text Available BACKGROUND: Inadequate oxygen (hypoxia triggers a multifaceted cellular response that has important roles in normal physiology and in many human diseases. A transcription factor, hypoxia-inducible factor (HIF, plays a central role in the hypoxia response; its activity is regulated by the oxygen-dependent degradation of the HIF-1alpha protein. Despite the ubiquity and importance of hypoxia responses, little is known about the variation in the global transcriptional response to hypoxia among different cell types or how this variation might relate to tissue- and cell-specific diseases. METHODS AND FINDINGS: We analyzed the temporal changes in global transcript levels in response to hypoxia in primary renal proximal tubule epithelial cells, breast epithelial cells, smooth muscle cells, and endothelial cells with DNA microarrays. The extent of the transcriptional response to hypoxia was greatest in the renal tubule cells. This heightened response was associated with a uniquely high level of HIF-1alpha RNA in renal cells, and it could be diminished by reducing HIF-1alpha expression via RNA interference. A gene-expression signature of the hypoxia response, derived from our studies of cultured mammary and renal tubular epithelial cells, showed coordinated variation in several human cancers, and was a strong predictor of clinical outcomes in breast and ovarian cancers. In an analysis of a large, published gene-expression dataset from breast cancers, we found that the prognostic information in the hypoxia signature was virtually independent of that provided by the previously reported wound signature and more predictive of outcomes than any of the clinical parameters in current use. CONCLUSIONS: The transcriptional response to hypoxia varies among human cells. Some of this variation is traceable to variation in expression of the HIF1A gene. A gene-expression signature of the cellular response to hypoxia is associated with a significantly poorer prognosis

  20. Gene expression programs in response to hypoxia: cell type specificity and prognostic significance in human cancers.

    Directory of Open Access Journals (Sweden)

    Jen-Tsan Chi

    2006-03-01

    Full Text Available Inadequate oxygen (hypoxia triggers a multifaceted cellular response that has important roles in normal physiology and in many human diseases. A transcription factor, hypoxia-inducible factor (HIF, plays a central role in the hypoxia response; its activity is regulated by the oxygen-dependent degradation of the HIF-1alpha protein. Despite the ubiquity and importance of hypoxia responses, little is known about the variation in the global transcriptional response to hypoxia among different cell types or how this variation might relate to tissue- and cell-specific diseases.We analyzed the temporal changes in global transcript levels in response to hypoxia in primary renal proximal tubule epithelial cells, breast epithelial cells, smooth muscle cells, and endothelial cells with DNA microarrays. The extent of the transcriptional response to hypoxia was greatest in the renal tubule cells. This heightened response was associated with a uniquely high level of HIF-1alpha RNA in renal cells, and it could be diminished by reducing HIF-1alpha expression via RNA interference. A gene-expression signature of the hypoxia response, derived from our studies of cultured mammary and renal tubular epithelial cells, showed coordinated variation in several human cancers, and was a strong predictor of clinical outcomes in breast and ovarian cancers. In an analysis of a large, published gene-expression dataset from breast cancers, we found that the prognostic information in the hypoxia signature was virtually independent of that provided by the previously reported wound signature and more predictive of outcomes than any of the clinical parameters in current use.The transcriptional response to hypoxia varies among human cells. Some of this variation is traceable to variation in expression of the HIF1A gene. A gene-expression signature of the cellular response to hypoxia is associated with a significantly poorer prognosis in breast and ovarian cancer.

  1. Common genetic variants in Wnt signaling pathway genes as potential prognostic biomarkers for colorectal cancer.

    Directory of Open Access Journals (Sweden)

    Wen-Chien Ting

    Full Text Available Compelling evidence has implicated the Wnt signaling pathway in the pathogenesis of colorectal cancer. We assessed the use of tag single nucleotide polymorphisms (tSNPs in adenomatous polyposis coli (APC/β-catenin (CTNNB1 genes to predict outcomes in patients with colorectal cancer. We selected and genotyped 10 tSNP to predict common variants across entire APC and CTNNB1 genes in 282 colorectal cancer patients. The associations of these tSNPs with distant metastasis-free survival and overall survival were evaluated by Kaplan-Meier analysis, Cox regression model, and survival tree analysis. The 5-year overall survival rate was 68.3%. Survival tree analysis identified a higher-order genetic interaction profile consisting of the APC rs565453, CTNNB1 2293303, and APC rs1816769 that was significantly associated with overall survival. The 5-year survival overall rates were 89.2%, 66.1%, and 58.8% for the low-, medium-, and high-risk genetic profiles, respectively (log-rank P = 0.001. After adjusting for possible confounders, including age, gender, carcinoembryonic antigen levels, tumor differentiation, stage, lymphovascular invasion, perineural invasion, and lymph node involvement, the genetic interaction profile remained significant. None of the studied SNPs were individually associated with distant metastasis-free survival and overall survival. Our results suggest that the genetic interaction profile among Wnt pathway SNPs might potentially increase the prognostic value in outcome prediction for colorectal cancer.

  2. Prognostic signature and clonality pattern of recurrently mutated genes in inactive chronic lymphocytic leukemia.

    Science.gov (United States)

    Hurtado, A M; Chen-Liang, T-H; Przychodzen, B; Hamedi, C; Muñoz-Ballester, J; Dienes, B; García-Malo, M D; Antón, A I; de Arriba, F; Teruel-Montoya, R; Ortuño, F J; Vicente, V; Maciejewski, J P; Jerez, A

    2015-08-28

    An increasing numbers of patients are being diagnosed with asymptomatic early-stage chronic lymphocytic leukemia (CLL), with no treatment indication at baseline. We applied a high-throughput deep-targeted analysis, especially designed for covering widely TP53 and ATM genes, in 180 patients with inactive disease at diagnosis, to test the independent prognostic value of CLL somatic recurrent mutations. We found that 40/180 patients harbored at least one acquired variant with ATM (n=17, 9.4%), NOTCH1 (n=14, 7.7%), TP53 (n=14, 7.7%) and SF3B1 (n=10, 5.5%) as most prevalent mutated genes. Harboring one 'sub-Sanger' TP53 mutation granted an independent 3.5-fold increase of probability of needing treatment. Those patients with a double-hit ATM lesion (mutation+11q deletion) had the shorter median time to first treatment (17 months). We found that a genomic variable: TP53 mutations, most of them under the sensitivity of conventional techniques; a cell phenotypic factor: CD38-positive expression; and a classical marker as β2-microglobulin, remained as the unique independent predictors of outcome. The high-throughput determination of TP53 status, particularly in this set of patients frequently lacking high-risk chromosomal aberrations, emerges as a key step, not only for prediction modeling, but also for exploring mutation-specific therapeutic approaches and minimal residual disease monitoring.

  3. Hox gene cluster of the ascidian, Halocynthia roretzi, reveals multiple ancient steps of cluster disintegration during ascidian evolution.

    Science.gov (United States)

    Sekigami, Yuka; Kobayashi, Takuya; Omi, Ai; Nishitsuji, Koki; Ikuta, Tetsuro; Fujiyama, Asao; Satoh, Noriyuki; Saiga, Hidetoshi

    2017-01-01

    Hox gene clusters with at least 13 paralog group (PG) members are common in vertebrate genomes and in that of amphioxus. Ascidians, which belong to the subphylum Tunicata (Urochordata), are phylogenetically positioned between vertebrates and amphioxus, and traditionally divided into two groups: the Pleurogona and the Enterogona. An enterogonan ascidian, Ciona intestinalis (Ci), possesses nine Hox genes localized on two chromosomes; thus, the Hox gene cluster is disintegrated. We investigated the Hox gene cluster of a pleurogonan ascidian, Halocynthia roretzi (Hr) to investigate whether Hox gene cluster disintegration is common among ascidians, and if so, how such disintegration occurred during ascidian or tunicate evolution. Our phylogenetic analysis reveals that the Hr Hox gene complement comprises nine members, including one with a relatively divergent Hox homeodomain sequence. Eight of nine Hr Hox genes were orthologous to Ci-Hox1, 2, 3, 4, 5, 10, 12 and 13. Following the phylogenetic classification into 13 PGs, we designated Hr Hox genes as Hox1, 2, 3, 4, 5, 10, 11/12/13.a, 11/12/13.b and HoxX. To address the chromosomal arrangement of the nine Hox genes, we performed two-color chromosomal fluorescent in situ hybridization, which revealed that the nine Hox genes are localized on a single chromosome in Hr, distinct from their arrangement in Ci. We further examined the order of the nine Hox genes on the chromosome by chromosome/scaffold walking. This analysis suggested a gene order of Hox1, 11/12/13.b, 11/12/13.a, 10, 5, X, followed by either Hox4, 3, 2 or Hox2, 3, 4 on the chromosome. Based on the present results and those previously reported in Ci, we discuss the establishment of the Hox gene complement and disintegration of Hox gene clusters during the course of ascidian or tunicate evolution. The Hox gene cluster and the genome must have experienced extensive reorganization during the course of evolution from the ancestral tunicate to Hr and Ci. Nevertheless

  4. Recurrent adenylation domain replacement in the microcystin synthetase gene cluster

    Directory of Open Access Journals (Sweden)

    Laakso Kati

    2007-10-01

    Full Text Available Abstract Background Microcystins are small cyclic heptapeptide toxins produced by a range of distantly related cyanobacteria. Microcystins are synthesized on large NRPS-PKS enzyme complexes. Many structural variants of microcystins are produced simulatenously. A recombination event between the first module of mcyB (mcyB1 and mcyC in the microcystin synthetase gene cluster is linked to the simultaneous production of microcystin variants in strains of the genus Microcystis. Results Here we undertook a phylogenetic study to investigate the order and timing of recombination between the mcyB1 and mcyC genes in a diverse selection of microcystin producing cyanobacteria. Our results provide support for complex evolutionary processes taking place at the mcyB1 and mcyC adenylation domains which recognize and activate the amino acids found at X and Z positions. We find evidence for recent recombination between mcyB1 and mcyC in strains of the genera Anabaena, Microcystis, and Hapalosiphon. We also find clear evidence for independent adenylation domain conversion of mcyB1 by unrelated peptide synthetase modules in strains of the genera Nostoc and Microcystis. The recombination events replace only the adenylation domain in each case and the condensation domains of mcyB1 and mcyC are not transferred together with the adenylation domain. Our findings demonstrate that the mcyB1 and mcyC adenylation domains are recombination hotspots in the microcystin synthetase gene cluster. Conclusion Recombination is thought to be one of the main mechanisms driving the diversification of NRPSs. However, there is very little information on how recombination takes place in nature. This study demonstrates that functional peptide synthetases are created in nature through transfer of adenylation domains without the concomitant transfer of condensation domains.

  5. Time-series clustering of gene expression in irradiated and bystander fibroblasts: an application of FBPA clustering

    Directory of Open Access Journals (Sweden)

    Markatou Marianthi

    2011-01-01

    Full Text Available Abstract Background The radiation bystander effect is an important component of the overall biological response of tissues and organisms to ionizing radiation, but the signaling mechanisms between irradiated and non-irradiated bystander cells are not fully understood. In this study, we measured a time-series of gene expression after α-particle irradiation and applied the Feature Based Partitioning around medoids Algorithm (FBPA, a new clustering method suitable for sparse time series, to identify signaling modules that act in concert in the response to direct irradiation and bystander signaling. We compared our results with those of an alternate clustering method, Short Time series Expression Miner (STEM. Results While computational evaluations of both clustering results were similar, FBPA provided more biological insight. After irradiation, gene clusters were enriched for signal transduction, cell cycle/cell death and inflammation/immunity processes; but only FBPA separated clusters by function. In bystanders, gene clusters were enriched for cell communication/motility, signal transduction and inflammation processes; but biological functions did not separate as clearly with either clustering method as they did in irradiated samples. Network analysis confirmed p53 and NF-κB transcription factor-regulated gene clusters in irradiated and bystander cells and suggested novel regulators, such as KDM5B/JARID1B (lysine (K-specific demethylase 5B and HDACs (histone deacetylases, which could epigenetically coordinate gene expression after irradiation. Conclusions In this study, we have shown that a new time series clustering method, FBPA, can provide new leads to the mechanisms regulating the dynamic cellular response to radiation. The findings implicate epigenetic control of gene expression in addition to transcription factor networks.

  6. Collaborative Ocular Oncology Group Report No. 1: Prospective Validation of a Multi-Gene Prognostic Assay in Uveal Melanoma

    Science.gov (United States)

    Onken, Michael D.; Worley, Lori A.; Char, Devron H.; Augsburger, James J.; Correa, Zelia M; Nudleman, Eric; Aaberg, Thomas M.; Altaweel, Michael M.; Bardenstein, David S.; Finger, Paul T.; Gallie, Brenda L.; Harocopos, George J.; Hovland, Peter G.; McGowan, Hugh D.; Milman, Tatyana; Mruthyunjaya, Prithvi; Simpson, E. Rand; Smith, Morton E.; Wilson, David J.; Wirostko, William J.; Harbour, J. William

    2012-01-01

    Purpose This study evaluates the prognostic performance of a 15 gene expression profiling (GEP) assay that assigns primary posterior uveal melanomas to prognostic subgroups: class 1 (low metastatic risk) and class 2 (high metastatic risk). Design Prospective, multicenter study. Participants 459 patients with posterior uveal melanoma were enrolled from 12 independent centers. Testing Tumors were classified by GEP as class 1 or class 2. The first 260 samples were also analyzed for chromosome 3 status using a single nucleotide polymorphism assay. Net reclassification improvement analysis was performed to compare the prognostic accuracy of GEP to the 7th edition clinical Tumor-Node-Metastasis (TNM) classification and to chromosome 3 status. Main Outcome Measures Patients were managed for their primary tumor and monitored for metastasis. Results The GEP assay successfully classified 446/459 (97.2%) cases. The GEP was class 1 in 276 cases (61.9%) and class 2 in 170 cases (38.1%). Median follow-up was 17.4 months (mean, 18.0 months). Metastasis was detected in 3 (1.1%) class 1 cases and 44 (25.9%) class 2 cases (log rank test, P<10−14). Although there was an association between GEP class 2 and monosomy 3 (Fisher exact test, P<0.0001), 54/260 (20.8%) tumors were discordant for GEP and chromosome 3 status, among which GEP demonstrated superior prognostic accuracy (log rank test, P=0.0001). Using multivariate Cox modeling, GEP class had a stronger independent association with metastasis than any other prognostic factor (P<0.0001). Chromosome 3 status did not contribute additional prognostic information that was independent of GEP (P=0.2). At three years follow-up, the net reclassification improvement of GEP over TNM classification was 0.43 (P=0.001) and 0.38 (P=0.004) over chromosome 3 status. Conclusions The GEP assay had a high technical success rate and was the most accurate prognostic marker among all of the factors analyzed. GEP provided a highly significant

  7. Collaborative Ocular Oncology Group report number 1: prospective validation of a multi-gene prognostic assay in uveal melanoma.

    Science.gov (United States)

    Onken, Michael D; Worley, Lori A; Char, Devron H; Augsburger, James J; Correa, Zelia M; Nudleman, Eric; Aaberg, Thomas M; Altaweel, Michael M; Bardenstein, David S; Finger, Paul T; Gallie, Brenda L; Harocopos, George J; Hovland, Peter G; McGowan, Hugh D; Milman, Tatyana; Mruthyunjaya, Prithvi; Simpson, E Rand; Smith, Morton E; Wilson, David J; Wirostko, William J; Harbour, J William

    2012-08-01

    This study evaluates the prognostic performance of a 15 gene expression profiling (GEP) assay that assigns primary posterior uveal melanomas to prognostic subgroups: class 1 (low metastatic risk) and class 2 (high metastatic risk). Prospective, multicenter study. A total of 459 patients with posterior uveal melanoma were enrolled from 12 independent centers. Tumors were classified by GEP as class 1 or class 2. The first 260 samples were also analyzed for chromosome 3 status using a single nucleotide polymorphism assay. Net reclassification improvement analysis was performed to compare the prognostic accuracy of GEP with the 7th edition clinical Tumor-Node-Metastasis (TNM) classification and chromosome 3 status. Patients were managed for their primary tumor and monitored for metastasis. The GEP assay successfully classified 446 of 459 cases (97.2%). The GEP was class 1 in 276 cases (61.9%) and class 2 in 170 cases (38.1%). Median follow-up was 17.4 months (mean, 18.0 months). Metastasis was detected in 3 class 1 cases (1.1%) and 44 class 2 cases (25.9%) (log-rank test, P<10(-14)). Although there was an association between GEP class 2 and monosomy 3 (Fisher exact test, P<0.0001), 54 of 260 tumors (20.8%) were discordant for GEP and chromosome 3 status, among which GEP demonstrated superior prognostic accuracy (log-rank test, P = 0.0001). By using multivariate Cox modeling, GEP class had a stronger independent association with metastasis than any other prognostic factor (P<0.0001). Chromosome 3 status did not contribute additional prognostic information that was independent of GEP (P = 0.2). At 3 years follow-up, the net reclassification improvement of GEP over TNM classification was 0.43 (P = 0.001) and 0.38 (P = 0.004) over chromosome 3 status. The GEP assay had a high technical success rate and was the most accurate prognostic marker among all of the factors analyzed. The GEP provided a highly significant improvement in prognostic accuracy over clinical TNM

  8. Mutational analysis of the nor gene cluster which encodes nitric-oxide reductase from Paracoccus denitrificans

    NARCIS (Netherlands)

    de Boer, A P; van der Oost, J.; Reijnders, W N; Westerhoff, H V; Stouthamer, A.H.; van Spanning, R J

    1996-01-01

    The genes that encode the hc-type nitric-oxide reductase from Paracoccus denitrificans have been identified. They are part of a cluster of six genes (norCBQDEF) and are found near the gene cluster that encodes the cd1-type nitrite reductase, which was identified earlier [de Boer, A. P. N.,

  9. Parallel evolutionary events in the haptoglobin gene clusters of rhesus monkey and human

    Energy Technology Data Exchange (ETDEWEB)

    Erickson, L.M.; Maeda, N. [Univ. of North Carolina, Chapel Hill, NC (United States)

    1994-08-01

    Parallel occurrences of evolutionary events in the haptoglobin gene clusters of rhesus monkeys and humans were studied. We found six different haplotypes among 11 individuals from two rhesus monkey families. The six haplotypes include two types of haptoglobin gene clusters: one type with a single gene and the other with two genes. DNA sequence analysis indicates that the one-gene and the two-gene clusters were both formed by unequal homologous crossovers between two genes of an ancestral three-gene cluster, near exon 5, the longest exon of the gene. This exon is also the location where a separate unequal homologous crossover occured in the human lineage, forming the human two-gene haptoglobin gene cluster from an ancestral three-gene cluster. The occurrence of independent homologous unequal crossovers in rhesus monkey and in human within the same region of DNA suggests that the evolutionary history of the haptoglobin gene cluster in primates is the consequence of frequent homologous pairings facilitated by the longest and most conserved exon of the gene. 27 refs., 7 figs., 1 tab.

  10. Transforming RNA-Seq data to improve the performance of prognostic gene signatures.

    Science.gov (United States)

    Zwiener, Isabella; Frisch, Barbara; Binder, Harald

    2014-01-01

    Gene expression measurements have successfully been used for building prognostic signatures, i.e for identifying a short list of important genes that can predict patient outcome. Mostly microarray measurements have been considered, and there is little advice available for building multivariable risk prediction models from RNA-Seq data. We specifically consider penalized regression techniques, such as the lasso and componentwise boosting, which can simultaneously consider all measurements and provide both, multivariable regression models for prediction and automated variable selection. However, they might be affected by the typical skewness, mean-variance-dependency or extreme values of RNA-Seq covariates and therefore could benefit from transformations of the latter. In an analytical part, we highlight preferential selection of covariates with large variances, which is problematic due to the mean-variance dependency of RNA-Seq data. In a simulation study, we compare different transformations of RNA-Seq data for potentially improving detection of important genes. Specifically, we consider standardization, the log transformation, a variance-stabilizing transformation, the Box-Cox transformation, and rank-based transformations. In addition, the prediction performance for real data from patients with kidney cancer and acute myeloid leukemia is considered. We show that signature size, identification performance, and prediction performance critically depend on the choice of a suitable transformation. Rank-based transformations perform well in all scenarios and can even outperform complex variance-stabilizing approaches. Generally, the results illustrate that the distribution and potential transformations of RNA-Seq data need to be considered as a critical step when building risk prediction models by penalized regression techniques.

  11. A strategy for full interrogation of prognostic gene expression patterns: exploring the biology of diffuse large B cell lymphoma.

    Directory of Open Access Journals (Sweden)

    Lisa M Rimsza

    Full Text Available Gene expression profiling yields quantitative data on gene expression used to create prognostic models that accurately predict patient outcome in diffuse large B cell lymphoma (DLBCL. Often, data are analyzed with genes classified by whether they fall above or below the median expression level. We sought to determine whether examining multiple cut-points might be a more powerful technique to investigate the association of gene expression with outcome.We explored gene expression profiling data using variable cut-point analysis for 36 genes with reported prognostic value in DLBCL. We plotted two-group survival logrank test statistics against corresponding cut-points of the gene expression levels and smooth estimates of the hazard ratio of death versus gene expression levels. To facilitate comparisons we also standardized the expression of each of the genes by the fraction of patients that would be identified by any cut-point. A multiple comparison adjusted permutation p-value identified 3 different patterns of significance: 1 genes with significant cut-point points below the median, whose loss is associated with poor outcome (e.g. HLA-DR; 2 genes with significant cut-points above the median, whose over-expression is associated with poor outcome (e.g. CCND2; and 3 genes with significant cut-points on either side of the median, (e.g. extracellular molecules such as FN1.Variable cut-point analysis with permutation p-value calculation can be used to identify significant genes that would not otherwise be identified with median cut-points and may suggest biological patterns of gene effects.

  12. MAGE-A gene expression in peripheral blood serves as a poor prognostic marker for patients with lung cancer.

    Science.gov (United States)

    Gu, Lina; Sang, Meixiang; Yin, Danjing; Liu, Fei; Wu, Yunyan; Liu, Shina; Huang, Weina; Shan, Baoen

    2018-02-12

    MAGE-A genes belong to the cancer/testis antigens family. The prognostic significance of MAGE-A expression in the peripheral blood of patients with lung cancer is unknown. Therefore, this study evaluated the expression and possible prognostic significance of MAGE-A in the peripheral blood of patients with lung cancer. In this study, we detected MAGE-A gene expression in the peripheral blood of 150 patients with lung cancer and 30 healthy donors using multiplex semi-nested PCR and analyzed their correlation with clinicopathological risk factors. MAGE-A expression was associated with factors indicating poor prognosis. The expression of MAGE-A and each individual MAGE-A gene were also associated with low overall survival in patients with lung cancer. The expression of MAGE-A genes in peripheral blood may act as a poor prognostic marker in patients with lung cancer. © 2018 The Authors. Thoracic Cancer published by China Lung Oncology Group and John Wiley & Sons Australia, Ltd.

  13. Gene mutations in the Ras pathway and the prognostic implication in Korean patients with juvenile myelomonocytic leukemia.

    Science.gov (United States)

    Park, Hyung-Doo; Lee, Soo Hyun; Sung, Ki Woong; Koo, Hong Hoe; Jung, Nak Gyun; Cho, Bin; Kim, Hak Ki; Park, In-Ae; Lee, Ki-O; Ki, Chang-Seok; Kim, Sun-Hee; Yoo, Keon Hee; Kim, Hee-Jin

    2012-04-01

    Juvenile myelomonocytic leukemia (JMML) is a rare hematologic malignancy in children. Hyperactivation of the Ras pathway from gene mutations is known to be the key culprit in the development of JMML. In this study, we investigated Ras pathway mutations and prognostic implication in Korean patients with JMML. A total of 22 Korean patients with JMML were recruited from two institutions (19 boys and three girls; median age, 17 months; range, 1-74 months). Hematologic and cytogenetic findings were reviewed. Mutation analyses involved PTPN11, KRAS, NRAS, and CBL genes by direct sequencing analyses (selected exons except in CBL). Survival analysis was performed by the Kaplan-Meier method. Cytogenetic and/or gene mutations were detected in 18 patients out of 22 (82%). Four patients (18%) had chromosomal abnormalities, with monosomy 7 being the most common. Seventeen (77%) had gene mutations. PTPN11 mutations were detected in 13 patients (59%). The patient heterozygous for c.854T>C had Noonan syndrome. NRAS and KRAS mutations were detected in two patients (9%) and one patient (5%), respectively. A homozygous CBL mutation was detected in one patient (5%; c.1228-2A>G). All mutations detected were previously reported mutations. Survival analyses suggested an unfavorable prognostic implication of PTPN11 mutation, albeit without a statistical significance. Collectively, the results from molecular genetics study and survival analyses suggested a relatively higher frequency and unfavorable prognostic implication of PTPN11 mutations in Korean patients with JMML.

  14. Lampreys Have a Single Gene Cluster for the Fast Skeletal Myosin Heavy Chain Gene Family

    Science.gov (United States)

    Ikeda, Daisuke; Ono, Yosuke; Hirano, Shigeki; Kan-no, Nobuhiro; Watabe, Shugo

    2013-01-01

    Muscle tissues contain the most classic sarcomeric myosin, called myosin II, which consists of 2 heavy chains (MYHs) and 4 light chains. In the case of humans (tetrapod), a total of 6 fast skeletal-type MYH genes (MYHs) are clustered on a single chromosome. In contrast, torafugu (teleost) contains at least 13 fast skeletal MYHs, which are distributed in 5 genomic regions; the MYHs are clustered in 3 of these regions. In the present study, the evolutionary relationship among fast skeletal MYHs is elucidated by comparing the MYHs of teleosts and tetrapods with those of cyclostome lampreys, one of two groups of extant jawless vertebrates (agnathans). We found that lampreys contain at least 3 fast skeletal MYHs, which are clustered in a head-to-tail manner in a single genomic region. Although there was apparent synteny in the corresponding MYH cluster regions between lampreys and tetrapods, phylogenetic analysis indicated that lamprey and tetrapod MYHs have independently duplicated and diversified. Subsequent transgenic approaches showed that the 5′-flanking sequences of Japanese lamprey fast skeletal MYHs function as a regulatory sequence to drive specific reporter gene expression in the fast skeletal muscle of zebrafish embryos. Although zebrafish MYH promoters showed apparent activity to direct reporter gene expression in myogenic cells derived from mice, promoters from Japanese lamprey MYHs had no activity. These results suggest that the muscle-specific regulatory mechanisms are partially conserved between teleosts and tetrapods but not between cyclostomes and tetrapods, despite the conserved synteny. PMID:24376886

  15. Lampreys have a single gene cluster for the fast skeletal myosin heavy chain gene family.

    Directory of Open Access Journals (Sweden)

    Daisuke Ikeda

    Full Text Available Muscle tissues contain the most classic sarcomeric myosin, called myosin II, which consists of 2 heavy chains (MYHs and 4 light chains. In the case of humans (tetrapod, a total of 6 fast skeletal-type MYH genes (MYHs are clustered on a single chromosome. In contrast, torafugu (teleost contains at least 13 fast skeletal MYHs, which are distributed in 5 genomic regions; the MYHs are clustered in 3 of these regions. In the present study, the evolutionary relationship among fast skeletal MYHs is elucidated by comparing the MYHs of teleosts and tetrapods with those of cyclostome lampreys, one of two groups of extant jawless vertebrates (agnathans. We found that lampreys contain at least 3 fast skeletal MYHs, which are clustered in a head-to-tail manner in a single genomic region. Although there was apparent synteny in the corresponding MYH cluster regions between lampreys and tetrapods, phylogenetic analysis indicated that lamprey and tetrapod MYHs have independently duplicated and diversified. Subsequent transgenic approaches showed that the 5'-flanking sequences of Japanese lamprey fast skeletal MYHs function as a regulatory sequence to drive specific reporter gene expression in the fast skeletal muscle of zebrafish embryos. Although zebrafish MYH promoters showed apparent activity to direct reporter gene expression in myogenic cells derived from mice, promoters from Japanese lamprey MYHs had no activity. These results suggest that the muscle-specific regulatory mechanisms are partially conserved between teleosts and tetrapods but not between cyclostomes and tetrapods, despite the conserved synteny.

  16. A hybrid distance measure for clustering expressed sequence tags originating from the same gene family.

    Directory of Open Access Journals (Sweden)

    Keng-Hoong Ng

    Full Text Available BACKGROUND: Clustering is a key step in the processing of Expressed Sequence Tags (ESTs. The primary goal of clustering is to put ESTs from the same transcript of a single gene into a unique cluster. Recent EST clustering algorithms mostly adopt the alignment-free distance measures, where they tend to yield acceptable clustering accuracies with reasonable computational time. Despite the fact that these clustering methods work satisfactorily on a majority of the EST datasets, they have a common weakness. They are prone to deliver unsatisfactory clustering results when dealing with ESTs from the genes derived from the same family. The root cause is the distance measures applied on them are not sensitive enough to separate these closely related genes. METHODOLOGY/PRINCIPAL FINDINGS: We propose a hybrid distance measure that combines the global and local features extracted from ESTs, with the aim to address the clustering problem faced by ESTs derived from the same gene family. The clustering process is implemented using the DBSCAN algorithm. We test the hybrid distance measure on the ten EST datasets, and the clustering results are compared with the two alignment-free EST clustering tools, i.e. wcd and PEACE. The clustering results indicate that the proposed hybrid distance measure performs relatively better (in terms of clustering accuracy than both EST clustering tools. CONCLUSIONS/SIGNIFICANCE: The clustering results provide support for the effectiveness of the proposed hybrid distance measure in solving the clustering problem for ESTs that originate from the same gene family. The improvement of clustering accuracies on the experimental datasets has supported the claim that the sensitivity of the hybrid distance measure is sufficient to solve the clustering problem.

  17. Identification of lethal cluster of genes in the yeast transcription network

    Science.gov (United States)

    Rho, K.; Jeong, H.; Kahng, B.

    2006-05-01

    Identification of essential or lethal genes would be one of the ultimate goals in drug designs. Here we introduce an in silico method to select the cluster with a high population of lethal genes, called lethal cluster, through microarray assay. We construct a gene transcription network based on the microarray expression level. Links are added one by one in the descending order of the Pearson correlation coefficients between two genes. As the link density p increases, two meaningful link densities pm and ps are observed. At pm, which is smaller than the percolation threshold, the number of disconnected clusters is maximum, and the lethal genes are highly concentrated in a certain cluster that needs to be identified. Thus the deletion of all genes in that cluster could efficiently lead to a lethal inviable mutant. This lethal cluster can be identified by an in silico method. As p increases further beyond the percolation threshold, the power law behavior in the degree distribution of a giant cluster appears at ps. We measure the degree of each gene at ps. With the information pertaining to the degrees of each gene at ps, we return to the point pm and calculate the mean degree of genes of each cluster. We find that the lethal cluster has the largest mean degree.

  18. Analysis of a Gibbs sampler method for model-based clustering of gene expression data.

    Science.gov (United States)

    Joshi, Anagha; Van de Peer, Yves; Michoel, Tom

    2008-01-15

    Over the last decade, a large variety of clustering algorithms have been developed to detect coregulatory relationships among genes from microarray gene expression data. Model-based clustering approaches have emerged as statistically well-grounded methods, but the properties of these algorithms when applied to large-scale data sets are not always well understood. An in-depth analysis can reveal important insights about the performance of the algorithm, the expected quality of the output clusters, and the possibilities for extracting more relevant information out of a particular data set. We have extended an existing algorithm for model-based clustering of genes to simultaneously cluster genes and conditions, and used three large compendia of gene expression data for Saccharomyces cerevisiae to analyze its properties. The algorithm uses a Bayesian approach and a Gibbs sampling procedure to iteratively update the cluster assignment of each gene and condition. For large-scale data sets, the posterior distribution is strongly peaked on a limited number of equiprobable clusterings. A GO annotation analysis shows that these local maxima are all biologically equally significant, and that simultaneously clustering genes and conditions performs better than only clustering genes and assuming independent conditions. A collection of distinct equivalent clusterings can be summarized as a weighted graph on the set of genes, from which we extract fuzzy, overlapping clusters using a graph spectral method. The cores of these fuzzy clusters contain tight sets of strongly coexpressed genes, while the overlaps exhibit relations between genes showing only partial coexpression. GaneSh, a Java package for coclustering, is available under the terms of the GNU General Public License from our website at http://bioinformatics.psb.ugent.be/software

  19. Fine Needle Aspiration Biopsies for Gene Expression Ratio-based Diagnostic and Prognostic Tests in Malignant Pleural Mesothelioma

    Science.gov (United States)

    De Rienzo, Assunta; Dong, Lingsheng; Yeap, Beow Y.; Jensen, Roderick V.; Richards, William G.; Gordon, Gavin J.; Sugarbaker, David J.; Bueno, Raphael

    2010-01-01

    Purpose Malignant pleural mesothelioma (MPM) is an aggressive disease associated with median survival between 9 and 12 months. The correct diagnosis of MPM is sometimes challenging and usually requires solid tissue biopsies rather than fine needle aspiration biopsies (FNA). We postulated that the accuracy of FNA-based diagnosis might be improved by the addition of molecular tests using a gene expression ratio-based algorithm and that prognostic tests could be similarly performed. Experimental Design Two MPM and two lung cancer cell lines were used to establish the minimal RNA amount required for ratio tests. Based on these results, 276 ex-vivo FNA biopsies from 63 MPM patients, and 250 ex-vivo FNA samples from 92 lung cancer patients were analyzed using previously described diagnostic and prognostic tests based on gene expression ratios. Results We found that the sensitivity of the diagnostic test for MPM was 100% (95% CI: 95–100%), and the specificity in primary lung adenocarcinoma was 90% (95% CI: 81–95%). The FNA-based prognostic classification was concordant among 76% (95% CI: 65–87%) of patients with the risk assignment in a subset of the matched surgical specimens previously analyzed by the prognostic test. Conclusions Sufficient RNA can be extracted from most FNA biopsies to perform gene expression molecular tests. In particular, we show that the gene expression ratio algorithms performed well when applied to diagnosis and prognosis in MPM. This study provides support for the development of additional RNA molecular tests that may enhance the utility of FNA in the management of other solid cancers. PMID:21088255

  20. Detecting clusters of different geometrical shapes in microarray gene expression data.

    Science.gov (United States)

    Kim, Dae-Won; Lee, Kwang H; Lee, Doheon

    2005-05-01

    Clustering has been used as a popular technique for finding groups of genes that show similar expression patterns under multiple experimental conditions. Many clustering methods have been proposed for clustering gene-expression data, including the hierarchical clustering, k-means clustering and self-organizing map (SOM). However, the conventional methods are limited to identify different shapes of clusters because they use a fixed distance norm when calculating the distance between genes. The fixed distance norm imposes a fixed geometrical shape on the clusters regardless of the actual data distribution. Thus, different distance norms are required for handling the different shapes of clusters. We present the Gustafson-Kessel (GK) clustering method for microarray gene-expression data. To detect clusters of different shapes in a dataset, we use an adaptive distance norm that is calculated by a fuzzy covariance matrix (F) of each cluster in which the eigenstructure of F is used as an indicator of the shape of the cluster. Moreover, the GK method is less prone to falling into local minima than the k-means and SOM because it makes decisions through the use of membership degrees of a gene to clusters. The algorithmic procedure is accomplished by the alternating optimization technique, which iteratively improves a sequence of sets of clusters until no further improvement is possible. To test the performance of the GK method, we applied the GK method and well-known conventional methods to three recently published yeast datasets, and compared the performance of each method using the Saccharomyces Genome Database annotations. The clustering results of the GK method are more significantly relevant to the biological annotations than those of the other methods, demonstrating its effectiveness and potential for clustering gene-expression data. The software was developed using Java language, and can be executed on the platforms that JVM (Java Virtual Machine) is running. It is

  1. Implementing targeted expectant management in fertility care using prognostic modelling: a cluster randomized trial with a multifaceted strategy.

    Science.gov (United States)

    Kersten, F A M; Nelen, W L D M; van den Boogaard, N M; van Rumste, M M; Koks, C A; IntHout, J; Verhoeve, H R; Pelinck, M J; Boks, D E S; Gianotten, J; Broekmans, F J M; Goddijn, M; Braat, D D M; Mol, B W J; Hermens, R P G M

    2017-08-01

    What is the effectiveness of a multifaceted implementation strategy compared to usual care on improving the adherence to guideline recommendations on expectant management for couples with unexplained infertility? The multifaceted implementation strategy did not significantly increase adherence to guideline recommendations on expectant management compared to care as usual. Intrauterine insemination (IUI) with or without ovarian hyperstimulation has no beneficial effect compared to no treatment for 6 months after the fertility work-up for couples with unexplained infertility and a good prognosis of natural conception. Therefore, various professionals and policy makers have advocated the use of prognostic profiles and expectant management in guideline recommendations. A cluster randomized controlled trial in 25 clinics in the Netherlands was conducted between March 2013 and May 2014. Clinics were randomized between the implementation strategy (intervention, n = 13) and care as usual (control, n = 12). The effect of the implementation strategy was evaluated by comparing baseline and effect measurement data. Data collection was retrospective and obtained from medical record research and a patient questionnaire. A total of 544 couples were included at baseline and 485 at the effect measurement (247 intervention group/238 control group). Guideline adherence increased from 49 to 69% (OR 2.66; 95% CI 1.45-4.89) in the intervention group, and from 49 to 61% (OR 2.03; 95% CI 1.38-3.00) in the control group. Multilevel analysis with case-mix adjustment showed that the difference of 8% was not statistically significant (OR 1.31; 95% CI 0.67-2.59). The ongoing pregnancy rate within six months after fertility work-up did not significantly differ between intervention and control group (25% versus 27%: OR 0.72; 95% CI 0.40-1.27). There is a possible selection bias, couples included in the study had a higher socio-economic status than non-responders. How this affects guideline

  2. Transcriptional organization of the phycocyanin subunit gene clusters of the cyanobacterium Anacystis nidulans UTEX 625.

    Science.gov (United States)

    Kalla, S R; Lind, L K; Lidholm, J; Gustafsson, P

    1988-01-01

    The phycocyanin subunit gene cluster is duplicated on the chromosome of the cyanobacterium Anacystis nidulans UTEX 625. The two gene clusters cpcB1A1 (left) and cpcB2A2 (right) are separated by about 2,500 base pairs, and in each cluster the beta-subunit gene is located upstream from the alpha-subunit gene. Filter hybridizations with phycocyanin-specific probes to total RNA detected at least two major transcripts that were 1,300 to 1,400 nucleotides long. Besides these major mRNA species, two minor transcripts of 3,400 and 3,700 nucleotides covering one of the gene clusters and the region between the clusters were found. No additional minor transcripts were found in the intergenic region between the two phycocyanin gene clusters. The lengths of the major mRNAs indicated that the beta- and alpha-subunit genes were cotranscribed. No apparent homologies were found when the DNA sequences located upstream from the proposed ribosome-binding site of the two phycocyanin beta-subunit genes were compared. Northern hybridizations with gene cluster-specific probes from the regions 5' of the beta-subunit genes, as well as S1 nuclease mapping and mRNA primer extension experiments, showed that both gene clusters were transcribed. The minor transcripts were found to initiate upstream from the left gene cluster. Two mRNA 5' ends were mapped upstream from the cpcB1A1 gene cluster, while only one 5' end was mapped in front of the cpcB2A2 gene cluster. All transcripts were present in RNA preparations from cultures grown under high levels of white light as well as under low levels of red light. The level of phycocyanin-specific mRNA, measured as part of the total RNA, was lower under low levels of red light compared with that under high levels of white light. Conserved sequence motifs were found when the promoter region of the cpcB1A1 gene cluster and promoter regions from other cyanobacterial photosynthesis genes were compared. The DNA sequences covering the proposed transcriptional

  3. Deletion and gene expression analyses define the paxilline biosynthetic gene cluster in Penicillium paxilli.

    Science.gov (United States)

    Scott, Barry; Young, Carolyn A; Saikia, Sanjay; McMillan, Lisa K; Monahan, Brendon J; Koulman, Albert; Astin, Jonathan; Eaton, Carla J; Bryant, Andrea; Wrenn, Ruth E; Finch, Sarah C; Tapper, Brian A; Parker, Emily J; Jameson, Geoffrey B

    2013-08-14

    The indole-diterpene paxilline is an abundant secondary metabolite synthesized by Penicillium paxilli. In total, 21 genes have been identified at the PAX locus of which six have been previously confirmed to have a functional role in paxilline biosynthesis. A combination of bioinformatics, gene expression and targeted gene replacement analyses were used to define the boundaries of the PAX gene cluster. Targeted gene replacement identified seven genes, paxG, paxA, paxM, paxB, paxC, paxP and paxQ that were all required for paxilline production, with one additional gene, paxD, required for regular prenylation of the indole ring post paxilline synthesis. The two putative transcription factors, PP104 and PP105, were not co-regulated with the pax genes and based on targeted gene replacement, including the double knockout, did not have a role in paxilline production. The relationship of indole dimethylallyl transferases involved in prenylation of indole-diterpenes such as paxilline or lolitrem B, can be found as two disparate clades, not supported by prenylation type (e.g., regular or reverse). This paper provides insight into the P. paxilli indole-diterpene locus and reviews the recent advances identified in paxilline biosynthesis.

  4. The gsdf gene locus harbors evolutionary conserved and clustered genes preferentially expressed in fish previtellogenic oocytes.

    Science.gov (United States)

    Gautier, Aude; Le Gac, Florence; Lareyre, Jean-Jacques

    2011-02-01

    display a different cellular localization compared to that of the gsdf gene indicating that the later gene is not co-regulated. Interestingly, our study identifies new clustered genes that are specifically expressed in previtellogenic oocytes (nup54, aff1, klhl8, sdad1). Copyright © 2010 Elsevier B.V. All rights reserved.

  5. Variations in CCL3L gene cluster sequence and non-specific gene copy numbers

    Directory of Open Access Journals (Sweden)

    Edberg Jeffrey C

    2010-03-01

    Full Text Available Abstract Background Copy number variations (CNVs of the gene CC chemokine ligand 3-like1 (CCL3L1 have been implicated in HIV-1 susceptibility, but the association has been inconsistent. CCL3L1 shares homology with a cluster of genes localized to chromosome 17q12, namely CCL3, CCL3L2, and, CCL3L3. These genes are involved in host defense and inflammatory processes. Several CNV assays have been developed for the CCL3L1 gene. Findings Through pairwise and multiple alignments of these genes, we have shown that the homology between these genes ranges from 50% to 99% in complete gene sequences and from 70-100% in the exonic regions, with CCL3L1 and CCL3L3 being identical. By use of MEGA 4 and BioEdit, we aligned sense primers, anti-sense primers, and probes used in several previously described assays against pre-multiple alignments of all four chemokine genes. Each set of probes and primers aligned and matched with overlapping sequences in at least two of the four genes, indicating that previously utilized RT-PCR based CNV assays are not specific for only CCL3L1. The four available assays measured median copies of 2 and 3-4 in European and African American, respectively. The concordance between the assays ranged from 0.44-0.83 suggesting individual discordant calls and inconsistencies with the assays from the expected gene coverage from the known sequence. Conclusions This indicates that some of the inconsistencies in the association studies could be due to assays that provide heterogenous results. Sequence information to determine CNV of the three genes separately would allow to test whether their association with the pathogenesis of a human disease or phenotype is affected by an individual gene or by a combination of these genes.

  6. An Ergot Alkaloid Biosynthesis Gene and Clustered Hypothetical Genes from Aspergillus fumigatus†

    Science.gov (United States)

    Coyle, Christine M.; Panaccione, Daniel G.

    2005-01-01

    The ergot alkaloids are a family of indole-derived mycotoxins with a variety of significant biological activities. Aspergillus fumigatus, a common airborne fungus and opportunistic human pathogen, and several fungi in the relatively distant taxon Clavicipitaceae (clavicipitaceous fungi) produce different sets of ergot alkaloids. The ergot alkaloids of these divergent fungi share a four-member ergoline ring but differ in the number, type, and position of the side chains. Several genes required for ergot alkaloid production are known in the clavicipitaceous fungi, and these genes are clustered in the genome of the ergot fungus Claviceps purpurea. We investigated whether the ergot alkaloids of A. fumigatus have a common biosynthetic and genetic origin with those of the clavicipitaceous fungi. A homolog of dmaW, the gene controlling the determinant step in the ergot alkaloid pathway of clavicipitaceous fungi, was identified in the A. fumigatus genome. Knockout of dmaW eliminated all known ergot alkaloids from A. fumigatus, and complementation of the mutation restored ergot alkaloid production. Clustered with dmaW in the A. fumigatus genome are sequences corresponding to five genes previously proposed to encode steps in the ergot alkaloid pathway of C. purpurea, as well as additional sequences whose deduced protein products are consistent with their involvement in the ergot alkaloid pathway. The corresponding genes have similarities in their nucleotide sequences, but the orientations and positions within the cluster of several of these genes differ. The data indicate that the ergot alkaloid biosynthetic capabilities in A. fumigatus and the clavicipitaceous fungi had a common origin. PMID:15933009

  7. Colon cancer and gene alterations: their immunological implications and suggestions for prognostic indices and improvements in biotherapy.

    Science.gov (United States)

    Contasta, Ida; Pellegrini, Patrizia; Berghella, Anna Maria; Del Beato, Tiziana; Adorno, Domenico

    2006-10-01

    Studies have shown that changes occur in c-Ki-ras, p53, and Bcl2 gene structure and function during the various stages of human colon carcinogenesis. Alterations of these genes are responsible for the establishment of a state of continuous stimulus for cell division and apoptotic inhibition at physiological and pharmacological levels. This paper focuses on the results of our research aimed at investigating how these gene alterations influence tumoral mechanisms on an immunological level and how immunological parameters can be used as prognostic markers for the passage of normal tissue to adenoma and adenoma to carcinoma. Overall, our data suggest that an alteration in the c-Ki-ras gene results in a switch to a suppressive type of immune response, determining an impairment of immune cell activation at both antigen- presenting-cell and T-cell levels. c-Ki-ras gene mutations, p53 deletions, and Bc12 expression, on the other hand, can be used as prognostic markers for the passage of normal tissue to adenoma and adenoma to carcinoma. The p53 oncogene does not appear to impair patients' immunological response further. In conclusion, an evaluation of c-Ki-ras, rather than p53 gene alterations, would seem to be more relevant in colon cancer prevention programs and biotherapy improvement.

  8. Challenges in microarray class discovery: a comprehensive examination of normalization, gene selection and clustering

    Science.gov (United States)

    2010-01-01

    Background Cluster analysis, and in particular hierarchical clustering, is widely used to extract information from gene expression data. The aim is to discover new classes, or sub-classes, of either individuals or genes. Performing a cluster analysis commonly involve decisions on how to; handle missing values, standardize the data and select genes. In addition, pre-processing, involving various types of filtration and normalization procedures, can have an effect on the ability to discover biologically relevant classes. Here we consider cluster analysis in a broad sense and perform a comprehensive evaluation that covers several aspects of cluster analyses, including normalization. Result We evaluated 2780 cluster analysis methods on seven publicly available 2-channel microarray data sets with common reference designs. Each cluster analysis method differed in data normalization (5 normalizations were considered), missing value imputation (2), standardization of data (2), gene selection (19) or clustering method (11). The cluster analyses are evaluated using known classes, such as cancer types, and the adjusted Rand index. The performances of the different analyses vary between the data sets and it is difficult to give general recommendations. However, normalization, gene selection and clustering method are all variables that have a significant impact on the performance. In particular, gene selection is important and it is generally necessary to include a relatively large number of genes in order to get good performance. Selecting genes with high standard deviation or using principal component analysis are shown to be the preferred gene selection methods. Hierarchical clustering using Ward's method, k-means clustering and Mclust are the clustering methods considered in this paper that achieves the highest adjusted Rand. Normalization can have a significant positive impact on the ability to cluster individuals, and there are indications that background correction is

  9. Challenges in microarray class discovery: a comprehensive examination of normalization, gene selection and clustering

    Directory of Open Access Journals (Sweden)

    Landfors Mattias

    2010-10-01

    Full Text Available Abstract Background Cluster analysis, and in particular hierarchical clustering, is widely used to extract information from gene expression data. The aim is to discover new classes, or sub-classes, of either individuals or genes. Performing a cluster analysis commonly involve decisions on how to; handle missing values, standardize the data and select genes. In addition, pre-processing, involving various types of filtration and normalization procedures, can have an effect on the ability to discover biologically relevant classes. Here we consider cluster analysis in a broad sense and perform a comprehensive evaluation that covers several aspects of cluster analyses, including normalization. Result We evaluated 2780 cluster analysis methods on seven publicly available 2-channel microarray data sets with common reference designs. Each cluster analysis method differed in data normalization (5 normalizations were considered, missing value imputation (2, standardization of data (2, gene selection (19 or clustering method (11. The cluster analyses are evaluated using known classes, such as cancer types, and the adjusted Rand index. The performances of the different analyses vary between the data sets and it is difficult to give general recommendations. However, normalization, gene selection and clustering method are all variables that have a significant impact on the performance. In particular, gene selection is important and it is generally necessary to include a relatively large number of genes in order to get good performance. Selecting genes with high standard deviation or using principal component analysis are shown to be the preferred gene selection methods. Hierarchical clustering using Ward's method, k-means clustering and Mclust are the clustering methods considered in this paper that achieves the highest adjusted Rand. Normalization can have a significant positive impact on the ability to cluster individuals, and there are indications that

  10. Is K-ras gene mutation a prognostic factor for colorectal cancer: a systematic review and meta-analysis.

    Science.gov (United States)

    Ren, JiaoJiao; Li, GuangXiao; Ge, Jie; Li, Xia; Zhao, YaShuang

    2012-08-01

    : The K-ras gene is one of the commonly mutated oncogenes associated with colorectal cancer. However, its prognostic significance for patients with colorectal cancer remains inconclusive. : To derive a more precise estimation of the prognostic significance of K-ras gene mutations, a systematic review and meta-analysis were performed. : We searched PubMed, Embase, and the Cochrane databases from January 1992 to November 2011. : The prognostic value of K-ras gene mutations was examined in patients with colorectal cancer who did not receive preoperative chemotherapy or radiation. : The effect of K-ras gene mutations on the overall survival was measured by the HR and 95% CIs. : The pooled HR for the association between K-ras gene mutations and overall survival in patients with colorectal cancer was 1.04 (95% CI: 0.99-1.10, p = 0.11). Subgroup analysis showed significant reductions in the overall survival associated with mutations at K-ras codon 12, the articles that reported HR directly, and the studies published before and after 2005, although publication bias was present. All the associations disappeared after adjustment with the trim-and-fill method. The pooled HR of 3 studies examining mutations at K-ras codon 13 was 1.47 (95% CI: 1.09-1.97, p = 0.02), and no publication bias was observed. No significant association was observed in different study regions. : The heterogeneity in the study populations is a potential problem, the use of different staging systems or small groups of different stages may contribute to heterogeneity, and residual confounding may have influenced the results in those studies that did not completely adjust for other factors. : Overall K-ras gene mutations seem not to correlate with the prognosis of patients with colorectal cancer. The association remains to be confirmed with a more precise analysis of a large sample.

  11. A novel volume-age-KPS (VAK) glioblastoma classification identifies a prognostic cognate microRNA-gene signature.

    Science.gov (United States)

    Zinn, Pascal O; Sathyan, Pratheesh; Mahajan, Bhanu; Bruyere, John; Hegi, Monika; Majumder, Sadhan; Colen, Rivka R

    2012-01-01

    Several studies have established Glioblastoma Multiforme (GBM) prognostic and predictive models based on age and Karnofsky Performance Status (KPS), while very few studies evaluated the prognostic and predictive significance of preoperative MR-imaging. However, to date, there is no simple preoperative GBM classification that also correlates with a highly prognostic genomic signature. Thus, we present for the first time a biologically relevant, and clinically applicable tumor Volume, patient Age, and KPS (VAK) GBM classification that can easily and non-invasively be determined upon patient admission. We quantitatively analyzed the volumes of 78 GBM patient MRIs present in The Cancer Imaging Archive (TCIA) corresponding to patients in The Cancer Genome Atlas (TCGA) with VAK annotation. The variables were then combined using a simple 3-point scoring system to form the VAK classification. A validation set (N = 64) from both the TCGA and Rembrandt databases was used to confirm the classification. Transcription factor and genomic correlations were performed using the gene pattern suite and Ingenuity Pathway Analysis. VAK-A and VAK-B classes showed significant median survival differences in discovery (P = 0.007) and validation sets (P = 0.008). VAK-A is significantly associated with P53 activation, while VAK-B shows significant P53 inhibition. Furthermore, a molecular gene signature comprised of a total of 25 genes and microRNAs was significantly associated with the classes and predicted survival in an independent validation set (P = 0.001). A favorable MGMT promoter methylation status resulted in a 10.5 months additional survival benefit for VAK-A compared to VAK-B patients. The non-invasively determined VAK classification with its implication of VAK-specific molecular regulatory networks, can serve as a very robust initial prognostic tool, clinical trial selection criteria, and important step toward the refinement of genomics-based personalized therapy

  12. A novel volume-age-KPS (VAK glioblastoma classification identifies a prognostic cognate microRNA-gene signature.

    Directory of Open Access Journals (Sweden)

    Pascal O Zinn

    Full Text Available BACKGROUND: Several studies have established Glioblastoma Multiforme (GBM prognostic and predictive models based on age and Karnofsky Performance Status (KPS, while very few studies evaluated the prognostic and predictive significance of preoperative MR-imaging. However, to date, there is no simple preoperative GBM classification that also correlates with a highly prognostic genomic signature. Thus, we present for the first time a biologically relevant, and clinically applicable tumor Volume, patient Age, and KPS (VAK GBM classification that can easily and non-invasively be determined upon patient admission. METHODS: We quantitatively analyzed the volumes of 78 GBM patient MRIs present in The Cancer Imaging Archive (TCIA corresponding to patients in The Cancer Genome Atlas (TCGA with VAK annotation. The variables were then combined using a simple 3-point scoring system to form the VAK classification. A validation set (N = 64 from both the TCGA and Rembrandt databases was used to confirm the classification. Transcription factor and genomic correlations were performed using the gene pattern suite and Ingenuity Pathway Analysis. RESULTS: VAK-A and VAK-B classes showed significant median survival differences in discovery (P = 0.007 and validation sets (P = 0.008. VAK-A is significantly associated with P53 activation, while VAK-B shows significant P53 inhibition. Furthermore, a molecular gene signature comprised of a total of 25 genes and microRNAs was significantly associated with the classes and predicted survival in an independent validation set (P = 0.001. A favorable MGMT promoter methylation status resulted in a 10.5 months additional survival benefit for VAK-A compared to VAK-B patients. CONCLUSIONS: The non-invasively determined VAK classification with its implication of VAK-specific molecular regulatory networks, can serve as a very robust initial prognostic tool, clinical trial selection criteria, and important step toward

  13. Physical and genetic map of the major nif gene cluster from Azotobacter vinelandii.

    OpenAIRE

    Jacobson, M R; Brigle, K E; Bennett, L T; Setterquist, R A; Wilson, M S; Cash, V L; Beynon, J; Newton, W E; Dean, D R

    1989-01-01

    Determination of a 28,793-base-pair DNA sequence of a region from the Azotobacter vinelandii genome that includes and flanks the nitrogenase structural gene region was completed. This information was used to revise the previously proposed organization of the major nif cluster. The major nif cluster from A. vinelandii encodes 15 nif-specific genes whose products bear significant structural identity to the corresponding nif-specific gene products from Klebsiella pneumoniae. These genes include ...

  14. A tripartite clustering analysis on microRNA, gene and disease model.

    Science.gov (United States)

    Shen, Chengcheng; Liu, Ying

    2012-02-01

    Alteration of gene expression in response to regulatory molecules or mutations could lead to different diseases. MicroRNAs (miRNAs) have been discovered to be involved in regulation of gene expression and a wide variety of diseases. In a tripartite biological network of human miRNAs, their predicted target genes and the diseases caused by altered expressions of these genes, valuable knowledge about the pathogenicity of miRNAs, involved genes and related disease classes can be revealed by co-clustering miRNAs, target genes and diseases simultaneously. Tripartite co-clustering can lead to more informative results than traditional co-clustering with only two kinds of members and pass the hidden relational information along the relation chain by considering multi-type members. Here we report a spectral co-clustering algorithm for k-partite graph to find clusters with heterogeneous members. We use the method to explore the potential relationships among miRNAs, genes and diseases. The clusters obtained from the algorithm have significantly higher density than randomly selected clusters, which means members in the same cluster are more likely to have common connections. Results also show that miRNAs in the same family based on the hairpin sequences tend to belong to the same cluster. We also validate the clustering results by checking the correlation of enriched gene functions and disease classes in the same cluster. Finally, widely studied miR-17-92 and its paralogs are analyzed as a case study to reveal that genes and diseases co-clustered with the miRNAs are in accordance with current research findings.

  15. Recursive Cluster Elimination (RCE for classification and feature selection from gene expression data

    Directory of Open Access Journals (Sweden)

    Showe Louise C

    2007-05-01

    Full Text Available Abstract Background Classification studies using gene expression datasets are usually based on small numbers of samples and tens of thousands of genes. The selection of those genes that are important for distinguishing the different sample classes being compared, poses a challenging problem in high dimensional data analysis. We describe a new procedure for selecting significant genes as recursive cluster elimination (RCE rather than recursive feature elimination (RFE. We have tested this algorithm on six datasets and compared its performance with that of two related classification procedures with RFE. Results We have developed a novel method for selecting significant genes in comparative gene expression studies. This method, which we refer to as SVM-RCE, combines K-means, a clustering method, to identify correlated gene clusters, and Support Vector Machines (SVMs, a supervised machine learning classification method, to identify and score (rank those gene clusters for the purpose of classification. K-means is used initially to group genes into clusters. Recursive cluster elimination (RCE is then applied to iteratively remove those clusters of genes that contribute the least to the classification performance. SVM-RCE identifies the clusters of correlated genes that are most significantly differentially expressed between the sample classes. Utilization of gene clusters, rather than individual genes, enhances the supervised classification accuracy of the same data as compared to the accuracy when either SVM or Penalized Discriminant Analysis (PDA with recursive feature elimination (SVM-RFE and PDA-RFE are used to remove genes based on their individual discriminant weights. Conclusion SVM-RCE provides improved classification accuracy with complex microarray data sets when it is compared to the classification accuracy of the same datasets using either SVM-RFE or PDA-RFE. SVM-RCE identifies clusters of correlated genes that when considered together

  16. Recursive cluster elimination (RCE) for classification and feature selection from gene expression data.

    Science.gov (United States)

    Yousef, Malik; Jung, Segun; Showe, Louise C; Showe, Michael K

    2007-05-02

    Classification studies using gene expression datasets are usually based on small numbers of samples and tens of thousands of genes. The selection of those genes that are important for distinguishing the different sample classes being compared, poses a challenging problem in high dimensional data analysis. We describe a new procedure for selecting significant genes as recursive cluster elimination (RCE) rather than recursive feature elimination (RFE). We have tested this algorithm on six datasets and compared its performance with that of two related classification procedures with RFE. We have developed a novel method for selecting significant genes in comparative gene expression studies. This method, which we refer to as SVM-RCE, combines K-means, a clustering method, to identify correlated gene clusters, and Support Vector Machines (SVMs), a supervised machine learning classification method, to identify and score (rank) those gene clusters for the purpose of classification. K-means is used initially to group genes into clusters. Recursive cluster elimination (RCE) is then applied to iteratively remove those clusters of genes that contribute the least to the classification performance. SVM-RCE identifies the clusters of correlated genes that are most significantly differentially expressed between the sample classes. Utilization of gene clusters, rather than individual genes, enhances the supervised classification accuracy of the same data as compared to the accuracy when either SVM or Penalized Discriminant Analysis (PDA) with recursive feature elimination (SVM-RFE and PDA-RFE) are used to remove genes based on their individual discriminant weights. SVM-RCE provides improved classification accuracy with complex microarray data sets when it is compared to the classification accuracy of the same datasets using either SVM-RFE or PDA-RFE. SVM-RCE identifies clusters of correlated genes that when considered together provide greater insight into the structure of the

  17. Variation in sequence and location of the fumonisin mycotoxin niosynthetic gene cluster in Fusarium

    NARCIS (Netherlands)

    Proctor, R.H.; Hove, van F.; Susca, A.; Stea, A.; Busman, M.; Lee, van der T.A.J.; Waalwijk, C.; Moretti, A.

    2010-01-01

    In Fusarium, the ability to produce fumonisins is governed by a 17-gene fumonisin biosynthetic gene (FUM) cluster. Here, we examined the cluster in F. oxysporum strain O-1890 and nine other species selected to represent a wide range of the genetic diversity within the GFSC.

  18. Genome-Wide Prediction of Metabolic Enzymes, Pathways, and Gene Clusters in Plants.

    Science.gov (United States)

    Schläpfer, Pascal; Zhang, Peifen; Wang, Chuan; Kim, Taehyong; Banf, Michael; Chae, Lee; Dreher, Kate; Chavali, Arvind K; Nilo-Poyanco, Ricardo; Bernard, Thomas; Kahn, Daniel; Rhee, Seung Y

    2017-04-01

    Plant metabolism underpins many traits of ecological and agronomic importance. Plants produce numerous compounds to cope with their environments but the biosynthetic pathways for most of these compounds have not yet been elucidated. To engineer and improve metabolic traits, we need comprehensive and accurate knowledge of the organization and regulation of plant metabolism at the genome scale. Here, we present a computational pipeline to identify metabolic enzymes, pathways, and gene clusters from a sequenced genome. Using this pipeline, we generated metabolic pathway databases for 22 species and identified metabolic gene clusters from 18 species. This unified resource can be used to conduct a wide array of comparative studies of plant metabolism. Using the resource, we discovered a widespread occurrence of metabolic gene clusters in plants: 11,969 clusters from 18 species. The prevalence of metabolic gene clusters offers an intriguing possibility of an untapped source for uncovering new metabolite biosynthesis pathways. For example, more than 1,700 clusters contain enzymes that could generate a specialized metabolite scaffold (signature enzymes) and enzymes that modify the scaffold (tailoring enzymes). In four species with sufficient gene expression data, we identified 43 highly coexpressed clusters that contain signature and tailoring enzymes, of which eight were characterized previously to be functional pathways. Finally, we identified patterns of genome organization that implicate local gene duplication and, to a lesser extent, single gene transposition as having played roles in the evolution of plant metabolic gene clusters. © 2017 American Society of Plant Biologists. All Rights Reserved.

  19. Genome-Wide Prediction of Metabolic Enzymes, Pathways, and Gene Clusters in Plants1[OPEN

    Science.gov (United States)

    Zhang, Peifen; Kim, Taehyong; Banf, Michael; Chavali, Arvind K.; Nilo-Poyanco, Ricardo; Bernard, Thomas

    2017-01-01

    Plant metabolism underpins many traits of ecological and agronomic importance. Plants produce numerous compounds to cope with their environments but the biosynthetic pathways for most of these compounds have not yet been elucidated. To engineer and improve metabolic traits, we need comprehensive and accurate knowledge of the organization and regulation of plant metabolism at the genome scale. Here, we present a computational pipeline to identify metabolic enzymes, pathways, and gene clusters from a sequenced genome. Using this pipeline, we generated metabolic pathway databases for 22 species and identified metabolic gene clusters from 18 species. This unified resource can be used to conduct a wide array of comparative studies of plant metabolism. Using the resource, we discovered a widespread occurrence of metabolic gene clusters in plants: 11,969 clusters from 18 species. The prevalence of metabolic gene clusters offers an intriguing possibility of an untapped source for uncovering new metabolite biosynthesis pathways. For example, more than 1,700 clusters contain enzymes that could generate a specialized metabolite scaffold (signature enzymes) and enzymes that modify the scaffold (tailoring enzymes). In four species with sufficient gene expression data, we identified 43 highly coexpressed clusters that contain signature and tailoring enzymes, of which eight were characterized previously to be functional pathways. Finally, we identified patterns of genome organization that implicate local gene duplication and, to a lesser extent, single gene transposition as having played roles in the evolution of plant metabolic gene clusters. PMID:28228535

  20. Fragmentation of an aflatoxin-like gene cluster in a forest pathogen

    Science.gov (United States)

    Secondary metabolic pathway genes are typically clustered in fungi. An exception to this paradigm is seen for genes required for the production of dothistromin, an aflatoxin-like virulence factor produced by the pine needle pathogen Dothistroma septosporum. In contrast to the tight clustering of gen...

  1. The complete coenzyme B12 biosynthesis gene cluster of Lactobacillus reuteri CRL 1098

    NARCIS (Netherlands)

    Santos, dos F.; Vera, J.L.; Heijden, van der R.; Valdez, G.F.; Vos, de W.M.; Sesma, F.; Hugenholtz, J.

    2008-01-01

    The coenzyme B12 production pathway in Lactobacillus reuteri has been deduced using a combination of genetic, biochemical and bioinformatics approaches. The coenzyme B12 gene cluster of Lb. reuteri CRL1098 has the unique feature of clustering together the cbi, cob and hem genes. It consists of 29

  2. A phylogenomic gene cluster resource: The phylogeneticallyinferred groups (PhlGs) database

    Energy Technology Data Exchange (ETDEWEB)

    Dehal, Paramvir S.; Boore, Jeffrey L.

    2005-08-25

    We present here the PhIGs database, a phylogenomic resource for sequenced genomes. Although many methods exist for clustering gene families, very few attempt to create truly orthologous clusters sharing descent from a single ancestral gene across a range of evolutionary depths. Although these non-phylogenetic gene family clusters have been used broadly for gene annotation, errors are known to be introduced by the artifactual association of slowly evolving paralogs and lack of annotation for those more rapidly evolving. A full phylogenetic framework is necessary for accurate inference of function and for many studies that address pattern and mechanism of the evolution of the genome. The automated generation of evolutionary gene clusters, creation of gene trees, determination of orthology and paralogy relationships, and the correlation of this information with gene annotations, expression information, and genomic context is an important resource to the scientific community.

  3. Organization and differential regulation of a cluster of lignin peroxidase genes of Phanerochaete chrysosporium

    Science.gov (United States)

    Philip. Stewart; Daniel. Cullen

    1999-06-01

    The lignin peroxidases of Phanerochaete chrysosporium are encoded by a minimum of 10 closely related genes. Physical and genetic mapping of a cluster of eight lip genes revealed six genes occurring in pairs and transcriptionally convergent, suggesting that portions of the lip family arose by gene duplication events. The completed sequence of 1ipG and lipJ, together...

  4. Clusters of adjacent and similarly expressed genes across normal human tissues complicate comparative transcriptomic discovery.

    Science.gov (United States)

    Liu, Chang; Ghosh, Sujoy; Searls, David B; Saunders, Ann M; Cossman, Jeffrey; Roses, Allen D

    2005-01-01

    Transcriptomic techniques are valuable tools with which to validate genetic and biological hypotheses and are now widely available for research. However, with the exception of tumor biology, comparative genomics analyses have been difficult to use as discovery engines to describe biologically relevant expression changes. We propose that physical proximity of human genes correlates with similar mRNA expression, so that increased expression might include a disease-relevant gene and many other genes in the adjacent region. To increase the efficiency of combining susceptibility gene mapping and interpretation of transcriptomics, we developed a method to identify clusters of adjacent and similarly expressed genes. Gene expression profiles for 28,945 genes across 101 normal human tissues were obtained from the Gene Logic BioExpress system. The expression similarity for genes in sliding-windows was measured using average pair-wise Pearson correlation coefficients. We identified 187 clusters (p < 10e-4) of co-regulated genes, including 2648 genes, or 9.1% of all genes considered and termed these "clusters of adjacent and similarly expressed genes" (CASEGs). Genes in 15 (8.2%) of these clusters demonstrate a significant co-expression enrichment (p < 10e-10). This study demonstrates the coordinate expression of neighboring genes and provides a comprehensive view of expression-based compartmentalization of the human genome, which can be overlaid on genetic susceptibility gene maps.

  5. A modified recombineering protocol for the genetic manipulation of gene clusters in Aspergillus fumigatus.

    Directory of Open Access Journals (Sweden)

    Laura Alcazar-Fuoli

    Full Text Available Genomic analyses of fungal genome structure have revealed the presence of physically-linked groups of genes, termed gene clusters, where collective functionality of encoded gene products serves a common biosynthetic purpose. In multiple fungal pathogens of humans and plants gene clusters have been shown to encode pathways for biosynthesis of secondary metabolites including metabolites required for pathogenicity. In the major mould pathogen of humans Aspergillus fumigatus, multiple clusters of co-ordinately upregulated genes were identified as having heightened transcript abundances, relative to laboratory cultured equivalents, during the early stages of murine infection. The aim of this study was to develop and optimise a methodology for manipulation of gene cluster architecture, thereby providing the means to assess their relevance to fungal pathogenicity. To this end we adapted a recombineering methodology which exploits lambda phage-mediated recombination of DNA in bacteria, for the generation of gene cluster deletion cassettes. By exploiting a pre-existing bacterial artificial chromosome (BAC library of A. fumigatus genomic clones we were able to implement single or multiple intra-cluster gene replacement events at both subtelomeric and telomere distal chromosomal locations, in both wild type and highly recombinogenic A. fumigatus isolates. We then applied the methodology to address the boundaries of a gene cluster producing a nematocidal secondary metabolite, pseurotin A, and to address the role of this secondary metabolite in insect and mammalian responses to A. fumigatus challenge.

  6. Distribution and Genetic Diversity of Bacteriocin Gene Clusters in Rumen Microbial Genomes.

    Science.gov (United States)

    Azevedo, Analice C; Bento, Cláudia B P; Ruiz, Jeronimo C; Queiroz, Marisa V; Mantovani, Hilário C

    2015-10-01

    Some species of ruminal bacteria are known to produce antimicrobial peptides, but the screening procedures have mostly been based on in vitro assays using standardized methods. Recent sequencing efforts have made available the genome sequences of hundreds of ruminal microorganisms. In this work, we performed genome mining of the complete and partial genome sequences of 224 ruminal bacteria and 5 ruminal archaea to determine the distribution and diversity of bacteriocin gene clusters. A total of 46 bacteriocin gene clusters were identified in 33 strains of ruminal bacteria. Twenty gene clusters were related to lanthipeptide biosynthesis, while 11 gene clusters were associated with sactipeptide production, 7 gene clusters were associated with class II bacteriocin production, and 8 gene clusters were associated with class III bacteriocin production. The frequency of strains whose genomes encode putative antimicrobial peptide precursors was 14.4%. Clusters related to the production of sactipeptides were identified for the first time among ruminal bacteria. BLAST analysis indicated that the majority of the gene clusters (88%) encoding putative lanthipeptides contained all the essential genes required for lanthipeptide biosynthesis. Most strains of Streptococcus (66.6%) harbored complete lanthipeptide gene clusters, in addition to an open reading frame encoding a putative class II bacteriocin. Albusin B-like proteins were found in 100% of the Ruminococcus albus strains screened in this study. The in silico analysis provided evidence of novel biosynthetic gene clusters in bacterial species not previously related to bacteriocin production, suggesting that the rumen microbiota represents an underexplored source of antimicrobial peptides. Copyright © 2015, American Society for Microbiology. All Rights Reserved.

  7. Improving the computational efficiency of recursive cluster elimination for gene selection.

    Science.gov (United States)

    Luo, Lin-Kai; Huang, Deng-Feng; Ye, Ling-Jun; Zhou, Qi-Feng; Shao, Gui-Fang; Peng, Hong

    2011-01-01

    The gene expression data are usually provided with a large number of genes and a relatively small number of samples, which brings a lot of new challenges. Selecting those informative genes becomes the main issue in microarray data analysis. Recursive cluster elimination based on support vector machine (SVM-RCE) has shown the better classification accuracy on some microarray data sets than recursive feature elimination based on support vector machine (SVM-RFE). However, SVM-RCE is extremely time-consuming. In this paper, we propose an improved method of SVM-RCE called ISVM-RCE. ISVM-RCE first trains a SVM model with all clusters, then applies the infinite norm of weight coefficient vector in each cluster to score the cluster, finally eliminates the gene clusters with the lowest score. In addition, ISVM-RCE eliminates genes within the clusters instead of removing a cluster of genes when the number of clusters is small. We have tested ISVM-RCE on six gene expression data sets and compared their performances with SVM-RCE and linear-discriminant-analysis-based RFE (LDA-RFE). The experiment results on these data sets show that ISVM-RCE greatly reduces the time cost of SVM-RCE, meanwhile obtains comparable classification performance as SVM-RCE, while LDA-RFE is not stable.

  8. A robust approach based on Weibull distribution for clustering gene expression data

    Directory of Open Access Journals (Sweden)

    Gong Binsheng

    2011-05-01

    Full Text Available Abstract Background Clustering is a widely used technique for analysis of gene expression data. Most clustering methods group genes based on the distances, while few methods group genes according to the similarities of the distributions of the gene expression levels. Furthermore, as the biological annotation resources accumulated, an increasing number of genes have been annotated into functional categories. As a result, evaluating the performance of clustering methods in terms of the functional consistency of the resulting clusters is of great interest. Results In this paper, we proposed the WDCM (Weibull Distribution-based Clustering Method, a robust approach for clustering gene expression data, in which the gene expressions of individual genes are considered as the random variables following unique Weibull distributions. Our WDCM is based on the concept that the genes with similar expression profiles have similar distribution parameters, and thus the genes are clustered via the Weibull distribution parameters. We used the WDCM to cluster three cancer gene expression data sets from the lung cancer, B-cell follicular lymphoma and bladder carcinoma and obtained well-clustered results. We compared the performance of WDCM with k-means and Self Organizing Map (SOM using functional annotation information given by the Gene Ontology (GO. The results showed that the functional annotation ratios of WDCM are higher than those of the other methods. We also utilized the external measure Adjusted Rand Index to validate the performance of the WDCM. The comparative results demonstrate that the WDCM provides the better clustering performance compared to k-means and SOM algorithms. The merit of the proposed WDCM is that it can be applied to cluster incomplete gene expression data without imputing the missing values. Moreover, the robustness of WDCM is also evaluated on the incomplete data sets. Conclusions The results demonstrate that our WDCM produces clusters

  9. CTDGFinder: A Novel Homology-Based Algorithm for Identifying Closely Spaced Clusters of Tandemly Duplicated Genes.

    Science.gov (United States)

    Ortiz, Juan F; Rokas, Antonis

    2017-01-01

    Closely spaced clusters of tandemly duplicated genes (CTDGs) contribute to the diversity of many phenotypes, including chemosensation, snake venom, and animal body plans. CTDGs have traditionally been identified subjectively as genomic neighborhoods containing several gene duplicates in close proximity; however, CTDGs are often highly variable with respect to gene number, intergenic distance, and synteny. This lack of formal definition hampers the study of CTDG evolutionary dynamics and the discovery of novel CTDGs in the exponentially growing body of genomic data. To address this gap, we developed a novel homology-based algorithm, CTDGFinder, which formalizes and automates the identification of CTDGs by examining the physical distribution of individual members of families of duplicated genes across chromosomes. Application of CTDGFinder accurately identified CTDGs for many well-known gene clusters (e.g., Hox and beta-globin gene clusters) in the human, mouse and 20 other mammalian genomes. Differences between previously annotated gene clusters and our inferred CTDGs were due to the exclusion of nonhomologs that have historically been considered parts of specific gene clusters, the inclusion or absence of genes between the CTDGs and their corresponding gene clusters, and the splitting of certain gene clusters into distinct CTDGs. Examination of human genes showing tissue-specific enhancement of their expression by CTDGFinder identified members of several well-known gene clusters (e.g., cytochrome P450s and olfactory receptors) and revealed that they were unequally distributed across tissues. By formalizing and automating CTDG identification, CTDGFinder will facilitate understanding of CTDG evolutionary dynamics, their functional implications, and how they are associated with phenotypic diversity. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e

  10. CAR gene cluster and transcript levels of carotenogenic genes in Rhodotorula mucilaginosa.

    Science.gov (United States)

    Landolfo, Sara; Ianiri, Giuseppe; Camiolo, Salvatore; Porceddu, Andrea; Mulas, Giuliana; Chessa, Rossella; Zara, Giacomo; Mannazzu, Ilaria

    2018-01-01

    A molecular approach was applied to the study of the carotenoid biosynthetic pathway of Rhodotorula mucilaginosa. At first, functional annotation of the genome of R. mucilaginosa C2.5t1 was carried out and gene ontology categories were assigned to 4033 predicted proteins. Then, a set of genes involved in different steps of carotenogenesis was identified and those coding for phytoene desaturase, phytoene synthase/lycopene cyclase and carotenoid dioxygenase (CAR genes) proved to be clustered within a region of ~10 kb. Quantitative PCR of the genes involved in carotenoid biosynthesis showed that genes coding for 3-hydroxy-3-methylglutharyl-CoA reductase and mevalonate kinase are induced during exponential phase while no clear trend of induction was observed for phytoene synthase/lycopene cyclase and phytoene dehydrogenase encoding genes. Thus, in R. mucilaginosa the induction of genes involved in the early steps of carotenoid biosynthesis is transient and accompanies the onset of carotenoid production, while that of CAR genes does not correlate with the amount of carotenoids produced. The transcript levels of genes coding for carotenoid dioxygenase, superoxide dismutase and catalase A increased during the accumulation of carotenoids, thus suggesting the activation of a mechanism aimed at the protection of cell structures from oxidative stress during carotenoid biosynthesis. The data presented herein, besides being suitable for the elucidation of the mechanisms that underlie carotenoid biosynthesis, will contribute to boosting the biotechnological potential of this yeast by improving the outcome of further research efforts aimed at also exploring other features of interest.

  11. Combining multiple hypothesis testing and affinity propagation clustering leads to accurate, robust and sample size independent classification on gene expression data

    Directory of Open Access Journals (Sweden)

    Sakellariou Argiris

    2012-10-01

    Full Text Available Abstract Background A feature selection method in microarray gene expression data should be independent of platform, disease and dataset size. Our hypothesis is that among the statistically significant ranked genes in a gene list, there should be clusters of genes that share similar biological functions related to the investigated disease. Thus, instead of keeping N top ranked genes, it would be more appropriate to define and keep a number of gene cluster exemplars. Results We propose a hybrid FS method (mAP-KL, which combines multiple hypothesis testing and affinity propagation (AP-clustering algorithm along with the Krzanowski & Lai cluster quality index, to select a small yet informative subset of genes. We applied mAP-KL on real microarray data, as well as on simulated data, and compared its performance against 13 other feature selection approaches. Across a variety of diseases and number of samples, mAP-KL presents competitive classification results, particularly in neuromuscular diseases, where its overall AUC score was 0.91. Furthermore, mAP-KL generates concise yet biologically relevant and informative N-gene expression signatures, which can serve as a valuable tool for diagnostic and prognostic purposes, as well as a source of potential disease biomarkers in a broad range of diseases. Conclusions mAP-KL is a data-driven and classifier-independent hybrid feature selection method, which applies to any disease classification problem based on microarray data, regardless of the available samples. Combining multiple hypothesis testing and AP leads to subsets of genes, which classify unknown samples from both, small and large patient cohorts with high accuracy.

  12. Effects of gene disruptions in the nisin gene cluster of Lactococcus lactis on nisin production and producer immunity

    NARCIS (Netherlands)

    Ra, Runar; Beerthuyzen, Marke M.; Vos, Willem M. de; Saris, Per E.J.; Kuipers, Oscar P.

    1999-01-01

    The lantibiotic nisin is produced by several strains of Lactococcus lactis subsp. lactis. The chromosomally located gene cluster nisABTCIPRKFEG is required for biosynthesis, development of immunity, and regulation of gene expression. In-frame deletions in the nisB and nisT genes, and disruption of

  13. [Cluster ensemble algorithm based on dual neural gas applied to cancer gene expression profiles].

    Science.gov (United States)

    Zhang, Xiaodong; Chen, Hantao

    2015-02-01

    The microarray technology used in biological and medical research provides a new idea for the diagnosis and treatment of cancer. To find different types of cancer and to classify the cancer samples accurately, we propose a new cluster ensemble framework Dual Neural Gas Cluster Ensemble (DNGCE), which is based on neural gas algorithm, to discover the underlying structure of noisy cancer gene expression profiles. This framework DNGCE applies the neural gas algorithm to perform clustering not only on the sample dimension, but also on the attribute dimension. It also adopts the normalized cut algorithm to partition off the consensus matrix constructed from multiple clustering solutions. We obtained the final accurate results. Experiments on cancer gene expression profiles illustrated that the proposed approach could achieve good performance, as it outperforms the single clustering algorithms and most of the existing approaches in the process of clustering gene expression profiles.

  14. A recently transferred cluster of bacterial genes in Trichomonas vaginalis - lateral gene transfer and the fate of acquired genes

    Science.gov (United States)

    2014-01-01

    Background Lateral Gene Transfer (LGT) has recently gained recognition as an important contributor to some eukaryote proteomes, but the mechanisms of acquisition and fixation in eukaryotic genomes are still uncertain. A previously defined norm for LGTs in microbial eukaryotes states that the majority are genes involved in metabolism, the LGTs are typically localized one by one, surrounded by vertically inherited genes on the chromosome, and phylogenetics shows that a broad collection of bacterial lineages have contributed to the transferome. Results A unique 34 kbp long fragment with 27 clustered genes (TvLF) of prokaryote origin was identified in the sequenced genome of the protozoan parasite Trichomonas vaginalis. Using a PCR based approach we confirmed the presence of the orthologous fragment in four additional T. vaginalis strains. Detailed sequence analyses unambiguously suggest that TvLF is the result of one single, recent LGT event. The proposed donor is a close relative to the firmicute bacterium Peptoniphilus harei. High nucleotide sequence similarity between T. vaginalis strains, as well as to P. harei, and the absence of homologs in other Trichomonas species, suggests that the transfer event took place after the radiation of the genus Trichomonas. Some genes have undergone pseudogenization and degradation, indicating that they may not be retained in the future. Functional annotations reveal that genes involved in informational processes are particularly prone to degradation. Conclusions We conclude that, although the majority of eukaryote LGTs are single gene occurrences, they may be acquired in clusters of several genes that are subsequently cleansed of evolutionarily less advantageous genes. PMID:24898731

  15. Clustering of Drosophila melanogaster immune genes in interplay with recombination rate.

    Directory of Open Access Journals (Sweden)

    K Mathias Wegner

    Full Text Available BACKGROUND: Gene order in eukaryotic chromosomes is not random and has been linked to coordination of gene expression, chromatin structure and also recombination rate. The evolution of recombination rate is especially relevant for genes involved in immunity because host-parasite co-evolution could select for increased recombination rate (Red Queen hypothesis. To identify patterns left by the intimate interaction between hosts and parasites, I analysed the genomic parameters of the immune genes from 24 gene families/groups of Drosophila melanogaster. PRINCIPAL FINDINGS: Immune genes that directly interact with the pathogen (i.e. recognition and effector genes clustered in regions of higher recombination rates. Out of these, clustered effector genes were transcribed fastest indicating that transcriptional control might be one major cause for cluster formation. The relative position of clusters to each other, on the other hand, cannot be explained by transcriptional control per se. Drosophila immune genes that show epistatic interactions can be found at an average distance of 15.44+/-2.98 cM, which is considerably closer than genes that do not interact (30.64+/-1.95 cM. CONCLUSIONS: Epistatically interacting genes rarely belong to the same cluster, which supports recent models of optimal recombination rates between interacting genes in antagonistic host-parasite co-evolution. These patterns suggest that formation of local clusters might be a result of transcriptional control, but that in the condensed genome of D. melanogaster relative position of these clusters may be a result of selection for optimal rather than maximal recombination rates between these clusters.

  16. AutoSOME: a clustering method for identifying gene expression modules without prior knowledge of cluster number

    Directory of Open Access Journals (Sweden)

    Cooper James B

    2010-03-01

    Full Text Available Abstract Background Clustering the information content of large high-dimensional gene expression datasets has widespread application in "omics" biology. Unfortunately, the underlying structure of these natural datasets is often fuzzy, and the computational identification of data clusters generally requires knowledge about cluster number and geometry. Results We integrated strategies from machine learning, cartography, and graph theory into a new informatics method for automatically clustering self-organizing map ensembles of high-dimensional data. Our new method, called AutoSOME, readily identifies discrete and fuzzy data clusters without prior knowledge of cluster number or structure in diverse datasets including whole genome microarray data. Visualization of AutoSOME output using network diagrams and differential heat maps reveals unexpected variation among well-characterized cancer cell lines. Co-expression analysis of data from human embryonic and induced pluripotent stem cells using AutoSOME identifies >3400 up-regulated genes associated with pluripotency, and indicates that a recently identified protein-protein interaction network characterizing pluripotency was underestimated by a factor of four. Conclusions By effectively extracting important information from high-dimensional microarray data without prior knowledge or the need for data filtration, AutoSOME can yield systems-level insights from whole genome microarray expression studies. Due to its generality, this new method should also have practical utility for a variety of data-intensive applications, including the results of deep sequencing experiments. AutoSOME is available for download at http://jimcooperlab.mcdb.ucsb.edu/autosome.

  17. A putative gene cluster from a Lyngbya wollei bloom that encodes paralytic shellfish toxin biosynthesis.

    Directory of Open Access Journals (Sweden)

    Troco K Mihali

    Full Text Available Saxitoxin and its analogs cause the paralytic shellfish-poisoning syndrome, adversely affecting human health and coastal shellfish industries worldwide. Here we report the isolation, sequencing, annotation, and predicted pathway of the saxitoxin biosynthetic gene cluster in the cyanobacterium Lyngbya wollei. The gene cluster spans 36 kb and encodes enzymes for the biosynthesis and export of the toxins. The Lyngbya wollei saxitoxin gene cluster differs from previously identified saxitoxin clusters as it contains genes that are unique to this cluster, whereby the carbamoyltransferase is truncated and replaced by an acyltransferase, explaining the unique toxin profile presented by Lyngbya wollei. These findings will enable the creation of toxin probes, for water monitoring purposes, as well as proof-of-concept for the combinatorial biosynthesis of these natural occurring alkaloids for the production of novel, biologically active compounds.

  18. An Effective Tri-Clustering Algorithm Combining Expression Data with Gene Regulation Information

    Directory of Open Access Journals (Sweden)

    Ao Li

    2009-04-01

    Full Text Available Motivation: Bi-clustering algorithms aim to identify sets of genes sharing similar expression patterns across a subset of conditions. However direct interpretation or prediction of gene regulatory mechanisms may be difficult as only gene expression data is used. Information about gene regulators may also be available, most commonly about which transcription factors may bind to the promoter region and thus control the expression level of a gene. Thus a method to integrate gene expression and gene regulation information is desirable for clustering and analyzing. Methods: By incorporating gene regulatory information with gene expression data, we define regulated expression values (REV as indicators of how a gene is regulated by a specific factor. Existing bi-clustering methods are extended to a three dimensional data space by developing a heuristic TRI-Clustering algorithm. An additional approach named Automatic Boundary Searching algorithm (ABS is introduced to automatically determine the boundary threshold. Results: Results based on incorporating ChIP-chip data representing transcription factor-gene interactions show that the algorithms are efficient and robust for detecting tri-clusters. Detailed analysis of the tri-cluster extracted from yeast sporulation REV data shows genes in this cluster exhibited significant differences during the middle and late stages. The implicated regulatory network was then reconstructed for further study of defined regulatory mechanisms. Topological and statistical analysis of this network demonstrated evidence of significant changes of TF activities during the different stages of yeast sporulation, and suggests this approach might be a general way to study regulatory networks undergoing transformations.

  19. A robust prognostic gene expression signature for early stage lung adenocarcinoma

    DEFF Research Database (Denmark)

    Krzystanek, Marcin; Moldvay, Judit; Szüts, David

    2016-01-01

    Stage I lung adenocarcinoma is usually not treated with adjuvant chemotherapy; however, around half of these patients do not survive 5 years. Therefore, a reliable prognostic biomarker for early stage patients would be critical to identify those most likely to benefit from early additional treatm...

  20. Prognostic Significance of Decreased Expression of Six Large Common Fragile Site Genes in Oropharyngeal Squamous Cell Carcinomas

    Directory of Open Access Journals (Sweden)

    Ge Gao

    2014-12-01

    Full Text Available Common fragile sites (CFSs are large regions with profound genomic instability that often span extremely large genes a number of which have been found to be important tumor suppressors. RNA sequencing previously revealed that there was a group of six large CFS genes which frequently had decreased expression in oropharyngeal squamous cell carcinomas (OPSCCs and real-time reverse transcriptase polymerase chain reaction experiments validated that these six large CFS genes (PARK2, DLG2, NBEA, CTNNA3, DMD, and FHIT had decreased expression in most of the tumor samples. In this study, we investigated whether the decreased expression of these genes has any clinical significance in OPSCCs. We analyzed the six CFS large genes in 45 OPSCC patients and found that 27 (60% of the OPSCC tumors had decreased expression of these six genes. When we correlated the expression of these six genes to each patient’s clinical records, for 11 patients who had tumor recurrence, 10 of them had decreased expression of almost all 6 genes. When we divided the patients into two groups, one group with decreased expression of the six genes and the other group with either slight changes or increased expression of the six genes, we found that there is significant difference in the incidence of tumor recurrence between these two groups by Kaplan-Meier plot analysis (P < .05. Our results demonstrated that those OPSCC tumors with decreased expression of this select group of six large CFS genes were much more likely to be associated with tumor recurrence and these genes are potential prognostic markers for predicting tumor recurrence in OPSCC.

  1. Leveraging long sequencing reads to investigate R-gene clustering and variation in sugar beet

    Science.gov (United States)

    Host-pathogen interactions are of prime importance to modern agriculture. Plants utilize various types of resistance genes to mitigate pathogen damage. Identification of the specific gene responsible for a specific resistance can be difficult due to duplication and clustering within R-gene families....

  2. Horizontal transfer of a nitrate assimilation gene cluster and ecological transitions in fungi: a phylogenetic study.

    Directory of Open Access Journals (Sweden)

    Jason C Slot

    Full Text Available High affinity nitrate assimilation genes in fungi occur in a cluster (fHANT-AC that can be coordinately regulated. The clustered genes include nrt2, which codes for a high affinity nitrate transporter; euknr, which codes for nitrate reductase; and NAD(PH-nir, which codes for nitrite reductase. Homologs of genes in the fHANT-AC occur in other eukaryotes and prokaryotes, but they have only been found clustered in the oomycete Phytophthora (heterokonts. We performed independent and concatenated phylogenetic analyses of homologs of all three genes in the fHANT-AC. Phylogenetic analyses limited to fungal sequences suggest that the fHANT-AC has been transferred horizontally from a basidiomycete (mushrooms and smuts to an ancestor of the ascomycetous mold Trichoderma reesei. Phylogenetic analyses of sequences from diverse eukaryotes and eubacteria, and cluster structure, are consistent with a hypothesis that the fHANT-AC was assembled in a lineage leading to the oomycetes and was subsequently transferred to the Dikarya (Ascomycota+Basidiomycota, which is a derived fungal clade that includes the vast majority of terrestrial fungi. We propose that the acquisition of high affinity nitrate assimilation contributed to the success of Dikarya on land by allowing exploitation of nitrate in aerobic soils, and the subsequent transfer of a complete assimilation cluster improved the fitness of T. reesei in a new niche. Horizontal transmission of this cluster of functionally integrated genes supports the "selfish operon" hypothesis for maintenance of gene clusters.

  3. Dual regulation of receptor tyrosine kinase genes EGFR and c-Met by the tumor-suppressive microRNA-23b/27b cluster in bladder cancer.

    Science.gov (United States)

    Chiyomaru, Takeshi; Seki, Naohiko; Inoguchi, Satoru; Ishihara, Tomoaki; Mataki, Hiroko; Matsushita, Ryosuke; Goto, Yusuke; Nishikawa, Rika; Tatarano, Shuichi; Itesako, Toshihiko; Nakagawa, Masayuki; Enokida, Hideki

    2015-02-01

    Recent clinical trials of chemotherapeutics for advanced bladder cancer (BC) have shown limited benefits. Therefore, new prognostic markers and more effective treatment strategies are required. One approach to achieve these goals is through the analysis of RNA networks. Our recent studies of microRNA (miRNA) expression signatures revealed that the microRNA-23b/27b (miR-23b/27b) cluster is frequently downregulated in various types of human cancers. However, the functional role of the miR-23b/27b cluster in BC cells is still unknown. Thus, the aim of the present study was to investigate the functional significance of the miR-23b/27b cluster and its regulated molecular targets, with an emphasis on its contributions to BC oncogenesis and metastasis. The expression levels of the miR-23b/27b cluster were significantly reduced in BC clinical specimens. Restoration of mature miR-23b or miR-27b miRNAs significantly inhibited cancer cell migration and invasion, suggesting that these clustered miRNAs function as tumor suppressors. Gene expression data and in silico analysis demonstrated that the genes coding for the epidermal growth factor receptor (EGFR) and hepatocyte growth factor receptor (c-Met) were potential targets of the miR-23b/27b cluster. Luciferase reporter assays and western blotting demonstrated that EGFR and c-Met receptor trypsine kinases were directly regulated by these clustered miRNAs. We conclude that the decreased expression of the tumor-suppressive miR-23b/27b cluster enhanced cancer cell proliferation, migration and invasion in BC through direct regulation of EGFR and c-Met signaling pathways. Our data on RNA networks regulated by tumor-suppressive miR-23b/27b provide new insights into the potential mechanisms of BC oncogenesis and metastasis.

  4. MADIBA: A web server toolkit for biological interpretation of Plasmodium and plant gene clusters

    Directory of Open Access Journals (Sweden)

    Louw Abraham I

    2008-02-01

    Full Text Available Abstract Background Microarray technology makes it possible to identify changes in gene expression of an organism, under various conditions. Data mining is thus essential for deducing significant biological information such as the identification of new biological mechanisms or putative drug targets. While many algorithms and software have been developed for analysing gene expression, the extraction of relevant information from experimental data is still a substantial challenge, requiring significant time and skill. Description MADIBA (MicroArray Data Interface for Biological Annotation facilitates the assignment of biological meaning to gene expression clusters by automating the post-processing stage. A relational database has been designed to store the data from gene to pathway for Plasmodium, rice and Arabidopsis. Tools within the web interface allow rapid analyses for the identification of the Gene Ontology terms relevant to each cluster; visualising the metabolic pathways where the genes are implicated, their genomic localisations, putative common transcriptional regulatory elements in the upstream sequences, and an analysis specific to the organism being studied. Conclusion MADIBA is an integrated, online tool that will assist researchers in interpreting their results and understand the meaning of the co-expression of a cluster of genes. Functionality of MADIBA was validated by analysing a number of gene clusters from several published experiments – expression profiling of the Plasmodium life cycle, and salt stress treatments of Arabidopsis and rice. In most of the cases, the same conclusions found by the authors were quickly and easily obtained after analysing the gene clusters with MADIBA.

  5. Bacillus cereus-type polyhydroxyalkanoate biosynthetic gene cluster contains R-specific enoyl-CoA hydratase gene.

    Science.gov (United States)

    Kihara, Takahiro; Hiroe, Ayaka; Ishii-Hyakutake, Manami; Mizuno, Kouhei; Tsuge, Takeharu

    2017-08-01

    Bacillus cereus and Bacillus megaterium both accumulate polyhydroxyalkanoate (PHA) but their PHA biosynthetic gene (pha) clusters that code for proteins involved in PHA biosynthesis are different. Namely, a gene encoding MaoC-like protein exists in the B. cereus-type pha cluster but not in the B. megaterium-type pha cluster. MaoC-like protein has an R-specific enoyl-CoA hydratase (R-hydratase) activity and is referred to as PhaJ when involved in PHA metabolism. In this study, the pha cluster of B. cereus YB-4 was characterized in terms of PhaJ's function. In an in vitro assay, PhaJ from B. cereus YB-4 (PhaJYB4) exhibited hydration activity toward crotonyl-CoA. In an in vivo assay using Escherichia coli as a host for PHA accumulation, the recombinant strain expressing PhaJYB4 and PHA synthase led to increased PHA accumulation, suggesting that PhaJYB4 functioned as a monomer supplier. The monomer composition of the accumulated PHA reflected the substrate specificity of PhaJYB4, which appeared to prefer short chain-length substrates. The pha cluster from B. cereus YB-4 functioned to accumulate PHA in E. coli; however, it did not function when the phaJYB4 gene was deleted. The B. cereus-type pha cluster represents a new example of a pha cluster that contains the gene encoding PhaJ.

  6. A Dirichlet process mixture model for clustering longitudinal gene expression data.

    Science.gov (United States)

    Sun, Jiehuan; Herazo-Maya, Jose D; Kaminski, Naftali; Zhao, Hongyu; Warren, Joshua L

    2017-09-30

    Subgroup identification (clustering) is an important problem in biomedical research. Gene expression profiles are commonly utilized to define subgroups. Longitudinal gene expression profiles might provide additional information on disease progression than what is captured by baseline profiles alone. Therefore, subgroup identification could be more accurate and effective with the aid of longitudinal gene expression data. However, existing statistical methods are unable to fully utilize these data for patient clustering. In this article, we introduce a novel clustering method in the Bayesian setting based on longitudinal gene expression profiles. This method, called BClustLonG, adopts a linear mixed-effects framework to model the trajectory of genes over time, while clustering is jointly conducted based on the regression coefficients obtained from all genes. In order to account for the correlations among genes and alleviate the high dimensionality challenges, we adopt a factor analysis model for the regression coefficients. The Dirichlet process prior distribution is utilized for the means of the regression coefficients to induce clustering. Through extensive simulation studies, we show that BClustLonG has improved performance over other clustering methods. When applied to a dataset of severely injured (burn or trauma) patients, our model is able to identify interesting subgroups. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  7. Identification of the Viridicatumtoxin and Griseofulvin Gene Clusters from Penicillium aethiopicum

    National Research Council Canada - National Science Library

    Chooi, Yit-Heng; Cacho, Ralph; Tang, Yi

    2010-01-01

    ...: the tetracycline-like viridicatumtoxin 1 and the classic antifungal agent griseofulvin 2. Here, we report the concurrent discovery of the two corresponding biosynthetic gene clusters (vrt and gsf...

  8. Nonlinear Biosynthetic Gene Cluster Dose Effect on Penicillin Production by Penicillium chrysogenum

    NARCIS (Netherlands)

    Nijland, Jeroen G.; Ebbendorf, Bjorg; Woszczynska, Marta; Boer, Remon; Bovenberg, Roel A. L.; Driessen, Arnold J. M.

    2010-01-01

    Industrial penicillin production levels by the filamentous fungus Penicillium chrysogenum increased dramatically by classical strain improvement. High-yielding strains contain multiple copies of the penicillin biosynthetic gene cluster that encodes three key enzymes of the beta-lactam biosynthetic

  9. A systematic computational analysis of biosynthetic gene cluster evolution: lessons for engineering biosynthesis

    NARCIS (Netherlands)

    Medema, Marnix; Cimermancic, P.; Sali, A.; Takano, Eriko; Fischbach, M.A.

    2014-01-01

    Bacterial secondary metabolites are widely used as antibiotics, anticancer drugs, insecticides and food additives. Attempts to engineer their biosynthetic gene clusters (BGCs) to produce unnatural metabolites with improved properties are often frustrated by the unpredictability and complexity of the

  10. AntiSMASH 4.0 - improvements in chemistry prediction and gene cluster boundary identification

    NARCIS (Netherlands)

    Blin, Kai; Wolf, Thomas; Chevrette, Marc G.; Lu, Xiaowen; Schwalen, Christopher J.; Kautsar, Satria A.; Suarez Duran, Hernando G.; Los Santos, De Emmanuel L.C.; Kim, Hyun Uk; Nave, Mariana; Dickschat, Jeroen S.; Mitchell, Douglas A.; Shelest, Ekaterina; Breitling, Rainer; Takano, Eriko; Lee, Sang Yup; Weber, Tilmann; Medema, Marnix H.

    2017-01-01

    Many antibiotics, chemotherapeutics, crop protection agents and food preservatives originate from molecules produced by bacteria, fungi or plants. In recent years, genome mining methodologies have been widely adopted to identify and characterize the biosynthetic gene clusters encoding the production

  11. United we stand: big roles for small RNA gene clusters.

    Science.gov (United States)

    Felden, Brice; Paillard, Luc

    2017-02-01

    Prokaryotes and eukaryotes evolved relatively similar RNA-based molecular mechanisms to fight potentially deleterious nucleic acids coming from phages, transposons, or viruses. Short RNAs guide effector complexes toward their targets to be silenced or eliminated. These short immunity RNAs are transcribed from clustered loci. Unexpectedly and strikingly, bacterial and eukaryotic immunity RNA clusters share substantial functional and mechanistic resemblances in fighting nucleic acid intruders. © 2017 Felden and Paillard; Published by Cold Spring Harbor Laboratory Press for the RNA Society.

  12. Intranuclear and higher-order chromatin organization of the major histone gene cluster in breast cancer.

    Science.gov (United States)

    Fritz, Andrew J; Ghule, Prachi N; Boyd, Joseph R; Tye, Coralee E; Page, Natalie A; Hong, Deli; Shirley, David J; Weinheimer, Adam S; Barutcu, Ahmet R; Gerrard, Diana L; Frietze, Seth; van Wijnen, Andre J; Zaidi, Sayyed K; Imbalzano, Anthony N; Lian, Jane B; Stein, Janet L; Stein, Gary S

    2018-02-01

    Alterations in nuclear morphology are common in cancer progression. However, the degree to which gross morphological abnormalities translate into compromised higher-order chromatin organization is poorly understood. To explore the functional links between gene expression and chromatin structure in breast cancer, we performed RNA-seq gene expression analysis on the basal breast cancer progression model based on human MCF10A cells. Positional gene enrichment identified the major histone gene cluster at chromosome 6p22 as one of the most significantly upregulated (and not amplified) clusters of genes from the normal-like MCF10A to premalignant MCF10AT1 and metastatic MCF10CA1a cells. This cluster is subdivided into three sub-clusters of histone genes that are organized into hierarchical topologically associating domains (TADs). Interestingly, the sub-clusters of histone genes are located at TAD boundaries and interact more frequently with each other than the regions in-between them, suggesting that the histone sub-clusters form an active chromatin hub. The anchor sites of loops within this hub are occupied by CTCF, a known chromatin organizer. These histone genes are transcribed and processed at a specific sub-nuclear microenvironment termed the major histone locus body (HLB). While the overall chromatin structure of the major HLB is maintained across breast cancer progression, we detected alterations in its structure that may relate to gene expression. Importantly, breast tumor specimens also exhibit a coordinate pattern of upregulation across the major histone gene cluster. Our results provide a novel insight into the connection between the higher-order chromatin organization of the major HLB and its regulation during breast cancer progression. © 2017 Wiley Periodicals, Inc.

  13. Integrating Data Clustering and Visualization for the Analysis of 3D Gene Expression Data

    Energy Technology Data Exchange (ETDEWEB)

    Data Analysis and Visualization (IDAV) and the Department of Computer Science, University of California, Davis, One Shields Avenue, Davis CA 95616, USA,; nternational Research Training Group ``Visualization of Large and Unstructured Data Sets,' ' University of Kaiserslautern, Germany; Computational Research Division, Lawrence Berkeley National Laboratory, One Cyclotron Road, Berkeley, CA 94720, USA; Genomics Division, Lawrence Berkeley National Laboratory, One Cyclotron Road, Berkeley CA 94720, USA; Life Sciences Division, Lawrence Berkeley National Laboratory, One Cyclotron Road, Berkeley CA 94720, USA,; Computer Science Division,University of California, Berkeley, CA, USA,; Computer Science Department, University of California, Irvine, CA, USA,; All authors are with the Berkeley Drosophila Transcription Network Project, Lawrence Berkeley National Laboratory,; Rubel, Oliver; Weber, Gunther H.; Huang, Min-Yu; Bethel, E. Wes; Biggin, Mark D.; Fowlkes, Charless C.; Hendriks, Cris L. Luengo; Keranen, Soile V. E.; Eisen, Michael B.; Knowles, David W.; Malik, Jitendra; Hagen, Hans; Hamann, Bernd

    2008-05-12

    The recent development of methods for extracting precise measurements of spatial gene expression patterns from three-dimensional (3D) image data opens the way for new analyses of the complex gene regulatory networks controlling animal development. We present an integrated visualization and analysis framework that supports user-guided data clustering to aid exploration of these new complex datasets. The interplay of data visualization and clustering-based data classification leads to improved visualization and enables a more detailed analysis than previously possible. We discuss (i) integration of data clustering and visualization into one framework; (ii) application of data clustering to 3D gene expression data; (iii) evaluation of the number of clusters k in the context of 3D gene expression clustering; and (iv) improvement of overall analysis quality via dedicated post-processing of clustering results based on visualization. We discuss the use of this framework to objectively define spatial pattern boundaries and temporal profiles of genes and to analyze how mRNA patterns are controlled by their regulatory transcription factors.

  14. Integrated miRNA-risk gene-pathway pair network analysis provides prognostic biomarkers for gastric cancer

    Directory of Open Access Journals (Sweden)

    Cai H

    2016-05-01

    Full Text Available Hui Cai,1 Jiping Xu,2 Yifang Han,3 Zhengmao Lu,1 Ting Han,1 Yibo Ding,4 Liye Ma1 1Department of General Surgery, Changhai Hospital, Second Military Medical University, Shanghai, 2Department of Medical Administration, Changhai Hospital, Second Military Medical University, Shanghai, 3Department of Epidemiology, Research Institute for Medicine of Nanjing Command, Nanjing, 4Department of Epidemiology, Changhai Hospital, Second Military Medical University, Shanghai, People’s Republic of China Purpose: This study aimed to identify molecular prognostic biomarkers for gastric cancer. Methods: mRNA and miRNA expression profiles of eligible gastric cancer and control samples were downloaded from Gene Expression Omnibus to screen the differentially expressed genes (DEGs and differentially expressed miRNAs (DEmiRs, using MetaDE and limma packages, respectively. Target genes of the DEmiRs were also collected from both predictive and experimentally validated target databases of miRNAs. The overlapping genes between selected targets and DEGs were identified as risk genes, followed by functional enrichment analysis. Human pathways and their corresponding genes were downloaded from the Kyoto Encyclopedia of Genes and Genomes (KEGG database for the expression analysis of each pathway in gastric cancer samples. Next, co-pathway pairs were selected according to the Pearson correlation coefficients. Finally, the co-pathway pairs, miRNA–target pairs, and risk gene–pathway pairs were merged into a complex interaction network, the most important nodes (miRNAs/target genes/co-pathway pairs of which were selected by calculating their degrees.Results: Totally, 1,260 DEGs and 144 DEmiRs were identified. There were 336 risk genes found in the 9,572 miRNA–target pairs. Judging from the pathway expression files, 45 co-pathway pairs were screened out. There were 1,389 interactive pairs and 480 nodes in the integrated network. Among all nodes in the network, focal

  15. PROGNOSTIC VALUE OF BRAIN AND ACUTE LEUKEMIA CYTOPLASMIC GENE EXPRESSION IN EGYPTIAN CHILDREN WITH ACUTE MYELOID LEUKEMIA

    Directory of Open Access Journals (Sweden)

    adel abd elhaleim hagag

    2015-04-01

    Full Text Available Abstract      Background: Acute myeloid leukemia (AML accounts for 25%-35% of the acute leukemia in children. BAALC (Brain and Acute Leukemia, Cytoplasmic gene is a recently identified gene on chromosome 8q22.3 that has prognostic significance in AML.  The aim of this work was to study the impact of BAALC gene expression on prognosis of AML in Egyptian children. Patients and methods: This study was conducted on 40 patients of newly diagnosed AML who were subjected to the following: Full history taking, clinical examination, laboratory investigations including: complete blood count, LDH, bone marrow aspiration, cytochemistry and immunophenotyping, assessment of BAALC Gene by real time PCR in bone marrow aspirate mononuclear cells before the start of chemotherapy. Results: BAALC gene expression showed positive expression in 24 cases (60% and negative expression in 16 cases (40%. Patients who showed positive BAALC gene expression included 10 patients achieved complete remission, 8 patients died and 6 relapsed patients, while patients who showed negative expression include 12 patients achieved complete remission, 1 relapsed patient and 3 patients died. There was significant association between BAALC gene expression and FAB classification of patients of AML patientsas positive BAALC expression is predominantly seen in FAB subtypes M1 and M2 compared with negative BAALC gene expression that was found more in M3 and M4 (8 cases with M1, 12 cases with M2, 1 case with M3 and 3 cases with M4 in positive BAALC expression versus 2 cases with M1, 3 cases with M2, 4 cases with M3 and 7 cases with M4 in BAALC gene negative expression group with significant difference regarding FAB subtypes. As regard age, sex, splenomegaly, lymphadenopathy, pallor, purpura, platelets count, WBCs count, and percentage of blast cells in BM, the present study showed no significant association with BAALC. Conclusion: BAALC expression is an important prognostic factor in AML

  16. A Link-Based Cluster Ensemble Approach For Improved Gene Expression Data Analysis

    Directory of Open Access Journals (Sweden)

    P.Balaji

    2015-01-01

    Full Text Available Abstract It is difficult from possibilities to select a most suitable effective way of clustering algorithm and its dataset for a defined set of gene expression data because we have a huge number of ways and huge number of gene expressions. At present many researchers are preferring to use hierarchical clustering in different forms this is no more totally optimal. Cluster ensemble research can solve this type of problem by automatically merging multiple data partitions from a wide range of different clusterings of any dimensions to improve both the quality and robustness of the clustering result. But we have many existing ensemble approaches using an association matrix to condense sample-cluster and co-occurrence statistics and relations within the ensemble are encapsulated only at raw level while the existing among clusters are totally discriminated. Finding these missing associations can greatly expand the capability of those ensemble methodologies for microarray data clustering. We propose general K-means cluster ensemble approach for the clustering of general categorical data into required number of partitions.

  17. Unusual Gene Order and Organization of the Sea Urchin Hox Cluster

    Energy Technology Data Exchange (ETDEWEB)

    Cameron, R A; Rowen, L; Nesbitt, R; Bloom, S; Rast, J P; Berney, K; Arenas-Mena, C; Martinez, P; Lucas, S; Richardson, P M; Davidson, E H; Peterson, K J; Hood, L

    2005-10-11

    The highly consistent gene order and axial colinear expression patterns found in vertebrate hox gene clusters are less well conserved across the rest of bilaterians. We report the first deuterostome instance of an intact hox cluster with a unique gene order where the paralog groups are not expressed in a sequential manner. The finished sequence from BAC clones from the genome of the sea urchin, Strongylocentrotus purpuratus, reveals a gene order wherein the anterior genes (Hox1, Hox2 and Hox3) lie nearest the posterior genes in the cluster such that the most 3 gene is Hox5. (The gene order is : 5-Hox1, 2, 3, 11/13c, 11/13b, 11/13a, 9/10, 8, 7, 6, 5 - 3). The finished sequence result is corroborated by restriction mapping evidence and BAC-end scaffold analyses. Comparisons with a putative ancestral deuterostome Hox gene cluster suggest that the rearrangements leading to the sea urchin gene order were many and complex.

  18. Unusual Gene Order and Organization of the Sea Urchin HoxCluster

    Energy Technology Data Exchange (ETDEWEB)

    Richardson, Paul M.; Lucas, Susan; Cameron, R. Andrew; Rowen,Lee; Nesbitt, Ryan; Bloom, Scott; Rast, Jonathan P.; Berney, Kevin; Arenas-Mena, Cesar; Martinez, Pedro; Davidson, Eric H.; Peterson, KevinJ.; Hood, Leroy

    2005-05-10

    The highly consistent gene order and axial colinear expression patterns found in vertebrate hox gene clusters are less well conserved across the rest of bilaterians. We report the first deuterostome instance of an intact hox cluster with a unique gene order where the paralog groups are not expressed in a sequential manner. The finished sequence from BAC clones from the genome of the sea urchin, Strongylocentrotus purpuratus, reveals a gene order wherein the anterior genes (Hox1, Hox2 and Hox3) lie nearest the posterior genes in the cluster such that the most 3' gene is Hox5. (The gene order is : 5'-Hox1,2, 3, 11/13c, 11/13b, '11/13a, 9/10, 8, 7, 6, 5 - 3)'. The finished sequence result is corroborated by restriction mapping evidence and BAC-end scaffold analyses. Comparisons with a putative ancestral deuterostome Hox gene cluster suggest that the rearrangements leading to the sea urchin gene order were many and complex.

  19. Sequencing rare marine actinomycete genomes reveals high density of unique natural product biosynthetic gene clusters

    Science.gov (United States)

    Schorn, Michelle A.; Alanjary, Mohammad M.; Aguinaldo, Kristen; Korobeynikov, Anton; Podell, Sheila; Patin, Nastassia; Lincecum, Tommie; Jensen, Paul R.; Ziemert, Nadine

    2016-01-01

    Traditional natural product discovery methods have nearly exhausted the accessible diversity of microbial chemicals, making new sources and techniques paramount in the search for new molecules. Marine actinomycete bacteria have recently come into the spotlight as fruitful producers of structurally diverse secondary metabolites, and remain relatively untapped. In this study, we sequenced 21 marine-derived actinomycete strains, rarely studied for their secondary metabolite potential and under-represented in current genomic databases. We found that genome size and phylogeny were good predictors of biosynthetic gene cluster diversity, with larger genomes rivalling the well-known marine producers in the Streptomyces and Salinispora genera. Genomes in the Micrococcineae suborder, however, had consistently the lowest number of biosynthetic gene clusters. By networking individual gene clusters into gene cluster families, we were able to computationally estimate the degree of novelty each genus contributed to the current sequence databases. Based on the similarity measures between all actinobacteria in the Joint Genome Institute's Atlas of Biosynthetic gene Clusters database, rare marine genera show a high degree of novelty and diversity, with Corynebacterium, Gordonia, Nocardiopsis, Saccharomonospora and Pseudonocardia genera representing the highest gene cluster diversity. This research validates that rare marine actinomycetes are important candidates for exploration, as they are relatively unstudied, and their relatives are historically rich in secondary metabolites. PMID:27902408

  20. Sequencing rare marine actinomycete genomes reveals high density of unique natural product biosynthetic gene clusters.

    Science.gov (United States)

    Schorn, Michelle A; Alanjary, Mohammad M; Aguinaldo, Kristen; Korobeynikov, Anton; Podell, Sheila; Patin, Nastassia; Lincecum, Tommie; Jensen, Paul R; Ziemert, Nadine; Moore, Bradley S

    2016-12-01

    Traditional natural product discovery methods have nearly exhausted the accessible diversity of microbial chemicals, making new sources and techniques paramount in the search for new molecules. Marine actinomycete bacteria have recently come into the spotlight as fruitful producers of structurally diverse secondary metabolites, and remain relatively untapped. In this study, we sequenced 21 marine-derived actinomycete strains, rarely studied for their secondary metabolite potential and under-represented in current genomic databases. We found that genome size and phylogeny were good predictors of biosynthetic gene cluster diversity, with larger genomes rivalling the well-known marine producers in the Streptomyces and Salinispora genera. Genomes in the Micrococcineae suborder, however, had consistently the lowest number of biosynthetic gene clusters. By networking individual gene clusters into gene cluster families, we were able to computationally estimate the degree of novelty each genus contributed to the current sequence databases. Based on the similarity measures between all actinobacteria in the Joint Genome Institute's Atlas of Biosynthetic gene Clusters database, rare marine genera show a high degree of novelty and diversity, with Corynebacterium, Gordonia, Nocardiopsis, Saccharomonospora and Pseudonocardia genera representing the highest gene cluster diversity. This research validates that rare marine actinomycetes are important candidates for exploration, as they are relatively unstudied, and their relatives are historically rich in secondary metabolites.

  1. The gene cluster of aureocyclicin 4185: the first cyclic bacteriocin of Staphylococcus aureus.

    Science.gov (United States)

    Potter, Amina; Ceotto, Hilana; Coelho, Marcus Lívio Varella; Guimarães, Allan J; Bastos, Maria do Carmo de Freire

    2014-05-01

    Staphylococcus aureus 4185 was previously shown to produce at least two bacteriocins. One of them is encoded by pRJ101. To detect the bacteriocin-encoding gene cluster, an ~9160 kb region of pRJ101 was sequenced. In silico analyses identified 10 genes (aclX, aclB, aclI, aclT, aclC, aclD, aclA, aclF, aclG and aclH) that might be involved in the production of a novel cyclic bacteriocin named aureocyclicin 4185. The organization of these genes was quite similar to that of the gene cluster responsible for carnocyclin A production and immunity. Four putative proteins encoded by these genes (AclT, AclC, AclD and AclA) also exhibited similarity to proteins encoded by cyclic bacteriocin gene clusters. Mutants derived from insertion of Tn917-lac into aclC, aclF, aclH and aclX were affected in bacteriocin production and growth. AclX is a 205 aa putative protein not encoded by the gene clusters of other cyclic bacteriocins. AclX exhibits 50 % similarity to a permease and has five putative membrane-spanning domains. Transcription analyses suggested that aclX is part of the aureocyclicin 4185 gene cluster, encoding a protein required for bacteriocin production. The aclA gene is the structural gene of aureocyclicin 4185, which shows 65 % similarity to garvicin ML. AclA is proposed to be cleaved off, generating a mature peptide with a predicted Mr of 5607 Da (60 aa). By homology modelling, AclA presents four α-helices, like carnocyclin A. AclA could not be found at detectable levels in the culture supernatant of a strain carrying only pRJ101. To our knowledge, this is the first report of a cyclic bacteriocin gene cluster in the genus Staphylococcus.

  2. Cytokine Gene Polymorphisms Associated With Symptom Clusters in Oncology Patients Undergoing Radiation Therapy.

    Science.gov (United States)

    Miaskowski, Christine; Conley, Yvette P; Mastick, Judy; Paul, Steven M; Cooper, Bruce A; Levine, Jon D; Knisely, Mitchell; Kober, Kord M

    2017-09-01

    Most of the reviews on the biological basis for symptom clusters suggest that inflammatory processes are involved in the development and maintenance of the symptom clusters. However, no studies have evaluated for associations between genetic polymorphisms and common symptom clusters (e.g., mood disturbance, sickness behavior). Examine the associations between cytokine gene polymorphisms and the severity of three distinct symptom clusters (i.e., mood-cognitive, sickness-behavior, treatment-related) in a sample of patients with breast and prostate cancer (n = 157) at the completion of radiation therapy. Symptom severity was assessed using the Memorial Symptom Assessment Scale. Symptom clusters were created using exploratory factor analysis. The associations between cytokine gene polymorphisms and the symptom cluster severity scores were evaluated using regression analyses. Polymorphisms in C-X-C motif chemokine ligand 8 (CXCL8), interleukin (IL13), and nuclear factor kappa beta 2 (NFKB2) were associated with severity scores for the mood-cognitive symptom cluster. In addition to interferon gamma (IFNG1), the same polymorphism in NFKB2 (i.e., rs1056890) that was associated with the mood-cognitive symptom cluster score was associated with the sickness-behavior symptom cluster. Polymorphisms in interleukin 1 receptor 1 (IL1R1), IL6, and NFKB1 were associated with severity factor scores for the treatment-related symptom cluster. Our findings support the hypotheses that symptoms that cluster together have a common underlying mechanism and the most common symptom clusters in oncology patients are associated polymorphisms in genes involved in a variety of inflammatory processes. Copyright © 2017 American Academy of Hospice and Palliative Medicine. Published by Elsevier Inc. All rights reserved.

  3. The Local Maximum Clustering Method and Its Application in Microarray Gene Expression Data Analysis

    Directory of Open Access Journals (Sweden)

    Chen Yidong

    2004-01-01

    Full Text Available An unsupervised data clustering method, called the local maximum clustering (LMC method, is proposed for identifying clusters in experiment data sets based on research interest. A magnitude property is defined according to research purposes, and data sets are clustered around each local maximum of the magnitude property. By properly defining a magnitude property, this method can overcome many difficulties in microarray data clustering such as reduced projection in similarities, noises, and arbitrary gene distribution. To critically evaluate the performance of this clustering method in comparison with other methods, we designed three model data sets with known cluster distributions and applied the LMC method as well as the hierarchic clustering method, the -mean clustering method, and the self-organized map method to these model data sets. The results show that the LMC method produces the most accurate clustering results. As an example of application, we applied the method to cluster the leukemia samples reported in the microarray study of Golub et al. (1999.

  4. Clustering based gene expression feature selection method: A computational approach to enrich the classifier efficiency of differentially expressed genes

    KAUST Repository

    Abusamra, Heba

    2016-07-20

    The native nature of high dimension low sample size of gene expression data make the classification task more challenging. Therefore, feature (gene) selection become an apparent need. Selecting a meaningful and relevant genes for classifier not only decrease the computational time and cost, but also improve the classification performance. Among different approaches of feature selection methods, however most of them suffer from several problems such as lack of robustness, validation issues etc. Here, we present a new feature selection technique that takes advantage of clustering both samples and genes. Materials and methods We used leukemia gene expression dataset [1]. The effectiveness of the selected features were evaluated by four different classification methods; support vector machines, k-nearest neighbor, random forest, and linear discriminate analysis. The method evaluate the importance and relevance of each gene cluster by summing the expression level for each gene belongs to this cluster. The gene cluster consider important, if it satisfies conditions depend on thresholds and percentage otherwise eliminated. Results Initial analysis identified 7120 differentially expressed genes of leukemia (Fig. 15a), after applying our feature selection methodology we end up with specific 1117 genes discriminating two classes of leukemia (Fig. 15b). Further applying the same method with more stringent higher positive and lower negative threshold condition, number reduced to 58 genes have be tested to evaluate the effectiveness of the method (Fig. 15c). The results of the four classification methods are summarized in Table 11. Conclusions The feature selection method gave good results with minimum classification error. Our heat-map result shows distinct pattern of refines genes discriminating between two classes of leukemia.

  5. Ensemble attribute profile clustering: discovering and characterizing groups of genes with similar patterns of biological features

    Directory of Open Access Journals (Sweden)

    Bissell MJ

    2006-03-01

    Full Text Available Abstract Background Ensemble attribute profile clustering is a novel, text-based strategy for analyzing a user-defined list of genes and/or proteins. The strategy exploits annotation data present in gene-centered corpora and utilizes ideas from statistical information retrieval to discover and characterize properties shared by subsets of the list. The practical utility of this method is demonstrated by employing it in a retrospective study of two non-overlapping sets of genes defined by a published investigation as markers for normal human breast luminal epithelial cells and myoepithelial cells. Results Each genetic locus was characterized using a finite set of biological properties and represented as a vector of features indicating attributes associated with the locus (a gene attribute profile. In this study, the vector space models for a pre-defined list of genes were constructed from the Gene Ontology (GO terms and the Conserved Domain Database (CDD protein domain terms assigned to the loci by the gene-centered corpus LocusLink. This data set of GO- and CDD-based gene attribute profiles, vectors of binary random variables, was used to estimate multiple finite mixture models and each ensuing model utilized to partition the profiles into clusters. The resultant partitionings were combined using a unanimous voting scheme to produce consensus clusters, sets of profiles that co-occured consistently in the same cluster. Attributes that were important in defining the genes assigned to a consensus cluster were identified. The clusters and their attributes were inspected to ascertain the GO and CDD terms most associated with subsets of genes and in conjunction with external knowledge such as chromosomal location, used to gain functional insights into human breast biology. The 52 luminal epithelial cell markers and 89 myoepithelial cell markers are disjoint sets of genes. Ensemble attribute profile clustering-based analysis indicated that both lists

  6. Context-specific infinite mixtures for clustering gene expression profiles across diverse microarray dataset

    Science.gov (United States)

    Liu, X.; Sivaganesan, S.; Yeung, K.Y.; Guo, J.; Bumgarner, R.E.; Mario, Medvedovic

    2006-01-01

    Motivation Identifying groups of co-regulated genes by monitoring their expression over various experimental conditions is complicated by the fact that such co-regulation is condition-specific. Ignoring the context-specific nature of co-regulation significantly reduces the ability of clustering procedures to detect co-expressed genes due to additional “noise” introduced by non-informative measurements. Results We have developed a novel Bayesian hierarchical model and corresponding computational algorithms for clustering gene expression profiles across diverse experimental conditions and studies that accounts for context-specificity of gene expression patterns. The model is based on the Bayesian infinite mixtures framework and does not require a priori specification of the number of clusters. We demonstrate that explicit modeling of context-specificity results in increased accuracy of the cluster analysis by examining the specificity and sensitivity of clusters in microarray data. We also demonstrate that probabilities of co-expression derived from the posterior distribution of clusterings are valid estimates of statistical significance of created clusters. Availability The open-source package gimm is available at http://eh3.uc.edu/gimm. Contact Mario.Medvedovic@uc.edu Supplementary information http://eh3.uc.edu/gimm/csimm PMID:16709591

  7. Clinical and prognostic value of MET gene copy number gain and chromosome 7 polysomy in primary colorectal cancer patients.

    Science.gov (United States)

    Seo, An Na; Park, Kyoung Un; Choe, Gheeyoung; Kim, Woo Ho; Kim, Duck-Woo; Kang, Sung-Bum; Lee, Hye Seung

    2015-12-01

    We aimed to explore the clinical and prognostic influence of numeric alterations of MET gene copy number (GCN) and chromosome 7 (CEP7) CN in colorectal cancer (CRC) patients. MET GCN and CEP7 CN were investigated in tissue arrayed tumors from 170 CRC patients using silver in situ hybridization (SISH). MET GCN gain was defined as ≥4 copies of MET, and CEP7 polysomy was prespecified as ≥3 copies of CEP7. Additionally, MET messenger RNA (mRNA) transcription was evaluated using mRNA ISH and compared with MET GCN. MET GCN gain was observed in 14.7 % (25/170), which correlated with advanced stage (P = 0.037), presence of distant metastasis (P = 0.006), and short overall survival (OS) (P = 0.009). In contrast, CEP7 polysomy was found in 6.5 % (11/170), which was related to tumor location in the left colon (P = 0.027) and poor OS (P = 0.029). MET GCN positively correlated with CEP7 CN (R = 0.659, P patients (n = 123). In multivariate analysis, CEP7 polysomy was an independent prognostic factor for poor OS in all patients (P = 0.009; hazard ratio [HR], 2.220; 95 % confidence interval [CI], 1.233-3.997) and in stage II/III CRC patients (P patients, especially CEP7 polysomy has the most powerful prognostic impact in stage II/III CRC patients.

  8. Growth arrest DNA damage-inducible gene 45 gamma expression as a prognostic and predictive biomarker in hepatocellular carcinoma.

    Science.gov (United States)

    Ou, Da-Liang; Shyue, Song-Kun; Lin, Liang-In; Feng, Zi-Rui; Liou, Jun-Yang; Fan, Hsiang-Hsuan; Lee, Bin-Shyun; Hsu, Chiun; Cheng, Ann-Lii

    2015-09-29

    Growth arrest DNA damage-inducible gene 45 (GADD45) family proteins play a crucial role in regulating cellular stress responses and apoptosis. The present study explored the prognostic and predictive role of GADD45γ in hepatocellular carcinoma (HCC) treatment. GADD45γ expression in HCC cells was examined using quantitative reverse transcription-PCR (qRT-PCR) and Western blotting. The control of GADD45γ transcription was examined using a luciferase reporter assay and chromatin immunoprecipitation. The in vivo induction of GADD45γ was performed using adenoviral transfer. The expression of GADD45γ in HCC tumor tissues from patients who had undergone curative resection was measured using qRT-PCR. Sorafenib induced expression of GADD45γ mRNA and protein, independent of its RAF kinase inhibitor activity. GADD45γ induction was more prominent in sorafenib-sensitive HCC cells (Huh-7 and HepG2, IC50 6-7 μM) than in sorafenib-resistant HCC cells (Hep3B, Huh-7R, and HepG2R, IC50 12-15 μM). Overexpression of GADD45γ reversed sorafenib resistance in vitro and in vivo, whereas GADD45γ expression knockdown by using siRNA partially abrogated the proapoptotic effects of sorafenib on sorafenib-sensitive cells. Overexpression of survivin in HCC cells abolished the antitumor enhancement between GADD45γ overexpression and sorafenib treatment, suggesting that survivin is a crucial mediator of antitumor effects of GADD45γ. GADD45γ expression decreased in tumors from patients with HCC who had undergone curative surgery, and low GADD45γ expression was an independent prognostic factor for poor survival, in addition to old age and vascular invasion. The preceding data indicate that GADD45γ suppression is a poor prognostic factor in patients with HCC and may help predict sorafenib efficacy in HCC.

  9. Prediction of operon-like gene clusters in the Arabidopsis thaliana genome based on co-expression analysis of neighboring genes.

    Science.gov (United States)

    Wada, Masayoshi; Takahashi, Hiroki; Altaf-Ul-Amin, Md; Nakamura, Kensuke; Hirai, Masami Y; Ohta, Daisaku; Kanaya, Shigehiko

    2012-07-15

    Operon-like arrangements of genes occur in eukaryotes ranging from yeasts and filamentous fungi to nematodes, plants, and mammals. In plants, several examples of operon-like gene clusters involved in metabolic pathways have recently been characterized, e.g. the cyclic hydroxamic acid pathways in maize, the avenacin biosynthesis gene clusters in oat, the thalianol pathway in Arabidopsis thaliana, and the diterpenoid momilactone cluster in rice. Such operon-like gene clusters are defined by their co-regulation or neighboring positions within immediate vicinity of chromosomal regions. A comprehensive analysis of the expression of neighboring genes therefore accounts a crucial step to reveal the complete set of operon-like gene clusters within a genome. Genome-wide prediction of operon-like gene clusters should contribute to functional annotation efforts and provide novel insight into evolutionary aspects acquiring certain biological functions as well. We predicted co-expressed gene clusters by comparing the Pearson correlation coefficient of neighboring genes and randomly selected gene pairs, based on a statistical method that takes false discovery rate (FDR) into consideration for 1469 microarray gene expression datasets of A. thaliana. We estimated that A. thaliana contains 100 operon-like gene clusters in total. We predicted 34 statistically significant gene clusters consisting of 3 to 22 genes each, based on a stringent FDR threshold of 0.1. Functional relationships among genes in individual clusters were estimated by sequence similarity and functional annotation of genes. Duplicated gene pairs (determined based on BLAST with a cutoff of Emetabolism, containing P450 genes restricted to the Brassica family and predicted to be involved in secondary metabolism. Operon-like clusters tend to include genes encoding bio-machinery associated with ribosomes, the ubiquitin/proteasome system, secondary metabolic pathways, lipid and fatty-acid metabolism, and the lipid

  10. Gene identification and protein classification in microbial metagenomic sequence data via incremental clustering

    Directory of Open Access Journals (Sweden)

    Li Weizhong

    2008-04-01

    Full Text Available Abstract Background The identification and study of proteins from metagenomic datasets can shed light on the roles and interactions of the source organisms in their communities. However, metagenomic datasets are characterized by the presence of organisms with varying GC composition, codon usage biases etc., and consequently gene identification is challenging. The vast amount of sequence data also requires faster protein family classification tools. Results We present a computational improvement to a sequence clustering approach that we developed previously to identify and classify protein coding genes in large microbial metagenomic datasets. The clustering approach can be used to identify protein coding genes in prokaryotes, viruses, and intron-less eukaryotes. The computational improvement is based on an incremental clustering method that does not require the expensive all-against-all compute that was required by the original approach, while still preserving the remote homology detection capabilities. We present evaluations of the clustering approach in protein-coding gene identification and classification, and also present the results of updating the protein clusters from our previous work with recent genomic and metagenomic sequences. The clustering results are available via CAMERA, (http://camera.calit2.net. Conclusion The clustering paradigm is shown to be a very useful tool in the analysis of microbial metagenomic data. The incremental clustering method is shown to be much faster than the original approach in identifying genes, grouping sequences into existing protein families, and also identifying novel families that have multiple members in a metagenomic dataset. These clusters provide a basis for further studies of protein families.

  11. Shared gene structures and clusters of mutually exclusive spliced exons within the metazoan muscle myosin heavy chain genes.

    Directory of Open Access Journals (Sweden)

    Martin Kollmar

    Full Text Available Multicellular animals possess two to three different types of muscle tissues. Striated muscles have considerable ultrastructural similarity and contain a core set of proteins including the muscle myosin heavy chain (Mhc protein. The ATPase activity of this myosin motor protein largely dictates muscle performance at the molecular level. Two different solutions to adjusting myosin properties to different muscle subtypes have been identified so far: Vertebrates and nematodes contain many independent differentially expressed Mhc genes while arthropods have single Mhc genes with clusters of mutually exclusive spliced exons (MXEs. The availability of hundreds of metazoan genomes now allowed us to study whether the ancient bilateria already contained MXEs, how MXE complexity subsequently evolved, and whether additional scenarios to control contractile properties in different muscles could be proposed, By reconstructing the Mhc genes from 116 metazoans we showed that all intron positions within the motor domain coding regions are conserved in all bilateria analysed. The last common ancestor of the bilateria already contained a cluster of MXEs coding for part of the loop-2 actin-binding sequence. Subsequently the protostomes and later the arthropods gained many further clusters while MXEs got completely lost independently in several branches (vertebrates and nematodes and species (for example the annelid Helobdella robusta and the salmon louse Lepeophtheirus salmonis. Several bilateria have been found to encode multiple Mhc genes that might all or in part contain clusters of MXEs. Notable examples are a cluster of six tandemly arrayed Mhc genes, of which two contain MXEs, in the owl limpet Lottia gigantea and four Mhc genes with three encoding MXEs in the predatory mite Metaseiulus occidentalis. Our analysis showed that similar solutions to provide different myosin isoforms (multiple genes or clusters of MXEs or both have independently been developed

  12. Combining affinity propagation clustering and mutual information network to investigate key genes in fibroid.

    Science.gov (United States)

    Chen, Qian-Song; Wang, Dan; Liu, Bao-Lian; Gao, Shu-Feng; Gao, Dan-Li; Li, Gui-Rong

    2017-07-01

    The aim of the present study was to investigate key genes in fibroids based on the multiple affinity propogation-Krzanowski and Lai (mAP-KL) method, which included the maxT multiple hypothesis, Krzanowski and Lai (KL) cluster quality index, affinity propagation (AP) clustering algorithm and mutual information network (MIN) constructed by the context likelihood of relatedness (CLR) algorithm. In order to achieve this goal, mAP-KL was initially implemented to investigate exemplars in fibroid, and the maxT function was employed to rank the genes of training and test sets, and the top 200 genes were obtained for further study. In addition, the KL cluster index was applied to determine the quantity of clusters and the AP clustering algorithm was conducted to identify the clusters and their exemplars. Subsequently, the support vector machine (SVM) model was selected to evaluate the classification performance of mAP-KL. Finally, topological properties (degree, closeness, betweenness and transitivity) of exemplars in MIN constructed according to the CLR algorithm were assessed to investigate key genes in fibroid. The SVM model validated that the classification between normal controls and fibroid patients by mAP-KL had a good performance. A total of 9 clusters and exemplars were identified based on mAP-KL, which were comprised of CALCOCO2, COL4A2, COPS8, SNCG, PA2G4, C17orf70, MARK3, BTNL3 and TBC1D13. By accessing the topological analysis for exemplars in MIN, SNCG and COL4A2 were identified as the two most significant genes of four types of methods, and they were denoted as key genes in the progress of fibroid. In conclusion, two key genes (SNCG and COL4A2) and 9 exemplars were successfully investigated, and these may be potential biomarkers for the detection and treatment of fibroid.

  13. Structural variation of the ribosomal gene cluster within the class Insecta

    Energy Technology Data Exchange (ETDEWEB)

    Mukha, D.V.; Sidorenko, A.P.; Lazebnaya, I.V. [Vavilov Institute of General Genetics, Moscow (Russian Federation)] [and others

    1995-09-01

    General estimation of ribosomal DNA variation within the class Insecta is presented. It is shown that, using blot-hybridization, one can detect differences in the structure of the ribosomal gene cluster not only between genera within an order, but also between species within a genera, including sibling species. Structure of the ribosomal gene cluster of the Coccinellidae family (ladybirds) is analyzed. It is shown that cloned highly conservative regions of ribosomal DNA of Tetrahymena pyriformis can be used as probes for analyzing ribosomal genes in insects. 24 refs., 4 figs.

  14. Prognostic and Predictive Value of RAS Gene Mutations in Colorectal Cancer: Moving Beyond KRAS Exon 2.

    Science.gov (United States)

    Boeckx, Nele; Peeters, Marc; Van Camp, Guy; Pauwels, Patrick; Op de Beeck, Ken; Deschoolmeester, Vanessa

    2015-10-01

    The advent of anti-EGFR (epidermal growth factor receptor) therapy resulted in significant progress in the treatment of metastatic colorectal cancer patients. However, many patients do not respond to this therapy or develop acquired resistance within a few months after the start of treatment. Since 2008, anti-EGFR therapy is restricted to KRAS wild-type patients as it has been shown that KRAS exon 2-mutated patients do not respond to this therapy. Still, up to 60 % of KRAS exon 2 wild-type patients show primary resistance to this treatment. Recently, several studies investigating the predictive and prognostic role of RAS mutations other than in KRAS exon 2 demonstrated that patients with these mutations are not responding to therapy. However, the role of these mutations has long been questioned as The National Comprehensive Cancer Network Guidelines in Oncology and the European Medicines Agency indications had already been changed in order to restrict anti-EGFR therapy to all RAS wild-type colorectal cancer patients, while the Food and Drug Administration guidelines remained unchanged. Recently, the Food and Drug Administration guidelines have also been changed, which implies the importance of RAS mutations beyond KRAS exon 2 in colorectal cancer. In this review, we discuss the most important studies regarding the predictive and prognostic role of RAS mutations other than in KRAS exon 2 in order to demonstrate the importance of these RAS mutations in patients with metastatic colorectal cancer treated with anti-EGFR therapy.

  15. Identification and manipulation of the pleuromutilin gene cluster from Clitopilus passeckerianus for increased rapid antibiotic production

    Science.gov (United States)

    Bailey, Andy M.; Alberti, Fabrizio; Kilaru, Sreedhar; Collins, Catherine M.; de Mattos-Shipley, Kate; Hartley, Amanda J.; Hayes, Patrick; Griffin, Alison; Lazarus, Colin M.; Cox, Russell J.; Willis, Christine L.; O'Dwyer, Karen; Spence, David W.; Foster, Gary D.

    2016-05-01

    Semi-synthetic derivatives of the tricyclic diterpene antibiotic pleuromutilin from the basidiomycete Clitopilus passeckerianus are important in combatting bacterial infections in human and veterinary medicine. These compounds belong to the only new class of antibiotics for human applications, with novel mode of action and lack of cross-resistance, representing a class with great potential. Basidiomycete fungi, being dikaryotic, are not generally amenable to strain improvement. We report identification of the seven-gene pleuromutilin gene cluster and verify that using various targeted approaches aimed at increasing antibiotic production in C. passeckerianus, no improvement in yield was achieved. The seven-gene pleuromutilin cluster was reconstructed within Aspergillus oryzae giving production of pleuromutilin in an ascomycete, with a significant increase (2106%) in production. This is the first gene cluster from a basidiomycete to be successfully expressed in an ascomycete, and paves the way for the exploitation of a metabolically rich but traditionally overlooked group of fungi.

  16. A Robust Manifold Graph Regularized Nonnegative Matrix Factorization Algorithm for Cancer Gene Clustering.

    Science.gov (United States)

    Zhu, Rong; Liu, Jin-Xing; Zhang, Yuan-Ke; Guo, Ying

    2017-12-02

    Detecting genomes with similar expression patterns using clustering techniques plays an important role in gene expression data analysis. Non-negative matrix factorization (NMF) is an effective method for clustering the analysis of gene expression data. However, the NMF-based method is performed within the Euclidean space, and it is usually inappropriate for revealing the intrinsic geometric structure of data space. In order to overcome this shortcoming, Cai et al. proposed a novel algorithm, called graph regularized non-negative matrices factorization (GNMF). Motivated by the topological structure of the GNMF-based method, we propose improved graph regularized non-negative matrix factorization (GNMF) to facilitate the display of geometric structure of data space. Robust manifold non-negative matrix factorization (RM-GNMF) is designed for cancer gene clustering, leading to an enhancement of the GNMF-based algorithm in terms of robustness. We combine the l 2 , 1 -norm NMF with spectral clustering to conduct the wide-ranging experiments on the three known datasets. Clustering results indicate that the proposed method outperforms the previous methods, which displays the latest application of the RM-GNMF-based method in cancer gene clustering.

  17. Paradigm of tunable clustering using Binarization of Consensus Partition Matrices (Bi-CoPaM for gene discovery.

    Directory of Open Access Journals (Sweden)

    Basel Abu-Jamous

    Full Text Available Clustering analysis has a growing role in the study of co-expressed genes for gene discovery. Conventional binary and fuzzy clustering do not embrace the biological reality that some genes may be irrelevant for a problem and not be assigned to a cluster, while other genes may participate in several biological functions and should simultaneously belong to multiple clusters. Also, these algorithms cannot generate tight clusters that focus on their cores or wide clusters that overlap and contain all possibly relevant genes. In this paper, a new clustering paradigm is proposed. In this paradigm, all three eventualities of a gene being exclusively assigned to a single cluster, being assigned to multiple clusters, and being not assigned to any cluster are possible. These possibilities are realised through the primary novelty of the introduction of tunable binarization techniques. Results from multiple clustering experiments are aggregated to generate one fuzzy consensus partition matrix (CoPaM, which is then binarized to obtain the final binary partitions. This is referred to as Binarization of Consensus Partition Matrices (Bi-CoPaM. The method has been tested with a set of synthetic datasets and a set of five real yeast cell-cycle datasets. The results demonstrate its validity in generating relevant tight, wide, and complementary clusters that can meet requirements of different gene discovery studies.

  18. Increased glycopeptide production after overexpression of shikimate pathway genes being part of the balhimycin biosynthetic gene cluster

    DEFF Research Database (Denmark)

    Thykær, Jette; Nielsen, Jens; Wohlleben, W.

    2010-01-01

    Amycolatopsis balhimycina produces the vancomycin-analogue balhimycin. The strain therefore serves as a model strain for glycopeptide antibiotic production. Previous characterisation of the balhimycin biosynthetic cluster had shown that the border sequences contained both, a putative 3-deoxy......-d-arabino-heptulosonate 7-phosphate synthase (dahp), and a prephenate dehydrogenase (pdh) gene. In a metabolic engineering approach for increasing the precursor supply for balhimycin production, the dahp and pdh genes from the biosynthetic cluster were overexpressed both individually and together and the resulting strains...... production levels similar to the parent strain. Based on these results the relation between primary and secondary metabolism with regards to Dahp and Pdh is discussed....

  19. Prognostic impact of Wilms tumor gene mutations in Egyptian patients with acute myeloid leukemia with normal karyotype.

    Science.gov (United States)

    Zidan, Magda Abdel Aziz; Kamal Shaaban, Howyda M; Elghannam, Doaa M

    2014-07-01

    The Wilms' tumor (WT1) gene mutations were detected in patients with most forms of acute leukemia. However, the biological significance and the prognostic impact of WT1 mutation in Egyptian patients with acute myeloid leukemia with normal karyotype (AML-NK) are still uncertain. We aimed to evaluate the incidence and clinical relevance of WT1 gene mutations in acute myeloid leukemia with normal karyotype (AML-NK). Exons 7 and 9 of WT1 were screened in samples from 216 adult NK-AML using polymerase chain reaction single-strand conformation polymorphism techniques. Twenty-three patients (10.6%) harbored WT1 mutations. Younger ages and higher marrow blasts were significantly associated with WT1 mutations (P = 0.006 and 0.003 respectively). Complete remission rates were significantly lower in patients with WT1 mutations than those with WT1 wild-type (P = 0.015). Resistance, relapse, and mortality rates were significantly higher in patients with WT1 mutations than those without (P = 0.041, 0.016, and 0.008 respectively). WT1 mutations were inversely associated with NPM1 mutations (P = 0.007). Patients with WT1 mutations had worse disease-free survival (P < 0.001) and overall survival (P < 0.001) than patients with WT1 wild-type. In multivariable analyses, WT1 mutations independently predicted worse DFS (P < 0.001; hazard ratio [HR] 0.036) and overall survival (P = 0.001; HR = 0.376) when controlling for age, total leukocytic count (TLC), and NPM1 mutational status. In conclusion, WT1 mutations are a negative prognostic indicator in intensively treated patients with AML-NK, may be a part of molecularly based risk assessment and risk-adapted treatment stratification of patients with AML-NK.

  20. A CLUSTERING OF DJA STOCKS - THE APPLICATION IN FINANCE OF A METHOD FIRST USED IN GENE TRAJECTORY STUDY

    Directory of Open Access Journals (Sweden)

    Silaghi Gheorghe Cosmin

    2009-05-01

    Full Text Available Previously we employed the Gene Trajectory Clustering methodology to search for different associations of the stocks composing the DJA index, with the aim of finding different, logic clusters, supported by economic reasons, preferably different than the

  1. Vascular endothelial growth factor gene (VEGFA) polymorphisms may serve as prognostic factors for recurrent depressive disorder development.

    Science.gov (United States)

    Gałecki, Piotr; Gałecka, Elżbieta; Maes, Michael; Orzechowska, Agata; Berent, Dominika; Talarowska, Monika; Bobińska, Kinga; Lewiński, Andrzej; Bieńkiewicz, Małgorzata; Szemraj, Janusz

    2013-08-01

    Recurrent depressive disorder (rDD) is a multifactorial disease. Vascular endothelial growth factor (VEGF) is one of the factors that have been suggested to play a role in the etiology and/or development of this disease. Limited information related to the role of VEGFA gene polymorphism in depressive disorder is available. The aim of the study was to analyze the association between VEGFA gene polymorphisms (+405G/C; rs2010963, +936C/T; rs 3025039), VEGFA gene expression, and its serum protein levels in rDD in the Caucasian population. In the current study, 268 patients and 200 healthy controls of the Caucasian origin were involved. Genotyping and gene expression were performed using polymerase chain reaction (PCR)-based methods. Enzyme-linked immunosorbent assay (ELISA) was used for detection of circulating serum VEGF levels. The distribution of VEGFA polymorphism +405G/C differed significantly between rDD patients and healthy subjects. The results of this study indicated that the C allele and CC genotype of VEGFA are risk factors for rDD. Haplotypes CC and TG are the important factors for depression development. Further, VEGFA mRNA expression and VEGF levels were higher in rDD patients than in controls. The VEGFA gene polymorphism may serve as a prognostic factor for rDD development. Our study showed higher levels of both VEGFA mRNA in the peripheral blood cells and serum VEGF in patients diagnosed with rDD than in healthy controls. The obtained results suggest VEGF and the gene encoding the molecule play a role in the etiology of the disease and should be further investigated. Copyright © 2013 Elsevier Inc. All rights reserved.

  2. A Genomics Based Discovery of Secondary Metabolite Biosynthetic Gene Clusters in Aspergillus ustus

    Science.gov (United States)

    Pi, Borui; Yu, Dongliang; Dai, Fangwei; Song, Xiaoming; Zhu, Congyi; Li, Hongye; Yu, Yunsong

    2015-01-01

    Secondary metabolites (SMs) produced by Aspergillus have been extensively studied for their crucial roles in human health, medicine and industrial production. However, the resulting information is almost exclusively derived from a few model organisms, including A. nidulans and A. fumigatus, but little is known about rare pathogens. In this study, we performed a genomics based discovery of SM biosynthetic gene clusters in Aspergillus ustus, a rare human pathogen. A total of 52 gene clusters were identified in the draft genome of A. ustus 3.3904, such as the sterigmatocystin biosynthesis pathway that was commonly found in Aspergillus species. In addition, several SM biosynthetic gene clusters were firstly identified in Aspergillus that were possibly acquired by horizontal gene transfer, including the vrt cluster that is responsible for viridicatumtoxin production. Comparative genomics revealed that A. ustus shared the largest number of SM biosynthetic gene clusters with A. nidulans, but much fewer with other Aspergilli like A. niger and A. oryzae. These findings would help to understand the diversity and evolution of SM biosynthesis pathways in genus Aspergillus, and we hope they will also promote the development of fungal identification methodology in clinic. PMID:25706180

  3. Reassembly of functionally intact environmental DNA-derived biosynthetic gene clusters.

    Science.gov (United States)

    Kallifidas, Dimitris; Brady, Sean F

    2012-01-01

    Only a small fraction of the bacterial diversity present in natural microbial communities is regularly cultured in the laboratory. Those bacteria that remain recalcitrant to culturing cannot be examined for the production of bioactive secondary metabolites using standard pure-culture approaches. The screening of genomic DNA libraries containing DNA isolated directly from environmental samples (environmental DNA (eDNA)) provides an alternative approach for studying the biosynthetic capacities of these organisms. One drawback of this approach has been that most eDNA isolation procedures do not permit the cloning of DNA fragments of sufficient length to capture large natural product biosynthetic gene clusters in their entirety. Although the construction of eDNA libraries with inserts big enough to capture biosynthetic gene clusters larger than ∼40kb remains challenging, it is possible to access large gene clusters by reassembling them from sets of smaller overlapping fragments using transformation-associated recombination in Saccharomyces cerevisiae. Here, we outline a method for the reassembly of large biosynthetic gene clusters from captured sets of overlapping soil eDNA cosmid clones. Natural product biosynthetic gene clusters reassembled using this approach can then be used directly for functional heterologous expression studies. Copyright © 2012 Elsevier Inc. All rights reserved.

  4. A genomics based discovery of secondary metabolite biosynthetic gene clusters in Aspergillus ustus.

    Directory of Open Access Journals (Sweden)

    Borui Pi

    Full Text Available Secondary metabolites (SMs produced by Aspergillus have been extensively studied for their crucial roles in human health, medicine and industrial production. However, the resulting information is almost exclusively derived from a few model organisms, including A. nidulans and A. fumigatus, but little is known about rare pathogens. In this study, we performed a genomics based discovery of SM biosynthetic gene clusters in Aspergillus ustus, a rare human pathogen. A total of 52 gene clusters were identified in the draft genome of A. ustus 3.3904, such as the sterigmatocystin biosynthesis pathway that was commonly found in Aspergillus species. In addition, several SM biosynthetic gene clusters were firstly identified in Aspergillus that were possibly acquired by horizontal gene transfer, including the vrt cluster that is responsible for viridicatumtoxin production. Comparative genomics revealed that A. ustus shared the largest number of SM biosynthetic gene clusters with A. nidulans, but much fewer with other Aspergilli like A. niger and A. oryzae. These findings would help to understand the diversity and evolution of SM biosynthesis pathways in genus Aspergillus, and we hope they will also promote the development of fungal identification methodology in clinic.

  5. MS/MS networking guided analysis of molecule and gene cluster families

    Science.gov (United States)

    Nguyen, Don Duy; Wu, Cheng-Hsuan; Moree, Wilna J.; Lamsa, Anne; Medema, Marnix H.; Zhao, Xiling; Gavilan, Ronnie G.; Aparicio, Marystella; Atencio, Librada; Jackson, Chanaye; Ballesteros, Javier; Sanchez, Joel; Watrous, Jeramie D.; Phelan, Vanessa V.; van de Wiel, Corine; Kersten, Roland D.; Mehnaz, Samina; De Mot, René; Shank, Elizabeth A.; Charusanti, Pep; Nagarajan, Harish; Duggan, Brendan M.; Moore, Bradley S.; Bandeira, Nuno; Palsson, Bernhard Ø.; Pogliano, Kit; Gutiérrez, Marcelino; Dorrestein, Pieter C.

    2013-01-01

    The ability to correlate the production of specialized metabolites to the genetic capacity of the organism that produces such molecules has become an invaluable tool in aiding the discovery of biotechnologically applicable molecules. Here, we accomplish this task by matching molecular families with gene cluster families, making these correlations to 60 microbes at one time instead of connecting one molecule to one organism at a time, such as how it is traditionally done. We can correlate these families through the use of nanospray desorption electrospray ionization MS/MS, an ambient pressure MS technique, in conjunction with MS/MS networking and peptidogenomics. We matched the molecular families of peptide natural products produced by 42 bacilli and 18 pseudomonads through the generation of amino acid sequence tags from MS/MS data of specific clusters found in the MS/MS network. These sequence tags were then linked to biosynthetic gene clusters in publicly accessible genomes, providing us with the ability to link particular molecules with the genes that produced them. As an example of its use, this approach was applied to two unsequenced Pseudoalteromonas species, leading to the discovery of the gene cluster for a molecular family, the bromoalterochromides, in the previously sequenced strain P. piscicida JCM 20779T. The approach itself is not limited to 60 related strains, because spectral networking can be readily adopted to look at molecular family–gene cluster families of hundreds or more diverse organisms in one single MS/MS network. PMID:23798442

  6. Operon and non-operon gene clusters in the C. elegans genome.

    Science.gov (United States)

    Blumenthal, Thomas; Davis, Paul; Garrido-Lecca, Alfonso

    2015-04-28

    Nearly 15% of the ~20,000 C. elegans genes are contained in operons, multigene clusters controlled by a single promoter. The vast majority of these are of a type where the genes in the cluster are ~100 bp apart and the pre-mRNA is processed by 3' end formation accompanied by trans-splicing. A spliced leader, SL2, is specialized for operon processing. Here we summarize current knowledge on several variations on this theme including: (1) hybrid operons, which have additional promoters between genes; (2) operons with exceptionally long (> 1 kb) intercistronic regions; (3) operons with a second 3' end formation site close to the trans-splice site; (4) alternative operons, in which the exons are sometimes spliced as a single gene and sometimes as two genes; (5) SL1-type operons, which use SL1 instead of SL2 to trans-splice and in which there is no intercistronic space; (6) operons that make dicistronic mRNAs; and (7) non-operon gene clusters, in which either two genes use a single exon as the 3' end of one and the 5' end of the next, or the 3' UTR of one gene serves as the outron of the next. Each of these variations is relatively infrequent, but together they show a remarkable variety of tight-linkage gene arrangements in the C. elegans genome.

  7. Loss of Major DNase I Hypersensitive Sites in Duplicated β-globin Gene Cluster Incompletely Silences HBB Gene Expression.

    Science.gov (United States)

    Reading, N Scott; Shooter, Claire; Song, Jihyun; Miller, Robin; Agarwal, Archana; Lanikova, Lucie; Clark, Barnaby; Thein, Swee Lay; Divoky, Vladimir; Prchal, Josef T

    2016-11-01

    We report an infant with sickle cell disease phenotype by biochemical analysis whose β-globin gene (HBB) sequencing showed sickle cell mutation (HBBS ) heterozygosity. The proband has a unique head-to-tail duplication of the β-globin gene cluster having wild-type (HBBA ) and HBBS alleles inherited from her father; constituting her HBBS /HBBS -HBBA genotype. Further analyses revealed that proband's duplicated β-globin gene cluster (∼650 kb) encompassing HBBA does not include the immediate upstream locus control region (LCR) or 3' DNase I hypersensitivity (HS) element. The LCR interacts with β-globin gene cluster involving long range DNA interactions mediated by various transcription factors to drive the regulation of globin genes expression. However, a low level of HBBA transcript was clearly detected by digital PCR. In this patient, the observed transcription from the duplicated, distally displaced HBBA cluster demonstrates that the loss of LCR and flanking 3'HS sites do not lead to complete silencing of HBB transcription. © 2016 WILEY PERIODICALS, INC.

  8. An effective hybrid approach of gene selection and classification for microarray data based on clustering and particle swarm optimization.

    Science.gov (United States)

    Han, Fei; Yang, Shanxiu; Guan, Jian

    2015-01-01

    In this paper, a hybrid approach based on clustering and Particle Swarm Optimisation (PSO) is proposed to perform gene selection and classification for microarray data. In the new method, firstly, genes are partitioned into a predetermined number of clusters by K-means method. Since the genes in each cluster have much redundancy, Max-Relevance Min-Redundancy (mRMR) strategy is used to reduce redundancy of the clustered genes. Then, PSO is used to perform further gene selection from the remaining clustered genes. Because of its better generalisation performance with much faster convergence rate than other learning algorithms for neural networks, Extreme Learning Machine (ELM) is chosen to evaluate candidate gene subsets selected by PSO and perform samples classification in this study. The proposed method selects less redundant genes as well as increases prediction accuracy and its efficiency and effectiveness are verified by extensive comparisons with other classical methods on three open microarray data.

  9. The Eucalyptus grandis NBS-LRR gene family: physical clustering and expression hotspots

    Directory of Open Access Journals (Sweden)

    Nanette eChristie

    2016-01-01

    Full Text Available Eucalyptus grandis is a commercially important hardwood species and is known to be susceptible to a number of pests and pathogens. Determining mechanisms of defense is therefore a research priority. The published genome for E. grandis has aided the identification of one important class of resistance (R genes that incorporate nucleotide binding sites and leucine-rich repeat domains (NBS-LRR. Using an iterative search process we identified NBS-LRR gene models within the E. grandis genome. We characterized the gene models and identified their genomic arrangement. The gene expression patterns were examined in E. grandis clones, challenged with a fungal pathogen (Chrysoporthe austroafricana and insect pest (Leptocybe invasa. 1215 putative NBS-LRR coding sequences were located which aligned into two large classes, Toll or interleukin-1 receptor (TIR and coiled-coil (CC based on NB-ARC domains. NBS-LRR gene-rich regions were identified with 76% organized in clusters of three or more genes. A further 272 putative incomplete resistance genes were also identified. We determined that E. grandis has a higher ratio of TIR to CC classed genes compared to other woody plant species as well as a smaller percentage of single NBS-LRR genes. Transcriptome profiles indicated expression hotspots, within physical clusters, including expression of many incomplete genes. The clustering of putative NBS-LRR genes correlates with differential expression responses in resistant and susceptible plants indicating functional relevance for the physical arrangement of this gene family. This analysis of the repertoire and expression of E. grandis putative NBS-LRR genes provides an important resource for the identification of novel and functional R-genes; a key objective for strategies to enhance resilience.

  10. Polymorphisms within the APOBR gene are highly associated with milk levels of prognostic ketosis biomarkers in dairy cows.

    Science.gov (United States)

    Tetens, Jens; Heuer, Claas; Heyer, Iris; Klein, Matthias S; Gronwald, Wolfram; Junge, Wolfgang; Oefner, Peter J; Thaller, Georg; Krattenmacher, Nina

    2015-04-01

    Essentially all high-yielding dairy cows experience a negative energy balance during early lactation leading to increased lipomobilization, which is a normal physiological response. However, a severe energy deficit may lead to high levels of ketone bodies and, subsequently, to subclinical or clinical ketosis. It has previously been reported that the ratio of glycerophosphocholine to phosphocholine in milk is a prognostic biomarker for the risk of ketosis in dairy cattle. It was hypothesized that this ratio reflects the ability to break down blood phosphatidylcholine as a fatty acid resource. In the current study, 248 animals from a previous study were genotyped with Illumina BovineSNP50 BeadChip, and genome-wide association studies were carried out for the milk levels of phosphocholine, glycerophosphocholine, and the ratio of both metabolites. It was demonstrated that the latter two traits are heritable with h2 = 0.43 and h2 = 0.34, respectively. A major quantitative trait locus was identified on cattle chromosome 25. The APOBR gene, coding for the apolipoprotein B receptor, is located within this region and was analyzed as a candidate gene. The analysis revealed highly significant associations of polymorphisms within the gene with glycerophosphocholine as well as the metabolite ratio. These findings support the hypothesis that differences in the ability to take up blood phosphatidylcholine from low-density lipoproteins play an important role in early lactation metabolic stability of dairy cows and indicate APOBR to contain a causative variant. Copyright © 2015 the American Physiological Society.

  11. Prognostic relevance of a T-type calcium channels gene signature in solid tumours: A correlation ready for clinical validation.

    Science.gov (United States)

    Fornaro, Lorenzo; Vivaldi, Caterina; Lin, Dong; Xue, Hui; Falcone, Alfredo; Wang, Yuzhuo; Crea, Francesco; Bootman, Martin D

    2017-01-01

    T-type calcium channels (TTCCs) mediate calcium influx across the cell membrane. TTCCs regulate numerous physiological processes including cardiac pacemaking and neuronal activity. In addition, they have been implicated in the proliferation, migration and differentiation of tumour tissues. Although the signalling events downstream of TTCC-mediated calcium influx are not fully elucidated, it is clear that variations in the expression of TTCCs promote tumour formation and hinder response to treatment. We examined the expression of TTCC genes (all three subtypes; CACNA-1G, CACNA-1H and CACNA-1I) and their prognostic value in three major solid tumours (i.e. gastric, lung and ovarian cancers) via a publicly accessible database. In gastric cancer, expression of all the CACNA genes was associated with overall survival (OS) among stage I-IV patients (all p<0.05). By combining the three potential biomarkers, a TTCC signature was developed, which retained a significant association with OS both in stage IV and stage I-III patients. In lung and ovarian cancer, association with OS was also significant when all tumour stages were considered, but was partly lost or inconclusive after splitting cases into localized and metastatic subsets. Alterations in CACNA gene expression are linked to tumour prognosis. Gastric cancer represents the most promising setting for further evaluation.

  12. Sequencing and mapping hemoglobin gene clusters in the australian model dasyurid marsupial sminthopsis macroura

    Energy Technology Data Exchange (ETDEWEB)

    De Leo, A.A.; Wheeler, D.; Lefevre, C.; Cheng, Jan-Fang; Hope, R.; Kuliwaba, J.; Nicholas, K.R.; Westermanc, M.; Graves, J.A.M.

    2004-07-26

    Comparing globin genes and their flanking sequences across many species has allowed globin gene evolution to be reconstructed in great detail. Marsupial globin sequences have proved to be of exceptional significance. A previous finding of a beta-like omega gene in the alpha cluster in the tammar wallaby suggested that the alpha and beta cluster evolved via genome duplication and loss rather than tandem duplication. To confirm and extend this important finding we isolated and sequenced BACs containing the alpha and beta loci from the distantly related Australian marsupial Sminthopsis macroura. We report that the alpha gene lies in the same BAC as the beta-like omega gene, implying that the alpha-omega juxtaposition is likely to be conserved in all marsupials. The LUC7L gene was found 3' of the S. macroura alpha locus, a gene order shared with humans but not mouse, chicken or fugu. Sequencing a BAC contig that contained the S. macroura beta globin and epsilon globin loci showed that the globin cluster is flanked by olfactory genes, demonstrating a gene arrangement conserved for over 180 MY. Analysis of the region 5' to the S. macroura epsilon globin gene revealed a region similar to the eutherian LCR, containing sequences and potential transcription factor binding sites with homology to eutherian hypersensitive sites 1 to 5. FISH mapping of BACs containing S. macroura alpha and beta globin genes located the beta globin cluster on chromosome 3q and the alpha locus close to the centromere on 1q, resolving contradictory map locations obtained by previous radioactive in situ hybridization.

  13. Methods for simultaneously identifying coherent local clusters with smooth global patterns in gene expression profiles

    Directory of Open Access Journals (Sweden)

    Lee Yun-Shien

    2008-03-01

    Full Text Available Abstract Background The hierarchical clustering tree (HCT with a dendrogram 1 and the singular value decomposition (SVD with a dimension-reduced representative map 2 are popular methods for two-way sorting the gene-by-array matrix map employed in gene expression profiling. While HCT dendrograms tend to optimize local coherent clustering patterns, SVD leading eigenvectors usually identify better global grouping and transitional structures. Results This study proposes a flipping mechanism for a conventional agglomerative HCT using a rank-two ellipse (R2E, an improved SVD algorithm for sorting purpose seriation by Chen 3 as an external reference. While HCTs always produce permutations with good local behaviour, the rank-two ellipse seriation gives the best global grouping patterns and smooth transitional trends. The resulting algorithm automatically integrates the desirable properties of each method so that users have access to a clustering and visualization environment for gene expression profiles that preserves coherent local clusters and identifies global grouping trends. Conclusion We demonstrate, through four examples, that the proposed method not only possesses better numerical and statistical properties, it also provides more meaningful biomedical insights than other sorting algorithms. We suggest that sorted proximity matrices for genes and arrays, in addition to the gene-by-array expression matrix, can greatly aid in the search for comprehensive understanding of gene expression structures. Software for the proposed methods can be obtained at http://gap.stat.sinica.edu.tw/Software/GAP.

  14. Sequencing and transcriptional analysis of the biosynthesis gene cluster of abscisic acid-producing Botrytis cinerea.

    Science.gov (United States)

    Gong, Tao; Shu, Dan; Yang, Jie; Ding, Zhong-Tao; Tan, Hong

    2014-09-29

    Botrytis cinerea is a model species with great importance as a pathogen of plants and has become used for biotechnological production of ABA. The ABA cluster of B. cinerea is composed of an open reading frame without significant similarities (bcaba3), followed by the genes (bcaba1 and bcaba2) encoding P450 monooxygenases and a gene probably coding for a short-chain dehydrogenase/reductase (bcaba4). In B. cinerea ATCC58025, targeted inactivation of the genes in the cluster suggested at least three genes responsible for the hydroxylation at carbon atom C-1' and C-4' or oxidation at C-4' of ABA. Our group has identified an ABA-overproducing strain, B. cinerea TB-3-H8. To differentiate TB-3-H8 from other B. cinerea strains with the functional ABA cluster, the DNA sequence of the 12.11-kb region containing the cluster of B. cinerea TB-3-H8 was determined. Full-length cDNAs were also isolated for bcaba1, bcaba2, bcaba3 and bcaba4 from B. cinerea TB-3-H8. Sequence comparison of the four genes and their flanking regions respectively derived from B. cinerea TB-3-H8, B05.10 and T4 revealed that major variations were located in intergenic sequences. In B. cinerea TB-3-H8, the expression profiles of the four function genes under ABA high-yield conditions were also analyzed by real-time PCR.

  15. Identification of biosynthetic gene clusters from metagenomic libraries using PPTase complementation in a Streptomyces host.

    Science.gov (United States)

    Bitok, J Kipchirchir; Lemetre, Christophe; Ternei, Melinda A; Brady, Sean F

    2017-09-01

    The majority of environmental bacteria are not readily cultured in the lab, leaving the natural products they make inaccessible using culture-dependent discovery methods. Cloning and heterologous expression of DNA extracted from environmental samples (environmental DNA, eDNA) provides a means of circumventing this discovery bottleneck. To facilitate the identification of clones containing biosynthetic gene clusters, we developed a model heterologous expression reporter strain Streptomyces albus::bpsA ΔPPTase. This strain carries a 4΄-phosphopantetheinyl transferase (PPTase)-dependent blue pigment synthase A gene, bpsA, in a PPTase deletion background. eDNA clones that express a functional PPTase restore production of the blue pigment, indigoidine. As PPTase genes often occur in biosynthetic gene clusters (BGCs), indigoidine production can be used to identify eDNA clones containing BGCs. We screened a soil eDNA library hosted in S. albus::bpsA ΔPPTase and identified clones containing non-ribosomal peptide synthetase (NRPS), polyketide synthase (PKS) and mixed NRPS/PKS biosynthetic gene clusters. One NRPS gene cluster was shown to confer the production of myxochelin A to S. albus::bpsA ΔPPTase. © FEMS 2017. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  16. Exploring tomato gene functions based on coexpression modules using graph clustering and differential coexpression approaches.

    Science.gov (United States)

    Fukushima, Atsushi; Nishizawa, Tomoko; Hayakumo, Mariko; Hikosaka, Shoko; Saito, Kazuki; Goto, Eiji; Kusano, Miyako

    2012-04-01

    Gene-to-gene coexpression analysis provides fundamental information and is a promising approach for predicting unknown gene functions in plants. We investigated various associations in the gene expression of tomato (Solanum lycopersicum) to predict unknown gene functions in an unbiased manner. We obtained more than 300 microarrays from publicly available databases and our own hybridizations, and here, we present tomato coexpression networks and coexpression modules. The topological characteristics of the networks were highly heterogenous. We extracted 465 total coexpression modules from the data set by graph clustering, which allows users to divide a graph effectively into a set of clusters. Of these, 88% were assigned systematically by Gene Ontology terms. Our approaches revealed functional modules in the tomato transcriptome data; the predominant functions of coexpression modules were biologically relevant. We also investigated differential coexpression among data sets consisting of leaf, fruit, and root samples to gain further insights into the tomato transcriptome. We now demonstrate that (1) duplicated genes, as well as metabolic genes, exhibit a small but significant number of differential coexpressions, and (2) a reversal of gene coexpression occurred in two metabolic pathways involved in lycopene and flavonoid biosynthesis. Independent experimental verification of the findings for six selected genes was done using quantitative real-time polymerase chain reaction. Our findings suggest that differential coexpression may assist in the investigation of key regulatory steps in metabolic pathways. The approaches and results reported here will be useful to prioritize candidate genes for further functional genomics studies of tomato metabolism.

  17. Form gene clustering method about pan-ethnic-group products based on emotional semantic

    Science.gov (United States)

    Chen, Dengkai; Ding, Jingjing; Gao, Minzhuo; Ma, Danping; Liu, Donghui

    2016-09-01

    The use of pan-ethnic-group products form knowledge primarily depends on a designer's subjective experience without user participation. The majority of studies primarily focus on the detection of the perceptual demands of consumers from the target product category. A pan-ethnic-group products form gene clustering method based on emotional semantic is constructed. Consumers' perceptual images of the pan-ethnic-group products are obtained by means of product form gene extraction and coding and computer aided product form clustering technology. A case of form gene clustering about the typical pan-ethnic-group products is investigated which indicates that the method is feasible. This paper opens up a new direction for the future development of product form design which improves the agility of product design process in the era of Industry 4.0.

  18. Clustering Time-Series Gene Expression Data Using Smoothing Spline Derivatives

    Directory of Open Access Journals (Sweden)

    S. Déjean

    2007-06-01

    Full Text Available Microarray data acquired during time-course experiments allow the temporal variations in gene expression to be monitored. An original postprandial fasting experiment was conducted in the mouse and the expression of 200 genes was monitored with a dedicated macroarray at 11 time points between 0 and 72 hours of fasting. The aim of this study was to provide a relevant clustering of gene expression temporal profiles. This was achieved by focusing on the shapes of the curves rather than on the absolute level of expression. Actually, we combined spline smoothing and first derivative computation with hierarchical and partitioning clustering. A heuristic approach was proposed to tune the spline smoothing parameter using both statistical and biological considerations. Clusters are illustrated a posteriori through principal component analysis and heatmap visualization. Most results were found to be in agreement with the literature on the effects of fasting on the mouse liver and provide promising directions for future biological investigations.

  19. plantiSMASH: automated identification, annotation and expression analysis of plant biosynthetic gene clusters

    DEFF Research Database (Denmark)

    Kautsar, Satria A.; Suarez Duran, Hernando G.; Blin, Kai

    2017-01-01

    in specific genomic loci: biosynthetic gene clusters (BGCs). Here, we introduce plantiSMASH, a versatile online analysis platform that automates the identification of candidate plant BGCs. Moreover, it allows integration of transcriptomic data to prioritize candidate BGCs based on the coexpression patterns...... of predicted biosynthetic enzyme-coding genes, and facilitates comparative genomic analysis to study the evolutionary conservation of each cluster. Applied on 48 high-quality plant genomes, plantiSMASH identifies a rich diversity of candidate plant BGCs. These results will guide further experimental......Plant specialized metabolites are chemically highly diverse, play key roles in host-microbe interactions, have important nutritional value in crops and are frequently applied as medicines. It has recently become clear that plant biosynthetic pathway-encoding genes are sometimes densely clustered...

  20. Prognostic significance and gene expression profiles of p53 mutations in microsatellite-stable stage III colorectal adenocarcinomas.

    Directory of Open Access Journals (Sweden)

    Venkat R Katkoori

    Full Text Available Although the prognostic value of p53 abnormalities in Stage III microsatellite stable (MSS colorectal cancers (CRCs is known, the gene expression profiles specific to the p53 status in the MSS background are not known. Therefore, the current investigation has focused on identification and validation of the gene expression profiles associated with p53 mutant phenotypes in MSS Stage III CRCs. Genomic DNA extracted from 135 formalin-fixed paraffin-embedded tissues, was analyzed for microsatellite instability (MSI and p53 mutations. Further, mRNA samples extracted from five p53-mutant and five p53-wild-type MSS-CRC snap-frozen tissues were profiled for differential gene expression by Affymetrix Human Genome U133 Plus 2.0 arrays. Differentially expressed genes were further validated by the high-throughput quantitative nuclease protection assay (qNPA, and confirmed by quantitative real-time polymerase chain reaction (qRT-PCR and by immunohistochemistry (IHC. Survival rates were estimated by Kaplan-Meier and Cox regression analyses. A higher incidence of p53 mutations was found in MSS (58% than in MSI (30% phenotypes. Both univariate (log-rank, P = 0.025 and multivariate (hazard ratio, 2.52; 95% confidence interval, 1.25-5.08 analyses have demonstrated that patients with MSS-p53 mutant phenotypes had poor CRC-specific survival when compared to MSS-p53 wild-type phenotypes. Gene expression analyses identified 84 differentially expressed genes. Of 49 down-regulated genes, LPAR6, PDLIM3, and PLAT, and, of 35 up-regulated genes, TRIM29, FUT3, IQGAP3, and SLC6A8 were confirmed by qNPA, qRT-PCR, and IHC platforms. p53 mutations are associated with poor survival of patients with Stage III MSS CRCs and p53-mutant and wild-type phenotypes have distinct gene expression profiles that might be helpful in identifying aggressive subsets.

  1. Evolutionary ecology of beta-lactam gene clusters in animals

    NARCIS (Netherlands)

    Suring, Wouter; Meusemann, Karen; Blanke, Alexander; Mariën, Janine; Schol, Tim; Agamennone, Valeria; Faddeeva-Vakhrusheva, Anna; Berg, Matty P; Brouwer, Abraham; van Straalen, Nico M; Roelofs, Dick

    Beta-lactam biosynthesis was thought to occur only in fungi and bacteria, but we recently reported the presence of isopenicillin N synthase in a soil-dwelling animal, Folsomia candida. However, it has remained unclear whether this gene is part of a larger beta-lactam biosynthesis pathway and how

  2. Gene microarray data analysis using parallel point-symmetry-based clustering.

    Science.gov (United States)

    Sarkar, Anasua; Maulik, Ujjwal

    2015-01-01

    Identification of co-expressed genes is the central goal in microarray gene expression analysis. Point-symmetry-based clustering is an important unsupervised learning technique for recognising symmetrical convex- or non-convex-shaped clusters. To enable fast clustering of large microarray data, we propose a distributed time-efficient scalable approach for point-symmetry-based K-Means algorithm. A natural basis for analysing gene expression data using symmetry-based algorithm is to group together genes with similar symmetrical expression patterns. This new parallel implementation also satisfies linear speedup in timing without sacrificing the quality of clustering solution on large microarray data sets. The parallel point-symmetry-based K-Means algorithm is compared with another new parallel symmetry-based K-Means and existing parallel K-Means over eight artificial and benchmark microarray data sets, to demonstrate its superiority, in both timing and validity. The statistical analysis is also performed to establish the significance of this message-passing-interface based point-symmetry K-Means implementation. We also analysed the biological relevance of clustering solutions.

  3. Contributions of vertical descent, horizontal transfer and gene loss to the distribution of mycotoxin biosynthetic gene clusters in Fusarium

    Science.gov (United States)

    The genus Fusarium produces a diverse array of mycotoxins and other secondary metabolites, but individual species contribute to only a small fraction of this diversity. Here, we employed comparative genomic and phylogenetic analyses to investigate the distribution and evolution of gene clusters resp...

  4. Study of integrated heterogeneous data reveals prognostic power of gene expression for breast cancer survival.

    Directory of Open Access Journals (Sweden)

    Richard E Neapolitan

    Full Text Available Studies show that thousands of genes are associated with prognosis of breast cancer. Towards utilizing available genetic data, efforts have been made to predict outcomes using gene expression data, and a number of commercial products have been developed. These products have the following shortcomings: 1 They use the Cox model for prediction. However, the RSF model has been shown to significantly outperform the Cox model. 2 Testing was not done to see if a complete set of clinical predictors could predict as well as the gene expression signatures.We address these shortcomings. The METABRIC data set concerns 1981 breast cancer tumors. Features include 21 clinical features, expression levels for 16,384 genes, and survival. We compare the survival prediction performance of the Cox model and the RSF model using the clinical data and the gene expression data to their performance using only the clinical data. We obtain significantly better results when we used both clinical data and gene expression data for 5 year, 10 year, and 15 year survival prediction. When we replace the gene expression data by PAM50 subtype, our results are significant only for 5 year and 15 year prediction. We obtain significantly better results using the RSF model over the Cox model. Finally, our results indicate that gene expression data alone may predict long-term survival.Our results indicate that we can obtain improved survival prediction using clinical data and gene expression data compared to prediction using only clinical data. We further conclude that we can obtain improved survival prediction using the RSF model instead of the Cox model. These results are significant because by incorporating more gene expression data with clinical features and using the RSF model, we could develop decision support systems that better utilize heterogeneous information to improve outcome prediction and decision making.

  5. ToppCluster: a multiple gene list feature analyzer for comparative enrichment clustering and network-based dissection of biological systems.

    Science.gov (United States)

    Kaimal, Vivek; Bardes, Eric E; Tabar, Scott C; Jegga, Anil G; Aronow, Bruce J

    2010-07-01

    ToppCluster is a web server application that leverages a powerful enrichment analysis and underlying data environment for comparative analyses of multiple gene lists. It generates heatmaps or connectivity networks that reveal functional features shared or specific to multiple gene lists. ToppCluster uses hypergeometric tests to obtain list-specific feature enrichment P-values for currently 17 categories of annotations of human-ortholog genes, and provides user-selectable cutoffs and multiple testing correction methods to control false discovery. Each nameable gene list represents a column input to a resulting matrix whose rows are overrepresented features, and individual cells per-list P-values and corresponding genes per feature. ToppCluster provides users with choices of tabular outputs, hierarchical clustering and heatmap generation, or the ability to interactively select features from the functional enrichment matrix to be transformed into XGMML or GEXF network format documents for use in Cytoscape or Gephi applications, respectively. Here, as example, we demonstrate the ability of ToppCluster to enable identification of list-specific phenotypic and regulatory element features (both cis-elements and 3'UTR microRNA binding sites) among tissue-specific gene lists. ToppCluster's functionalities enable the identification of specialized biological functions and regulatory networks and systems biology-based dissection of biological states. ToppCluster can be accessed freely at http://toppcluster.cchmc.org.

  6. Global analysis of biosynthetic gene clusters reveals vast potential of secondary metabolite production in Penicillium species

    DEFF Research Database (Denmark)

    Nielsen, Jens Christian; Grijseels, Sietske; Prigent, Sylvain

    2017-01-01

    sequenced the genomes of 9 Penicillium species and, together with 15 published genomes, we investigated the secondary metabolism of Penicillium and identified an immense, unexploited potential for producing secondary metabolites by this genus. A total of 1,317 putative biosynthetic gene clusters (BGCs) were...... identified, and polyketide synthase and non-ribosomal peptide synthetase based BGCs were grouped into gene cluster families and mapped to known pathways. The grouping of BGCs allowed us to study the evolutionary trajectory of pathways based on 6-methylsalicylic acid (6-MSA) synthases. Finally, we cross...... diversity of Penicillia and highlights the potential of these species as a source of new antibiotics and other pharmaceuticals....

  7. Reconstructing protein and gene phylogenies using reconciliation and soft-clustering.

    Science.gov (United States)

    Kuitche, Esaie; Lafond, Manuel; Ouangraoua, Aïda

    2017-12-01

    The architecture of eukaryotic coding genes allows the production of several different protein isoforms by genes. Current gene phylogeny reconstruction methods make use of a single protein product per gene, ignoring information on alternative protein isoforms. These methods often lead to inaccurate gene tree reconstructions that require to be corrected before phylogenetic analyses. Here, we propose a new approach for the reconstruction of gene trees and protein trees accounting for alternative protein isoforms. We extend the concept of reconciliation to protein trees, and we define a new reconciliation problem called MinDRGT that consists in finding a gene tree that minimizes a double reconciliation cost with a given protein tree and a given species tree. We define a second problem called MinDRPGT that consists in finding a protein supertree and a gene tree minimizing a double reconciliation cost, given a species tree and a set of protein subtrees. We propose a shift from the traditional view of protein ortholog groups as hard-clusters to soft-clusters and we study the MinDRPGT problem under this assumption. We provide algorithmic exact and heuristic solutions for versions of the problems, and we present the results of applications on protein and gene trees from the Ensembl database. The implementations of the methods are available at https://github.com/UdeS-CoBIUS/Protein2GeneTree and https://github.com/UdeS-CoBIUS/SuperProteinTree .

  8. Cloning of the biosynthetic gene cluster for naphthoxanthene antibiotic FD-594 from Streptomyces sp. TA-0256.

    Science.gov (United States)

    Kudo, Fumitaka; Yonezawa, Takanori; Komatsubara, Akiko; Mizoue, Kazutoshi; Eguchi, Tadashi

    2011-01-01

    FD-594 is an unique pyrano[4',3':6,7]naphtho[1,2-b]xanthene polyketide with a trisaccharide of 2,6-dideoxysugars. In this study, we cloned the FD-594 biosynthetic gene cluster from the producer strain Streptomyces sp. TA-0256 to investigate its biosynthesis. The identified pnx gene cluster was 38143 bp, consisting of 40 open reading frames, including a minimal PKS gene, TDP-olivose biosynthetic genes, two glycosyltransferase genes, two methyltransferase genes and many oxygenase/reductase genes. Most of these enzymes coded in the pnx cluster were reasonably assigned to a plausible biosynthetic pathway for FD-594, in which an unique ring opening process via Baeyer-Villiger-type oxidation catalyzed by a putative flavin adenine dinucleotide (FAD)-dependent monooxygenase, is speculated to lead to the unique xanthene structure. To clarify the involvement of pnx genes in the FD-594 biosynthesis, a glycosyltransferase, PnxGT2, and a methyltransferase, PnxMT2, were characterized enzymatically with the recombinant proteins expressed in Escherichia coli. As a result, PnxGT2 catalyzed the triple olivose transfers to the FD-594 aglycon with TDP-olivose as the glycosyl donor to afford triolivoside. Surprisingly, in the PnxGT2 enzymatic reaction, tetraolivoside and pentaolivoside were significantly detected along with the expected triolivoside. To our knowledge, PnxGT2 is the first contiguous oligosaccharide-forming glycosyltransferase in secondary metabolism. Furthermore, addition of PnxMT2 and S-adenosyl-L-methionine into the PnxGT2 reaction mixture afforded natural FD-594 to confirm that the PnxGT2 reaction product was the expected regiospecifically glycosylated compound. Consequently, the identified pnx gene cluster appears to be involved in FD-594 biosynthesis.

  9. Gene copy number gain of EGFR is a poor prognostic biomarker in gastric cancer: evaluation of 855 patients with bright-field dual in situ hybridization (DISH) method.

    Science.gov (United States)

    Higaki, Eiji; Kuwata, Takeshi; Nagatsuma, Akiko Kawano; Nishida, Yasunori; Kinoshita, Takahiro; Aizawa, Masaki; Nitta, Hiroaki; Nagino, Masato; Ochiai, Atsushi

    2016-01-01

    EGFR overexpression is a prognostic biomarker and is expected to be a predictive biomarker for anti-EGFR therapies in gastric cancer. However, few studies have reported the clinical impact of EGFR gene copy number (GCN) and its correlation with EGFR overexpression. We used dual in situ hybridization (DISH) to detect EGFR GCN and chromosome 7 centromere (CEN7) in a set of tissue microarrays representing 855 patients with gastric cancer. These data were compared with those of immunohistochemical (IHC) analysis of EGFR expression to evaluate prognostic value. EGFR GCN gain (≥ 2.5 EGFR signals per cell) was detected in 194 patients (22.7%) and indicated poor prognosis. Among 194 patients, EGFR amplification (EGFR/CEN7 ≥ 2.0) was observed in 29 patients (14.9%), which was almost identical to the IHC 3+ subgroup and worst prognostic subgroup. Patients with EGFR GCN gain but not amplification, including those exhibiting polysomy, also exhibited poorer prognosis than GCN non-gain patients and were distributed between IHC 0/1+ and 2+ subgroups. GCN gain was frequently observed in patients with more advanced disease, but served as an independent prognostic factor regardless of the pathological stage. EGFR GCN gain is a more accurate prognostic biomarker than EGFR overexpression in patients with gastric cancer.

  10. Sequencing, physical organization and kinetic expression of the patulin biosynthetic gene cluster from Penicillium expansum.

    Science.gov (United States)

    Tannous, Joanna; El Khoury, Rhoda; Snini, Selma P; Lippi, Yannick; El Khoury, André; Atoui, Ali; Lteif, Roger; Oswald, Isabelle P; Puel, Olivier

    2014-10-17

    Patulin is a polyketide-derived mycotoxin produced by numerous filamentous fungi. Among them, Penicillium expansum is by far the most problematic species. This fungus is a destructive phytopathogen capable of growing on fruit, provoking the blue mold decay of apples and producing significant amounts of patulin. The biosynthetic pathway of this mycotoxin is chemically well-characterized, but its genetic bases remain largely unknown with only few characterized genes in less economic relevant species. The present study consisted of the identification and positional organization of the patulin gene cluster in P. expansum strain NRRL 35695. Several amplification reactions were performed with degenerative primers that were designed based on sequences from the orthologous genes available in other species. An improved genome Walking approach was used in order to sequence the remaining adjacent genes of the cluster. RACE-PCR was also carried out from mRNAs to determine the start and stop codons of the coding sequences. The patulin gene cluster in P. expansum consists of 15 genes in the following order: patH, patG, patF, patE, patD, patC, patB, patA, patM, patN, patO, patL, patI, patJ, and patK. These genes share 60-70% of identity with orthologous genes grouped differently, within a putative patulin cluster described in a non-producing strain of Aspergillus clavatus. The kinetics of patulin cluster genes expression was studied under patulin-permissive conditions (natural apple-based medium) and patulin-restrictive conditions (Eagle's minimal essential medium), and demonstrated a significant association between gene expression and patulin production. In conclusion, the sequence of the patulin cluster in P. expansum constitutes a key step for a better understanding of the mechanisms leading to patulin production in this fungus. It will allow the role of each gene to be elucidated, and help to define strategies to reduce patulin production in apple-based products. Copyright

  11. Organization of the human keratin type II gene cluster at 12q13

    Energy Technology Data Exchange (ETDEWEB)

    Yoon, S.J.; LeBlanc-Straceski, J.; Krauter, K. [Albert Einstein College of Medicine, Bronx, NY (United States)] [and others

    1994-12-01

    Keratin proteins constitute intermediate filaments and are the major differentiation products of mammalian epithelial cells. The epithelial keratins are classified into two groups, type I and type II, and one member of each group is expressed in a given epithelial cell differentiation stage. Mutations in type I and type II keratin genes have now been implicated in three different human genetic disorders, epidermolysis bullosa simplex, epidermolytic hyperkeratosis, and epidermolytic palmoplantar keratoderma. Members of the type I keratins are mapped to human chromosome 17, and the type II keratin genes are mapped to chromosome 12. To understand the organization of the type II keratin genes on chromosome 12, we isolated several yeast artificial chromosomes carrying these keratin genes and examined them in detail. We show that eight already known type II keratin genes are located in a cluster at 12q13, and their relative organization reflects their evolutionary relationship. We also determined that a type I keratin gene, KRT8, is located next to its partner, KRT18, in this cluster. Careful examination of the cluster also revealed that there may be a number of additional keratin genes at this locus that have not been described previously. 41 refs., 3 figs., 1 tab.

  12. Comprehensive annotation of secondary metabolite biosynthetic genes and gene clusters of Aspergillus nidulans, A. fumigatus, A. niger and A. oryzae

    Science.gov (United States)

    2013-01-01

    Background Secondary metabolite production, a hallmark of filamentous fungi, is an expanding area of research for the Aspergilli. These compounds are potent chemicals, ranging from deadly toxins to therapeutic antibiotics to potential anti-cancer drugs. The genome sequences for multiple Aspergilli have been determined, and provide a wealth of predictive information about secondary metabolite production. Sequence analysis and gene overexpression strategies have enabled the discovery of novel secondary metabolites and the genes involved in their biosynthesis. The Aspergillus Genome Database (AspGD) provides a central repository for gene annotation and protein information for Aspergillus species. These annotations include Gene Ontology (GO) terms, phenotype data, gene names and descriptions and they are crucial for interpreting both small- and large-scale data and for aiding in the design of new experiments that further Aspergillus research. Results We have manually curated Biological Process GO annotations for all genes in AspGD with recorded functions in secondary metabolite production, adding new GO terms that specifically describe each secondary metabolite. We then leveraged these new annotations to predict roles in secondary metabolism for genes lacking experimental characterization. As a starting point for manually annotating Aspergillus secondary metabolite gene clusters, we used antiSMASH (antibiotics and Secondary Metabolite Analysis SHell) and SMURF (Secondary Metabolite Unknown Regions Finder) algorithms to identify potential clusters in A. nidulans, A. fumigatus, A. niger and A. oryzae, which we subsequently refined through manual curation. Conclusions This set of 266 manually curated secondary metabolite gene clusters will facilitate the investigation of novel Aspergillus secondary metabolites. PMID:23617571

  13. A genome-wide analysis of nonribosomal peptide synthetase gene clusters and their peptides in a Planktothrix rubescens strain

    Directory of Open Access Journals (Sweden)

    Nederbragt Alexander J

    2009-08-01

    Full Text Available Abstract Background Cyanobacteria often produce several different oligopeptides, with unknown biological functions, by nonribosomal peptide synthetases (NRPS. Although some cyanobacterial NRPS gene cluster types are well described, the entire NRPS genomic content within a single cyanobacterial strain has never been investigated. Here we have combined a genome-wide analysis using massive parallel pyrosequencing ("454" and mass spectrometry screening of oligopeptides produced in the strain Planktothrix rubescens NIVA CYA 98 in order to identify all putative gene clusters for oligopeptides. Results Thirteen types of oligopeptides were uncovered by mass spectrometry (MS analyses. Microcystin, cyanopeptolin and aeruginosin synthetases, highly similar to already characterized NRPS, were present in the genome. Two novel NRPS gene clusters were associated with production of anabaenopeptins and microginins, respectively. Sequence-depth of the genome and real-time PCR data revealed three copies of the microginin gene cluster. Since NRPS gene cluster candidates for microviridin and oscillatorin synthesis could not be found, putative (gene encoded precursor peptide sequences to microviridin and oscillatorin were found in the genes mdnA and oscA, respectively. The genes flanking the microviridin and oscillatorin precursor genes encode putative modifying enzymes of the precursor oligopeptides. We therefore propose ribosomal pathways involving modifications and cyclisation for microviridin and oscillatorin. The microviridin, anabaenopeptin and cyanopeptolin gene clusters are situated in close proximity to each other, constituting an oligopeptide island. Conclusion Altogether seven nonribosomal peptide synthetase (NRPS gene clusters and two gene clusters putatively encoding ribosomal oligopeptide biosynthetic pathways were revealed. Our results demonstrate that whole genome shotgun sequencing combined with MS-directed determination of oligopeptides successfully

  14. SC2ATmd: a tool for integration of the figure of merit with cluster analysis for gene expression data

    Science.gov (United States)

    Olex, Amy L.; Fetrow, Jacquelyn S.

    2011-01-01

    Summary: Standard and Consensus Clustering Analysis Tool for Microarray Data (SC2ATmd) is a MATLAB-implemented application specifically designed for the exploration of microarray gene expression data via clustering. Implementation of two versions of the clustering validation method figure of merit allows for performance comparisons between different clustering algorithms, and tailors the cluster analysis process to the varying characteristics of each dataset. Along with standard clustering algorithms this application also offers a consensus clustering method that can generate reproducible clusters across replicate experiments or different clustering algorithms. This application was designed specifically for the analysis of gene expression data, but may be used with any numerical data as long as it is in the right format. Availability: SC2ATmd may be freely downloaded from http://www.compbiosci.wfu.edu/tools.htm. Contact: olexal@wfu.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:21372084

  15. Intact cluster and chordate-like expression of ParaHox genes in a sea star.

    Science.gov (United States)

    Annunziata, Rossella; Martinez, Pedro; Arnone, Maria Ina

    2013-06-27

    The ParaHox genes are thought to be major players in patterning the gut of several bilaterian taxa. Though this is a fundamental role that these transcription factors play, their activities are not limited to the endoderm and extend to both ectodermal and mesodermal tissues. Three genes compose the ParaHox group: Gsx, Xlox and Cdx. In some taxa (mostly chordates but to some degree also in protostomes) the three genes are arranged into a genomic cluster, in a similar fashion to what has been shown for the better-known Hox genes. Sea urchins possess the full complement of ParaHox genes but they are all dispersed throughout the genome, an arrangement that, perhaps, represented the primitive condition for all echinoderms. In order to understand the evolutionary history of this group of genes we cloned and characterized all ParaHox genes, studied their expression patterns and identified their genomic loci in a member of an earlier branching group of echinoderms, the asteroid Patiria miniata. We identified the three ParaHox orthologs in the genome of P. miniata. While one of them, PmGsx is provided as maternal message, with no zygotic activation afterwards, the other two, PmLox and PmCdx are expressed during embryogenesis, within restricted domains of both endoderm and ectoderm. Screening of a Patiria bacterial artificial chromosome (BAC) library led to the identification of a clone containing the three genes. The transcriptional directions of PmGsx and PmLox are opposed to that of the PmCdx gene within the cluster. The identification of P. miniata ParaHox genes has revealed the fact that these genes are clustered in the genome, in contrast to what has been reported for echinoids. Since the presence of an intact cluster, or at least a partial cluster, has been reported in chordates and polychaetes respectively, it becomes clear that within echinoderms, sea urchins have modified the original bilaterian arrangement. Moreover, the sea star ParaHox domains of expression show

  16. Mapping of the {alpha}{sub 4} subunit gene (GABRA4) to human chromosome 4 defines an {alpha}{sub 2}-{alpha}{sub 4}-{beta}{sub 1}-{gamma}{sub 1} gene cluster: Further evidence that modern GABA{sub a} receptor gene clusters are derived from an ancestral cluster

    Energy Technology Data Exchange (ETDEWEB)

    McLean, P.J.; Farb, D.H.; Russek, S.J. [Boston Univ. School of Medicine, MA (United States)] [and others

    1995-04-10

    We demonstrated previously that an {alpha}{sub 1}-{beta}{sub 2}-{gamma}{sub 2} gene cluster of the {gamma}-aminobutyric acid (GABA{sub A}) receptor is located on human chromosome 5q34-q35 and that an ancestral {alpha}-{beta}-{gamma} gene cluster probably spawned clusters on chromosomes 4, 5, and 15. Here, we report that the {alpha}{sub 4} gene (GABRA4) maps to human chromosome 4p14-q12, defining a cluster comprising the {alpha}{sub 2}, {alpha}{sub 4}, {beta}{sub 1}, and {gamma}{sub 1} genes. The existence of an {alpha}{sub 2}-{alpha}{sub 4}-{beta}{sub 1}-{gamma}{sub 2} cluster on chromosome 4 and an {alpha}{sub 1}-{alpha}{sub 6}-{beta}{sub 2}-{gamma}{sub 2} cluster on chromosome 5 provides further evidence that the number of ancestral GABA{sub A} receptor subunit genes has been expanded by duplication within an ancestral gene cluster. Moreover, if duplication of the {alpha} gene occurred before duplication of the ancestral gene cluster, then a heretofore undiscovered subtype of a subunit should be located on human chromosome 15q11-q13 within an {alpha}{sub 5}-{alpha}{sub x}-{beta}{sub 3}-{gamma}{sub 3} gene cluster at the locus for Angelman and Prader-Willi syndromes. 34 refs., 6 figs., 1 tab.

  17. antiSMASH 4.0-improvements in chemistry prediction and gene cluster boundary identification.

    Science.gov (United States)

    Blin, Kai; Wolf, Thomas; Chevrette, Marc G; Lu, Xiaowen; Schwalen, Christopher J; Kautsar, Satria A; Suarez Duran, Hernando G; de Los Santos, Emmanuel L C; Kim, Hyun Uk; Nave, Mariana; Dickschat, Jeroen S; Mitchell, Douglas A; Shelest, Ekaterina; Breitling, Rainer; Takano, Eriko; Lee, Sang Yup; Weber, Tilmann; Medema, Marnix H

    2017-04-28

    Many antibiotics, chemotherapeutics, crop protection agents and food preservatives originate from molecules produced by bacteria, fungi or plants. In recent years, genome mining methodologies have been widely adopted to identify and characterize the biosynthetic gene clusters encoding the production of such compounds. Since 2011, the 'antibiotics and secondary metabolite analysis shell-antiSMASH' has assisted researchers in efficiently performing this, both as a web server and a standalone tool. Here, we present the thoroughly updated antiSMASH version 4, which adds several novel features, including prediction of gene cluster boundaries using the ClusterFinder method or the newly integrated CASSIS algorithm, improved substrate specificity prediction for non-ribosomal peptide synthetase adenylation domains based on the new SANDPUMA algorithm, improved predictions for terpene and ribosomally synthesized and post-translationally modified peptides cluster products, reporting of sequence similarity to proteins encoded in experimentally characterized gene clusters on a per-protein basis and a domain-level alignment tool for comparative analysis of trans-AT polyketide synthase assembly line architectures. Additionally, several usability features have been updated and improved. Together, these improvements make antiSMASH up-to-date with the latest developments in natural product research and will further facilitate computational genome mining for the discovery of novel bioactive molecules. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  18. Calcitonin gene-related peptide antagonism and cluster headache: an emerging new treatment.

    Science.gov (United States)

    Ashina, Håkan; Newman, Lawrence; Ashina, Sait

    2017-08-30

    Calcitonin gene-related peptide (CGRP) is a key signaling molecule involved in migraine pathophysiology. Efficacy of CGRP monoclonal antibodies and antagonists in migraine treatment has fueled an increasing interest in the prospect of treating cluster headache (CH) with CGRP antagonism. The exact role of CGRP and its mechanism of action in CH have not been fully clarified. A search for original studies and randomized controlled trials (RCTs) published in English was performed in PubMed and in ClinicalTrials.gov . The search term used was "cluster headache and calcitonin gene related peptide" and "primary headaches and calcitonin gene related peptide." Reference lists of identified articles were also searched for additional relevant papers. Human experimental studies have reported elevated plasma CGRP levels during both spontaneous and glyceryl trinitrate-induced cluster attacks. CGRP may play an important role in cluster headache pathophysiology. More refined human studies are warranted with regard to assay validation and using larger sample sizes. The results from RCTs may reveal the therapeutic potential of CGRP monoclonal antibodies and antagonists for cluster headache treatment.

  19. ERRγ target genes are poor prognostic factors in Tamoxifen-treated breast cancer.

    Science.gov (United States)

    Madhavan, Subha; Gusev, Yuriy; Singh, Salendra; Riggins, Rebecca B

    2015-05-15

    One-third of estrogen (ER+) and/or progesterone receptor-positive (PGR+) breast tumors treated with Tamoxifen (TAM) do not respond to initial treatment, and the remaining 70% are at risk to relapse in the future. Estrogen-related receptor gamma (ESRRG, ERRγ) is an orphan nuclear receptor with broad, structural similarities to classical ER that is widely implicated in the transcriptional regulation of energy homeostasis. We have previously demonstrated that ERRγ induces resistance to TAM in ER+ breast cancer models, and that the receptor's transcriptional activity is modified by activation of the ERK/MAPK pathway. We hypothesize that hyper-activation or over-expression of ERRγ induces a pro-survival transcriptional program that impairs the ability of TAM to inhibit the growth of ER+ breast cancer. The goal of the present study is to determine whether ERRγ target genes are associated with reduced distant metastasis-free survival (DMFS) in ER+ breast cancer treated with TAM. Raw gene expression data was obtained from 3 publicly available breast cancer clinical studies of women with ER+ breast cancer who received TAM as their sole endocrine therapy. ERRγ target genes were selected from 2 studies that published validated chromatin immunoprecipitation (ChIP) analyses of ERRγ promoter occupancy. Kaplan-Meier estimation was used to determine the association of ERRγ target genes with DMFS, and selected genes were validated in ER+, MCF7 breast cancer cells that express exogenous ERRγ. Thirty-seven validated receptor target genes were statistically significantly altered in women who experienced a DM within 5 years, and could classify several independent studies into poor vs. good DMFS. Two genes (EEF1A2 and PPIF) could similarly separate ER+, TAM-treated breast tumors by DMFS, and their protein levels were measured in an ER+ breast cancer cell line model with exogenous ERRγ. Finally, expression of ERRγ and these two target genes are elevated in models of ER+ breast

  20. Characterization of a Major Cluster of nif, fix, and Associated Genes in a Sugarcane Endophyte, Acetobacter diazotrophicus

    Science.gov (United States)

    Lee, Sunhee; Reth, Alexander; Meletzus, Dietmar; Sevilla, Myrna; Kennedy, Christina

    2000-01-01

    A major 30.5-kb cluster of nif and associated genes of Acetobacter diazotrophicus (syn. Gluconacetobacter diazotrophicus), a nitrogen-fixing endophyte of sugarcane, was sequenced and analyzed. This cluster represents the largest assembly of contiguous nif-fix and associated genes so far characterized in any diazotrophic bacterial species. Northern blots and promoter sequence analysis indicated that the genes are organized into eight transcriptional units. The overall arrangement of genes is most like that of the nif-fix cluster in Azospirillum brasilense, while the individual gene products are more similar to those in species of Rhizobiaceae or in Rhodobacter capsulatus. PMID:11092875

  1. Gene-expression Classifier in Papillary Thyroid Carcinoma: Validation and Application of a Classifier for Prognostication

    DEFF Research Database (Denmark)

    Londero, Stefano Christian; Jespersen, Marie Louise; Krogdahl, Annelise

    2016-01-01

    BACKGROUND: No reliable biomarker for metastatic potential in the risk stratification of papillary thyroid carcinoma exists. We aimed to develop a gene-expression classifier for metastatic potential. MATERIALS AND METHODS: Genome-wide expression analyses were used. Development cohort: freshly...... frozen tissue from 38 patients was collected between the years 1986 and 2009. Validation cohort: formalin-fixed paraffin-embedded tissues were collected from 183 consecutively treated patients. RESULTS: A 17-gene classifier was identified based on the expression values in patients with and without...... metastasis in the development cohort. The 17-gene classifier for regional/distant metastasis identified was tested against the clinical status in the validation cohort. Sensitivity for detection of metastases was 51.5% and specificity 61.6%. Log-rank testing failed to identify any significance (p=0...

  2. The Serratia gene cluster encoding biosynthesis of the red antibiotic, prodigiosin, shows species- and strain-dependent genome context variation

    DEFF Research Database (Denmark)

    Harris, Abigail K P; Williamson, Neil R; Slater, Holly

    2004-01-01

    The prodigiosin biosynthesis gene cluster (pig cluster) from two strains of Serratia (S. marcescens ATCC 274 and Serratia sp. ATCC 39006) has been cloned, sequenced and expressed in heterologous hosts. Sequence analysis of the respective pig clusters revealed 14 ORFs in S. marcescens ATCC 274 and...

  3. Identification of the Viridicatumtoxin and Griseofulvin Gene Clusters from Penicillium aethiopicum

    Science.gov (United States)

    Chooi, Yit-Heng; Cacho, Ralph; Tang, Yi

    2010-01-01

    SUMMARY Penicillium aethiopicum produces two structurally interesting and biologically active polyketides: the tetracycline-like viridicatumtoxin 1 and the classic antifungal agent griseofulvin 2. Here, we report the concurrent discovery of the two corresponding biosynthetic gene clusters (vrt and gsf) by 454 shotgun sequencing. Gene deletions confirmed two nonreducing PKSs (NRPKS), vrtA and gsfA, are required for the biosynthesis of 1 and 2, respectively. Both PKSs share similar domain architectures and lack a C-terminal thioesterase domain. We identified gsfI as the chlorinase involved in the biosynthesis of 2, as deletion of gsfI resulted in the accumulation of decholorogriseofulvin 3. Comparative analysis with the P. chrysogenum genome revealed that both clusters are embedded within conserved syntenic regions of P. aethiopicum chromosomes. Discovery of the vrt and gsf clusters provided the basis for genetic and biochemical studies of the pathways. PMID:20534346

  4. Prevalence and prognostic role of mismatch repair gene defect in endometrial cancer patients.

    Science.gov (United States)

    Tangjitgamol, Siriwan; Kittisiam, Thannaporn; Tanvanich, Sujitra

    2017-09-01

    The study was to evaluate the prevalence of mismatch repair gene defect among Thai patients with endometrial cancer and its association with clinico-pathological features and survivals. The formalin fixed paraffin-embedded blocks of EMC tissue from hysterectomy specimens of patients having surgery in our institution between 1 Jan 1995 and 31 December 2016 were assessed for the immunohistochemical expression of 4 mismatch repair proteins (MLH1, PMS, MSH2, MSH 6). Mismatch repair gene defect was determined by a negative expression of at least 1 protein. Among 385 EMC patients included in the study, mean age was 57.3 ± 10.8 years with 62.3% aged ⩽ 60 years. The most frequent mismatch repair gene defect was MSH6 (38.7%), followed by PMS2 (34.3%), MLH1 (33.2%), and MSH2 (16.4%). Overall, 55.1% showed negative expression of at least one protein. We found significantly higher mismatch repair gene defect in patients aged ⩽ 60 years, with early stage disease, and negative lymph node status than the other comparative groups: 59.2% vs 48.3% for age (p = 0.037), 58.2% vs 45.2% (p = 0.027) for stage, and 58.1% vs 44.6% (p = 0.048) for nodal status. The 5-year progression-free survival, overall survival, and endometrial cancer-specific survival of patients with mismatch repair gene defect was higher than those without gene defect. The differences were statistically significant for only progression-free survival and endometrial cancer-specific survival: 87.7% (95% confidence interval = 83.0%-92.4%) vs 81.5% (95% confidence interval = 75.4%-87.6%) (p = 0.049) for progression-free survival and 91.0% (95% confidence interval = 86.9%-95.1%) vs 85.5% (95% confidence interval = 80.0%-91.0%) (p = 0.044) for endometrial cancer-specific survival, respectively. In conclusion, more than half of Thai endometrial cancer patients had mismatch repair gene defect. The patients with mismatch repair gene defect had significantly younger age (⩽ 60 years) and better prognosis in terms of

  5. Gene cluster responsible for validamycin biosynthesis in Streptomyces hygroscopicus subsp. jinggangensis 5008.

    Science.gov (United States)

    Yu, Yi; Bai, Linquan; Minagawa, Kazuyuki; Jian, Xiaohong; Li, Lei; Li, Jialiang; Chen, Shuangya; Cao, Erhu; Mahmud, Taifo; Floss, Heinz G; Zhou, Xiufen; Deng, Zixin

    2005-09-01

    A gene cluster responsible for the biosynthesis of validamycin, an aminocyclitol antibiotic widely used as a control agent for sheath blight disease of rice plants, was identified from Streptomyces hygroscopicus subsp. jinggangensis 5008 using heterologous probe acbC, a gene involved in the cyclization of D-sedoheptulose 7-phosphate to 2-epi-5-epi-valiolone of the acarbose biosynthetic gene cluster originated from Actinoplanes sp. strain SE50/110. Deletion of a 30-kb DNA fragment from this cluster in the chromosome resulted in loss of validamycin production, confirming a direct involvement of the gene cluster in the biosynthesis of this important plant protectant. A sequenced 6-kb fragment contained valA (an acbC homologue encoding a putative cyclase) as well as two additional complete open reading frames (valB and valC, encoding a putative adenyltransferase and a kinase, respectively), which are organized as an operon. The function of ValA was genetically demonstrated to be essential for validamycin production and biochemically shown to be responsible specifically for the cyclization of D-sedoheptulose 7-phosphate to 2-epi-5-epi-valiolone in vitro using the ValA protein heterologously overexpressed in E. coli. The information obtained should pave the way for further detailed analysis of the complete biosynthetic pathway, which would lead to a complete understanding of validamycin biosynthesis.

  6. Insights into secondary metabolism from a global analysis of prokaryotic biosynthetic gene clusters

    NARCIS (Netherlands)

    Cimermancic, P.; Medema, Marnix; Claesen, J.; Kurika, K.; Wieland Brown, L.C.; Mavrommatis, K.; Pati, A.; Godfrey, P.A.; Koehrsen, M.; Clardy, J.; Birren, B. W.; Takano, Eriko; Sali, A.; Linington, R.G.; Fischbach, M.A.

    2014-01-01

    Although biosynthetic gene clusters (BGCs) have been discovered for hundreds of bacterial metabolites, our knowledge of their diversity remains limited. Here, we used a novel algorithm to systematically identify BGCs in the extensive extant microbial sequencing data. Network analysis of the

  7. Evolutionary history of the phl gene cluster in the plant-associated bacterium Pseudomonas fluorescens

    NARCIS (Netherlands)

    Moynihan, J.A.; Morrissey, J.P.; Coppoolse, E.; Stiekema, W.J.; O'Gara, F.; Boyd, E.F.

    2009-01-01

    Pseudomonas fluorescens is of agricultural and economic importance as a biological control agent largely because of its plant-association and production of secondary metabolites, in particular 2, 4-diacetylphloroglucinol (2, 4-DAPG). This polyketide, which is encoded by the eight gene phl cluster,

  8. Diversity and depth-specific distribution of SAR11 cluster rRNA genes from marine planktonic bacteria

    Energy Technology Data Exchange (ETDEWEB)

    Field, K.G.; Gordon, D.; Wright, T. [Oregon State Univ., Corvallis, OR (United States)] [and others

    1997-01-01

    Small-subunit (SSU) ribosomal DNA (rDNA) gene clusters are phylogenetically related sets of SSU rRNA genes, commonly encountered in genes amplified from natural populations. Genetic variability in gene clusters could result form artifacts (polymerase error or PCR chimera formation), microevolution (variation among rrn copies within strains), or macroevolution (genetic divergence correlated with long-term evolutionary divergence). To better understand gene clusters, this study assessed genetic diversity and distribution of a single environmental SSU rDNA gene cluster, the SAR11 cluster. SAR11 cluster genes, from an uncultured group of the {alpha} subclass of the class Proteobacteria, have been recovered from coastal and midoceanic waters of the North Atlantic and Pacific. We cloned and bidirectionally sequenced 23 new SAR11 cluster 16S rRNA genes, from 80 and 250 m im the Sargasso Sea and from surface coastal waters of the Atlantic and Pacific, and analyzed them with previously published sequences. Two SAR11 genes were obviously PCR chimeras, but the biological (nonchimeric) origins of most subgroups within the cluster were confirmed by independent recovery from separate gene libraries. Using group-specific oligonucleotide probes, we analyzed depth profiles of nucleic acids, targeting both amplified rDNAs and bulk RNAs. Two subgroups within the SAR11 cluster showed different highly depth-specific distributions. We conclude that some of the genetic diversity within the SAR11 gene cluster represents macroevolutionary divergence correlated with niche specialization. Furthermore, we demonstrate the utility for marine microbial ecology of oligonucleotide probes based on gene sequences amplified from natural populations and show that a detailed knowledge of sequence variability may be needed to effectively design these probes. 48 refs., 7 figs., 3 tabs.

  9. Supra-operonic clusters of functionally related genes (SOCs) are a source of horizontal gene co-transfers.

    Science.gov (United States)

    Pang, Tin Yau; Lercher, Martin J

    2017-01-09

    Adaptation of bacteria occurs predominantly via horizontal gene transfer (HGT). While it is widely recognized that horizontal acquisitions frequently encompass multiple genes, it is unclear what the size distribution of successfully transferred DNA segments looks like and what evolutionary forces shape this distribution. Here, we identified 1790 gene family pairs that were consistently co-gained on the same branches across a phylogeny of 53 E. coli strains. We estimated a lower limit of their genomic distances at the time they were transferred to their host genomes; this distribution shows a sharp upper bound at 30 kb. The same gene-pairs can have larger distances (up to 70 kb) in other genomes. These more distant pairs likely represent recent acquisitions via transduction that involve the co-transfer of excised prophage genes, as they are almost always associated with intervening phage-associated genes. The observed distribution of genomic distances of co-transferred genes is much broader than expected from a model based on the co-transfer of genes within operons; instead, this distribution is highly consistent with the size distribution of supra-operonic clusters (SOCs), groups of co-occurring and co-functioning genes that extend beyond operons. Thus, we propose that SOCs form a basic unit of horizontal gene transfer.

  10. Identification of Subtype-Specific Prognostic Genes for Early-Stage Lung Adenocarcinoma and Squamous Cell Carcinoma Patients Using an Embedded Feature Selection Algorithm.

    Directory of Open Access Journals (Sweden)

    Suyan Tian

    Full Text Available The existence of fundamental differences between lung adenocarcinoma (AC and squamous cell carcinoma (SCC in their underlying mechanisms motivated us to postulate that specific genes might exist relevant to prognosis of each histology subtype. To test on this research hypothesis, we previously proposed a simple Cox-regression model based feature selection algorithm and identified successfully some subtype-specific prognostic genes when applying this method to real-world data. In this article, we continue our effort on identification of subtype-specific prognostic genes for AC and SCC, and propose a novel embedded feature selection method by extending Threshold Gradient Descent Regularization (TGDR algorithm and minimizing on a corresponding negative partial likelihood function. Using real-world datasets and simulated ones, we show these two proposed methods have comparable performance whereas the new proposal is superior in terms of model parsimony. Our analysis provides some evidence on the existence of such subtype-specific prognostic genes, more investigation is warranted.

  11. Expression of multi-drug resistance-related genes MDR3 and MRP as prognostic factors in clinical liver cancer patients.

    Science.gov (United States)

    Yu, Zheng; Peng, Sun; Hong-Ming, Pan; Kai-Feng, Wang

    2012-01-01

    To investigate the expression of multi-drug resistance-related genes, MDR3 and MRP, in clinical specimens of primary liver cancer and their potential as prognostic factors in liver cancer patients. A total of 26 patients with primary liver cancer were enrolled. The expression of MDR3 and MRP genes was measured by real-time PCR and the association between gene expression and the prognosis of patients was analyzed by the Kaplan-Meier method and COX regression model. This study showed that increases in MDR3 gene expression were identified in cholangiocellular carcinoma, cirrhosis and HBsAg-positive patients, while MRP expression increased in hepatocellular carcinoma, non-cirrhosis and HBsAg-negative patients. Moreover, conjugated bilirubin and total bile acid in the serum were significantly reduced in patients with high MRP expression compared to patients with low expression. The overall survival tended to be longer in patients with high MDR3 and MRP expression compared to the control group. MRP might be an independent prognostic factor in patients with liver cancer by COX regression analysis. MDR3 and MRP may play important roles in liver cancer patients as prognostic factors and their underlying mechanisms in liver cancer are worthy of further investigation.

  12. Distinct patterns of novel gene mutations in poor-prognostic stereotyped subsets of chronic lymphocytic leukemia

    DEFF Research Database (Denmark)

    Strefford, J C; Sutton, L-A; Baliakas, P

    2013-01-01

    Recent studies have revealed recurrent mutations of the NOTCH1, SF3B1 and BIRC3 genes in chronic lymphocytic leukemia (CLL), especially among aggressive, chemorefractory cases. Nevertheless, it is currently unknown whether their presence may differ in subsets of patients carrying stereotyped B...

  13. Identification, characterization and metagenome analysis of oocyte-specific genes organized in clusters in the mouse genome

    Directory of Open Access Journals (Sweden)

    Vaiman Daniel

    2005-05-01

    Full Text Available Abstract Background Genes specifically expressed in the oocyte play key roles in oogenesis, ovarian folliculogenesis, fertilization and/or early embryonic development. In an attempt to identify novel oocyte-specific genes in the mouse, we have used an in silico subtraction methodology, and we have focused our attention on genes that are organized in genomic clusters. Results In the present work, five clusters have been studied: a cluster of thirteen genes characterized by an F-box domain localized on chromosome 9, a cluster of six genes related to T-cell leukaemia/lymphoma protein 1 (Tcl1 on chromosome 12, a cluster composed of a SPErm-associated glutamate (E-Rich (Speer protein expressed in the oocyte in the vicinity of four unknown genes specifically expressed in the testis on chromosome 14, a cluster composed of the oocyte secreted protein-1 (Oosp-1 gene and two Oosp-related genes on chromosome 19, all three being characterized by a partial N-terminal zona pellucida-like domain, and another small cluster of two genes on chromosome 19 as well, composed of a TWIK-Related spinal cord K+ channel encoding-gene, and an unknown gene predicted in silico to be testis-specific. The specificity of expression was confirmed by RT-PCR and in situ hybridization for eight and five of them, respectively. Finally, we showed by comparing all of the isolated and clustered oocyte-specific genes identified so far in the mouse genome, that the oocyte-specific clusters are significantly closer to telomeres than isolated oocyte-specific genes are. Conclusion We have studied five clusters of genes specifically expressed in female, some of them being also expressed in male germ-cells. Moreover, contrarily to non-clustered oocyte-specific genes, those that are organized in clusters tend to map near chromosome ends, suggesting that this specific near-telomere position of oocyte-clusters in rodents could constitute an evolutionary advantage. Understanding the biological

  14. Clustering of two genes putatively involved in cyanate detoxification evolved recently and independently in multiple fungal lineages

    Science.gov (United States)

    Fungi that have the enzymes cyanase and carbonic anhydrase show a limited capacity to detoxify cyanate, a fungicide employed by both plants and humans. Here, we describe a novel two-gene cluster that comprises duplicated cyanase and carbonic anhydrase copies, which we name the CCA gene cluster, trac...

  15. Evolutionary dynamics of rRNA gene clusters in cichlid fish

    Directory of Open Access Journals (Sweden)

    Nakajima Rafael T

    2012-10-01

    Full Text Available Abstract Background Among multigene families, ribosomal RNA (rRNA genes are the most frequently studied and have been explored as cytogenetic markers to study the evolutionary history of karyotypes among animals and plants. In this report, we applied cytogenetic and genomic methods to investigate the organization of rRNA genes among cichlid fishes. Cichlids are a group of fishes that are of increasing scientific interest due to their rapid and convergent adaptive radiation, which has led to extensive ecological diversity. Results The present paper reports the cytogenetic mapping of the 5S rRNA genes from 18 South American, 22 African and one Asian species and the 18S rRNA genes from 3 African species. The data obtained were comparatively analyzed with previously published information related to the mapping of rRNA genes in cichlids. The number of 5S rRNA clusters per diploid genome ranged from 2 to 15, with the most common pattern being the presence of 2 chromosomes bearing a 5S rDNA cluster. Regarding 18S rDNA mapping, the number of sites ranged from 2 to 6, with the most common pattern being the presence of 2 sites per diploid genome. Furthermore, searching the Oreochromis niloticus genome database led to the identification of a total of 59 copies of 5S rRNA and 38 copies of 18S rRNA genes that were distributed in several genomic scaffolds. The rRNA genes were frequently flanked by transposable elements (TEs and spread throughout the genome, complementing the FISH analysis that detect only clustered copies of rRNA genes. Conclusions The organization of rRNA gene clusters seems to reflect their intense and particular evolutionary pathway and not the evolutionary history of the associated taxa. The possible role of TEs as one source of rRNA gene movement, that could generates the spreading of ribosomal clusters/copies, is discussed. The present paper reinforces the notion that the integration of cytogenetic data and genomic analysis provides a

  16. Gene clustering analysis in human osteoporosis disease and modifications of the jawbone.

    Science.gov (United States)

    Toti, Paolo; Sbordone, Carolina; Martuscelli, Ranieri; Califano, Luigi; Ramaglia, Luca; Sbordone, Ludovico

    2013-08-01

    An analysis of the genes involved in both osteoporosis and modifications of the jawbone, through text mining, using a web search tool, of information regarding gene/protein interaction. The final set of genes involved in the present phenomenon was obtained by expansion-filtering loop. Using a web-available software (STRING), interactions among all genes were searched for, and a clustering procedure was performed in which only high-confidence predicted associations were considered. Two hundred forty-two genes potentially involved in osteoporosis and in modifications of the jawbone were recorded. Seven "leader genes" were identified (CTNNB1, IL1B, IL6, JUN, RUNX2, SPP1, TGFB1), while another 10 genes formed the cluster B group (BMP2, BMP7, COL1A1, ICAM1, IGF1, IL10, MMP9, NFKB1, TNFSF11, VEGFA). Ninety-eight genes had no interactions, and were defined as "orphan genes". The expansion of knowledge regarding the molecular basis causing osteoporotic traits has been brought about with the help of a de novo identification, based on the data mining of genes involved in osteoporosis and in modification of the jawbone. A comparison of the present data, in which no role was verified for 98 genes that had been previously supposed to have a role, with that of the literature, in which another 81 genes, as obtained from GWAS reviews and meta-analyses, appeared to be strongly associated with osteoporosis, probably attests to a lack of information on osteoporotic disease. Copyright © 2013 Elsevier Ltd. All rights reserved.

  17. Identification and analysis of the paulomycin biosynthetic gene cluster and titer improvement of the paulomycins in Streptomyces paulus NRRL 8115.

    Directory of Open Access Journals (Sweden)

    Jine Li

    Full Text Available The paulomycins are a group of glycosylated compounds featuring a unique paulic acid moiety. To locate their biosynthetic gene clusters, the genomes of two paulomycin producers, Streptomyces paulus NRRL 8115 and Streptomyces sp. YN86, were sequenced. The paulomycin biosynthetic gene clusters were defined by comparative analyses of the two genomes together with the genome of the third paulomycin producer Streptomyces albus J1074. Subsequently, the identity of the paulomycin biosynthetic gene cluster was confirmed by inactivation of two genes involved in biosynthesis of the paulomycose branched chain (pau11 and the ring A moiety (pau18 in Streptomyces paulus NRRL 8115. After determining the gene cluster boundaries, a convergent biosynthetic model was proposed for paulomycin based on the deduced functions of the pau genes. Finally, a paulomycin high-producing strain was constructed by expressing an activator-encoding gene (pau13 in S. paulus, setting the stage for future investigations.

  18. cluster

    Indian Academy of Sciences (India)

    electron transfer chains involved in a number of biologi- cal systems including respiration and photosynthesis.1. The most common iron–sulphur clusters found as active centres in iron–sulphur proteins are [Fe2S2], [Fe3S4] and [Fe4S4], in which Fe(III) ions are coordinated to cysteines from the peptide and are linked to each ...

  19. Genomics-based Approach and Prognostic Stratification Significance of Gene Mutations in Intermediate-risk Acute Myeloid Leukemia

    Directory of Open Access Journals (Sweden)

    Bian-Hong Wang

    2015-01-01

    Conclusions: NGS represents a pioneering and helpful approach to prognostic risk stratification of IR-AML patients. Further large-scale studies for comprehensive molecular analysis are needed to provide guidance and a theoretical basis for IR-AML prognostic stratification and clinical management.

  20. Soybean bacterial artificial chromosome contigs anchored with RFLPs: insights into genome duplication and gene clustering.

    Science.gov (United States)

    Mudge, Joann; Huihuang, Yan; Denny, Roxanne L; Howe, Dana K; Danesh, Dariush; Marek, Laura F; Retzel, Ernie; Shoemaker, Randy C; Young, Nevin D

    2004-04-01

    Surveying the soybean genome with 683 bacterial artificial chromosome (BAC) contiguous groups (contigs) anchored by restriction fragment length polymorphisms (RFLPs) enabled us to explore microsyntenic relationships among duplicated regions and also to examine the physical organization of hypomethylated (and presumably gene-rich) genomic regions. Numerous cases where nonhomologous RFLPs hybridized to common BAC clones indicated that RFLPs were physically clustered in soybean, apparently in less than 25% of the genome. By extension, we speculate that most of the genes are clustered in less than 275 M of the soybean genome. Approximately 40%-45% of this gene-rich portion is associated with the RFLP-anchored contigs described in this study. Similarities in genome organization among BAC contigs from duplicate genomic regions were also examined. Homoeologous BAC contigs often exhibited extensive microsynteny. Furthermore, paralogs recovered from duplicate contigs shared 86%-100% sequence identity.

  1. Transcriptional regulation of gene expression clusters in motor neurons following spinal cord injury

    DEFF Research Database (Denmark)

    Ryge, J.; Winther, Ole; Wienecke, J.

    2010-01-01

    Background: Spinal cord injury leads to neurological dysfunctions affecting the motor, sensory as well as the autonomic systems. Increased excitability of motor neurons has been implicated in injury-induced spasticity, where the reappearance of self-sustained plateau potentials in the absence...... of modulatory inputs from the brain correlates with the development of spasticity. Results: Here we examine the dynamic transcriptional response of motor neurons to spinal cord injury as it evolves over time to unravel common gene expression patterns and their underlying regulatory mechanisms. For this we use...... a rat-tail-model with complete spinal cord transection causing injury-induced spasticity, where gene expression profiles are obtained from labeled motor neurons extracted with laser microdissection 0, 2, 7, 21 and 60 days post injury. Consensus clustering identifies 12 gene clusters with distinct time...

  2. Cloning and heterologous expression of the penicillin biosynthetic gene cluster from penicillum chrysogenum.

    Science.gov (United States)

    Smith, D J; Burnham, M K; Edwards, J; Earl, A J; Turner, G

    1990-01-01

    A cosmid clone containing the putative penicillin biosynthetic gene cluster from Penicillium chrysogenum was used to transform the related filamentous fungi Neurospora crassa and Aspergillus niger, which do not produce beta-lactam antibiotics. Both of the transformed hosts contained intact P. chrysogenum DNA derived from the cosmid clone and produced authentic penicillin V. Assays of penicillin biosynthetic enzyme activity additionally demonstrated that they possessed delta-(L-alpha-amino-adipyl)-L-cysteinyl-D-valine synthetase (ACVS), isopenicillin N synthetase (IPNS) and acyl coenzyme A:6-aminopenicillanic acid acyltransferase (ACT) activity. The data suggests that genes encoding all the enzymes necessary for the biosynthesis of penicillin from amino acid precursors are closely linked in P. chrysogenum and constitute a gene cluster.

  3. Establishment of the Inducible Tet-On System for the Activation of the Silent Trichosetin Gene Cluster in Fusarium fujikuroi

    Directory of Open Access Journals (Sweden)

    Slavica Janevska

    2017-04-01

    Full Text Available The PKS-NRPS-derived tetramic acid equisetin and its N-desmethyl derivative trichosetin exhibit remarkable biological activities against a variety of organisms, including plants and bacteria, e.g., Staphylococcus aureus. The equisetin biosynthetic gene cluster was first described in Fusarium heterosporum, a species distantly related to the notorious rice pathogen Fusarium fujikuroi. Here we present the activation and characterization of a homologous, but silent, gene cluster in F. fujikuroi. Bioinformatic analysis revealed that this cluster does not contain the equisetin N-methyltransferase gene eqxD and consequently, trichosetin was isolated as final product. The adaption of the inducible, tetracycline-dependent Tet-on promoter system from Aspergillus niger achieved a controlled overproduction of this toxic metabolite and a functional characterization of each cluster gene in F. fujikuroi. Overexpression of one of the two cluster-specific transcription factor (TF genes, TF22, led to an activation of the three biosynthetic cluster genes, including the PKS-NRPS key gene. In contrast, overexpression of TF23, encoding a second Zn(II2Cys6 TF, did not activate adjacent cluster genes. Instead, TF23 was induced by the final product trichosetin and was required for expression of the transporter-encoding gene MFS-T. TF23 and MFS-T likely act in consort and contribute to detoxification of trichosetin and therefore, self-protection of the producing fungus.

  4. Characterisation of the paralytic shellfish toxin biosynthesis gene clusters in Anabaena circinalis AWQC131C and Aphanizomenon sp. NH-5

    Directory of Open Access Journals (Sweden)

    Neilan Brett A

    2009-03-01

    Full Text Available Abstract Background Saxitoxin and its analogues collectively known as the paralytic shellfish toxins (PSTs are neurotoxic alkaloids and are the cause of the syndrome named paralytic shellfish poisoning. PSTs are produced by a unique biosynthetic pathway, which involves reactions that are rare in microbial metabolic pathways. Nevertheless, distantly related organisms such as dinoflagellates and cyanobacteria appear to produce these toxins using the same pathway. Hypothesised explanations for such an unusual phylogenetic distribution of this shared uncommon metabolic pathway, include a polyphyletic origin, an involvement of symbiotic bacteria, and horizontal gene transfer. Results We describe the identification, annotation and bioinformatic characterisation of the putative paralytic shellfish toxin biosynthesis clusters in an Australian isolate of Anabaena circinalis and an American isolate of Aphanizomenon sp., both members of the Nostocales. These putative PST gene clusters span approximately 28 kb and contain genes coding for the biosynthesis and export of the toxin. A putative insertion/excision site in the Australian Anabaena circinalis AWQC131C was identified, and the organization and evolution of the gene clusters are discussed. A biosynthetic pathway leading to the formation of saxitoxin and its analogues in these organisms is proposed. Conclusion The PST biosynthesis gene cluster presents a mosaic structure, whereby genes have apparently transposed in segments of varying size, resulting in different gene arrangements in all three sxt clusters sequenced so far. The gene cluster organizational structure and sequence similarity seems to reflect the phylogeny of the producer organisms, indicating that the gene clusters have an ancient origin, or that their lateral transfer was also an ancient event. The knowledge we gain from the characterisation of the PST biosynthesis gene clusters, including the identity and sequence of the genes involved

  5. Two Gene Clusters Coordinate Galactose and Lactose Metabolism in Streptococcus gordonii

    Science.gov (United States)

    Zeng, Lin; Martino, Nicole C.

    2012-01-01

    Streptococcus gordonii is an early colonizer of the human oral cavity and an abundant constituent of oral biofilms. Two tandemly arranged gene clusters, designated lac and gal, were identified in the S. gordonii DL1 genome, which encode genes of the tagatose pathway (lacABCD) and sugar phosphotransferase system (PTS) enzyme II permeases. Genes encoding a predicted phospho-β-galactosidase (LacG), a DeoR family transcriptional regulator (LacR), and a transcriptional antiterminator (LacT) were also present in the clusters. Growth and PTS assays supported that the permease designated EIILac transports lactose and galactose, whereas EIIGal transports galactose. The expression of the gene for EIIGal was markedly upregulated in cells growing on galactose. Using promoter-cat fusions, a role for LacR in the regulation of the expressions of both gene clusters was demonstrated, and the gal cluster was also shown to be sensitive to repression by CcpA. The deletion of lacT caused an inability to grow on lactose, apparently because of its role in the regulation of the expression of the genes for EIILac, but had little effect on galactose utilization. S. gordonii maintained a selective advantage over Streptococcus mutans in a mixed-species competition assay, associated with its possession of a high-affinity galactose PTS, although S. mutans could persist better at low pHs. Collectively, these results support the concept that the galactose and lactose systems of S. gordonii are subject to complex regulation and that a high-affinity galactose PTS may be advantageous when S. gordonii is competing against the caries pathogen S. mutans in oral biofilms. PMID:22660715

  6. Comprehensive immune transcriptomic analysis in bladder cancer reveals subtype specific immune gene expression patterns of prognostic relevance.

    Science.gov (United States)

    Ren, Runhan; Tyryshkin, Kathrin; Graham, Charles H; Koti, Madhuri; Siemens, D Robert

    2017-09-19

    Recent efforts on genome wide profiling of muscle invasive bladder cancer (MIBC) have led to its classification into distinct genomic and transcriptomic molecular subtypes that exhibit variability in prognosis. Evolving evidence from recent immunotherapy trials has demonstrated the significance of pre-existing tumour immune profiles that could guide treatment decisions. To identify immune gene expression patterns associated with the molecular subtypes, we performed a comprehensive in silico immune transcriptomic profiling, utilizing transcriptomic data from 347 MIBC cases from The Cancer Genome Atlas (TCGA). To investigate subtype-associated immune gene expression patterns, we assembled 924 immune response genes and specifically those involved in T-cell cytotoxicity and the Type I/II interferon pathways. A set of 157 ranked genes was able to distinguish the four subtypes in an unsupervised analysis in an original training cohort (n=122) and an expanded, validation cohort (n=225). The most common overrepresented pathways distinguishing the four molecular subtypes, included JAK/STAT signaling, Toll-like receptor signaling, interleukin signaling, and T-cell activation. Some of the most enriched biological processes were responses to IFN-γ, antigen processing and presentation, cytokine mediated signaling, hemopoeisis, cell proliferation and cellular defense response in the TCGA cluster IV. Our novel findings provide further insights into the association between genomic subtypes and immune activation in MIBC and may open novel opportunities for their exploitation towards precise treatment with immunotherapy.

  7. A scan statistic to extract causal gene clusters from case-control genome-wide rare CNV data

    Directory of Open Access Journals (Sweden)

    Scherer Stephen W

    2011-05-01

    Full Text Available Abstract Background Several statistical tests have been developed for analyzing genome-wide association data by incorporating gene pathway information in terms of gene sets. Using these methods, hundreds of gene sets are typically tested, and the tested gene sets often overlap. This overlapping greatly increases the probability of generating false positives, and the results obtained are difficult to interpret, particularly when many gene sets show statistical significance. Results We propose a flexible statistical framework to circumvent these problems. Inspired by spatial scan statistics for detecting clustering of disease occurrence in the field of epidemiology, we developed a scan statistic to extract disease-associated gene clusters from a whole gene pathway. Extracting one or a few significant gene clusters from a global pathway limits the overall false positive probability, which results in increased statistical power, and facilitates the interpretation of test results. In the present study, we applied our method to genome-wide association data for rare copy-number variations, which have been strongly implicated in common diseases. Application of our method to a simulated dataset demonstrated the high accuracy of this method in detecting disease-associated gene clusters in a whole gene pathway. Conclusions The scan statistic approach proposed here shows a high level of accuracy in detecting gene clusters in a whole gene pathway. This study has provided a sound statistical framework for analyzing genome-wide rare CNV data by incorporating topological information on the gene pathway.

  8. A Telomeric Cluster of Antimony Resistance Genes on Chromosome 34 of Leishmania infantum.

    Science.gov (United States)

    Tejera Nevado, Paloma; Bifeld, Eugenia; Höhn, Katharina; Clos, Joachim

    2016-09-01

    The mechanisms underlying the drug resistance of Leishmania spp. are manifold and not completely identified. Apart from the highly conserved multidrug resistance gene family known from higher eukaryotes, Leishmania spp. also possess genus-specific resistance marker genes. One of them, ARM58, was first identified in Leishmania braziliensis using a functional cloning approach, and its domain structure was characterized in L. infantum Here we report that L. infantum ARM58 is part of a gene cluster at the telomeric end of chromosome 34 also comprising the neighboring genes ARM56 and HSP23. We show that overexpression of all three genes can confer antimony resistance to intracellular amastigotes. Upon overexpression in L. donovani, ARM58 and ARM56 are secreted via exosomes, suggesting a scavenger/secretion mechanism of action. Using a combination of functional cloning and next-generation sequencing, we found that the gene cluster was selected only under antimonyl tartrate challenge and weakly under Cu(2+) challenge but not under sodium arsenite, Cd(2+), or miltefosine challenge. The selective advantage is less pronounced in intracellular amastigotes treated with the sodium stibogluconate, possibly due to the known macrophage-stimulatory activity of this drug, against which these resistance markers may not be active. Our data point to the specificity of these three genes for antimony resistance. Copyright © 2016, American Society for Microbiology. All Rights Reserved.

  9. Novel linkage disequilibrium clustering algorithm identifies new lupus genes on meta-analysis of GWAS datasets.

    Science.gov (United States)

    Saeed, Mohammad

    2017-05-01

    Systemic lupus erythematosus (SLE) is a complex disorder. Genetic association studies of complex disorders suffer from the following three major issues: phenotypic heterogeneity, false positive (type I error), and false negative (type II error) results. Hence, genes with low to moderate effects are missed in standard analyses, especially after statistical corrections. OASIS is a novel linkage disequilibrium clustering algorithm that can potentially address false positives and negatives in genome-wide association studies (GWAS) of complex disorders such as SLE. OASIS was applied to two SLE dbGAP GWAS datasets (6077 subjects; ∼0.75 million single-nucleotide polymorphisms). OASIS identified three known SLE genes viz. IFIH1, TNIP1, and CD44, not previously reported using these GWAS datasets. In addition, 22 novel loci for SLE were identified and the 5 SLE genes previously reported using these datasets were verified. OASIS methodology was validated using single-variant replication and gene-based analysis with GATES. This led to the verification of 60% of OASIS loci. New SLE genes that OASIS identified and were further verified include TNFAIP6, DNAJB3, TTF1, GRIN2B, MON2, LATS2, SNX6, RBFOX1, NCOA3, and CHAF1B. This study presents the OASIS algorithm, software, and the meta-analyses of two publicly available SLE GWAS datasets along with the novel SLE genes. Hence, OASIS is a novel linkage disequilibrium clustering method that can be universally applied to existing GWAS datasets for the identification of new genes.

  10. An improved Pearson's correlation proximity-based hierarchical clustering for mining biological association between genes.

    Science.gov (United States)

    Booma, P M; Prabhakaran, S; Dhanalakshmi, R

    2014-01-01

    Microarray gene expression datasets has concerned great awareness among molecular biologist, statisticians, and computer scientists. Data mining that extracts the hidden and usual information from datasets fails to identify the most significant biological associations between genes. A search made with heuristic for standard biological process measures only the gene expression level, threshold, and response time. Heuristic search identifies and mines the best biological solution, but the association process was not efficiently addressed. To monitor higher rate of expression levels between genes, a hierarchical clustering model was proposed, where the biological association between genes is measured simultaneously using proximity measure of improved Pearson's correlation (PCPHC). Additionally, the Seed Augment algorithm adopts average linkage methods on rows and columns in order to expand a seed PCPHC model into a maximal global PCPHC (GL-PCPHC) model and to identify association between the clusters. Moreover, a GL-PCPHC applies pattern growing method to mine the PCPHC patterns. Compared to existing gene expression analysis, the PCPHC model achieves better performance. Experimental evaluations are conducted for GL-PCPHC model with standard benchmark gene expression datasets extracted from UCI repository and GenBank database in terms of execution time, size of pattern, significance level, biological association efficiency, and pattern quality.

  11. An association study of established breast cancer reproductive and lifestyle risk factors with tumour subtype defined by the prognostic 70-gene expression signature (MammaPrint®).

    Science.gov (United States)

    Makama, M; Drukker, C A; Rutgers, E J Th; Slaets, L; Cardoso, F; Rookus, M A; Tryfonidis, K; Van't Veer, L J; Schmidt, M K

    2017-04-01

    Reproductive and lifestyle factors influence both breast cancer risk and prognosis; this might be through breast cancer subtype. Subtypes defined by immunohistochemical hormone receptor markers and gene expression signatures are used to predict prognosis of breast cancer patients based on their tumour biology. We investigated the association between established breast cancer risk factors and the 70-gene prognostication signature in breast cancer patients. Standardised questionnaires were used to obtain information on established risk factors of breast cancer from the Dutch patients of the MINDACT trial. Clinical-pathological and genomic information were obtained from the trial database. Logistic regression analyses were used to estimate the associations between lifestyle risk factors and tumour prognostic subtypes, measured by the 70-gene MammaPrint® signature (i.e. low-risk or high-risk tumours). Of the 1555 breast cancer patients included, 910 had low-risk and 645 had high-risk tumours. Current body mass index (BMI), age at menarche, age at first birth, age at menopause, hormonal contraceptive use and hormone replacement therapy use were not associated with MammaPrint®. In parous women, higher parity was associated with a lower risk (OR: 0.75, [95% confidence interval {CI}: 0.59-0.95] P = 0.018) and longer breastfeeding duration with a higher risk (OR: 1.03, [95% CI: 1.01-1.05] P = 0.005) of developing high-risk tumours; risk estimates were similar within oestrogen receptor-positive disease. After stratifying by menopausal status, the associations remained present in post-menopausal women. Using prognostic gene expression profiles, we have indications that specific reproductive factors may be associated with prognostic tumour subtypes beyond hormone receptor status. Copyright © 2017 Elsevier Ltd. All rights reserved.

  12. A novel method incorporating gene ontology information for unsupervised clustering and feature selection.

    Directory of Open Access Journals (Sweden)

    Shireesh Srivastava

    Full Text Available Among the primary goals of microarray analysis is the identification of genes that could distinguish between different phenotypes (feature selection. Previous studies indicate that incorporating prior information of the genes' function could help identify physiologically relevant features. However, current methods that incorporate prior functional information do not provide a relative estimate of the effect of different genes on the biological processes of interest.Here, we present a method that integrates gene ontology (GO information and expression data using Bayesian regression mixture models to perform unsupervised clustering of the samples and identify physiologically relevant discriminating features. As a model application, the method was applied to identify the genes that play a role in the cytotoxic responses of human hepatoblastoma cell line (HepG2 to saturated fatty acid (SFA and tumor necrosis factor (TNF-alpha, as compared to the non-toxic response to the unsaturated FFAs (UFA and TNF-alpha. Incorporation of prior knowledge led to a better discrimination of the toxic phenotypes from the others. The model identified roles of lysosomal ATPases and adenylate cyclase (AC9 in the toxicity of palmitate. To validate the role of AC in palmitate-treated cells, we measured the intracellular levels of cyclic AMP (cAMP. The cAMP levels were found to be significantly reduced by palmitate treatment and not by the other FFAs, in accordance with the model selection of AC9.A framework is presented that incorporates prior ontology information, which helped to (a perform unsupervised clustering of the phenotypes, and (b identify the genes relevant to each cluster of phenotypes. We demonstrate the proposed framework by applying it to identify physiologically-relevant feature genes that conferred differential toxicity to saturated vs. unsaturated FFAs. The framework can be applied to other problems to efficiently integrate ontology information and

  13. Non-ribosomal peptide synthetases: Identifying the cryptic gene clusters and decoding the natural product.

    Science.gov (United States)

    Singh, Mangal; Chaudhary, Sandeep; Sareen, Dipti

    2017-03-01

    Non-ribosomal peptide synthetases (NRPSs) and polyketide synthases (PKSs) present in bacteria and fungi are the major multi-modular enzyme complexes which synthesize secondary metabolites like the pharmacologically important antibiotics and siderophores. Each of the multiple modules of an NRPS activates a different amino or aryl acid, followed by their condensation to synthesize a linear or cyclic natural product. The studies on NRPS domains, the knowledge of their gene cluster architecture and tailoring enzymes have helped in the in silico genetic screening of the ever-expanding sequenced microbial genomic data for the identification of novel NRPS/PKS clusters and thus deciphering novel non-ribosomal peptides (NRPs). Adenylation domain is an integral part of the NRPSs and is the substrate selecting unit for the final assembled NRP. In some cases, it also requires a small protein, the MbtH homolog, for its optimum activity. The presence of putative adenylation domain and MbtH homologs in a sequenced genome can help identify the novel secondary metabolite producers. The role of the adenylation domain in the NRPS gene clusters and its characterization as a tool for the discovery of novel cryptic NRPS gene clusters are discussed.

  14. Clustering gene expression data with a penalized graph-based metric

    Directory of Open Access Journals (Sweden)

    Granitto Pablo M

    2011-01-01

    Full Text Available Abstract Background The search for cluster structure in microarray datasets is a base problem for the so-called "-omic sciences". A difficult problem in clustering is how to handle data with a manifold structure, i.e. data that is not shaped in the form of compact clouds of points, forming arbitrary shapes or paths embedded in a high-dimensional space, as could be the case of some gene expression datasets. Results In this work we introduce the Penalized k-Nearest-Neighbor-Graph (PKNNG based metric, a new tool for evaluating distances in such cases. The new metric can be used in combination with most clustering algorithms. The PKNNG metric is based on a two-step procedure: first it constructs the k-Nearest-Neighbor-Graph of the dataset of interest using a low k-value and then it adds edges with a highly penalized weight for connecting the subgraphs produced by the first step. We discuss several possible schemes for connecting the different sub-graphs as well as penalization functions. We show clustering results on several public gene expression datasets and simulated artificial problems to evaluate the behavior of the new metric. Conclusions In all cases the PKNNG metric shows promising clustering results. The use of the PKNNG metric can improve the performance of commonly used pairwise-distance based clustering methods, to the level of more advanced algorithms. A great advantage of the new procedure is that researchers do not need to learn a new method, they can simply compute distances with the PKNNG metric and then, for example, use hierarchical clustering to produce an accurate and highly interpretable dendrogram of their high-dimensional data.

  15. Clustering gene expression data with a penalized graph-based metric.

    Science.gov (United States)

    Bayá, Ariel E; Granitto, Pablo M

    2011-01-04

    The search for cluster structure in microarray datasets is a base problem for the so-called "-omic sciences". A difficult problem in clustering is how to handle data with a manifold structure, i.e. data that is not shaped in the form of compact clouds of points, forming arbitrary shapes or paths embedded in a high-dimensional space, as could be the case of some gene expression datasets. In this work we introduce the Penalized k-Nearest-Neighbor-Graph (PKNNG) based metric, a new tool for evaluating distances in such cases. The new metric can be used in combination with most clustering algorithms. The PKNNG metric is based on a two-step procedure: first it constructs the k-Nearest-Neighbor-Graph of the dataset of interest using a low k-value and then it adds edges with a highly penalized weight for connecting the subgraphs produced by the first step. We discuss several possible schemes for connecting the different sub-graphs as well as penalization functions. We show clustering results on several public gene expression datasets and simulated artificial problems to evaluate the behavior of the new metric. In all cases the PKNNG metric shows promising clustering results. The use of the PKNNG metric can improve the performance of commonly used pairwise-distance based clustering methods, to the level of more advanced algorithms. A great advantage of the new procedure is that researchers do not need to learn a new method, they can simply compute distances with the PKNNG metric and then, for example, use hierarchical clustering to produce an accurate and highly interpretable dendrogram of their high-dimensional data.

  16. Genetic Variants of the FADS Gene Cluster and ELOVL Gene Family, Colostrums LC-PUFA Levels, Breastfeeding, and Child Cognition

    OpenAIRE

    Morales, Eva; Bustamante, Mariona; Gonzalez, Juan Ramon; Guxens, Monica; Torrent, Maties; Mendez, Michelle; Garcia-Esteban, Raquel; Julvez, Jordi; Forns, Joan; Vrijheid, Martine; Molto-Puigmarti, Carolina; Lopez-Sabater, Carmen; Estivill, Xavier; Sunyer, Jordi

    2011-01-01

    Introduction Breastfeeding effects on cognition are attributed to long-chain polyunsaturated fatty acids (LC-PUFAs), but controversy persists. Genetic variation in fatty acid desaturase (FADS) and elongase (ELOVL) enzymes has been overlooked when studying the effects of LC-PUFAs supply on cognition. We aimed to: 1) to determine whether maternal genetic variants in the FADS cluster and ELOVL genes contribute to differences in LC-PUFA levels in colostrum; 2) to analyze whether these maternal va...

  17. Genetic variants of the FADS gene cluster and ELOVL gene family, colostrums LC-PUFA levels, breastfeeding, and child cognition

    OpenAIRE

    Eva Morales; Mariona Bustamante; Juan Ramon Gonzalez; Monica Guxens; Maties Torrent; Michelle Mendez; Raquel Garcia-Esteban; Jordi Julvez; Joan Forns; Martine Vrijheid; Carolina Molto-Puigmarti; Carmen Lopez-Sabater; Xavier Estivill; Jordi Sunyer

    2011-01-01

    INTRODUCTION: Breastfeeding effects on cognition are attributed to long-chain polyunsaturated fatty acids (LC-PUFAs), but controversy persists. Genetic variation in fatty acid desaturase (FADS) and elongase (ELOVL) enzymes has been overlooked when studying the effects of LC-PUFAs supply on cognition. We aimed to: 1) to determine whether maternal genetic variants in the FADS cluster and ELOVL genes contribute to differences in LC-PUFA levels in colostrum; 2) to analyze whether these maternal v...

  18. Genetic variations and haplotype diversity of the UGT1 gene cluster in the Chinese population.

    Directory of Open Access Journals (Sweden)

    Jing Yang

    Full Text Available Vertebrates require tremendous molecular diversity to defend against numerous small hydrophobic chemicals. UDP-glucuronosyltransferases (UGTs are a large family of detoxification enzymes that glucuronidate xenobiotics and endobiotics, facilitating their excretion from the body. The UGT1 gene cluster contains a tandem array of variable first exons, each preceded by a specific promoter, and a common set of downstream constant exons, similar to the genomic organization of the protocadherin (Pcdh, immunoglobulin, and T-cell receptor gene clusters. To assist pharmacogenomics studies in Chinese, we sequenced nine first exons, promoter and intronic regions, and five common exons of the UGT1 gene cluster in a population sample of 253 unrelated Chinese individuals. We identified 101 polymorphisms and found 15 novel SNPs. We then computed allele frequencies for each polymorphism and reconstructed their linkage disequilibrium (LD map. The UGT1 cluster can be divided into five linkage blocks: Block 9 (UGT1A9, Block 9/7/6 (UGT1A9, UGT1A7, and UGT1A6, Block 5 (UGT1A5, Block 4/3 (UGT1A4 and UGT1A3, and Block 3' UTR. Furthermore, we inferred haplotypes and selected their tagSNPs. Finally, comparing our data with those of three other populations of the HapMap project revealed ethnic specificity of the UGT1 genetic diversity in Chinese. These findings have important implications for future molecular genetic studies of the UGT1 gene cluster as well as for personalized medical therapies in Chinese.

  19. A new method for rapid identification of ansamycin compounds by inactivating KLM gene clusters in potential ansamycin-producing actinomyces.

    Science.gov (United States)

    Wang, G; Zhang, H; Sun, G; Wu, L; Zhang, J; Wang, Y

    2012-02-01

    In this study, we explored the possibility of construction of a 'universal targeting vector' by Red/ET recombination to inactivate L gene encoding 3-amino-5-hydroxybenzoic acid (AHBA)-oxidoreductase in AHBA biosynthetic gene cluster to facilitate the detection of ansamycins production in actinomycetes. Based on the conserved regions of linked AHBA synthase (K), oxidoreductase (L) and phosphatase (M) gene clusters, degenerate primers were designed and PCR was performed to detect KLM gene clusters within 33 AHBA synthase gene-positive actinomycetes strains. Among them, 22 KLM gene cluster-positive strains were identified. A 'universal targeting vector' was further constructed using the 50-nt homologous sequences chosen from four strains internal L gene in KLM gene clusters through Red/ET recombination. The L gene from nine of the KLM gene cluster-positive actinomycetes strains was inactivated by insertion of a kanamycin (Km) resistance marker into its internal region from the 'universal targeting vector'. By comparison of the metabolites produced in parent strains with those in L gene-inactivated mutants, we demonstrated the possible ansamycins production produced by these strains. One strain (4089) was proved to be a geldanamycin producer. Three strains (3-20, 7-32 and 8-32) were identified as potential triene-ansamycins producers. Another strain (3-27) was possible to be a streptovaricin C producer. Strains 24-100 and 4-124 might be served as ansamitocin-like producers. The results confirmed the feasibility that a 'universal targeting vector' could be constructed through Red/ET recombination using the conserved regions of KLM gene clusters to detect ansamycins production in actinomycetes. The 'universal targeting vector' provides a rapid approach in certain degree to detect the potential ansamycin producers from the 22 KLM gene cluster-positive actinomycetes strains. © 2011 The Authors. Journal of Applied Microbiology © 2011 The Society for Applied Microbiology.

  20. Circulating tumor cell clusters-associated gene plakoglobin and breast cancer survival.

    Science.gov (United States)

    Lu, Lingeng; Zeng, Hongmei; Gu, Xinsheng; Ma, Wenxue

    2015-06-01

    Breast cancer recurrence is a major cause of the disease-specific death. Circulating tumor cells (CTCs) are negatively associated with breast cancer survival. Plakoglobin, a cell adhesion protein, was recently reported as a determinant of CTCs types, single or clustered ones. Here, we aim to summarize the studies on the roles of plakoglobin and evaluate the association of plakoglobin and breast cancer survival. Plakoglobin as a key component in both cell adhesion and the signaling pathways was briefly reviewed first. Then the double-edge functions of plakoglobin in tumors and its association with CTCs and breast cancer metastasis were introduced. Finally, based on an open-access database, the association between plakoglobin and breast cancer survival was investigated using univariate and multivariate survival analyses. Plakoglobin may be a molecule functioning as a double-edge sword. Loss of plakoglobin expression leads to increased motility of epithelial cells, thereby promoting epithelial-mesenchymal transition and further metastasis of cancer. However, studies also show that plakoglobin can function as an oncogene. High expression of plakoglobin results in clustered tumor cells in circulation with high metastatic potential in breast cancer and shortened patient survival. Plakoglobin may be a potential prognostic biomarker that can be exploited to develop as a therapeutic target for breast cancer.

  1. Sequencing and transcriptional analysis of the Streptococcus thermophilus histamine biosynthesis gene cluster: factors that affect differential hdcA expression

    DEFF Research Database (Denmark)

    Calles-Enríquez, Marina; Hjort, Benjamin Benn; Andersen, Pia Skov

    2010-01-01

    acquisition through a horizontal transfer mechanism. Transcriptional analysis of the hdc cluster revealed the existence of a polycistronic mRNA covering the three genes. The histidine-decarboxylating gene (hdcA) of S. thermophilus demonstrated maximum expression during the stationary growth phase, with high...... to produce histamine. The hdc clusters of S. thermophilus CHCC1524 and CHCC6483 were sequenced, and the factors that affect histamine biosynthesis and histidine-decarboxylating gene (hdcA) expression were studied. The hdc cluster began with the hdcA gene, was followed by a transporter (hdcP), and ended...... with the hdcB gene, which is of unknown function. The three genes were orientated in the same direction. The genetic organization of the hdc cluster showed a unique organization among the lactic acid bacterial group and resembled those of Staphylococcus and Clostridium species, thus indicating possible...

  2. MeSH key terms for validation and annotation of gene expression clusters

    Energy Technology Data Exchange (ETDEWEB)

    Rechtsteiner, A. (Andreas); Rocha, L. M. (Luis Mateus)

    2004-01-01

    Integration of different sources of information is a great challenge for the analysis of gene expression data, and for the field of Functional Genomics in general. As the availability of numerical data from high-throughput methods increases, so does the need for technologies that assist in the validation and evaluation of the biological significance of results extracted from these data. In mRNA assaying with microarrays, for example, numerical analysis often attempts to identify clusters of co-expressed genes. The important task to find the biological significance of the results and validate them has so far mostly fallen to the biological expert who had to perform this task manually. One of the most promising avenues to develop automated and integrative technology for such tasks lies in the application of modern Information Retrieval (IR) and Knowledge Management (KM) algorithms to databases with biomedical publications and data. Examples of databases available for the field are bibliographic databases c ntaining scientific publications (e.g. MEDLINE/PUBMED), databases containing sequence data (e.g. GenBank) and databases of semantic annotations (e.g. the Gene Ontology Consortium and Medical Subject Headings (MeSH)). We present here an approach that uses the MeSH terms and their concept hierarchies to validate and obtain functional information for gene expression clusters. The controlled and hierarchical MeSH vocabulary is used by the National Library of Medicine (NLM) to index all the articles cited in MEDLINE. Such indexing with a controlled vocabulary eliminates some of the ambiguity due to polysemy (terms that have multiple meanings) and synonymy (multiple terms have similar meaning) that would be encountered if terms would be extracted directly from the articles due to differing article contexts or author preferences and background. Further, the hierarchical organization of the MeSH terms can illustrate the conceptuallfunctional relationships of genes

  3. IL2-IL21 gene cluster polymorphism is not associated with allograft function after kidney transplantation.

    Science.gov (United States)

    Kwiatkowska, Ewa; Domanski, Leszek; Kłoda, Karolina; Pawlik, Andrzej; Safranow, Krzysztof; Ciechanowski, Kazimierz

    2014-12-01

    Cytokines are key mediators of the immune response after transplantation. The interleukin (IL)-2 cytokine family, which includes IL-2, IL-4, IL-7, IL-9, IL-15, and IL-21, is of particular interest because of its importance in the allogenic response. The aim of this study was to examine the association between the rs6822844 gene polymorphism in the IL2-IL21 region and allograft function after kidney transplantation. The study enrolled 270 Caucasian kidney allograft recipients (166 males and 104 females, mean age 47.63 ± 12.96 years). Following parameters were recorded in each case: recipient's age, delayed graft function (DGF), occurrence and number of episodes of acute rejection (AR), and chronic allograft dysfunction (CAD). Genotyping of the rs6822844 IL2-IL21 cluster gene polymorphism was performed using real-time PCR assay. There were no statistically significant differences in the genotypes and alleles of the rs6822844 IL2-IL21 cluster gene polymorphism among patients with DGF (p = 0.72), AR (p = 0.69) and CAD (p = 0.77), or in creatinine concentrations 1, 3, 6, 12, 24 or 36 months after transplantation (p = 0.46, p = 0.58, p = 0.6, p = 0.72, p = 0.7, p = 0.76, respectively). It seems that the rs6822844 IL2-IL21 gene cluster polymorphism is of little importance in allograft function after kidney transplantation.

  4. Identification of a 24-gene prognostic signature that improves the European LeukemiaNet risk classification of acute myeloid leukemia: an international collaborative study.

    Science.gov (United States)

    Li, Zejuan; Herold, Tobias; He, Chunjiang; Valk, Peter J M; Chen, Ping; Jurinovic, Vindi; Mansmann, Ulrich; Radmacher, Michael D; Maharry, Kati S; Sun, Miao; Yang, Xinan; Huang, Hao; Jiang, Xi; Sauerland, Maria-Cristina; Büchner, Thomas; Hiddemann, Wolfgang; Elkahloun, Abdel; Neilly, Mary Beth; Zhang, Yanming; Larson, Richard A; Le Beau, Michelle M; Caligiuri, Michael A; Döhner, Konstanze; Bullinger, Lars; Liu, Paul P; Delwel, Ruud; Marcucci, Guido; Lowenberg, Bob; Bloomfield, Clara D; Rowley, Janet D; Bohlander, Stefan K; Chen, Jianjun

    2013-03-20

    To identify a robust prognostic gene expression signature as an independent predictor of survival of patients with acute myeloid leukemia (AML) and use it to improve established risk classification. Four independent sets totaling 499 patients with AML carrying various cytogenetic and molecular abnormalities were used as training sets. Two independent patient sets composed of 825 patients were used as validation sets. Notably, patients from different sets were treated with different protocols, and their gene expression profiles were derived using different microarray platforms. Cox regression and Kaplan-Meier methods were used for survival analyses. A prognostic signature composed of 24 genes was derived from a meta-analysis of Cox regression values of each gene across the four training sets. In multivariable models, a higher sum value of the 24-gene signature was an independent predictor of shorter overall (OS) and event-free survival (EFS) in both training and validation sets (P classification of AML, and patients in three new risk groups classified by the integrated risk classification showed significantly (P classification incorporating this gene signature provides a better framework for risk stratification and outcome prediction than the ELN classification.

  5. Genome mining for radical SAM protein determinants reveals multiple sactibiotic-like gene clusters.

    Directory of Open Access Journals (Sweden)

    Kiera Murphy

    Full Text Available Thuricin CD is a two-component bacteriocin produced by Bacillus thuringiensis that kills a wide range of clinically significant Clostridium difficile. This bacteriocin has recently been characterized and consists of two distinct peptides, Trnβ and Trnα, which both possess 3 intrapeptide sulphur to α-carbon bridges and act synergistically. Indeed, thuricin CD and subtilosin A are the only antimicrobials known to possess these unusual structures and are known as the sactibiotics (sulplur to alpha carbon-containing antibiotics. Analysis of the thuricin CD-associated gene cluster revealed the presence of genes encoding two highly unusual SAM proteins (TrnC and TrnD which are proposed to be responsible for these unusual post-translational modifications. On the basis of the frequently high conservation among enzymes responsible for the post-translational modification of specific antimicrobials, we performed an in silico screen for novel thuricin CD-like gene clusters using the TrnC and TrnD radical SAM proteins as driver sequences to perform an initial homology search against the complete non-redundant database. Fifteen novel thuricin CD-like gene clusters were identified, based on the presence of TrnC and TrnD homologues in the context of neighbouring genes encoding potential bacteriocin structural peptides. Moreover, metagenomic analysis revealed that TrnC or TrnD homologs are present in a variety of metagenomic environments, suggesting a widespread distribution of thuricin-like operons in a variety of environments. In-silico analysis of radical SAM proteins is sufficient to identify novel putative sactibiotic clusters.

  6. Molecular analysis of SCARECROW genes expressed in white lupin cluster roots

    Science.gov (United States)

    Sbabou, Laila; Bucciarelli, Bruna; Miller, Susan; Liu, Junqi; Berhada, Fatiha; Filali-Maltouf, Abdelkarim; Allan, Deborah; Vance, Carroll

    2010-01-01

    The Scarecrow (SCR) transcription factor plays a crucial role in root cell radial patterning and is required for maintenance of the quiescent centre and differentiation of the endodermis. In response to phosphorus (P) deficiency, white lupin (Lupinus albus L.) root surface area increases some 50-fold to 70-fold due to the development of cluster (proteoid) roots. Previously it was reported that SCR-like expressed sequence tags (ESTs) were expressed during early cluster root development. Here the cloning of two white lupin SCR genes, LaSCR1 and LaSCR2, is reported. The predicted amino acid sequences of both LaSCR gene products are highly similar to AtSCR and contain C-terminal conserved GRAS family domains. LaSCR1 and LaSCR2 transcript accumulation localized to the endodermis of both normal and cluster roots as shown by in situ hybridization and gene promoter::reporter staining. Transcript analysis as evaluated by quantitative real-time-PCR (qRT-PCR) and RNA gel hybridization indicated that the two LaSCR genes are expressed predominantly in roots. Expression of LaSCR genes was not directly responsive to the P status of the plant but was a function of cluster root development. Suppression of LaSCR1 in transformed roots of lupin and Medicago via RNAi (RNA interference) delivered through Agrobacterium rhizogenes resulted in decreased root numbers, reflecting the potential role of LaSCR1 in maintaining root growth in these species. The results suggest that the functional orthologues of AtSCR have been characterized. PMID:20167612

  7. A temporal precedence based clustering method for gene expression microarray data

    Directory of Open Access Journals (Sweden)

    Buchanan-Wollaston Vicky

    2010-01-01

    Full Text Available Abstract Background Time-course microarray experiments can produce useful data which can help in understanding the underlying dynamics of the system. Clustering is an important stage in microarray data analysis where the data is grouped together according to certain characteristics. The majority of clustering techniques are based on distance or visual similarity measures which may not be suitable for clustering of temporal microarray data where the sequential nature of time is important. We present a Granger causality based technique to cluster temporal microarray gene expression data, which measures the interdependence between two time-series by statistically testing if one time-series can be used for forecasting the other time-series or not. Results A gene-association matrix is constructed by testing temporal relationships between pairs of genes using the Granger causality test. The association matrix is further analyzed using a graph-theoretic technique to detect highly connected components representing interesting biological modules. We test our approach on synthesized datasets and real biological datasets obtained for Arabidopsis thaliana. We show the effectiveness of our approach by analyzing the results using the existing biological literature. We also report interesting structural properties of the association network commonly desired in any biological system. Conclusions Our experiments on synthesized and real microarray datasets show that our approach produces encouraging results. The method is simple in implementation and is statistically traceable at each step. The method can produce sets of functionally related genes which can be further used for reverse-engineering of gene circuits.

  8. A eukaryotic nicotinate-inducible gene cluster: convergent evolution in fungi and bacteria

    Science.gov (United States)

    Ámon, Judit; Fernández-Martín, Rafael; Bokor, Eszter; Cultrone, Antonietta; Kelly, Joan M.; Flipphi, Michel; Scazzocchio, Claudio

    2017-01-01

    Nicotinate degradation has hitherto been elucidated only in bacteria. In the ascomycete Aspergillus nidulans, six loci, hxnS/AN9178 encoding the molybdenum cofactor-containing nicotinate hydroxylase, AN11197 encoding a Cys2/His2 zinc finger regulator HxnR, together with AN11196/hxnZ, AN11188/hxnY, AN11189/hxnP and AN9177/hxnT, are clustered and stringently co-induced by a nicotinate derivative and subject to nitrogen metabolite repression mediated by the GATA factor AreA. These genes are strictly co-regulated by HxnR. Within the hxnR gene, constitutive mutations map in two discrete regions. Aspergillus nidulans is capable of using nicotinate and its oxidation products 6-hydroxynicotinic acid and 2,5-dihydroxypyridine as sole nitrogen sources in an HxnR-dependent way. HxnS is highly similar to HxA, the canonical xanthine dehydrogenase (XDH), and has originated by gene duplication, preceding the origin of the Pezizomycotina. This cluster is conserved with some variations throughout the Aspergillaceae. Our results imply that a fungal pathway has arisen independently from bacterial ones. Significantly, the neo-functionalization of XDH into nicotinate hydroxylase has occurred independently from analogous events in bacteria. This work describes for the first time a gene cluster involved in nicotinate catabolism in a eukaryote and has relevance for the formation and evolution of co-regulated primary metabolic gene clusters and the microbial degradation of N-heterocyclic compounds. PMID:29212709

  9. The prognostic and predictive value of TMPRSS2-ERG gene fusion and ERG protein expression in prostate cancer biopsies.

    Science.gov (United States)

    Berg, Kasper Drimer

    2016-12-01

    The clinical course of prostate carcinoma (PCa) is very heterogeneous. Consequently, a personalised approach for risk stratification and treatment planning is important. Recently, it has become evident that PCa, also at the genomic level, is heterogeneous. An early and common alteration is the gene fusion between the transmembrane protease serine 2 (TMPRSS2) gene and the v-ets avian erythroblastosis virus E26 oncogene homolog (ERG) gene resulting in expression of the oncoprotein ERG. The gene fusion is present in approximately half of PCa patients and the resultant two subgroups demonstrate marked differences in their genomic signatures. It has been hypothesised that genomic alterations can explain some of the observed heterogeneity in the clinical course of PCa. In order to conduct an analysis of the prognostic and predictive value of ERG protein expression in PCa biopsies, the thesis sought to evaluate: 1) the concordance in ERG expression between biopsies and radical prostatectomies: 2) the association between expression of ERG protein and the risk of PCa progression during active surveillance (AS), and 3) the association between ERG protein expression and response to primary castration-based treatment for advanced PCa. The included patients derived from the institutional AS cohort and an institutional cohort of advanced PCa patients undergoing first line castration-based androgen deprivation therapy (ADT). The 265 patients in the AS cohort were enrolled prospectively between October 2002 and October 2012 and were followed with regular digital rectal examinations, PSA measurements, and repeated biopsies. The advanced PCa cohort comprised of 194 patients diagnosed between January 2000 and December 2011 and was established retrospectively by a standardised extraction of patient data. Immunohistochemical (IHC) assessment for ERG protein expression was performed in all tumours containing diagnostic specimens (AS cohort: n = 459; advanced PCa cohort: n = 968), re

  10. Genomic organization, tissue distribution and functional characterization of the rat Pate gene cluster.

    Directory of Open Access Journals (Sweden)

    Angireddy Rajesh

    Full Text Available The cysteine rich prostate and testis expressed (Pate proteins identified till date are thought to resemble the three fingered protein/urokinase-type plasminogen activator receptor proteins. In this study, for the first time, we report the identification, cloning and characterization of rat Pate gene cluster and also determine the expression pattern. The rat Pate genes are clustered on chromosome 8 and their predicted proteins retained the ten cysteine signature characteristic to TFP/Ly-6 protein family. PATE and PATE-F three dimensional protein structure was found to be similar to that of the toxin bucandin. Though Pate gene expression is thought to be prostate and testis specific, we observed that rat Pate genes are also expressed in seminal vesicle and epididymis and in tissues beyond the male reproductive tract. In the developing rats (20-60 day old, expression of Pate genes seem to be androgen dependent in the epididymis and testis. In the adult rat, androgen ablation resulted in down regulation of the majority of Pate genes in the epididymides. PATE and PATE-F proteins were found to be expressed abundantly in the male reproductive tract of rats and on the sperm. Recombinant PATE protein exhibited potent antibacterial activity, whereas PATE-F did not exhibit any antibacterial activity. Pate expression was induced in the epididymides when challenged with LPS. Based on our results, we conclude that rat PATE proteins may contribute to the reproductive and defense functions.

  11. Spatial expression of Hox cluster genes in the ontogeny of a sea urchin

    Science.gov (United States)

    Arenas-Mena, C.; Cameron, A. R.; Davidson, E. H.

    2000-01-01

    The Hox cluster of the sea urchin Strongylocentrous purpuratus contains ten genes in a 500 kb span of the genome. Only two of these genes are expressed during embryogenesis, while all of eight genes tested are expressed during development of the adult body plan in the larval stage. We report the spatial expression during larval development of the five 'posterior' genes of the cluster: SpHox7, SpHox8, SpHox9/10, SpHox11/13a and SpHox11/13b. The five genes exhibit a dynamic, largely mesodermal program of expression. Only SpHox7 displays extensive expression within the pentameral rudiment itself. A spatially sequential and colinear arrangement of expression domains is found in the somatocoels, the paired posterior mesodermal structures that will become the adult perivisceral coeloms. No such sequential expression pattern is observed in endodermal, epidermal or neural tissues of either the larva or the presumptive juvenile sea urchin. The spatial expression patterns of the Hox genes illuminate the evolutionary process by which the pentameral echinoderm body plan emerged from a bilateral ancestor.

  12. Validation of a prognostic multi-gene signature in high-risk neuroblastoma using the high throughput digital NanoString nCounter™ system.

    Science.gov (United States)

    Stricker, Thomas P; Morales La Madrid, Andres; Chlenski, Alexandre; Guerrero, Lisa; Salwen, Helen R; Gosiengfiao, Yasmin; Perlman, Elizabeth J; Furman, Wayne; Bahrami, Armita; Shohet, Jason M; Zage, Peter E; Hicks, M John; Shimada, Hiroyuki; Suganuma, Rie; Park, Julie R; So, Sara; London, Wendy B; Pytel, Peter; Maclean, Kirsteen H; Cohn, Susan L

    2014-05-01

    Microarray-based molecular signatures have not been widely integrated into neuroblastoma diagnostic classification systems due to the complexities of the assay and requirement for high-quality RNA. New digital technologies that accurately quantify gene expression using RNA isolated from formalin-fixed paraffin embedded (FFPE) tissues are now available. In this study, we describe the first use of a high-throughput digital system to assay the expression of genes in an "ultra-high risk" microarray classifier in FFPE high-risk neuroblastoma tumors. Customized probes corresponding to the 42 genes in a published multi-gene neuroblastoma signature were hybridized to RNA isolated from 107 FFPE high-risk neuroblastoma samples using the NanoString nCounter™ Analysis System. For classification of each patient, the Pearson's correlation coefficient was calculated between the standardized nCounter™ data and the molecular signature from the microarray data. We demonstrate that the nCounter™ 42-gene panel sub-stratified the high-risk cohort into two subsets with statistically significantly different overall survival (p = 0.0027) and event-free survival (p = 0.028). In contrast, none of the established prognostic risk markers (age, stage, tumor histology, MYCN status, and ploidy) were significantly associated with survival. We conclude that the nCounter™ System can reproducibly quantify expression levels of signature genes in FFPE tumor samples. Validation of this microarray signature in our high-risk patient cohort using a completely different technology emphasizes the prognostic relevance of this classifier. Prospective studies testing the prognostic value of molecular signatures in high-risk neuroblastoma patients using FFPE tumor samples and the nCounter™ System are warranted. Copyright © 2014 Federation of European Biochemical Societies. Published by Elsevier B.V. All rights reserved.

  13. MGMT Gene Promoter Methylation as a Potent Prognostic Factor in Glioblastoma Treated With Temozolomide-Based Chemoradiotherapy: A Single-Institution Study

    Energy Technology Data Exchange (ETDEWEB)

    Kim, Young Suk [Department of Radiation Oncology, Yonsei University College of Medicine, Yonsei University Health System, Seoul (Korea, Republic of); Kim, Se Hoon [Department of Pathology, Yonsei University College of Medicine, Yonsei University Health System, Seoul (Korea, Republic of); Cho, Jaeho; Kim, Jun Won [Department of Radiation Oncology, Yonsei University College of Medicine, Yonsei University Health System, Seoul (Korea, Republic of); Chang, Jong Hee; Kim, Dong Suk; Lee, Kyu Sung [Department of Neurosurgery, Yonsei University College of Medicine, Yonsei University Health System, Seoul (Korea, Republic of); Suh, Chang-Ok, E-mail: cosuh317@yuhs.ac [Department of Radiation Oncology, Yonsei University College of Medicine, Yonsei University Health System, Seoul (Korea, Republic of)

    2012-11-01

    Purpose: Recently, cells deficient in O{sup 6}-methylguanine-DNA methyltransferase (MGMT) were found to show increased sensitivity to temozolomide (TMZ). We evaluated whether hypermethylation of MGMT was associated with survival in patients with glioblastoma multiforme (GBM). Methods and Materials: We retrospectively analyzed 93 patients with histologically confirmed GBM who received involved-field radiotherapy with TMZ from 2001 to 2008. The median age was 58 years (range, 24-78 years). Surgical resection was total in 39 patients (42%), subtotal in 30 patients (32%), and partial in 17 patients (18%); only a biopsy was performed in 7 patients (8%). Postoperative radiotherapy began within 3 weeks of surgery in 87% of the patients. Radiotherapy doses ranged from 50 to 74 Gy (median, 70 Gy). MGMT gene methylation was determined in 78 patients; MGMT was unmethylated in 43 patients (55%) and methylated in 35 patients (45%). The median follow-up period was 22 months (range, 3-88 months) for all patients. Results: The median overall survival (OS) was 22 months, and progression-free survival (PFS) was 11 months. MGMT gene methylation was an independently significant prognostic factor for both OS (p = 0.002) and PFS (p = 0.008) in multivariate analysis. The median OS was 29 months for the methylated group and 20 months for the unmethylated group. In 35 patients with methylated MGMT genes, the 2-year and 5-year OS rates were 54% and 31%, respectively. Six patients with combined prognostic factors of methylated MGMT genes, age {<=}50 years, and total/subtotal resections are all alive 38 to 77 months after operation, whereas the median OS in 8 patients with unmethylated MGMT genes, age >50 years, and less than subtotal resection was 13.2 months. Conclusion: We confirmed that MGMT gene methylation is a potent prognostic factor in patients with GBM. Our results suggest that early postoperative radiotherapy and a high total/subtotal resection rate might further improve the

  14. Multi-class clustering of cancer subtypes through SVM based ensemble of pareto-optimal solutions for gene marker identification.

    Science.gov (United States)

    Mukhopadhyay, Anirban; Bandyopadhyay, Sanghamitra; Maulik, Ujjwal

    2010-11-12

    With the advancement of microarray technology, it is now possible to study the expression profiles of thousands of genes across different experimental conditions or tissue samples simultaneously. Microarray cancer datasets, organized as samples versus genes fashion, are being used for classification of tissue samples into benign and malignant or their subtypes. They are also useful for identifying potential gene markers for each cancer subtype, which helps in successful diagnosis of particular cancer types. In this article, we have presented an unsupervised cancer classification technique based on multiobjective genetic clustering of the tissue samples. In this regard, a real-coded encoding of the cluster centers is used and cluster compactness and separation are simultaneously optimized. The resultant set of near-Pareto-optimal solutions contains a number of non-dominated solutions. A novel approach to combine the clustering information possessed by the non-dominated solutions through Support Vector Machine (SVM) classifier has been proposed. Final clustering is obtained by consensus among the clusterings yielded by different kernel functions. The performance of the proposed multiobjective clustering method has been compared with that of several other microarray clustering algorithms for three publicly available benchmark cancer datasets. Moreover, statistical significance tests have been conducted to establish the statistical superiority of the proposed clustering method. Furthermore, relevant gene markers have been identified using the clustering result produced by the proposed clustering method and demonstrated visually. Biological relationships among the gene markers are also studied based on gene ontology. The results obtained are found to be promising and can possibly have important impact in the area of unsupervised cancer classification as well as gene marker identification for multiple cancer subtypes.

  15. Identification of novel mureidomycin analogues via rational activation of a cryptic gene cluster in Streptomyces roseosporus NRRL 15998.

    Science.gov (United States)

    Jiang, Lingjuan; Wang, Lu; Zhang, Jihui; Liu, Hao; Hong, Bin; Tan, Huarong; Niu, Guoqing

    2015-09-15

    Antimicrobial agents are urgently needed to tackle the growing threat of antibiotic-resistant pathogens. An important source of new antimicrobials is the large repertoire of cryptic gene clusters embedded in microbial genomes. Genome mining revealed a napsamycin/mureidomycin biosynthetic gene cluster in the chromosome of Streptomyces roseosporus NRRL 15998. The cryptic gene cluster was activated by constitutive expression of a foreign activator gene ssaA from sansanmycin biosynthetic gene cluster of Streptomyces sp. strain SS. Expression of the gene cluster was verified by RT-PCR analysis of key biosynthetic genes. The activated metabolites demonstrated potent inhibitory activity against the highly refractory pathogen Pseudomonas aeruginosa, and characterization of the metabolites led to the discovery of eight acetylated mureidomycin analogues. To our surprise, constitutive expression of the native activator gene SSGG_02995, a ssaA homologue in S. roseosporus NRRL 15998, has no beneficial effect on mureidomycin stimulation. This study provides a new way to activate cryptic gene cluster for the acquisition of novel antibiotics and will accelerate the exploitation of prodigious natural products in Streptomyces.

  16. Prognostic Value of a CYP2B6 Gene Polymorphism in Patients with Acute Myeloid Leukemia.

    Science.gov (United States)

    Alazhary, Nevin M; Shafik, Roxan E; Shafik, Hanan E; Kamel, Mahmoud M

    2015-01-01

    The objectives of this study aimed to detect a CYP2B6 polymorphism in de novo cases of acute myeloid leukemia patients and identify any role in disease progression and outcome. DNA was isolated from peripheral blood of 82 newly diagnosed acute myeloid leukemia cases and the CYP2B6 G15631T gene polymorphism was assayed by PCR restriction fragment length polymorphism (PCR-RFLP). The frequency of the GG genotype (wild type) was 48 (58.5%) and that of the mutant type T allele was 34 (41.9%). GT genotype heterozygous variants were found in 28 (34%), and TT genotype homozygous variants in 6 (7.3%) cases. We found no significant association between the CYP2B6 G15631T polymorphism and complete response (CR) (p-value=0.768), FAB classification (p-value=0.51), cytogenetic analysis (p-value=0.673), and overall survival (p-value=0.325). Also, there were no significant links with early toxic death (p-value=0.92) or progression- free survival (PFS) (p-value=0.245). Our results suggest that the CYP2B6 polymorphism has no role in disease progression, therapeutic outcome, patient free survival, early toxic death and overall survival in acute myeloid leukemia patients.

  17. Hierarchical clustering of breast cancer methylomes revealed differentially methylated and expressed breast cancer genes.

    Directory of Open Access Journals (Sweden)

    I-Hsuan Lin

    Full Text Available Oncogenic transformation of normal cells often involves epigenetic alterations, including histone modification and DNA methylation. We conducted whole-genome bisulfite sequencing to determine the DNA methylomes of normal breast, fibroadenoma, invasive ductal carcinomas and MCF7. The emergence, disappearance, expansion and contraction of kilobase-sized hypomethylated regions (HMRs and the hypomethylation of the megabase-sized partially methylated domains (PMDs are the major forms of methylation changes observed in breast tumor samples. Hierarchical clustering of HMR revealed tumor-specific hypermethylated clusters and differential methylated enhancers specific to normal or breast cancer cell lines. Joint analysis of gene expression and DNA methylation data of normal breast and breast cancer cells identified differentially methylated and expressed genes associated with breast and/or ovarian cancers in cancer-specific HMR clusters. Furthermore, aberrant patterns of X-chromosome inactivation (XCI was found in breast cancer cell lines as well as breast tumor samples in the TCGA BRCA (breast invasive carcinoma dataset. They were characterized with differentially hypermethylated XIST promoter, reduced expression of XIST, and over-expression of hypomethylated X-linked genes. High expressions of these genes were significantly associated with lower survival rates in breast cancer patients. Comprehensive analysis of the normal and breast tumor methylomes suggests selective targeting of DNA methylation changes during breast cancer progression. The weak causal relationship between DNA methylation and gene expression observed in this study is evident of more complex role of DNA methylation in the regulation of gene expression in human epigenetics that deserves further investigation.

  18. In silico clustering of Salmonella global gene expression data reveals novel genes co-regulated with the SPI-1 virulence genes through HilD.

    Science.gov (United States)

    Martínez-Flores, Irma; Pérez-Morales, Deyanira; Sánchez-Pérez, Mishael; Paredes, Claudia C; Collado-Vides, Julio; Salgado, Heladia; Bustamante, Víctor H

    2016-11-25

    A wide variety of Salmonella enterica serovars cause intestinal and systemic infections to humans and animals. Salmonella Patogenicity Island 1 (SPI-1) is a chromosomal region containing 39 genes that have crucial virulence roles. The AraC-like transcriptional regulator HilD, encoded in SPI-1, positively controls the expression of the SPI-1 genes, as well as of several other virulence genes located outside SPI-1. In this study, we applied a clustering method to the global gene expression data of S. enterica serovar Typhimurium from the COLOMBOS database; thus genes that show an expression pattern similar to that of SPI-1 genes were selected. This analysis revealed nine novel genes that are co-expressed with SPI-1, which are located in different chromosomal regions. Expression analyses and protein-DNA interaction assays showed regulation by HilD for six of these genes: gtgE, phoH, sinR, SL1263 (lpxR) and SL4247 were regulated directly, whereas SL1896 was regulated indirectly. Interestingly, phoH is an ancestral gene conserved in most of bacteria, whereas the other genes show characteristics of genes acquired by Salmonella. A role in virulence has been previously demonstrated for gtgE, lpxR and sinR. Our results further expand the regulon of HilD and thus identify novel possible Salmonella virulence genes.

  19. Can K-ras gene mutation be utilized as prognostic biomarker for colorectal cancer patients receiving chemotherapy? A meta-analysis and systematic review.

    Science.gov (United States)

    Rui, Yuan-Yi; Zhang, Dan; Zhou, Zong-Guang; Wang, Cun; Yang, Lie; Yu, Yong-Yang; Chen, Hai-Ning

    2013-01-01

    K-ras gene mutations were common in colorectal patients, but their relationship with prognosis was unclear. Verify prognostic differences between patient with and without mutant K-ras genes by reviewing the published evidence. Systematic reviews and data bases were searched for cohort/case-control studies of prognosis of colorectal cancer patients with detected K-ras mutations versus those without mutant K-ras genes, both of whom received chemotherapy. Number of patients, regimens of chemotherapy, and short-term or long-term survival rate (disease-free or overall) were extracted. Quality of studies was also evaluated. 7 studies of comparisons with a control group were identified. No association between K-ras gene status with neither short-term disease free-survival (OR=1.01, 95% CI, 0.73-1.38, P=0.97) nor overall survival (OR=1.06, 95% CI, 0.82-1.36, P=0.66) in CRC patients who received chemotherapy was indicated. Comparison of long-term survival between two groups also indicated no significant difference after heterogeneity was eliminated (OR=1.09, 95% CI, 0.85-1.40, P=0.49). K-ras gene mutations may not be a prognostic index for colorectal cancer patients who received chemotherapy.

  20. Gene cluster analysis for the biosynthesis of elgicins, novel lantibiotics produced by paenibacillus elgii B69

    Directory of Open Access Journals (Sweden)

    Teng Yi

    2012-03-01

    Full Text Available Abstract Background The recent increase in bacterial resistance to antibiotics has promoted the exploration of novel antibacterial materials. As a result, many researchers are undertaking work to identify new lantibiotics because of their potent antimicrobial activities. The objective of this study was to provide details of a lantibiotic-like gene cluster in Paenibacillus elgii B69 and to produce the antibacterial substances coded by this gene cluster based on culture screening. Results Analysis of the P. elgii B69 genome sequence revealed the presence of a lantibiotic-like gene cluster composed of five open reading frames (elgT1, elgC, elgT2, elgB, and elgA. Screening of culture extracts for active substances possessing the predicted properties of the encoded product led to the isolation of four novel peptides (elgicins AI, AII, B, and C with a broad inhibitory spectrum. The molecular weights of these peptides were 4536, 4593, 4706, and 4820 Da, respectively. The N-terminal sequence of elgicin B was Leu-Gly-Asp-Tyr, which corresponded to the partial sequence of the peptide ElgA encoded by elgA. Edman degradation suggested that the product elgicin B is derived from ElgA. By correlating the results of electrospray ionization-mass spectrometry analyses of elgicins AI, AII, and C, these peptides are deduced to have originated from the same precursor, ElgA. Conclusions A novel lantibiotic-like gene cluster was shown to be present in P. elgii B69. Four new lantibiotics with a broad inhibitory spectrum were isolated, and these appear to be promising antibacterial agents.

  1. The Lineage-Specific Evolution of Aquaporin Gene Clusters Facilitated Tetrapod Terrestrial Adaptation

    OpenAIRE

    Finn, Roderick Nigel; Chauvigné, François; Hlidberg, Jón Baldur; Cutler, Christopher P.; Cerdà, Joan

    2014-01-01

    A major physiological barrier for aquatic organisms adapting to terrestrial life is dessication in the aerial environment. This barrier was nevertheless overcome by the Devonian ancestors of extant Tetrapoda, but the origin of specific molecular mechanisms that solved this water problem remains largely unknown. Here we show that an ancient aquaporin gene cluster evolved specifically in the sarcopterygian lineage, and subsequently diverged into paralogous forms of AQP2, -5, or -6 to mediate wa...

  2. antiSMASH 4.0-improvements in chemistry prediction and gene cluster boundary identification

    DEFF Research Database (Denmark)

    Blin, Kai; Wolf, Thomas; Chevrette, Marc G.

    2017-01-01

    architectures. Additionally, several usability features have been updated and improved. Together, these improvements make antiSMASH up-to-date with the latest developments in natural product research and will further facilitate computational genome mining for the discovery of novel bioactive molecules.......Many antibiotics, chemotherapeutics, crop protection agents and food preservatives originate from molecules produced by bacteria, fungi or plants. In recent years, genome mining methodologies have been widely adopted to identify and characterize the biosynthetic gene clusters encoding...

  3. Blood and tissue neuroendocrine tumor gene cluster analysis correlate, define hallmarks and predict disease status.

    Science.gov (United States)

    Kidd, Mark; Drozdov, Ignat; Modlin, Irvin

    2015-08-01

    A multianalyte algorithmic assay (MAAA) identifies circulating neuroendocrine tumor (NET) transcripts (n=51) with a sensitivity/specificity of 98%/97%. We evaluated whether blood measurements correlated with tumor tissue transcript analysis. The latter were segregated into gene clusters (GC) that defined clinical 'hallmarks' of neoplasia. A MAAA/cluster integrated algorithm (CIA) was developed as a predictive activity index to define tumor behavior and outcome. We evaluated three groups. Group 1: publically available NET transcriptome databases (n=15; GeneProfiler). Group 2: prospectively collected tumors and matched blood samples (n=22; qRT-PCR). Group 3: prospective clinical blood samples, n=159: stable disease (SD): n=111 and progressive disease (PD): n=48. Regulatory network analysis, linear modeling, principal component analysis (PCA), and receiver operating characteristic analyses were used to delineate neoplasia 'hallmarks' and assess GC predictive utility. Our results demonstrated: group 1: NET transcriptomes identified (92%) genes elevated. Group 2: 98% genes elevated by qPCR (fold change >2, Pgenes defined nine omic clusters (SSTRome, proliferome, signalome, metabolome, secretome, epigenome, plurome, and apoptome). Group 3: six clusters (SSTRome, proliferome, metabolome, secretome, epigenome, and plurome) differentiated SD from PD (area under the curve (AUC)=0.81). Integration with blood-algorithm amplified the AUC to 0.92±0.02 for differentiating PD and SD. The CIA defined a significantly lower SD score (34.1±2.6%) than in PD (84±2.8%, P92%. Blood transcript measurement predicts NET activity. © 2015 Society for Endocrinology.

  4. Regulatory feedback loop of two phz gene clusters through 5'-untranslated regions in Pseudomonas sp. M18.

    Directory of Open Access Journals (Sweden)

    Yaqian Li

    Full Text Available BACKGROUND: Phenazines are important compounds produced by pseudomonads and other bacteria. Two phz gene clusters called phzA1-G1 and phzA2-G2, respectively, were found in the genome of Pseudomonas sp. M18, an effective biocontrol agent, which is highly homologous to the opportunistic human pathogen P. aeruginosa PAO1, however little is known about the correlation between the expressions of two phz gene clusters. METHODOLOGY/PRINCIPAL FINDINGS: Two chromosomal insertion inactivated mutants for the two gene clusters were constructed respectively and the correlation between the expressions of two phz gene clusters was investigated in strain M18. Phenazine-1-carboxylic acid (PCA molecules produced from phzA2-G2 gene cluster are able to auto-regulate expression itself and activate the expression of phzA1-G1 gene cluster in a circulated amplification pattern. However, the post-transcriptional expression of phzA1-G1 transcript was blocked principally through 5'-untranslated region (UTR. In contrast, the phzA2-G2 gene cluster was transcribed to a lesser extent and translated efficiently and was negatively regulated by the GacA signal transduction pathway, mainly at a post-transcriptional level. CONCLUSIONS/SIGNIFICANCE: A single molecule, PCA, produced in different quantities by the two phz gene clusters acted as the functional mediator and the two phz gene clusters developed a specific regulatory mechanism which acts through 5'-UTR to transfer a single, but complex bacterial signaling event in Pseudomonas sp. strain M18.

  5. Clustering gene expression time series data using an infinite Gaussian process mixture model.

    Science.gov (United States)

    McDowell, Ian C; Manandhar, Dinesh; Vockley, Christopher M; Schmid, Amy K; Reddy, Timothy E; Engelhardt, Barbara E

    2018-01-01

    Transcriptome-wide time series expression profiling is used to characterize the cellular response to environmental perturbations. The first step to analyzing transcriptional response data is often to cluster genes with similar responses. Here, we present a nonparametric model-based method, Dirichlet process Gaussian process mixture model (DPGP), which jointly models data clusters with a Dirichlet process and temporal dependencies with Gaussian processes. We demonstrate the accuracy of DPGP in comparison to state-of-the-art approaches using hundreds of simulated data sets. To further test our method, we apply DPGP to published microarray data from a microbial model organism exposed to stress and to novel RNA-seq data from a human cell line exposed to the glucocorticoid dexamethasone. We validate our clusters by examining local transcription factor binding and histone modifications. Our results demonstrate that jointly modeling cluster number and temporal dependencies can reveal shared regulatory mechanisms. DPGP software is freely available online at https://github.com/PrincetonUniversity/DP_GP_cluster.

  6. [Advances on biosynthetic gene clusters of natural product from marine symbiotic microbe--a review].

    Science.gov (United States)

    Xu, Jing; Xu, Jun

    2008-07-01

    Previous research has suggested that the true producers of numerous natural products isolated from marine invertebrates were the microbial epibiont and symbiont which are deemed as not-yet-cultivated microbe. Cloning of the biosynthetic genes responsible for a specific nature product not only provides direct genetic evidence of the origin of the compounds but also establishes the feasibility of mass production of the compounds by heterologous expression. This paper reviews the progresses on the biosynthetic gene clusters of nature products from the symbiotic bacteria including marine sponge, ascidian, bryozoan, deep-sea tube worm and deep-sea sediments.

  7. Structure and expression of a pyrimidine gene cluster from the extreme thermophile Thermus strain ZO5.

    OpenAIRE

    Van de Casteele, M; Chen, P.; Roovers, M.; Legrain, C.; Glansdorff, N

    1997-01-01

    On a 4.7-kbp HindIII clone of Thermus strain ZO5 DNA, complementing an aspartate carbamoyltransferase mutation in Escherichia coli, we identified a cluster of four potential open reading frames corresponding to genes pyrR, and pyrB, an unidentified open reading frame named bbc, and gene pyrC. The transcription initiation site was mapped at about 115 nucleotides upstream of the pyrR translation start codon. The cognate Thermus pyr promoter also functions in heterologous expression of Thermus p...

  8. Gene Clusters for Insecticidal Loline Alkaloids in the Grass-Endophytic Fungus Neotyphodium uncinatum

    OpenAIRE

    Spiering, Martin J.; Moon, Christina D.; Wilkinson, Heather H.; Schardl, Christopher L.

    2005-01-01

    Loline alkaloids are produced by mutualistic fungi symbiotic with grasses, and they protect the host plants from insects. Here we identify in the fungal symbiont, Neotyphodium uncinatum, two homologous gene clusters (LOL-1 and LOL-2) associated with loline-alkaloid production. Nine genes were identified in a 25-kb region of LOL-1 and designated (in order) lolF-1, lolC-1, lolD-1, lolO-1, lolA-1, lolU-1, lolP-1, lolT-1, and lolE-1. LOL-2 contained the homologs lolC-2 through lolE-2 in the same ...

  9. Comparison of expression of secondary metabolite biosynthesis cluster genes in Aspergillus flavus, A. parasiticus, and A. oryzae.

    Science.gov (United States)

    Ehrlich, Kenneth C; Mack, Brian M

    2014-06-23

    Fifty six secondary metabolite biosynthesis gene clusters are predicted to be in the Aspergillus flavus genome. In spite of this, the biosyntheses of only seven metabolites, including the aflatoxins, kojic acid, cyclopiazonic acid and aflatrem, have been assigned to a particular gene cluster. We used RNA-seq to compare expression of secondary metabolite genes in gene clusters for the closely related fungi A. parasiticus, A. oryzae, and A. flavus S and L sclerotial morphotypes. The data help to refine the identification of probable functional gene clusters within these species. Our results suggest that A. flavus, a prevalent contaminant of maize, cottonseed, peanuts and tree nuts, is capable of producing metabolites which, besides aflatoxin, could be an underappreciated contributor to its toxicity.

  10. Burkholderia thailandensis harbors two identical rhl gene clusters responsible for the biosynthesis of rhamnolipids

    Directory of Open Access Journals (Sweden)

    Woods Donald E

    2009-12-01

    Full Text Available Abstract Background Rhamnolipids are surface active molecules composed of rhamnose and β-hydroxydecanoic acid. These biosurfactants are produced mainly by Pseudomonas aeruginosa and have been thoroughly investigated since their early discovery. Recently, they have attracted renewed attention because of their involvement in various multicellular behaviors. Despite this high interest, only very few studies have focused on the production of rhamnolipids by Burkholderia species. Results Orthologs of rhlA, rhlB and rhlC, which are responsible for the biosynthesis of rhamnolipids in P. aeruginosa, have been found in the non-infectious Burkholderia thailandensis, as well as in the genetically similar important pathogen B. pseudomallei. In contrast to P. aeruginosa, both Burkholderia species contain these three genes necessary for rhamnolipid production within a single gene cluster. Furthermore, two identical, paralogous copies of this gene cluster are found on the second chromosome of these bacteria. Both Burkholderia spp. produce rhamnolipids containing 3-hydroxy fatty acid moieties with longer side chains than those described for P. aeruginosa. Additionally, the rhamnolipids produced by B. thailandensis contain a much larger proportion of dirhamnolipids versus monorhamnolipids when compared to P. aeruginosa. The rhamnolipids produced by B. thailandensis reduce the surface tension of water to 42 mN/m while displaying a critical micelle concentration value of 225 mg/L. Separate mutations in both rhlA alleles, which are responsible for the synthesis of the rhamnolipid precursor 3-(3-hydroxyalkanoyloxyalkanoic acid, prove that both copies of the rhl gene cluster are functional, but one contributes more to the total production than the other. Finally, a double ΔrhlA mutant that is completely devoid of rhamnolipid production is incapable of swarming motility, showing that both gene clusters contribute to this phenotype. Conclusions Collectively, these

  11. Characterization of the biosynthetic gene cluster for cryptic phthoxazolin A in Streptomyces avermitilis.

    Directory of Open Access Journals (Sweden)

    Dian Anggraini Suroto

    Full Text Available Phthoxazolin A, an oxazole-containing polyketide, has a broad spectrum of anti-oomycete activity and herbicidal activity. We recently identified phthoxazolin A as a cryptic metabolite of Streptomyces avermitilis that produces the important anthelmintic agent avermectin. Even though genome data of S. avermitilis is publicly available, no plausible biosynthetic gene cluster for phthoxazolin A is apparent in the sequence data. Here, we identified and characterized the phthoxazolin A (ptx biosynthetic gene cluster through genome sequencing, comparative genomic analysis, and gene disruption. Sequence analysis uncovered that the putative ptx biosynthetic genes are laid on an extra genomic region that is not found in the public database, and 8 open reading frames in the extra genomic region could be assigned roles in the biosynthesis of the oxazole ring, triene polyketide and carbamoyl moieties. Disruption of the ptxA gene encoding a discrete acyltransferase resulted in a complete loss of phthoxazolin A production, confirming that the trans-AT type I PKS system is responsible for the phthoxazolin A biosynthesis. Based on the predicted functional domains in the ptx assembly line, we propose the biosynthetic pathway of phthoxazolin A.

  12. Functional dissection of HOXD cluster genes in regulation of neuroblastoma cell proliferation and differentiation.

    Directory of Open Access Journals (Sweden)

    Yunhong Zha

    Full Text Available Retinoic acid (RA can induce growth arrest and neuronal differentiation of neuroblastoma cells and has been used in clinic for treatment of neuroblastoma. It has been reported that RA induces the expression of several HOXD genes in human neuroblastoma cell lines, but their roles in RA action are largely unknown. The HOXD cluster contains nine genes (HOXD1, HOXD3, HOXD4, and HOXD8-13 that are positioned sequentially from 3' to 5', with HOXD1 at the 3' end and HOXD13 the 5' end. Here we show that all HOXD genes are induced by RA in the human neuroblastoma BE(2-C cells, with the genes located at the 3' end being activated generally earlier than those positioned more 5' within the cluster. Individual induction of HOXD8, HOXD9, HOXD10 or HOXD12 is sufficient to induce both growth arrest and neuronal differentiation, which is associated with downregulation of cell cycle-promoting genes and upregulation of neuronal differentiation genes. However, induction of other HOXD genes either has no effect (HOXD1 or has partial effects (HOXD3, HOXD4, HOXD11 and HOXD13 on BE(2-C cell proliferation or differentiation. We further show that knockdown of HOXD8 expression, but not that of HOXD9 expression, significantly inhibits the differentiation-inducing activity of RA. HOXD8 directly activates the transcription of HOXC9, a key effector of RA action in neuroblastoma cells. These findings highlight the distinct functions of HOXD genes in RA induction of neuroblastoma cell differentiation.

  13. Clusters of orthologous genes for 41 archaeal genomes and implications for evolutionary genomics of archaea

    Science.gov (United States)

    Makarova, Kira S; Sorokin, Alexander V; Novichkov, Pavel S; Wolf, Yuri I; Koonin, Eugene V

    2007-01-01

    Background An evolutionary classification of genes from sequenced genomes that distinguishes between orthologs and paralogs is indispensable for genome annotation and evolutionary reconstruction. Shortly after multiple genome sequences of bacteria, archaea, and unicellular eukaryotes became available, an attempt on such a classification was implemented in Clusters of Orthologous Groups of proteins (COGs). Rapid accumulation of genome sequences creates opportunities for refining COGs but also represents a challenge because of error amplification. One of the practical strategies involves construction of refined COGs for phylogenetically compact subsets of genomes. Results New Archaeal Clusters of Orthologous Genes (arCOGs) were constructed for 41 archaeal genomes (13 Crenarchaeota, 27 Euryarchaeota and one Nanoarchaeon) using an improved procedure that employs a similarity tree between smaller, group-specific clusters, semi-automatically partitions orthology domains in multidomain proteins, and uses profile searches for identification of remote orthologs. The annotation of arCOGs is a consensus between three assignments based on the COGs, the CDD database, and the annotations of homologs in the NR database. The 7538 arCOGs, on average, cover ~88% of the genes in a genome compared to a ~76% coverage in COGs. The finer granularity of ortholog identification in the arCOGs is apparent from the fact that 4538 arCOGs correspond to 2362 COGs; ~40% of the arCOGs are new. The archaeal gene core (protein-coding genes found in all 41 genome) consists of 166 arCOGs. The arCOGs were used to reconstruct gene loss and gene gain events during archaeal evolution and gene sets of ancestral forms. The Last Archaeal Common Ancestor (LACA) is conservatively estimated to possess 996 genes compared to 1245 and 1335 genes for the last common ancestors of Crenarchaeota and Euryarchaeota, respectively. It is inferred that LACA was a chemoautotrophic hyperthermophile that, in addition to the

  14. Clusters of orthologous genes for 41 archaeal genomes and implications for evolutionary genomics of archaea

    Directory of Open Access Journals (Sweden)

    Wolf Yuri I

    2007-11-01

    Full Text Available Abstract Background An evolutionary classification of genes from sequenced genomes that distinguishes between orthologs and paralogs is indispensable for genome annotation and evolutionary reconstruction. Shortly after multiple genome sequences of bacteria, archaea, and unicellular eukaryotes became available, an attempt on such a classification was implemented in Clusters of Orthologous Groups of proteins (COGs. Rapid accumulation of genome sequences creates opportunities for refining COGs but also represents a challenge because of error amplification. One of the practical strategies involves construction of refined COGs for phylogenetically compact subsets of genomes. Results New Archaeal Clusters of Orthologous Genes (arCOGs were constructed for 41 archaeal genomes (13 Crenarchaeota, 27 Euryarchaeota and one Nanoarchaeon using an improved procedure that employs a similarity tree between smaller, group-specific clusters, semi-automatically partitions orthology domains in multidomain proteins, and uses profile searches for identification of remote orthologs. The annotation of arCOGs is a consensus between three assignments based on the COGs, the CDD database, and the annotations of homologs in the NR database. The 7538 arCOGs, on average, cover ~88% of the genes in a genome compared to a ~76% coverage in COGs. The finer granularity of ortholog identification in the arCOGs is apparent from the fact that 4538 arCOGs correspond to 2362 COGs; ~40% of the arCOGs are new. The archaeal gene core (protein-coding genes found in all 41 genome consists of 166 arCOGs. The arCOGs were used to reconstruct gene loss and gene gain events during archaeal evolution and gene sets of ancestral forms. The Last Archaeal Common Ancestor (LACA is conservatively estimated to possess 996 genes compared to 1245 and 1335 genes for the last common ancestors of Crenarchaeota and Euryarchaeota, respectively. It is inferred that LACA was a chemoautotrophic hyperthermophile

  15. Association of paraoxonase gene cluster polymorphisms with ALS in France, Quebec, and Sweden.

    Science.gov (United States)

    Valdmanis, P N; Kabashi, E; Dyck, A; Hince, P; Lee, J; Dion, P; D'Amour, M; Souchon, F; Bouchard, J-P; Salachas, F; Meininger, V; Andersen, P M; Camu, W; Dupré, N; Rouleau, G A

    2008-08-12

    The paraoxonase gene cluster on chromosome 7 comprising the PON1-3 genes is an attractive candidate for association in amyotrophic lateral sclerosis (ALS) given the role of paraoxonase genes during the response to oxidative stress and their contribution to the enzymatic break down of nerve toxins. Oxidative stress is considered one of the mechanisms involved in ALS pathogenesis. Evidence for this includes the fact that mutations of SOD1, which normally reduce the production of toxic superoxide anion, account for 12% to 23% of familial cases in ALS. In addition, PON variants were shown to be associated with susceptibility to ALS in several North American and European populations. We extended this analysis to examine 20 single nucleotide polymorphisms (SNPs) across the PON gene cluster in a set of patients from France (480 cases, 475 controls), Quebec (159 cases, 95 controls), and Sweden (558 cases, 506 controls). Although individual SNPs were not considered associated on their own, a haplotype of SNPs at the C-terminal portion of PON2 that includes the PON2 C311S amino acid change was significant in the French (p value 0.0075) and Quebec (p value 0.026) populations as well as all three populations combined (p value 1.69 x 10(-6)). Stratification of the samples showed that this variation was pertinent to ALS susceptibility as a whole, and not to a particular subset of patients. These findings contribute to the increasing weight of evidence that genetic variants in the paraoxonase gene cluster are associated with amyotrophic lateral sclerosis.

  16. The lineage-specific evolution of aquaporin gene clusters facilitated tetrapod terrestrial adaptation.

    Science.gov (United States)

    Finn, Roderick Nigel; Chauvigné, François; Hlidberg, Jón Baldur; Cutler, Christopher P; Cerdà, Joan

    2014-01-01

    A major physiological barrier for aquatic organisms adapting to terrestrial life is dessication in the aerial environment. This barrier was nevertheless overcome by the Devonian ancestors of extant Tetrapoda, but the origin of specific molecular mechanisms that solved this water problem remains largely unknown. Here we show that an ancient aquaporin gene cluster evolved specifically in the sarcopterygian lineage, and subsequently diverged into paralogous forms of AQP2, -5, or -6 to mediate water conservation in extant Tetrapoda. To determine the origin of these apomorphic genomic traits, we combined aquaporin sequencing from jawless and jawed vertebrates with broad taxon assembly of >2,000 transcripts amongst 131 deuterostome genomes and developed a model based upon Bayesian inference that traces their convergent roots to stem subfamilies in basal Metazoa and Prokaryota. This approach uncovered an unexpected diversity of aquaporins in every lineage investigated, and revealed that the vertebrate superfamily consists of 17 classes of aquaporins (Aqp0 - Aqp16). The oldest orthologs associated with water conservation in modern Tetrapoda are traced to a cluster of three aqp2-like genes in Actinistia that likely arose >500 Ma through duplication of an aqp0-like gene present in a jawless ancestor. In sea lamprey, we show that aqp0 first arose in a protocluster comprised of a novel aqp14 paralog and a fused aqp01 gene. To corroborate these findings, we conducted phylogenetic analyses of five syntenic nuclear receptor subfamilies, which, together with observations of extensive genome rearrangements, support the coincident loss of ancestral aqp2-like orthologs in Actinopterygii. We thus conclude that the divergence of sarcopterygian-specific aquaporin gene clusters was permissive for the evolution of water conservation mechanisms that facilitated tetrapod terrestrial adaptation.

  17. The lineage-specific evolution of aquaporin gene clusters facilitated tetrapod terrestrial adaptation.

    Directory of Open Access Journals (Sweden)

    Roderick Nigel Finn

    Full Text Available A major physiological barrier for aquatic organisms adapting to terrestrial life is dessication in the aerial environment. This barrier was nevertheless overcome by the Devonian ancestors of extant Tetrapoda, but the origin of specific molecular mechanisms that solved this water problem remains largely unknown. Here we show that an ancient aquaporin gene cluster evolved specifically in the sarcopterygian lineage, and subsequently diverged into paralogous forms of AQP2, -5, or -6 to mediate water conservation in extant Tetrapoda. To determine the origin of these apomorphic genomic traits, we combined aquaporin sequencing from jawless and jawed vertebrates with broad taxon assembly of >2,000 transcripts amongst 131 deuterostome genomes and developed a model based upon Bayesian inference that traces their convergent roots to stem subfamilies in basal Metazoa and Prokaryota. This approach uncovered an unexpected diversity of aquaporins in every lineage investigated, and revealed that the vertebrate superfamily consists of 17 classes of aquaporins (Aqp0 - Aqp16. The oldest orthologs associated with water conservation in modern Tetrapoda are traced to a cluster of three aqp2-like genes in Actinistia that likely arose >500 Ma through duplication of an aqp0-like gene present in a jawless ancestor. In sea lamprey, we show that aqp0 first arose in a protocluster comprised of a novel aqp14 paralog and a fused aqp01 gene. To corroborate these findings, we conducted phylogenetic analyses of five syntenic nuclear receptor subfamilies, which, together with observations of extensive genome rearrangements, support the coincident loss of ancestral aqp2-like orthologs in Actinopterygii. We thus conclude that the divergence of sarcopterygian-specific aquaporin gene clusters was permissive for the evolution of water conservation mechanisms that facilitated tetrapod terrestrial adaptation.

  18. A conserved cluster of three PRD-class homeobox genes (homeobrain, rx and orthopedia in the Cnidaria and Protostomia

    Directory of Open Access Journals (Sweden)

    Mazza Maureen E

    2010-07-01

    Full Text Available Abstract Background Homeobox genes are a superclass of transcription factors with diverse developmental regulatory functions, which are found in plants, fungi and animals. In animals, several Antennapedia (ANTP-class homeobox genes reside in extremely ancient gene clusters (for example, the Hox, ParaHox, and NKL clusters and the evolution of these clusters has been implicated in the morphological diversification of animal bodyplans. By contrast, similarly ancient gene clusters have not been reported among the other classes of homeobox genes (that is, the LIM, POU, PRD and SIX classes. Results Using a combination of in silico queries and phylogenetic analyses, we found that a cluster of three PRD-class homeobox genes (Homeobrain (hbn, Rax (rx and Orthopedia (otp is present in cnidarians, insects and mollusks (a partial cluster comprising hbn and rx is present in the placozoan Trichoplax adhaerens. We failed to identify this 'HRO' cluster in deuterostomes; in fact, the Homeobrain gene appears to be missing from the chordate genomes we examined, although it is present in hemichordates and echinoderms. To illuminate the ancestral organization and function of this ancient cluster, we mapped the constituent genes against the assembled genome of a model cnidarian, the sea anemone Nematostella vectensis, and characterized their spatiotemporal expression using in situ hybridization. In N. vectensis, these genes reside in a span of 33 kb with the same gene order as previously reported in insects. Comparisons of genomic sequences and expressed sequence tags revealed the presence of alternative transcripts of Nv-otp and two highly unusual protein-coding polymorphisms in the terminal helix of the Nv-rx homeodomain. A population genetic survey revealed the Rx polymorphisms to be widespread in natural populations. During larval development, all three genes are expressed in the ectoderm, in non-overlapping territories along the oral-aboral axis, with distinct

  19. Identification and functional analysis of gene cluster involvement in biosynthesis of the cyclic lipopeptide antibiotic pelgipeptin produced by Paenibacillus elgii

    Directory of Open Access Journals (Sweden)

    Qian Chao-Dong

    2012-09-01

    Full Text Available Abstract Background Pelgipeptin, a potent antibacterial and antifungal agent, is a non-ribosomally synthesised lipopeptide antibiotic. This compound consists of a β-hydroxy fatty acid and nine amino acids. To date, there is no information about its biosynthetic pathway. Results A potential pelgipeptin synthetase gene cluster (plp was identified from Paenibacillus elgii B69 through genome analysis. The gene cluster spans 40.8 kb with eight open reading frames. Among the genes in this cluster, three large genes, plpD, plpE, and plpF, were shown to encode non-ribosomal peptide synthetases (NRPSs, with one, seven, and one module(s, respectively. Bioinformatic analysis of the substrate specificity of all nine adenylation domains indicated that the sequence of the NRPS modules is well collinear with the order of amino acids in pelgipeptin. Additional biochemical analysis of four recombinant adenylation domains (PlpD A1, PlpE A1, PlpE A3, and PlpF A1 provided further evidence that the plp gene cluster involved in pelgipeptin biosynthesis. Conclusions In this study, a gene cluster (plp responsible for the biosynthesis of pelgipeptin was identified from the genome sequence of Paenibacillus elgii B69. The identification of the plp gene cluster provides an opportunity to develop novel lipopeptide antibiotics by genetic engineering.

  20. Clustering of two genes putatively involved in cyanate detoxification evolved recently and independently in multiple fungal lineages.

    Science.gov (United States)

    Elmore, M Holly; McGary, Kriston L; Wisecaver, Jennifer H; Slot, Jason C; Geiser, David M; Sink, Stacy; O'Donnell, Kerry; Rokas, Antonis

    2015-02-06

    Fungi that have the enzymes cyanase and carbonic anhydrase show a limited capacity to detoxify cyanate, a fungicide employed by both plants and humans. Here, we describe a novel two-gene cluster that comprises duplicated cyanase and carbonic anhydrase copies, which we name the CCA gene cluster, trace its evolution across Ascomycetes, and examine the evolutionary dynamics of its spread among lineages of the Fusarium oxysporum species complex (hereafter referred to as the FOSC), a cosmopolitan clade of purportedly clonal vascular wilt plant pathogens. Phylogenetic analysis of fungal cyanase and carbonic anhydrase genes reveals that the CCA gene cluster arose independently at least twice and is now present in three lineages, namely Cochliobolus lunatus, Oidiodendron maius, and the FOSC. Genome-wide surveys within the FOSC indicate that the CCA gene cluster varies in copy number across isolates, is always located on accessory chromosomes, and is absent in FOSC's closest relatives. Phylogenetic reconstruction of the CCA gene cluster in 163 FOSC strains from a wide variety of hosts suggests a recent history of rampant transfers between isolates. We hypothesize that the independent formation of the CCA gene cluster in different fungal lineages and its spread across FOSC strains may be associated with resistance to plant-produced cyanates or to use of cyanate fungicides in agriculture. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  1. Highly repetitive tRNA(Pro)-tRNA(His) gene cluster from Photobacterium phosphoreum.

    Science.gov (United States)

    Giroux, S; Beaudet, J; Cedergren, R

    1988-01-01

    A DNA fragment comprising the four tRNA gene sequences of the Escherichia coli argT locus hybridized with two Sau3A-generated DNA fragments from the vibrio Photobacterium phosphoreum (ATCC 11040). Detailed sequence analysis of the longer fragment shows the following gene organization: 5'-promoter-tRNA(Pro)-tRNAPro-tRNA(Pro)-tRNA(His)-tRNA(Pro)-tRNA(Pro)- tRNA(His)-tRNA(Pro)-five pseudogenes derived from the upstream tRNAPro interspersed by putative Rho-independent terminators. This sequence demonstrates the presence of highly repetitive, tandem tRNA genes in a bacterial genome. Furthermore, a stretch of 304 nucleotides from this cluster was found virtually unchanged in the other (shorter) fragment which was previously sequenced. The two clusters together contain eight tRNA(Pro) pseudogenes and eight fully intact tRNA(Pro) genes, an unusually high number for a single eubacterial isoacceptor tRNA. These results show that the organization of some tRNA operons is highly variable in eubacteria. Images PMID:3056906

  2. Human paraoxonase gene cluster overexpression alleviates angiotensin II-induced cardiac hypertrophy in mice.

    Science.gov (United States)

    Pei, Jian-Fei; Yan, Yun-Fei; Tang, Xiaoqiang; Zhang, Yang; Cui, Shen-Shen; Zhang, Zhu-Qin; Chen, Hou-Zao; Liu, De-Pei

    2016-11-01

    Cardiac hypertrophy is the strongest predictor of the development of heart failure, and anti-hypertrophic treatment holds the key to improving the clinical syndrome and increasing the survival rates for heart failure. The paraoxonase (PON) gene cluster (PC) protects against atherosclerosis and coronary artery diseases. However, the role of PC in the heart is largely unknown. To evaluate the roles of PC in cardiac hypertrophy, transgenic mice carrying the intact human PON1, PON2, and PON3 genes and their flanking sequences were studied. We demonstrated that the PC transgene (PC-Tg) protected mice from cardiac hypertrophy induced by Ang II; these mice had reduced heart weight/body weight ratios, decreased left ventricular wall thicknesses and increased fractional shortening compared with wild-type (WT) control. The same protective tendency was also observed with an Apoe -/- background. Mechanically, PC-Tg normalized the disequilibrium of matrix metalloproteinases (MMPs)/tissue inhibitors of MMPs (TIMPs) in hypertrophic hearts, which might contribute to the protective role of PC-Tg in cardiac fibrosis and, thus, protect against cardiac remodeling. Taken together, our results identify a novel anti-hypertrophic role for the PON gene cluster, suggesting a possible strategy for the treatment of cardiac hypertrophy through elevating the levels of the PON gene family.

  3. RNase 1 genes from the family Sciuridae define a novel rodent ribonuclease cluster.

    Science.gov (United States)

    Siegel, Steven J; Percopo, Caroline M; Dyer, Kimberly D; Zhao, Wei; Roth, V Louise; Mercer, John M; Rosenberg, Helene F

    2009-01-01

    The RNase A ribonucleases are a complex group of functionally diverse secretory proteins with conserved enzymatic activity. We have identified novel RNase 1 genes from four species of squirrel (order Rodentia, family Sciuridae). Squirrel RNase 1 genes encode typical RNase A ribonucleases, each with eight cysteines, a conserved CKXXNTF signature motif, and a canonical His(12)-Lys(41)-His(119) catalytic triad. Two alleles encode Callosciurus prevostii RNase 1, which include a Ser(18)Pro, analogous to the sequence polymorphisms found among the RNase 1 duplications in the genome of Rattus exulans. Interestingly, although the squirrel RNase 1 genes are closely related to one another (77-95% amino acid sequence identity), the cluster as a whole is distinct and divergent from the clusters including RNase 1 genes from other rodent species. We examined the specific sites at which Sciuridae RNase 1s diverge from Muridae/Cricetidae RNase 1s and determined that the divergent sites are located on the external surface, with complete sparing of the catalytic crevice. The full significance of these findings awaits a more complete understanding of biological role of mammalian RNase 1s.

  4. Acquisition and evolution of plant pathogenesis-associated gene clusters and candidate determinants of tissue-specificity in xanthomonas.

    Directory of Open Access Journals (Sweden)

    Hong Lu

    Full Text Available Xanthomonas is a large genus of plant-associated and plant-pathogenic bacteria. Collectively, members cause diseases on over 392 plant species. Individually, they exhibit marked host- and tissue-specificity. The determinants of this specificity are unknown.To assess potential contributions to host- and tissue-specificity, pathogenesis-associated gene clusters were compared across genomes of eight Xanthomonas strains representing vascular or non-vascular pathogens of rice, brassicas, pepper and tomato, and citrus. The gum cluster for extracellular polysaccharide is conserved except for gumN and sequences downstream. The xcs and xps clusters for type II secretion are conserved, except in the rice pathogens, in which xcs is missing. In the otherwise conserved hrp cluster, sequences flanking the core genes for type III secretion vary with respect to insertion sequence element and putative effector gene content. Variation at the rpf (regulation of pathogenicity factors cluster is more pronounced, though genes with established functional relevance are conserved. A cluster for synthesis of lipopolysaccharide varies highly, suggesting multiple horizontal gene transfers and reassortments, but this variation does not correlate with host- or tissue-specificity. Phylogenetic trees based on amino acid alignments of gum, xps, xcs, hrp, and rpf cluster products generally reflect strain phylogeny. However, amino acid residues at four positions correlate with tissue specificity, revealing hpaA and xpsD as candidate determinants. Examination of genome sequences of xanthomonads Xylella fastidiosa and Stenotrophomonas maltophilia revealed that the hrp, gum, and xcs clusters are recent acquisitions in the Xanthomonas lineage.Our results provide insight into the ancestral Xanthomonas genome and indicate that differentiation with respect to host- and tissue-specificity involved not major modifications or wholesale exchange of clusters, but subtle changes in a small

  5. Measurement of circulating transcripts and gene cluster analysis predicts and defines therapeutic efficacy of peptide receptor radionuclide therapy (PRRT) in neuroendocrine tumors

    Energy Technology Data Exchange (ETDEWEB)

    Bodei, L. [European Institute of Oncology, Division of Nuclear Medicine, Milan (Italy); LuGenIum Consortium, Milan, Rotterdam, Bad Berka, London, Italy, Netherlands, Germany (Country Unknown); Kidd, M. [Wren Laboratories, Branford, CT (United States); Modlin, I.M. [LuGenIum Consortium, Milan, Rotterdam, Bad Berka, London, Italy, Netherlands, Germany (Country Unknown); Yale School of Medicine, New Haven, CT (United States); Severi, S.; Nicolini, S.; Paganelli, G. [Istituto Scientifico Romagnolo per lo Studio e la Cura dei Tumori (IRST) IRCCS, Nuclear Medicine and Radiometabolic Units, Meldola (Italy); Drozdov, I. [Bering Limited, London (United Kingdom); Kwekkeboom, D.J.; Krenning, E.P. [LuGenIum Consortium, Milan, Rotterdam, Bad Berka, London, Italy, Netherlands, Germany (Country Unknown); Erasmus Medical Center, Nuclear Medicine Department, Rotterdam (Netherlands); Baum, R.P. [LuGenIum Consortium, Milan, Rotterdam, Bad Berka, London, Italy, Netherlands, Germany (Country Unknown); Zentralklinik Bad Berka, Theranostics Center for Molecular Radiotherapy and Imaging, Bad Berka (Germany)

    2016-05-15

    Peptide receptor radionuclide therapy (PRRT) is an effective method for treating neuroendocrine tumors (NETs). It is limited, however, in the prediction of individual tumor response and the precise and early identification of changes in tumor size. Currently, response prediction is based on somatostatin receptor expression and efficacy by morphological imaging and/or chromogranin A (CgA) measurement. The aim of this study was to assess the accuracy of circulating NET transcripts as a measure of PRRT efficacy, and moreover to identify prognostic gene clusters in pretreatment blood that could be interpolated with relevant clinical features in order to define a biological index for the tumor and a predictive quotient for PRRT efficacy. NET patients (n = 54), M: F 37:17, median age 66, bronchial: n = 13, GEP-NET: n = 35, CUP: n = 6 were treated with {sup 177}Lu-based-PRRT (cumulative activity: 6.5-27.8 GBq, median 18.5). At baseline: 47/54 low-grade (G1/G2; bronchial typical/atypical), 31/49 {sup 18}FDG positive and 39/54 progressive. Disease status was assessed by RECIST1.1. Transcripts were measured by real-time quantitative reverse transcription PCR (qRT-PCR) and multianalyte algorithmic analysis (NETest); CgA by enzyme-linked immunosorbent assay (ELISA). Gene cluster (GC) derivations: regulatory network, protein:protein interactome analyses. Statistical analyses: chi-square, non-parametric measurements, multiple regression, receiver operating characteristic and Kaplan-Meier survival. The disease control rate was 72 %. Median PFS was not achieved (follow-up: 1-33 months, median: 16). Only grading was associated with response (p < 0.01). At baseline, 94 % of patients were NETest-positive, while CgA was elevated in 59 %. NETest accurately (89 %, χ{sup 2} = 27.4; p = 1.2 x 10{sup -7}) correlated with treatment response, while CgA was 24 % accurate. Gene cluster expression (growth-factor signalome and metabolome) had an AUC of 0.74 ± 0.08 (z-statistic = 2.92, p < 0

  6. Updated clusters of orthologous genes for Archaea: a complex ancestor of the Archaea and the byways of horizontal gene transfer.

    Science.gov (United States)

    Wolf, Yuri I; Makarova, Kira S; Yutin, Natalya; Koonin, Eugene V

    2012-12-14

    Collections of Clusters of Orthologous Genes (COGs) provide indispensable tools for comparative genomic analysis, evolutionary reconstruction and functional annotation of new genomes. Initially, COGs were made for all complete genomes of cellular life forms that were available at the time. However, with the accumulation of thousands of complete genomes, construction of a comprehensive COG set has become extremely computationally demanding and prone to error propagation, necessitating the switch to taxon-specific COG collections. Previously, we reported the collection of COGs for 41 genomes of Archaea (arCOGs). Here we present a major update of the arCOGs and describe evolutionary reconstructions to reveal general trends in the evolution of Archaea. The updated version of the arCOG database incorporates 91% of the pangenome of 120 archaea (251,032 protein-coding genes altogether) into 10,335 arCOGs. Using this new set of arCOGs, we performed maximum likelihood reconstruction of the genome content of archaeal ancestral forms and gene gain and loss events in archaeal evolution. This reconstruction shows that the last Common Ancestor of the extant Archaea was an organism of greater complexity than most of the extant archaea, probably with over 2,500 protein-coding genes. The subsequent evolution of almost all archaeal lineages was apparently dominated by gene loss resulting in genome streamlining. Overall, in the evolution of Archaea as well as a representative set of bacteria that was similarly analyzed for comparison, gene losses are estimated to outnumber gene gains at least 4 to 1. Analysis of specific patterns of gene gain in Archaea shows that, although some groups, in particular Halobacteria, acquire substantially more genes than others, on the whole, gene exchange between major groups of Archaea appears to be largely random, with no major 'highways' of horizontal gene transfer. The updated collection of arCOGs is expected to become a key resource for

  7. A joint finite mixture model for clustering genes from independent Gaussian and beta distributed data

    Directory of Open Access Journals (Sweden)

    Yli-Harja Olli

    2009-05-01

    Full Text Available Abstract Background Cluster analysis has become a standard computational method for gene function discovery as well as for more general explanatory data analysis. A number of different approaches have been proposed for that purpose, out of which different mixture models provide a principled probabilistic framework. Cluster analysis is increasingly often supplemented with multiple data sources nowadays, and these heterogeneous information sources should be made as efficient use of as possible. Results This paper presents a novel Beta-Gaussian mixture model (BGMM for clustering genes based on Gaussian distributed and beta distributed data. The proposed BGMM can be viewed as a natural extension of the beta mixture model (BMM and the Gaussian mixture model (GMM. The proposed BGMM method differs from other mixture model based methods in its integration of two different data types into a single and unified probabilistic modeling framework, which provides a more efficient use of multiple data sources than methods that analyze different data sources separately. Moreover, BGMM provides an exceedingly flexible modeling framework since many data sources can be modeled as Gaussian or beta distributed random variables, and it can also be extended to integrate data that have other parametric distributions as well, which adds even more flexibility to this model-based clustering framework. We developed three types of estimation algorithms for BGMM, the standard expectation maximization (EM algorithm, an approximated EM and a hybrid EM, and propose to tackle the model selection problem by well-known model selection criteria, for which we test the Akaike information criterion (AIC, a modified AIC (AIC3, the Bayesian information criterion (BIC, and the integrated classification likelihood-BIC (ICL-BIC. Conclusion Performance tests with simulated data show that combining two different data sources into a single mixture joint model greatly improves the clustering

  8. A joint finite mixture model for clustering genes from independent Gaussian and beta distributed data

    Science.gov (United States)

    Dai, Xiaofeng; Erkkilä, Timo; Yli-Harja, Olli; Lähdesmäki, Harri

    2009-01-01

    Background Cluster analysis has become a standard computational method for gene function discovery as well as for more general explanatory data analysis. A number of different approaches have been proposed for that purpose, out of which different mixture models provide a principled probabilistic framework. Cluster analysis is increasingly often supplemented with multiple data sources nowadays, and these heterogeneous information sources should be made as efficient use of as possible. Results This paper presents a novel Beta-Gaussian mixture model (BGMM) for clustering genes based on Gaussian distributed and beta distributed data. The proposed BGMM can be viewed as a natural extension of the beta mixture model (BMM) and the Gaussian mixture model (GMM). The proposed BGMM method differs from other mixture model based methods in its integration of two different data types into a single and unified probabilistic modeling framework, which provides a more efficient use of multiple data sources than methods that analyze different data sources separately. Moreover, BGMM provides an exceedingly flexible modeling framework since many data sources can be modeled as Gaussian or beta distributed random variables, and it can also be extended to integrate data that have other parametric distributions as well, which adds even more flexibility to this model-based clustering framework. We developed three types of estimation algorithms for BGMM, the standard expectation maximization (EM) algorithm, an approximated EM and a hybrid EM, and propose to tackle the model selection problem by well-known model selection criteria, for which we test the Akaike information criterion (AIC), a modified AIC (AIC3), the Bayesian information criterion (BIC), and the integrated classification likelihood-BIC (ICL-BIC). Conclusion Performance tests with simulated data show that combining two different data sources into a single mixture joint model greatly improves the clustering accuracy compared with

  9. Noise Resistant Generalized Parametric Validity Index of Clustering for Gene Expression Data.

    Science.gov (United States)

    Fa, Rui; Nandi, Asoke K

    2014-01-01

    Validity indices have been investigated for decades. However, since there is no study of noise-resistance performance of these indices in the literature, there is no guideline for determining the best clustering in noisy data sets, especially microarray data sets. In this paper, we propose a generalized parametric validity (GPV) index which employs two tunable parameters α and β to control the proportions of objects being considered to calculate the dissimilarities. The greatest advantage of the proposed GPV index is its noise-resistance ability, which results from the flexibility of tuning the parameters. Several rules are set to guide the selection of parameter values. To illustrate the noise-resistance performance of the proposed index, we evaluate the GPV index for assessing five clustering algorithms in two gene expression data simulation models with different noise levels and compare the ability of determining the number of clusters with eight existing indices. We also test the GPV in three groups of real gene expression data sets. The experimental results suggest that the proposed GPV index has superior noise-resistance ability and provides fairly accurate judgements.

  10. Hierarchical Control of Nitrite Respiration by Transcription Factors Encoded within Mobile Gene Clusters of Thermus thermophilus.

    Science.gov (United States)

    Alvarez, Laura; Quintáns, Nieves G; Blesa, Alba; Baquedano, Ignacio; Mencía, Mario; Bricio, Carlos; Berenguer, José

    2017-12-01

    Denitrification in Thermus thermophilus is encoded by the nitrate respiration conjugative element (NCE) and nitrite and nitric oxide respiration (nic) gene clusters. A tight coordination of each cluster's expression is required to maximize anaerobic growth, and to avoid toxicity by intermediates, especially nitric oxides (NO). Here, we study the control of the nitrite reductases (Nir) and NO reductases (Nor) upon horizontal acquisition of the NCE and nic clusters by a formerly aerobic host. Expression of the nic promoters PnirS, PnirJ, and PnorC, depends on the oxygen sensor DnrS and on the DnrT protein, both NCE-encoded. NsrR, a nic-encoded transcription factor with an iron-sulfur cluster, is also involved in Nir and Nor control. Deletion of nsrR decreased PnorC and PnirJ transcription, and activated PnirS under denitrification conditions, exhibiting a dual regulatory role never described before for members of the NsrR family. On the basis of these results, a regulatory hierarchy is proposed, in which under anoxia, there is a pre-activation of the nic promoters by DnrS and DnrT, and then NsrR leads to Nor induction and Nir repression, likely as a second stage of regulation that would require NO detection, thus avoiding accumulation of toxic levels of NO. The whole system appears to work in remarkable coordination to function only when the relevant nitrogen species are present inside the cell.

  11. The role of HPV RNA transcription, immune response-related gene expression and disruptive TP53 mutations in diagnostic and prognostic profiling of head and neck cancer.

    Science.gov (United States)

    Wichmann, Gunnar; Rosolowski, Maciej; Krohn, Knut; Kreuz, Markus; Boehm, Andreas; Reiche, Anett; Scharrer, Ulrike; Halama, Dirk; Bertolini, Julia; Bauer, Ulrike; Holzinger, Dana; Pawlita, Michael; Hess, Jochen; Engel, Christoph; Hasenclever, Dirk; Scholz, Markus; Ahnert, Peter; Kirsten, Holger; Hemprich, Alexander; Wittekind, Christian; Herbarth, Olf; Horn, Friedemann; Dietz, Andreas; Loeffler, Markus

    2015-12-15

    Stratification of head and neck squamous cell carcinomas (HNSCC) based on HPV16 DNA and RNA status, gene expression patterns, and mutated candidate genes may facilitate patient treatment decision. We characterize head and neck squamous cell carcinomas (HNSCC) with different HPV16 DNA and RNA (E6*I) status from 290 consecutively recruited patients by gene expression profiling and targeted sequencing of 50 genes. We show that tumors with transcriptionally inactive HPV16 (DNA+ RNA-) are similar to HPV-negative (DNA-) tumors regarding gene expression and frequency of TP53 mutations (47%, 8/17 and 43%, 72/167, respectively). We also find that an immune response-related gene expression cluster is associated with lymph node metastasis, independent of HPV16 status and that disruptive TP53 mutations are associated with lymph node metastasis in HPV16 DNA- tumors. We validate each of these associations in another large data set. Four gene expression clusters which we identify differ moderately but significantly in overall survival. Our findings underscore the importance of measuring the HPV16 RNA (E6*I) and TP53-mutation status for patient stratification and identify associations of an immune response-related gene expression cluster and TP53 mutations with lymph node metastasis in HNSCC. © 2015 UICC.

  12. Genetic variation of the Apo Al-CIII-AIV gene cluster in hypertriglyceridemic patients with chronic renal failure undergoing hemodialysis.

    OpenAIRE

    Choi, G. R.; Suh, S. P.; Song, J W; Kee, S. J.; Shin, J H; Ryang, D. W.

    2000-01-01

    Many patients with chronic renal failure (CRF) requiring hemodialysis present with hypertriglyceridemia (HTG). But the exact cause of HTG in CRF is still unknown. Genetic variation of the apo AI-CIII-AIV gene cluster was reported to be associated with primary HTG, atherosclerosis and coronary artery disease. This study was designed to evaluate the association between the restriction fragment length polymorphism (RFLP) of the apo AI-CIII-AIV gene cluster and HTG in patients with CRF undergoing...

  13. Investigating key genes associated with ovarian cancer by integrating affinity propagation clustering and mutual information network analysis.

    Science.gov (United States)

    Wang, J; Chen, C; Li, H-F; Jiang, X-L; Zhang, L

    2016-06-01

    The objective of the present work was to investigate key genes in ovarian cancer based on mAP-KL method which comprised the maxT multiple hypothesis (m), Krzanowski and Lai (KL) cluster quality index, and affinity propagation (AP) clustering algorithm, and mutual information network (MIN) constructed by the context likelihood of relatedness (CLR) algorithm. MAP-KL method was employed to identify exemplars in ovarian cancer, of which the maxT function ranked the genes of train set and test set and obtained top 200 genes; KL cluster index was utilized to determine the quantity of clusters; and then AP clustering algorithm was conducted to identify the clusters and their exemplars. Also, we assessed the classification performance of mAP-KL by support vector machines (SVM) model. Subsequently, the MIN for exemplars and cluster genes was constructed according to CLR algorithm. Finally, topological centrality properties of exemplars in MIN were assessed to investigate key genes for ovarian cancer. SVM model validated that the classification between normal controls and ovarian cancer patients by mAP-KL had a good performance. A total of 22 clusters and exemplars were detected by performing the mAP-KL method. Based on the topological centrality analyses for exemplars in MIN, we considered the C9orf16, COX5B and ACTB to be key genes in the progress of ovarian cancer. We have obtained three key genes (C9orf16, COX5B and ACTB) for ovarian cancer on the basis of mAP-KL method and MIN analysis. These genes might be potential biomarkers for treatment of ovarian cancer, and give insight for revealing the underlying mechanism of this tumor.

  14. Evaluation of potential prognostic value of Bmi-1 gene product and selected markers of proliferation (Ki-67 and apoptosis (p53 in the neuroblastoma group of tumors

    Directory of Open Access Journals (Sweden)

    Katarzyna Taran

    2016-02-01

    Full Text Available Introduction: Cancer in children is a very important issue in pediatrics. The least satisfactory treatment outcome occurs among patients with clinically advanced neuroblastomas. Despite much research, the biology of this tumor still remains unclear, and new prognostic factors are sought. The Bmi-1 gene product is a currently highly investigated protein which belongs to the Polycomb group (PcG and has been identified as a regulator of primary neural crest cells. It is believed that Bmi‑1 and N-myc act together and are both involved in the pathogenesis of neuroblastoma. The aim of the study was to assess the potential prognostic value of Bmi-1 protein and its relations with mechanisms of proliferation and apoptosis in the neuroblastoma group of tumors.Material/Methods: 29 formalin-fixed and paraffin-embedded neuroblastoma tissue sections were examined using mouse monoclonal antibodies anti-Bmi-1, anti-p53 and anti-Ki-67 according to the manufacturer’s instructions.Results: There were found statistically significant correlations between Bmi-1 expression and tumor histology and age of patients.Conclusions: Bmi-1 seems to be a promising marker in the neuroblastoma group of tumors whose expression correlates with widely accepted prognostic parameters. The pattern of BMI-1 expression may indicate that the examined protein is also involved in maturation processes in tumor tissue.

  15. eMBI: Boosting Gene Expression-based Clustering for Cancer Subtypes.

    Science.gov (United States)

    Chang, Zheng; Wang, Zhenjia; Ashby, Cody; Zhou, Chuan; Li, Guojun; Zhang, Shuzhong; Huang, Xiuzhen

    2014-01-01

    Identifying clinically relevant subtypes of a cancer using gene expression data is a challenging and important problem in medicine, and is a necessary premise to provide specific and efficient treatments for patients of different subtypes. Matrix factorization provides a solution by finding checker-board patterns in the matrices of gene expression data. In the context of gene expression profiles of cancer patients, these checkerboard patterns correspond to genes that are up- or down-regulated in patients with particular cancer subtypes. Recently, a new matrix factorization framework for biclustering called Maximum Block Improvement (MBI) is proposed; however, it still suffers several problems when applied to cancer gene expression data analysis. In this study, we developed many effective strategies to improve MBI and designed a new program called enhanced MBI (eMBI), which is more effective and efficient to identify cancer subtypes. Our tests on several gene expression profiling datasets of cancer patients consistently indicate that eMBI achieves significant improvements in comparison with MBI, in terms of cancer subtype prediction accuracy, robustness, and running time. In addition, the performance of eMBI is much better than another widely used matrix factorization method called nonnegative matrix factorization (NMF) and the method of hierarchical clustering, which is often the first choice of clinical analysts in practice.

  16. Active chromatin hub of the mouse alpha-globin locus forms in a transcription factory of clustered housekeeping genes.

    Science.gov (United States)

    Zhou, Guo-Ling; Xin, Li; Song, Wei; Di, Li-Jun; Liu, Guang; Wu, Xue-Song; Liu, De-Pei; Liang, Chih-Chuan

    2006-07-01

    RNA polymerases can be shared by a particular group of genes in a transcription "factory" in nuclei, where transcription may be coordinated in concert with the distribution of coexpressed genes in higher-eukaryote genomes. Moreover, gene expression can be modulated by regulatory elements working over a long distance. Here, we compared the conformation of a 130-kb chromatin region containing the mouse alpha-globin cluster and their flanking housekeeping genes in 14.5-day-postcoitum fetal liver and brain cells. The analysis of chromatin conformation showed that the active alpha1 and alpha2 globin genes and upstream regulatory elements are in close spatial proximity, indicating that looping may function in the transcriptional regulation of the mouse alpha-globin cluster. In fetal liver cells, the active alpha1 and alpha2 genes, but not the inactive zeta gene, colocalize with neighboring housekeeping genes C16orf33, C16orf8, MPG, and C16orf35. This is in sharp contrast with the mouse alpha-globin genes in nonexpressing cells, which are separated from the congregated housekeeping genes. A comparison of RNA polymerase II (Pol II) occupancies showed that active alpha1 and alpha2 gene promoters have a much higher RNA Pol II enrichment in liver than in brain. The RNA Pol II occupancy at the zeta gene promoter, which is specifically repressed during development, is much lower than that at the alpha1 and alpha2 promoters. Thus, the mouse alpha-globin gene cluster may be regulated through moving in or out active globin gene promoters and regulatory elements of a preexisting transcription factory in the nucleus, which is maintained by the flanking clustered housekeeping genes, to activate or inactivate alpha-globin gene expression.

  17. Targeted insertion of the neomycin phosphotransferase gene into the tubulin gene cluster of Trypanosoma brucei

    NARCIS (Netherlands)

    ten Asbroek, A. L.; Ouellette, M.; Borst, P.

    1990-01-01

    Kinetoplastids are unicellular eukaryotes that include important parasites of man, such as trypanosomes and leishmanias. The study of these organisms received a recent boost from the development of transient transformation allowing the short-term expression of genes reintroduced into parasites like

  18. Teaching Gene Technology in an Outreach Lab: Students' Assigned Cognitive Load Clusters and the Clusters' Relationships to Learner Characteristics, Laboratory Variables, and Cognitive Achievement

    Science.gov (United States)

    Scharfenberg, Franz-Josef; Bogner, Franz X.

    2013-02-01

    This study classified students into different cognitive load (CL) groups by means of cluster analysis based on their experienced CL in a gene technology outreach lab which has instructionally been designed with regard to CL theory. The relationships of the identified student CL clusters to learner characteristics, laboratory variables, and cognitive achievement were examined using a pre-post-follow-up design. Participants of our day-long module Genetic Fingerprinting were 409 twelfth-graders. During the module instructional phases (pre-lab, theoretical, experimental, and interpretation phases), we measured the students' mental effort (ME) as an index of CL. By clustering the students' module-phase-specific ME pattern, we found three student CL clusters which were independent of the module instructional phases, labeled as low-level, average-level, and high-level loaded clusters. Additionally, we found two student CL clusters that were each particular to a specific module phase. Their members reported especially high ME invested in one phase each: within the pre-lab phase and within the interpretation phase. Differentiating the clusters, we identified uncertainty tolerance, prior experience in experimentation, epistemic interest, and prior knowledge as relevant learner characteristics. We found relationships to cognitive achievement, but no relationships to the examined laboratory variables. Our results underscore the importance of pre-lab and interpretation phases in hands-on teaching in science education and the need for teachers to pay attention to these phases, both inside and outside of outreach laboratory learning settings.

  19. Architectural roles of multiple chromatin insulators at the human apolipoprotein gene cluster

    Science.gov (United States)

    Mishiro, Tsuyoshi; Ishihara, Ko; Hino, Shinjiro; Tsutsumi, Shuichi; Aburatani, Hiroyuki; Shirahige, Katsuhiko; Kinoshita, Yoshikazu; Nakao, Mitsuyoshi

    2009-01-01

    Long-range regulatory elements and higher-order chromatin structure coordinate the expression of multiple genes in cluster, and CTCF/cohesin-mediated chromatin insulator may be a key in this regulation. The human apolipoprotein (APO) A1/C3/A4/A5 gene region, whose alterations increase the risk of dyslipidemia and atherosclerosis, is partitioned at least by three CTCF-enriched sites and three cohesin protein RAD21-enriched sites (two overlap with the CTCF sites), resulting in the formation of two transcribed chromatin loops by interactions between insulators. The C3 enhancer and APOC3/A4/A5 promoters reside in the same loop, where the APOC3/A4 promoters are pointed towards the C3 enhancer, whereas the APOA1 promoter is present in the different loop. The depletion of either CTCF or RAD21 disrupts the chromatin loop structure, together with significant changes in the APO expression and the localization of transcription factor hepatocyte nuclear factor (HNF)-4α and transcriptionally active form of RNA polymerase II at the APO promoters. Thus, CTCF/cohesin-mediated insulators maintain the chromatin loop formation and the localization of transcriptional apparatus at the promoters, suggesting an essential role of chromatin insulation in controlling the expression of clustered genes. PMID:19322193

  20. Genetic engineering and heterologous expression of the disorazol biosynthetic gene cluster via Red/ET recombineering.

    Science.gov (United States)

    Tu, Qiang; Herrmann, Jennifer; Hu, Shengbiao; Raju, Ritesh; Bian, Xiaoying; Zhang, Youming; Müller, Rolf

    2016-02-15

    Disorazol, a macrocyclic polykitide produced by the myxobacterium Sorangium cellulosum So ce12 and it is reported to have potential cytotoxic activity towards several cancer cell lines, including multi-drug resistant cells. The disorazol biosynthetic gene cluster (dis) from Sorangium cellulosum (So ce12) was identified by transposon mutagenesis and cloned in a bacterial artificial chromosome (BAC) library. The 58-kb dis core gene cluster was reconstituted from BACs via Red/ET recombineering and expressed in Myxococcus xanthus DK1622. For the first time ever, a myxobacterial trans-AT polyketide synthase has been expressed heterologously in this study. Expression in M. xanthus allowed us to optimize the yield of several biosynthetic products using promoter engineering. The insertion of an artificial synthetic promoter upstream of the disD gene encoding a discrete acyl transferase (AT), together with an oxidoreductase (Or), resulted in 7-fold increase in disorazol production. The successful reconstitution and expression of the genetic sequences encoding for these promising cytotoxic compounds will allow combinatorial biosynthesis to generate novel disorazol derivatives for further bioactivity evaluation.

  1. The Wilhelmine E. Key 1992 Invitational lecture. Phenotypic analysis of the Dopa decarboxylase gene cluster mutants in Drosophila melanogaster.

    Science.gov (United States)

    Wright, T R

    1996-01-01

    Mutations in a majority of the 18 loci of the Dopa decarboxylase (Ddc) gene cluster effect similar morphological defects of the cuticle and/or catecholamine-related abnormalities. Mutations in 14 loci affect cuticle formation, cuticle sclerotization, or cuticle melanization, with mutations in 11 of these same loci (including Ddc and amd) producing melanotic psueudotumors, symptomatic, perhaps, of abnormal catecholamine metabolism. Mutations in seven of the genes perturb catecholamine pool levels during prepupal and pupal development, all of which also form melanotic pseudotumors, suggesting several of these genes may encode proteins involved in catecholamine metabolism. Thus, the Ddc gene cluster represents in higher eukaryotes an unusual example of a large cluster of functionally related genes involved in a common physiological process.

  2. The paternal gene of the DDK syndrome maps to the Schlafen gene cluster on mouse chromosome 11.

    Science.gov (United States)

    Bell, Timothy A; de la Casa-Esperón, Elena; Doherty, Heather E; Ideraabdullah, Folami; Kim, Kuikwon; Wang, Yunfei; Lange, Leslie A; Wilhemsen, Kirk; Lange, Ethan M; Sapienza, Carmen; de Villena, Fernando Pardo-Manuel

    2006-01-01

    The DDK syndrome is an early embryonic lethal phenotype observed in crosses between females of the DDK inbred mouse strain and many non-DDK males. Lethality results from an incompatibility between a maternal DDK factor and a non-DDK paternal gene, both of which have been mapped to the Ovum mutant (Om) locus on mouse chromosome 11. Here we define a 465-kb candidate interval for the paternal gene by recombinant progeny testing. To further refine the candidate interval we determined whether males from 17 classical and wild-derived inbred strains are interfertile with DDK females. We conclude that the incompatible paternal allele arose in the Mus musculus domesticus lineage and that incompatible strains should share a common haplotype spanning the paternal gene. We tested for association between paternal allele compatibility/incompatibility and 167 genetic variants located in the candidate interval. Two diallelic SNPs, located in the Schlafen gene cluster, are completely predictive of the polar-lethal phenotype. These SNPs also predict the compatible or incompatible status of males of five additional strains.

  3. Generating in vivo cloning vectors for parallel cloning of large gene clusters by homologous recombination.

    Directory of Open Access Journals (Sweden)

    Jeongmin Lee

    Full Text Available A robust method for the in vivo cloning of large gene clusters was developed based on homologous recombination (HR, requiring only the transformation of PCR products into Escherichia coli cells harboring a receiver plasmid. Positive clones were selected by an acquired antibiotic resistance, which was activated by the recruitment of a short ribosome-binding site plus start codon sequence from the PCR products to the upstream position of a silent antibiotic resistance gene in receiver plasmids. This selection was highly stringent and thus the cloning efficiency of the GFPuv gene (size: 0.7 kb was comparable to that of the conventional restriction-ligation method, reaching up to 4.3 × 10(4 positive clones per μg of DNA. When we attempted parallel cloning of GFPuv fusion genes (size: 2.0 kb and carotenoid biosynthesis pathway clusters (sizes: 4 kb, 6 kb, and 10 kb, the cloning efficiency was similarly high regardless of the DNA size, demonstrating that this would be useful for the cloning of large DNA sequences carrying multiple open reading frames. However, restriction analyses of the obtained plasmids showed that the selected cells may contain significant amounts of receiver plasmids without the inserts. To minimize the amount of empty plasmid in the positive selections, the sacB gene encoding a levansucrase was introduced as a counter selection marker in receiver plasmid as it converts sucrose to a toxic levan in the E. coli cells. Consequently, this method yielded completely homogeneous plasmids containing the inserts via the direct transformation of PCR products into E. coli cells.

  4. Motif-Independent De Novo Detection of Secondary Metabolite Gene Clusters – Towards Identification of Novel Secondary Metabolisms from Filamentous Fungi -

    Directory of Open Access Journals (Sweden)

    Myco eUmemura

    2015-05-01

    Full Text Available Secondary metabolites are produced mostly by clustered genes that are essential to their biosynthesis. The transcriptional expression of these genes is often cooperatively regulated by a transcription factor located inside or close to a cluster. Most of the secondary metabolism biosynthesis (SMB gene clusters identified to date contain so-called core genes with distinctive sequence features, such as polyketide synthase (PKS and non-ribosomal peptide synthetase (NRPS. Recent efforts in sequencing fungal genomes have revealed far more SMB gene clusters than expected based on the number of core genes in the genomes. Several bioinformatics tools have been developed to survey SMB gene clusters using the sequence motif information of the core genes, including SMURF and antiSMASH.More recently, accompanied by the development of sequencing techniques allowing to obtain large-scale genomic and transcriptomic data, motif-independent prediction methods of SMB gene clusters, including MIDDAS-M, have been developed. Most these methods detect the clusters in which the genes are cooperatively regulated at transcriptional levels, thus allowing the identification of novel SMB gene clusters regardless of the presence of the core genes. Another type of the method, MIPS-CG, uses the characteristics of SMB genes, which are highly enriched in non-syntenic blocks (NSBs, enabling the prediction even without transcriptome data although the results have not been evaluated in detail. Considering that large portion of SMB gene clusters might be sufficiently expressed only in limited uncommon conditions, it seems that prediction of SMB gene clusters by bioinformatics and successive experimental validation is an only way to efficiently uncover hidden SMB gene clusters. Here, we describe and discuss possible novel approaches for the determination of SMB gene clusters that have not been identified using conventional methods.

  5. Glutamic acid promotes monacolin K production and monacolin K biosynthetic gene cluster expression in Monascus

    OpenAIRE

    Zhang, Chan; Liang, Jian; Yang, Le; Chai, Shiyuan; Zhang, Chenxi; Sun, Baoguo; Wang, Chengtao

    2017-01-01

    This study investigated the effects of glutamic acid on production of monacolin K and expression of the monacolin K biosynthetic gene cluster. When Monascus M1 was grown in glutamic medium instead of in the original medium, monacolin K production increased from 48.4 to 215.4?mg?l?1, monacolin K production increased by 3.5 times. Glutamic acid enhanced monacolin K production by upregulating the expression of mokB-mokI; on day 8, the expression level of mokA tended to decrease by Reverse Transc...

  6. Analyzing Microarray Data of Alzheimer's Using Cluster Analysis to Identify the Biomarker Genes

    Directory of Open Access Journals (Sweden)

    Satya vani Guttula

    2012-01-01

    Full Text Available Alzheimer is characterized by the presence of senile plaques and neurofibrillary tangles in cortical regions of the brain. The experimental data is taken from Gene Expression Omnibus. A hierarchical Cluster analysis and TreeView were performed to group genes on the basis of the expression pattern. The dynamic change of expression over time and diverse patterns of expression support the concept of a complex local milieu. TreeView allows the organized data to be visualized. List of 24 genes were obtained which showed high expression levels. Three genes, SORL1, APP, and APOE, are suspected to cause Alzheimer’s whereas the other 21 genes are related to other diseases but may also be found to be associated with Alzheimer’s, and these are TMEM59, CCT4, IGF2R, SFPQ, PRDX3, RNF14, IDS, SSBP1, SYNE2, TXNL4A, STXBP3, SMARCB1, ULK2, AGTPBP1, FABP7, CALB1, H2AFY, COPA, SAP18, ATIC and SYNCRIP.

  7. Conserved syntenic clusters of protein coding genes are missing in birds.

    Science.gov (United States)

    Lovell, Peter V; Wirthlin, Morgan; Wilhelm, Larry; Minx, Patrick; Lazar, Nathan H; Carbone, Lucia; Warren, Wesley C; Mello, Claudio V

    2014-01-01

    Birds are one of the most highly successful and diverse groups of vertebrates, having evolved a number of distinct characteristics, including feathers and wings, a sturdy lightweight skeleton and unique respiratory and urinary/excretion systems. However, the genetic basis of these traits is poorly understood. Using comparative genomics based on extensive searches of 60 avian genomes, we have found that birds lack approximately 274 protein coding genes that are present in the genomes of most vertebrate lineages and are for the most part organized in conserved syntenic clusters in non-avian sauropsids and in humans. These genes are located in regions associated with chromosomal rearrangements, and are largely present in crocodiles, suggesting that their loss occurred subsequent to the split of dinosaurs/birds from crocodilians. Many of these genes are associated with lethality in rodents, human genetic disorders, or biological functions targeting various tissues. Functional enrichment analysis combined with orthogroup analysis and paralog searches revealed enrichments that were shared by non-avian species, present only in birds, or shared between all species. Together these results provide a clearer definition of the genetic background of extant birds, extend the findings of previous studies on missing avian genes, and provide clues about molecular events that shaped avian evolution. They also have implications for fields that largely benefit from avian studies, including development, immune system, oncogenesis, and brain function and cognition. With regards to the missing genes, birds can be considered ‘natural knockouts’ that may become invaluable model organisms for several human diseases.

  8. Interactions of Environmental Factors and APOA1-APOC3-APOA4-APOA5 Gene Cluster Gene Polymorphisms with Metabolic Syndrome

    Science.gov (United States)

    Wu, Yanhua; Yu, Yaqin; Zhao, Tiancheng; Wang, Shibin; Fu, Yingli; Qi, Yue; Yang, Guang; Yao, Wenwang; Su, Yingying; Ma, Yue; Shi, Jieping; Jiang, Jing; Kou, Changgui

    2016-01-01

    Objective The present study investigated the prevalence and risk factors for Metabolic syndrome. We evaluated the association between single nucleotide polymorphisms (SNPs) in the apolipoprotein APOA1/C3/A4/A5 gene cluster and the MetS risk and analyzed the interactions of environmental factors and APOA1/C3/A4/A5 gene cluster polymorphisms with MetS. Methods A study on the prevalence and risk factors for MetS was conducted using data from a large cross-sectional survey representative of the population of Jilin Province situated in northeastern China. A total of 16,831 participations were randomly chosen by multistage stratified cluster sampling of residents aged from 18 to 79 years in all nine administrative areas of the province. Environmental factors associated with MetS were examined using univariate and multivariate logistic regression analyses based on the weighted sample data. A sub-sample of 1813 survey subjects who met the criteria for MetS patients and 2037 controls from this case-control study were used to evaluate the association between SNPs and MetS risk. Genomic DNA was extracted from peripheral blood lymphocytes, and SNP genotyping was determined by MALDI-TOF-MS. The associations between SNPs and MetS were examined using a case-control study design. The interactions of environmental factors and APOA1/C3/A4/A5 gene cluster polymorphisms with MetS were assessed using multivariate logistic regression analysis. Results The overall adjusted prevalence of MetS was 32.86% in Jilin province. The prevalence of MetS in men was 36.64%, which was significantly higher than the prevalence in women (29.66%). MetS was more common in urban areas (33.86%) than in rural areas (31.80%). The prevalence of MetS significantly increased with age (OR = 8.621, 95%CI = 6.594–11.272). Mental labor (OR = 1.098, 95%CI = 1.008–1.195), current smoking (OR = 1.259, 95%CI = 1.108–1.429), excess salt intake (OR = 1.252, 95%CI = 1.149–1.363), and a fruit and dairy intake less

  9. Genome-wide identification of physically clustered genes suggests chromatin-level co-regulation in male reproductive development in Arabidopsis thaliana.

    Science.gov (United States)

    Reimegård, Johan; Kundu, Snehangshu; Pendle, Ali; Irish, Vivian F; Shaw, Peter; Nakayama, Naomi; Sundström, Jens F; Emanuelsson, Olof

    2017-04-07

    Co-expression of physically linked genes occurs surprisingly frequently in eukaryotes. Such chromosomal clustering may confer a selective advantage as it enables coordinated gene regulation at the chromatin level. We studied the chromosomal organization of genes involved in male reproductive development in Arabidopsis thaliana. We developed an in-silico tool to identify physical clusters of co-regulated genes from gene expression data. We identified 17 clusters (96 genes) involved in stamen development and acting downstream of the transcriptional activator MS1 (MALE STERILITY 1), which contains a PHD domain associated with chromatin re-organization. The clusters exhibited little gene homology or promoter element similarity, and largely overlapped with reported repressive histone marks. Experiments on a subset of the clusters suggested a link between expression activation and chromatin conformation: qRT-PCR and mRNA in situ hybridization showed that the clustered genes were up-regulated within 48 h after MS1 induction; out of 14 chromatin-remodeling mutants studied, expression of clustered genes was consistently down-regulated only in hta9/hta11, previously associated with metabolic cluster activation; DNA fluorescence in situ hybridization confirmed that transcriptional activation of the clustered genes was correlated with open chromatin conformation. Stamen development thus appears to involve transcriptional activation of physically clustered genes through chromatin de-condensation. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  10. Modeling the asymmetric evolution of a mouse and rat-specific microRNA gene cluster intron 10 of the Sfmbt2 gene.

    Science.gov (United States)

    Lehnert, Stefan; Kapitonov, Vladimir; Thilakarathne, Pushpike J; Schuit, Frans C

    2011-05-23

    The total number of miRNA genes in a genome, expression of which is responsible for the miRNA repertoire of an organism, is not precisely known. Moreover, the question of how new miRNA genes arise during evolution is incompletely understood. Recent data in humans and opossum indicate that retrotranspons of the class of short interspersed nuclear elements have contributed to the growth of microRNA gene clusters. We studied a large miRNA gene cluster in intron 10 of the mouse Sfmbt2 gene using bioinformatic tools. Mice and rats are unique to harbor a 55-65 Kb DNA sequence in intron 10 of the Sfmbt2 gene. This intronic region is rich in regularly repeated B1 retrotransposons together with inverted self-complementary CA/TG microsatellites. The smallest repeats unit, called MSHORT1 in the mouse, was duplicated 9 times in a tandem head-to-tail array to form 2.5 Kb MLONG1 units. The center of the mouse miRNA gene cluster consists of 13 copies of MLONG1. BLAST analysis of MSHORT1 in the mouse shows that the repeat unit is unique for intron 10 of the Sfmbt2 gene and suggest a dual phase model for growth of the miRNA gene cluster:arrangement [corrected] of 10 MSHORT1 units into MLONG1 and further duplication of 13 head-to-tail MLONG1 units in the center of the miRNA gene cluster. Rats have a similar arrangement [corrected] of repeat units in intron 10 of the Sfmbt2 gene. The discrepancy between 65 miRNA genes in the mouse cluster as compared to only 1 miRNA gene in the corresponding rat repeat cluster is ascribed to sequence differences between MSHORT1 and RSHORT1 that result in lateral-shifted, less-stable miRNA precursor hairpins for RSHORT1. Our data provides new evidence for the emerging concept that lineage-specific retroposons have played an important role in the birth of new miRNA genes during evolution. The large difference in the number of miRNA genes in two closely related species (65 versus 1, mice versus rats) indicates that this species-specific evolution can be

  11. The biosynthetic gene cluster for the cyanogenic glucoside dhurrin in Sorghum bicolor contains its co-expressed vacuolar MATE transporter

    OpenAIRE

    Behrooz Darbani; Mohammed Saddik Motawia; Carl Erik Olsen; Nour-Eldin, Hussam H.; Birger Lindberg Møller; Fred Rook

    2016-01-01

    Genomic gene clusters for the biosynthesis of chemical defence compounds are increasingly identified in plant genomes. We previously reported the independent evolution of biosynthetic gene clusters for cyanogenic glucoside biosynthesis in three plant lineages. Here we report that the gene cluster for the cyanogenic glucoside dhurrin in Sorghum bicolor additionally contains a gene, SbMATE2, encoding a transporter of the multidrug and toxic compound extrusion (MATE) family, which is co-expresse...

  12. Splenomegaly, elevated alkaline phosphatase and mutations in the SRSF2/ASXL1/RUNX1 gene panel are strong adverse prognostic markers in patients with systemic mastocytosis.

    Science.gov (United States)

    Jawhar, M; Schwaab, J; Hausmann, D; Clemens, J; Naumann, N; Henzler, T; Horny, H-P; Sotlar, K; Schoenberg, S O; Cross, N C P; Fabarius, A; Hofmann, W-K; Valent, P; Metzgeroth, G; Reiter, A

    2016-12-01

    We evaluated the impact of clinical and molecular characteristics on overall survival (OS) in 108 patients with indolent (n=41) and advanced systemic mastocytosis (SM) (advSM, n=67). Organomegaly was measured by magnetic resonance imaging-based volumetry of the liver and spleen. In multivariate analysis of all patients, an increased spleen volume ⩾450 ml (hazard ratio (HR), 5.2; 95% confidence interval (CI), (2.1-13.0); P=0.003) and an elevated alkaline phosphatase (AP; HR 5.0 (1.1-22.2); P=0.02) were associated with adverse OS. The 3-year OS was 100, 77, and 39%, respectively (P<0.0001), for patients with 0 (low risk, n=37), 1 (intermediate risk, n=32) or 2 (high risk, n=39) parameters. For advSM patients with fully available clinical and molecular data (n=60), univariate analysis identified splenomegaly ⩾1200 ml, elevated AP and mutations in the SRSF2/ASXL1/RUNX1 (S/A/R) gene panel as significant prognostic markers. In multivariate analysis, mutations in S/A/R (HR 3.2 (1.1-9.6); P=0.01) and elevated AP (HR 2.6 (1.0-7.1); P=0.03) remained predictive adverse prognostic markers for OS. The 3-year OS was 76 and 38%, respectively (P=0.0003), for patients with 0-1 (intermediate risk, n=28) or 2 (high risk, n=32) parameters. We conclude that splenomegaly, elevated AP and mutations in the S/A/R gene panel are independent of the World Health Organization classification and provide the most relevant prognostic information in SM patients.

  13. Simultaneous analysis of the expression of 14 genes with individual prognostic value in myelodysplastic syndrome patients at diagnosis: WT1 detection in peripheral blood adversely affects survival.

    Science.gov (United States)

    Santamaría, Carlos; Ramos, Fernando; Puig, Noemi; Barragán, Eva; de Paz, Raquel; Pedro, Carme; Insunza, Andrés; Tormo, Mar; Del Cañizo, Consuelo; Diez-Campelo, María; Xicoy, Blanca; Salido, Eduardo; Sánchez del Real, Javier; Hernández, Montserrat; Chillón, Carmen; Sanz, Guillermo F; García-Sanz, Ramón; San Miguel, Jesús F; González, Marcos

    2012-12-01

    Several studies have evaluated the prognostic value of the individual expression of certain genes in patients with myelodysplastic syndromes (MDS). However, none of them includes their simultaneous analysis by quantitative polymerase chain reaction (PCR). We evaluated relative expression levels of 14 molecular markers in 193 peripheral blood samples from untreated MDS patients using real-time PCR. Detectable WT1 expression levels, low TET2, and low IER3 gene expression were the only markers showing in univariate analysis a poor prognostic value for all treatment-free (TFS), progression-free (PFS), and overall survival (OS). In multivariate analysis, molecular parameters associated with a shorter TFS were: WT1 detection (p = 0.014), low TET2 (p = 0.002), and low IER3 expression (p = 0.025). WT1 detection (p = 0.006) and low TET2 (p = 0.006) expression were associated with a shorter PFS when multivariate analysis was carried out by including only molecular markers. Molecular values with an independent value in OS were: WT1 detection (p = 0.003), high EVI1 expression (p = 0.001), and undetectatable p15-CDKN2B (p = 0.037). WT1 expressers were associated with adverse clinical-biological features, high IPSS and WPSS scoring, and unfavorable molecular expression profile. In summary, detectable WT1 expression levels, and low TET2 and low IER3 expression in peripheral blood showed a strong association with adverse prognosis in MDS patients at diagnosis. However, WT1 was the only molecular marker displaying an independent prognostic value in both OS and TFS.

  14. K-Ras gene mutation status as a prognostic and predictive factor in patients with colorectal cancer undergoing irinotecan- or oxaliplatin-based chemotherapy.

    Science.gov (United States)

    Stec, Rafał; Bodnar, Lubomir; Charkiewicz, Radosław; Korniluk, Jan; Rokita, Marta; Smoter, Marta; Ciechowicz, Marzena; Chyczewski, Lech; Nikliński, Jacek; Kozłowski, Wojciech; Szczylik, Cezary

    2012-11-01

    CRC caused more than 600,000 estimated deaths in 2008. Dysregulated signaling through the RAS/RAF/mitogen-activated protein kinase (MEK)/extracellular signal-regulated kinase (ERK) signaling pathway due to mutations in K-Ras and B-Raf are common events in CRC. Incidence of mutations in codons 12 and 13 of K-Ras and exons 11 and 15 of B-Raf were analyzed in amplified PCR products from primary tumors of 273 patients with CRC, and their prognostic and predictive significance was assessed. The prognostic role of clinical and pathological factors was also examined. K-Ras mutations were present in 89 patients (32.6%), of whom 76 (85.4%) had mutations in codon 12 and 10 (11.2%) had mutations in codon 13. B-Raf gene mutations were present in 17 patients (6.9%), of whom 6 (35.3%) had mutations in exon 15. Multivariate analysis revealed a predictive significance for K-Ras mutations with respect to time to progression in patients treated with irinotecan and oxaliplatin as first-line chemotherapy. There was no predictive significance for B-Raf gene mutation status in these patients. The following risk factors were found to affect overall survival (OS) rates: primary tumor location, lymph node involvement grade, carcinoembryonic antigen (CEA) level before treatment, and performance status according to WHO criteria. Based on the results of this study, K-Ras mutation status may be a suitable indicator of patient eligibility and a prognostic indicator for responsiveness to anti-EGFR therapy alone, or in combination with chemotherapy. Also, K-Ras mutation status may predict time to progression in patients treated with irinotecan and oxaliplatin.

  15. Soft Topographic Maps for Clustering and Classifying Bacteria Using Housekeeping Genes

    Directory of Open Access Journals (Sweden)

    Massimo La Rosa

    2011-01-01

    Full Text Available The Self-Organizing Map (SOM algorithm is widely used for building topographic maps of data represented in a vectorial space, but it does not operate with dissimilarity data. Soft Topographic Map (STM algorithm is an extension of SOM to arbitrary distance measures, and it creates a map using a set of units, organized in a rectangular lattice, defining data neighbourhood relationships. In the last years, a new standard for identifying bacteria using genotypic information began to be developed. In this new approach, phylogenetic relationships of bacteria could be determined by comparing a stable part of the bacteria genetic code, the so-called “housekeeping genes.” The goal of this work is to build a topographic representation of bacteria clusters, by means of self-organizing maps, starting from genotypic features regarding housekeeping genes.

  16. Identification of a trichothecene gene cluster and description of the harzianum A biosynthesis pathway in the fungus Trichoderma arundinaceum

    Science.gov (United States)

    Trichothecenes are sesquiterpenes that act like mycotoxins. Their biosynthesis has been mainly studied in the fungal genera Fusarium, where most of the biosynthetic genes (tri) are grouped in a cluster regulated by ambient conditions and regulatory genes. Unexpectedly, few studies are available abou...

  17. The entire β-globin gene cluster is deleted in a form of τδβ-thalassemia.

    NARCIS (Netherlands)

    E.R. Fearon; H.H.Jr. Kazazian; P.G. Waber (Pamela); J.I. Lee (Joseph); S.E. Antonarakis; S.H. Orkin (Stuart); E.F. Vanin; P.S. Henthorn; F.G. Grosveld (Frank); A.F. Scott; G.R. Buchanan

    1983-01-01

    textabstractWe have used restriction endonuclease mapping to study a deletion involving the beta-globin gene cluster in a Mexican-American family with gamma delta beta-thalassemia. Analysis of DNA polymorphisms demonstrated deletion of the beta-globin gene from the affected chromosome. Using a DNA

  18. Genetic variability in the regulation of the expression cluster of MDR genes in patients with breast cancer.

    Science.gov (United States)

    Tsyganov, Matvey M; Freidin, Maxim B; Ibragimova, Marina K; Deryusheva, Irina V; Kazantseva, Polina V; Slonimskaya, Elena M; Cherdyntseva, Nadezhda V; Litviakov, Nikolai V

    2017-08-01

    We aimed to investigate the association between the polymorphism and expression patterns of multiple drug resistance genes (MDR) in breast cancer (BC). The MDR gene expression levels were measured in tumor tissues of 106 breast cancer patients using quantitative real-time PCR. Affymetrix CytoScan™ HD Array chips were used to assess genotypes. Pairwise correlation analysis for ABCB1, ABCC1, ABCC2 and ABCG2 gene expression levels was carried out to reveal co-expression clusters. Associations between SNPs of MDR genes and their preoperative expression levels were assessed using analysis of covariance adjusting for covariates. The SNPs associated with the expression of the ABCB1, ABCC1, ABCC2 and ABCG2 genes before NAC were detected. In addition, 21 SNPs associated with the expression of four ABC-transporter genes and involved in the expression regulation were identified. Validation in an independent sample confirmed the association between the MDR cluster genes and 11 SNPs. Four MDR genes: ABCB1, ABCC1, ABCC2 and ABCG2 were shown to form the functional expression cluster in breast tumor. Further studies are required to discover precise mechanisms of the cluster regulation, thereby providing new approaches and targets to combat the development of the MDR phenotype during chemotherapy.

  19. The biosynthetic gene cluster for the cyanogenic glucoside dhurrin in Sorghum bicolor contains its co-expressed vacuolar MATE transporter.

    Science.gov (United States)

    Darbani, Behrooz; Motawia, Mohammed Saddik; Olsen, Carl Erik; Nour-Eldin, Hussam H; Møller, Birger Lindberg; Rook, Fred

    2016-11-14

    Genomic gene clusters for the biosynthesis of chemical defence compounds are increasingly identified in plant genomes. We previously reported the independent evolution of biosynthetic gene clusters for cyanogenic glucoside biosynthesis in three plant lineages. Here we report that the gene cluster for the cyanogenic glucoside dhurrin in Sorghum bicolor additionally contains a gene, SbMATE2, encoding a transporter of the multidrug and toxic compound extrusion (MATE) family, which is co-expressed with the biosynthetic genes. The predicted localisation of SbMATE2 to the vacuolar membrane was demonstrated experimentally by transient expression of a SbMATE2-YFP fusion protein and confocal microscopy. Transport studies in Xenopus laevis oocytes demonstrate that SbMATE2 is able to transport dhurrin. In addition, SbMATE2 was able to transport non-endogenous cyanogenic glucosides, but not the anthocyanin cyanidin 3-O-glucoside or the glucosinolate indol-3-yl-methyl glucosinolate. The genomic co-localisation of a transporter gene with the biosynthetic genes producing the transported compound is discussed in relation to the role self-toxicity of chemical defence compounds may play in the formation of gene clusters.

  20. [Prognostic differences of phenotypes in pT1-2N0 invasive breast cancer: a large cohort study with cluster analysis].

    Science.gov (United States)

    Wang, Z; Wang, W H; Wang, S L; Jin, J; Song, Y W; Liu, Y P; Ren, H; Fang, H; Tang, Y; Chen, B; Qi, S N; Lu, N N; Li, N; Tang, Y; Liu, X F; Yu, Z H; Li, Y X

    2016-06-23

    To find phenotypic subgroups of patients with pT1-2N0 invasive breast cancer by means of cluster analysis and estimate the prognosis and clinicopathological features of these subgroups. From 1999 to 2013, 4979 patients with pT1-2N0 invasive breast cancer were recruited for hierarchical clustering analysis. Age (≤40, 41-70, 70+ years), size of primary tumor, pathological type, grade of differentiation, microvascular invasion, estrogen receptor (ER), progesterone receptor (PR) and human epidermal growth factor receptor 2 (HER-2) were chosen as distance metric between patients. Hierarchical cluster analysis was performed using Ward's method. Cophenetic correlation coefficient (CPCC) and Spearman correlation coefficient were used to validate clustering structures. The CPCC was 0.603. The Spearman correlation coefficient was 0.617 (P40 years, smaller primary tumor, lower histologic grade, positive ER and PR status, and mainly negative HER-2. Patients in the cluster 1 and 11 had the worst prognosis, The cluster 1 was characterized by a larger tumor, higher grade and negative ER and PR status, while the cluster 11 was characterized by positive microvascular invasion. Patients in other 7 clusters had a moderate prognosis, and patients in each cluster had distinctive clinicopathological features and recurrent patterns. This study identified distinctive clinicopathologic phenotypes in a large cohort of patients with pT1-2N0 breast cancer through hierarchical clustering and revealed different prognosis. This integrative model may help physicians to make more personalized decisions regarding adjuvant therapy.

  1. Transcriptional interference networks coordinate the expression of functionally-related genes clustered in the same genomic loci

    Directory of Open Access Journals (Sweden)

    Zsolt eBoldogkoi

    2012-07-01

    Full Text Available The regulation of gene expression is essential for normal functioning of biological systems in every form of life. Gene expression is primarily controlled at the level of transcription, especially at the phase of initiation. Non-coding RNAs are one of the major players at every level of genetic regulation, including the control of chromatin organisation, transcription, various post-transcriptional processes and translation. In this study, the Transcriptional Interference Network (TIN hypothesis was put forward in an attempt to explain the global expression of antisense RNAs and the overall occurrence of tandem gene clusters in the genomes of various biological systems ranging from viruses to mammalian cells. The TIN hypothesis suggests the existence of a novel layer of genetic regulation, based on the interactions between the transcriptional machineries of neighbouring genes at their overlapping regions, which are assumed to play a fundamental role in coordinating gene expression within a cluster of functionally-linked genes. It is claimed that the transcriptional overlaps between adjacent genes are much more widespread in genomes than is thought today. The Waterfall model of the TIN hypothesis postulates a unidirectional effect of upstream genes on the transcription of downstream genes within a cluster of tandemly-arrayed genes, while the Seesaw model proposes a mutual interdependence of gene expression between the oppositely-oriented genes. The TIN represents an auto-regulatory system with an exquisitely timed and highly synchronised cascade of gene expression in functionally-linked genes located in close physical proximity to each other. In this study, we focused on herpesviruses. The reason for this lies in the compressed nature of viral genes, which allows a tight regulation and an easier investigation of the transcriptional interactions between genes. However, I believe that the same or similar principles can be applied to cellular

  2. Digital gene expression analysis of two life cycle stages of the human-infective parasite, Trypanosoma brucei gambiense reveals differentially expressed clusters of co-regulated genes

    Directory of Open Access Journals (Sweden)

    Wildridge David

    2010-02-01

    Full Text Available Abstract Background The evolutionarily ancient parasite, Trypanosoma brucei, is unusual in that the majority of its genes are regulated post-transcriptionally, leading to the suggestion that transcript abundance of most genes does not vary significantly between different life cycle stages despite the fact that the parasite undergoes substantial cellular remodelling and metabolic changes throughout its complex life cycle. To investigate this in the clinically relevant sub-species, Trypanosoma brucei gambiense, which is the causative agent of the fatal human disease African sleeping sickness, we have compared the transcriptome of two different life cycle stages, the potentially human-infective bloodstream forms with the non-human-infective procyclic stage using digital gene expression (DGE analysis. Results Over eleven million unique tags were generated, producing expression data for 7360 genes, covering 81% of the genes in the genome. Compared to microarray analysis of the related T. b. brucei parasite, approximately 10 times more genes with a 2.5-fold change in expression levels were detected. The transcriptome analysis revealed the existence of several differentially expressed gene clusters within the genome, indicating that contiguous genes, presumably from the same polycistronic unit, are co-regulated either at the level of transcription or transcript stability. Conclusions DGE analysis is extremely sensitive for detecting gene expression differences, revealing firstly that a far greater number of genes are stage-regulated than had previously been identified and secondly and more importantly, this analysis has revealed the existence of several differentially expressed clusters of genes present on what appears to be the same polycistronic units, a phenomenon which had not previously been observed in microarray studies. These differentially regulated clusters of genes are in addition to the previously identified RNA polymerase I polycistronic

  3. Directed natural product biosynthesis gene cluster capture and expression in the model bacterium Bacillus subtilis

    KAUST Repository

    Li, Yongxin

    2015-03-24

    Bacilli are ubiquitous low G+C environmental Gram-positive bacteria that produce a wide assortment of specialized small molecules. Although their natural product biosynthetic potential is high, robust molecular tools to support the heterologous expression of large biosynthetic gene clusters in Bacillus hosts are rare. Herein we adapt transformation-associated recombination (TAR) in yeast to design a single genomic capture and expression vector for antibiotic production in Bacillus subtilis. After validating this direct cloning plug-and-playa approach with surfactin, we genetically interrogated amicoumacin biosynthetic gene cluster from the marine isolate Bacillus subtilis 1779. Its heterologous expression allowed us to explore an unusual maturation process involving the N-acyl-asparagine pro-drug intermediates preamicoumacins, which are hydrolyzed by the asparagine-specific peptidase into the active component amicoumacin A. This work represents the first direct cloning based heterologous expression of natural products in the model organism B. subtilis and paves the way to the development of future genome mining efforts in this genus.

  4. Microbial communication leading to the activation of silent fungal secondary metabolite gene clusters

    Directory of Open Access Journals (Sweden)

    Tina eNetzker

    2015-04-01

    Full Text Available Microorganisms form diverse multispecies communities in various ecosystems. The high abundance of fungal and bacterial species in these consortia results in specific communication between the microorganisms. A key role in this communication is played by secondary metabolites (SMs, which are also called natural products. Recently, it was shown that interspecies ‘talk’ between microorganisms represents a physiological trigger to activate silent gene clusters leading to the formation of novel SMs by the involved species. This review focuses on mixed microbial cultivation, mainly between bacteria and fungi, with a special emphasis on the induced formation of fungal SMs in co-cultures. In addition, the role of chromatin remodeling in the induction is examined, and methodical perspectives for the analysis of natural products are presented. As an example for an intermicrobial interaction elucidated at the molecular level, we discuss the specific interaction between the filamentous fungi Aspergillus nidulans and Aspergillus fumigatus with the soil bacterium Streptomyces rapamycinicus, which provides an excellent model system to enlighten molecular concepts behind regulatory mechanisms and will pave the way to a novel avenue of drug discovery through targeted activation of silent SM gene clusters through co-cultivations of microorganisms.

  5. The dppBCDF gene cluster of Haemophilus influenzae: Role in heme utilization

    Directory of Open Access Journals (Sweden)

    Morton Daniel J

    2009-08-01

    Full Text Available Abstract Background Haemophilus influenzae requires a porphyrin source for aerobic growth and possesses multiple mechanisms to obtain this essential nutrient. This porphyrin requirement may be satisfied by either heme alone, or protoporphyrin IX in the presence of an iron source. One protein involved in heme acquisition by H. influenzae is the periplasmic heme binding protein HbpA. HbpA exhibits significant homology to the dipeptide and heme binding protein DppA of Escherichia coli. DppA is a component of the DppABCDF peptide-heme permease of E. coli. H. influenzae homologs of dppBCDF are located in the genome at a point distant from hbpA. The object of this study was to investigate the potential role of the H. influenzae dppBCDF locus in heme utilization. Findings An insertional mutation in dppC was constructed and the impact of the mutation on the utilization of both free heme and various proteinaceous heme sources as well as utilization of protoporphyrin IX was determined in growth curve studies. The dppC insertion mutant strain was significantly impacted in utilization of all tested heme sources and protoporphyin IX. Complementation of the dppC mutation with an intact dppCBDF gene cluster in trans corrected the growth defects seen in the dppC mutant strain. Conclusion The dppCBDF gene cluster constitutes part of the periplasmic heme-acquisition systems of H. influenzae.

  6. Directed natural product biosynthesis gene cluster capture and expression in the model bacterium Bacillus subtilis

    Science.gov (United States)

    Li, Yongxin; Li, Zhongrui; Yamanaka, Kazuya; Xu, Ying; Zhang, Weipeng; Vlamakis, Hera; Kolter, Roberto; Moore, Bradley S.; Qian, Pei-Yuan

    2015-03-01

    Bacilli are ubiquitous low G+C environmental Gram-positive bacteria that produce a wide assortment of specialized small molecules. Although their natural product biosynthetic potential is high, robust molecular tools to support the heterologous expression of large biosynthetic gene clusters in Bacillus hosts are rare. Herein we adapt transformation-associated recombination (TAR) in yeast to design a single genomic capture and expression vector for antibiotic production in Bacillus subtilis. After validating this direct cloning ``plug-and-play'' approach with surfactin, we genetically interrogated amicoumacin biosynthetic gene cluster from the marine isolate Bacillus subtilis 1779. Its heterologous expression allowed us to explore an unusual maturation process involving the N-acyl-asparagine pro-drug intermediates preamicoumacins, which are hydrolyzed by the asparagine-specific peptidase into the active component amicoumacin A. This work represents the first direct cloning based heterologous expression of natural products in the model organism B. subtilis and paves the way to the development of future genome mining efforts in this genus.

  7. CTCF Is Required for Neural Development and Stochastic Expression of Clustered Pcdh Genes in Neurons

    Directory of Open Access Journals (Sweden)

    Teruyoshi Hirayama

    2012-08-01

    Full Text Available The CCCTC-binding factor (CTCF is a key molecule for chromatin conformational changes that promote cellular diversity, but nothing is known about its role in neurons. Here, we produced mice with a conditional knockout (cKO of CTCF in postmitotic projection neurons, mostly in the dorsal telencephalon. The CTCF-cKO mice exhibited postnatal growth retardation and abnormal behavior and had defects in functional somatosensory mapping in the brain. In terms of gene expression, 390 transcripts were expressed at significantly different levels between CTCF-deficient and control cortex and hippocampus. In particular, the levels of 53 isoforms of the clustered protocadherin (Pcdh genes, which are stochastically expressed in each neuron, declined markedly. Each CTCF-deficient neuron showed defects in dendritic arborization and spine density during brain development. Their excitatory postsynaptic currents showed normal amplitude but occurred with low frequency. Our results indicate that CTCF regulates functional neural development and neuronal diversity by controlling clustered Pcdh expression.

  8. Linking Biosynthetic Gene Clusters to their Metabolites via Pathway-Targeted Molecular Networking

    Science.gov (United States)

    Trautman, Eric P.; Crawford, Jason M.

    2016-01-01

    The connection of microbial biosynthetic gene clusters to the small molecule metabolites they encode is central to the discovery and characterization of new metabolic pathways with ecological and pharmacological potential. With increasing microbial genome sequence information being deposited into publicly available databases, it is clear that microbes have the coding capacity for many more biologically active small molecules than previously realized. Of increasing interest are the small molecules encoded by the human microbiome, as these metabolites likely mediate a variety of currently uncharacterized human-microbe interactions that influence health and disease. In this mini-review, we describe the ongoing biosynthetic, structural, and functional characterizations of the genotoxic colibactin pathway in gut bacteria as a thematic example of linking biosynthetic gene clusters to their metabolites. We also highlight other natural products that are produced through analogous biosynthetic logic and comment on some current disconnects between bioinformatics predictions and experimental structural characterizations. Lastly, we describe the use of pathway-targeted molecular networking as a tool to characterize secondary metabolic pathways within complex metabolomes and to aid in downstream metabolite structural elucidation efforts. PMID:26456470

  9. Porting Large HPC Applications to GPU Clusters: The Codes GENE and VERTEX

    CERN Document Server

    Dannert, Tilman; Rampp, Markus

    2013-01-01

    We have developed GPU versions for two major high-performance-computing (HPC) applications originating from two different scientific domains. GENE is a plasma microturbulence code which is employed for simulations of nuclear fusion plasmas. VERTEX is a neutrino-radiation hydrodynamics code for "first principles"-simulations of core-collapse supernova explosions. The codes are considered state of the art in their respective scientific domains, both concerning their scientific scope and functionality as well as the achievable compute performance, in particular parallel scalability on all relevant HPC platforms. GENE and VERTEX were ported by us to HPC cluster architectures with two NVidia Kepler GPUs mounted in each node in addition to two Intel Xeon CPUs of the Sandy Bridge family. On such platforms we achieve up to twofold gains in the overall application performance in the sense of a reduction of the time to solution for a given setup with respect to a pure CPU cluster. The paper describes our basic porting ...

  10. Apicidin F: characterization and genetic manipulation of a new secondary metabolite gene cluster in the rice pathogen Fusarium fujikuroi.

    Directory of Open Access Journals (Sweden)

    Eva-Maria Niehaus

    Full Text Available The fungus F. fujikuroi is well known for its production of gibberellins causing the 'bakanae' disease of rice. Besides these plant hormones, it is able to produce other secondary metabolites (SMs, such as pigments and mycotoxins. Genome sequencing revealed altogether 45 potential SM gene clusters, most of which are cryptic and silent. In this study we characterize a new non-ribosomal peptide synthetase (NRPS gene cluster that is responsible for the production of the cyclic tetrapeptide apicidin F (APF. This new SM has structural similarities to the known histone deacetylase inhibitor apicidin. To gain insight into the biosynthetic pathway, most of the 11 cluster genes were deleted, and the mutants were analyzed by HPLC-DAD and HPLC-HRMS for their ability to produce APF or new derivatives. Structure elucidation was carried out be HPLC-HRMS and NMR analysis. We identified two new derivatives of APF named apicidin J and K. Furthermore, we studied the regulation of APF biosynthesis and showed that the cluster genes are expressed under conditions of high nitrogen and acidic pH in a manner dependent on the nitrogen regulator AreB, and the pH regulator PacC. In addition, over-expression of the atypical pathway-specific transcription factor (TF-encoding gene APF2 led to elevated expression of the cluster genes under inducing and even repressing conditions and to significantly increased product yields. Bioinformatic analyses allowed the identification of a putative Apf2 DNA-binding ("Api-box" motif in the promoters of the APF genes. Point mutations in this sequence motif caused a drastic decrease of APF production indicating that this motif is essential for activating the cluster genes. Finally, we provide a model of the APF biosynthetic pathway based on chemical identification of derivatives in the cultures of deletion mutants.

  11. Interleukin-1 gene cluster polymorphisms and gingival recessions after orthodontic treatment.

    Science.gov (United States)

    Bergandi, Loredana; Rubiano, Rachele; Brunazzo, Matteo; Aldieri, Elisabetta; Alfieri, Elisabetta; Dalmasso, Paola; Dal Masso, Paola; Cardaropoli, Giuseppe; Bracco, Pietro; Debernardi, Cesare; Ghigo, Dario

    2008-01-01

    Genetic polymorphisms in the interleukin-1 gene cluster have been associated with the severity of periodontal diseases featured by a variable degree of destruction of connective tissue and bone, such as periodontitis and periimplantitis. This study was aimed to investigate if a link exists between such interleukin-1 gene polymorphisms and the development of gingival recessions during orthodontic treatment in Italian children. We evaluated, in 74 young Italian patients of both sexes, the -889 C/T polymorphism of the interleukin-1alpha gene and the -511 C/T and +3954 C/T polymorphisms of interleukin-1alpha gene by polymerase chain reactions-restriction fragment length polymorphism method using NcoI, AvaI and TaqI as restriction enzymes. No association of interleukin-1 genotypes investigated and gingival recession occurring during orthodontic treatment were identified. In the population studied specific interleukin-1 genotypes (linked to a higher susceptibility to bone resorption in periodontal disease) there does not appear to be any association with the development of gingival recessions during orthodontic treatment.

  12. Type VI secretion system-associated gene clusters contribute to pathogenesis of Salmonella enterica serovar Typhimurium.

    Science.gov (United States)

    Mulder, David T; Cooper, Colin A; Coombes, Brian K

    2012-06-01

    The enteropathogen Salmonella enterica serovar Typhimurium employs a suite of tightly regulated virulence factors within the intracellular compartment of phagocytic host cells resulting in systemic dissemination in mice. A type VI secretion system (T6SS) within Salmonella pathogenicity island 6 (SPI-6) has been implicated in this process; however, the regulatory inputs and the roles of noncore genes in this system are not well understood. Here we describe four clusters of noncore T6SS genes in SPI-6 based on a comparative relationship with the T6SS-3 of Burkholderia mallei and report that the disruption of these genes results in defects in intracellular replication and systemic dissemination in mice. In addition, we show that the expression of the SPI-6-encoded Hcp and VgrG orthologs is enhanced during late stages of macrophage infection. We identify six regions that are transcriptionally active during cell infections and that have regulatory contributions from the regulators of virulence SsrB, PhoP, and SlyA. We show that levels of protein expression are very weak under in vitro conditions and that expression is not enhanced upon the deletion of ssrB, phoP, slyA, qseC, ompR, or hfq, suggesting an unknown activating factor. These data suggest that the SPI-6 T6SS has been integrated into the Salmonella Typhimurium virulence network and customized for host-pathogen interactions through the action of noncore genes.

  13. Fungal metabolic gene clusters – caravans traveling across genomes and environments

    Directory of Open Access Journals (Sweden)

    Jennifer Hughes Wisecaver

    2015-03-01

    Full Text Available Metabolic gene clusters (MGCs, physically co-localized genes participating in the same metabolic pathway, are signature features of fungal genomes. MGCs are most often observed in specialized metabolism, having evolved in individual fungal lineages in response to specific ecological needs, such as the utilization of uncommon nutrients (e.g., galactose and allantoin or the production of secondary metabolic antimicrobial compounds and virulence factors (e.g., aflatoxin and melanin. A flurry of recent studies has shown that several MGCs, whose functions are often associated with fungal virulence as well as with the evolutionary arms race between fungi and their competitors, have experienced horizontal gene transfer (HGT. In this minireview, after briefly introducing HGT as a source of gene innovation, we examine the evidence for HGT’s involvement on the evolution of MGCs and, more generally of fungal metabolism, enumerate the molecular mechanisms that mediate such transfers and the ecological circumstances that favor them, as well as discuss the types of evidence required for inferring the presence of HGT in MGCs. The currently available examples indicate that transfers of entire MGCs have taken place between closely related fungal species as well as distant ones and that they sometimes involve large chromosomal segments. These results suggest that the HGT-mediated acquisition of novel metabolism is an ongoing and successful ecological strategy for many fungal species.

  14. MIDDAS-M: motif-independent de novo detection of secondary metabolite gene clusters through the integration of genome sequencing and transcriptome data.

    Directory of Open Access Journals (Sweden)

    Myco Umemura

    Full Text Available Many bioactive natural products are produced as "secondary metabolites" by plants, bacteria, and fungi. During the middle of the 20th century, several secondary metabolites from fungi revolutionized the pharmaceutical industry, for example, penicillin, lovastatin, and cyclosporine. They are generally biosynthesized by enzymes encoded by clusters of coordinately regulated genes, and several motif-based methods have been developed to detect secondary metabolite biosynthetic (SMB gene clusters using the sequence information of typical SMB core genes such as polyketide synthases (PKS and non-ribosomal peptide synthetases (NRPS. However, no detection method exists for SMB gene clusters that are functional and do not include core SMB genes at present. To advance the exploration of SMB gene clusters, especially those without known core genes, we developed MIDDAS-M, a motif-independent de novodetection algorithm for SMB gene clusters. We integrated virtual gene cluster generation in an annotated genome sequence with highly sensitive scoring of the cooperative transcriptional regulation of cluster member genes. MIDDAS-M accurately predicted 38 SMB gene clusters that have been experimentally confirmed and/or predicted by other motif-based methods in 3 fungal strains. MIDDAS-M further identified a new SMB gene cluster for ustiloxin B, which was experimentally validated. Sequence analysis of the cluster genes indicated a novel mechanism for peptide biosynthesis independent of NRPS. Because it is fully computational and independent of empirical knowledge about SMB core genes, MIDDAS-M allows a large-scale, comprehensive analysis of SMB gene clusters, including those with novel biosynthetic mechanisms that do not contain any functionally characterized genes.

  15. The Prognostic Role of Androgen Receptor in Patients with Early-Stage Breast Cancer: A Meta-analysis of Clinical and Gene Expression Data.

    Science.gov (United States)

    Bozovic-Spasojevic, Ivana; Zardavas, Dimitrios; Brohée, Sylvain; Ameye, Lieveke; Fumagalli, Debora; Ades, Felipe; de Azambuja, Evandro; Bareche, Yacine; Piccart, Martine; Paesmans, Marianne; Sotiriou, Christos

    2017-06-01

    Purpose: Androgen receptor (AR) expression has been observed in about 70% of patients with breast cancer, but its prognostic role remains uncertain.Experimental Design: To assess the prognostic role of AR expression in early-stage breast cancer, we performed a meta-analysis of studies that evaluated the impact of AR at the protein and gene expression level on disease-free survival (DFS) and/or overall survival (OS). Eligible studies were identified by systematic review of electronic databases using the MeSH-terms "breast neoplasm" and "androgen receptor" and were selected after a qualitative assessment based on the REMARK criteria. A pooled gene expression analysis of 35 publicly available microarray data sets was also performed from patients with early-stage breast cancer with available gene expression and clinical outcome data.Results: Twenty-two of 33 eligible studies for the clinical meta-analysis, including 10,004 patients, were considered as evaluable for the current study after the qualitative assessment. AR positivity defined by IHC was associated with improved DFS in all patients with breast cancer [multivariate (M) analysis, HR 0.46; 95% confidence interval (CI) 0.37-0.58, P analysis. High AR mRNA levels were found to confer positive prognosis overall in terms of DFS (HR 0.82; 95% CI 0.72-0.92;P = 0.0007) and OS (HR 0.84; 95% CI, 0.75-0.94; P = 0.02) only in univariate analysis.Conclusions: Our analysis, conducted among more than 17,000 women with early-stage breast cancer included in clinical and gene expression analysis, demonstrates that AR positivity is associated with favorable clinical outcome. Clin Cancer Res; 23(11); 2702-12. ©2016 AACR. ©2016 American Association for Cancer Research.

  16. Identification and functional analysis of the gene cluster for L-arabinose utilization in Corynebacterium glutamicum.

    Science.gov (United States)

    Kawaguchi, Hideo; Sasaki, Miho; Vertès, Alain A; Inui, Masayuki; Yukawa, Hideaki

    2009-06-01

    Corynebacterium glutamicum ATCC 31831 grew on l-arabinose as the sole carbon source at a specific growth rate that was twice that on d-glucose. The gene cluster responsible for l-arabinose utilization comprised a six-cistron transcriptional unit with a total length of 7.8 kb. Three l-arabinose-catabolizing genes, araA (encoding l-arabinose isomerase), araB (l-ribulokinase), and araD (l-ribulose-5-phosphate 4-epimerase), comprised the araBDA operon, upstream of which three other genes, araR (LacI-type transcriptional regulator), araE (l-arabinose transporter), and galM (putative aldose 1-epimerase), were present in the opposite direction. Inactivation of the araA, araB, or araD gene eliminated growth on l-arabinose, and each of the gene products was functionally homologous to its Escherichia coli counterpart. Moreover, compared to the wild-type strain, an araE disruptant exhibited a >80% decrease in the growth rate at a lower concentration of l-arabinose (3.6 g liter(-1)) but not at a higher concentration of l-arabinose (40 g liter(-1)). The expression of the araBDA operon and the araE gene was l-arabinose inducible and negatively regulated by the transcriptional regulator AraR. Disruption of araR eliminated the repression in the absence of l-arabinose. Expression of the regulon was not repressed by d-glucose, and simultaneous utilization of l-arabinose and d-glucose was observed in aerobically growing wild-type and araR deletion mutant cells. The regulatory mechanism of the l-arabinose regulon is, therefore, distinct from the carbon catabolite repression mechanism in other bacteria.

  17. The ribulose-1,5-bisphosphate carboxylase/oxygenase gene cluster of Methylococcus capsulatus (Bath).

    Science.gov (United States)

    Baxter, Nardia J; Hirt, Robert P; Bodrossy, Levente; Kovacs, Kornel L; Embley, T Martin; Prosser, James I; Murrell, J Colin

    2002-04-01

    The genes encoding the ribulose-1,5-bisphosphate carboxylase/oxygenase (Rubisco) from Methylococcus capsulatus (Bath) were localised to an 8.3-kb EcoRI fragment of the genome. Genes encoding the large subunit ( cbbL), small subunit ( cbbS) and putative regulatory gene ( cbbQ) were shown to be located on one cluster. Surprisingly, cbbO, a second putative regulatory gene, was not located in the remaining 1.2-kb downstream (3') of cbbQ. However, probing of the M. capsulatus (Bath) genome with cbbO from Nitrosomonas europaea demonstrated that a cbbO homologue was contained within a separate 3.0-kb EcoRI fragment. Instead of a cbbR ORF being located upstream (5') of cbbL, there was a moxR-like ORF that was transcribed in the opposite direction to cbbL. There were three additional ORFs within the large 8.3-kb EcoRI fragment: a pyrE-like ORF, an rnr-like ORF and an incomplete ORF with no sequence similarity to any known protein. Phylogenetic analysis of cbbL from M. capsulatus (Bath) placed it within clade A of the green-type Form 1 Rubisco. cbbL was expressed in M. capsulatus (Bath) when grown with methane as a sole carbon and energy source under both copper-replete and copper-limited conditions. M. capsulatus (Bath) was capable of autotrophic growth on solid medium but not in liquid medium. Preliminarily investigations suggested that other methanotrophs may also be capable of autotrophic growth. Rubisco genes were also identified, by PCR, in Methylococcus-like strains and Methylocaldum species; however, no Rubisco genes were found in Methylomicrobium album BG8, Methylomonas methanica S1, Methylomonas rubra, Methylosinus trichosporium OB3b or Methylocystis parvus OBBP.

  18. Interactions of Environmental Factors and APOA1-APOC3-APOA4-APOA5 Gene Cluster Gene Polymorphisms with Metabolic Syndrome.

    Directory of Open Access Journals (Sweden)

    Yanhua Wu

    Full Text Available The present study investigated the prevalence and risk factors for Metabolic syndrome. We evaluated the association between single nucleotide polymorphisms (SNPs in the apolipoprotein APOA1/C3/A4/A5 gene cluster and the MetS risk and analyzed the interactions of environmental factors and APOA1/C3/A4/A5 gene cluster polymorphisms with MetS.A study on the prevalence and risk factors for MetS was conducted using data from a large cross-sectional survey representative of the population of Jilin Province situated in northeastern China. A total of 16,831 participations were randomly chosen by multistage stratified cluster sampling of residents aged from 18 to 79 years in all nine administrative areas of the province. Environmental factors associated with MetS were examined using univariate and multivariate logistic regression analyses based on the weighted sample data. A sub-sample of 1813 survey subjects who met the criteria for MetS patients and 2037 controls from this case-control study were used to evaluate the association between SNPs and MetS risk. Genomic DNA was extracted from peripheral blood lymphocytes, and SNP genotyping was determined by MALDI-TOF-MS. The associations between SNPs and MetS were examined using a case-control study design. The interactions of environmental factors and APOA1/C3/A4/A5 gene cluster polymorphisms with MetS were assessed using multivariate logistic regression analysis.The overall adjusted prevalence of MetS was 32.86% in Jilin province. The prevalence of MetS in men was 36.64%, which was significantly higher than the prevalence in women (29.66%. MetS was more common in urban areas (33.86% than in rural areas (31.80%. The prevalence of MetS significantly increased with age (OR = 8.621, 95%CI = 6.594-11.272. Mental labor (OR = 1.098, 95%CI = 1.008-1.195, current smoking (OR = 1.259, 95%CI = 1.108-1.429, excess salt intake (OR = 1.252, 95%CI = 1.149-1.363, and a fruit and dairy intake less than 2 servings a week

  19. Genomic characterization of a new endophytic Streptomyces kebangsaanensis identifies biosynthetic pathway gene clusters for novel phenazine antibiotic production

    Directory of Open Access Journals (Sweden)

    Juwairiah Remali

    2017-11-01

    Full Text Available Background Streptomyces are well known for their capability to produce many bioactive secondary metabolites with medical and industrial importance. Here we report a novel bioactive phenazine compound, 6-((2-hydroxy-4-methoxyphenoxy carbonyl phenazine-1-carboxylic acid (HCPCA extracted from Streptomyces kebangsaanensis, an endophyte isolated from the ethnomedicinal Portulaca oleracea. Methods The HCPCA chemical structure was determined using nuclear magnetic resonance spectroscopy. We conducted whole genome sequencing for the identification of the gene cluster(s believed to be responsible for phenazine biosynthesis in order to map its corresponding pathway, in addition to bioinformatics analysis to assess the potential of S. kebangsaanensis in producing other useful secondary metabolites. Results The S. kebangsaanensis genome comprises an 8,328,719 bp linear chromosome with high GC content (71.35% consisting of 12 rRNA operons, 81 tRNA, and 7,558 protein coding genes. We identified 24 gene clusters involved in polyketide, nonribosomal peptide, terpene, bacteriocin, and siderophore biosynthesis, as well as a gene cluster predicted to be responsible for phenazine biosynthesis. Discussion The HCPCA phenazine structure was hypothesized to derive from the combination of two biosynthetic pathways, phenazine-1,6-dicarboxylic acid and 4-methoxybenzene-1,2-diol, originated from the shikimic acid pathway. The identification of a biosynthesis pathway gene cluster for phenazine antibiotics might facilitate future genetic engineering design of new synthetic phenazine antibiotics. Additionally, these findings confirm the potential of S. kebangsaanensis for producing various antibiotics and secondary metabolites.

  20. Binary state pattern clustering: a digital paradigm for class and biomarker discovery in gene microarray studies of cancer.

    Science.gov (United States)

    Beattie, Bradley J; Robinson, Peter N

    2006-06-01

    Class and biomarker discovery continue to be among the preeminent goals in gene microarray studies of cancer. We have developed a new data mining technique, which we call Binary State Pattern Clustering (BSPC) that is specifically adapted for these purposes, with cancer and other categorical datasets. BSPC is capable of uncovering statistically significant sample subclasses and associated marker genes in a completely unsupervised manner. This is accomplished through the application of a digital paradigm, where the expression level of each potential marker gene is treated as being representative of its discrete functional state. Multiple genes that divide samples into states along the same boundaries form a kind of gene-cluster that has an associated sample-cluster. BSPC is an extremely fast deterministic algorithm that scales well to large datasets. Here we describe results of its application to three publicly available oligonucleotide microarray datasets. Using an alpha-level of 0.05, clusters reproducing many of the known sample classifications were identified along with associated biomarkers. In addition, a number of simulations were conducted using shuffled versions of each of the original datasets, noise-added datasets, as well as completely artificial datasets. The robustness of BSPC was compared to that of three other publicly available clustering methods: ISIS, CTWC and SAMBA. The simulations demonstrate BSPC's substantially greater noise tolerance and confirm the accuracy of our calculations of statistical significance.

  1. Array-based gene expression, CGH and tissue data defines a 12q24 gain in neuroblastic tumors with prognostic implication

    Directory of Open Access Journals (Sweden)

    Kilpinen Sami

    2010-05-01

    Full Text Available Abstract Background Neuroblastoma has successfully served as a model system for the identification of neuroectoderm-derived oncogenes. However, in spite of various efforts, only a few clinically useful prognostic markers have been found. Here, we present a framework, which integrates DNA, RNA and tissue data to identify and prioritize genetic events that represent clinically relevant new therapeutic targets and prognostic biomarkers for neuroblastoma. Methods A single-gene resolution aCGH profiling was integrated with microarray-based gene expression profiling data to distinguish genetic copy number alterations that were strongly associated with transcriptional changes in two neuroblastoma cell lines. FISH analysis using a hotspot tumor tissue microarray of 37 paraffin-embedded neuroblastoma samples and in silico data mining for gene expression information obtained from previously published studies including up to 445 healthy nervous system samples and 123 neuroblastoma samples were used to evaluate the clinical significance and transcriptional consequences of the detected alterations and to identify subsequently activated gene(s. Results In addition to the anticipated high-level amplification and subsequent overexpression of MYCN, MEIS1, CDK4 and MDM2 oncogenes, the aCGH analysis revealed numerous other genetic alterations, including microamplifications at 2p and 12q24.11. Most interestingly, we identified and investigated the clinical relevance of a previously poorly characterized amplicon at 12q24.31. FISH analysis showed low-level gain of 12q24.31 in 14 of 33 (42% neuroblastomas. Patients with the low-level gain had an intermediate prognosis in comparison to patients with MYCN amplification (poor prognosis and to those with no MYCN amplification or 12q24.31 gain (good prognosis (P = 0.001. Using the in silico data mining approach, we identified elevated expression of five genes located at the 12q24.31 amplicon in neuroblastoma (DIABLO, ZCCHC

  2. Mouse Nkrp1-Clr gene cluster sequence and expression analyses reveal conservation of tissue-specific MHC-independent immunosurveillance.

    Science.gov (United States)

    Zhang, Qiang; Rahim, Mir Munir A; Allan, David S J; Tu, Megan M; Belanger, Simon; Abou-Samra, Elias; Ma, Jaehun; Sekhon, Harman S; Fairhead, Todd; Zein, Haggag S; Carlyle, James R; Anderson, Stephen K; Makrigiannis, Andrew P

    2012-01-01

    The Nkrp1 (Klrb1)-Clr (Clec2) genes encode a receptor-ligand system utilized by NK cells as an MHC-independent immunosurveillance strategy for innate immune responses. The related Ly49 family of MHC-I receptors displays extreme allelic polymorphism and haplotype plasticity. In contrast, previous BAC-mapping and aCGH studies in the mouse suggest the neighboring and related Nkrp1-Clr cluster is evolutionarily stable. To definitively compare the relative evolutionary rate of Nkrp1-Clr vs. Ly49 gene clusters, the Nkrp1-Clr gene clusters from two Ly49 haplotype-disparate inbred mouse strains, BALB/c and 129S6, were sequenced. Both Nkrp1-Clr gene cluster sequences are highly similar to the C57BL/6 reference sequence, displaying the same gene numbers and order, complete pseudogenes, and gene fragments. The Nkrp1-Clr clusters contain a strikingly dissimilar proportion of repetitive elements compared to the Ly49 clusters, suggesting that certain elements may be partly responsible for the highly disparate Ly49 vs. Nkrp1 evolutionary rate. Focused allelic polymorphisms were found within the Nkrp1b/d (Klrb1b), Nkrp1c (Klrb1c), and Clr-c (Clec2f) genes, suggestive of possible immune selection. Cell-type specific transcription of Nkrp1-Clr genes in a large panel of tissues/organs was determined. Clr-b (Clec2d) and Clr-g (Clec2i) showed wide expression, while other Clr genes showed more tissue-specific expression patterns. In situ hybridization revealed specific expression of various members of the Clr family in leukocytes/hematopoietic cells of immune organs, various tissue-restricted epithelial cells (including intestinal, kidney tubular, lung, and corneal progenitor epithelial cells), as well as myocytes. In summary, the Nkrp1-Clr gene cluster appears to evolve more slowly relative to the related Ly49 cluster, and likely regulates innate immunosurveillance in a tissue-specific manner.

  3. Mouse Nkrp1-Clr gene cluster sequence and expression analyses reveal conservation of tissue-specific MHC-independent immunosurveillance.

    Directory of Open Access Journals (Sweden)

    Qiang Zhang

    Full Text Available The Nkrp1 (Klrb1-Clr (Clec2 genes encode a receptor-ligand system utilized by NK cells as an MHC-independent immunosurveillance strategy for innate immune responses. The related Ly49 family of MHC-I receptors displays extreme allelic polymorphism and haplotype plasticity. In contrast, previous BAC-mapping and aCGH studies in the mouse suggest the neighboring and related Nkrp1-Clr cluster is evolutionarily stable. To definitively compare the relative evolutionary rate of Nkrp1-Clr vs. Ly49 gene clusters, the Nkrp1-Clr gene clusters from two Ly49 haplotype-disparate inbred mouse strains, BALB/c and 129S6, were sequenced. Both Nkrp1-Clr gene cluster sequences are highly similar to the C57BL/6 reference sequence, displaying the same gene numbers and order, complete pseudogenes, and gene fragments. The Nkrp1-Clr clusters contain a strikingly dissimilar proportion of repetitive elements compared to the Ly49 clusters, suggesting that certain elements may be partly responsible for the highly disparate Ly49 vs. Nkrp1 evolutionary rate. Focused allelic polymorphisms were found within the Nkrp1b/d (Klrb1b, Nkrp1c (Klrb1c, and Clr-c (Clec2f genes, suggestive of possible immune selection. Cell-type specific transcription of Nkrp1-Clr genes in a large panel of tissues/organs was determined. Clr-b (Clec2d and Clr-g (Clec2i showed wide expression, while other Clr genes showed more tissue-specific expression patterns. In situ hybridization revealed specific expression of various members of the Clr family in leukocytes/hematopoietic cells of immune organs, various tissue-restricted epithelial cells (including intestinal, kidney tubular, lung, and corneal progenitor epithelial cells, as well as myocytes. In summary, the Nkrp1-Clr gene cluster appears to evolve more slowly relative to the related Ly49 cluster, and likely regulates innate immunosurveillance in a tissue-specific manner.

  4. Degradation of Benzene by Pseudomonas veronii 1YdBTEX2 and 1YB2 Is Catalyzed by Enzymes Encoded in Distinct Catabolism Gene Clusters.

    Science.gov (United States)

    de Lima-Morales, Daiana; Chaves-Moreno, Diego; Wos-Oxley, Melissa L; Jáuregui, Ruy; Vilchez-Vargas, Ramiro; Pieper, Dietmar H

    2015-10-16

    Pseudomonas veronii 1YdBTEX2, a benzene and toluene degrader, and Pseudomonas veronii 1YB2, a benzene degrader, have previously been shown to be key players in a benzene-contaminated site. These strains harbor unique catabolic pathways for the degradation of benzene comprising a gene cluster encoding an isopropylbenzene dioxygenase where genes encoding downstream enzymes were interrupted by stop codons. Extradiol dioxygenases were recruited from gene clusters comprising genes encoding a 2-hydroxymuconic semialdehyde dehydrogenase necessary for benzene degradation but typically absent from isopropylbenzene dioxygenase-encoding gene clusters. The benzene dihydrodiol dehydrogenase-encoding gene was not clustered with any other aromatic degradation genes, and the encoded protein was only distantly related to dehydrogenases of aromatic degradation pathways. The involvement of the different gene clusters in the degradation pathways was suggested by real-time quantitative reverse transcription PCR. Copyright © 2015, American Society for Microbiology. All Rights Reserved.

  5. Identification of the Regulator Gene Responsible for the Acetone-Responsive Expression of the Binuclear Iron Monooxygenase Gene Cluster in Mycobacteria ▿

    Science.gov (United States)

    Furuya, Toshiki; Hirose, Satomi; Semba, Hisashi; Kino, Kuniki

    2011-01-01

    The mimABCD gene cluster encodes the binuclear iron monooxygenase that oxidizes propane and phenol in Mycobacterium smegmatis strain MC2 155 and Mycobacterium goodii strain 12523. Interestingly, expression of the mimABCD gene cluster is induced by acetone. In this study, we investigated the regulator gene responsible for this acetone-responsive expression. In the genome sequence of M. smegmatis strain MC2 155, the mimABCD gene cluster is preceded by a gene designated mimR, which is divergently transcribed. Sequence analysis revealed that MimR exhibits amino acid similarity with the NtrC family of transcriptional activators, including AcxR and AcoR, which are involved in acetone and acetoin metabolism, respectively. Unexpectedly, many homologs of the mimR gene were also found in the sequenced genomes of actinomycetes. A plasmid carrying a transcriptional fusion of the intergenic region between the mimR and mimA genes with a promoterless green fluorescent protein (GFP) gene was constructed and introduced into M. smegmatis strain MC2 155. Using a GFP reporter system, we confirmed by deletion and complementation analyses that the mimR gene product is the positive regulator of the mimABCD gene cluster expression that is responsive to acetone. M. goodii strain 12523 also utilized the same regulatory system as M. smegmatis strain MC2 155. Although transcriptional activators of the NtrC family generally control transcription using the σ54 factor, a gene encoding the σ54 factor was absent from the genome sequence of M. smegmatis strain MC2 155. These results suggest the presence of a novel regulatory system in actinomycetes, including mycobacteria. PMID:21856847

  6. Crg, a gene required for Ur-3-mediated rust resistance in common bean, maps to a resistance gene analog cluster.

    Science.gov (United States)

    Kalavacharia, V; Stavely, J R; Myers, J R; McClean, P E

    2000-11-01

    Race-specific resistance to the bean rust pathogen (Uromyces appendiculatus) is provided by a number of loci in common bean (Phaseolus vulgaris). The Ur-3 locus controls hypersensitive resistance (HR) to 44 of the 89 races curated in the United States. To better understand resistance mediated by this locus, we developed new genetic material for analysis. We developed a population of mutagenized seed of cv. Sierra (genotype = Ur-3 ur-4 ur-6) that was screened with a bean rust race that is normally incompatible (HR response) on Ur-3 genotypes. We discovered two mutants of common bean, crg and ur3-delta3, in which uredinia formed on leaves (a compatible interaction) following infection. The F1 generation from a cross of these two mutants expressed the HR response, and the F2 generation segregated in a ratio of 9:7 (HR/uredinia formation). Therefore, the two genes are unlinked. Further genetic analysis determined that the mutation in ur3-delta3 was in the Ur-3 locus, and the mutation in crg was in a newly discovered gene given the symbol Crg (Complements resistance gene). Each mutation was inherited in a recessive manner. Unlike ur3-delta3, crg expressed reduced compatibility to bean rust races 49 and 47 that are normally fully compatible on genotypes, such as Sierra, that are homozygous recessive at the Ur-4 and Ur-6 loci. This suggests a gene mutated in crg is normally a positive compatibility factor for the bean-bean rust interaction. Polymerase chain reaction analysis of crg with primers to common bean resistance gene analogs (RGA) that contain a nucleotide-binding site sequence similar to those found in a number of plant disease resistance genes revealed that crg is missing the SB1 RGA, but not the linked SB3 and SB5 RGAs. Genetic analyses revealed that Crg cosegregates with the SB1 RGA. These results demonstrate that Crg is located near a RGA cluster in the common bean genome.

  7. Prognostic Significance of Diffuse Large B-Cell Lymphoma Cell of Origin Determined by Digital Gene Expression in Formalin-Fixed Paraffin-Embedded Tissue Biopsies

    Science.gov (United States)

    Scott, David W.; Mottok, Anja; Ennishi, Daisuke; Wright, George W.; Farinha, Pedro; Ben-Neriah, Susana; Kridel, Robert; Barry, Garrett S.; Hother, Christoffer; Abrisqueta, Pau; Boyle, Merrill; Meissner, Barbara; Telenius, Adele; Savage, Kerry J.; Sehn, Laurie H.; Slack, Graham W.; Steidl, Christian; Staudt, Louis M.; Connors, Joseph M.; Rimsza, Lisa M.; Gascoyne, Randy D.

    2015-01-01

    Purpose To evaluate the prognostic impact of cell-of-origin (COO) subgroups, assigned using the recently described gene expression–based Lymph2Cx assay in comparison with International Prognostic Index (IPI) score and MYC/BCL2 coexpression status (dual expressers). Patients and Methods Reproducibility of COO assignment using the Lymph2Cx assay was tested employing repeated sampling within tumor biopsies and changes in reagent lots. The assay was then applied to pretreatment formalin-fixed paraffin-embedded tissue (FFPET) biopsies from 344 patients with de novo diffuse large B-cell lymphoma (DLBCL) uniformly treated with rituximab plus cyclophosphamide, doxorubicin, vincristine, and prednisone (R-CHOP) at the British Columbia Cancer Agency. MYC and BCL2 protein expression was assessed using immunohistochemistry on tissue microarrays. Results The Lymph2Cx assay provided concordant COO calls in 96% of 49 repeatedly sampled tumor biopsies and in 100% of 83 FFPET biopsies tested across reagent lots. Critically, no frank misclassification (activated B-cell–like DLBCL to germinal center B-cell–like DLBCL or vice versa) was observed. Patients with activated B-cell–like DLBCL had significantly inferior outcomes compared with patients with germinal center B-cell–like DLBCL (log-rank P survival, disease-specific survival, and overall survival). In pairwise multivariable analyses, COO was associated with outcomes independent of IPI score and MYC/BCL2 immunohistochemistry. The prognostic significance of COO was particularly evident in patients with intermediate IPI scores and the non–MYC-positive/BCL2-positive subgroup (log-rank P < .001 for time to progression). Conclusion Assignment of DLBCL COO by the Lymph2Cx assay using FFPET biopsies identifies patient groups with significantly different outcomes after R-CHOP, independent of IPI score and MYC/BCL2 dual expression. PMID:26240231

  8. [Hierarchical clustering analysis to detect associations between clinical and pathological features of gastric tumors and hypermethylation of suppressor genes].

    Science.gov (United States)

    Zavala G, Luis; Luengo J, Víctor; Ossandón C, Francisco; Riquelme S, Erick; Backhouse E, Claudia; Palma V, Mariana; Argandoña C, Jorge; Cumsille, Miguel Angel; Corvalán R, Alejandro

    2007-01-01

    Methylation is an inactivation mechanism for tumor suppressor genes, that can have important clinical implications. To analyze the methylation status of 11 tumor suppressor genes in pathological samples of diffuse gastric cancer. Eighty three patients with diffuse gastric cancer with information about survival and infection with Epstein Barr virus, were studied. DNA was extracted from pathological slides and the methylation status of genes p14, p15, p16, APC, p73, FHIT, E-cadherin, SEMA3B, BRCA-1, MINT-2 y MGMT, was studied using sodium bisulphite modification and polymerase chain reaction. Results were grouped according to the methylation index or Hierarchical clustering (TIGR MultiExperiment Viewer). Three genes had a high frequency of methylation (FHIT, BRCA1, APC), four had an intermediate frequency (p15, MGMT, p14, MINT2) and four had a low frequency (p16, p73, E-cadherin, SEMA3B). The methylation index had no association with clinical or pathological features of tumors or patients survival. Hierarchical clustering generated two clusters. One grouped clinical and pathological features with FHIT, BRCA1, and APC and the other grouped the other eight genes and Epstein Barr virus infection. Two significant associations were found, between APC and survival and p16/p14 and Epstein Barr virus infection. Hierarchical clustering is a tool that identifies associations between clinical and pathological features of tumors and methylation of tumor suppressor genes.

  9. Regulatory role of tetR gene in a novel gene cluster of Acidovorax avenae subsp. avenae RS-1 under oxidative stress

    Directory of Open Access Journals (Sweden)

    He eLiu

    2014-10-01

    Full Text Available Acidovorax avenae subsp. avenae is the causal agent of bacterial brown stripe disease in rice. In this study, we characterized a novel horizontal transfer of a gene cluster, including tetR, on the chromosome of A. avenae subsp. avenae RS-1 by genome-wide analysis. TetR acted as a repressor in this gene cluster and the oxidative stress resistance was enhanced in tetR-deletion mutant strain. Electrophoretic mobility shift assay (EMSA demonstrated that TetR regulator bound directly to the promoter of this gene cluster. Consistently, the results of quantitative real-time PCR also showed alterations in expression of associated genes. Moreover, the proteins affected by TetR under oxidative stress were revealed by comparing proteomic profiles of wild-type and mutant strains via 1D SDS-PAGE and LC-MS/MS analyses. Taken together, our results demonstrated that tetR gene in this novel gene cluster contributed to cell survival under oxidative stress, and TetR protein played an important regulatory role in growth kinetics, biofilm-forming capability, SOD and catalase activity, and oxide detoxicating ability.

  10. PCA Based on Graph Laplacian Regularization and P-Norm for Gene Selection and Clustering.

    Science.gov (United States)

    Feng, Chun-Mei; Gao, Ying-Lian; Liu, Jin-Xing; Zheng, Chun-Hou; Yu, Jiguo

    2017-06-01

    In modern molecular biology, the hotspots and difficulties of this field are identifying characteristic genes from gene expression data. Traditional reconstruction-error-minimization model principal component analysis (PCA) as a matrix decomposition method uses quadratic error function, which is known sensitive to outliers and noise. Hence, it is necessary to learn a good PCA method when outliers and noise exist. In this paper, we develop a novel PCA method enforcing P-norm on error function and graph-Laplacian regularization term for matrix decomposition problem, which is called as PgLPCA. The heart of the method designing for reducing outliers and noise is a new error function based on non-convex proximal P-norm. Besides, Laplacian regularization term is used to find the internal geometric structure in the data representation. To solve the minimization problem, we develop an efficient optimization algorithm based on the augmented Lagrange multiplier method. This method is used to select characteristic genes and cluster the samples from explosive biological data, which has higher accuracy than compared methods.

  11. Unsupervised clustering of gene expression data points at hypoxia as possible trigger for metabolic syndrome

    Directory of Open Access Journals (Sweden)

    York David

    2006-12-01

    Full Text Available Abstract Background Classification of large volumes of data produced in a microarray experiment allows for the extraction of important clues as to the nature of a disease. Results Using multi-dimensional unsupervised FOREL (FORmal ELement algorithm we have re-analyzed three public datasets of skeletal muscle gene expression in connection with insulin resistance and type 2 diabetes (DM2. Our analysis revealed the major line of variation between expression profiles of normal, insulin resistant, and diabetic skeletal muscle. A cluster of most "metabolically sound" samples occupied one end of this line. The distance along this line coincided with the classic markers of diabetes risk, namely obesity and insulin resistance, but did not follow the accepted clinical diagnosis of DM2 as defined by the presence or absence of hyperglycemia. Genes implicated in this expression pattern are those controlling skeletal muscle fiber type and glycolytic metabolism. Additionally myoglobin and hemoglobin were upregulated and ribosomal genes deregulated in insulin resistant patients. Conclusion Our findings are concordant with the changes seen in skeletal muscle with altitude hypoxia. This suggests that hypoxia and shift to glycolytic metabolism may also drive insulin resistance.

  12. A WDR Gene Is a Conserved Member of a Chitin Synthase Gene Cluster and Influences the Cell Wall in Aspergillus nidulans

    Directory of Open Access Journals (Sweden)

    Gea Guerriero

    2016-06-01

    Full Text Available WD40 repeat (WDR proteins are pleiotropic molecular hubs. We identify a WDR gene that is a conserved genomic neighbor of a chitin synthase gene in Ascomycetes. The WDR gene is unique to fungi and plants, and was called Fungal Plant WD (FPWD. FPWD is within a cell wall metabolism gene cluster in the Ascomycetes (Pezizomycotina comprising chsD, a Chs activator and a GH17 glucanase. The FPWD, AN1556.2 locus was deleted in Aspergillus nidulans strain SAA.111 by gene replacement and only heterokaryon transformants were obtained. The re-annotation of Aspergilli genomes shows that AN1556.2 consists of two tightly linked separate genes, i.e., the WDR gene and a putative beta-flanking gene of unknown function. The WDR and the beta-flanking genes are conserved genomic neighbors localized within a recently identified metabolic cell wall gene cluster in genomes of Aspergilli. The heterokaryons displayed increased susceptibility to drugs affecting the cell wall, and their phenotypes, observed by optical, confocal, scanning electron and atomic force microscopy, suggest cell wall alterations. Quantitative real-time PCR shows altered expression of some cell wall-related genes. The possible implications on cell wall biosynthesis are discussed.

  13. Evolution of C2H2-zinc finger genes and subfamilies in mammals: Species-specific duplication and loss of clusters, genes and effector domains

    Directory of Open Access Journals (Sweden)

    Aubry Muriel

    2008-06-01

    Full Text Available Abstract Background C2H2 zinc finger genes (C2H2-ZNF constitute the largest class of transcription factors in humans and one of the largest gene families in mammals. Often arranged in clusters in the genome, these genes are thought to have undergone a massive expansion in vertebrates, primarily by tandem duplication. However, this view is based on limited datasets restricted to a single chromosome or a specific subset of genes belonging to the large KRAB domain-containing C2H2-ZNF subfamily. Results Here, we present the first comprehensive study of the evolution of the C2H2-ZNF family in mammals. We assembled the complete repertoire of human C2H2-ZNF genes (718 in total, about 70% of which are organized into 81 clusters across all chromosomes. Based on an analysis of their N-terminal effector domains, we identified two new C2H2-ZNF subfamilies encoding genes with a SET or a HOMEO domain. We searched for the syntenic counterparts of the human clusters in other mammals for which complete gene data are available: chimpanzee, mouse, rat and dog. Cross-species comparisons show a large variation in the numbers of C2H2-ZNF genes within homologous mammalian clusters, suggesting differential patterns of evolution. Phylogenetic analysis of selected clusters reveals that the disparity in C2H2-ZNF gene repertoires across mammals not only originates from differential gene duplication but also from gene loss. Further, we discovered variations among orthologs in the number of zinc finger motifs and association of the effector domains, the latter often undergoing sequence degeneration. Combined with phylogenetic studies, physical maps and an analysis of the exon-intron organization of genes from the SCAN and KRAB domains-containing subfamilies, this result suggests that the SCAN subfamily emerged first, followed by the SCAN-KRAB and finally by the KRAB subfamily. Conclusion Our results are in agreement with the "birth and death hypothesis" for the evolution of

  14. Expression of preoperative KISS1 gene in tumor tissue with epithelial ovarian cancer and its prognostic value.

    Science.gov (United States)

    Cao, Fang; Chen, Liping; Liu, Manhua; Lin, Weiwei; Ji, Jinlong; You, Jun; Qiao, Fenghai; Liu, Hongbin

    2016-11-01

    Our study aimed to elucidate the role of Kisspeptin (KISS1) in tumor tissues of patients with epithelial ovarian cancer (EOC) and investigate the prognostic value of this biomarker.Forty EOC patients and 20 uterine fibroids female patients with healthy ovaries undergoing cytoreductive surgery between January 2010 and January 2014 in our hospital were enrolled in this study. KISS1 expression in tumor and normal tissues was detected. Correlations between clinic-pathologic variables and KISS1 expression in EOC tissues and the prognostic value of KISS1 for overall survival were evaluated.During the follow-up of 11.2 to 62.1 months, the overall survival rate and mean survival time were 28.9% (11/38) and 38.35 ± 2.84 months. Preoperative KISS1 mRNA was higher in tumor tissue than in normal tissue (P <0.001), and it was associated with histologic grade of tumor, surgical FIGO stage, metastasis, and residual tumor size (all P <0.05). Multivariate survival analysis indicated significant influence of residual tumor size (HR = 2.357, P = 0.039) and preoperative KISS1 mRNA (HR = 0.0001, P <0.001) on mean survival time. Patients with low KISS1 mRNA expression had shorter survival time than those with high expression (P = 0.001).Preoperative KISS1 mRNA was a potential prognostic biomarker for EOC, and high preoperative KISS1 expression indicated a favorable prognosis.

  15. The Epipolythiodiketopiperazine Gene Cluster in Claviceps purpurea: Dysfunctional Cytochrome P450 Enzyme Prevents Formation of the Previously Unknown Clapurines.

    Directory of Open Access Journals (Sweden)

    Julian Dopstadt

    Full Text Available Claviceps purpurea is an important food contaminant and well known for the production of the toxic ergot alkaloids. Apart from that, little is known about its secondary metabolism and not all toxic substances going along with the food contamination with Claviceps are known yet. We explored the metabolite profile of a gene cluster in C. purpurea with a high homology to gene clusters, which are responsible for the formation of epipolythiodiketopiperazine (ETP toxins in other fungi. By overexpressing the transcription factor, we were able to activate the cluster in the standard C. purpurea strain 20.1. Although all necessary genes for the formation of the characteristic disulfide bridge were expressed in the overexpression mutants, the fungus did not produce any ETPs. Isolation of pathway intermediates showed that the common biosynthetic pathway stops after the first steps. Our results demonstrate that hydroxylation of the diketopiperazine backbone is the critical step during the ETP biosynthesis. Due to a dysfunctional enzyme, the fungus is not able to produce toxic ETPs. Instead, the pathway end-products are new unusual metabolites with a unique nitrogen-sulfur bond. By heterologous expression of the Leptosphaeria maculans cytochrome P450 encoding gene sirC, we were able to identify the end-products of the ETP cluster in C. purpurea. The thioclapurines are so far unknown ETPs, which might contribute to the toxicity of other C. purpurea strains with a potentially intact ETP cluster.

  16. Anticipating the clinical use of prognostic gene expression-based tests for colon cancer stage II and III: is Godot finally arriving?

    Science.gov (United States)

    Sveen, Anita; Nesbakken, Arild; Ågesen, Trude H; Guren, Marianne G; Tveit, Kjell M; Skotheim, Rolf I; Lothe, Ragnhild A

    2013-12-15

    According to current recommendations for adjuvant treatment, patients with colon cancer stage II are not routinely offered chemotherapy, unless considered to have a high risk of relapse based on specific clinicopathological parameters. Following these criteria, it is challenging to identify the subgroup of patients that will benefit the most from adjuvant treatment. Contrarily, patients with colon cancer stage III are routinely offered chemotherapy, but due to expected adverse effects and frailty, elderly patients are often excluded from standard protocols. Colon cancer is a disease of the elderly and accordingly, there is a large subgroup of patients for which guidelines for adjuvant treatment remain less clear. In these two clinical settings, improved risk stratification has great potential impact on patient care, anticipating that high-risk patients will benefit from chemotherapy. However, microsatellite instability is the only molecular prognostic marker recommended for clinical use. In this perspective, we provide an updated view on the status and clinical potential of the many proposed prognostic gene expression-based tests for colon cancer stage II and III. The main limitation for clinical implementation is lack of prospective validation. For patients with stage II, highly promising tests have been identified and clinical trials are ongoing. For elderly patients with stage III, the value of such tests has received less focus, but promising early results have been shown. Although awaiting results from prospective trials, improved risk assessment for patients with stage II and III is likely to be achieved in the foreseeable future. ©2013 AACR.

  17. Sexuality generates diversity in the aflatoxin gene cluster: evidence on a global scale.

    Directory of Open Access Journals (Sweden)

    Geromy G Moore

    Full Text Available Aflatoxins are produced by Aspergillus flavus and A. parasiticus in oil-rich seed and grain crops and are a serious problem in agriculture, with aflatoxin B₁ being the most carcinogenic natural compound known. Sexual reproduction in these species occurs between individuals belonging to different vegetative compatibility groups (VCGs. We examined natural genetic variation in 758 isolates of A. flavus, A. parasiticus and A. minisclerotigenes sampled from single peanut fields in the United States (Georgia, Africa (Benin, Argentina (Córdoba, Australia (Queensland and India (Karnataka. Analysis of DNA sequence variation across multiple intergenic regions in the aflatoxin gene clusters of A. flavus, A. parasiticus and A. minisclerotigenes revealed significant linkage disequilibrium (LD organized into distinct blocks that are conserved across different localities, suggesting that genetic recombination is nonrandom and a global occurrence. To assess the contributions of asexual and sexual reproduction to fixation and maintenance of toxin chemotype diversity in populations from each locality/species, we tested the null hypothesis of an equal number of MAT1-1 and MAT1-2 mating-type individuals, which is indicative of a sexually recombining population. All samples were clone-corrected using multi-locus sequence typing which associates closely with VCG. For both A. flavus and A. parasiticus, when the proportions of MAT1-1 and MAT1-2 were significantly different, there was more extensive LD in the aflatoxin cluster and populations were fixed for specific toxin chemotype classes, either the non-aflatoxigenic class in A. flavus or the B₁-dominant and G₁-dominant classes in A. parasiticus. A mating type ratio close to 1∶1 in A. flavus, A. parasiticus and A. minisclerotigenes was associated with higher recombination rates in the aflatoxin cluster and less pronounced chemotype differences in populations. This work shows that the reproductive nature of

  18. A systematic computational analysis of biosynthetic gene cluster evolution: lessons for engineering biosynthesis.

    Directory of Open Access Journals (Sweden)

    Marnix H Medema

    2014-12-01

    Full Text Available Bacterial secondary metabolites are widely used as antibiotics, anticancer drugs, insecticides and food additives. Attempts to engineer their biosynthetic gene clusters (BGCs to produce unnatural metabolites with improved properties are often frustrated by the unpredictability and complexity of the enzymes that synthesize these molecules, suggesting that genetic changes within BGCs are limited by specific constraints. Here, by performing a systematic computational analysis of BGC evolution, we derive evidence for three findings that shed light on the ways in which, despite these constraints, nature successfully invents new molecules: 1 BGCs for complex molecules often evolve through the successive merger of smaller sub-clusters, which function as independent evolutionary entities. 2 An important subset of polyketide synthases and nonribosomal peptide synthetases evolve by concerted evolution, which generates sets of sequence-homogenized domains that may hold promise for engineering efforts since they exhibit a high degree of functional interoperability, 3 Individual BGC families evolve in distinct ways, suggesting that design strategies should take into account family-specific functional constraints. These findings suggest novel strategies for using synthetic biology to rationally engineer biosynthetic pathways.

  19. A systematic computational analysis of biosynthetic gene cluster evolution: lessons for engineering biosynthesis.

    Science.gov (United States)

    Medema, Marnix H; Cimermancic, Peter; Sali, Andrej; Takano, Eriko; Fischbach, Michael A

    2014-12-01

    Bacterial secondary metabolites are widely used as antibiotics, anticancer drugs, insecticides and food additives. Attempts to engineer their biosynthetic gene clusters (BGCs) to produce unnatural metabolites with improved properties are often frustrated by the unpredictability and complexity of the enzymes that synthesize these molecules, suggesting that genetic changes within BGCs are limited by specific constraints. Here, by performing a systematic computational analysis of BGC evolution, we derive evidence for three findings that shed light on the ways in which, despite these constraints, nature successfully invents new molecules: 1) BGCs for complex molecules often evolve through the successive merger of smaller sub-clusters, which function as independent evolutionary entities. 2) An important subset of polyketide synthases and nonribosomal peptide synthetases evolve by concerted evolution, which generates sets of sequence-homogenized domains that may hold promise for engineering efforts since they exhibit a high degree of functional interoperability, 3) Individual BGC families evolve in distinct ways, suggesting that design strategies should take into account family-specific functional constraints. These findings suggest novel strategies for using synthetic biology to rationally engineer biosynthetic pathways.

  20. Biosynthetic gene clusters for relevant secondary metabolites produced by Penicillium roqueforti in blue cheeses.

    Science.gov (United States)

    García-Estrada, Carlos; Martín, Juan-Francisco

    2016-10-01

    Ripening of blue-veined cheeses, such as the French Bleu and Roquefort, the Italian Gorgonzola, the English Stilton, the Danish Danablu or the Spanish Cabrales, Picón Bejes-Tresviso, and Valdeón, requires the growth and enzymatic activity of the mold Penicillium roqueforti, which is responsible for the characteristic texture, blue-green spots, and aroma of these types of cheeses. This filamentous fungus is able to synthesize different secondary metabolites, including andrastins, mycophenolic acid, and several mycotoxins, such as roquefortines C and D, PR-toxin and eremofortins, isofumigaclavines A and B, and festuclavine. This review provides a detailed description of the main secondary metabolites produced by P. roqueforti in blue cheese, giving a special emphasis to roquefortine, PR-toxin and mycophenolic acid, and their biosynthetic gene clusters and pathways. The knowledge of these clusters and secondary metabolism pathways, together with the ability of P. roqueforti to produce beneficial secondary metabolites, is of interest for commercial purposes.

  1. Clustered Mutation Signatures Reveal that Error-Prone DNA Repair Targets Mutations to Active Genes.

    Science.gov (United States)

    Supek, Fran; Lehner, Ben

    2017-07-27

    Many processes can cause the same nucleotide change in a genome, making the identification of the mechanisms causing mutations a difficult challenge. Here, we show that clustered mutations provide a more precise fingerprint of mutagenic processes. Of nine clustered mutation signatures identified from >1,000 tumor genomes, three relate to variable APOBEC activity and three are associated with tobacco smoking. An additional signature matches the spectrum of translesion DNA polymerase eta (POLH). In lymphoid cells, these mutations target promoters, consistent with AID-initiated somatic hypermutation. In solid tumors, however, they are associated with UV exposure and alcohol consumption and target the H3K36me3 chromatin of active genes in a mismatch repair (MMR)-dependent manner. These regions normally have a low mutation rate because error-free MMR also targets H3K36me3 chromatin. Carcinogens and error-prone repair therefore redistribute mutations to the more important regions of the genome, contributing a substantial mutation load in many tumors, including driver mutations. Copyright © 2017 Elsevier Inc. All rights reserved.

  2. Neandertal origin of genetic variation at the cluster of OAS immunity genes.

    Science.gov (United States)

    Mendez, Fernando L; Watkins, Joseph C; Hammer, Michael F

    2013-04-01

    Analyses of ancient DNA from extinct humans reveal signals of at least two independent hybridization events in the history of non-African populations. To date, there are very few examples of specific genetic variants that have been rigorously identified as introgressive. Here, we survey DNA sequence variation in the OAS gene cluster on chromosome 12 and provide strong evidence that a haplotype extending for ~185 kb introgressed from Neandertals. This haplotype is nearly restricted to Eurasians and is estimated to have diverged from the Neandertal sequence ~125 kya. Despite the potential for novel functional variation, the observed frequency of this haplotype is consistent with neutral introgression. This is the second locus in the human genome, after STAT2, carrying distinct haplotypes that appear to have introgressed separately from both Neandertals and Denisova.

  3. The 987P gene cluster in enterotoxigenic Escherichia coli contains an STpa transposon that activates 987P expression.

    Science.gov (United States)

    Klaasen, P; Woodward, M J; van Zijderveld, F G; de Graaf, F K

    1990-03-01

    The genetic determinant for the production of 987P fimbriae has been cloned into pBR322. Analysis of frequently occurring deletions in the resultant recombinant plasmid, pPK180, revealed that the 987P gene cluster contains a transposon that encodes the synthesis of heat-stable enterotoxin STpa and is flanked by inverted repeats of IS1. Hybridization experiments with STpa- and 987P-specific probes demonstrated that a variety of STpa+ 987P+ wild-type Escherichia coli strains contained contiguous STpa-987P DNA, most likely on their chromosome. Transcription of the 987P gene cluster appeared to be activated by the adjacent IS1 element.

  4. Analysis of healthy cohorts for single nucleotide polymorphisms in C1q gene cluster

    Directory of Open Access Journals (Sweden)

    MARIA A. RADANOVA

    2015-12-01

    Full Text Available C1q is the first component of the classical pathway of complement activation. The coding region for C1q is localized on chromosome 1p34.1–36.3. Mutations or single nucleotide polymorphisms (SNPs in C1q gene cluster can cause developing of Systemic lupus erythematosus (SLE because of C1q deficiency or other unknown reason. We selected five SNPs located in 7.121 kbp region on chromosome 1, which were previously associated with SLE and/or low C1q level, but not causing C1q deficiency and analyzed them in terms of allele frequencies and genotype distribution in comparison with Hispanic, Asian, African and other Caucasian cohorts. These SNPs were: rs587585, rs292001, rs172378, rs294179 and rs631090. One hundred eighty five healthy Bulgarian volunteers were genotyped for the selected five C1q SNPs by quantative real-time PCR methods. International HapMap Project has been used for information about genotype distribution and allele frequencies of the five SNPs in, Hispanics, Asians, Africans